Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Last time I downloaded Wikipedia, it was 4.5GB. If I were to knock up this hack, I would definitely scrape pages instead.


On-demand page scrape + memoisation is almost certainly a win here. Even if thousands of people are hitting this, a lot will choose some of the same queries (I'm sure Kevin Bacon and xkcd and philosophy are in there a bunch), especially in the tails of the paths (Latin, Mathematics, ...)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: