Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm also curious whether it removes some of the limitations of the Algolia version; I wanted to download my content for some statistical analysis (notes at http://www.gwern.net/HN ) but I discovered that it seems there's some hard limits to how much of my data I can reach: https://github.com/algolia/hn-search/pull/36


There's a cheat to work around that limit: use the created_at_i parameter.

Example: https://github.com/minimaxir/hacker-news-download-all-storie...


If you want to get data for a single user through Algolia's API via the commandline, you could also use https://github.com/jaredsohn/hnuserdownload. It uses the same technique as minimaxir's code (a post of his was the inspiration.)


I don't know Python, so I'm not sure what your source code is doing. At a guess, you've hacked together some sort of repeated queries thing with a time-window?


> Get 1000 entries and process them. > Take the timestamp of the last entry. > Requery the API, asking for articles made before that timestamp

Repeat.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: