Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Common Crawl maintains a free, open repository of web crawl data (commoncrawl.org)
27 points by doener 9 months ago | hide | past | favorite | 1 comment


I haven't seen this resource before.

You can also search the index [1]

I downloaded and ran the example code [2] to lookup a URL and fetch its content and the response was instantaneous!

  [1] http://index.commoncrawl.org/CC-MAIN-2025-08-index?url=ycombinator.com&output=json
  [2] https://commoncrawl.org/get-started




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: