Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

a static cache for anyone not logged in, and only doing this check when you are authenticated which gives access to editing pages?

edit: Because HN is throwing "you're posting too fast" errors again:

> That falls short of the "meets their needs" test. Authenticated users already have a check (i.e., the auth process). Anubis is to stop/limit bots from reading content.

Arch Wiki is a high value target for scraping so they'll just solve the anubis challenge once a week. It's not going to stop them.



> Arch Wiki is a high value target for scraping so they'll just solve the anubis challenge once a week. It's not going to stop them.

The goal of Anubis isn't to stop them from scraping entirely, but rather to slow down aggressive scraping (e.g. sites with lots of pages being scraped every 6 hours[1]) so that the scraping doesn't impact the backend nearly as much

[1] https://pod.geraspora.de/posts/17342163, which was linked as an example in the original blog post describing the motivation for anubis[2]

[2]: https://xeiaso.net/blog/2025/anubis/


The point of a static cache is that your backend isn't impacted at all.


That falls short of the "meets their needs" test. Authenticated users already have a check (i.e., the auth process). Anubis is to stop/limit bots from reading content.


... Are you saying a bot couldn't authenticate?

Still need a layer there, could also have been a manual login to pull a session token.


> Arch Wiki is a high value target for scraping so they'll just solve the anubis challenge once a week.

ISTR that Anubis allows the site-owner to control the expiry on the check; if you're still getting hit by bots, turn the check to 5s with a lower "work" effort so that every request will take (say) 2s, and only last for 5s.

(Still might not help though, because that optimises for bots at the expense of humans - a human will only do maybe one actual request every 30 - 200 seconds, while a bot could do a lot in 5s).


Rather than a time to live you probably want a number of requests to live. Decrement a counter associated with the token at every request until it expires.

An obvious followup is to decrement it by a larger amount if requests are made at a higher frequency.


Does anyone know if static caches work? No one seems to have replied to that point. It seems like a simple and user-friendly solution.


Caches would only work if the bots were hitting routes that any human had ever hit before.


They'd also work if the bot, or another bot, hits that route before. It's a wiki, the amount of content is finite and each route getting hit once isn't a problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: