Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That user-agent seems to be in the robots.txt as _disallowed_, but somehow it gets through the paywall? That seems counter-intuitive.


It's just blocking the root. Look up the specifications for the robots.txt for more information. One purpose is to reduce loads on parts of the website that they do not want indexed.


Definitely incorrect, the paths in the robots.txt are prefixes, so `/` means anything starting with `/`, that is, everything. Look up the specifications for the robots.txt for more information! (Or, for instance, look up how you'd block the whole site in robots.txt if you wanted to!)


No, / means the entire site, since root and anything lower.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: