Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What's a good alternative to robots.txt?


Proper access control on your web site.


Simply not showing the transaction if you're not logged in and not the user belonging to the transaction?


You can also specify a meta robots tag inside the page HTML. If you want to block a lot of pages, your best bet would be to add it to your master layout or template. You get the same effect of blocking on robots.txt but without exposing a list of blocked pages.

The downside is that Google will still crawl the page and use your bandwidth, but the page won't be indexed.


I would suggest both these two: Check the user agent for bots and if it is a bot send a 404 header and exit before page needs to load. Also add a meta noindex just in case. Robots.txt DOES NOT prevent indexing, just crawling.


Blocking access to googlebot for those pages is the easiest. But robots.txt would work, if you just did

Ignore /checkouts/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: