You can still avoid loading whole new pages. You simply attach Javascript events...

joelanman · on Feb 10, 2011

The only problem with that is you end up with a mix of both. If a spider collects all the non-ajax links, and shows that to a javascript-enabled browser, the user will end up on eg. /shop/shoes.

If the site is ajax enabled for a slicker experience, then as the user browses from here they might get something like this in their address bar:

/shop/shoes#!shop/socks

or even

/shop/shoes#!help/technical

which starts to look really weird. The google hashbang spec at least fixes this problem. The spider understands the normal URLs of the app, and will dispatch users to them.

Isofarro · on Feb 10, 2011

Can you not use JavaScript to figure out your URL is a mess and redirect accordingly? JavaScript for redirecting people to the homepage of websites have been available on dynamicdrive.com for at least a decade now.

That's one redirect to the homepage (which you're already doing by 301-redirecting the JavaScript-free URLs anyway), so it's hardly going to be difficult.

I'm puzzled, considering the haphazard redirects already going on for incoming links to hash-banged sites, why this isn't a trivial problem.

Incoming link is to /shop/shoes#!shop/socks JavaScript right at the top of /shop/shoes that window.location to /#!shop/shoes

joelanman · on Feb 10, 2011

two problems

1) The link is weird and confusing in the first place, /shop/shoes#!shop/socks refers to two different resources

2) The server will already have done work to find the shoes, when the javascript redirects to the socks page.

Isofarro · on Feb 10, 2011

1.) Is a limitation of Google's crawlable Ajax proposal. That would probably not have occurred with a proper standards body. What sequence of events would have to happen to have that as an inbound URL? I sense some previous JavaScript would have to have failed to allow that scenario.

2.) The site is already paying this price by redirecting _escaped_fragment_ URLs, and the old clean style urls. All inbound links will have this problem, so you're only shifting some of the burden through this door instead of the others.

joelanman · on Feb 10, 2011

no, with google's proposal, the #! links are all from the site root, see Lifehacker and Twitter's implementation. So these ugly half and half URLs never exist, and you're not paying a double request price

Isofarro · on Feb 10, 2011

Google's proposed kludge doesn't limit URLs to the site root - a path segment is documented. Have a read of it: http://code.google.com/web/ajaxcrawling/docs/specification.h...

joelanman · on Feb 10, 2011

ah you're right, and yes that could possibly introduce the issue of redundant work done on the server depending on the implementation. However the two major implementations I've seen (Twitter and Lifehacker) use it from the root and so dont have that problem.

rimantas · on Feb 10, 2011

aka "Hijax": http://domscripting.com/blog/display/41