We adopted continuous deployment at my last company, and it was a huge win for us. It resulted in less downtime, reduced the cognitive load on our developers, and let us turn changes and bug fixes faster, which just made everyone happier. Here is the approach we took --
We got continuous integration working first by setting up an automated test server. We used cerberus, which is a trivially simple ruby gem that simply polls a repo, grabs code, runs tests, and reports the result. You could install this anywhere, even an old Mac mini if you wanted. We spun up a low end server for it. We wrote great tests, got decent coverage, and made adjustments in our automated testing strategy to increase our confidence.
Then we worked on zero-downtime deployment and rollback. This was actually the hardest part for us. With regard to schema changes, if we had to hammer our database (new index on a big table) then we needed to take the site down, but otherwise our strategy was to always add to the scheme, wait for the dust to settle, and then remove things later. This worked for us, but we had a relatively simple schema.
I haven't quite figured out how an ajax-heavy site would pull this off. That seems like a hard problem since you need to detect the changes and refresh your javascript code.
We then combined these two to get automated deployment to a staging server. We could have written monitoring code at this point, but we decided to punt on that, relying on email notification for crashes and watching our performance dashboard.
And finally, we set it up to deploy to production, and it just worked, and we never looked back. It was the most productive and pleasant setup I've ever worked in.
Regarding ajax heavy applications, I have been faced with that particular problem. A few of my sites are javascript heavy apps, with long flows between page reloads. If I ever change my server API in a way that would break the javascript, I need to signal to the client that it needs to refresh. (I keep track of version numbers for the server code, and whenever that is bumped, it means the client is out of sync)
This can be kinda.. awkward. What I've opted either a lightbox or some kind of message saying we need to refresh the page. But that is not ideal.
Has anyone dealt with this issue before? With javascript heavy apps, the development is more like a traditional desktop app, or a mobile app that has to deal with the client/server model interface in a non-trivial manner.
I've been doing continuous deployment (multiple production deployments per day) with an ajax-heavy application and have been doing a sort of rolling API change, where both the client and the server still function using the last generation's contract, so I rarely get a client request that the server can't deal with, or send a response that the client can't deal with. It doesn't work with every kind of change, but it has helped me.
Something that you could do (and I may wind up doing) is just having the page ask the server if it needs to refresh itself every so often.
I haven't directly but I just happen to be staring at Pivotal Tracker which handles it pretty nicely. The do as you say and push a little lightbox which says "A system change has occurred requiring a refresh of this page.
Please click 'OK' to reload Tracker."
I think how Gmail handles the problem is they keep multiple instances of the server-side software running, one for each version of the API. From my experience, whenever GMail rolls out a new feature, I don't see it until I do a refresh and frequently that's what GMail tells me I need to do to see the new feature.
We got continuous integration working first by setting up an automated test server. We used cerberus, which is a trivially simple ruby gem that simply polls a repo, grabs code, runs tests, and reports the result. You could install this anywhere, even an old Mac mini if you wanted. We spun up a low end server for it. We wrote great tests, got decent coverage, and made adjustments in our automated testing strategy to increase our confidence.
Then we worked on zero-downtime deployment and rollback. This was actually the hardest part for us. With regard to schema changes, if we had to hammer our database (new index on a big table) then we needed to take the site down, but otherwise our strategy was to always add to the scheme, wait for the dust to settle, and then remove things later. This worked for us, but we had a relatively simple schema.
I haven't quite figured out how an ajax-heavy site would pull this off. That seems like a hard problem since you need to detect the changes and refresh your javascript code.
We then combined these two to get automated deployment to a staging server. We could have written monitoring code at this point, but we decided to punt on that, relying on email notification for crashes and watching our performance dashboard.
And finally, we set it up to deploy to production, and it just worked, and we never looked back. It was the most productive and pleasant setup I've ever worked in.