Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'd say the way failover/HA is done at Heroku is straightforward. Will (the designer) or I plan to write about it some day, pending laziness.

It took some time to figure out because it required breaking some orthodoxy, but I'm happy with the result.

The thresholds are documented at https://devcenter.heroku.com/articles/heroku-postgres-ha#fai....

The promotion is done by rebinding the URL of the database and restarting the app. This shares mechanism neatly with changing of passwords, which is one reason we decided it was worth throwing out network-transparent orthodoxy when it came to HA: the clients must be controlled anyway to deal with security considerations.



One approach I used when we migrated data centres to avoid having to manage timing etc. of IP address changes in our apps, was to use haproxy.

We configured slaves in the new data centre, set up a haproxy in the new data centre that used the old data centre databases as the main backend, and the new data centre databases as the backup, changed the apps to point to haproxy, shut down the masters and let haproxy shift all the traffic, promoted the slaves once we were certain they'd caught up. We had a period of seconds where the sites were effectively read-only, but that was it.

We're planning rolling out haproxy with keepalived or ucarp to mediate between a lot more of our backend services


> I'd say the way failover/HA is done at Heroku is straightforward.

> It took some time to figure out because it required breaking some orthodoxy

If it's hard to figure out, it does not sounds too straight forward.


Only because the idea of manipulating clients of the database to do HA is somewhat heretical.


Just wanted to put in a vote for a writeup. Mind sharing a few more details here in the meantime?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: