Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I also run 3 small clusters of consul and I went ahead and read the raft paper[1] so I can debug consul election problems if it occurs.

Consul is awesome when it works, but when it breaks it can be hell to get it working again. thankfully it usually works fine. I only had 1 outage and it fixed itself after restarting the service.

[1] https://raft.github.io/raft.pdf



> so I can debug consul election problems if it occurs

Interestingly, reading this remind me of a HashiCorp Nomad marketing piece [1]:

> "We have people who are first-time system administrators deploying applications, building containers, maintaining Nomad. There is a guy on our team who worked in the IT help desk for eight years — just today he upgraded an entire cluster himself."

I was always thinking "but what if something goes wrong? just call HashiCorp engs?" :p

[1] https://www.hashicorp.com/case-studies/roblox


That seems to be a general problem with these types of solutions. You have the exact same issue with something like ZooKeeper. It awesome when it works, but good luck trying to figure out why it's broken.

Just the author of the previous post relying on these types of services is something that can keep me up at night.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: