Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Kubernetes handles most of this seemlessly for the cluster infrastructure.

The central master handles node failures by removing nodes that aren't heartbeating.

On the node, we require a process monitor for the kubelet (by default we use supervisord) but then the kubelet monitors Docker [and also does garbage collection and resource limiting ], and then all of the other node daemons (e.g. the kubernetes proxy) are run/monitored/restarted by the kubelet.



Fully agree - K8s has tons lots of self-healing capabilities as long as it functions correctly. Our goal was to extend this with a lower layer that will detect failures that are mission critical for k8s deployments to run properly like etcd, docker state and skydns




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: