How do you handle updating the machine that Incus itself runs on? I imagine you have to be super careful not to introduce any breakage, because then all the VMs/containers go down.
What about kernel updates that require reboots? I have heard of ksplice/kexec, but I have never seen them used anywhere.
I'm not quite sure what's your question here? Very much similar to any other system which needs to reboot and you getting ready to these reboots in advance.
To some extent, of course things like vSphere/Virtuozzo and even LXD/Incus, and even simple Qemu/Virsh systems can do live migration of VMs, so you may care less on preparing things inside VMs to be fault taulerant, but to some extent.
I.e. if your team do run PostgreSQL, your run it in cluster with Patroni and VIPs and all that lovely industry standard magic and tell dev teams to use that VIP as entry point (in reality things bit more complicated with Haproxy/Pgbouncer on top, but enough to express the idea).
I'm not sure that clustering goes beyond "multiple hosts with single API to rule them all" - thus I assume when physical node needs maintenance, it won't magically migrate / restart VEs on other cluster members. May be wrong here.
P.S. Microcloud tries to achieve this AFAIR, but it's from Canonical, so on LXD.
Okay then my question still stands. You are saying "similar to any other system which needs to reboot", but this is nowhere near similar to something like k8s, which has 1st class support for this. You cordon the node you are about to take off for maintenance, kubernetes automatically redistributes all the workloads to the other nodes, and after you are done you uncordon the node.
How does this look with Incus? Obviously if the workload you are running has some kind of multinode support you can use that, but I'm wondering if Incus a way to do this in some kind of generalized way like k8s?
But I did some more reading, there seems to be support for live migration for VMs, and limited live migration for containers. Moving stopped instances is supported for both VMs and containers.
Indeed, container live migration is limited and a bit unclear on "network devices" - bridged interface is network device or not?
Bit ironic, that even with using CRUI, which AFAIK was created by the same Virtuozzo guys which provided OpenVZ back then, and that VEs could live migrate, was personally testing it in 2007-2008. Granted, there we no systemd by that days, if this complicates things. And of course required their patched kernel.
> Live migration for containers
For containers, there is limited support for live migration using CRIU. However, because of extensive kernel dependencies, only very basic containers (non-systemd containers without a network device) can be migrated reliably. In most real-world scenarios, you should stop the container, move it over and then start it again.
It is a built-in functionality [1] and requires no extra orchestration. In a cluster setup, you would be using virtualized storage (ceph based) and virtualized network (ovn). You can replace a container/VM on one host with another on a different host with the same storage volumes, network and address. This is what k8s does with pod migrations too (edit: except the address).
There are a couple of differences though. The first is the pet vs cattle treatment of containers by Incus and k8s respectively. Incus tries to resurrect dead containers as faithfully as possible. This means that Incus treats container crashes like system crashes, and its recovery involves systemd bootup inside the container (kernel too in case of VMs). This is what accounts for the delay. K8s on the other hand, doesn't care about dead containers/pods at all. It just creates another pod, likely with a different address and expects it to handle the interruption.
Another difference is the orchestration mechanism behind this. K8s, as you may be aware, uses control loops on controller nodes to detect the crash and initiate the recovery. The recovery is mediated by the kubelets on the worker nodes. Incus seems to have the orchestrator on all nodes. They take decisions based on consensus and manage the recovery process themselves.
> and address. This is what k8s does with pod migrations too.
That's not true of Pods; each Pod has its own distinct network identity. You're correct about the network, though, since AFAIK Service and Pod CIDR are fixed for the lifespan of the k8s cluster
You spoke to it further down, but guarded it with "likely" and I can say with certainty that it's not likely, it unconditionally does. That's not to say address re-use isn't possible over a long enough time horizon, but that bookkeeeping is delegated to the CNI
---
Your "dead container" one also has some nuance, in that kubelet will for sure restart a failed container, in place, with the same network identity. When fresh identity comes into play is if the Node fails, or the control loop determines something in the Pod's configuration has changed (env-vars, resources, scheduling constraints, etc) in which case it will be recreated, even if by coincidence on the same Node
> I can say with certainty that it's not likely, it unconditionally does. That's not to say address re-use isn't possible over a long enough time horizon, but that bookkeeeping is delegated to the CNI
You are 100% wrong then. The kube-ovn CNI enables static address assignment and "sticky" IPAM on both pods and kubevirt vms.
Heh, I knew I was going to get in trouble since the CNI could do whatever it likes, but felt safe due to Pods having mostly random identities. But at that second I had forgotten about StatefulSets, which I agree with your linked CNI's opinion would actually be a great candidate for static address assignment
Sorry for the lapse and I'll try to be more careful when using "unconditional" to describe pluggable software
I agree with everything you pointed out. They were what I had in my mind too. However, I avoided those points on purpose for the sake of brevity. It was getting too long winded and convoluted for my liking. Thanks for adding a separate clarification, though.
What about kernel updates that require reboots? I have heard of ksplice/kexec, but I have never seen them used anywhere.