Hi HN. I'm interested in resources describing the architecture and implementation of critical control system software. I'd like to understand more specifically about how these systems can be or are designed such that loading, patching, and deploying software in these environments can be done with zero downtime.
Are there any books, code, or other resources you would recommend?
I'm trying to understand the actual environment. When you say "deployment", what is changed, where does it start, and how far does it propagate?
For example, would one option for zero downtime be to have replicated (2 or more) "control systems" beyond some "layer" (sorry, it's hard to be precise without knowing more) and enforcing synchronicity between those while having only actually controlling at any time. Then, when you are patching or updating, you freeze on one, update the other, then switch to the other? Not advocating a solution, just trying to understand the situation by throwing out an example to talk around.
I'm not an expert in this at all, but if what I'm talking about above is even close to being on track, I'd recommend this book for starters: https://www.amazon.com/Introduction-Embedded-Systems-Cyber-P...