I kinda think thats the point. Not exactly a software problem, but an information theory problem. If you need to tolerate 2 failures, you need 6 machines.
Many of us in high-assurance systems have witnessed triple, modular redundancy fail. I started saying 3 out of 5 for that reason. May be same for other commenter. I also want where possible for the computers to be in different locations, using different hardware, and with different developers working against same API.
Yes, but those aren't Byzantine failures. Byzantine failures present with incorrect values, not simply the absence of a value (as seen by a total hardware failure).
See the abstract in Lamport's original paper introducing the Byzantine generals problem [0].
We also see a similar issue in error correction -- an introductory undergrad course might teach this via Lagrange interpolation [1], where you need only n+k of the coefficients in the presence of erasure errors, but n+2k in the general case (where n is the size of the actual message, and k is the maximum number of errors to correct).
Partial hardware failures exist. The wrong values start going through the system. A bit flip is easiest example. NonStop has been countering both partial and total HW failures in their design a long time now.