24 drives. Same model. Likely the same batch. Similar wear. Imagine most of them failing at the same time, and the rest failing as you're rebuilding it due to the increased load, because they're already almost at the same point.
I ordered my NAS drives on Amazon, to avoid getting the same batch (all consecutive serial numbers) I used amazon.co.uk for one half and amazon.de for the other half of them. One could also stage the orders in time.
Software bugs might cause that (eg. drive fails after exactly 1 billion IOPS due to some counter overflowing). But hardware wear probably won't be as consistent.
I've seen this happen to a friend. Back in the noughties they built a home NAS similar to the one in the article, using fewer (smaller) drives. It was in RAID5 configuration. It lasted until one drive died and a second followed it during the rebuild. Granted, it wasn't using ZFS, there was no regular scrubbing, 00s drive failure rates were probably different, and they didn't power it down when not using it. The point is the correlated failure, not the precise cause.
Usual disclaimers, n=1, rando on the internet, etc.
You’re far better off having two raids, one as a daily backup of progressive snapshots that only turns on occasionally to backup and is off the rest of the time.
I don’t understand how it is better to have an occasional (= significantly time-delayed) backup. You’ll lose all changes since the last backup. And you’re doubling the cost, compared to just one extra hard drive for RAID 6.
Really important stuff is already being backed up to a second location anyway.
24 drives. Same model. Likely the same batch. Similar wear. Imagine most of them failing at the same time, and the rest failing as you're rebuilding it due to the increased load, because they're already almost at the same point.
Reliable storage is tricky.