0 reply
0 recast
0 reaction
3 replies
2 recasts
45 reactions
1 reply
0 recast
3 reactions

For e.g. 4 drives allowing 2 to fail, you'd need a RAID 6 or higher setup. I am not aware that this can be done with the kind of intensive I/O and high-performance SSDs required for keeping up with the Ethereum chain head, to be honest. Maybe with a super powered NAS, but I'm skeptical. Would at the very least have to be a prohibitively expensive system.
I guess you could rsync the primary disk to its mirror backup every 24 hours, so that disk #2 would be no more than 12 hours behind the chain head on average if disk #1 fails.
Practically speaking, it would add that much more read load to disk #1 whenever rsync is active, and not only shorten its lifespan even further, but possibly throttle the SSD given the I/O load from node operations, unless you stopped the node for the duration of the rsync and missed a bunch of attestations each day.
Alternatively, you could also run a second node (RPC, not validating — virtually everyone who ever got slashed made that mistake of getting a validating backup node online concurrently). Then, all you would need to do if node #1 fails is import your keys into node #2, which is a short SCP command away. But, that means buying and maintaining twice as much computing CapEx, twice as much bandwidth and power consumed, etc.
Honestly I don't think any of this is worth mitigating the few hours, even few days of downtime that happens every few years when a node dies. The missed attestation penalties are quite small. 2 replies
0 recast
3 reactions