Content pfp
Content
@
https://ethstaker.cc
0 reply
0 recast
0 reaction

Thomas pfp
Thomas
@aviationdoctor.eth
Woke up to a flurry of failed attestations from my solo validator. The NUC’s SSD finally gave up the ghost after three years of intense I/O. Not the most pleasant start of a Sunday, but I was expecting it at some point, and had a spare SSD on hand. Swapped the disks, flashed Ubuntu Server, clean-installed eth-docker, and less than one hour of tinkering later, I am now syncing from checkpoint, which should complete by tomorrow. Running a solo validator does take some effort, and the rewards of ~2.7% APY aren’t much, but putting the de- in decentralization is priceless
3 replies
2 recasts
45 reactions

InsideTheSim 🎩🍪 pfp
InsideTheSim 🎩🍪
@insidethesim.eth
How over the top would it be to have a failover setup where you have say 4 drives and any 2 can fail? Idea would be to have time to replace without downtime.
1 reply
0 recast
3 reactions

Thomas pfp
Thomas
@aviationdoctor.eth
For e.g. 4 drives allowing 2 to fail, you'd need a RAID 6 or higher setup. I am not aware that this can be done with the kind of intensive I/O and high-performance SSDs required for keeping up with the Ethereum chain head, to be honest. Maybe with a super powered NAS, but I'm skeptical. Would at the very least have to be a prohibitively expensive system. I guess you could rsync the primary disk to its mirror backup every 24 hours, so that disk #2 would be no more than 12 hours behind the chain head on average if disk #1 fails. Practically speaking, it would add that much more read load to disk #1 whenever rsync is active, and not only shorten its lifespan even further, but possibly throttle the SSD given the I/O load from node operations, unless you stopped the node for the duration of the rsync and missed a bunch of attestations each day. Alternatively, you could also run a second node (RPC, not validating — virtually everyone who ever got slashed made that mistake of getting a validating backup node online concurrently). Then, all you would need to do if node #1 fails is import your keys into node #2, which is a short SCP command away. But, that means buying and maintaining twice as much computing CapEx, twice as much bandwidth and power consumed, etc. Honestly I don't think any of this is worth mitigating the few hours, even few days of downtime that happens every few years when a node dies. The missed attestation penalties are quite small.
2 replies
0 recast
3 reactions