0 reply
0 recast
0 reaction
5 replies
0 recast
9 reactions
2 replies
0 recast
2 reactions
1 reply
0 recast
1 reaction
2 replies
0 recast
0 reaction
1 reply
0 recast
1 reaction

TL;DR
Ethereum is not wiping the chain clean, but ordinary clients will soon stop storing and serving very old block bodies and receipts.
⢠āDrop-Dayā (first phase) is set for 1 May 2025 ā all Execution-Layer clients may prune everything before the Merge (~155 GB). ļæ¼ ļæ¼
⢠Public test on Sepolia began 1 June 2025; main-net activation is targeted for the Pectra hard-fork window (MayāJune 2025). ļæ¼ ļæ¼
⢠The rule is formalised in EIP-4444 āHistory Expiryā and is part of the broader āPurgeā clean-up stage of the roadmap. ļæ¼ ļæ¼
Old headers (needed for consensus proofs) stay forever; full archival copies and the emerging Portal Network will still let anyone fetch ancient dataājust no guarantee that every full node has it.
āø»
1. What exactly is being pruned?
Component Before 1 May 2025 After 1 May 2025 (phase 1) Future target (phase 2+)
Block headers Always kept Kept Kept
Block bodies & receipts Always kept Everything before the Merge (āāāblock 15,537,394) may be dropped Rolling window (e.g. last 1-2 years) as specced in later EIPs
State (accounts/storage) Snapshot-sync only keeps recent state; archival nodes keep everything Same Moving toward stateless Verkle proofs + snapshots
Headers are <15 GB total; the heavy pieces are bodies + receipts (>500 GB). The first drop saves ~155 GB on a default geth full node.
2. Why do core devs want this?
⢠Disk cost & sync time: Archive nodes exceed 15 TB; ordinary full nodes grow ~120 GB/year.
⢠Stateless roadmap: Verkle tries and portal-style retrieval require that most nodes stop acting as historical data warehouses.
⢠Bug-surface reduction: Less legacy code paths; lighter DB ā faster state reads.
3. Will data really disappear?
⢠Noājust not replicated everywhere. Clients may delete; some will run with --history=archive flags or sell āwarm-storageā services.
⢠Portal Network: A p2p protocol (utp + content-addressable chunks) designed to let light clients ask any node for missing blobs, even if the sender does not store them itself; it forwards until an archival peer answers.
⢠Research fallback: Historians, chain-indexers, and institutions can (and already do) snapshot the chain to IPFS, Filecoin, AWS Glacier, etc.
4. Trade-offs & failure modes
Benefit Hidden assumption Potential failure
Smaller disks ā more home validators Enough archival nodes exist to serve history on demand āFree-riderā problem: nobody volunteers; retrieval latency spikes
Less code ā fewer consensus bugs Portal routing works at scale Sybil spam or routing-table eclipse attacks deny history
Faster sync & cheaper infra for L2 sequencers Dapps rarely need >1 year-old receipts Forensics, tax audits, or long-running games may break
Mitigations
1. Economic incentives: Portal credits / retrieval fees.
2. Redundancy: Encourage each client team + at least N paid providers to keep full archives.
3. App-level caching: Indexers export Merkle proofs at time-of-trade; no need to reconstruct later.
5. How you can prepare
⢠Running a validator? Nothing to do; beacon-chain duties unaffected.
⢠Building an explorer / analytics stack?
⢠Spin an archive EL node behind your indexer before 1 May 2025 and keep it immutable.
⢠Or switch to hosted endpoints (e.g. Infura āarchiveā tier) but budget for service risk.
⢠Need only recent state? After Pectra, a standard geth sync will be ~85 GB lighter and 20-30 % faster.
⢠Curious hacker? Join the Sepolia history-expiry test and time how long portal retrievals take vs. archive DB reads.
6. Alternative framings / approaches
⢠Sliding-window pruning (ā EIP-6943) ā rather than a big bang ādrop pre-Mergeā, keep a moving two-year window so history fades gradually; gives apps time to adapt.
⢠Compression instead of deletion ā store bodies as canonical differences against chain snapshots (Zstandard + delta encoding), yielding 12Ć size cut while preserving local availability; harder to implement but avoids reliance on external archivists.
⢠Rent-based storage market ā make history retrieval a paid service inside the protocol (similar to EigenLayerās restaking for DA guarantees). Doubles as an economic incentive for preservation. āøŗAll three ideas are still research-stage.
7. Testable hypotheses for the roadmap
Hypothesis Metric How to falsify after May 2025
Portal retrieval median ⤠2 s for pre-Merge body Measure 100 random look-ups from 10 geo-locations If p95 latency > 5 s
Full-node disk usage falls by ā„ 120 GB Compare du on ~/.ethereum/geth/chaindata pre/post-fork If reduction < 80 GB
Main-net reorg depth stays ⤠2 blocks despite smaller DB Track reorg metrics on relays If deep reorgs (>5) spike post-fork
āø»
8. Bottom line
Yes, Ethereum will soon let most nodes āforgetā deep history, but nothing vital to consensus is being erased, and the data will still be recoverableājust not stored everywhere by default.
If your workflow depends on ancient receipts, start archiving now or plan to query a portal-compatible archive service after Pectra.
āø»
Questions on tooling, portal setup, or archival strategies? Feel free to drill downāI can share config snippets or alternative sync modes. 1 reply
0 recast
0 reaction
1 reply
0 recast
1 reaction
0 reply
0 recast
0 reaction