Base Mainnet Flashblocks pending state lag while canonical latest stayed healthy
Date: 2026-06-11
Summary
We observed a self-hosted Base mainnet Reth node where the canonical RPC path recovered and stayed healthy, but the Flashblocks path remained badly stale until a node restart.
The key symptom was:
eth_getBlockByNumber("latest") and newHeads were current.
eth_subscribe ["newFlashblocks"] was roughly 450 blocks behind newHeads.
- Reth metrics reported
reth_reth_flashblocks_pending_snapshot_height=47205334, matching the stale pending block height seen by downstream RPC reads.
- Restarting the Base node immediately restored
newFlashblocks and the pending snapshot to the canonical head.
This looks like Flashblocks pending-state production/subscription lag, not a full node crash, OOM, or canonical sync failure.
Node setup
The node was running Base mainnet Reth with Flashblocks enabled:
- Base Reth tag:
v1.0.0
- Base repo commit:
47b8b3690d3ef34530f8f90441bc733df01c1dda
- Execution command included:
--websocket-url=wss://mainnet.flashblocks.base.org/ws
- The command did not include
--engine.cross-block-cache-size.
- Containers had been up for 7 days before restart.
OOMKilled=false, RestartCount=0.
At the pre-restart snapshot, the machine was not under memory pressure:
- Host memory:
61 GiB total, 45 GiB available.
- Swap used:
3.6 GiB / 30 GiB.
- Execution container: about
20.06 GiB / 61.91 GiB memory.
- Execution process:
VmRSS=49935668 kB, VmSwap=2102636 kB, Threads=409.
Timeline UTC
17:28-18:02: Flashblocks upstream reconnect/reorder/reorg signatures
In the Reth execution logs during 2026-06-11T17:28Z..18:02Z, we saw:
No pong response from upstream, reconnecting: 7 times.
WebSocket connection established: 7 times.
Received non-zero index Flashblock: 2 times.
reorg detected: 17 times.
Representative lines:
2026-06-11T17:28:52.425095Z WARN No pong response from upstream, reconnecting
2026-06-11T17:28:54.203726Z INFO WebSocket connection established
2026-06-11T17:28:56.525666Z ERROR Received non-zero index Flashblock for new block
2026-06-11T17:49:25.964924Z WARN No pong response from upstream, reconnecting
2026-06-11T17:49:27.741799Z INFO WebSocket connection established
We did not observe these signatures in that window:
State root task timed out: 0
could not process Flashblock: 0
- long read transaction timeout: 0
- OOM signature: 0
- exact
missing canonical error: 0
18:00-18:05: canonical RPC healthy, Flashblocks stale
At 2026-06-11T18:01:12Z..18:01:45Z, repeated eth_getBlockByNumber("latest") calls were current:
- First sample: block
47205762, timestamp 2026-06-11T18:01:11Z, age 1s.
- Last sample: block
47205776, timestamp 2026-06-11T18:01:39Z, age 6s.
- All samples were
0s..6s old.
Around the same time, Reth Flashblocks metrics showed the pending snapshot was stale:
reth_reth_flashblocks_upstream_messages 4817121
reth_reth_flashblocks_reconnect_attempts 418
reth_reth_flashblocks_upstream_errors 30
reth_reth_flashblocks_unexpected_block_order 227
reth_reth_flashblocks_block_processing_error 208
reth_reth_flashblocks_pending_clear_reorg 824
reth_reth_flashblocks_pending_clear_catchup 50141
reth_reth_flashblocks_pending_snapshot_height 47205334
reth_reth_flashblocks_pending_snapshot_fb_index 10
reth_sync_block_validation_state_root_task_timeout_total 0
reth_sync_block_validation_state_root_parallel_fallback_total 0
reth_sync_block_validation_state_root_task_fallback_success_total 0
A local WebSocket probe to the node around 2026-06-11T18:04Z showed:
newHeads count=8 unique_blocks=8 first=47205873 last=47205880
newFlashblocks count=68 unique_blocks=7 first=47205422 last=47205428
errors=[]
So newFlashblocks was about 450 blocks behind newHeads, while newHeads and HTTP latest were current.
18:06-18:09: restart cleared the lag
We restarted the node at 2026-06-11T18:06Z.
After restart, HTTP latest stayed current:
- First sample: block
47205939, timestamp 2026-06-11T18:07:05Z, age 5s.
- Last sample: block
47205954, timestamp 2026-06-11T18:07:35Z, age 8s.
Reth metrics around 18:08Z showed:
reth_reth_flashblocks_upstream_messages 832
reth_reth_flashblocks_pending_snapshot_height 47205983
reth_sync_block_validation_state_root_task_timeout_total 0
reth_sync_block_validation_state_root_parallel_fallback_total 0
reth_sync_block_validation_state_root_task_fallback_success_total 0
The WebSocket probe around 18:08Z showed Flashblocks caught up:
newHeads count=8 unique_blocks=8 first=47205982 last=47205989
newFlashblocks count=81 unique_blocks=9 first=47205983 last=47205991
errors=[]
Downstream impact
The node serves a latency-sensitive application that uses the official Flashblocks paths:
eth_subscribe ["pendingLogs", filter] for ERC20 Transfer logs.
eth_subscribe ["newFlashblocks"] probes.
eth_getBlockByNumber("pending") / BlockId::pending() through live read paths such as eth_call, eth_estimateGas, debug_traceCall, eth_getTransactionCount, and eth_getBalance.
During this incident, the application saw a split-brain view of the same Base node:
- Ordinary block subscription / canonical state had advanced to blocks such as
47205325 and later 47205760.
- Base live/pending reads still returned stale heights such as
47204902 and 47205334.
- The stale
47205334 matched reth_reth_flashblocks_pending_snapshot_height before restart.
One concrete downstream failure:
- A Base sell preparation path repeatedly failed before transaction submission because exit quote calldata simulation became unavailable and the transaction actor rejected actions where
current_block was far ahead of the pending-derived live_block.
- Before restart, a manual rescue attempt found a route but was rejected with
current_block=47205760, live_block=47205334.
- After restart, the same class of rescue action was able to pass preparation and confirm shortly after restart. We are omitting transaction identifiers from this public report.
This does not prove eth_sendRawTransaction itself was broken. The failure happened earlier: stale Flashblocks pending state polluted downstream quote/simulation/readiness logic while canonical latest was already healthy.
Working hypothesis
Our current hypothesis is:
- A Flashblocks upstream reconnect/reorder/reorg sequence caused pending-state production to fall behind.
- Canonical sync and ordinary
newHeads recovered, but the Flashblocks pending snapshot and newFlashblocks subscription did not catch up.
- Downstream consumers that actively subscribe to
pendingLogs and query pending state can observe the stale Flashblocks path even when operators checking only latest / newHeads see a healthy node.
- Restarting the node clears the stale Flashblocks pending state.
We do not yet know whether high downstream pendingLogs/pending read load merely exposed the condition, amplified it, or is required to trigger it.
Similar public issues we found
These issues look related or adjacent:
Our incident differs from the full-stall reports because canonical latest / newHeads were healthy at the final pre-restart sampling point, while the Flashblocks path remained about 450 blocks behind.
Questions for the Base team
- Is it expected that
newFlashblocks and reth_reth_flashblocks_pending_snapshot_height can remain hundreds of blocks behind while newHeads / latest are current?
- Is there a known condition where Flashblocks pending-state production stops catching up after upstream reconnect/reorg/order errors, without causing a full canonical sync stall?
- Are
pendingLogs subscribers or high-volume pending state reads known to affect Flashblocks pending snapshot catch-up?
- Is there a health metric or RPC invariant we should monitor to distinguish:
- canonical chain unhealthy,
- Flashblocks upstream disconnected,
- Flashblocks pending snapshot stale,
pendingLogs consumer lag?
- Are there recommended Reth flags for Flashblocks-heavy RPC nodes, especially around
--engine.cross-block-cache-size or RPC cache settings?
- Is restart currently the expected recovery action when
reth_reth_flashblocks_pending_snapshot_height remains stale while canonical latest is healthy?
Raw evidence retained locally
We retained:
- Reth execution logs around
2026-06-11T17:28Z..18:09Z.
- Pre-restart Docker/container/resource snapshots.
- Pre- and post-restart Reth metrics.
- WebSocket probe outputs for
newHeads and newFlashblocks.
- Downstream application logs showing stale pending-derived
live_block values matching Reth metrics.
Base Mainnet Flashblocks pending state lag while canonical latest stayed healthy
Date: 2026-06-11
Summary
We observed a self-hosted Base mainnet Reth node where the canonical RPC path recovered and stayed healthy, but the Flashblocks path remained badly stale until a node restart.
The key symptom was:
eth_getBlockByNumber("latest")andnewHeadswere current.eth_subscribe ["newFlashblocks"]was roughly 450 blocks behindnewHeads.reth_reth_flashblocks_pending_snapshot_height=47205334, matching the stalependingblock height seen by downstream RPC reads.newFlashblocksand the pending snapshot to the canonical head.This looks like Flashblocks pending-state production/subscription lag, not a full node crash, OOM, or canonical sync failure.
Node setup
The node was running Base mainnet Reth with Flashblocks enabled:
v1.0.047b8b3690d3ef34530f8f90441bc733df01c1dda--websocket-url=wss://mainnet.flashblocks.base.org/ws--engine.cross-block-cache-size.OOMKilled=false,RestartCount=0.At the pre-restart snapshot, the machine was not under memory pressure:
61 GiBtotal,45 GiBavailable.3.6 GiB / 30 GiB.20.06 GiB / 61.91 GiBmemory.VmRSS=49935668 kB,VmSwap=2102636 kB,Threads=409.Timeline UTC
17:28-18:02: Flashblocks upstream reconnect/reorder/reorg signatures
In the Reth execution logs during
2026-06-11T17:28Z..18:02Z, we saw:No pong response from upstream, reconnecting: 7 times.WebSocket connection established: 7 times.Received non-zero index Flashblock: 2 times.reorg detected: 17 times.Representative lines:
We did not observe these signatures in that window:
State root task timed out: 0could not process Flashblock: 0missing canonicalerror: 018:00-18:05: canonical RPC healthy, Flashblocks stale
At
2026-06-11T18:01:12Z..18:01:45Z, repeatedeth_getBlockByNumber("latest")calls were current:47205762, timestamp2026-06-11T18:01:11Z, age1s.47205776, timestamp2026-06-11T18:01:39Z, age6s.0s..6sold.Around the same time, Reth Flashblocks metrics showed the pending snapshot was stale:
A local WebSocket probe to the node around
2026-06-11T18:04Zshowed:So
newFlashblockswas about450blocks behindnewHeads, whilenewHeadsand HTTPlatestwere current.18:06-18:09: restart cleared the lag
We restarted the node at
2026-06-11T18:06Z.After restart, HTTP
lateststayed current:47205939, timestamp2026-06-11T18:07:05Z, age5s.47205954, timestamp2026-06-11T18:07:35Z, age8s.Reth metrics around
18:08Zshowed:The WebSocket probe around
18:08Zshowed Flashblocks caught up:Downstream impact
The node serves a latency-sensitive application that uses the official Flashblocks paths:
eth_subscribe ["pendingLogs", filter]for ERC20Transferlogs.eth_subscribe ["newFlashblocks"]probes.eth_getBlockByNumber("pending")/BlockId::pending()through live read paths such aseth_call,eth_estimateGas,debug_traceCall,eth_getTransactionCount, andeth_getBalance.During this incident, the application saw a split-brain view of the same Base node:
47205325and later47205760.47204902and47205334.47205334matchedreth_reth_flashblocks_pending_snapshot_heightbefore restart.One concrete downstream failure:
current_blockwas far ahead of the pending-derivedlive_block.current_block=47205760, live_block=47205334.This does not prove
eth_sendRawTransactionitself was broken. The failure happened earlier: stale Flashblocks pending state polluted downstream quote/simulation/readiness logic while canonicallatestwas already healthy.Working hypothesis
Our current hypothesis is:
newHeadsrecovered, but the Flashblocks pending snapshot andnewFlashblockssubscription did not catch up.pendingLogsand querypendingstate can observe the stale Flashblocks path even when operators checking onlylatest/newHeadssee a healthy node.We do not yet know whether high downstream
pendingLogs/pendingread load merely exposed the condition, amplified it, or is required to trigger it.Similar public issues we found
These issues look related or adjacent:
base/base#2675:Base v8.0.0 on mainnet stops synching, includingNo pong response from upstream,Received non-zero index Flashblock,could not process Flashblock ... missing canonical header, and OOM-risk old-state-root signatures. Base v8.0.0 on mainnet stops synching base#2675base/base#2526: archive node stalls after Flashblocks disconnect / timeout sequence, withNo pong response, canonical head plateau, state-root timeout, cache mutex blocking, and missing canonical header. Base reth archive nodes stall after Flashblocks disconnect / timeout sequence base#2526base/base#2896: Flashblocks rebuilt twice / parent hash mismatch during possible reorg or sequencer failover. bug(flashblocks): rebuilt twice base#2896base/base#694: non-sequential Flashblocks. Bug: Receiving non-sequential flashblocks base#694base/base#781: Flashblocks processor panic on empty Flashblocks during reorg/depth-limit reconciliation. bug: processor panics on empty flashblocks during reorg/depth-limit reconciliation base#781base/base#613: documents official FlashblockspendingLogsandnewFlashblockssubscriptions. fix(flashblocks): eth_subscribe for flashblock types should closely mirror non FB variants base#613Our incident differs from the full-stall reports because canonical
latest/newHeadswere healthy at the final pre-restart sampling point, while the Flashblocks path remained about 450 blocks behind.Questions for the Base team
newFlashblocksandreth_reth_flashblocks_pending_snapshot_heightcan remain hundreds of blocks behind whilenewHeads/latestare current?pendingLogssubscribers or high-volumependingstate reads known to affect Flashblocks pending snapshot catch-up?pendingLogsconsumer lag?--engine.cross-block-cache-sizeor RPC cache settings?reth_reth_flashblocks_pending_snapshot_heightremains stale while canonical latest is healthy?Raw evidence retained locally
We retained:
2026-06-11T17:28Z..18:09Z.newHeadsandnewFlashblocks.live_blockvalues matching Reth metrics.