Skip to content

Conversation

@mediocregopher
Copy link
Collaborator

@mediocregopher mediocregopher commented Oct 14, 2025

I observed the following behavior on a node which was started without a --debug.tip, after having manually dropped a stage:

2025-10-14T12:17:32.267719Z  INFO Preparing stage pipeline_stages=1/16 stage=Era checkpoint=23575428 target=None
2025-10-14T12:17:32.267733Z  INFO Executing stage pipeline_stages=1/16 stage=Era checkpoint=23575428 target=None
2025-10-14T12:17:32.268088Z  INFO Finished stage pipeline_stages=1/16 stage=Era checkpoint=0 target=None

As shown, the Era stage was setting its checkpoint to zero. This was occurring because when the target has already been reached Era immediately returns ready from poll_execute_ready:

if input.target_reached() || self.item.is_some() {
return Poll::Ready(Ok(()));
}

(target_reached assumes a zero target block number if the target field is None.)

At this point self.item is None. When execute is then called Era has no work to do, but was returning the target (defaulting again to zero) rather than previous checkpoint:

If a stage was then manually dropped again then both Era (the first stage) and Finish (the last stage) would be zero. This check in check_pipeline_consistency would never hit, and no pipeline sync would be reran:

// If the checkpoint of any stage is less than the checkpoint of the first stage,
// retrieve and return the block hash of the latest header and use it as the target.
if stage_checkpoint < first_stage_checkpoint {
debug!(
target: "consensus::engine",
first_stage_checkpoint,
inconsistent_stage_id = %stage_id,
inconsistent_stage_checkpoint = stage_checkpoint,
"Pipeline sync progress is inconsistent"
);
return self.blockchain_db().block_hash(first_stage_checkpoint);
}
}

Since the Finish stage is used to determine current block height generally, the node would then assume it has no blocks.

2025-10-14T13:59:36.340411Z  INFO Status connected_peers=0 latest_block=0

@mediocregopher mediocregopher changed the title fix: Set Era pipeline stage to last checkpoint when there is no work fix: Set Era pipeline stage to last checkpoint when there is no target Oct 14, 2025
@mediocregopher mediocregopher marked this pull request as ready for review October 14, 2025 14:21
@mediocregopher mediocregopher added C-bug An unexpected or incorrect behavior A-staged-sync Related to staged sync (pipelines and stages) labels Oct 14, 2025
@yongkangc
Copy link
Member

^ This has been making it really difficult to test the new trie changesets against main; everytime I would drop the MerkleChangeSets stage the node was getting into a weird state

do you mean that after this fix, it would be much easier to test the trieset changes against main? @mediocregopher

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find

};

Ok(ExecOutput { checkpoint: StageCheckpoint::new(height), done: height == input.target() })
Ok(ExecOutput { checkpoint: StageCheckpoint::new(height), done: height >= input.target() })
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

Comment on lines +214 to +219
// It's possible for a pipeline sync to be executed with a None target, e.g. after a
// stage was manually dropped, and `reth node` is then called without a `--debug.tip`.
//
// In this case we don't want to simply default to zero, as that would overwrite the
// previously stored checkpoint block number. Instead we default to that previous
// checkpoint.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this checks out

@github-project-automation github-project-automation bot moved this from Backlog to In Progress in Reth Tracker Oct 15, 2025
@mediocregopher mediocregopher added this pull request to the merge queue Oct 15, 2025
@mediocregopher
Copy link
Collaborator Author

^ This has been making it really difficult to test the new trie changesets against main; everytime I would drop the MerkleChangeSets stage the node was getting into a weird state

do you mean that after this fix, it would be much easier to test the trieset changes against main? @mediocregopher

Now that I've figured this issue out I could potentially work around it, but yes with this fix in using reth-bench-compare on the trie changesets PR is much more straightforward

Merged via the queue into main with commit 00f1733 Oct 15, 2025
43 of 44 checks passed
@mediocregopher mediocregopher deleted the mediocregopher/era-pipeline-fix branch October 15, 2025 08:31
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Oct 15, 2025
frisitano added a commit to scroll-tech/reth that referenced this pull request Nov 17, 2025
* docs: yellowpaper sections in consensus implementation (paradigmxyz#18881)

* fix(era-utils): fix off-by-one for Excluded end bound in process_iter (paradigmxyz#18731)

Co-authored-by: Roman Hodulák <[email protected]>

* refactor: eliminate redundant allocation in precompile cache example (paradigmxyz#18886)

* chore: relax `ChainSpec` impls (paradigmxyz#18894)

* chore: make clippy happy (paradigmxyz#18900)

* fix(trie): Reveal extension child when extension is last remaining child of a branch (paradigmxyz#18891)

* chore(node): simplify EngineApiExt bounds by removing redundant constraints (paradigmxyz#18905)

* refactor(engine): separate concerns in on_forkchoice_updated for better maintainability (paradigmxyz#18661)

Co-authored-by: Nathaniel Bajo <[email protected]>
Co-authored-by: YK <[email protected]>
Co-authored-by: Brian Picciano <[email protected]>

* feat(provider): add get_account_before_block to ChangesetReader (paradigmxyz#18898)

* refactor: replace collect().is_empty() with next().is_none() in tests (paradigmxyz#18902)

Co-authored-by: Alexey Shekhirin <[email protected]>

* ci: cache hive simulator images to reduce prepare-hive job time (paradigmxyz#18899)

* docs: duplicate comment in Eip4844PoolTransactionError (paradigmxyz#18858)

* chore: align node_config threshold constant (paradigmxyz#18914)

* feat: wait for new blocks when build is in progress (paradigmxyz#18831)

Co-authored-by: Roman Hodulák <[email protected]>

* perf(tree): worker pooling for storage in multiproof generation (paradigmxyz#18887)

Co-authored-by: Brian Picciano <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Alexey Shekhirin <[email protected]>

* chore(grafana): use precompile address as legend (paradigmxyz#18913)

* refactor: remove needless collect() calls in trie tests (paradigmxyz#18937)

* fix(examples): change method to launch with debug capabilities (paradigmxyz#18946)

* fix(example): launch with debug capabilities (paradigmxyz#18947)

* fix(testsuite): Fix unused updates in e2e-test-utils (paradigmxyz#18953)

* fix(payload): correct Debug label for PayloadTimestamp in PayloadServiceCommand (paradigmxyz#18954)

* chore(rpc): Moves `SequencerMetrics` into `reth-optimism-rpc` (paradigmxyz#18921)

* refactor: replace println! with structured logging in test_vectors (paradigmxyz#18956)

* refactor(cli): use structured logging (tracing) in p2p command (paradigmxyz#18957)

* perf(tree): add elapsed time to parallel state root completion log (paradigmxyz#18959)

* fix(trie): Properly upsert into StoragesTrie in repair-trie (paradigmxyz#18941)

* fix: misleading error message in db list: show actual table name (paradigmxyz#18896)

* fix: remove noisy stderr prints in ERA1 cleanup (EraClient::delete_outside_range) (paradigmxyz#18895)

* fix: use max B256 for upper bound in empty-storage check (paradigmxyz#18962)

* ci: remove reproducible build from release.yml (paradigmxyz#18958)

* chore(rpc): Remove redundant U256::from in suggested_priority_fee (paradigmxyz#18969)

* chore(ci): update eest 7594 issue link in hive expected failures file (paradigmxyz#18976)

* perf(tests): remove redundant format! in ef-tests run_only (paradigmxyz#18909)

* feat(cli): enable traces export via `tracing-otlp` cli arg (paradigmxyz#18242)

Co-authored-by: Dan Cline <[email protected]>

* feat: allow otlp level to be configurable (paradigmxyz#18981)

* chore(optimism): remove unnecessary Debug bounds from header generics (paradigmxyz#18989)

* refactor: convert satisfy_base_fee_ids to use closure (paradigmxyz#18979)

* chore: bump otlp crates (paradigmxyz#18984)

* fix(network): prevent metric leak in outgoing message queue on session teardown (paradigmxyz#18847)

* chore: remove unused imports in blockchain_provider (paradigmxyz#18867)

Co-authored-by: Matthias Seitz <[email protected]>

* fix(stateless): enforce BLOCKHASH ancestor header limit (paradigmxyz#18920)

* chore(evm): mark ExecuteOutput as unused and slated for removal (paradigmxyz#18754)

* refactor: unify `Pipeline` creation codepaths (paradigmxyz#18955)

* fix(engine): flatten storage cache (paradigmxyz#18880)

* perf(tree): worker pooling for account proofs (paradigmxyz#18901)

Co-authored-by: Brian Picciano <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Alexey Shekhirin <[email protected]>

* refactor(storage): fix ChainStateKey enum variant name (paradigmxyz#18992)

* refactor(trie): remove proof task manager (paradigmxyz#18934)

Co-authored-by: Brian Picciano <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Alexey Shekhirin <[email protected]>

* fix: required optimism primitives features in db-api (paradigmxyz#19005)

* refactor(engine): simplify InvalidBlockWitnessHook::on_invalid_block for better testability (paradigmxyz#18696)

* chore: replace poll_next_unpin loop with poll_recv_many (paradigmxyz#18978)

* fix: Set Era pipeline stage to last checkpoint when there is no target (paradigmxyz#19000)

* ci: Add tests for Paris scenario in hive.yml (paradigmxyz#19013)

* chore: bump book timeout (paradigmxyz#19016)

* feat: add metrics for safe and finalized block heights (paradigmxyz#18987)

* chore(privitives-traits): remove unused serde derives and camelCase attribute (paradigmxyz#19014)

* chore: refactor loop in `add_new_transactions` (paradigmxyz#19006)

* chore(ci): bump hive eest to v5.3.0 (paradigmxyz#19021)

* feat(devp2p): make eth p2p networkId configurable (paradigmxyz#19020)

Co-authored-by: frankudoags <frankudoags.com>

* chore: remove unused Args struct from exex-subscription example (paradigmxyz#19019)

* feat: add pending sequence as pub (paradigmxyz#19022)

* chore: bump alloy-core (paradigmxyz#19026)

* fix: unused warnings for tracing (paradigmxyz#19025)

* fix: respect cli blob size setting (paradigmxyz#19024)

* feat(engine): deprecate TestPipelineBuilder::with_executor_results (paradigmxyz#19017)

* perf: background init of workers (paradigmxyz#19012)

* chore(ci): update expected failures (paradigmxyz#19034)

* fix: use header type generic for mask (paradigmxyz#19037)

* fix: correct `Compact` impl for `Option` (paradigmxyz#19042)

* chore: increase versioned hash index cache (paradigmxyz#19038)

* chore(primitives-traits): relax SignerRecoverable bounds for Extended<B,T> (paradigmxyz#19045)

* feat: bump revm (paradigmxyz#18999)

* fix: resolve upstream merge

Signed-off-by: Gregory Edison <[email protected]>

* bump revm

* update deps and fix lints

* update openvm deps

* skip exex wal storage test

* pin revm tag scroll-v91

* bump openvm compat

---------

Signed-off-by: Gregory Edison <[email protected]>
Co-authored-by: josé v <[email protected]>
Co-authored-by: Forostovec <[email protected]>
Co-authored-by: Roman Hodulák <[email protected]>
Co-authored-by: Skylar Ray <[email protected]>
Co-authored-by: Arsenii Kulikov <[email protected]>
Co-authored-by: Léa Narzis <[email protected]>
Co-authored-by: Brian Picciano <[email protected]>
Co-authored-by: radik878 <[email protected]>
Co-authored-by: William Nwoke <[email protected]>
Co-authored-by: Nathaniel Bajo <[email protected]>
Co-authored-by: YK <[email protected]>
Co-authored-by: Dan Cline <[email protected]>
Co-authored-by: Merkel Tranjes <[email protected]>
Co-authored-by: Alexey Shekhirin <[email protected]>
Co-authored-by: Federico Gimenez <[email protected]>
Co-authored-by: stevencartavia <[email protected]>
Co-authored-by: emmmm <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: MIHAO PARK <[email protected]>
Co-authored-by: Tilak Madichetti <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
Co-authored-by: maradini77 <[email protected]>
Co-authored-by: sashaodessa <[email protected]>
Co-authored-by: Alvarez <[email protected]>
Co-authored-by: MozirDmitriy <[email protected]>
Co-authored-by: drhgencer <[email protected]>
Co-authored-by: Matthias Seitz <[email protected]>
Co-authored-by: anim001k <[email protected]>
Co-authored-by: Julian Meyer <[email protected]>
Co-authored-by: Karl Yu <[email protected]>
Co-authored-by: Jennifer <[email protected]>
Co-authored-by: Ivan Wang <[email protected]>
Co-authored-by: GarmashAlex <[email protected]>
Co-authored-by: Udoagwa Franklin <[email protected]>
Co-authored-by: Luca Provini <[email protected]>
Co-authored-by: Galoretka <[email protected]>
Co-authored-by: sashass1315 <[email protected]>
Co-authored-by: frisitano <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-staged-sync Related to staged sync (pipelines and stages) C-bug An unexpected or incorrect behavior

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants