Skip to content

Fix eth_getBlockByNumber("latest") returning null intermittently#1816

Merged
librelois merged 13 commits intopolkadot-evm:masterfrom
moonbeam-foundation:manuel/fix-frequent-null-response
Feb 11, 2026
Merged

Fix eth_getBlockByNumber("latest") returning null intermittently#1816
librelois merged 13 commits intopolkadot-evm:masterfrom
moonbeam-foundation:manuel/fix-frequent-null-response

Conversation

@manuelmauro
Copy link
Copy Markdown
Contributor

@manuelmauro manuelmauro commented Feb 11, 2026

Goal of the changes

Fix intermittent null / inconsistent latest Ethereum RPC responses by making latest-based RPCs read from the latest indexed canonical block instead of the client best head, especially while mapping-sync lags or after reorgs.

What reviewers need to know

  • client/db/src/kv/mod.rs

    • Adds persisted meta key LATEST_CANONICAL_INDEXED_BLOCK.
    • latest_block_hash() now:
      • reads the cached canonical indexed block number,
      • clamps to current best,
      • verifies canonical/indexed consistency,
      • walks back (bounded by MAX_WALKBACK_DEPTH = 16) to recover from stale pointers/reorgs,
      • persists recovered pointer.
    • This avoids falling back to genesis for transient stale-head scenarios and stabilizes latest queries.
  • client/mapping-sync/src/kv/mod.rs

    • Updates LATEST_CANONICAL_INDEXED_BLOCK when syncing a new best canonical block.
    • Makes mapping-sync the authoritative writer of canonical indexed head metadata.
  • client/rpc/src/eth/mod.rs, client/rpc/src/eth/client.rs, client/rpc/src/eth/filter.rs, client/rpc-core/src/eth.rs

    • eth_getBlockByNumber("latest"), eth_blockNumber, eth_coinbase, and log/filter paths now use backend.latest_block_hash() / latest indexed number for consistent behavior.
    • eth_blockNumber and eth_coinbase methods become async in RPC core/impl to support indexed-head lookup.
    • Filter polling updates last_poll from the same indexed-capped head it queries, preventing skipped logs when best head is ahead of indexed head.

Risks / compatibility notes

  • Behavioral change: latest semantics now track indexed canonical visibility (consistent across RPCs) rather than raw client best head.
  • Recovery search is intentionally bounded (16 blocks). Very deep stale windows still degrade gracefully (eventual catch-up behavior covered by tests).

Testing

  • Added: ts-tests/tests/test-latest-block-consistency.ts
    • startup non-null latest checks,
    • eth_blockNumber/eth_getBlockByNumber("latest") consistency,
    • eth_coinbase availability,
    • indexing lag beyond recovery-limit scenario,
    • reorg consistency for latest/blockNumber/logs.
  • Updated: ts-tests/tests/test-filter-api.ts
    • new regression test for no log skips when indexed head lags best head,
    • robust dynamic filter-id error assertions.

References

Summary by CodeRabbit

  • Bug Fixes

    • Fixed race conditions in RPC "latest" block responses when block indexing lags behind the active chain
    • Improved consistency of eth_blockNumber and eth_coinbase responses during indexing delays
    • Enhanced filter stability when polling across indexing lag periods
  • New Features

    • Added persistence mechanism for tracking the latest indexed canonical block state
  • Tests

    • Added comprehensive test suite validating latest block consistency across startup, block production, reorgs, and indexing lag scenarios

@manuelmauro manuelmauro marked this pull request as ready for review February 11, 2026 13:50
@manuelmauro manuelmauro requested a review from sorpaas as a code owner February 11, 2026 13:50
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • ✅ Review completed - (🔄 Check again to review again)
📝 Walkthrough

Walkthrough

Converted author and block_number in EthApi from synchronous to asynchronous; RPC implementations, filters, and block-resolution now use backend.latest_block_hash() / latest indexed block number and perform canonical-chain verification; mapping-sync persists the latest canonical indexed block to the DB.

Changes

Cohort / File(s) Summary
Trait Definition
client/rpc-core/src/eth.rs
Converted author and block_number signatures on EthApi to async methods.
Client RPC Implementation
client/rpc/src/eth/client.rs
Made author and block_number async; await backend.latest_block_hash() and map backend/client errors to internal_err; use latest-indexed hash/number instead of direct best_hash/best_number.
RPC Server & Block Resolution
client/rpc/src/eth/mod.rs
Refactored block_info_by_number to branch on BlockNumberOrHash variants; Latest resolves via backend.latest_block_hash() then substrate lookup; added canonical-chain verification for non-genesis blocks; updated EthApiServer to await inner async methods.
Filters / Logs
client/rpc/src/eth/filter.rs
Added private async helper to fetch latest indexed block number; replaced uses of best_number with latest-indexed number in filter_changes and logs code paths; moved awaits outside locks and adjusted capping/comparison logic.
Mapping DB (KV)
client/db/src/kv/mod.rs
Added LATEST_CANONICAL_INDEXED_BLOCK meta key and MAX_WALKBACK_DEPTH; added methods to read/write latest canonical indexed block; latest_block_hash now validates canonical/indexed state and walks back up to the bound to find the latest indexed canonical block.
Mapping Sync Side-effect
client/mapping-sync/src/kv/mod.rs
When a block becomes new best (is_new_best), persist its number via set_latest_canonical_indexed_block(...) to update the latest canonical indexed block meta.
Misc (exports/traits)
client/rpc/src/eth/...
Adjusted exported RPC trait/impl signatures to async for author and block_number and updated internal call sites accordingly.

Sequence Diagram(s)

sequenceDiagram
    participant C as Client
    participant S as EthApiServer
    participant I as EthApiImpl
    participant B as Backend
    participant DB as MappingDB

    C->>S: eth_author / eth_blockNumber
    S->>I: call async author()/block_number() and await
    I->>B: await latest_block_hash()
    B-->>I: latest indexed block hash
    I->>DB: lookup eth->substrate hash / number
    DB-->>I: substrate hash / block number
    I->>I: canonical verification (if number > 0)
    I-->>S: RpcResult(author / number)
    S-->>C: JSON-RPC response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰

I hopped through async lanes today,
Fetched the latest hash along the way,
No racy bests to chase or bind,
Numbers and authors now await, aligned,
A tiny hop keeps mapping kind.

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main fix: resolving intermittent null responses from eth_getBlockByNumber("latest"), which is the core issue addressed across all modified files.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Ensures eth_getFilterChanges, eth_getFilterLogs, and eth_getLogs use the
latest indexed block number instead of Substrate's best_number when
resolving "latest" block references.

This prevents inconsistencies where these filter RPCs could reference
blocks not yet visible via eth_getBlockByNumber("latest").
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
client/rpc/src/eth/filter.rs (1)

285-305: ⚠️ Potential issue | 🟠 Major

Avoid advancing last_poll past the indexed range.

With current_number capped to latest_indexed_number, keeping last_poll = best_number + 1 can skip logs when the indexer lags (blocks between latest indexed and best are never revisited).

🐛 Suggested fix
-                    // Update filter `last_poll`.
-                    locked.insert(
-                        key,
-                        FilterPoolItem {
-                            last_poll: BlockNumberOrHash::Num(best_number + 1),
-                            filter_type: pool_item.filter_type.clone(),
-                            at_block: pool_item.at_block,
-                            pending_transaction_hashes: HashSet::new(),
-                        },
-                    );
-
                     // Either the filter-specific `to` block or latest indexed block.
                     // Use latest indexed block to ensure consistency with other RPCs.
                     let mut current_number = filter
                         .to_block
                         .and_then(|v| v.to_min_block_num())
                         .map(|s| s.unique_saturated_into())
                         .unwrap_or(latest_indexed_number);

                     if current_number > latest_indexed_number {
                         current_number = latest_indexed_number;
                     }
+
+                    // Update filter `last_poll` based on the capped range.
+                    let next_poll =
+                        UniqueSaturatedInto::<u64>::unique_saturated_into(current_number)
+                            .saturating_add(1);
+                    locked.insert(
+                        key,
+                        FilterPoolItem {
+                            last_poll: BlockNumberOrHash::Num(next_poll),
+                            filter_type: pool_item.filter_type.clone(),
+                            at_block: pool_item.at_block,
+                            pending_transaction_hashes: HashSet::new(),
+                        },
+                    );

Copy link
Copy Markdown
Member

@librelois librelois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  1. High: latest can regress to genesis on KV backend instead of returning the most recent indexed head
    The new latest path now depends on backend.latest_block_hash() in client/rpc/src/eth/mod.rs:156, client/rpc/src/eth/client.rs:75, and client/rpc/src/eth/client.rs:99.
    In the KV backend, client/db/src/kv/mod.rs:102 returns genesis whenever best_number is not yet indexed (:134-135). During the same lag window that currently produces null, this change can return block 0 (and eth_blockNumber = 0) transiently.
    So this likely removes null, but may replace it with a worse semantic error on KV deployments.

  2. Medium: new error surface for eth_blockNumber / eth_coinbase when no indexed canonical block exists yet
    eth_blockNumber and eth_coinbase are now async and call latest_block_hash() directly in client/rpc/src/eth/client.rs:71 and client/rpc/src/eth/client.rs:95.
    If backend returns error (for example SQL path when no canonical indexed row yet), these now return RPC internal errors rather than previous best-chain values.

Open Questions / Assumptions

  1. Are affected operators on SQL backend or KV backend?
    If SQL: this fix is much more likely to address the reported null latest behavior correctly.
    If KV: I would not merge as-is without changing KV latest_block_hash() semantics.

Merge Risk Summary

  • For SQL-backed nodes, the eth_getBlockByNumber("latest") race is likely fixed by using indexed latest hash instead of client.info().best_number.
  • For KV-backed nodes, this is not safe yet because latest can transiently jump to genesis under indexing lag.

Recommended Next Actions

  1. Adjust KV client/db/src/kv/mod.rs:102 to return the highest indexed canonical block <= best_number (not genesis fallback on tip miss).
  2. Add regression tests for eth_getBlockByNumber("latest") and eth_blockNumber under simulated mapping lag (both SQL and KV).
  3. Re-test operator scenario after that; expected behavior should be “latest indexed block”, never null, never sudden block 0 unless truly at genesis.

@manuelmauro
Copy link
Copy Markdown
Contributor Author

During the same lag window that currently produces null, this change can return block 0 (and eth_blockNumber = 0) transiently.

@librelois addressed in e1177f8

@manuelmauro
Copy link
Copy Markdown
Contributor Author

Are affected operators on SQL backend or KV backend?

The SQL backend is not affected by this issue because its latest_block_hash() implementation queries the database directly:

SELECT substrate_block_hash FROM blocks WHERE is_canon = 1 ORDER BY block_number DESC LIMIT 1

Copy link
Copy Markdown
Member

@librelois librelois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested improvements:

  1. Add fallback/backfill when the meta key is absent (derive from currently indexed canonical best instead of hard-falling to genesis).
  2. Add an integration test for upgrade/restart behavior: existing synced DB + no new imports yet + eth_getBlockByNumber("latest").
  3. on stale cached number (reorg case), walk back to prior indexed canonical block instead of returning genesis.

@manuelmauro
Copy link
Copy Markdown
Contributor Author

Suggested improvements:
3. on stale cached number (reorg case), walk back to prior indexed canonical block instead of returning genesis.

Addressed this in eb4bc7a and 57f1ebb

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@ts-tests/tests/test-filter-api.ts`:
- Around line 149-172: The test step "should not skip log filter changes when
best head is ahead of indexed head" calls waitForBlock(..., 10000) which may
wait up to 10s, so add an explicit Mocha timeout at the start of the step's
async function to avoid Mocha's default 2s timeout; inside the step(...)
callback (the async function where waitForBlock is called) call
this.timeout(15000) (or similar >=10000) to ensure the test has enough time to
complete while keeping the function signature (async function () { ... }) so
this is available.
🧹 Nitpick comments (2)
client/rpc/src/eth/filter.rs (1)

230-244: Consider aligning Block filter with Log filter for consistency.

The FilterType::Block branch still uses best_number for the upper bound (line 232), while the FilterType::Log branch uses latest_indexed_number. This creates potential inconsistency:

  • Block filters may iterate over unindexed blocks and silently skip them (lines 356-359 return no hash when block is None)
  • Log filters explicitly cap at the latest indexed block

This may be intentional since block filters gracefully handle missing blocks, but documenting this distinction would help maintainability. If consistent behavior is preferred, consider capping the Block branch similarly.

client/db/src/kv/mod.rs (1)

495-504: Consider adding write lock for consistency with other write methods.

set_latest_canonical_indexed_block doesn't acquire write_lock, unlike other write methods (e.g., write_hashes at line 408). While atomic single-key writes are safe at the database level, and last-write-wins is acceptable for this metadata, using the lock would maintain consistency with the codebase patterns.

The current implementation is functionally correct—concurrent calls would just race to set valid values—but adding the lock would align with the defensive coding style used elsewhere.

♻️ Optional: Add write lock for consistency
 pub fn set_latest_canonical_indexed_block(&self, block_number: u64) -> Result<(), String> {
+    let _lock = self.write_lock.lock();
     let mut transaction = sp_database::Transaction::new();
     transaction.set(
         columns::META,
         static_keys::LATEST_CANONICAL_INDEXED_BLOCK,
         &block_number.encode(),
     );
     self.db.commit(transaction).map_err(|e| e.to_string())
 }

@librelois librelois merged commit 1a2f1ae into polkadot-evm:master Feb 11, 2026
7 checks passed
@librelois librelois deleted the manuel/fix-frequent-null-response branch February 11, 2026 19:19
librelois added a commit to moonbeam-foundation/frontier that referenced this pull request Feb 11, 2026
…kadot-evm#1816)

* fix: 🐛 do not rely on substrate state for latest block info

* fix: 🐛 use latest indexed block for eth_blockNumber and eth_coinbase

* fix: 🐛 use latest indexed block for eth_getLogs and filter RPCs

Ensures eth_getFilterChanges, eth_getFilterLogs, and eth_getLogs use the
latest indexed block number instead of Substrate's best_number when
resolving "latest" block references.

This prevents inconsistencies where these filter RPCs could reference
blocks not yet visible via eth_getBlockByNumber("latest").

* fix: 🐛 track latest canonical indexed block in mapping-sync

* fix: 🐛 walk back to find indexed canonical block instead of returning genesis

* fix: 🐛 walk back on stale cached block during reorg instead of returning genesis

* test: ✅ add integration tests

* backfill and persist latest indexed canonical block on miss/stale meta

* rustfmt

* prevent filter log skips under indexing lag & harden lat pointer recover

* explicit Mocha timeout

* add more tests

* prettier

---------

Co-authored-by: Éloïs <c@elo.tf>
librelois added a commit to moonbeam-foundation/frontier that referenced this pull request Feb 11, 2026
…kadot-evm#1816)

* fix: 🐛 do not rely on substrate state for latest block info

* fix: 🐛 use latest indexed block for eth_blockNumber and eth_coinbase

* fix: 🐛 use latest indexed block for eth_getLogs and filter RPCs

Ensures eth_getFilterChanges, eth_getFilterLogs, and eth_getLogs use the
latest indexed block number instead of Substrate's best_number when
resolving "latest" block references.

This prevents inconsistencies where these filter RPCs could reference
blocks not yet visible via eth_getBlockByNumber("latest").

* fix: 🐛 track latest canonical indexed block in mapping-sync

* fix: 🐛 walk back to find indexed canonical block instead of returning genesis

* fix: 🐛 walk back on stale cached block during reorg instead of returning genesis

* test: ✅ add integration tests

* backfill and persist latest indexed canonical block on miss/stale meta

* rustfmt

* prevent filter log skips under indexing lag & harden lat pointer recover

* explicit Mocha timeout

* add more tests

* prettier

---------

Co-authored-by: Éloïs <c@elo.tf>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants