Skip to content

Conversation

@meyer9
Copy link
Collaborator

@meyer9 meyer9 commented Sep 26, 2025

Ref #176

This PR adds an in-memory storage backend and tests that work with any storage backend. We can easily add more test cases to the file to test against a future SQLite backend.

@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch from d7fe4bc to 56afdf7 Compare September 26, 2025 17:38
@meyer9 meyer9 added the W-historical-proofs Workstream: historical-proofs label Sep 26, 2025
@meyer9 meyer9 linked an issue Sep 26, 2025 that may be closed by this pull request
@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch 2 times, most recently from 955e4f1 to 7731223 Compare September 26, 2025 17:47
@meyer9 meyer9 marked this pull request as ready for review September 26, 2025 17:48
@meyer9 meyer9 marked this pull request as draft September 26, 2025 17:48
@meyer9 meyer9 force-pushed the meyer9/166-finalize-storage-interface branch from 3e311c7 to 5ef8999 Compare September 26, 2025 17:48
@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch from 7731223 to a737cf2 Compare September 26, 2025 17:49
@meyer9 meyer9 marked this pull request as ready for review September 27, 2025 05:01
@meyer9 meyer9 force-pushed the meyer9/166-finalize-storage-interface branch from 5ef8999 to 84c813c Compare September 29, 2025 16:13
@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch from a737cf2 to 1eef699 Compare September 29, 2025 16:13
@meyer9 meyer9 force-pushed the meyer9/166-finalize-storage-interface branch from 84c813c to 9ca3987 Compare October 8, 2025 15:45
@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch from 1eef699 to 1c7a7a3 Compare October 8, 2025 15:49
@wiz-b4c72f16a4
Copy link

wiz-b4c72f16a4 bot commented Oct 8, 2025

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data -
Total -

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

@meyer9 meyer9 force-pushed the meyer9/176-in-memory-storage-backend branch from 1c7a7a3 to f336fc3 Compare October 8, 2025 17:14
let mut collected_entries: Vec<(Nibbles, BranchNodeCompact)> =
if let Some(addr) = hashed_address {
// Storage trie cursor
for ((block, address, path), branch) in &storage.storage_branches {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we merge the changes from trie_updates for all blocks up to the specified max_block_number?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should merge them all already. The BTreeMap is sorted by block number ASC, so when we collect all the entries, it will choose the latest block <= max_block_number.

Copy link
Collaborator

@dhyaniarun1993 dhyaniarun1993 Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, we are storing live updates only in the trie_updates BTreeMap. We aren't saving data to storage_branches BTreeMap in the live syncing.

Hence live updates changes won't be visible to the cursor since cursor is only reading storage_branches BTreeMap

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, that should be saved to *_branches and hashed_*. The trie_updates and post_states are just indexes from block number to the slots/trie nodes changed for pruning/reorgs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: fa65493

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Will review it.

.collect()
} else {
// Account trie cursor
for ((block, path), branch) in &storage.account_branches {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

@emhane
Copy link
Member

emhane commented Oct 9, 2025

does this pr reference any issue?

Copilot AI review requested due to automatic review settings October 10, 2025 06:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements an in-memory storage backend for external trie node storage and creates a comprehensive test suite that can work with any storage backend implementation. The changes enable easier testing and development of the external proofs feature while providing a foundation for adding future storage backends like SQLite.

Key changes:

  • Implements a complete in-memory storage backend with thread-safe operations using tokio RwLock
  • Creates an extensive test suite covering all storage operations, cursor functionality, and edge cases
  • Adds proper documentation and comments to the storage trait interfaces

Reviewed Changes

Copilot reviewed 5 out of 6 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/exex/external-proofs/src/in_memory.rs Complete in-memory implementation of ExternalStorage trait with cursors for accounts, storage, and trie branches
crates/exex/external-proofs/src/storage_tests.rs Comprehensive test suite covering all storage operations, cursor navigation, block versioning, and edge cases
crates/exex/external-proofs/src/storage.rs Added documentation comments to the storage trait and BlockStateDiff struct
crates/exex/external-proofs/src/lib.rs Added module declarations for in_memory and storage_tests
crates/exex/external-proofs/Cargo.toml Added tokio dependency and test dependencies with feature flags

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +3 to +4
#[cfg(test)]
mod tests {
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The entire file is wrapped in a #[cfg(test)] module, but the file itself is already conditionally included only for tests via #[cfg(test)] in lib.rs. This creates redundant test-only compilation guards.

Copilot uses AI. Check for mistakes.
Comment on lines +391 to +394
let inner = self
.inner
.try_read()
.map_err(|_| ExternalStorageError::Other(eyre::eyre!("Failed to acquire read lock")))?;
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using try_read() in synchronous cursor methods could fail under contention. Consider using blocking read() or redesigning the API to be async, as lock contention in this context likely indicates a design issue rather than a recoverable error.

Copilot uses AI. Check for mistakes.
Comment on lines +403 to +406
let inner = self
.inner
.try_read()
.map_err(|_| ExternalStorageError::Other(eyre::eyre!("Failed to acquire read lock")))?;
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same lock contention issue as in trie_cursor. Using try_read() in synchronous cursor methods could fail under contention. Consider using blocking read() or redesigning the API to be async.

Copilot uses AI. Check for mistakes.
Comment on lines +414 to +417
let inner = self
.inner
.try_read()
.map_err(|_| ExternalStorageError::Other(eyre::eyre!("Failed to acquire read lock")))?;
Copy link

Copilot AI Oct 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same lock contention issue as in previous cursor methods. Using try_read() in synchronous cursor methods could fail under contention. Consider using blocking read() or redesigning the API to be async.

Copilot uses AI. Check for mistakes.
@emhane emhane added K-feature Kind: feature A-db Area: database labels Oct 10, 2025
blocks_to_add: HashMap<u64, BlockStateDiff>,
) -> ExternalStorageResult<()> {
let mut inner = self.inner.write().await;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well, we need to update account_branches, storage_branches, hashed_accounts and hashed_storage.

}

for (hashed_address, storage) in &block_state_diff.post_state.storages {
for (slot, value) in &storage.storage {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to move the wiped logic here. Let me know what you think.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, moved this to a common fn: 65e441e

@meyer9
Copy link
Collaborator Author

meyer9 commented Oct 10, 2025

Clippy does not recognize the lint as being used even though it definitely is (removing it causes warnings). I had to use #[allow(dead_code)] instead to fix this, but it will be removed very soon anyway. Seems like a bug in Clippy

@dhyaniarun1993 dhyaniarun1993 added this pull request to the merge queue Oct 12, 2025
Merged via the queue into unstable with commit cd922b9 Oct 12, 2025
39 of 40 checks passed
@dhyaniarun1993 dhyaniarun1993 deleted the meyer9/176-in-memory-storage-backend branch October 12, 2025 08:27
dhyaniarun1993 added a commit that referenced this pull request Oct 13, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Oct 20, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Oct 20, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
meyer9 added a commit that referenced this pull request Oct 31, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
meyer9 added a commit that referenced this pull request Oct 31, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 11, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 11, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
meyer9 added a commit that referenced this pull request Nov 13, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
meyer9 added a commit that referenced this pull request Nov 13, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 18, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 18, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 25, 2025
Ref #176 

This PR adds an in-memory storage backend and tests that work with any
storage backend. We can easily add more test cases to the file to test
against a future SQLite backend.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
emhane added a commit that referenced this pull request Nov 25, 2025
Based on #183

This PR adds a backfill job that accepts a DB transaction and copies the
current state to the database. The transaction ensures we see a
consistent view of the database at the current block, even if the node
is syncing. This requires `--db.read-transaction-timeout 0`.

This currently doesn't handle interrupting the job because the state may
update while syncing and may read a different version of the database
upon restart.

---------

Co-authored-by: Arun Dhyani <[email protected]>
Co-authored-by: Emilia Hane <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-db Area: database K-feature Kind: feature W-historical-proofs Workstream: historical-proofs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create in-memory storage backend for testing

4 participants