-
Notifications
You must be signed in to change notification settings - Fork 135
feat(l1): move storage heal paths to their own table #2359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Lines of code reportTotal lines added: Detailed view |
…aths-to-own-table
…to move-storage-heal-paths-to-own-table
| // Delete read values | ||
| let txn = self.db.begin_write()?; | ||
| { | ||
| let mut table = txn.open_table(STORAGE_HEAL_PATHS_TABLE)?; | ||
| for (hash, _) in res.iter() { | ||
| table.remove(<H256 as Into<AccountHashRLP>>::into(*hash))?; | ||
| } | ||
| } | ||
| txn.commit()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the get_* method delete keys sounds confusing. Would it be too bad to split this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a rename? (get_and_remove_ o retrieve_ or something that gives the idea of consumption of elements?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think take would be suitable here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the idea of splitting it too much as this has only one use case in which we want to delete as soon as we read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated ba937d5
crates/storage/store_db/redb.rs
Outdated
| fn set_storage_heal_paths(&self, paths: Vec<(H256, Vec<Nibbles>)>) -> Result<(), StoreError> { | ||
| let key_values = paths | ||
| .into_iter() | ||
| .map(|(hash, paths)| { | ||
| ( | ||
| <H256 as Into<AccountHashRLP>>::into(hash), | ||
| <Vec<Nibbles> as Into<TriePathsRLP>>::into(paths), | ||
| ) | ||
| }) | ||
| .collect(); | ||
| self.write_batch(STORAGE_HEAL_PATHS_TABLE, key_values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After merging #2336 this needs the following change:
| fn set_storage_heal_paths(&self, paths: Vec<(H256, Vec<Nibbles>)>) -> Result<(), StoreError> { | |
| let key_values = paths | |
| .into_iter() | |
| .map(|(hash, paths)| { | |
| ( | |
| <H256 as Into<AccountHashRLP>>::into(hash), | |
| <Vec<Nibbles> as Into<TriePathsRLP>>::into(paths), | |
| ) | |
| }) | |
| .collect(); | |
| self.write_batch(STORAGE_HEAL_PATHS_TABLE, key_values) | |
| async fn set_storage_heal_paths(&self, paths: Vec<(H256, Vec<Nibbles>)>) -> Result<(), StoreError> { | |
| let key_values = paths | |
| .into_iter() | |
| .map(|(hash, paths)| { | |
| ( | |
| <H256 as Into<AccountHashRLP>>::into(hash), | |
| <Vec<Nibbles> as Into<TriePathsRLP>>::into(paths), | |
| ) | |
| }) | |
| .collect(); | |
| self.write_batch(STORAGE_HEAL_PATHS_TABLE, key_values).await |
Similar changes will be needed at the API level and for libmdbx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with merge!
Oppen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments.
ElFantasma
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It LGTM
…to move-storage-heal-paths-to-own-table
**Motivation** During snap sync, we download account ranges and then for each downloaded account we request its storage and bytecodes. For these requests we use fetcher processes that receive incoming messages from a channel (storage roots, bytecode hashes, etc), place them on a queue, and then group them in batches and spawn parallel processes to fetch them. All fetchers share a common behaviour of reading requests, batching, and fetching with differences concerning only the content of the queue. In many cases, we have had many bugs due to how these fetchers worked as we may update one of them and forget about the rest. This PR aims to reduce the sources of bugs and maintain a unified behaviour for fetchers by adding generic functions that represent the fetcher behaviour. In this PR we add the generic function `run_queue` that receives a generic queue (a Vec<T>), and an asyn function that operates over a batch in said queue. <!-- Why does this pull request exist? What are its goals? --> **Description** * Add generic function `run_queue` to abstract queue logic from fetcher processes * Use `run_queue` in `bytecode_fetcher`, `large_storage_fecther`, and `storage_fetcher` *Considerations* * As this PR was done with #2359 this won't be applied to the `storage_healer` which will stop reading messages * While the batch size could be a const generic instead of a regular argument, doing so would force us to make the other generic arguments in `run_queue` explicit, which looks pretty bad <!-- A clear and concise general description of the changes this PR introduces --> <!-- Link to issues: Resolves #111, Resolves #222 --> Closes #issue_number --------- Co-authored-by: Mario Rugiero <[email protected]>
**Motivation** During state sync, we store the accounts hashes of the storages we failed to fetch along with their root path in the store so the storage healer can then read them and heal them. For this we used the `SnapState` table, where the whole pending storage paths map was a value in that table. This used to work fine at a smaller scale, but when this map gets too big reading and writing from it becomes very expensive and can disrupt other processes. This PR moves the pending storage paths to their own table and changes how we interact with them: * The storage healer no longer fetches the whole map, but instead reads a specific amount of storages from it when its queue is not filled. * The storage healer no longer uses a channel, it instead reads incoming requests directly from the store * Fetchers that need to communicate with the storage healer now do so via adding paths to the store <!-- Why does this pull request exist? What are its goals? --> **Description** * Remove storage heal paths from snap state * Add new DB table for storage heal paths * Remove channel from storage healer and instead manage incoming and outgoing storage heal paths through the store (This also solves the issues of the rebuilder not being able to input storage heal requests and the storage healer being kept alive indefinitely upon forced shutdown) <!-- A clear and concise general description of the changes this PR introduces --> <!-- Link to issues: Resolves lambdaclass#111, Resolves lambdaclass#222 --> Closes #issue_number
…s#2408) **Motivation** During snap sync, we download account ranges and then for each downloaded account we request its storage and bytecodes. For these requests we use fetcher processes that receive incoming messages from a channel (storage roots, bytecode hashes, etc), place them on a queue, and then group them in batches and spawn parallel processes to fetch them. All fetchers share a common behaviour of reading requests, batching, and fetching with differences concerning only the content of the queue. In many cases, we have had many bugs due to how these fetchers worked as we may update one of them and forget about the rest. This PR aims to reduce the sources of bugs and maintain a unified behaviour for fetchers by adding generic functions that represent the fetcher behaviour. In this PR we add the generic function `run_queue` that receives a generic queue (a Vec<T>), and an asyn function that operates over a batch in said queue. <!-- Why does this pull request exist? What are its goals? --> **Description** * Add generic function `run_queue` to abstract queue logic from fetcher processes * Use `run_queue` in `bytecode_fetcher`, `large_storage_fecther`, and `storage_fetcher` *Considerations* * As this PR was done with lambdaclass#2359 this won't be applied to the `storage_healer` which will stop reading messages * While the batch size could be a const generic instead of a regular argument, doing so would force us to make the other generic arguments in `run_queue` explicit, which looks pretty bad <!-- A clear and concise general description of the changes this PR introduces --> <!-- Link to issues: Resolves lambdaclass#111, Resolves lambdaclass#222 --> Closes #issue_number --------- Co-authored-by: Mario Rugiero <[email protected]>
Motivation
During state sync, we store the accounts hashes of the storages we failed to fetch along with their root path in the store so the storage healer can then read them and heal them. For this we used the
SnapStatetable, where the whole pending storage paths map was a value in that table. This used to work fine at a smaller scale, but when this map gets too big reading and writing from it becomes very expensive and can disrupt other processes.This PR moves the pending storage paths to their own table and changes how we interact with them:
Description
Closes #issue_number