Parallelize audit node preload#456
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #456 +/- ##
==========================================
- Coverage 88.61% 88.15% -0.47%
==========================================
Files 39 38 -1
Lines 9109 8346 -763
==========================================
- Hits 8072 7357 -715
+ Misses 1037 989 -48 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
| info!( | ||
| "Preload of nodes for insert ({} objects loaded) completed.", | ||
| load_count | ||
| ); |
There was a problem hiding this comment.
Do we really want to log this as an INFO level log? Although this will arguably be much less hot than a read path, it may be possible for this to become noisy.
There was a problem hiding this comment.
This is actually intentional, we we switched the previous preload log to DEBUG, we lost visibility on how long node preloading was taking during sequencing (and how many objects were loaded).
akd/src/append_only_zks.rs
Outdated
|
|
||
| // preload the nodes that we will visit during the insertion | ||
| let (_, time_s) = | ||
| let (fallable_load_count, time_s) = |
There was a problem hiding this comment.
If the first item in the tuple is intended to be named in a manner indicating that it can fail, I believe we may want to rename this fallible_load_count.
There was a problem hiding this comment.
I copied this from elsewhere in the file, but I will edit both to use the right spelling :)
akd/src/append_only_zks.rs
Outdated
| .filter(|node| { | ||
| node.node_type != TreeNodeType::Leaf | ||
| && node.get_latest_epoch() > start_epoch | ||
| && node.min_descendant_epoch < end_epoch |
There was a problem hiding this comment.
Should this be <=? Previously, we were returning an empty vec when determining retrieval nodes in cases where the min descendant epoch was strictly greater than the end epoch. Since we're doing the inverse here, shouldn't we be less than or equal to?
Overview
Perform the same optimization in #454, but for audit node preloading. This involved rewriting the audit node preloading path to be more similar to the regular node preloading path, which should help readability and maintinability. We could likely refactor this even further to reduce code duplication in future.
Benchmark
Ran the azks benchmark on my 10-core Macbook Pro, on trunk (7fea6be) and on this PR:
Audit Generation (1000 Leaves)

Trunk:
This PR:

Audit Generation (50,000 Leaves)

Trunk:
This PR:

Audit Generation (100,000 Leaves)

Trunk:
This PR:

Performance does regress at small load levels, and that's because we're overly parallel and the overhead outweighs any benefits. I plan to introduce a change that will allow users to configure parallelism levels based on their expected load size, overriding the current mechanism where parallelism is automatically determined by the core count of the system.