accounts-db: Reduce read-only cache lock contention#10641
accounts-db: Reduce read-only cache lock contention#10641vadorovsky merged 1 commit intoanza-xyz:masterfrom
Conversation
`DashMap` defaults to `num_cpus * 4` shards (~256 on 64-core validators), so each shard holds ~200 of the ~45-55k cached accounts and locking one shard blocks many unrelated keys. The current load/store times on master are: ``` read_only_accounts_cache_load_us=5474i read_only_accounts_cache_store_us=293i read_only_accounts_cache_load_us=4659i read_only_accounts_cache_store_us=227i read_only_accounts_cache_load_us=5042i read_only_accounts_cache_store_us=220i read_only_accounts_cache_load_us=3286i read_only_accounts_cache_store_us=92i read_only_accounts_cache_load_us=7615i read_only_accounts_cache_store_us=599i read_only_accounts_cache_load_us=2819i read_only_accounts_cache_store_us=82i read_only_accounts_cache_load_us=3240i read_only_accounts_cache_store_us=204i read_only_accounts_cache_load_us=3832i read_only_accounts_cache_store_us=240i read_only_accounts_cache_load_us=8635i read_only_accounts_cache_store_us=378i read_only_accounts_cache_load_us=9273i read_only_accounts_cache_store_us=260i ``` Increase the shard count to 2^16 (65 536) that keeps a power of two and roughly matches the number of cached accounts we observe on mainnet-beta. The average load is still ~1 account per shard (collisions are common), but compared with the default `num_cpus * 4` shards - where we saw hot shards carrying ~200 accounts - this dramatically lowers contention. After the change, the only visible sleeps are the ones taking ~100ms intervals, so it's the sleep we trigger on each iteration of evictor, rather than sleep coming from the lock contention. In between these sleep intervals, the only visible work is dropping the cache memory (the actual eviction). The new load/store times are: ``` read_only_accounts_cache_load_us=4103i read_only_accounts_cache_store_us=10i read_only_accounts_cache_load_us=4577i read_only_accounts_cache_store_us=72i read_only_accounts_cache_load_us=5233i read_only_accounts_cache_store_us=19i read_only_accounts_cache_load_us=4928i read_only_accounts_cache_store_us=12i read_only_accounts_cache_load_us=2270i read_only_accounts_cache_store_us=5i read_only_accounts_cache_load_us=3722i read_only_accounts_cache_store_us=48i read_only_accounts_cache_load_us=5225i read_only_accounts_cache_store_us=28i read_only_accounts_cache_load_us=5243i read_only_accounts_cache_store_us=9i read_only_accounts_cache_load_us=4631i read_only_accounts_cache_store_us=14i read_only_accounts_cache_load_us=4564i read_only_accounts_cache_store_us=18i ```
b0b94a9 to
02e5530
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #10641 +/- ##
=======================================
Coverage 83.0% 83.0%
=======================================
Files 844 844
Lines 316091 316092 +1
=======================================
+ Hits 262569 262578 +9
+ Misses 53522 53514 -8 🚀 New features to boost your workflow:
|
brooksprumo
left a comment
There was a problem hiding this comment.
Nice perf win. I think the contention reduction easily justifies the extra mem usage for the additional shards. And now we also don't have different behavior based on num cpus.
Backport to v3.1?
|
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |
`DashMap` defaults to `num_cpus * 4` shards (~256 on 64-core validators), so each shard holds ~200 of the ~45-55k cached accounts and locking one shard blocks many unrelated keys. The current load/store times on master are: ``` read_only_accounts_cache_load_us=5474i read_only_accounts_cache_store_us=293i read_only_accounts_cache_load_us=4659i read_only_accounts_cache_store_us=227i read_only_accounts_cache_load_us=5042i read_only_accounts_cache_store_us=220i read_only_accounts_cache_load_us=3286i read_only_accounts_cache_store_us=92i read_only_accounts_cache_load_us=7615i read_only_accounts_cache_store_us=599i read_only_accounts_cache_load_us=2819i read_only_accounts_cache_store_us=82i read_only_accounts_cache_load_us=3240i read_only_accounts_cache_store_us=204i read_only_accounts_cache_load_us=3832i read_only_accounts_cache_store_us=240i read_only_accounts_cache_load_us=8635i read_only_accounts_cache_store_us=378i read_only_accounts_cache_load_us=9273i read_only_accounts_cache_store_us=260i ``` Increase the shard count to 2^16 (65 536) that keeps a power of two and roughly matches the number of cached accounts we observe on mainnet-beta. The average load is still ~1 account per shard (collisions are common), but compared with the default `num_cpus * 4` shards - where we saw hot shards carrying ~200 accounts - this dramatically lowers contention. After the change, the only visible sleeps are the ones taking ~100ms intervals, so it's the sleep we trigger on each iteration of evictor, rather than sleep coming from the lock contention. In between these sleep intervals, the only visible work is dropping the cache memory (the actual eviction). The new load/store times are: ``` read_only_accounts_cache_load_us=4103i read_only_accounts_cache_store_us=10i read_only_accounts_cache_load_us=4577i read_only_accounts_cache_store_us=72i read_only_accounts_cache_load_us=5233i read_only_accounts_cache_store_us=19i read_only_accounts_cache_load_us=4928i read_only_accounts_cache_store_us=12i read_only_accounts_cache_load_us=2270i read_only_accounts_cache_store_us=5i read_only_accounts_cache_load_us=3722i read_only_accounts_cache_store_us=48i read_only_accounts_cache_load_us=5225i read_only_accounts_cache_store_us=28i read_only_accounts_cache_load_us=5243i read_only_accounts_cache_store_us=9i read_only_accounts_cache_load_us=4631i read_only_accounts_cache_store_us=14i read_only_accounts_cache_load_us=4564i read_only_accounts_cache_store_us=18i ``` (cherry picked from commit ec0ade5)
…#10641) (#10645) accounts-db: Reduce read-only cache lock contention (#10641) `DashMap` defaults to `num_cpus * 4` shards (~256 on 64-core validators), so each shard holds ~200 of the ~45-55k cached accounts and locking one shard blocks many unrelated keys. The current load/store times on master are: ``` read_only_accounts_cache_load_us=5474i read_only_accounts_cache_store_us=293i read_only_accounts_cache_load_us=4659i read_only_accounts_cache_store_us=227i read_only_accounts_cache_load_us=5042i read_only_accounts_cache_store_us=220i read_only_accounts_cache_load_us=3286i read_only_accounts_cache_store_us=92i read_only_accounts_cache_load_us=7615i read_only_accounts_cache_store_us=599i read_only_accounts_cache_load_us=2819i read_only_accounts_cache_store_us=82i read_only_accounts_cache_load_us=3240i read_only_accounts_cache_store_us=204i read_only_accounts_cache_load_us=3832i read_only_accounts_cache_store_us=240i read_only_accounts_cache_load_us=8635i read_only_accounts_cache_store_us=378i read_only_accounts_cache_load_us=9273i read_only_accounts_cache_store_us=260i ``` Increase the shard count to 2^16 (65 536) that keeps a power of two and roughly matches the number of cached accounts we observe on mainnet-beta. The average load is still ~1 account per shard (collisions are common), but compared with the default `num_cpus * 4` shards - where we saw hot shards carrying ~200 accounts - this dramatically lowers contention. After the change, the only visible sleeps are the ones taking ~100ms intervals, so it's the sleep we trigger on each iteration of evictor, rather than sleep coming from the lock contention. In between these sleep intervals, the only visible work is dropping the cache memory (the actual eviction). The new load/store times are: ``` read_only_accounts_cache_load_us=4103i read_only_accounts_cache_store_us=10i read_only_accounts_cache_load_us=4577i read_only_accounts_cache_store_us=72i read_only_accounts_cache_load_us=5233i read_only_accounts_cache_store_us=19i read_only_accounts_cache_load_us=4928i read_only_accounts_cache_store_us=12i read_only_accounts_cache_load_us=2270i read_only_accounts_cache_store_us=5i read_only_accounts_cache_load_us=3722i read_only_accounts_cache_store_us=48i read_only_accounts_cache_load_us=5225i read_only_accounts_cache_store_us=28i read_only_accounts_cache_load_us=5243i read_only_accounts_cache_store_us=9i read_only_accounts_cache_load_us=4631i read_only_accounts_cache_store_us=14i read_only_accounts_cache_load_us=4564i read_only_accounts_cache_store_us=18i ``` (cherry picked from commit ec0ade5) Co-authored-by: Michal R <[email protected]>
This PR builds on #10640. It can be merged independently, but the issue was identified after applying that change, and the profiles below were captured with #10640 applied.
Problem
DashMapdefaults tonum_cpus * 4shards (~256 on 64-core validators), so each shard holds ~200 of the ~45-55k cached accounts and locking one shard blocks many unrelated keys.The current load/store times on master are:
Summary of Changes
Increase the shard count to 2^16 (65 536) that keeps a power of two and roughly matches the number of cached accounts we observe on mainnet-beta. The average load is still ~1 account per shard (collisions are common), but compared with the default
num_cpus * 4shards - where we saw hot shards carrying ~200 accounts - this dramatically lowers contention.After the change, the only visible sleeps are the ones taking ~100ms intervals, so it's the sleep we trigger on each iteration of evictor, rather than sleep coming from the lock contention. In between these sleep intervals, the only visible work is dropping the cache memory (the actual eviction).
The new load/store times are: