Skip to content

Conversation

@yihuang
Copy link
Collaborator

@yihuang yihuang commented Jan 12, 2026

Description

use bitmap + sync.Map to replace btree in secondary store to optimize memory allocation.

The benchmark shows significant allocation drop in random and worst case, and also better cpu performance. A little bit regression in no-conflict case, probably because of the increased baseline overhead of the bitmap, but in general it looks like a good deal.

NOTE: we can potentially optimize the top level btree with sync.Map as well, but that'll add complexity and overheads to iteration support, if we decides to treat iteration as second class citizen, we can do that.

benchstat /tmp/before /tmp/after
goos: darwin
goarch: arm64
pkg: github.com/cosmos/cosmos-sdk/blockstm
cpu: Apple M3 Max
                                         │ /tmp/before │             /tmp/after              │
                                         │   sec/op    │    sec/op     vs base               │
BlockSTM/random-10000/100-sequential-16     1.266 ± 1%    1.180 ±  0%   -6.78% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-1-16       1.308 ± 1%    1.203 ±  1%   -8.01% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-5-16      275.6m ± 0%   271.8m ±  0%   -1.37% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-10-16     141.2m ± 0%   139.4m ±  1%   -1.27% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-15-16     120.4m ± 2%   115.5m ±  2%   -4.10% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-20-16     144.6m ± 2%   137.6m ±  2%   -4.83% (p=0.002 n=6)
BlockSTM/no-conflict-10000-sequential-16    1.286 ± 2%    1.272 ±  1%        ~ (p=0.065 n=6)
BlockSTM/no-conflict-10000-worker-1-16      1.323 ± 0%    1.324 ±  0%        ~ (p=0.589 n=6)
BlockSTM/no-conflict-10000-worker-5-16     286.0m ± 0%   286.6m ±  0%   +0.22% (p=0.004 n=6)
BlockSTM/no-conflict-10000-worker-10-16    150.0m ± 1%   150.1m ±  0%        ~ (p=0.589 n=6)
BlockSTM/no-conflict-10000-worker-15-16    120.4m ± 2%   120.3m ±  1%        ~ (p=0.589 n=6)
BlockSTM/no-conflict-10000-worker-20-16    123.9m ± 3%   124.3m ±  1%        ~ (p=0.818 n=6)
BlockSTM/worst-case-10000-sequential-16     1.267 ± 0%    1.259 ±  1%   -0.63% (p=0.009 n=6)
BlockSTM/worst-case-10000-worker-1-16       1.302 ± 1%    1.273 ±  1%   -2.21% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-5-16      427.0m ± 3%   418.3m ±  2%        ~ (p=0.093 n=6)
BlockSTM/worst-case-10000-worker-10-16     410.0m ± 1%   354.6m ±  2%  -13.51% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-15-16     420.9m ± 1%   379.6m ±  1%   -9.80% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-20-16     418.9m ± 1%   376.7m ±  2%  -10.08% (p=0.002 n=6)
BlockSTM/iterate-10000/100-sequential-16    1.283 ± 1%    1.282 ±  0%        ~ (p=0.485 n=6)
BlockSTM/iterate-10000/100-worker-1-16      1.360 ± 1%    1.330 ±  0%   -2.23% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-5-16     292.5m ± 0%   288.7m ±  1%   -1.29% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-10-16    164.3m ± 1%   165.3m ±  5%        ~ (p=0.818 n=6)
BlockSTM/iterate-10000/100-worker-15-16    179.0m ± 1%   172.7m ± 17%        ~ (p=0.132 n=6)
BlockSTM/iterate-10000/100-worker-20-16    214.8m ± 3%   202.4m ±  4%   -5.77% (p=0.002 n=6)
geomean                                    394.4m        381.4m         -3.31%

                                         │  /tmp/before   │              /tmp/after              │
                                         │      B/op      │     B/op       vs base               │
BlockSTM/random-10000/100-sequential-16     9.384Mi ±  0%   9.384Mi ±  0%        ~ (p=0.535 n=6)
BlockSTM/random-10000/100-worker-1-16       71.39Mi ±  0%   36.92Mi ±  0%  -48.29% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-5-16       72.03Mi ±  0%   37.18Mi ±  0%  -48.39% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-10-16      72.87Mi ±  0%   37.36Mi ±  0%  -48.74% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-15-16      80.54Mi ±  0%   39.86Mi ±  1%  -50.51% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-20-16     103.44Mi ±  1%   48.39Mi ±  1%  -53.22% (p=0.002 n=6)
BlockSTM/no-conflict-10000-sequential-16    9.384Mi ± 19%   9.384Mi ± 19%        ~ (p=0.232 n=6)
BlockSTM/no-conflict-10000-worker-1-16      77.27Mi ±  0%   80.02Mi ±  0%   +3.55% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-5-16      79.00Mi ±  0%   81.61Mi ±  0%   +3.31% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-10-16     81.26Mi ±  0%   84.09Mi ±  0%   +3.49% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-15-16     87.54Mi ±  0%   90.55Mi ±  0%   +3.44% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-20-16     90.38Mi ±  1%   94.01Mi ±  1%   +4.02% (p=0.002 n=6)
BlockSTM/worst-case-10000-sequential-16     9.155Mi ±  0%   9.155Mi ±  0%        ~ (p=1.000 n=6)
BlockSTM/worst-case-10000-worker-1-16       83.11Mi ±  0%   33.62Mi ±  0%  -59.55% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-5-16      125.78Mi ±  2%   44.48Mi ±  1%  -64.64% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-10-16     183.79Mi ±  1%   54.92Mi ±  1%  -70.12% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-15-16     212.50Mi ±  0%   66.11Mi ±  0%  -68.89% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-20-16     213.49Mi ±  1%   67.77Mi ±  1%  -68.26% (p=0.002 n=6)
BlockSTM/iterate-10000/100-sequential-16    15.72Mi ±  0%   15.72Mi ±  0%        ~ (p=1.000 n=6)
BlockSTM/iterate-10000/100-worker-1-16     130.84Mi ±  0%   81.05Mi ±  0%  -38.05% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-5-16     134.54Mi ±  0%   83.34Mi ±  0%  -38.05% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-10-16    145.93Mi ±  1%   90.59Mi ±  2%  -37.92% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-15-16     204.5Mi ±  1%   122.1Mi ±  4%  -40.31% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-20-16     268.5Mi ±  1%   154.4Mi ±  1%  -42.48% (p=0.002 n=6)
geomean                                     77.04Mi         48.21Mi        -37.42%

                                         │ /tmp/before  │              /tmp/after              │
                                         │  allocs/op   │  allocs/op   vs base                 │
BlockSTM/random-10000/100-sequential-16     220.0k ± 0%   220.0k ± 0%        ~ (p=1.000 n=6)
BlockSTM/random-10000/100-worker-1-16       891.5k ± 0%   653.2k ± 0%  -26.73% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-5-16       898.9k ± 0%   658.3k ± 0%  -26.77% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-10-16      908.8k ± 0%   661.8k ± 0%  -27.18% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-15-16      999.7k ± 0%   711.8k ± 1%  -28.80% (p=0.002 n=6)
BlockSTM/random-10000/100-worker-20-16     1270.6k ± 1%   881.8k ± 1%  -30.60% (p=0.002 n=6)
BlockSTM/no-conflict-10000-sequential-16    220.0k ± 0%   220.0k ± 0%        ~ (p=1.000 n=6)
BlockSTM/no-conflict-10000-worker-1-16      1.105M ± 0%   1.184M ± 0%   +7.19% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-5-16      1.124M ± 0%   1.203M ± 0%   +6.94% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-10-16     1.150M ± 0%   1.231M ± 0%   +7.01% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-15-16     1.222M ± 0%   1.305M ± 0%   +6.76% (p=0.002 n=6)
BlockSTM/no-conflict-10000-worker-20-16     1.256M ± 1%   1.345M ± 1%   +7.13% (p=0.002 n=6)
BlockSTM/worst-case-10000-sequential-16     220.0k ± 0%   220.0k ± 0%        ~ (p=1.000 n=6) ¹
BlockSTM/worst-case-10000-worker-1-16       985.0k ± 0%   597.4k ± 0%  -39.34% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-5-16      1452.9k ± 2%   810.6k ± 2%  -44.21% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-10-16      2.088M ± 1%   1.013M ± 1%  -51.45% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-15-16      2.409M ± 0%   1.232M ± 0%  -48.88% (p=0.002 n=6)
BlockSTM/worst-case-10000-worker-20-16      2.419M ± 1%   1.263M ± 1%  -47.80% (p=0.002 n=6)
BlockSTM/iterate-10000/100-sequential-16    290.0k ± 0%   290.0k ± 0%        ~ (p=1.000 n=6) ¹
BlockSTM/iterate-10000/100-worker-1-16      1.450M ± 0%   1.091M ± 0%  -24.72% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-5-16      1.490M ± 0%   1.123M ± 0%  -24.66% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-10-16     1.613M ± 1%   1.220M ± 2%  -24.35% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-15-16     2.251M ± 1%   1.649M ± 4%  -26.74% (p=0.002 n=6)
BlockSTM/iterate-10000/100-worker-20-16     2.946M ± 1%   2.102M ± 1%  -28.66% (p=0.002 n=6)
geomean                                     1.039M        811.8k       -21.89%
¹ all samples are equal

Solution:
- use bitmap + sync.Map to replace btree to optimize memory allocation
@yihuang yihuang requested a review from Eric-Warehime January 12, 2026 18:01
@yihuang yihuang changed the title optim: optimize secondary store memory allocation optim: optimize block-stm secondary store memory allocation Jan 12, 2026
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 95.00000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 70.50%. Comparing base (82f1fc2) to head (9a290cf).

Files with missing lines Patch % Lines
blockstm/secondary_store.go 93.75% 2 Missing ⚠️
blockstm/stm.go 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #25767      +/-   ##
==========================================
- Coverage   70.56%   70.50%   -0.07%     
==========================================
  Files         838      839       +1     
  Lines       54570    54599      +29     
==========================================
- Hits        38508    38494      -14     
- Misses      16062    16105      +43     
Files with missing lines Coverage Δ
blockstm/mvdata.go 96.87% <100.00%> (-0.16%) ⬇️
blockstm/mviterator.go 100.00% <100.00%> (ø)
blockstm/stm.go 77.41% <50.00%> (-1.90%) ⬇️
blockstm/secondary_store.go 93.75% <93.75%> (ø)

... and 2 files with indirect coverage changes

Impacted file tree graph

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Eric-Warehime Eric-Warehime changed the title optim: optimize block-stm secondary store memory allocation perf: optimize block-stm secondary store memory allocation Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant