Improve BytesRefHash.sort performance by retrieve byte directly from the pool. by tyronecai · Pull Request #15775 · apache/lucene

tyronecai · 2026-02-27T01:52:16Z

Description

RadixSort involves a large number of statistical histogram calculations, there are numerous byteAt calls.
Currently, byteAt calls get to retrieve the BytesRef corresponding to position i from the pool,
and then uses cmp.ByteAt to get the value of BytesRef at position k.

Because the byteAt calls are so frequent, they cause an observable performance penalty.
profile_cpu_44438.html

Therefore, we can directly retrieve the byte values corresponding to start and i from the pool.

Use save environment as #15772

without #15772 + without pool.byteAt

sort 33554432 unique terms in 4543.19 ms

without #15772 + with pool.byteAt

sort 33554432 unique terms in 3866.08 ms.      (4543.19 - 3866.08) / 4543.19 = 0.149

with #15772 + without pool.byteAt

sort 33554432 unique terms in 3385.94 ms

with #15772 + with pool.byteAt

sort 33554432 unique terms in 2937.54 ms.       (3385.94 - 2937.54) / 3385.94 = 0.132

However, I'm not sure if this change is appropriate from a code structure perspective,
although it does improve performance.

@dweiss @mikemccand please take a look and give some advice

…all path. Optimize performance by directly accessing bytes from the pool.

tyronecai · 2026-03-02T05:22:18Z

@dweiss

Could you please take a look and see if there are any issues with this change ?

dweiss · 2026-03-02T17:17:24Z

Checks don't pass. Also, refining this class in microbenchmarks is not the same as running it in the wild - there is significant complexity if you add higher-tier code. I'd say that this may not be worth the increased code complexity. You'd need to try macro-benchmarks (luceneutil) and see if this shows any improvement, I don't think you'll see much there.

tyronecai · 2026-03-02T23:22:20Z

Checks don't pass. Also, refining this class in microbenchmarks is not the same as running it in the wild - there is significant complexity if you add higher-tier code. I'd say that this may not be worth the increased code complexity. You'd need to try macro-benchmarks (luceneutil) and see if this shows any improvement, I don't think you'll see much there.检查未通过。此外，在微基准测试中优化这个类与在真实环境中运行是不同的——如果添加了更高层次的代码，复杂性会显著增加。我认为这种增加的代码复杂性并不值得。您需要尝试宏观基准测试（luceneutil），看看是否能看到任何改进，不过我觉得效果不会太明显。

Okay, I also think this change is a bit strange.

Checks don't pass. Also, refining this class in microbenchmarks is not the same as running it in the wild - there is significant complexity if you add higher-tier code. I'd say that this may not be worth the increased code complexity. You'd need to try macro-benchmarks (luceneutil) and see if this shows any improvement, I don't think you'll see much there.

Okay, I also think this change is a bit strange.

tyronecai added 2 commits February 27, 2026 09:17

Improve the performance of BytesRefHash.sort by reducing the byteAt c…

6ff5fad

…all path. Optimize performance by directly accessing bytes from the pool.

Add byteAt function to get one byte from pool

135d76e

github-actions bot added the module:core/other label Feb 27, 2026

tyronecai added 2 commits February 27, 2026 09:55

Update comment.

a1d8052

Update comment

0fc2545

tyronecai changed the title ~~Improve BytesRefHash.sort performance by get byte directly from the pool.~~ Improve BytesRefHash.sort performance by retrieve byte directly from the pool. Feb 27, 2026

Update changes.

edb6650

github-actions bot added this to the 10.5.0 milestone Feb 27, 2026

tyronecai added 2 commits February 27, 2026 10:01

Update changes.

83e6333

update

a638eeb

tyronecai closed this Mar 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve BytesRefHash.sort performance by retrieve byte directly from the pool.#15775

Improve BytesRefHash.sort performance by retrieve byte directly from the pool.#15775
tyronecai wants to merge 7 commits intoapache:mainfrom
tyronecai:patch-2

tyronecai commented Feb 27, 2026 •

edited

Loading

Uh oh!

tyronecai commented Mar 2, 2026

Uh oh!

dweiss commented Mar 2, 2026

Uh oh!

tyronecai commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tyronecai commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

without #15772 + without pool.byteAt

without #15772 + with pool.byteAt

with #15772 + without pool.byteAt

with #15772 + with pool.byteAt

Uh oh!

tyronecai commented Mar 2, 2026

Uh oh!

dweiss commented Mar 2, 2026

Uh oh!

tyronecai commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tyronecai commented Feb 27, 2026 •

edited

Loading