Improve BytesRefHash.sort performance by retrieve byte directly from the pool.#15775
Improve BytesRefHash.sort performance by retrieve byte directly from the pool.#15775tyronecai wants to merge 7 commits intoapache:mainfrom
Conversation
…all path. Optimize performance by directly accessing bytes from the pool.
|
Could you please take a look and see if there are any issues with this change ? |
|
Checks don't pass. Also, refining this class in microbenchmarks is not the same as running it in the wild - there is significant complexity if you add higher-tier code. I'd say that this may not be worth the increased code complexity. You'd need to try macro-benchmarks (luceneutil) and see if this shows any improvement, I don't think you'll see much there. |
Okay, I also think this change is a bit strange.
Okay, I also think this change is a bit strange. |
Description
RadixSort involves a large number of statistical histogram calculations, there are numerous byteAt calls.
Currently, byteAt calls
getto retrieve the BytesRef corresponding to position i from the pool,and then uses
cmp.ByteAtto get the value of BytesRef at position k.Because the byteAt calls are so frequent, they cause an observable performance penalty.
profile_cpu_44438.html
Therefore, we can directly retrieve the byte values corresponding to start and i from the pool.
Use save environment as #15772
without #15772 + without pool.byteAt
without #15772 + with pool.byteAt
with #15772 + without pool.byteAt
with #15772 + with pool.byteAt
However, I'm not sure if this change is appropriate from a code structure perspective,
although it does improve performance.
@dweiss @mikemccand please take a look and give some advice