Avoid allocations in IndexFileNames.parseGeneration method#14920
Avoid allocations in IndexFileNames.parseGeneration method#14920doom369 wants to merge 2 commits intoapache:mainfrom
Conversation
|
This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop receiving this reminder on future updates to the PR. |
Your profiler is lying to you. Toss it and get a better one :) |
| int third = (second != -1) ? filename.indexOf('_', second + 1) : -1; | ||
|
|
||
| int parts = (second == -1 || second >= end) ? 2 : (third == -1 || third >= end) ? 3 : 4; | ||
|
|
There was a problem hiding this comment.
This is a massive increase in complexity of the code: the tradeoff isn't worth it.
It's a simple string method called once, doesn't even approach the cost of the system call to actually read the file.
There was a problem hiding this comment.
In our scenario, index updates (and we have ~6 different indexes) can happen every 5 seconds. It's profiling info from a 5-minute production run with a async-profiler (one of the best at the moment, imho). So, if these allocations were captured, they're not rare. Agree about complexity. But this method now should be also 5-10x faster if that matters for you. If you want, I can provide you with a microbenchmark.
rmuir
left a comment
There was a problem hiding this comment.
Sorry, this doesn't look like the right tradeoff to me. I'd recommend using a better profiler.
I don't see IndexFileNames anywhere in the CPU nor heap profiling of our benchmarks:
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
|
This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution! |
IndexFileNames.parseGenerationallocates multiple unnecessary strings (3-6 strings per call) + string[] objects:With often reindexing this allocations become notable during profiling. These allocations could be easily eliminated with a manual underscore search. After the fix only 1 string will be allocated in the worst case.