Plumb merge parallelism via `ConcurrentMergeScheduler` instead of codec by abernardi597 · Pull Request #507 · mikemccand/luceneutil

abernardi597 · 2026-01-12T16:55:12Z

By leaving the HNSW vector codec merge executor as null, the codecs will leverage the intraMergeTaskExecutor provided at merge-time to parallelize the merges.
To populate the field, the corresponding parameters are properly set using ConcurrentMergeScheduler.setMaxMergesAndThreads(int, int).

This change results in a bit of a dependency issue with constructing the codec when using the default value ConcurrentMergeScheduler.AUTO_DETECT_MERGES_AND_THREADS, as the auto-detected value only becomes available after setting it on the CMS.
I modified KnnIndexer.createIndex to take a lambda for setting non-default options in order to get the CMS and use its values to construct the codec, but it's admittedly a bit janky.

I also modified some of the reported merge timings and print statements.

These changes were originally made as part of #502.

github-actions · 2026-01-27T00:12:27Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

mikemccand

Phew, this was hard to think about. I like the motivation for this change (use Lucene's defaults, stop creating an extra executor with confusing knobs to tune), but it's too forceful (overloading CMS hard limit on merge debt and HNSW max merge concurrency).

Could we keep the HNSW merge concurrency parameter separate? Rename it to -hnswMergeWorkerCount or so, and pass that to Lucene99HnswVectorsFormat when we getCodec(...). The javadocs for that class state that you must provide and executor when numMergeWorkers > 1 but I think that's not true? It seems to (correctly) fallback to CMS's intra-merge thread pool? Maybe make upstream PR to fix that if so.

And then change what -numMergeThread does as you do here? It becomes the soft limit (maxMergeThreads) for CMS, which is also what it sizes its intra-merge thread pool to, so that will be consistent with how KnnGraphTester works today. Since you must also set CMS's hard limit on merge debt, pick something dynamic ... maybe max(numMergeThread+2, 3*numMergeThread/2) ish? And add a new parameter (-maxMergeDebt maybe?) so user could override that dynamic default if they want?

mikemccand · 2026-02-10T11:16:33Z

src/main/knn/KnnGraphTester.java

    numQueryVectors = 1000;
    dim = 256;
    topK = 100;
-    numMergeThread = 1;


OK we are also switching to Lucene's default concurrency with this change right? Previously knnPerfTest.py was defaulting to 1 and 1 here?

Yes, it makes more sense to me that omitting an option should use whatever the Lucene default is.
It's also more ergonomic in my opinion if you want the Lucene default; otherwise you'd specify -1 on the CLI which can be ambiguous.

mikemccand · 2026-02-10T11:32:05Z

src/main/knn/KnnGraphTester.java

-    numMergeThread = 1;
-    numMergeWorker = 1;
+    numMergeThread = ConcurrentMergeScheduler.AUTO_DETECT_MERGES_AND_THREADS;
+    numMergeWorker = ConcurrentMergeScheduler.AUTO_DETECT_MERGES_AND_THREADS;


OK what actually is the difference between worker and thread here, even before your change?

It looks like numMergeWorker is how much concurrency a single HNSW merge will use, and numMergeThread is the size of the globally shared (ConcurrentMergeScheduler's intraMergeExecutor) thread pool.

I don't get why these are separately configurable. If I have only one merge to run, wouldn't I want to use all available concurrency? So I should set numMergeWorker >= numMergeThread. But then it's silly to expect/ask for more concurrency than you can possibly execute, so then numMergeWorker should be <= numMergeThread. Intersecting the two ... I should just always set them to the same value?

Or, perhaps we lose throughput if we try to do too many concurrent tasks for a single HNSW merge (thread sync/context switch overhead), in which case maybe I limit workers and expect multiple merges to typically be running to soak up all the concurrent threads?

Anyway it's all intensely confusing, and this PR is great progress (eliminating extra confusion added on top by creating another executor, and falling back to Lucene's preferred default path)...

If I follow correctly, it seems like we can just set numMergeWorker to cms.getMaxThreadCount() instead of cms.getMaxMergeCount().

I'm not concerned about thread overhead since the HNSW merges are batched anyway to mitigate the overhead of over-provisioning concurrency.

mikemccand · 2026-02-10T11:46:49Z

src/main/knn/KnnGraphTester.java

-      ).createIndex();
+      ).createIndex(iwc -> {
+        ConcurrentMergeScheduler cms = (ConcurrentMergeScheduler) iwc.getMergeScheduler();
+        cms.setMaxMergesAndThreads(numMergeWorker, numMergeThread);


Hmmm ... this isn't quite right ... it's overloading two very different settings.

setMaxMergesAndThreads takes two parameters, maxMergeCount and maxThreadCount. Think of maxMergeCount as a hard limit on number of running merges (merge debt), and maxThreadCount as a soft limit.

When merge debt crosses the soft limit, ConcurrentMergeScheduler (CMS) begins pausing/unpausing enough merges to keep the running merge count at the soft limit. If merge debt continues to grow, CMS takes the even more drastic step of forcefully stalling the incoming indexing threads (the threads causing new segments to poof into existence, the "producers", or the "mutator threads" in GC-speak). These are CMS's two backpressure mechanisms against the adversarial mutators.

Whereas, numMergeWorkers here is a very different concept: it's how much concurrency the HNSW merger, for a single segment, is able to take advantage of. It really should not be a configurable parameter, I think -- it should be "as much concurrency as you have to offer" (numMergeThreads here).

See above, I think we can use getMaxThreadCount() for numMergeWorker. Really, -numMergeWorker should be -maxMergeCount and -numMergeThread should be -maxMergeThreads?

And add an override flag for hnswMergeThreads, perhaps.
Does it make sense to overload the flag to control whether the intra-merge executor is used or not?
That is:

if you omit the flag then the intra-merge executor is used for hnsw merges at whatever the getMaxThreadCount() is

if you provide a value for the flag then it will use a separate executor

As far as I can tell, there isn't much else that uses the intra-merge executor so it seems redundant to even support the separate executor. I wonder why this escape hatch was added instead of just using the intra-merge executor?

github-actions · 2026-02-25T00:16:21Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the dev@lucene.apache.org list. Thank you for your contribution!

abernardi597 · 2026-03-09T21:57:09Z

The latest revision of this PR:

Adds a numMaxMerge flag to optionally override the corresponding CMS parameter
Uses the existing numMergeThread flag to optionally override the corresponding CMS parameter
Uses the existing numMergeWorker flag to optionally override HNSW graph merge parallelism
- By default, use the intraMergeExecutor from CMS with parallelism set to the (possibly computed) getMaxThreadCount()
- Using -numMergeWorker 1 uses the calling merge thread (no parallelism)
- Using -numMergeWorker N for N>1 uses a separate ForkJoinPool with parallelism N

abernardi597 force-pushed the parallel-merge branch from cbf0b90 to 8ec4176 Compare January 12, 2026 17:59

github-actions bot added the Stale label Jan 27, 2026

abernardi597 force-pushed the parallel-merge branch from 8ec4176 to 6c00719 Compare February 3, 2026 18:24

github-actions bot removed the Stale label Feb 4, 2026

mikemccand reviewed Feb 10, 2026

View reviewed changes

github-actions bot added Stale and removed Stale labels Feb 25, 2026

abernardi597 force-pushed the parallel-merge branch from 6c00719 to dc50806 Compare March 9, 2026 20:40

abernardi597 added 2 commits March 9, 2026 21:18

Fix merge concurrency

543294f

Add graph merge executor override

9591d33

abernardi597 force-pushed the parallel-merge branch from dc50806 to 9591d33 Compare March 9, 2026 21:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plumb merge parallelism via `ConcurrentMergeScheduler` instead of codec#507

Plumb merge parallelism via `ConcurrentMergeScheduler` instead of codec#507
abernardi597 wants to merge 2 commits intomikemccand:mainfrom
abernardi597:parallel-merge

abernardi597 commented Jan 12, 2026

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

mikemccand left a comment

Uh oh!

mikemccand Feb 10, 2026

Uh oh!

abernardi597 Feb 27, 2026

Uh oh!

mikemccand Feb 10, 2026

Uh oh!

abernardi597 Feb 27, 2026

Uh oh!

mikemccand Feb 10, 2026

Uh oh!

abernardi597 Feb 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

abernardi597 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abernardi597 commented Jan 12, 2026

Uh oh!

github-actions bot commented Jan 27, 2026

Uh oh!

mikemccand left a comment

Choose a reason for hiding this comment

Uh oh!

mikemccand Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

abernardi597 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mikemccand Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

abernardi597 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

mikemccand Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

abernardi597 Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

abernardi597 commented Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abernardi597 Feb 27, 2026 •

edited

Loading