Skip to content

Upgrade to Lucene 10#3151

Open
lintool wants to merge 10 commits intomasterfrom
lucene10
Open

Upgrade to Lucene 10#3151
lintool wants to merge 10 commits intomasterfrom
lucene10

Conversation

@lintool
Copy link
Member

@lintool lintool commented Mar 8, 2026

The branch lucene10 is a long-lived feature branch for upgrading to Lucene 10. The idea is that all other Lucene 10 features would be PRed against this branch, and when we're ready, we can merge this to master in one go.

That means we should keep this branch always mergeable to master.

WIP, will keep in draft form until we're ready to merge.

@lintool lintool mentioned this pull request Mar 8, 2026
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98022af436

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

public KnnVectorsFormat getKnnVectorsFormatForField(String field) {
return new DelegatingKnnVectorsFormat(new Lucene99HnswVectorsFormat(args.M, args.efC), 4096);
}
});
}

this.writer = new IndexWriter(dir, config);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reapply writer settings before creating HNSW IndexWriter

The constructor now instantiates IndexWriter immediately after codec selection, but no longer sets OpenMode.CREATE or setRAMBufferSizeMB(args.memoryBuffer) on IndexWriterConfig. As a result, rerunning indexing against an existing -index path can append to prior data instead of recreating the index, and the -memoryBuffer CLI argument is silently ignored. This changes indexing semantics and can invalidate reproducibility for dense-vector experiments.

Useful? React with 👍 / 👎.

rs = searcher.search(query, useRM3 ? searchArgs.rerankcutoff : k, BREAK_SCORE_TIES_BY_DOCID, true);
}
context = new RerankerContext<>(searcher, null, query, null,
TopDocs rs = searcher.search(query, useRM3 ? searchArgs.rerankcutoff : k, BREAK_SCORE_TIES_BY_DOCID, true);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep Lucene8 compatibility branch in feedback-term search

_search still uses backwardsCompatibilityLucene8 to avoid sorted tie-breaking on old indexes, but _get_feedback_terms was changed to always call search(..., BREAK_SCORE_TIES_BY_DOCID, true). On Lucene 8 indexes, this reintroduces the docvalues-based incompatibility that the compatibility flag is meant to bypass, so feedback-term extraction can fail even when normal search() still works.

Useful? React with 👍 / 👎.

@codecov
Copy link

codecov bot commented Mar 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.39%. Comparing base (a64b6e4) to head (e9d571a).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3151      +/-   ##
============================================
- Coverage     74.14%   73.39%   -0.75%     
- Complexity     1653     1702      +49     
============================================
  Files           196      199       +3     
  Lines         13117    13403     +286     
  Branches       1708     1765      +57     
============================================
+ Hits           9725     9837     +112     
- Misses         2691     2845     +154     
- Partials        701      721      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant