Coordinate DocValuesSkipper across fields for multi-range conjunctions by sgup432 · Pull Request #15793 · apache/lucene

sgup432 · 2026-03-04T02:06:24Z

Description

Related issue for more details - #15770

This PR adds MultiFieldDocValuesRangeQuery, which coordinates DocValuesSkipper evaluation across fields. BooleanQuery.rewrite() detects the pattern (2+ required NumericDocValuesRangeQuery clauses on distinct fields) and replaces them with a single coordinated query.
MultiFieldDocValuesRangeQuery contains Concatenated iterator where the main logic lies. It work together with all the desired fields docValueSkipper and move them together.
Also contains a jmh benchmark to validate this.
Tested across different data patterns, document counts, and number of concurrent range fields.

JMH Benchmark Results

Pattern	Docs	Fields	Without Optimization	With optimization	Speedup
clustered	1M	3	16,417	61,342	3.7x
clustered	1M	5	11,523	57,487	5.0x
clustered	10M	3	16,148	55,677	3.4x
clustered	10M	5	13,128	42,154	3.2x
mixed	1M	3	859	1,001	1.17x
mixed	1M	5	514	873	1.70x
mixed	10M	3	76	79	1.03x
mixed	10M	5	50	69	1.38x
random	1M	3	62	68	1.10x
random	1M	5	45	64	1.42x
random	10M	3	4.3	6.5	1.51x
random	10M	5	3.5	5.8	1.65x
sorted	1M	3	920	841	0.91x
sorted	1M	5	611	882	1.44x
sorted	10M	3	69	78	1.14x
sorted	10M	5	55	68	1.22x

Query used

{"bool":{"filter":[{"range":{"field0":{"gte":"X","lte":"Y"}}},{"range":{"field1":{"gte":"A","lte":"B"}}},{"range":{"field2":{"gte":"M","lte":"N"}}}]}}

Data Pattern:

clustered: All field values increase with docID (e.g., time-series data where timestamp, sequence number, and sensor readings grow together). Narrow query ranges eliminate most blocks. Best case for coordination (3.2–5.0x).
mixed: Combination of monotonic (timestamp), low-cardinality (20 values, like order status), and random fields (price). Resembles e-commerce order filtering. Moderate gains (1.2–1.7x).
sorted: Index sorted by one field (timestamp), other fields random. Resembles time-series indexed by ingestion time but queried on unsorted metric fields. Similar to mixed (1.1–1.4x).
random: All fields uniformly random with wide query ranges. Worst case, but still gains (1.1–1.7x) — when one field eliminates a block, it saves checking all others.

Coordinate DocValuesSkipper across fields for multi-range conjunctions

de048cd

github-actions bot added the module:core/search label Mar 4, 2026

sgup432 mentioned this pull request Mar 4, 2026

Add benchmark for multi-field numeric range conjunction queries mikemccand/luceneutil#549

Open

Add changelog

6de0d39

github-actions bot added this to the 11.0.0 milestone Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coordinate DocValuesSkipper across fields for multi-range conjunctions#15793

Coordinate DocValuesSkipper across fields for multi-range conjunctions#15793
sgup432 wants to merge 2 commits intoapache:mainfrom
sgup432:multi_field_doc_values_skip

sgup432 commented Mar 4, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sgup432 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

JMH Benchmark Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

sgup432 commented Mar 4, 2026 •

edited

Loading