Lucene103 blocktree TrieReader regression: 4x slower seekExact for _id lookups vs lucene90 FST

### Description

Component: core/codecs

Description:

The lucene103 blocktree codec replaced the in-memory FST term index with
an on-disk TrieReader. This causes a significant performance regression
for workloads that perform high-frequency seekExact() calls on the _id
field during document indexing.

## Environment

- OpenSearch 3.3 (Lucene 10.x with lucene103 codec) vs OpenSearch 2.19 (Lucene 9.12.0 with lucene90 codec)
- JDK: Amazon Corretto 21.0.8
- Workload: 32 KNN indices, 6 shards each, mixed ingest+query (50/50), 
  bulk indexing with explicit _id (UUID), ~400 segments per index at 
  refresh_interval=1s

## Problem

Every indexed document with an explicit _id triggers 
PerThreadIDVersionAndSeqNoLookup.getDocID() which calls 
SegmentTermsEnum.seekExact(BytesRef) on every segment to check for 
version conflicts. With ~400 segments per index, each document requires 
~400 seekExact calls.

In lucene90, seekExact navigates an in-memory FST (heap-resident). 
In lucene103, seekExact navigates a TrieReader via memory-mapped file 
reads, where each read triggers MemorySessionImpl.checkValidStateRaw() 
(Panama Foreign Memory API bounds check).

## JFR Evidence

Write thread profiling (JFR ExecutionSample) shows:

lucene10.3.1 : 10.0% of write thread time in seekExact path
  DataInput.readVLong()
    SegmentTermsEnumFrame.loadBlock()
      SegmentTermsEnum.lambda$prepareSeekExact$1(BytesRef)
        SegmentTermsEnum.seekExact(BytesRef)
          PerThreadIDVersionAndSeqNoLookup.getDocID()

lucene9.12 : 2.6% of write thread time in seekExact path
  FST$Arc$BitTable.isBitSet()
    FST.findTargetArc()
      SegmentTermsEnum.seekExact(BytesRef)
        PerThreadIDVersionAndSeqNoLookup.getDocID()

Additionally, 6.6% of write thread time is spent in 
MemorySessionImpl.checkValidStateRaw() on memory-mapped reads triggered 
by the TrieReader navigation.

Combined: 16.6% write thread overhead vs 2.6% = 6.4x regression for 
this code path.

## Impact

At 256,000 seekExact calls/sec (32 TPS × 20 docs/bulk × 400 segments), 
this overhead causes:
- 1.9x per-document indexing latency (577µs vs 303µs)
- Search thread saturation under mixed workload (queries slow down due 
  to CPU contention)
- Ingestion stalls at 297k docs/tenant vs 600k+ on lucene90

Increasing refresh_interval from 1s to 30s (reducing segments from ~400 
to ~13) mitigates the issue by reducing seekExact calls 30x, pushing 
the stall point from 297k to 497k.

## Root Cause

Two compounding factors:

1. TrieReader replaces in-memory FST with on-disk trie navigation. 
   The FST was loaded into Java heap at segment open time — navigation 
   was pure CPU (BitTable.isBitSet). The TrieReader reads from 
   memory-mapped files, adding I/O indirection.

2. Each memory-mapped read triggers checkValidStateRaw() — the Panama 
   Foreign Memory API bounds check that verifies the Arena is still 
   open. This is called on every byte read from the mmap file.

The _id field is special: it is looked up via seekExact on every single 
document indexed. It has a random access pattern (UUIDs) that does not 
benefit from the TrieReader's sequential access optimizations.


## How to Reproduce

1. Create an index with many small segments (refresh_interval=1s, 
   continuous ingestion)
2. Bulk index documents with explicit _id (UUIDs)
3. Profile write threads with JFR
4. Compare seekExact time between lucene90 and lucene103 codecs

### Version and environment details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lucene103 blocktree TrieReader regression: 4x slower seekExact for _id lookups vs lucene90 FST #15820

Description

Environment

Problem

JFR Evidence

Impact

Root Cause

How to Reproduce

Version and environment details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Lucene103 blocktree TrieReader regression: 4x slower seekExact for _id lookups vs lucene90 FST #15820

Description

Description

Environment

Problem

JFR Evidence

Impact

Root Cause

How to Reproduce

Version and environment details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions