Skip to content

Tracking: refactor aggregate spill pipeline for lower OOM risk #18906

@dqhl76

Description

@dqhl76

Summary

This issues tracks the refactor work for aggregate pipeline.

Tracking PRs:

Problems

Final aggregate can OOM:
For now only the partial stage can spill. During final aggregation, large hashtable have no spill path, so memory pressure can escalate to OOM.

Partial aggregate spills too eagerly:
The partial stage checks memory on every processed block and, if pressured, immediately spills the in-memory hash table. When pressure persists, it may spill repeatedly, producing many tiny fragments and causing excessive read/write count.

Proposed changes

  • Recursive spill in final aggregation

Add a spill path to the final stage. When memory pressure is high, final aggregation repartitions and spills its payload recursively until it can fit in memory (bounded by a max recursion depth).

  • Batch/stream spill triggering for partial aggregation

Replace per-block “spill immediately” with a batch/stream policy so spilled fragments are reasonably sized. This reduces tiny files and I/O amplification.

  • 🚧 Rate-controlled synchronous spill/restore

Use a blocking (synchronous) spill/restore operator to bound concurrency and smooth throughput. This replaces the current max_aggregate_restore_worker setting and provides clearer backpressure.

Limitation

Potential performance regression when no spill is needed: Supporting recursive spill adds repartition capability that introduces overhead even on clean (no-spill) paths. Will try to reduce the overhead..

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions