refactor: refine experimental final aggregate spill #18907

dqhl76 · 2025-10-31T06:29:42Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Replace the final aggregate spiller with a sync writer from SpillsBufferPool
Refactor the experimental final aggregate processor to split local/shared state, reuse hash tables, and bound recursive spilling via max_aggregate_spill_level
Rename the feature flag to enable_experiment_aggregate, add the max_aggregate_spill_level setting

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

Copilot

Pull Request Overview

This PR refactors the aggregate spill implementation by renaming settings and introducing a new spiller architecture with multi-level spill support. The main changes consolidate spill logic, add a maximum recursion depth setting for aggregate spills, and improve memory management during aggregation.

Renamed enable_experiment_aggregate_final to enable_experiment_aggregate with updated default from 0 to 1
Added max_aggregate_spill_level setting to limit recursion depth during aggregate spilling
Replaced FinalAggregateSpiller with NewAggregateSpiller using parquet-based spilling

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
settings_getter_setter.rs	Renamed getter method to match new setting name and added getter for max spill level
settings_default.rs	Renamed setting, changed default to 1, and added max_aggregate_spill_level setting
new_transform_final_aggregate.rs	Major refactor to extract state management, improve spilling logic, and add multi-level spill support
new_final_aggregate_state.rs	New file containing extracted state management structs (RoundPhase, LocalRoundState, FinalAggregateSharedState)
new_aggregate_spiller.rs	New file implementing parquet-based spiller replacing legacy serialization approach
final_aggregate_spiller.rs	Deleted file - legacy spiller implementation removed
aggregate_meta.rs	Added NewSpilledPayload struct and NewBucketSpilled enum variant
transform_partition_bucket*.rs	Added unreachable handlers for NewBucketSpilled variant
payload/partitioned_payload/aggregate_hashtable	Added reset_for_reuse methods to support hashtable reuse
build_partition_bucket.rs	Updated to use new spiller and pass max_aggregate_spill_level parameter
physical_aggregate_final.rs	Updated to use renamed setting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/query/settings/src/settings_default.rs

...elines/processors/transforms/aggregator/new_final_aggregate/new_transform_final_aggregate.rs

.../src/pipelines/processors/transforms/aggregator/new_final_aggregate/new_aggregate_spiller.rs

github-actions · 2025-11-02T15:49:23Z

Docker Image for PR

tag: pr-18907-07241e7-1762098381

note: this image tag is only available for internal use.

github-actions · 2025-11-03T01:39:44Z

ClickBench Report

fix: re-enable check spill by setting max spill depth refactor: split local/shared state

… is high

github-actions · 2025-11-04T09:10:29Z

Docker Image for PR

tag: pr-18907-af8289a-1762247328

note: this image tag is only available for internal use.

github-actions · 2025-11-04T09:59:07Z

ClickBench Report

github-actions · 2025-11-04T14:41:07Z

Docker Image for PR

tag: pr-18907-39ec537-1762267174

note: this image tag is only available for internal use.

github-actions · 2025-11-04T15:31:28Z

ClickBench Report

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

...rc/pipelines/processors/transforms/aggregator/new_aggregate/new_transform_final_aggregate.rs

github-actions bot added the pr-refactor this PR changes the code base without new features or bugfix label Oct 31, 2025

dqhl76 mentioned this pull request Oct 31, 2025

Tracking: refactor aggregate spill pipeline for lower OOM risk #18906

Open

3 tasks

dqhl76 requested a review from Copilot November 2, 2025 09:45

Copilot AI reviewed Nov 2, 2025

View reviewed changes

dqhl76 added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Nov 2, 2025

dqhl76 added 10 commits November 4, 2025 14:21

fix: repartition not work

75d4c16

fix: re-enable check spill by setting max spill depth refactor: split local/shared state

write should close

5855caf

fix: try improve performance

7ca23d5

used for CI

88bdd92

fix: result incorrect when cluster+force_spill

fc5575c

fixup

a28f6a3

max_aggregate_spill_level = 0

a28dfd3

refactor: release hashtable instead of resetting when memory pressure…

0410d7e

… is high

make clippy happy

b1dd1da

try fix

c1ef96c

dqhl76 force-pushed the performance-final-aggregate-31 branch from 7ceabe9 to 4587399 Compare November 4, 2025 06:21

revert the previous performance improve with final_aggregate

1ba503a

dqhl76 force-pushed the performance-final-aggregate-31 branch from 4587399 to 1ba503a Compare November 4, 2025 06:35

dqhl76 added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Nov 4, 2025

clean

e77a85f

refactor: try to improve hits benchmark and clean

b18b21e

dqhl76 added ci-benchmark Benchmark: run all test and removed ci-benchmark Benchmark: run all test labels Nov 4, 2025

disable

0234267

dqhl76 marked this pull request as ready for review November 5, 2025 01:09

dqhl76 requested a review from zhang2014 November 5, 2025 01:09

chatgpt-codex-connector bot reviewed Nov 5, 2025

View reviewed changes

...rc/pipelines/processors/transforms/aggregator/new_aggregate/new_transform_final_aggregate.rs Show resolved Hide resolved

zhang2014 approved these changes Nov 5, 2025

View reviewed changes

zhang2014 merged commit 5b874fa into databendlabs:main Nov 5, 2025
95 of 173 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: refine experimental final aggregate spill #18907

refactor: refine experimental final aggregate spill #18907

Uh oh!

dqhl76 commented Oct 31, 2025 •

edited by drmingdrmer

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 2, 2025

Uh oh!

github-actions bot commented Nov 3, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

github-actions bot commented Nov 4, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor: refine experimental final aggregate spill #18907

refactor: refine experimental final aggregate spill #18907

Uh oh!

Conversation

dqhl76 commented Oct 31, 2025 • edited by drmingdrmer Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Type of change

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Nov 2, 2025

Docker Image for PR

Uh oh!

github-actions bot commented Nov 3, 2025

ClickBench Report

Uh oh!

github-actions bot commented Nov 4, 2025

Docker Image for PR

Uh oh!

github-actions bot commented Nov 4, 2025

ClickBench Report

Uh oh!

github-actions bot commented Nov 4, 2025

Docker Image for PR

Uh oh!

github-actions bot commented Nov 4, 2025

ClickBench Report

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dqhl76 commented Oct 31, 2025 •

edited by drmingdrmer

Loading