Is your feature request related to a problem or challenge?
current the GroupedTopKAggregateStream support two type of query
select id, max(time) from t group by id order by max(time) desc limit 10
select id, min(time) from t group by id order by min(time) asc limit 10
we have another use case that i find it also can use GroupedTopKAggregateStream to spped up, like below query
select distinct id from t order by id desc/asc limit 10
select id from t group by id order by id desc/asc limit 10
because If a certain id = x is in the global top 10(the second phase of AggregateExec), then x must appear in the local top 10(the first phase of AggregateExec) of at least one partition.
Describe the solution you'd like
- modify the
TopKAggregation optimizer rule to pass the information to AggregateExec
- modify
GroupedTopKAggregateStream to support this case
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
current the
GroupedTopKAggregateStreamsupport two type of querywe have another use case that i find it also can use GroupedTopKAggregateStream to spped up, like below query
because If a certain
id = xis in the global top 10(the second phase of AggregateExec), thenxmust appear in the local top 10(the first phase of AggregateExec) of at least one partition.Describe the solution you'd like
TopKAggregationoptimizer rule to pass the information to AggregateExecGroupedTopKAggregateStreamto support this caseDescribe alternatives you've considered
No response
Additional context
No response