More efficient window implementation #217

nils-braun · 2021-08-18T12:00:26Z

The current OVER implementation works as follows:
If there is are OVER calls in a SELECT (aka a projection), they are treated one after the other by first storing the current index, partitioning and sorting, then grouping and applying the window and finally restoring the current sorting/partitioning/sorting. The reason for this is, that if we mix OVER and non-OVER calls in a single project, they will not fit together if they have different partitioning/sorting/index.
This PR introduces a much easier implementation: using one of the optimization rules in calcite, we split up projections into projections without OVER and ones that only contain window operations (which are now called LogicalWindow). This allows us to group and shuffle without the need to restore any ordering or partitioning, as the window is now treated independently.

I did not do any benchmarking so far (will do later), but I would expect this to be much faster for larger data sets and many OVER operations.

…her projects

…ass - needs refinement still

nils-braun added 7 commits August 10, 2021 22:42

Add more optimization rules, for example one that splits OVER from ot…

24cbbc7

…her projects

Implemented a more optimized window handler plugin

bb24593

Huge refactoring to integrate multiple window aggregations into one p…

b808829

…ass - needs refinement still

Make sure to keep output names even after optimization

328c568

Refine the new implemenation with docu and type annotations

d497566

Merge branch 'main' into feature/more-efficient-window-implementation

f0491ce

Make sure the mapped function is pickleable without dask

cf16194

nils-braun merged commit 7b60e4a into main Aug 18, 2021

nils-braun deleted the feature/more-efficient-window-implementation branch August 18, 2021 21:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

More efficient window implementation #217

More efficient window implementation #217

Uh oh!

nils-braun commented Aug 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

More efficient window implementation #217

More efficient window implementation #217

Uh oh!

Conversation

nils-braun commented Aug 18, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants