More efficient window implementation #217
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The current
OVERimplementation works as follows:If there is are
OVERcalls in aSELECT(aka a projection), they are treated one after the other by first storing the current index, partitioning and sorting, then grouping and applying the window and finally restoring the current sorting/partitioning/sorting. The reason for this is, that if we mixOVERand non-OVERcalls in a single project, they will not fit together if they have different partitioning/sorting/index.This PR introduces a much easier implementation: using one of the optimization rules in calcite, we split up projections into projections without
OVERand ones that only contain window operations (which are now calledLogicalWindow). This allows us to group and shuffle without the need to restore any ordering or partitioning, as the window is now treated independently.I did not do any benchmarking so far (will do later), but I would expect this to be much faster for larger data sets and many
OVERoperations.