-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-34079][SQL] Merge non-correlated scalar subqueries #32298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
peter-toth
wants to merge
83
commits into
apache:master
from
peter-toth:SPARK-34079-multi-column-scalar-subquery
Closed
Changes from all commits
Commits
Show all changes
83 commits
Select commit
Hold shift + click to select a range
e0e39d5
[SPARK-34079][SQL] Merging non-correlated scalar subqueries to multi-…
peter-toth 0a7e0e2
no need for the whole plan traversal in this PR
peter-toth e35cdc1
Merge commit 'fdccd88c2a6dd18c9d446b63fccd5c6188ea125c' into SPARK-34…
peter-toth 0cff7b2
add MULTI_SCALAR_SUBQUERY pattern
peter-toth 22e833d
Merge commit '9af338cd685bce26abbc2dd4d077bde5068157b1' into SPARK-34…
peter-toth 42add09
Merge commit '132cbf0c8c1a382f33d8d212f931f5956f85a2f9' into SPARK-34…
peter-toth e63111d
add some tests, add more docs
peter-toth c84f0ee
Merge commit '2634dbac35c5e8d5216b38fd4256f5fd059f341f' into SPARK-34…
peter-toth ee8f12a
fix test
peter-toth 17fd666
rename mergePlans() to tryMergePlans()
peter-toth 6134fa9
add test to cover aggregate and group expression merge
peter-toth 2828345
do not merge different aggregate implementations and add test
peter-toth 1f2f75c
drop MultiScalarSubquery, use ScalarSubquery(CreateStruct()) instead
peter-toth a3e84a4
refactor, add support for support filter and join, add new tests, add…
peter-toth 0fe66dc
minor fixes
peter-toth 100cb9c
extract common scalar subqueries
peter-toth 9d8dd6b
Merge commit 'cd2ef9cb43f4e27302906ac2f605df4b66a72a2f' into SPARK-34…
peter-toth f83f22b
add and update docs
peter-toth 41c0f0a
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth d10a8be
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth db34640
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth d081885
Clean up code, add more comments
peter-toth e98754a
temp
peter-toth bb623cf
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth ae1d84e
fix messages
peter-toth 2eb14f1
regenerate expected plans
peter-toth 060e4b7
add more comments
peter-toth d86d2c4
use Header alias type
peter-toth 0a97c8b
Merge commit '6e8a4626117f0cb5535875f7181f56350ad4f195' into SPARK-34…
peter-toth 61f2b34
Merge commit '8ae88d01b46d581367d0047b50fcfb65078ab972' into SPARK-34…
peter-toth 532d05e
remove dependecy on `spark.sql.execution.reuseSubquery`, remove unnec…
peter-toth c488377
refactor
attilapiros e0a7610
accept Attila's suggestion but keep the `merged` flag, minor name cha…
peter-toth 63c3709
fix review findings
peter-toth dabbea4
add negative test case to general join matching where only a non-chil…
peter-toth 4d97de5
amend a test to cover extra projects on both sides
peter-toth cc8690e
improve generic node merging
peter-toth 83c78ca
minor fix
peter-toth 3130913
fix merging logic if merging a plan into a merged plan
peter-toth 3e8f7fa
remove unused mapAttributes()
peter-toth 252c9b1
do not merge nondeterministic plans
peter-toth fa5e786
fix test and check adaptive path as well
peter-toth e292732
check test results
peter-toth 963c423
check for same instance of subqueries
peter-toth 9efaf2a
use the new `isCorrelated()`
peter-toth 96a502d
move deterministic check as early as possible
peter-toth 5b91d61
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth 8bcf515
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth 87ba289
fix comments
peter-toth 6d5a124
rephrase general node merging
peter-toth 851ca29
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth 96d0cab
use CTE nodes
peter-toth a57ed32
no need for extra shuffle with subquery `CTERelationDef`s
peter-toth 0b34d83
regenerate expected plan stability output
peter-toth 4985d43
remove obsolete assert
peter-toth de9b312
fix LogicalPlanTagInSparkPlanSuite, for logical scan plan trees consi…
peter-toth 13a2fad
fix row-level runtime filtering as after subquery merging bloom filte…
peter-toth 92ce6e5
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth 67ffae6
fix header scaladoc
peter-toth a32a85c
rename subquery flag to mergedScalarSubquery, fix CTERelationRef scal…
peter-toth 1bc8a45
fix test name
peter-toth 224edef
add new testcase "Merge non-correlated scalar subqueries in a subquery"
peter-toth a7fd1c5
add test "Merge non-correlated scalar subqueries with conflicting names"
peter-toth a5eb5df
add test "Merging subqueries from different places"
peter-toth 8457148
add test "Do not merge subqueries with different join conditions", fi…
peter-toth 1ff64e4
add test "Do not merge subqueries with different filter conditions"
peter-toth 13a1cdb
simplify do not merge test cases
peter-toth 4da3fe6
drop general node merging code path
peter-toth 96ed6fd
use canonicalized form to in Filter and Join condition comparison
peter-toth dbe81e2
simplify aggregate check
peter-toth ba299d5
fix aggregate grouping compare
peter-toth 65f3425
simplify header
peter-toth dc5e9b9
Merge branch 'master' into SPARK-34079-multi-column-scalar-subquery
peter-toth c64373b
rebase on top of https://github.com/apache/spark/pull/34929
peter-toth 3993eab
revert regenerated q5 expected output
peter-toth f93283d
fix nested subqueries, add test
peter-toth 8c5c9ac
fix comment
peter-toth 3b7ad2c
rename method
peter-toth c268580
fix test name
peter-toth 169fd6b
fix removeReferences
peter-toth 19128ff
rename merged subquery flag in cte def
peter-toth 1c4d14b
simplify removeReferences, fix tests
peter-toth 2590edf
fix scala 2.13
peter-toth File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
389 changes: 389 additions & 0 deletions
389
...talyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/MergeScalarSubqueries.scala
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because with this PR some bloom filter aggregate subqueries can be merged. E.g.
=>