-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-40599][SQL] Add multiTransform methods to TreeNode to generate alternatives #38034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
52546d0
b99f2f9
8de8f88
afcef4e
60323c8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -618,6 +618,212 @@ abstract class TreeNode[BaseType <: TreeNode[BaseType]] extends Product with Tre | |||||
| } | ||||||
| } | ||||||
|
|
||||||
| /** | ||||||
| * Returns alternative copies of this node where `rule` has been recursively applied to the tree. | ||||||
| * | ||||||
| * Users should not expect a specific directionality. If a specific directionality is needed, | ||||||
| * multiTransformDownWithPruning or multiTransformUpWithPruning should be used. | ||||||
| * | ||||||
| * @param rule a function used to generate transformed alternatives for a node | ||||||
|
||||||
| * @param rule a function used to generate transformed alternatives for a node | |
| * @param rule a function used to generate alternatives for a node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * when the rule returns large number of alternatives, this function returns the alternatives as a | |
| * when the rule returns many alternatives for many nodes, this function returns the alternatives as a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's also mention, if the rule applies but you want the not-apply behavior, you can just return Seq(originalNode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
peter-toth marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worried about making multiTransform too complicated. Given the only use case for now is projecting expressions such as output partitions/orderings, can we simplify the rule a little bit? My preference is to remove this autoContinue flag and fully rely on the callers.
It is not always easy to determine if we will do any child expression mapping
If we have a real SQL use case, I'm happy to change my mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, indeed. And I don't have any other use case either, so I will remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, removed.
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if my understanding is correct, the only way to mark this rule as ineffective is that it returns stream with one alternative which is eq to this ? then why we use map here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thanks, this looks wrong. I will fix it with other requests today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest commit fixes this, although there is relevant issue that I'm not sure we can mark the rule ineffective if the one element original stream is returned:
https://github.com/apache/spark/pull/38034/files#diff-94575875fbf007fdaf43e4946c69c18649294fed974a46816ab1986f6350541bR691-R699
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's OK. We can mark the rule as ineffective only if the partial function does not apply. Once it applies, no matter what it returns, it's effective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comment needs an update. We should also explain why only the down direction is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've modified the scala docs, let me know if we need more defails or fixes.