-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fix Duplicated filters within (filter(TableScan)) plan for unparser #13422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Eliminate duplicated filter within (filter(TableScan)) plan * Updates * fix * add test * fix
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @Sevenannn -- this is a great idea. Thank you @jayzhan211 for the review
|
Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look |
jayzhan211
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
|
Thanks @Sevenannn @alamb |
|
Thanks @jayzhan211 @alamb |
Thank you for all the bug fixes @Sevenannn -- it turns out that @wiedld just found you had fixed a bug we ran into in InfluxDB as well (#12979) 🙏 |
Which issue does this PR close?
N/A
Rationale for this change
when rewriting plans that has aggregates with lhs / rhs with filter and scan containing same filter.
For query
The logical plan is
The rewritten query will be:
SELECT customer.c_custkey, count(orders.o_orderkey) FROM customer LEFT JOIN orders ON ((customer.c_custkey = orders.o_custkey) AND (orders.o_comment NOT LIKE '%special%requests%' AND orders.o_comment NOT LIKE '%special%requests%')) GROUP BY customer.c_custkeyUnder the current approach, the filter
orders.o_comment NOT LIKE Utf8("%special%requests%")will occur twice in final query, although this has no effect on query result correctness, it brings performance overhead by including duplicated conditions.What changes are included in this PR?
Are these changes tested?
Yes
Are there any user-facing changes?
No