[SPARK-33540][SQL] Subexpression elimination for interpreted predicate #30497

viirya · 2020-11-25T09:14:00Z

What changes were proposed in this pull request?

This patch proposes to support subexpression elimination for interpreted predicate.

Why are the changes needed?

Similar to interpreted projection, there are use cases when codegen predicate is not able to work, e.g. too complex schema, non-codegen expression, etc. When there are frequently occurring expressions (subexpressions) among predicate expression, the performance is quite bad as we need to re-compute same expressions. We should be able to support subexpression elimination for interpreted predicate like interpreted projection.

Does this PR introduce any user-facing change?

No, this doesn't change user behavior.

How was this patch tested?

Unit test and benchmark.

SparkQA · 2020-11-25T13:00:08Z

Test build #131763 has finished for PR 30497 at commit 56c09ca.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2020-11-25T13:15:24Z

retest this please

SparkQA · 2020-11-25T16:50:31Z

Test build #131776 has finished for PR 30497 at commit 56c09ca.

This patch fails PySpark unit tests.
This patch merges cleanly.
This patch adds no public classes.

dongjoon-hyun

+1, LGTM. Thank you, @viirya , @HyukjinKwon , @maropu .
Merged to master for Apache Spark 3.1.

viirya · 2020-11-25T17:51:02Z

Thanks @dongjoon-hyun @HyukjinKwon @maropu

Extend subexpression elimination to predicate.

56c09ca

viirya force-pushed the SPARK-33540 branch from 6941283 to 56c09ca Compare November 25, 2020 09:14

github-actions bot added the SQL label Nov 25, 2020

HyukjinKwon approved these changes Nov 25, 2020

View reviewed changes

maropu approved these changes Nov 25, 2020

View reviewed changes

dongjoon-hyun approved these changes Nov 25, 2020

View reviewed changes

dongjoon-hyun closed this in 9643eab Nov 25, 2020

viirya deleted the SPARK-33540 branch December 27, 2023 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-33540][SQL] Subexpression elimination for interpreted predicate #30497

[SPARK-33540][SQL] Subexpression elimination for interpreted predicate #30497

Uh oh!

viirya commented Nov 25, 2020

Uh oh!

SparkQA commented Nov 25, 2020

Uh oh!

HyukjinKwon commented Nov 25, 2020

Uh oh!

SparkQA commented Nov 25, 2020

Uh oh!

dongjoon-hyun left a comment

Uh oh!

viirya commented Nov 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-33540][SQL] Subexpression elimination for interpreted predicate #30497

[SPARK-33540][SQL] Subexpression elimination for interpreted predicate #30497

Uh oh!

Conversation

viirya commented Nov 25, 2020

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Nov 25, 2020

Uh oh!

HyukjinKwon commented Nov 25, 2020

Uh oh!

SparkQA commented Nov 25, 2020

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

viirya commented Nov 25, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants