Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Nov 25, 2020

What changes were proposed in this pull request?

This patch proposes to support subexpression elimination for interpreted predicate.

Why are the changes needed?

Similar to interpreted projection, there are use cases when codegen predicate is not able to work, e.g. too complex schema, non-codegen expression, etc. When there are frequently occurring expressions (subexpressions) among predicate expression, the performance is quite bad as we need to re-compute same expressions. We should be able to support subexpression elimination for interpreted predicate like interpreted projection.

Does this PR introduce any user-facing change?

No, this doesn't change user behavior.

How was this patch tested?

Unit test and benchmark.

@SparkQA
Copy link

SparkQA commented Nov 25, 2020

Test build #131763 has finished for PR 30497 at commit 56c09ca.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Nov 25, 2020

Test build #131776 has finished for PR 30497 at commit 56c09ca.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @viirya , @HyukjinKwon , @maropu .
Merged to master for Apache Spark 3.1.

@viirya
Copy link
Member Author

viirya commented Nov 25, 2020

Thanks @dongjoon-hyun @HyukjinKwon @maropu

@viirya viirya deleted the SPARK-33540 branch December 27, 2023 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants