Skip to content

Conversation

@ayushi-agarwal
Copy link
Contributor

What changes were proposed in this pull request?

As part of this ticket https://issues.apache.org/jira/browse/SPARK-31869 an improvement was added which can decrease the number of exchanges. As output partitioning is overridden in gluten, the test was failing as extra exchange was coming.

(Fixes: #3559)

How was this patch tested?

Ran UT locally.

@github-actions
Copy link

github-actions bot commented Jan 9, 2024

#3559

@github-actions
Copy link

github-actions bot commented Jan 9, 2024

Run Gluten Clickhouse CI

@ayushi-agarwal
Copy link
Contributor Author

@JkSelf @rui-mo Please review. Thank you

@github-actions
Copy link

Run Gluten Clickhouse CI

@ayushi-agarwal ayushi-agarwal marked this pull request as ready for review January 10, 2024 09:32
@ayushi-agarwal
Copy link
Contributor Author

@JkSelf @zhli1142015 The build is green. Kindly review.

@rui-mo rui-mo requested review from ulysses-you and zzcclp January 11, 2024 06:06
case BuildLeft =>
joinType match {
case _: InnerLike | RightOuter => right.outputPartitioning
case _: InnerLike | RightOuter => expandPartitioning(right.outputPartitioning)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashJoinLikeExecTransformer is also extended by ShuffledHashJoinExecTransformer. Do we need this fix for SHJ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was looking into that, this change will help it there also so I think we can keep it for SHJ also.

@ulysses-you
Copy link
Contributor

it seems the code is difference since Spark3.4, shall we move this to shim module ?

}

// https://issues.apache.org/jira/browse/SPARK-31869
// ToDo: https://issues.apache.org/jira/browse/SPARK-45882
Copy link
Contributor Author

@ayushi-agarwal ayushi-agarwal Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a ToDO here, for this https://issues.apache.org/jira/browse/SPARK-45882 we would need to add in shim. Shall that be taken in separate PR?
This PR make the changes which were added in spark 3.1 apache/spark#28676

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ulysses-you Shall we move this to shim module as part of this in separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to solve this todo in this pr, as we need to move the whole code to shim module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ulysses-you I will update the PR

@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

@github-actions
Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @ayushi-agarwal , lgtm. cc @rui-mo if you have other comments

@zzcclp
Copy link
Contributor

zzcclp commented Jan 12, 2024

LGTM

@ayushi-agarwal
Copy link
Contributor Author

ayushi-agarwal commented Jan 12, 2024

Thank you @zzcclp @ulysses-you @zhli1142015 for reviewing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[VL] Track all the failed unit test in Spark 3.4.

5 participants