[SPARK-32056][SQL][Follow-up] Coalesce partitions for repartiotion hint and sql when AQE is enabled#28952
[SPARK-32056][SQL][Follow-up] Coalesce partitions for repartiotion hint and sql when AQE is enabled#28952viirya wants to merge 3 commits into
Conversation
|
I also think this might be worth creating a new jira ticket, but as initially we discussed it as follow-up. So I put it as a follow-up first. |
|
ok to test |
|
Does jenkins not work? |
|
Yea, it seems jenkins got sick last night... |
| case s: ShuffleExchangeExec => s | ||
| } | ||
| assert(shuffle.size == 1) | ||
| assert(shuffle(0).outputPartitioning.numPartitions == 10) |
There was a problem hiding this comment.
let's put 10 as a parameter, to make this method a bit more general.
|
+1 looks good to me too |
|
Test build #124649 has finished for PR 28952 at commit
|
|
retest this please |
|
Test build #124684 has finished for PR 28952 at commit
|
|
Test build #124712 has finished for PR 28952 at commit
|
|
retest this please |
|
Test build #124726 has started for PR 28952 at commit |
|
retest this please |
|
Test build #124774 has finished for PR 28952 at commit
|
|
retest this please... |
|
Test build #124804 has finished for PR 28952 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
+1, LGTM. Thank you, @viirya and @cloud-fan .
Merged to master for Apache Spark 3.1.
|
Thanks all! |
…nt and sql when AQE is enabled As the followup of apache#28900, this patch extends coalescing partitions to repartitioning using hints and SQL syntax without specifying number of partitions, when AQE is enabled. When repartitionning using hints and SQL syntax, we should follow the shuffling behavior of repartition by expression/range to coalesce partitions when AQE is enabled. Yes. After this change, if users don't specify the number of partitions when repartitioning using `REPARTITION`/`REPARTITION_BY_RANGE` hint or `DISTRIBUTE BY`/`CLUSTER BY`, AQE will coalesce partitions. Unit tests. Closes apache#28952 from viirya/SPARK-32056-sql. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…nt and sql when AQE is enabled As the followup of apache#28900, this patch extends coalescing partitions to repartitioning using hints and SQL syntax without specifying number of partitions, when AQE is enabled. When repartitionning using hints and SQL syntax, we should follow the shuffling behavior of repartition by expression/range to coalesce partitions when AQE is enabled. Yes. After this change, if users don't specify the number of partitions when repartitioning using `REPARTITION`/`REPARTITION_BY_RANGE` hint or `DISTRIBUTE BY`/`CLUSTER BY`, AQE will coalesce partitions. Unit tests. Closes apache#28952 from viirya/SPARK-32056-sql. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…nt and sql when AQE is enabled As the followup of apache#28900, this patch extends coalescing partitions to repartitioning using hints and SQL syntax without specifying number of partitions, when AQE is enabled. When repartitionning using hints and SQL syntax, we should follow the shuffling behavior of repartition by expression/range to coalesce partitions when AQE is enabled. Yes. After this change, if users don't specify the number of partitions when repartitioning using `REPARTITION`/`REPARTITION_BY_RANGE` hint or `DISTRIBUTE BY`/`CLUSTER BY`, AQE will coalesce partitions. Unit tests. Closes apache#28952 from viirya/SPARK-32056-sql. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
…nt and sql when AQE is enabled As the followup of apache#28900, this patch extends coalescing partitions to repartitioning using hints and SQL syntax without specifying number of partitions, when AQE is enabled. When repartitionning using hints and SQL syntax, we should follow the shuffling behavior of repartition by expression/range to coalesce partitions when AQE is enabled. Yes. After this change, if users don't specify the number of partitions when repartitioning using `REPARTITION`/`REPARTITION_BY_RANGE` hint or `DISTRIBUTE BY`/`CLUSTER BY`, AQE will coalesce partitions. Unit tests. Closes apache#28952 from viirya/SPARK-32056-sql. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
What changes were proposed in this pull request?
As the followup of #28900, this patch extends coalescing partitions to repartitioning using hints and SQL syntax without specifying number of partitions, when AQE is enabled.
Why are the changes needed?
When repartitionning using hints and SQL syntax, we should follow the shuffling behavior of repartition by expression/range to coalesce partitions when AQE is enabled.
Does this PR introduce any user-facing change?
Yes. After this change, if users don't specify the number of partitions when repartitioning using
REPARTITION/REPARTITION_BY_RANGEhint orDISTRIBUTE BY/CLUSTER BY, AQE will coalesce partitions.How was this patch tested?
Unit tests.