Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior [databricks] #14164

nartal1 · 2026-01-17T01:21:35Z

Fixes #14083 and contributes to #14135.

Description

This PR aligns GpuUnionExec with Apache Spark 4.1's change to UnionExec behavior introduced in SPARK-52921.
In Spark 4.1, UnionExec was changed to use SQLPartitioningAwareUnionRDD which groups partitions at corresponding indices across child RDDs, rather than concatenating all partitions sequentially.
We have copied most of the code from Apache Spark's codebase for Partitioner-aware union.
While fixing this, audited the outputPartitioning override functions of other execs and updated/added for the missing ones.
These are again copied from the Spark's code and modified to fit for this repo.
GpuProjectExecLike: Fixed outputPartitioning to remap expressions through aliases (was missing, unlike Spark's PartitioningPreservingUnaryExecNode).
GpuBroadcastHashJoinExecBase: Added missing outputPartitioning override
GpuCustomShuffleReaderExec: Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning (matching Spark's AQEShuffleReadExec).
GpuShuffledHashJoinExec: Added outputPartitioning override matching Spark's HashJoin trait behavior.
GpuShuffledSymmetricHashJoinExec: Added outputPartitioning override for InnerLike and FullOuter join types.
GpuShuffledAsymmetricHashJoinExec: Added outputPartitioning override for LeftOuter and RightOuter join types.
GpuBroadcastNestedLoopJoinExecBase: Added outputPartitioning override matching Spark's BroadcastNestedLoopJoinExec behavior.

Some of the integration tests were failing on Spark-4.1 with below error:

AssertionError: CPU and GPU list have different lengths at [] CPU: 10 GPU: 20

Testing

All the intergration tests in dpp_test.py pass now with this PR.
Before this PR:

========================================================== 16 failed, 89 passed, 154 warnings in 127.48s (0:02:07) ==========================================================

With this PR:

=============================================================== 105 passed, 154 warnings in 123.63s (0:02:03) ===============================================================

Checklists

This PR has added documentation for new or modified features or behaviors.
This PR has added new tests or modified existing tests to cover new code paths.
(Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description.

…as children operators Signed-off-by: Niranjan Artal <[email protected]>

Signed-off-by: Niranjan Artal <[email protected]>

nartal1 · 2026-01-17T01:24:20Z

This was built on top of #14120. So it can be merged after Spark-4.1.1 support is added.

greptile-apps · 2026-01-17T01:25:09Z

Greptile Overview

Greptile Summary

This PR implements Spark 4.1's partitioner-aware union behavior (SPARK-52921) for GpuUnionExec and fixes outputPartitioning across multiple executor types.

Key Changes

Spark 4.1 Partitioner-Aware Union:

Implements GpuPartitionerAwareUnionRDD that groups partitions at corresponding indices (e.g., partition 0 from all children) rather than concatenating sequentially
Falls back to concatenation when outputPartitioning is UnknownPartitioning
Adds logic to compare child partitionings and determine compatibility

outputPartitioning Fixes:

GpuProjectExecLike: Now remaps expressions through aliases and flattens PartitioningCollection, matching Spark's PartitioningPreservingUnaryExecNode
GpuShuffledHashJoinExec: Returns PartitioningCollection for InnerLike joins
GpuShuffledSymmetricHashJoinExec and GpuShuffledAsymmetricHashJoinExec: Added proper partitioning based on join type
GpuCustomShuffleReaderExec: Preserves HashPartitioning for coalesced reads with updated partition count
GpuBroadcastHashJoinExecBase and GpuBroadcastNestedLoopJoinExecBase: Added missing overrides

Testing: Integration tests in dpp_test.py now pass (16 failures → 0 failures).

Confidence Score: 4.5/5

This PR is safe to merge with minor validation needed on edge cases
The implementation closely follows Spark's approach and all integration tests pass. The main concern is the assumption in GpuPartitionerAwareUnionRDD that all RDDs have the same partition count, which relies on caller validation
Verify that GpuPartitionerAwareUnionRDD's partition count assumptions are always satisfied by getOutputPartitioning logic

Important Files Changed

Filename	Overview
sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuUnionExecShim.scala	Implements Spark 4.1 partitioner-aware union with proper attribute mapping and partitioning comparison
sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuPartitionerAwareUnionRDD.scala	Groups partitions at corresponding indices; assumes all RDDs have same partition count (validated by caller)
sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala	Fixed GpuProjectExecLike.outputPartitioning to remap expressions through aliases and flatten PartitioningCollection
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuCustomShuffleReaderExec.scala	Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning with updated partition count

Sequence Diagram

sequenceDiagram
    participant Spark as Spark 4.1 Planner
    participant GpuUnion as GpuUnionExec
    participant Shim as GpuUnionExecShim
    participant RDD as GpuPartitionerAwareUnionRDD
    participant Children as Child RDDs
    
    Spark->>GpuUnion: Plan UnionExec
    GpuUnion->>Shim: getOutputPartitioning(children)
    alt All children have compatible partitioning
        Shim-->>GpuUnion: HashPartitioning/SinglePartition
        Note over Shim: SQL_UNION_OUTPUT_PARTITIONING=true
    else Incompatible partitioning
        Shim-->>GpuUnion: UnknownPartitioning(0)
    end
    
    Spark->>GpuUnion: executeColumnar()
    GpuUnion->>Shim: unionColumnarRdds()
    alt UnknownPartitioning
        Shim->>Children: sc.union(rdds)
        Note over Shim,Children: Concatenate all partitions
        Children-->>GpuUnion: Sequential RDD
    else Has known partitioning
        Shim->>RDD: new GpuPartitionerAwareUnionRDD
        RDD->>Children: Group partitions at index i
        Note over RDD,Children: Partition-aware: rdds[0][i] + rdds[1][i] + ...
        Children-->>GpuUnion: Partitioner-aware RDD
    end

greptile-apps

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuUnionExecShim.scala

res-life · 2026-01-20T01:28:33Z

sql-plugin/src/main/spark320/scala/com/nvidia/spark/rapids/shims/GpuUnionExecShim.scala

+{"spark": "400"}
+{"spark": "401"}
+spark-rapids-shim-json-lines ***/
+package com.nvidia.spark.rapids.shims


missing {"spark": "400db173"} ?
It's better to test Databricks by adding [databricks] marker in the PR title.

Added [databricks] marker in the PR title. I will add "400db173" in the PR to support for Databricks-17.3. Adding now itself might be confusing for some.

revans2 · 2026-01-21T17:29:50Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala

+      children.map(_.executeColumnar()),
+      numOutputRows,
+      numOutputBatches)
  }


Do we need to also update what we say out output partitioning is?

nvauto · 2026-01-26T02:03:27Z

NOTE: release/26.02 has been created from main. Please retarget your PR to release/26.02 if it should be included in the release.

Signed-off-by: Niranjan Artal <[email protected]>

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: Niranjan Artal <[email protected]>

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{4 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

The base branch was changed.

jihoonson

Thanks @nartal1. Are all these changes covered by existing tests?

GpuProjectExecLike: Fixed outputPartitioning to remap expressions through aliases (was missing, unlike Spark's PartitioningPreservingUnaryExecNode).
GpuBroadcastHashJoinExecBase: Added missing outputPartitioning override
GpuCustomShuffleReaderExec: Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning (matching Spark's AQEShuffleReadExec).
GpuShuffledHashJoinExec: Added outputPartitioning override matching Spark's HashJoin trait behavior.
GpuShuffledSymmetricHashJoinExec: Added outputPartitioning override for InnerLike and FullOuter join types.
GpuShuffledAsymmetricHashJoinExec: Added outputPartitioning override for LeftOuter and RightOuter join types.
GpuBroadcastNestedLoopJoinExecBase: Added outputPartitioning override matching Spark's BroadcastNestedLoopJoinExec behavior.

jihoonson · 2026-01-30T19:31:20Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala

+   * This is critical for Spark 4.1+ where UnionExec uses outputPartitioning
+   * to decide between partitioner-aware union vs concatenation.
+   */
+  override def outputPartitioning: Partitioning = {


Do we not need to handle the case when the project has aliases?

Also wonder if it's a good idea to have GpuProjectExecLike extend PartitioningPreservingUnaryExecNode instead of copying this code.

jihoonson · 2026-01-30T19:38:16Z

sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala

  }

+  override def outputPartitioning: Partitioning =
+    GpuUnionExecShim.getOutputPartitioning(children, output, conf)


Hmm, can we take the outputPartitioning of the cpu exec as a parameter of GpuUnionExec instead of duplicating the Spark code?

Thanks @jihoonson for the review. Your review comments on the refactor makes sense. I will do it as a follow-on PR for 26.04 if that's okay. Filed an issue for these - #14229

…nionexec_paritionaware

greptile-apps

_{4 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

...ugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuPartitionerAwareUnionRDD.scala

greptile-apps

_{4 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

nartal1 · 2026-01-30T21:02:03Z

build

jihoonson

Thank you for filing a follow-up issue. LGTM

Specify outputPartitioning for UnionExec for same output partitoning …

3cfaa98

…as children operators Signed-off-by: Niranjan Artal <[email protected]>

nartal1 self-assigned this Jan 17, 2026

nartal1 added the audit_4.1.0 Audit related tasks for 4.1.0 label Jan 17, 2026

Add db shims

ed51acb

Signed-off-by: Niranjan Artal <[email protected]>

greptile-apps bot reviewed Jan 17, 2026

View reviewed changes

sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuUnionExecShim.scala Show resolved Hide resolved

nartal1 mentioned this pull request Jan 17, 2026

[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators #14083

Closed

addressed review comments

d5b590e

res-life reviewed Jan 20, 2026

View reviewed changes

nartal1 changed the title ~~Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior.~~ Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior [databricks] Jan 20, 2026

revans2 reviewed Jan 21, 2026

View reviewed changes

addressed review comments and fixed missing overrides for some Execs

9283666

Signed-off-by: Niranjan Artal <[email protected]>

greptile-apps bot reviewed Jan 26, 2026

View reviewed changes

nartal1 mentioned this pull request Jan 28, 2026

[FEA] Add support for Spark 4.1.1 [databricks] #14120

Merged

6 tasks

remove unused import

1239861

Signed-off-by: Niranjan Artal <[email protected]>

greptile-apps bot reviewed Jan 28, 2026

View reviewed changes

res-life linked an issue Jan 29, 2026 that may be closed by this pull request

[Spark 4.1.1] 450 Integration Tests Failing #14135

Closed

res-life mentioned this pull request Jan 29, 2026

[Spark 4.1.1] 450 Integration Tests Failing #14135

Closed

nartal1 requested a review from a team January 29, 2026 22:31

override outputPartitioning for missing Execs

cb302e3

greptile-apps bot reviewed Jan 29, 2026

View reviewed changes

gerashegalov previously approved these changes Jan 30, 2026

View reviewed changes

nartal1 changed the base branch from main to release/26.02 January 30, 2026 03:22

jihoonson reviewed Jan 30, 2026

View reviewed changes

Merge branch 'release/26.02' of github.com:NVIDIA/spark-rapids into u…

aea6a7d

…nionexec_paritionaware

greptile-apps bot reviewed Jan 30, 2026

View reviewed changes

...ugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuPartitionerAwareUnionRDD.scala Show resolved Hide resolved

Update license header

0d12e44

greptile-apps bot reviewed Jan 30, 2026

View reviewed changes

nartal1 mentioned this pull request Jan 30, 2026

Refactor outputPartitioning for few execs and add tests #14229

Open

nartal1 requested a review from gerashegalov January 30, 2026 23:27

jihoonson approved these changes Jan 31, 2026

View reviewed changes

gerashegalov merged commit 4c18b5a into NVIDIA:release/26.02 Jan 31, 2026
44 checks passed

Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior [databricks] #14164

Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior [databricks] #14164

Uh oh!

Conversation

nartal1 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing

Checklists

Uh oh!

nartal1 commented Jan 17, 2026

Uh oh!

greptile-apps bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Overview

Greptile Summary

Key Changes

Confidence Score: 4.5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

res-life Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

nartal1 Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

revans2 Jan 21, 2026

Choose a reason for hiding this comment

Uh oh!

nvauto commented Jan 26, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

jihoonson left a comment

Choose a reason for hiding this comment

Uh oh!

jihoonson Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

jihoonson Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

jihoonson Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

nartal1 Jan 30, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

nartal1 commented Jan 30, 2026

Uh oh!

jihoonson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

nartal1 commented Jan 17, 2026 •

edited

Loading

greptile-apps bot commented Jan 17, 2026 •

edited

Loading