Skip to content

Conversation

@nartal1
Copy link
Collaborator

@nartal1 nartal1 commented Jan 17, 2026

Fixes #14083 and contributes to #14135.

Description

This PR aligns GpuUnionExec with Apache Spark 4.1's change to UnionExec behavior introduced in SPARK-52921.
In Spark 4.1, UnionExec was changed to use SQLPartitioningAwareUnionRDD which groups partitions at corresponding indices across child RDDs, rather than concatenating all partitions sequentially.
We have copied most of the code from Apache Spark's codebase for Partitioner-aware union.
While fixing this, audited the outputPartitioning override functions of other execs and updated/added for the missing ones.
These are again copied from the Spark's code and modified to fit for this repo.
GpuProjectExecLike: Fixed outputPartitioning to remap expressions through aliases (was missing, unlike Spark's PartitioningPreservingUnaryExecNode).
GpuBroadcastHashJoinExecBase: Added missing outputPartitioning override
GpuCustomShuffleReaderExec: Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning (matching Spark's AQEShuffleReadExec).
GpuShuffledHashJoinExec: Added outputPartitioning override matching Spark's HashJoin trait behavior.
GpuShuffledSymmetricHashJoinExec: Added outputPartitioning override for InnerLike and FullOuter join types.
GpuShuffledAsymmetricHashJoinExec: Added outputPartitioning override for LeftOuter and RightOuter join types.
GpuBroadcastNestedLoopJoinExecBase: Added outputPartitioning override matching Spark's BroadcastNestedLoopJoinExec behavior.

Some of the integration tests were failing on Spark-4.1 with below error:

AssertionError: CPU and GPU list have different lengths at [] CPU: 10 GPU: 20

Testing

All the intergration tests in dpp_test.py pass now with this PR.
Before this PR:

========================================================== 16 failed, 89 passed, 154 warnings in 127.48s (0:02:07) ==========================================================

With this PR:

=============================================================== 105 passed, 154 warnings in 123.63s (0:02:03) ===============================================================

Checklists

  • This PR has added documentation for new or modified features or behaviors.
  • This PR has added new tests or modified existing tests to cover new code paths.
    (Please explain in the PR description how the new code paths are tested, such as names of the new/existing tests that cover them.)
  • Performance testing has been performed and its results are added in the PR description. Or, an issue has been filed with a link in the PR description.

@nartal1 nartal1 self-assigned this Jan 17, 2026
@nartal1 nartal1 added the audit_4.1.0 Audit related tasks for 4.1.0 label Jan 17, 2026
Signed-off-by: Niranjan Artal <[email protected]>
@nartal1
Copy link
Collaborator Author

nartal1 commented Jan 17, 2026

This was built on top of #14120. So it can be merged after Spark-4.1.1 support is added.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 17, 2026

Greptile Overview

Greptile Summary

This PR implements Spark 4.1's partitioner-aware union behavior (SPARK-52921) for GpuUnionExec and fixes outputPartitioning across multiple executor types.

Key Changes

Spark 4.1 Partitioner-Aware Union:

  • Implements GpuPartitionerAwareUnionRDD that groups partitions at corresponding indices (e.g., partition 0 from all children) rather than concatenating sequentially
  • Falls back to concatenation when outputPartitioning is UnknownPartitioning
  • Adds logic to compare child partitionings and determine compatibility

outputPartitioning Fixes:

  • GpuProjectExecLike: Now remaps expressions through aliases and flattens PartitioningCollection, matching Spark's PartitioningPreservingUnaryExecNode
  • GpuShuffledHashJoinExec: Returns PartitioningCollection for InnerLike joins
  • GpuShuffledSymmetricHashJoinExec and GpuShuffledAsymmetricHashJoinExec: Added proper partitioning based on join type
  • GpuCustomShuffleReaderExec: Preserves HashPartitioning for coalesced reads with updated partition count
  • GpuBroadcastHashJoinExecBase and GpuBroadcastNestedLoopJoinExecBase: Added missing overrides

Testing: Integration tests in dpp_test.py now pass (16 failures → 0 failures).

Confidence Score: 4.5/5

  • This PR is safe to merge with minor validation needed on edge cases
  • The implementation closely follows Spark's approach and all integration tests pass. The main concern is the assumption in GpuPartitionerAwareUnionRDD that all RDDs have the same partition count, which relies on caller validation
  • Verify that GpuPartitionerAwareUnionRDD's partition count assumptions are always satisfied by getOutputPartitioning logic

Important Files Changed

Filename Overview
sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuUnionExecShim.scala Implements Spark 4.1 partitioner-aware union with proper attribute mapping and partitioning comparison
sql-plugin/src/main/spark411/scala/com/nvidia/spark/rapids/shims/GpuPartitionerAwareUnionRDD.scala Groups partitions at corresponding indices; assumes all RDDs have same partition count (validated by caller)
sql-plugin/src/main/scala/com/nvidia/spark/rapids/basicPhysicalOperators.scala Fixed GpuProjectExecLike.outputPartitioning to remap expressions through aliases and flatten PartitioningCollection
sql-plugin/src/main/scala/org/apache/spark/sql/rapids/execution/GpuCustomShuffleReaderExec.scala Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning with updated partition count

Sequence Diagram

sequenceDiagram
    participant Spark as Spark 4.1 Planner
    participant GpuUnion as GpuUnionExec
    participant Shim as GpuUnionExecShim
    participant RDD as GpuPartitionerAwareUnionRDD
    participant Children as Child RDDs
    
    Spark->>GpuUnion: Plan UnionExec
    GpuUnion->>Shim: getOutputPartitioning(children)
    alt All children have compatible partitioning
        Shim-->>GpuUnion: HashPartitioning/SinglePartition
        Note over Shim: SQL_UNION_OUTPUT_PARTITIONING=true
    else Incompatible partitioning
        Shim-->>GpuUnion: UnknownPartitioning(0)
    end
    
    Spark->>GpuUnion: executeColumnar()
    GpuUnion->>Shim: unionColumnarRdds()
    alt UnknownPartitioning
        Shim->>Children: sc.union(rdds)
        Note over Shim,Children: Concatenate all partitions
        Children-->>GpuUnion: Sequential RDD
    else Has known partitioning
        Shim->>RDD: new GpuPartitionerAwareUnionRDD
        RDD->>Children: Group partitions at index i
        Note over RDD,Children: Partition-aware: rdds[0][i] + rdds[1][i] + ...
        Children-->>GpuUnion: Partitioner-aware RDD
    end
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

{"spark": "400"}
{"spark": "401"}
spark-rapids-shim-json-lines ***/
package com.nvidia.spark.rapids.shims
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing {"spark": "400db173"} ?
It's better to test Databricks by adding [databricks] marker in the PR title.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added [databricks] marker in the PR title. I will add "400db173" in the PR to support for Databricks-17.3. Adding now itself might be confusing for some.

@nartal1 nartal1 changed the title Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior. Align GpuUnionExec with Spark 4.1's partitioner-aware union behavior [databricks] Jan 20, 2026
children.map(_.executeColumnar()),
numOutputRows,
numOutputBatches)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to also update what we say out output partitioning is?

@nvauto
Copy link
Collaborator

nvauto commented Jan 26, 2026

NOTE: release/26.02 has been created from main. Please retarget your PR to release/26.02 if it should be included in the release.

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Signed-off-by: Niranjan Artal <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@res-life res-life linked an issue Jan 29, 2026 that may be closed by this pull request
@nartal1 nartal1 requested a review from a team January 29, 2026 22:31
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

gerashegalov
gerashegalov previously approved these changes Jan 30, 2026
@nartal1 nartal1 changed the base branch from main to release/26.02 January 30, 2026 03:22
@nartal1 nartal1 dismissed gerashegalov’s stale review January 30, 2026 03:22

The base branch was changed.

Copy link
Collaborator

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nartal1. Are all these changes covered by existing tests?

GpuProjectExecLike: Fixed outputPartitioning to remap expressions through aliases (was missing, unlike Spark's PartitioningPreservingUnaryExecNode).
GpuBroadcastHashJoinExecBase: Added missing outputPartitioning override
GpuCustomShuffleReaderExec: Fixed outputPartitioning for AQE coalesced reads to preserve HashPartitioning (matching Spark's AQEShuffleReadExec).
GpuShuffledHashJoinExec: Added outputPartitioning override matching Spark's HashJoin trait behavior.
GpuShuffledSymmetricHashJoinExec: Added outputPartitioning override for InnerLike and FullOuter join types.
GpuShuffledAsymmetricHashJoinExec: Added outputPartitioning override for LeftOuter and RightOuter join types.
GpuBroadcastNestedLoopJoinExecBase: Added outputPartitioning override matching Spark's BroadcastNestedLoopJoinExec behavior.

* This is critical for Spark 4.1+ where UnionExec uses outputPartitioning
* to decide between partitioner-aware union vs concatenation.
*/
override def outputPartitioning: Partitioning = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need to handle the case when the project has aliases?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also wonder if it's a good idea to have GpuProjectExecLike extend PartitioningPreservingUnaryExecNode instead of copying this code.

}

override def outputPartitioning: Partitioning =
GpuUnionExecShim.getOutputPartitioning(children, output, conf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, can we take the outputPartitioning of the cpu exec as a parameter of GpuUnionExec instead of duplicating the Spark code?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jihoonson for the review. Your review comments on the refactor makes sense. I will do it as a follow-on PR for 26.04 if that's okay. Filed an issue for these - #14229

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@nartal1
Copy link
Collaborator Author

nartal1 commented Jan 30, 2026

build

Copy link
Collaborator

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for filing a follow-up issue. LGTM

@gerashegalov gerashegalov merged commit 4c18b5a into NVIDIA:release/26.02 Jan 31, 2026
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

audit_4.1.0 Audit related tasks for 4.1.0

Projects

None yet

6 participants