Skip to content

[FEA][AUDIT][SPARK-52921][SQL] Specify outputPartitioning for UnionExec for same output partitoning as children operators #14083

@abellina

Description

@abellina

This is a spark 4.1 audit task.

UnionExec supports outputPartitioning, when children have the same output partitioning. I am not 100% sure why we would want this, but filing in case we need it. According to the issue, Union has "unknown" partitioning otherwise. My guess is that if we don't follow suit, spark-rapids could add shuffles when the CPU case wouldn't. That said, I am not 100% sure, and wanted to get some comments from @revans2 and others.

There is a follow up I was looking at for audit (apache/spark@c0acf45023f), so I created the issue for the main spark issue to discuss.

Here's another related follow up: apache/spark@8edc7685b97

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions