Skip to content

Conversation

@Kimahriman
Copy link
Contributor

What changes were proposed in this pull request?

Fix the nullability of the Base64 expression to be based on the child's nullability, and not always be nullable.

Why are the changes needed?

#47303 had a side effect of changing the nullability by the switch to using StaticInvoke. This was also backported to Spark 3.5.2 and caused schema mismatch errors for stateful streams when we upgraded. This restores the previous behavior which is supported by StaticInvoke through the returnNullable argument. If the child is non-nullable, we know the result will be non-nullable.

Does this PR introduce any user-facing change?

Restores the nullability of the Base64 expression to what is was in Spark 3.5.1 and earlier.

How was this patch tested?

New UT

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Aug 30, 2024
@Kimahriman
Copy link
Contributor Author

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except of a comment.

Please, append the [SQL] tag in PR's title.

Comment on lines 470 to 473
assert(!Base64(Literal(bytes)).nullable)
assert(Base64(Literal.create(null, BinaryType)).nullable)
assert(!UnBase64(Literal("AQIDBA==")).nullable)
assert(UnBase64(Literal.create(null, StringType)).nullable)
Copy link
Member

@MaxGekk MaxGekk Sep 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a check when you pass non-NULL expr but mark it as a nullable:

assert(Base64(Literal(bytes).castNullable()).nullable)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@Kimahriman Kimahriman changed the title [SPARK-49476] Fix nullability of base64 function [SPARK-49476][SQL] Fix nullability of base64 function Sep 1, 2024
assert(UnBase64(Literal.create(null, StringType)).nullable)
assert(UnBase64(Literal("AQIDBA==").castNullable()).nullable)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Adding one blank line here is enough

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merging to master/3.5.
Thank you, @Kimahriman and @LuciferYang @yaooqinn for review.

@MaxGekk MaxGekk closed this in c274c5a Sep 2, 2024
MaxGekk added a commit that referenced this pull request Sep 2, 2024
### What changes were proposed in this pull request?

Fix the nullability of the `Base64` expression to be based on the child's nullability, and not always be nullable.

### Why are the changes needed?

#47303 had a side effect of changing the nullability by the switch to using `StaticInvoke`. This was also backported to Spark 3.5.2 and caused schema mismatch errors for stateful streams when we upgraded. This restores the previous behavior which is supported by StaticInvoke through the `returnNullable` argument. If the child is non-nullable, we know the result will be non-nullable.

### Does this PR introduce _any_ user-facing change?

Restores the nullability of the `Base64` expression to what is was in Spark 3.5.1 and earlier.

### How was this patch tested?

New UT

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47941 from Kimahriman/base64-nullability.

Lead-authored-by: Adam Binford <[email protected]>
Co-authored-by: Maxim Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit c274c5a)
Signed-off-by: Max Gekk <[email protected]>
@Kimahriman
Copy link
Contributor Author

Made a follow up to fix a test in the 3.5 backport #47964

yaooqinn pushed a commit that referenced this pull request Sep 3, 2024
### What changes were proposed in this pull request?

Fix a test that is failing from backporting #47941

### Why are the changes needed?

Fix test

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Fixed test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47964 from Kimahriman/base64-proto-test.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
@dongjoon-hyun
Copy link
Member

Thank you, @Kimahriman and all.

turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
* [SPARK-49476][SQL] Fix nullability of base64 function

### What changes were proposed in this pull request?

Fix the nullability of the `Base64` expression to be based on the child's nullability, and not always be nullable.

### Why are the changes needed?

apache#47303 had a side effect of changing the nullability by the switch to using `StaticInvoke`. This was also backported to Spark 3.5.2 and caused schema mismatch errors for stateful streams when we upgraded. This restores the previous behavior which is supported by StaticInvoke through the `returnNullable` argument. If the child is non-nullable, we know the result will be non-nullable.

### Does this PR introduce _any_ user-facing change?

Restores the nullability of the `Base64` expression to what is was in Spark 3.5.1 and earlier.

### How was this patch tested?

New UT

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47941 from Kimahriman/base64-nullability.

Lead-authored-by: Adam Binford <[email protected]>
Co-authored-by: Maxim Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
(cherry picked from commit c274c5a)
Signed-off-by: Max Gekk <[email protected]>

* [SPARK-49476][SQL][3.5][FOLLOWUP] Fix base64 proto test

### What changes were proposed in this pull request?

Fix a test that is failing from backporting apache#47941

### Why are the changes needed?

Fix test

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Fixed test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47964 from Kimahriman/base64-proto-test.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Kent Yao <[email protected]>

* [SPARK-49476][SQL][3.5][FOLLOWUP] Fix base64 proto test

### What changes were proposed in this pull request?

Fix a test that is failing from backporting apache#47941

### Why are the changes needed?

Fix test

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Fixed test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47964 from Kimahriman/base64-proto-test.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Kent Yao <[email protected]>

---------

Signed-off-by: Max Gekk <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
Co-authored-by: Adam Binford <[email protected]>
Co-authored-by: Maxim Gekk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants