[SPARK-48921][SQL] ScalaUDF encoders in subquery should be resolved for MergeInto #47380

viirya · 2024-07-17T02:32:14Z

What changes were proposed in this pull request?

We got a customer issue that a MergeInto query on Iceberg table works earlier but cannot work after upgrading to Spark 3.4.

The error looks like

Caused by: org.apache.spark.SparkRuntimeException: Error while decoding: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to nullable on unresolved object
upcast(getcolumnbyordinal(0, StringType), StringType, - root class: java.lang.String).toString.

The source table of MergeInto uses ScalaUDF. The error happens when Spark invokes the deserializer of input encoder of the ScalaUDF and the deserializer is not resolved yet.

The encoders of ScalaUDF are resolved by the rule ResolveEncodersInUDF which will be applied at the end of analysis phase.

During rewriting MergeInto to ReplaceData query, Spark creates an Exists subquery and ScalaUDF is part of the plan of the subquery. Note that the ScalaUDF is already resolved by the analyzer.

Then, in ResolveSubquery rule which resolves the subquery, it will resolve the subquery plan if it is not resolved yet. Because the subquery containing ScalaUDF is resolved, the rule skips it so ResolveEncodersInUDF won't be applied on it. So the analyzed ReplaceData query contains a ScalaUDF with encoders unresolved that cause the error.

This patch modifies ResolveSubquery so it will resolve subquery plan if it is not analyzed to make sure subquery plan is fully analyzed.

This patch moves ResolveEncodersInUDF rule before rewriting MergeInto to make sure the ScalaUDF in the subquery plan is fully analyzed.

Why are the changes needed?

Fixing production query error.

Does this PR introduce any user-facing change?

Yes, fixing user-facing issue.

How was this patch tested?

Manually test with MergeInto query and add an unit test.

Was this patch authored or co-authored using generative AI tooling?

No

dongjoon-hyun · 2024-07-17T03:18:24Z

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/ResolveSubquerySuite.scala

+  val testRelation = LocalRelation($"a".int, $"b".double)
+  val testRelation2 = LocalRelation($"c".int, $"d".string)
+
+  test("SPARK-48921: ScalaUDF in subquery should run through analyzer") {


Thank you for adding this.

dongjoon-hyun · 2024-07-17T03:21:02Z

Do you happen to know which JIRA issue is related to this regression, @viirya ?

after upgrading to Spark 3.4.

dongjoon-hyun

+1, LGTM (Pending CIs).

cc @cloud-fan, @yaooqinn , too

viirya · 2024-07-17T05:02:54Z

Do you happen to know which JIRA issue is related to this regression, @viirya ?

after upgrading to Spark 3.4.

Thank you for review, @dongjoon-hyun.

It is not caused by a JIRA so I think that it is not a regression.

The Iceberg MergeInto query error is happened on the row-level group filter query. The feature is added in Spark 3.4.
So in the previous Spark version the customer uses, it doesn't trigger the issue.

viirya · 2024-07-17T05:04:24Z

I re-triggered the failed Run Docker integration tests.

All CIs are passed now: https://github.com/viirya/spark-1/actions/runs/9967182407/job/27542878853

dongjoon-hyun · 2024-07-17T05:07:36Z

Got it. Feel free to merge and backport, @viirya ~

viirya · 2024-07-17T05:09:06Z

Thank you @dongjoon-hyun. I will keep it for a day and merge if no more comments.

cloud-fan · 2024-07-17T13:00:30Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

    private def resolveSubQueries(plan: LogicalPlan, outer: LogicalPlan): LogicalPlan = {
      plan.transformAllExpressionsWithPruning(_.containsPattern(PLAN_EXPRESSION), ruleId) {
-        case s @ ScalarSubquery(sub, _, exprId, _, _, _) if !sub.resolved =>
+        case s @ ScalarSubquery(sub, _, exprId, _, _, _) if !sub.analyzed =>


will we ever set the analyzed flag to true for plans in SubqueryExpression?

Ah, it runs execute instead of executeCheck. It will re-enter this every call.

Hmm, maybe we should change to executeCheck?

CheckAnalysis will check subquery expressions recursively, so we shouldn't check it here.

shall we mark ScalaUDF as unresolved if the encoder is not resolved yet?

I added a checkAnalysis after analysis of the subquery plan now.

CheckAnalysis will check subquery expressions recursively, so we shouldn't check it here.

I meant to check sub plan included in the subquery, not the subquery expression itself. It shouldn't be recursive.

shall we mark ScalaUDF as unresolved if the encoder is not resolved yet?

I also did it before, but I saw some side effect that causes the MergeInto query to fail. So I removed it before submitting this PR.

CheckAnalysis will check subquery expressions recursively, so we shouldn't check it here.

I meant to check sub plan included in the subquery, not the subquery expression itself. It shouldn't be recursive.

It isn't recursive, but InlineCTE.buildCTEMap has some issues on it.

cloud-fan · 2024-07-17T13:01:45Z

is this a rule order issue? Shall we run ResolveEncodersInUDF before rewriting MergeInto?

viirya · 2024-07-17T13:29:09Z

is this a rule order issue? Shall we run ResolveEncodersInUDF before rewriting MergeInto?

It is also working. Actually that is my first fix.

cloud-fan · 2024-07-17T16:59:22Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

        executeSameContext(e.plan)
      }

+      checkAnalysis(newSubqueryPlan)


This change is risky as we may fail earlier than before, while before we can still resolve the subquery expression after more iterations.

Hmm, makes sense. Maybe restoring to the fix of moving ResolveEncodersInUDF?

I think adjusting the rule order is probably the safest solution for now. The current way of resolving subquery expressions is quite fragile. Ideally we should recursively invoke the full analyzer only once (and must invoke once) for each subquery expression, instead of doing it again and again with the if resolved check.

Okay. I will make the change (and maybe adjust the test).

Btw, for the MergInto issue we encountered, because the ScalaUDF is put in a subquery plan by rewriting MergeInto rule. So moving ResolveEncodersInUDF before rewriting can fix the issue.

But it doesn't fix the general issue if ScalaUDF is in other subquery plan.

I tend to fix the general issue instead of just fixing the corner issue we encountered.

Let me fix the issue we encounter first. We can consider the general issue later.

viirya · 2024-07-18T01:11:54Z

@cloud-fan I changed the rule order of ResolveEncodersInUDF. The unit test is updated too.

cloud-fan · 2024-07-18T01:36:11Z

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

      HandleNullInputsForUDF,
      UpdateAttributeNullability),
-    Batch("UDF", Once,
-      ResolveEncodersInUDF),


hmm, shall we move the MergeInto rewrite rule after ResolveEncodersInUDF instead?

IIRC we need to run ResolveEncodersInUDF after the ScalaUDF Null Handling batch

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

dongjoon-hyun

It seems that new code part fails to compile.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

viirya · 2024-07-18T15:04:02Z

Thanks for review @dongjoon-hyun @huaxingao @yaooqinn @cloud-fan

I got some error when running merge_spark_pr.py. Could you help merge and back port this PR? If there are any conflicts, I will open back port PRs.

Thank you.

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala

…ysis/Analyzer.scala

dongjoon-hyun · 2024-07-18T20:00:55Z

Merged to master.

Could you make backporting PRs to branch-3.5 and branch-3.4 because there are conflicts, @viirya ?

viirya · 2024-07-18T20:02:30Z

Thank you @dongjoon-hyun . I will create backporting PRs.

viirya · 2024-07-18T20:19:44Z

The backport PR for branch-3.5: #47406

For branch-3.4, although MergeInto is added there, it isn't really supported for executing it. We have it internally as we backported some changes to internal 3.4 branch. So we don't need a backport PR for branch-3.4.

…or MergeInto ### What changes were proposed in this pull request? We got a customer issue that a `MergeInto` query on Iceberg table works earlier but cannot work after upgrading to Spark 3.4. The error looks like ``` Caused by: org.apache.spark.SparkRuntimeException: Error while decoding: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to nullable on unresolved object upcast(getcolumnbyordinal(0, StringType), StringType, - root class: java.lang.String).toString. ``` The source table of `MergeInto` uses `ScalaUDF`. The error happens when Spark invokes the deserializer of input encoder of the `ScalaUDF` and the deserializer is not resolved yet. The encoders of ScalaUDF are resolved by the rule `ResolveEncodersInUDF` which will be applied at the end of analysis phase. During rewriting `MergeInto` to `ReplaceData` query, Spark creates an `Exists` subquery and `ScalaUDF` is part of the plan of the subquery. Note that the `ScalaUDF` is already resolved by the analyzer. Then, in `ResolveSubquery` rule which resolves the subquery, it will resolve the subquery plan if it is not resolved yet. Because the subquery containing `ScalaUDF` is resolved, the rule skips it so `ResolveEncodersInUDF` won't be applied on it. So the analyzed `ReplaceData` query contains a `ScalaUDF` with encoders unresolved that cause the error. This patch modifies `ResolveSubquery` so it will resolve subquery plan if it is not analyzed to make sure subquery plan is fully analyzed. This patch moves `ResolveEncodersInUDF` rule before rewriting `MergeInto` to make sure the `ScalaUDF` in the subquery plan is fully analyzed. ### Why are the changes needed? Fixing production query error. ### Does this PR introduce _any_ user-facing change? Yes, fixing user-facing issue. ### How was this patch tested? Manually test with `MergeInto` query and add an unit test. ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#47380 from viirya/fix_subquery_resolve. Lead-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Kent Yao <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>

### What changes were proposed in this pull request? Follow-up of #47380, we should resolve the Scala UDF within all subqueries, instead of modifying the rule orders to make DML rewrites working. This PR also moves the DML rewrite rules to the main resolution batch, so that the DML rewrite results can apply with other rules such as `ResolveTableConstraints` ### Why are the changes needed? A better and simpler fix for SPARK-48921 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes #50973 from gengliangwang/fixDML. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>

### What changes were proposed in this pull request? Follow-up of apache#47380, we should resolve the Scala UDF within all subqueries, instead of modifying the rule orders to make DML rewrites working. This PR also moves the DML rewrite rules to the main resolution batch, so that the DML rewrite results can apply with other rules such as `ResolveTableConstraints` ### Why are the changes needed? A better and simpler fix for SPARK-48921 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests ### Was this patch authored or co-authored using generative AI tooling? No Closes apache#50973 from gengliangwang/fixDML. Authored-by: Gengliang Wang <[email protected]> Signed-off-by: Gengliang Wang <[email protected]>

SPARK-48921: ScalaUDF in subquery should run through analyzer

3657b7c

github-actions bot added the SQL label Jul 17, 2024

dongjoon-hyun reviewed Jul 17, 2024

View reviewed changes

dongjoon-hyun approved these changes Jul 17, 2024

View reviewed changes

huaxingao approved these changes Jul 17, 2024

View reviewed changes

yaooqinn approved these changes Jul 17, 2024

View reviewed changes

cloud-fan reviewed Jul 17, 2024

View reviewed changes

check and set analyzed

708b959

cloud-fan reviewed Jul 17, 2024

View reviewed changes

Move ResolveEncodersInUDF rule

c90e463

viirya changed the title ~~[SPARK-48921][SQL] ScalaUDF in subquery should run through analyzer~~ [SPARK-48921][SQL] ScalaUDF encoders in subquery should be resolved for MergeInto Jul 18, 2024

cloud-fan reviewed Jul 18, 2024

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

cloud-fan reviewed Jul 18, 2024

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

For review

544828d

viirya force-pushed the fix_subquery_resolve branch from 4b62a8f to 544828d Compare July 18, 2024 02:08

cloud-fan reviewed Jul 18, 2024

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Show resolved Hide resolved

cloud-fan approved these changes Jul 18, 2024

View reviewed changes

Add comment

fd17bd4

dongjoon-hyun approved these changes Jul 18, 2024

View reviewed changes

dongjoon-hyun reviewed Jul 18, 2024

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

Fix

0722454

yaooqinn reviewed Jul 18, 2024

View reviewed changes

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala Outdated Show resolved Hide resolved

Update sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/anal…

c93cac5

…ysis/Analyzer.scala

yaooqinn approved these changes Jul 18, 2024

View reviewed changes

dongjoon-hyun closed this in fab6d83 Jul 18, 2024

gengliangwang mentioned this pull request May 22, 2025

[SPARK-52252][SQL] ScalaUDF encoders in subquery should be resolved #50973

Closed

[SPARK-48921][SQL] ScalaUDF encoders in subquery should be resolved for MergeInto #47380

[SPARK-48921][SQL] ScalaUDF encoders in subquery should be resolved for MergeInto #47380

Uh oh!

Conversation

viirya commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dongjoon-hyun left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Jul 17, 2024

Uh oh!

viirya commented Jul 17, 2024

Uh oh!

dongjoon-hyun commented Jul 17, 2024

Uh oh!

viirya commented Jul 17, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan commented Jul 17, 2024

Uh oh!

viirya commented Jul 17, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya commented Jul 18, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

viirya commented Jul 17, 2024 •

edited

Loading

dongjoon-hyun commented Jul 17, 2024 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

viirya Jul 17, 2024 •

edited

Loading