-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-32668][SQL] HiveGenericUDTF initialize UDTF should use StructObjectInspector method #29490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| protected lazy val inputInspectors = children.map(toInspector) | ||
| protected lazy val inputInspectors = { | ||
| val inspectors = children.map(toInspector) | ||
| val fields = inspectors.indices.map(index => s"col$index").asJava |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The field name is not important so use col0 col1 ...
|
Test build #127692 has finished for PR 29490 at commit
|
|
yes, I have the same problem while using hive UDTF in spark-sql or spark.sql because of not override |
|
Then what do you think about this ? cc @dongjoon-hyun @sunchao |
|
Thank you for pinging me, @ulysses-you . |
|
Test build #131822 has finished for PR 29490 at commit
|
|
oops sorry @ulysses-you just remembered that you pinged me on this PR. This looks mostly good to me exception one question: since the new API is added in 0.13 and Spark still support Hive 0.12, do we need to take care of backward compatibility here? cc @wangyum |
|
cc @somani |
|
thanks @sunchao . If don't miss something, Hive1.2 has been removed since SPARK-32981 with branch-3.1. Hive0.12 is so far for master branch so we don't need care about compatible with it. cc @dongjoon-hyun isn't it ? |
|
@sunchao , you think this will be affected by the Hive metastore client versions? |
This is the part I'm not sure, that is, Spark loading permanent UDF classes from a HMS with version 0.12. But apparently Hive 0.12 doesn't support permanent UDF so seems this is not an issue. |
|
BTW #30665 is trying to solve the same issue and we should consolidate on one PR. |
|
retest this please. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Test build #133578 has finished for PR 29490 at commit
|
|
Test build #133580 has finished for PR 29490 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test starting |
|
Kubernetes integration test status failure |
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Test build #133581 has finished for PR 29490 at commit
|
|
Kubernetes integration test status failure |
|
Kubernetes integration test starting |
|
Test build #133582 has finished for PR 29490 at commit
|
|
Kubernetes integration test status success |
|
Test build #133583 has finished for PR 29490 at commit
|
sql/hive/src/main/scala/org/apache/spark/sql/hive/hiveUDFs.scala
Outdated
Show resolved
Hide resolved
sql/hive/src/test/java/org/apache/spark/sql/hive/execution/UDTFStack3.java
Outdated
Show resolved
Hide resolved
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala
Outdated
Show resolved
Hide resolved
|
I think there could be cases where one registers Hive 0.12 UDFs (although not sure how rare this is) in a HMS that Spark talks to, and things may fail when calling the UDFs since they don't have the new |
|
@sunchao you mean user create a permanent udf which is from Hive0.12 build-in function ? If so I believe it's really rare .. |
|
@ulysses-you nvm please ignore my comment above :) I was thinking the case where Spark somehow loads the Hive 0.12 |
|
Do we still need this error message? spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveSessionCatalog.scala Lines 91 to 93 in dfa6fb4
|
|
Test build #133632 has finished for PR 29490 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Merged to master. |
|
thanks all! |
What changes were proposed in this pull request?
Use
initialize(StructObjectInspector argOIs)insteadinitialize(ObjectInspector[] args)inHiveGenericUDTF.Why are the changes needed?
In our case, we implement a Hive
GenericUDTFand overrideinitialize(StructObjectInspector argOIs). Then it's ok to execute with Hive, but failed with Spark SQL. Here is the Spark SQL error msg:The reason is Spark
HiveGenericUDTFcallinitialize(ObjectInspector[] argOIs)to init a UDTF, but it's a Deprecated method.We should use
initialize(StructObjectInspector argOIs)to do this so that we can be compatible both of the two method. Same as Hive.Does this PR introduce any user-facing change?
Yes, fix UDTF initialize method.
How was this patch tested?
manual test and passed
HiveUDFDynamicLoadSuite