-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21533][SQL] Print warning messages when override function configure found in Hive UDFs
#18768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@gatorsmile @hvanhovell What do you think this? Thanks! |
|
Test build #80035 has finished for PR 18768 at commit
|
|
Test build #80037 has finished for PR 18768 at commit
|
|
Test build #80038 has finished for PR 18768 at commit
|
|
Jenkins, retest this please. |
| } | ||
|
|
||
| def validateHiveUserDefinedFunction(udfClass: Class[_]): Unit = { | ||
| if (hasInheritanceOf[GenericUDF]("configure", udfClass) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When GenericUDF API has configure method? Seems GenericUDF at 0.10.0 has no such method?
https://hive.apache.org/javadocs/r0.10.0/api/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked it seemed this API had been implemented in v0.11.0 (https://github.com/apache/hive/blob/release-0.11.0/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDF.java). This is the commit for this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If ran with a Hive version that configure is not implemented yet, is hasInheritanceOf safe from NoSuchMethodException ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC spark always refers to hive-exec-1.2.1.spark2.jar , so it seems we have no chance to get the exception. But, I think it is not a bad idea to catch NoSuchMethodException there for understandability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that's right. As you said, it is still good to catch the exception.
|
Test build #80040 has finished for PR 18768 at commit
|
| } | ||
|
|
||
| private def isSubClassOf(t: Type, parent: Class[_]): Boolean = t match { | ||
| case cls: Class[_] => parent.isAssignableFrom(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you pass in something not a Class[_]? If not, we can simply inline isAssignableFrom check into hasInheritanceOf. Then we don't need to have Type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
Btw, is it possibly to add an unit test for it? |
|
Test build #80052 has finished for PR 18768 at commit
|
|
Test build #80053 has finished for PR 18768 at commit
|
|
Is this the only function we miss for different types of Hive UD(A/T)Fs? |
|
Could we document this limit in the https://spark.apache.org/docs/latest/sql-programming-guide.html#unsupported-hive-functionality? |
| * Construct a [[FunctionBuilder]] based on the provided class that represents a function. | ||
| */ | ||
| private def makeFunctionBuilder(name: String, clazz: Class[_]): FunctionBuilder = { | ||
| validateHiveUserDefinedFunction(clazz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am neural to introduce a warning message for this case. Not sure how helpful the warning message will be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. So, I think it's ok to revisit this again if we have more reports from users.
|
@viirya I've no idea about a way to add tests for this kind of cases and I think the previous prs similar to this case (just add code for warning) also had no test. But, if we could, we'd better to do... @gatorsmile I checked these Hive classes and I found some other functions that Spark currently ignores... (e.g., |
|
Yes! Thanks for your efforts! Please update the documentation at first. |
|
ok, I'll make a pr later. Thanks! |
|
I think the document about these unsupported functions is enough for users, so I'll close this for now. If we have more response from users, we could revisit this again. Thanks. |
## What changes were proposed in this pull request? This pr added documents about unsupported functions in Hive UDF/UDTF/UDAF. This pr relates to apache#18768 and apache#18527. ## How was this patch tested? N/A Author: Takeshi Yamamuro <[email protected]> Closes apache#18792 from maropu/HOTFIX-20170731.
What changes were proposed in this pull request?
This pr added code to print warning messages when an unsupported function
public void configure(MapredContext mapredContext)used inGenericUDFandGenericUDF. This is because Spark does not callconfigureinternally and this is error-prone.How was this patch tested?
Manually checked. If this pr applied, you hit a warning message below;