-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21845] [SQL] Make codegen fallback of expressions configurable #19062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| GeneratePredicate.generate(expression, inputSchema) | ||
| } catch { | ||
| case e @ (_: JaninoRuntimeException | _: CompileException) | ||
| if sqlContext == null || sqlContext.conf.wholeStageFallback => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because sqlContext is always null when running it in executors, and thus, this always return true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to put this comment in https://github.com/apache/spark/pull/19062/files#diff-b9f96d092fb3fea76bcf75e016799678R57?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing the null check here makes sense although this means existing spark jobs that were previously switching to the non-codegen version even with sqlContext.conf.wholeStageFallback = false will now start failing at runtime (perhaps, rightly so). Might be worth calling this out in the 2.3 release notes and/or the migration guide.
| "TestSQLContext", | ||
| new SparkConf() | ||
| .set("spark.sql.test", "") | ||
| .set(SQLConf.CODEGEN_FALLBACK.key, "false") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn it false to ensure it does not hide the actual bugs of our expression codegen that causes compilation falure.
| final val sqlContext = SparkSession.getActiveSession.map(_.sqlContext).orNull | ||
|
|
||
| // whether we should fallback when hitting compilation errors caused by codegen | ||
| private val codeGenFallBack = sqlContext == null || sqlContext.conf.codegenFallback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to add !Utils.isTesting && or to drop !Utils.isTesting && from WholeStageCodegenExec to make these conditions consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, I did it like what you said. However, if using that approach, I need to remove the test case. Then, I think we might just keep using the codegenFallback for controlling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see
|
Test build #81158 has finished for PR 19062 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM.
|
LGTM |
1 similar comment
|
LGTM |
| // sqlContext will be null when we are being deserialized on the slaves. In this instance | ||
| // the value of subexpressionEliminationEnabled will be set by the deserializer after the | ||
| // constructor has run. | ||
| val subexpressionEliminationEnabled: Boolean = if (sqlContext != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @marmbrus @cloud-fan Does this sound OK? Currently, our codebase does not check nullability in most places. This value will be always not null when we initialize the value.
|
Test build #81229 has finished for PR 19062 at commit
|
|
Test build #81230 has finished for PR 19062 at commit
|
|
Thanks! Merging to master. |
|
I think this makes the master build fail @gatorsmile : |
|
Thanks! Let me first revert this PR. |
What changes were proposed in this pull request?
We should make codegen fallback of expressions configurable. So far, it is always on. We might hide it when our codegen have compilation bugs. Thus, we should also disable the codegen fallback when running test cases.
How was this patch tested?
Added test cases