-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20204][SQL] remove SimpleCatalystConf and CatalystConf type alias #17521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @ueshin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the content in this file is copied from SQLConf, except the lazy val for session local timezone, and the 3 copy methods here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copied from object SQLConf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moved to CatalystConf.STATIC and SQLConf.STATIC
|
Test build #75485 has finished for PR 17521 at commit
|
|
Test build #75486 has finished for PR 17521 at commit
|
|
Test build #75489 has finished for PR 17521 at commit
|
|
Test build #75491 has finished for PR 17521 at commit
|
|
Test build #75492 has finished for PR 17521 at commit
|
|
To be clear, I don't think we should have two separate places to define config entries. If this is what the pr is doing, I strongly veto. I get that there is a small incremental improvement we can do by properly bucketing things into logical vs physical components, but I don't think it is worth the overhead of us needing to think about where a config is each time we want to look it up ... |
|
LGTM |
|
Test build #75510 has finished for PR 17521 at commit
|
|
Merging in master. |
|
After merging my local branch up to this PR, I ran some of the regression tests from a machine in the Eastern time zone and observed the following failures: One failure is SQLQuerySuite/SPARK-3173 Timestamp support in the parser. Once I change my machine to Pacific time zone, the failure is gone. I am not making a statement that this PR causes the above failures. It could be something before this PR. I just want to share this information. |
|
@nsyca can you look into it? |
|
I will investigate. I am searching from the last good point I merged my private branch with the master trunk and will go from there. |
|
@nsyca can you try marking |
|
@dilipbiswal has narrowed down that this PR is changing the behaviour. He will continue to investigate and will post an update in the next hour or so before he calls it a day. |
|
@cloud-fan @nsyca Thanks !! Changing to make it lazy works for the test cases i have tried. I am running the full tests now. |
|
@cloud-fan @nsyca A quick update.. I ran the problematic tests and they pass with a change to move the time zone setting code to PlanTest.scala just before we create the SQLConf like following - // Timezone is fixed to America/Los_Angeles for those timezone sensitive tests (timestamp_*)
TimeZone.setDefault(TimeZone.getTimeZone("America/Los_Angeles"))
// Add Locale setting
Locale.setDefault(Locale.US)
protected val conf = new SQLConf().copy(SQLConf.CASE_SENSITIVE -> true)Please let me know what you think .. |
|
Update: nvm. The purpose of this configuration is session specific. |
|
also cc @ueshin , I think the default value of |
|
These test cases are timezone sensitive, right? If so, the changes made by @dilipbiswal are reasonable to me. |
|
I agree with @cloud-fan's suggestion, |
|
@dilipbiswal Can you please try it based on what @cloud-fan and @ueshin suggested? Does it resolve the issue you report? |
|
@gatorsmile @cloud-fan @ueshin Sorry .. i was on transit from work. Sure, i will make a try. However , i wanted to understand this a bit more. In my understanding, the current problem we are trying to address is only in our test environment. In our tests as part of the infrastructure we want to keep the timezone constant as we have hardcoded timestamp literals in the test and in order to make row comparisions work we need to match the timezone with the literals we use in the test. Today, we do that in QueryTest.scala here However, due to the change in the PR, this is happening a bit late as we have created a cloned copy of SQLConf in PlanTest.scala (super class of QueryTest.scala) in here. So in my understanding its this copy step where the configurations are used the first time and populates the singleton object SQLConf. Previously Analyzer, optimizer etc used to instantiate a new CatalystConf class and they were seeing the updated default timezone. Please correct me on this.. Thats the reason, i was suggesting to move the setting of timezone up into PlanTest before the place where SQLConf is cloned. I will study the ConfigEntryWithDefault approach in the meantime. |
|
Just add another view point to this incident. If those failed test cases were written with the time zone fixed to a region rather than Pacific time zone, we would fail fast in the first run over the regression machines in the west coast. We should have picked the time zone like GMT+1, one that is different from the GMT used by |
|
The |
|
@viirya Thanks !! Actually i assumed the problem is limited only to tests. I thought changing the timezone on the fly is not a realistic scenario. Looks like it is :-) I will submit a pr with wenchen's suggestion. |
|
The pr is at #17537. |
What changes were proposed in this pull request?
This is a follow-up of #17285 .
How was this patch tested?
existing tests