-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-5498][SQL][FOLLOW] add schema to table partition #20846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| parameters: Map[String, String] = Map.empty, | ||
| stats: Option[CatalogStatistics] = None) { | ||
| stats: Option[CatalogStatistics] = None, | ||
| schema: Option[StructType] = None) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The partition schema is stored in CatalogTable . I am not very clear what is the exception you got.
@dongjoon-hyun Could you help @liutang123 investigate the issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, @gatorsmile . I'll take a look during weekend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some times, partition's schema is different from the table's.
|
@liutang123 , Spark should not do this kind of risky thing. Hive 2.3.2 also disallows incompatible schema changes like the following. hive> CREATE TABLE test_par(a string) PARTITIONED BY (b bigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';
OK
Time taken: 0.262 seconds
hive> ALTER TABLE test_par CHANGE a a bigint RESTRICT;
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Unable to alter table. The following columns have types incompatible with the existing columns in their respective positions :
a
hive> SELECT VERSION();
OK
2.3.2 r857a9fd8ad725a53bd95c1b2d6612f9b1155f44d
Time taken: 0.711 seconds, Fetched: 1 row(s)cc @gatorsmile . |
|
@dongjoon-hyun, thanks for reviewing. |
|
We do not allow users to change the table column type. Currently, only the column comments are allowed to change if users issue the command through Spark. However, users still can change it through Hive. Thus, nothing we can do from Spark side, right? |
|
Right, @gatorsmile . |
|
Can one of the admins verify this patch? |
|
What JIRA was this really about? |
Closes apache#21766 Closes apache#21679 Closes apache#21161 Closes apache#20846 Closes apache#19434 Closes apache#18080 Closes apache#17648 Closes apache#17169 Add: Closes apache#22813 Closes apache#21994 Closes apache#22005 Closes apache#22463 Add: Closes apache#15899 Add: Closes apache#22539 Closes apache#21868 Closes apache#21514 Closes apache#21402 Closes apache#21322 Closes apache#21257 Closes apache#20163 Closes apache#19691 Closes apache#18697 Closes apache#18636 Closes apache#17176 Closes apache#23001 from wangyum/CloseStalePRs. Authored-by: Yuming Wang <[email protected]> Signed-off-by: hyukjinkwon <[email protected]>
What changes were proposed in this pull request?
When query a orc table witch some partition schemas are different from table schema, ClassCastException will occured.
reproduction:
create table test_par(a string) PARTITIONED BY (bbigint) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat';ALTER TABLE test_par CHANGE a a bigint restrict; -- in hiveselect * from test_par;How was this patch tested?
manual test.