-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-49152][SQL][FOLLOWUP] table location string should be Hadoop Path string #47759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| case SetTableLocation(ResolvedV1TableIdentifier(ident), None, location) => | ||
| AlterTableSetLocationCommand(ident, None, location) | ||
|
|
||
| // V2 catalog doesn't support setting partition location yet, we must use v1 command here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to table location but is also a followup of #47660 to fix a missing case.
| private def qualifyLocInTableSpec(tableSpec: TableSpec): TableSpec = { | ||
| tableSpec.withNewLocation(tableSpec.location.map(loc => CatalogUtils.makeQualifiedPath( | ||
| CatalogUtils.stringToURI(loc), hadoopConf).toString)) | ||
| val newLoc = tableSpec.location.map { loc => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code here follows SessionCatalog#makeQualifiedTablePath
| /** | ||
| * A reserved property to specify the location of the table. The files of the table | ||
| * should be under this location. | ||
| * should be under this location. The location is a Hadoop Path string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for clarification.
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this looks correct to me.
| c INT) | ||
| USING parquet | ||
| LOCATION 'file:///path/to/table' | ||
| LOCATION 'file:/path/to/table' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've investigated this locally. The new result is actually consistent with the production behavior (with Hive Metastore). What happens is:
- CREATE TABLE command qualifies the table location, and keeps it as
URIinCatalogTable#storage#locationUri - When saving the table into HMS, we have to turn
URIinto string, by usingnew Path(uri).toString. The Hadoop Path string will omit theauthoritycomponent if it's not present. Sofile:///pathbecomesfile:/path. - When reading the table back from HMS, we turn string back to
URI, but theauthoritycomponent won't be filled.
However, with InMemoryCatalog, we keep CatalogTable directly and there is no URI <-> string round trip. So the authority component is still there.
This actually doesn't matter, as empty authority is the same as no authority, in URI string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the details.
|
Merged to master. |
|
Could you make a backporting PR to branch-3.5 seperately in order to make it sure all CIs pass, @cloud-fan ? |
…ath string This is a followup of apache#47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars. restore the unintentional behavior change. No, it's not released yet new test no Closes apache#47759 from cloud-fan/fix. Authored-by: Wenchen Fan <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
This is a followup of #47660 to restore the behavior change. The table location string should be Hadoop Path string instead of URL string which escapes all special chars.
Why are the changes needed?
restore the unintentional behavior change.
Does this PR introduce any user-facing change?
No, it's not released yet
How was this patch tested?
new test
Was this patch authored or co-authored using generative AI tooling?
no