-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20196][PYTHON][SQL] update doc for catalog functions for all languages, add pyspark refreshByPath API #17512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #75466 has finished for PR 17512 at commit
|
|
will update after #17518 + changes to R doc too |
|
updated @gatorsmile |
|
Test build #75515 has finished for PR 17512 at commit
|
R/pkg/R/SQLContext.R
Outdated
| } | ||
|
|
||
| #' Create a SparkDataFrame from a SparkSQL Table | ||
| #' Create a SparkDataFrame from a SparkSQL table or temporary view |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
table or view
Here, actually, it includes both temporary views or persistent views.
R/pkg/R/SQLContext.R
Outdated
| #' | ||
| #' Returns the specified Table as a SparkDataFrame. The Table must have already been registered | ||
| #' in the SparkSession. | ||
| #' Returns the specified table or temporary view as a SparkDataFrame. The temporary view must have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here.
python/pyspark/sql/catalog.py
Outdated
| """Recover all the partitions of the given table and update the catalog.""" | ||
| """Recovers all the partitions of the given table and update the catalog. | ||
| Only works with a partitioned table, and not a temporary view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a temporary view -> a view
Both temporary and persistent views are not supported. We will detect and issue exceptions.
|
|
||
| /** | ||
| * Recovers all the partitions in the directory of a table and update the catalog. | ||
| * Only works with a partitioned table, and not a temporary view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here.
| /** | ||
| * Refreshes the cache entry and the associated metadata for all Dataset (if any), that contain | ||
| * the given data source path. | ||
| * the given data source path. Path matching is by prefix, i.e. "/" would invalidate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalidate -> invalidate and refresh
We also do the re-cache, but the new version cached lazily.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason in here, CatalogImpl.scala is very different from Catalog.scala - let me know if you want me to change them - for now I've updated the first sentence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I found this sentence is copied from Catalog.scala. Maybe, we can update them to
Path matching is by prefix, i.e. "/" would invalidate all the cached entries and make the new versions cached lazily.
|
|
||
| /** | ||
| * Recovers all the partitions in the directory of a table and update the catalog. | ||
| * Only works with a partitioned table, and not a temporary view. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a temporary view.
->
not a view.
|
Test build #75534 has started for PR 17512 at commit |
|
Jenkins, retest this please |
|
Test build #75536 has finished for PR 17512 at commit
|
R/pkg/R/SQLContext.R
Outdated
| #' | ||
| #' Returns the specified Table as a SparkDataFrame. The Table must have already been registered | ||
| #' in the SparkSession. | ||
| #' Returns the specified table or view as a SparkDataFrame. The table or view must already exists or |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: exists -> exist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks for catching this!
|
LGTM except minor comments. |
|
Test build #75555 has finished for PR 17512 at commit
|
|
merged to master, thanks! |
What changes were proposed in this pull request?
Update doc to remove external for createTable, add refreshByPath in python
How was this patch tested?
manual