-
Notifications
You must be signed in to change notification settings - Fork 72
Obtain filepath from Dask DataFrame
#1145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## main #1145 +/- ##
==========================================
+ Coverage 81.43% 81.58% +0.14%
==========================================
Files 78 78
Lines 4395 4403 +8
Branches 797 798 +1
==========================================
+ Hits 3579 3592 +13
+ Misses 640 631 -9
- Partials 176 180 +4
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
jdye64
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks good. Had one recommendation around a utility function to use if it fits your needs. If it doesn't don't worry about it though.
tests/unit/test_context.py
Outdated
| c.schema["root"].filepaths["df"] | ||
|
|
||
|
|
||
| def test_ddf_filepath(tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the parquet_ddf fixture work for the test here, or does it explicitly need to be redefined here?
Builds upon #1074
Uses part of https://github.com/jdye64/dask-sql/blob/expand_predicate_pushdown/dask_sql/physical/rel/logical/table_scan.py#L112-134
For some reason, this doesn't work with #1102 yet, indicating some error on the DPP side(?), even though DPP is able to obtain the correct filepaths by using this PR. So DPP still just returns the original plan whencreate_tableuses a Dask DataFrame instead of a Parquet filepath.