Skip to content

Conversation

@sarahyurick
Copy link
Collaborator

@sarahyurick sarahyurick commented May 12, 2023

Builds upon #1074

Uses part of https://github.com/jdye64/dask-sql/blob/expand_predicate_pushdown/dask_sql/physical/rel/logical/table_scan.py#L112-134

For some reason, this doesn't work with #1102 yet, indicating some error on the DPP side(?), even though DPP is able to obtain the correct filepaths by using this PR. So DPP still just returns the original plan when create_table uses a Dask DataFrame instead of a Parquet filepath.

@codecov-commenter
Copy link

codecov-commenter commented May 12, 2023

Codecov Report

Merging #1145 (b35ea00) into main (41cb27a) will increase coverage by 0.14%.
The diff coverage is 100.00%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main    #1145      +/-   ##
==========================================
+ Coverage   81.43%   81.58%   +0.14%     
==========================================
  Files          78       78              
  Lines        4395     4403       +8     
  Branches      797      798       +1     
==========================================
+ Hits         3579     3592      +13     
+ Misses        640      631       -9     
- Partials      176      180       +4     
Impacted Files Coverage Δ
dask_sql/context.py 93.96% <100.00%> (+0.18%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Collaborator

@jdye64 jdye64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic looks good. Had one recommendation around a utility function to use if it fits your needs. If it doesn't don't worry about it though.

@sarahyurick sarahyurick marked this pull request as ready for review May 15, 2023 05:45
c.schema["root"].filepaths["df"]


def test_ddf_filepath(tmpdir):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the parquet_ddf fixture work for the test here, or does it explicitly need to be redefined here?

@ayushdg ayushdg merged commit 2df7476 into dask-contrib:main May 26, 2023
@sarahyurick sarahyurick deleted the read_parquet_filepath branch May 26, 2023 22:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants