-
Notifications
You must be signed in to change notification settings - Fork 72
Closed
Labels
bugSomething isn't workingSomething isn't workingrustPull requests that update Rust codePull requests that update Rust code
Description
What happened:
While playing around with hooking dask-sql into Coiled's benchmarks, I noticed some issues around DPP with test_query_3:
pyo3_runtime.PanicException: called `Result::unwrap()` on an `Err` value: Os { code: 20, kind: NotADirectory, message: "Not a directory" }
Think this is because dynamic_partition_pruning::read_table assumes we're working with a directory of chunked parquet files and doesn't have handling for the case where we have a single parquet file:
| let paths = fs::read_dir(tables.get(&table_string).unwrap().filepath.clone()).unwrap(); |
What you expected to happen:
In general, I would expect this to emit a warning and skip DPP rather than bubble up to an error, though I don't think it should be too difficult to modify the handling of tables for the single file case? cc @sarahyurick
Environment:
- dask-sql version: 2023.10.1
- Python version: 3.9
- Operating System: ubuntu20.04
- Install method (conda, pip, source): conda
sarahyuricksarahyurick
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingrustPull requests that update Rust codePull requests that update Rust code