Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7b17557
basic predicate-pushdown support
rjzamora Mar 15, 2022
b5cb2cb
remove explict Dispatch class
rjzamora Mar 15, 2022
017f65e
use _Frame.fillna
rjzamora Mar 15, 2022
e08f6cf
cleanup comments
rjzamora Mar 16, 2022
f63b814
test coverage
rjzamora Mar 16, 2022
4b1bc97
improve test coverage
rjzamora Mar 16, 2022
7f78c58
add xfail test for dt accessor in predicate and fix test_show.py
rjzamora Mar 16, 2022
60f9149
fix some naming issues
rjzamora Mar 16, 2022
5d9b369
add config and use assert_eq
rjzamora Mar 16, 2022
6951a1d
add logging events when predicate-pushdown bails
rjzamora Mar 16, 2022
116d668
move bail logic earlier in function
rjzamora Mar 17, 2022
600a020
address easier code review comments
rjzamora Mar 17, 2022
359cab0
typo fix
rjzamora Mar 17, 2022
6abf658
fix creation_info access bug
rjzamora Mar 18, 2022
94294f5
convert any expression to DNF
rjzamora Mar 18, 2022
f663e0b
csv test coverage
rjzamora Mar 18, 2022
a18a149
include IN coverage
rjzamora Mar 18, 2022
a3725fb
improve test rigor
rjzamora Mar 18, 2022
38ca9fb
address code review
rjzamora Mar 22, 2022
fe32ec9
Merge remote-tracking branch 'upstream/main' into predicate-pushdown
rjzamora Mar 24, 2022
21722d1
Merge remote-tracking branch 'upstream/main' into predicate-pushdown
charlesbluca Mar 24, 2022
88051c9
Merge remote-tracking branch 'upstream/main' into predicate-pushdown
charlesbluca Mar 24, 2022
275609c
skip parquet tests when deps are not installed
rjzamora Mar 25, 2022
01f762f
Merge branch 'predicate-pushdown' of https://github.com/rjzamora/dask…
rjzamora Mar 25, 2022
3d2f6d3
fix bug
rjzamora Mar 25, 2022
f718791
add pyarrow dep to cluster workers
rjzamora Mar 25, 2022
0c69a40
roll back test skipping changes
rjzamora Mar 25, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,7 @@ services:
container_name: dask-worker
image: daskdev/dask:latest
command: dask-worker dask-scheduler:8786
environment:
EXTRA_CONDA_PACKAGES: "pyarrow>1.0.0" # required for parquet IO
volumes:
- /tmp:/tmp
8 changes: 7 additions & 1 deletion dask_sql/physical/rel/logical/filter.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
import logging
from typing import TYPE_CHECKING, Union

import dask.config as dask_config
import dask.dataframe as dd
import numpy as np

from dask_sql.datacontainer import DataContainer
from dask_sql.physical.rel.base import BaseRelPlugin
from dask_sql.physical.rex import RexConverter
from dask_sql.physical.utils.filter import attempt_predicate_pushdown

if TYPE_CHECKING:
import dask_sql
Expand All @@ -31,7 +33,11 @@ def filter_or_scalar(df: dd.DataFrame, filter_condition: Union[np.bool_, dd.Seri

# In SQL, a NULL in a boolean is False on filtering
filter_condition = filter_condition.fillna(False)
return df[filter_condition]
out = df[filter_condition]
if dask_config.get("sql.predicate_pushdown"):
return attempt_predicate_pushdown(out)
else:
return out


class DaskFilterPlugin(BaseRelPlugin):
Expand Down
Loading