Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
fa18499
Merge remote-tracking branch 'upstream/main'
jdye64 Apr 13, 2023
d6b470c
Merge remote-tracking branch 'upstream/main'
jdye64 May 2, 2023
856f7c0
Merge remote-tracking branch 'upstream/main'
jdye64 May 8, 2023
4b86547
Merge remote-tracking branch 'upstream/main'
jdye64 May 30, 2023
28fce59
Merge remote-tracking branch 'upstream/main'
jdye64 Jun 14, 2023
6637b84
Bump ADP -> 26.0.0
jdye64 Jun 14, 2023
c59cdbd
warn on optimization failure instead of erroring and exiting
jdye64 Jun 14, 2023
79a1f7c
Merge branch 'main' into adp_26
ayushdg Jun 20, 2023
ba585a5
Merge branch 'main' into adp_26
ayushdg Jun 26, 2023
23af11d
Merge remote-tracking branch 'origin/main' into adp_26
charlesbluca Jun 30, 2023
5c02c5a
Resolve initial build errors
charlesbluca Jun 30, 2023
7c36bf5
Merge remote-tracking branch 'origin/main' into adp_26
charlesbluca Jul 7, 2023
515dae6
Switch to crates release, add zlib to host/build deps
charlesbluca Jul 7, 2023
ef399e8
Add zlib to aarch build deps
charlesbluca Jul 7, 2023
8fbd1ff
Merge remote-tracking branch 'origin/main' into adp_26
charlesbluca Jul 7, 2023
bca9911
Merge branch 'main' into adp_26
jdye64 Jul 10, 2023
6858578
Bump to ADP 27 and introduce support for wildcard expressions, a wild…
jdye64 Jul 12, 2023
24e0f90
remove bit of logic that is no longer needed to manually check the wi…
jdye64 Jul 12, 2023
d776229
experiment with removing zlib, hoping that fixes os x build
jdye64 Jul 12, 2023
99ec801
Change expected_df result to 1.5 from 1. 3/2 is in fact 1.5 and not 1
jdye64 Jul 12, 2023
8997f7f
Fix cargo test
jdye64 Jul 12, 2023
379a978
add .cargo/config.toml in hopes of fixing linker build issues on osx
jdye64 Jul 13, 2023
e030bef
Remove extra config.toml
charlesbluca Jul 13, 2023
b2e85df
Try overriding runner-installed toolchain
charlesbluca Jul 14, 2023
d01088d
Revert "Try overriding runner-installed toolchain"
charlesbluca Jul 14, 2023
ca70f0f
Initial migration to maturin build system
charlesbluca Jul 17, 2023
d900f0e
Make some modifications to Rust package name
charlesbluca Jul 17, 2023
7d1be92
Adjust native library name from _.internal to dask_planner
jdye64 Jul 17, 2023
83fb5c3
Resolve initial conda build issues
charlesbluca Jul 17, 2023
c7bbbd7
Replace setuptools-rust with maturin in CI
charlesbluca Jul 17, 2023
6dc6347
Constrain maturin, remove setuptools-rust from CI envs
charlesbluca Jul 17, 2023
6dcf5e0
Update docs and Rust CI
charlesbluca Jul 17, 2023
b7c02c9
Remove more dask_planner appearances
charlesbluca Jul 17, 2023
a3e1a68
Bump pyarrow min version to resolve 3.8 conflicts
charlesbluca Jul 17, 2023
3ff8240
test commit seeing how CI will respond without cmd_loop import
jdye64 Jul 17, 2023
ce56a08
Merge branch 'adp_26' of github.com:jdye64/dask-sql into adp_26
charlesbluca Jul 17, 2023
ae7a3d6
Rename module to _datafusion_lib
charlesbluca Jul 17, 2023
0c2908c
Switch to maturin develop for CI installs
charlesbluca Jul 17, 2023
849dc42
Fix failing cargo tests, changed output, from datafusion version bump
jdye64 Jul 18, 2023
1f73b56
Fix cargo test syntax issue
jdye64 Jul 18, 2023
79b6eac
Fix failing Rust tests
Jul 18, 2023
405470f
Remove linux config.toml options
jdye64 Jul 18, 2023
32e4613
Merge remote-tracking branch 'origin/main' into adp_26
charlesbluca Jul 19, 2023
7870c96
Fix Rust object import
charlesbluca Jul 20, 2023
2961cfa
Apply code suggestions
charlesbluca Jul 20, 2023
9983700
Bump to recent ADP commit
charlesbluca Jul 25, 2023
9fd4770
Initial unblocker for pyarrow string handling
charlesbluca Jul 25, 2023
36c58ab
Compatibility code for old or no pyarrow installation
charlesbluca Jul 25, 2023
b4b2cdb
Added RexCall Operation to handle InSubquery Expr and also adjusted c…
jdye64 Jul 25, 2023
336b8ea
Add Sarah's fix for datetime.time error
jdye64 Jul 26, 2023
465e9df
Add condition to guard against complex function names that contain a …
jdye64 Jul 26, 2023
483bab5
unmarked xfail for queries 6, 9, & 54
jdye64 Jul 26, 2023
955bf4d
Quick fix for pydantic upstream breakage
charlesbluca Jul 26, 2023
8757515
Update dask_sql/physical/utils/filter.py
charlesbluca Jul 26, 2023
5271eea
Apply Sarah's suggestions
charlesbluca Jul 27, 2023
69441fc
Attempt to unblock failures at parse_datetime
charlesbluca Jul 27, 2023
7688f8b
Disable pyarrow strings for now
charlesbluca Jul 28, 2023
19aed3f
Remove breakpoint
charlesbluca Jul 28, 2023
fe185ae
Merge remote-tracking branch 'origin/main' into adp_26
charlesbluca Aug 8, 2023
62ba03b
Remove pydantic constraint now that fastapi is bumped
charlesbluca Aug 8, 2023
bbb0dc5
Apply pyproject suggestions
charlesbluca Aug 8, 2023
3706115
Bump build system to maturin 1.1
charlesbluca Aug 8, 2023
1fc8849
Move filter datetime handling, remove string datetime handling for now
charlesbluca Aug 9, 2023
510b063
Actually check containment in InSubquery call
charlesbluca Aug 9, 2023
3d4d948
bring back decorrelated_where_exists and decorrelate_when_in
jdye64 Aug 9, 2023
4fa155c
Checkstyle fixes
jdye64 Aug 9, 2023
3f23aed
Remove xfail for queries 58 and 61 which pass now
jdye64 Aug 9, 2023
bcd1f29
Fix pytest syntax issue
jdye64 Aug 9, 2023
d183976
whatever, have it your way black
jdye64 Aug 9, 2023
32f5adf
Remove debugging println
charlesbluca Aug 10, 2023
e2d4399
re-add support for ilike using the case_insensitive member of like
ayushdg Aug 10, 2023
378d48a
Merge branch 'adp_26' of github.com:jdye64/dask-sql into adp_26
charlesbluca Aug 11, 2023
67a5d86
Handle non-decimal scalar args for cuDF in RexCall
charlesbluca Aug 11, 2023
d767726
Try using maturin with zig for wheel builds
charlesbluca Aug 11, 2023
294cd28
Install protoc for all wheel builds and zlib1g-dev in linux builds
charlesbluca Aug 11, 2023
1bcb71e
Remove Cargo tests because that code is already being tested in DataF…
jdye64 Aug 14, 2023
caf6761
Adjust optimizer/utils test includes
jdye64 Aug 14, 2023
edbc669
Adjust import path for doctest
jdye64 Aug 14, 2023
f6411a4
Adjust import path for doctest (more)
jdye64 Aug 14, 2023
a94864a
Check if zlib is installed on ubuntu runners
charlesbluca Aug 14, 2023
24f465f
Try invoking maturin directly for conda builds
charlesbluca Aug 14, 2023
a0acebc
Revert "Try invoking maturin directly for conda builds"
charlesbluca Aug 14, 2023
488cbaf
Install protoc via apt
charlesbluca Aug 14, 2023
6ca444b
Add zlib to conda environment so that conda install c-compiler can lo…
jdye64 Aug 14, 2023
4109b38
Remove pytest coalesce option for Sum(b) with a string conditional re…
jdye64 Aug 14, 2023
3cadd47
Revert "Install protoc via apt"
charlesbluca Aug 15, 2023
276f2fa
Try not using zig for x86_64 builds
charlesbluca Aug 15, 2023
66ebed4
Try installing protoc from apt again
charlesbluca Aug 15, 2023
789dd35
Revert "Try installing protoc from apt again"
charlesbluca Aug 15, 2023
d154e66
Try explicitly setting PROTOC location for x86_64 builds
charlesbluca Aug 15, 2023
3ac265c
Where is protoc?
charlesbluca Aug 15, 2023
64411d4
Fix protoc binary location
charlesbluca Aug 15, 2023
9ed0550
Disable docker container for linux x86_64 build
charlesbluca Aug 15, 2023
1688ce0
Properly upload artifacts for ARM/intel
charlesbluca Aug 15, 2023
e3210f7
Disable aarch64 builds for now
charlesbluca Aug 15, 2023
4efeb01
Constrain mlflow to avoid import error
charlesbluca Aug 15, 2023
6020cdf
Set wheel tags to manylinux_2_17
charlesbluca Aug 15, 2023
d7d8413
Use manylinux docker container for x86_64 builds
charlesbluca Aug 15, 2023
319e2ef
No sudo for protoc installation
charlesbluca Aug 15, 2023
70dca3f
Install protoc directly from github
charlesbluca Aug 15, 2023
7b7d7a9
Specify PROTOC environment variable for x86_64 runs
charlesbluca Aug 15, 2023
60ef2e7
More doc updates to reflect new installation style
charlesbluca Aug 17, 2023
8a49552
Fix docker builds
charlesbluca Aug 17, 2023
6e7c451
Bump ADP to stable 28.0.0
charlesbluca Aug 17, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
File renamed without changes.
5 changes: 4 additions & 1 deletion .github/CODEOWNERS
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,7 @@
* @ayushdg @charlesbluca @galipremsagar

# rust codeowners
dask_planner/ @ayushdg @charlesbluca @galipremsagar @jdye64
.cargo/ @ayushdg @charlesbluca @galipremsagar @jdye64
src/ @ayushdg @charlesbluca @galipremsagar @jdye64
Cargo.toml @ayushdg @charlesbluca @galipremsagar @jdye64
Cargo.lock @ayushdg @charlesbluca @galipremsagar @jdye64
Comment on lines 4 to +8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if it makes sense to start adding teams to own portions of the code and then add people to those teams

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me - might be worth scoping out separately from this issue since IIUC those teams would need to be created at an org level which I don't have permissions to do

11 changes: 6 additions & 5 deletions .github/workflows/conda.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,9 @@ on:
pull_request:
paths:
- setup.py
- dask_planner/Cargo.toml
- dask_planner/Cargo.lock
- dask_planner/pyproject.toml
- dask_planner/rust-toolchain.toml
- Cargo.toml
- Cargo.lock
- pyproject.toml
- continuous_integration/recipe/**
- .github/workflows/conda.yml
schedule:
Expand All @@ -34,7 +33,9 @@ jobs:
fail-fast: false
matrix:
python: ["3.8", "3.9", "3.10"]
arch: ["linux-64", "linux-aarch64"]
# FIXME: aarch64 builds are consuming too much memory to run on GHA
# arch: ["linux-64", "linux-aarch64"]
arch: ["linux-64"]
steps:
- uses: actions/checkout@v3
with:
Expand Down
178 changes: 113 additions & 65 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,111 +15,159 @@ concurrency:
env:
upload: ${{ github.event_name == 'release' && github.repository == 'dask-contrib/dask-sql' }}

# Required shell entrypoint to have properly activated conda environments
defaults:
run:
shell: bash -l {0}

jobs:
wheels:
name: Build and publish py3.${{ matrix.python }} wheels on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
linux:
name: Build and publish wheels for linux ${{ matrix.target }}
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
python: ["8", "9", "10"] # 3.x
target: [x86_64, aarch64]
steps:
- uses: actions/checkout@v3
- name: Install Protoc
uses: arduino/setup-protoc@v1
if: matrix.target == 'aarch64'
with:
version: '3.x'
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Build wheels for x86_64
if: matrix.target == 'x86_64'
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.target }}
args: --release --out dist
sccache: 'true'
manylinux: '2_17'
before-script-linux: >
DOWNLOAD_URL=$(curl --retry 6 --retry-delay 10 -s https://api.github.com/repos/protocolbuffers/protobuf/releases/latest | grep -o '"browser_download_url": "[^"]*' | cut -d'"' -f4 | grep "\linux-x86_64.zip$") &&
curl --retry 6 --retry-delay 10 -LO $DOWNLOAD_URL &&
unzip protoc-*-linux-x86_64.zip -d $HOME/.local
docker-options: --env PROTOC=/root/.local/bin/protoc
- name: Build wheels for aarch64
if: matrix.target == 'aarch64'
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.target }}
args: --release --out dist --zig
sccache: 'true'
manylinux: '2_17'
- name: Check dist files
run: |
pip install twine

twine check dist/*
ls -lh dist/
- name: Upload binary wheels
uses: actions/upload-artifact@v3
with:
fetch-depth: 0
name: wheels for linux ${{ matrix.target }}
path: dist/*
- name: Publish package
if: env.upload == 'true'
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/*

windows:
name: Build and publish wheels for windows
runs-on: windows-latest
steps:
- uses: actions/checkout@v3
- name: Install Protoc
if: matrix.os != 'ubuntu-latest'
uses: arduino/setup-protoc@v1
with:
version: '3.x'
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Set up QEMU for linux-aarch64
if: matrix.os == 'ubuntu-latest'
uses: docker/setup-qemu-action@v2
- uses: actions/setup-python@v4
with:
platforms: arm64
- name: Add rust toolchain target for macos-aarch64
if: matrix.os == 'macos-latest'
run: rustup target add aarch64-apple-darwin
python-version: '3.10'
architecture: x64
- name: Build wheels
uses: pypa/[email protected]
uses: PyO3/maturin-action@v1
with:
target: x64
args: --release --out dist
sccache: 'true'
- name: Check dist files
run: |
pip install twine

twine check dist/*
ls dist/
- name: Upload binary wheels
uses: actions/upload-artifact@v3
with:
name: wheels for windows
path: dist/*
- name: Publish package
if: env.upload == 'true'
env:
CIBW_BUILD: 'cp3${{ matrix.python }}-*'
CIBW_SKIP: '*musllinux*'
CIBW_ARCHS_LINUX: 'aarch64 x86_64'
CIBW_ARCHS_WINDOWS: 'AMD64'
CIBW_ARCHS_MACOS: 'x86_64 arm64'
# Without CARGO_NET_GIT_FETCH_WITH_CLI we oom (https://github.com/rust-lang/cargo/issues/10583)
CIBW_ENVIRONMENT_LINUX: >
CARGO_NET_GIT_FETCH_WITH_CLI="true"
PATH="$HOME/.cargo/bin:$HOME/.local/bin:$PATH"
CIBW_ENVIRONMENT_WINDOWS: 'PATH="$UserProfile\.cargo\bin;$PATH"'
CIBW_BEFORE_BUILD: 'pip install -U setuptools-rust'
CIBW_BEFORE_BUILD_LINUX: >
ARCH=$([ $(uname -m) == x86_64 ] && echo x86_64 || echo aarch_64) &&
DOWNLOAD_URL=$(curl --retry 6 --retry-delay 10 -s https://api.github.com/repos/protocolbuffers/protobuf/releases/latest | grep -o '"browser_download_url": "[^"]*' | cut -d'"' -f4 | grep "\linux-${ARCH}.zip$") &&
curl --retry 6 --retry-delay 10 -LO $DOWNLOAD_URL &&
unzip protoc-*-linux-$ARCH.zip -d $HOME/.local &&
protoc --version &&
pip install -U setuptools-rust &&
pip list &&
curl --retry 6 --retry-delay 10 https://sh.rustup.rs -sSf | sh -s -- --default-toolchain=stable --profile=minimal -y &&
rustup show
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/*

macos:
name: Build and publish wheels for macos ${{ matrix.target }}
runs-on: macos-latest
strategy:
fail-fast: false
matrix:
target: [x86_64, aarch64]
steps:
- uses: actions/checkout@v3
- name: Install Protoc
uses: arduino/setup-protoc@v1
with:
version: '3.x'
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/setup-python@v4
with:
package-dir: .
output-dir: dist
config-file: "dask_planner/pyproject.toml"
- name: Set up Python
uses: conda-incubator/[email protected]
python-version: '3.10'
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
miniforge-variant: Mambaforge
use-mamba: true
python-version: "3.8"
channel-priority: strict
target: ${{ matrix.target }}
args: --release --out dist
sccache: 'true'
- name: Check dist files
run: |
mamba install twine
pip install twine

twine check dist/*
ls -lh dist/
- name: Upload binary wheels
uses: actions/upload-artifact@v3
with:
name: wheels for py3.${{ matrix.python }} on ${{ matrix.os }}
name: wheels for macos ${{ matrix.target }}
path: dist/*
- name: Publish package
if: env.upload == 'true'
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/*

sdist:
name: Build and publish source distribution
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Build sdist
uses: PyO3/maturin-action@v1
with:
fetch-depth: 0
- name: Set up Python
uses: conda-incubator/setup-[email protected]
command: sdist
args: --out dist
- uses: actions/setup-python@v4
with:
miniforge-variant: Mambaforge
use-mamba: true
python-version: "3.8"
channel-priority: strict
- name: Build source distribution
run: |
mamba install setuptools-rust twine

python setup.py sdist
python-version: '3.10'
- name: Check dist files
run: |
pip install twine

twine check dist/*
ls -lh dist/
- name: Publish source distribution
Expand Down
5 changes: 0 additions & 5 deletions .github/workflows/rust.yml
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@ jobs:
- name: Optionally update upstream dependencies
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
cd dask_planner
bash update-dependencies.sh
- name: Install Protoc
uses: arduino/setup-protoc@v1
Expand All @@ -60,11 +59,9 @@ jobs:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Check workspace in debug mode
run: |
cd dask_planner
cargo check
- name: Check workspace in release mode
run: |
cd dask_planner
cargo check --release

# test the crate
Expand All @@ -84,7 +81,6 @@ jobs:
- name: Optionally update upstream dependencies
if: needs.detect-ci-trigger.outputs.triggered == 'true'
run: |
cd dask_planner
bash update-dependencies.sh
- name: Install Protoc
uses: arduino/setup-protoc@v1
Expand All @@ -93,5 +89,4 @@ jobs:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Run tests
run: |
cd dask_planner
cargo test
5 changes: 1 addition & 4 deletions .github/workflows/test-upstream.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,10 @@ jobs:
- name: Optionally update upstream cargo dependencies
if: env.which_upstream == 'DataFusion'
run: |
cd dask_planner
bash update-dependencies.sh
- name: Build the Rust DataFusion bindings
run: |
python setup.py build install
maturin develop
- name: Install hive testing dependencies
if: matrix.os == 'ubuntu-latest'
run: |
Expand Down Expand Up @@ -124,11 +123,9 @@ jobs:
env:
UPDATE_ALL_CARGO_DEPS: false
run: |
cd dask_planner
bash update-dependencies.sh
- name: Install dependencies and nothing else
run: |
mamba install setuptools-rust
pip install -e . -vv

which python
Expand Down
3 changes: 1 addition & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ jobs:
shared-key: test
- name: Build the Rust DataFusion bindings
run: |
python setup.py build install
maturin develop
- name: Install hive testing dependencies
if: matrix.os == 'ubuntu-latest'
run: |
Expand Down Expand Up @@ -118,7 +118,6 @@ jobs:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Install dependencies and nothing else
run: |
mamba install "setuptools-rust>=1.5.2"
pip install -e . -vv

which python
Expand Down
10 changes: 1 addition & 9 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,23 +46,15 @@ venv
# IDE
.idea
.vscode
planner/.classpath
planner/.project
planner/.settings/
planner/.idea
planner/*.iml
*.swp

# project specific
planner/dependency-reduced-pom.xml
planner/target/
dask_sql/jar
.next/
dask-worker-space/
node_modules/
docs/source/_build/
tests/unit/queries
tests/unit/data
target/*

# Ignore development specific local testing files
dev_tests
Expand Down
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ repos:
rev: v1.0
hooks:
- id: cargo-check
args: ['--manifest-path', './dask_planner/Cargo.toml', '--verbose', '--']
args: ['--manifest-path', './Cargo.toml', '--verbose', '--']
- id: clippy
args: ['--manifest-path', './dask_planner/Cargo.toml', '--verbose', '--', '-D', 'warnings']
args: ['--manifest-path', './Cargo.toml', '--verbose', '--', '-D', 'warnings']
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.2.0
hooks:
Expand All @@ -39,4 +39,4 @@ repos:
entry: cargo +nightly fmt
language: system
types: [rust]
args: ['--manifest-path', './dask_planner/Cargo.toml', '--verbose', '--']
args: ['--manifest-path', './Cargo.toml', '--verbose', '--']
Loading