Skip to content

Conversation

@charlesbluca
Copy link
Collaborator

With rapidsai/cudf#9455 merged, cuDF now has handling for lists of ascending bools and na_position, making it consistent with Pandas.

)

df_result = c.sql(
"SELECT * FROM df ORDER BY a ASC NULLS FIRST, b DESC NULLS LAST, c DESC NULLS LAST"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old logic that we were using to decide if we could do single-partition sorts (not any(sort_null_first[1:])) was incorrect, but we weren't catching that in this test because we weren't checking the full dataframe's sorting. This new test should fail with the old logic but pass with this PR.

@charlesbluca
Copy link
Collaborator Author

This should be good to go in, but would currently depend on nightly cuDF; going to leave this as a draft until 21.12 is released

@codecov-commenter
Copy link

codecov-commenter commented Oct 23, 2021

Codecov Report

Merging #262 (df4fb38) into main (ebdf4d5) will increase coverage by 0.09%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #262      +/-   ##
==========================================
+ Coverage   95.89%   95.99%   +0.09%     
==========================================
  Files          64       64              
  Lines        2730     2794      +64     
  Branches      408      420      +12     
==========================================
+ Hits         2618     2682      +64     
+ Misses         72       71       -1     
- Partials       40       41       +1     
Impacted Files Coverage Δ
dask_sql/physical/utils/sort.py 90.38% <100.00%> (+7.62%) ⬆️
dask_sql/datacontainer.py 94.11% <0.00%> (-5.89%) ⬇️
dask_sql/cmd.py 100.00% <0.00%> (ø)
dask_sql/physical/rel/convert.py 87.50% <0.00%> (ø)
dask_sql/physical/rel/logical/window.py 98.81% <0.00%> (ø)
dask_sql/context.py 99.09% <0.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebdf4d5...df4fb38. Read the comment docs.

@charlesbluca
Copy link
Collaborator Author

After some more thought here, I actually think it makes sense to merge in changes that depend on nightly cuDF - at the moment, GPU support is still experimental and I feel like it should be expected to require the latest changes on cuDF's end.

Hopefully in the future we can relax this requirement and allow for stable versions (with performance being the only implication), but for now I think this should be good to merge as long as we also make sure to emphasize the need for cuDF nightlies in docs on GPU support.

@charlesbluca charlesbluca marked this pull request as ready for review November 1, 2021 15:11
@quasiben
Copy link
Contributor

quasiben commented Nov 1, 2021

After some more thought here, I actually think it makes sense to merge in changes that depend on nightly cuDF - at the moment, GPU support is still experimental and I feel like it should be expected to require the latest changes on cuDF's end.

Agreed I think for cuDF/GPU support we can be quite liberal in our experimentation and communicate that to end users.

@quasiben quasiben merged commit 0bf65bb into dask-contrib:main Nov 1, 2021
@charlesbluca charlesbluca deleted the simplify-single-part-sort branch January 19, 2022 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants