Fixes for pandas 2, latest cudf, and wheel building#4144
Fixes for pandas 2, latest cudf, and wheel building#4144rapids-bot[bot] merged 9 commits intorapidsai:branch-24.04from
Conversation
bdice
left a comment
There was a problem hiding this comment.
Seems fine to me. We’ll want to get this merged soon to unblock CUDA 12.2 work on branch-24.04.
|
It looks like the ARM builds are failing. Do we know what is going on there? |
|
Both ARM and x86-64 builds are failing. cuGraph is relying on functions from libcudf that were deprecated in 24.02 and removed in 24.04: rapidsai/cudf#14848 |
|
Yeah I started a Slack thread to get help on those fixes. |
ChuckHastings
left a comment
There was a problem hiding this comment.
Thanks for fixing this!
rlratzel
left a comment
There was a problem hiding this comment.
Thank you very much for taking care of this!
| mkdir -p ./dist | ||
| RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})" | ||
|
|
||
| # Download wheels built during this job. |
There was a problem hiding this comment.
Thanks for fixing this. It's unfortunate that we have to do it this way vs. something closer to what we do for conda testing (add a channel pointing to the local builds). I haven't verified one way or the other, but I'm curious if we have similar problems in our other wheel test environments, seems like we would.
There was a problem hiding this comment.
When I set up wheels originally I did this for every package that needs it. I know cugraph added a bunch of wheels afterwards, so it's probably worth an audit.
I don't think this is all that different from the conda case. The main difference is that there are multiple install lines. That was initially there to avoid needing to use --pre to force installation of pre-releases, but actually with the newer versioning strategies that I put in place in wheels that probably won't be an issue any more, i.e. it's untested but we ought to be able to use a single install command here now.
|
/merge |
This PR contains a number of different fixes currently required to get cugraph tests passing:
- There are two main changes for pandas 2 compatibility:
- [pandas renamed `DataFrame.applymap` to `DataFrame.map`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.map.html) so creating the renumbering map with a column `map` caused problems for attribute-based column access `renumber_map.map`. Those columns are now renamed to `renumber_map`.
- Empty columns now default to str rather than float, so tests that assumed we could access the values as cupy arrays failed because cudf's string columns cannot be converted to cupy arrays. These columns are now always cast to float in the tests before the cupy conversion.
- cugraph-dgl and cugraph-pyg's wheel builds were not downloading the latest cugraph/pylibcugraph wheels to run tests. As a result, the above pandas 2 fixes didn't take when running the dgl and pyg tests. I updated the wheel building scripts to account for this discrepancy.
- rapidsai/cudf#14202 made a breaking change to how characters are encoded in strings columns in cudf, which broke cugraph_etl. This PR fixes the code that depended on the old APIs.
This code also includes a small patch to the cugraph_etl CMake so that it exports the correct package name (previously it was using cugraph).
Authors:
- Vyas Ramasubramani (https://github.com/vyasr)
Approvers:
- GALI PREM SAGAR (https://github.com/galipremsagar)
- Bradley Dice (https://github.com/bdice)
- Chuck Hastings (https://github.com/ChuckHastings)
- Rick Ratzel (https://github.com/rlratzel)
- Jake Awe (https://github.com/AyodeAwe)
URL: rapidsai/cugraph#4144
This PR contains a number of different fixes currently required to get cugraph tests passing:
DataFrame.applymaptoDataFrame.mapso creating the renumbering map with a columnmapcaused problems for attribute-based column accessrenumber_map.map. Those columns are now renamed torenumber_map.This code also includes a small patch to the cugraph_etl CMake so that it exports the correct package name (previously it was using cugraph).