Skip to content

adding wheel build for libcudf#15483

Merged
rapids-bot[bot] merged 122 commits intorapidsai:branch-24.10from
msarahan:libcudf-wheel
Aug 23, 2024
Merged

adding wheel build for libcudf#15483
rapids-bot[bot] merged 122 commits intorapidsai:branch-24.10from
msarahan:libcudf-wheel

Conversation

@msarahan
Copy link
Contributor

@msarahan msarahan commented Apr 8, 2024

Description

Contributes to rapidsai/build-planning#33

Adds a standalone libcudf wheel, containing the libcudf C++ shared library.

Fixes #16588

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Notes for Reviewers

Dependency Flows

My (@jameslamb)'s interpretation of the state we want to get to with this PR.

---
title: Build dependencies
---
flowchart TD
    A[libcudf] --> B[pylibcudf]
    A --> C[cudf]
    B --> C[cudf]
    B --> F[cudf-kafka]
    D[dask-cudf]
    E[cudf-polars]
    G[custreamz]
Loading
---
title: Runtime dependencies
---
flowchart TD
    A[libcudf] --> B[pylibcudf]
    B --> C[cudf]
    A --> C
    B --> E[cudf-polars]
    C --> D[dask-cudf]
    C --> F[cudf-kafka]
    C --> G[custreamz]
    F --> G
Loading

Size changes

wheel size (before) size (this PR)
libcudf. --- 440M
pylibcudf 469M 16M
cudf 478M 24M
cudf-polars 0.04M 0.04M
dask-cudf 0.05M 0.05M
TOTAL 947M 480M

NOTES: size = compressed, "before" = 2024-08-21 nightlies (58799d6)

how I calculated those (click me)
docker run \
    --rm \
    -v $(pwd):/opt/work:ro \
    -w /opt/work \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2024-08-21 \
    --env RAPIDS_NIGHTLY_SHA=58799d6 \
    --env RAPIDS_PR_NUMBER=15483 \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env RAPIDS_REPOSITORY=rapidsai/cudf \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.11 \
    bash

mkdir -p "${WHEEL_DIR_BEFORE}"
mkdir -p "${WHEEL_DIR_AFTER}"

cpp_projects=(
    libcudf
)
py_projects=(
    cudf
    pylibcudf
)
py_pure_projects=(
    cudf_polars
    dask_cudf
)

# TODO: calculate the date
for project in "${py_projects[@]}"; do
    # before
    RAPIDS_BUILD_TYPE=nightly \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="branch-24.10" \
    RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_BEFORE}"

    # after
    RAPIDS_BUILD_TYPE=pull-request \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_AFTER}"
done


for project in "${py_pure_projects[@]}"; do
    # before
    RAPIDS_BUILD_TYPE=nightly \
    RAPIDS_PY_WHEEL_PURE="1" \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="branch-24.10" \
    RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_BEFORE}"
    
    # after
    RAPIDS_BUILD_TYPE=pull-request \
    RAPIDS_PY_WHEEL_PURE="1" \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
        rapids-download-wheels-from-s3 python "${WHEEL_DIR_AFTER}"
done

for project in "${cpp_projects[@]}"; do
    # before
    RAPIDS_BUILD_TYPE=nightly \
    RAPIDS_PY_WHEEL_PURE="1" \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="branch-24.10" \
    RAPIDS_SHA=${RAPIDS_NIGHTLY_SHA} \
        rapids-download-wheels-from-s3 cpp "${WHEEL_DIR_BEFORE}"
    
    # after
    RAPIDS_BUILD_TYPE=pull-request \
    RAPIDS_PY_WHEEL_NAME="${project}_${RAPIDS_PY_CUDA_SUFFIX}" \
    RAPIDS_REF_NAME="pull-request/${RAPIDS_PR_NUMBER}" \
        rapids-download-wheels-from-s3 cpp "${WHEEL_DIR_AFTER}"
done

du -sh ${WHEEL_DIR_BEFORE}/*
du -sh ${WHEEL_DIR_BEFORE}
du -sh ${WHEEL_DIR_AFTER}/*
du -sh ${WHEEL_DIR_AFTER}

Related work

Corresponding devcontainers PR: rapidsai/devcontainers#271

We should merge that when this is just about ready, then re-run the failing devcontainers CI jobs here.

@github-actions github-actions bot added Python Affects Python cuDF API. CMake CMake build issue labels Apr 8, 2024
@msarahan msarahan added DO NOT MERGE Hold off on merging; see PR for details and removed Python Affects Python cuDF API. labels Apr 8, 2024
@github-actions github-actions bot added the Python Affects Python cuDF API. label Apr 8, 2024
@github-actions github-actions bot added the ci label Apr 9, 2024
@msarahan msarahan force-pushed the libcudf-wheel branch 5 times, most recently from 973ce98 to 1ac95d4 Compare April 11, 2024 20:01
@msarahan
Copy link
Contributor Author

Stuck on dlpack here. The failure is:

/__w/cudf/cudf/python/cudf/build/cp310-cp310-manylinux_2_28_aarch64/cudf/_lib/interop.cxx:1443:10: fatal error: dlpack/dlpack.h: No such file or directory
   1443 | #include "dlpack/dlpack.h"
        |          ^~~~~~~~~~~~~~~~~
  compilation terminated.

https://github.com/rapidsai/cudf/actions/runs/8662725105/job/23755706695?pr=15483#step:9:290

msarahan added a commit to msarahan/devcontainers that referenced this pull request Apr 18, 2024
I'm not at all sure that I've done this correctly. This is intended to support local development with a devcontainer for the work in rapidsai/cudf#15483
@msarahan
Copy link
Contributor Author

@vyasr the build is passing now, but the tests seem to show that the library (libcudf.so) can't be found. It does seem like libcudf-cuXY is being installed OK, so I'm thinking it has to be some kind of RPATH issue. I don't see how you've addressed this in the RMM PR. Any advice?

I've inspected one of the libcudf wheels, and it has both lib and lib64 folders. The lib folder has nvcomp stuff, and lib64 has libcudf.so.

@vyasr
Copy link
Contributor

vyasr commented Apr 19, 2024

You won't see this in the rmm PR because rmm is header-only, so there is no library there. The raft PR is a more useful example here. What you want to look at is actually the Python components of the libraft library, specifically the load.py module. The idea is that we never set RPATHs on libraries explicitly. Instead, we use ctypes to dlopen the library before we import any of the extension modules. This is a more robust solution than setting RPATHS because of the various ways in which it is possible for Python packages to coexist on a user's system (different environments, PYTHONPATH modification, etc).

@msarahan msarahan force-pushed the libcudf-wheel branch 2 times, most recently from 5b6bc46 to 580779d Compare April 23, 2024 17:16
@vyasr
Copy link
Contributor

vyasr commented Apr 29, 2024

You have to set the variables before the rapids-cmake clone happens, which in this case is in the include(../../rapids_config.cmake) line. IOW the variables as currently set won't have any effect. Also, I think you'll need to specify the repo as vyasr/rapids-cmake, not just vyasr.

@vyasr
Copy link
Contributor

vyasr commented Aug 22, 2024

I've gone through and responded to the many threads where there was open discussion that needed some feedback from me, but I haven't reviewed yet. Will do that soon.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only have a couple minor suggestions, so I am approving this. There are some open conversations, mostly items that I think are between @jameslamb and @vyasr to resolve (or defer for later work). If you need more feedback on those, let me know.

I also removed some reviewers that were autoassigned, to reduce the noise for those folks.

@bdice bdice mentioned this pull request Aug 22, 2024
3 tasks
Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error in the pip devcontainers:

  ERROR: Could not find a version that satisfies the requirement libcudf-cu12==24.10.*,>=0.0.0a0 (from versions: none)
  
  ERROR: No matching distribution found for libcudf-cu12==24.10.*,>=0.0.0a0

Is this blocked by rapidsai/devcontainers#271?

Edit: yes. From the description:

We should merge that when this is just about ready, then re-run the failing devcontainers CI jobs here.

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there. Really only the build.sh change that's a blocker I think.

python -m auditwheel repair -w ${package_dir}/final_dist ${package_dir}/dist/*
python -m auditwheel repair \
--exclude libcudf.so \
--exclude libarrow.so.1601 \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on merge order this can be removed.

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"

mkdir -p ${package_dir}/final_dist
python -m auditwheel repair --exclude libarrow.so.1601 -w ${package_dir}/final_dist ${package_dir}/dist/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, this will be removable soon.

AyodeAwe pushed a commit to rapidsai/devcontainers that referenced this pull request Aug 23, 2024
I'm not at all sure that I've done this correctly. This is intended to
support local development with a devcontainer for the work in
rapidsai/cudf#15483

Co-authored-by: James Lamb <[email protected]>
@jameslamb
Copy link
Member

devcontainers job passed after merging rapidsai/devcontainers#271,

https://github.com/rapidsai/cudf/actions/runs/10515765176/job/29173481700?pr=15483

😁 thanks @AyodeAwe

Given that + all the approvals here, I'm gonna merge this. Thanks so much for all the help everyone!!!

@jameslamb
Copy link
Member

/merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team ci CMake CMake build issue conda improvement Improvement / enhancement to an existing function non-breaking Non-breaking change Python Affects Python cuDF API.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Update update-version.sh for pylibcudf

7 participants