-
Notifications
You must be signed in to change notification settings - Fork 5.2k
[Docker] optimize dockerfile remove deepep and blackwell merge it to… #7343
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
ee3aabf
[Docker] optimize dockerfile remove deepep and blackwell merge it to…
whybeyoung 2e2d0ec
upd
whybeyoung 3d2ff9d
upd
whybeyoung fc107e3
upd
whybeyoung 0fd0c7c
upd
whybeyoung 164982f
Merge branch 'main' into pddocker
whybeyoung 9279f9e
Merge branch 'main' into pddocker
whybeyoung 02d682a
upd
zhyncs 6e35885
upd
zhyncs 3ed8155
upd
zhyncs 49527f1
upd
zhyncs 3b7cfdb
upd
zhyncs e5d623d
upd
zhyncs 10abd98
upd
zhyncs 78423cf
upd
zhyncs 8ad1e18
Merge branch 'main' into pddocker
zhyncs f726c0e
upd
zhyncs a0f94e2
upd
zhyncs 54fc5e2
upd
zhyncs 9d31dc5
upd
zhyncs 8fafb89
upd
zhyncs 1fc768d
upd
zhyncs File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,51 +1,98 @@ | ||
| ARG CUDA_VERSION=12.4.1 | ||
|
|
||
| FROM nvcr.io/nvidia/tritonserver:24.12-py3-min | ||
| ARG CUDA_VERSION=12.6.1 | ||
| FROM nvidia/cuda:${CUDA_VERSION}-cudnn-devel-ubuntu22.04 | ||
|
|
||
| ARG BUILD_TYPE=all | ||
| ENV DEBIAN_FRONTEND=noninteractive | ||
| ENV DEBIAN_FRONTEND=noninteractive \ | ||
| CUDA_HOME=/usr/local/cuda \ | ||
| GDRCOPY_HOME=/usr/src/gdrdrv-2.4.4/ \ | ||
| NVSHMEM_DIR=/sgl-workspace/nvshmem/install | ||
|
|
||
| # Set timezone and install all packages | ||
| RUN echo 'tzdata tzdata/Areas select America' | debconf-set-selections \ | ||
| && echo 'tzdata tzdata/Zones/America select Los_Angeles' | debconf-set-selections \ | ||
| && apt update -y \ | ||
| && apt install software-properties-common -y \ | ||
| && apt install python3 python3-pip -y \ | ||
| && apt install curl git sudo libibverbs-dev -y \ | ||
| && apt install rdma-core infiniband-diags openssh-server perftest -y \ | ||
| && python3 --version \ | ||
| && python3 -m pip --version \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && apt clean | ||
|
|
||
| # For openbmb/MiniCPM models | ||
| RUN pip3 install datamodel_code_generator --break-system-packages | ||
| && echo 'tzdata tzdata/Zones/America select Los_Angeles' | debconf-set-selections \ | ||
| && apt-get update && apt-get install -y --no-install-recommends \ | ||
| tzdata \ | ||
| software-properties-common netcat-openbsd kmod unzip openssh-server \ | ||
| curl wget lsof zsh ccache tmux htop git-lfs tree \ | ||
| python3 python3-pip python3-dev libpython3-dev \ | ||
| build-essential cmake \ | ||
| libopenmpi-dev libnuma1 libnuma-dev \ | ||
| libibverbs-dev libibverbs1 libibumad3 \ | ||
| librdmacm1 libnl-3-200 libnl-route-3-200 libnl-route-3-dev libnl-3-dev \ | ||
| ibverbs-providers infiniband-diags perftest \ | ||
| libgoogle-glog-dev libgtest-dev libjsoncpp-dev libunwind-dev \ | ||
| libboost-all-dev libssl-dev \ | ||
| libgrpc-dev libgrpc++-dev libprotobuf-dev protobuf-compiler-grpc \ | ||
| pybind11-dev \ | ||
| libhiredis-dev libcurl4-openssl-dev \ | ||
| libczmq4 libczmq-dev \ | ||
| libfabric-dev \ | ||
| patchelf \ | ||
| nvidia-dkms-550 \ | ||
| devscripts debhelper fakeroot dkms check libsubunit0 libsubunit-dev \ | ||
| && ln -sf /usr/bin/python3 /usr/bin/python \ | ||
| && rm -rf /var/lib/apt/lists/* \ | ||
| && apt-get clean | ||
|
|
||
| # GDRCopy installation | ||
| RUN mkdir -p /tmp/gdrcopy && cd /tmp \ | ||
| && git clone https://github.com/NVIDIA/gdrcopy.git -b v2.4.4 \ | ||
| && cd gdrcopy/packages \ | ||
| && CUDA=/usr/local/cuda ./build-deb-packages.sh \ | ||
| && dpkg -i gdrdrv-dkms_*.deb libgdrapi_*.deb gdrcopy-tests_*.deb gdrcopy_*.deb \ | ||
| && cd / && rm -rf /tmp/gdrcopy | ||
|
|
||
| # Fix DeepEP IBGDA symlink | ||
| RUN ln -sf /usr/lib/x86_64-linux-gnu/libmlx5.so.1 /usr/lib/x86_64-linux-gnu/libmlx5.so | ||
|
|
||
| # Clone and install SGLang | ||
| WORKDIR /sgl-workspace | ||
| RUN python3 -m pip install --no-cache-dir --upgrade pip setuptools wheel html5lib six \ | ||
| && git clone --depth=1 https://github.com/sgl-project/sglang.git \ | ||
| && cd sglang \ | ||
| && case "$CUDA_VERSION" in \ | ||
| 12.6.1) CUINDEX=126 ;; \ | ||
| 12.8.1) CUINDEX=128 ;; \ | ||
| *) echo "Unsupported CUDA version: $CUDA_VERSION" && exit 1 ;; \ | ||
| esac \ | ||
| && python3 -m pip install --no-cache-dir -e "python[${BUILD_TYPE}]" --extra-index-url https://download.pytorch.org/whl/cu${CUINDEX} \ | ||
| && if [ "$CUDA_VERSION" = "12.8.1" ]; then \ | ||
| python3 -m pip install --no-cache-dir nvidia-nccl-cu12==2.27.3 --force-reinstall --no-deps ; \ | ||
zhyncs marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| python3 -m pip install --no-cache-dir https://github.com/sgl-project/whl/releases/download/v0.1.9/sgl_kernel-0.1.9+cu128-cp39-abi3-manylinux2014_x86_64.whl --force-reinstall --no-deps ; \ | ||
| fi | ||
|
|
||
| # Build and install NVSHMEM + DeepEP | ||
| RUN wget https://developer.download.nvidia.com/compute/redist/nvshmem/3.2.5/source/nvshmem_src_3.2.5-1.txz \ | ||
| && git clone https://github.com/deepseek-ai/DeepEP.git \ | ||
| && tar -xf nvshmem_src_3.2.5-1.txz && mv nvshmem_src nvshmem \ | ||
| && cd nvshmem \ | ||
| && git apply /sgl-workspace/DeepEP/third-party/nvshmem.patch \ | ||
| && sed -i '1i#include <unistd.h>' examples/moe_shuffle.cu \ | ||
| && rm -f /sgl-workspace/nvshmem_src_3.2.5-1.txz \ | ||
| && NVSHMEM_SHMEM_SUPPORT=0 \ | ||
| NVSHMEM_UCX_SUPPORT=0 \ | ||
| NVSHMEM_USE_NCCL=0 \ | ||
| NVSHMEM_MPI_SUPPORT=0 \ | ||
| NVSHMEM_IBGDA_SUPPORT=1 \ | ||
| NVSHMEM_PMIX_SUPPORT=0 \ | ||
| NVSHMEM_TIMEOUT_DEVICE_POLLING=0 \ | ||
| NVSHMEM_USE_GDRCOPY=1 \ | ||
| cmake -S . -B build/ -DCMAKE_INSTALL_PREFIX=${NVSHMEM_DIR} -DCMAKE_CUDA_ARCHITECTURES=90 \ | ||
| && cmake --build build --target install -j \ | ||
| && cd /sgl-workspace/DeepEP \ | ||
| && NVSHMEM_DIR=${NVSHMEM_DIR} pip install . | ||
|
|
||
| ARG CUDA_VERSION | ||
| RUN python3 -m pip install --upgrade pip setuptools wheel html5lib six --break-system-packages --ignore-installed \ | ||
| && git clone --depth=1 https://github.com/sgl-project/sglang.git \ | ||
| && if [ "$CUDA_VERSION" = "12.1.1" ]; then \ | ||
| export CUINDEX=121; \ | ||
| elif [ "$CUDA_VERSION" = "12.4.1" ]; then \ | ||
| export CUINDEX=124; \ | ||
| elif [ "$CUDA_VERSION" = "12.8.1" ]; then \ | ||
| export CUINDEX=124; \ | ||
| elif [ "$CUDA_VERSION" = "11.8.0" ]; then \ | ||
| export CUINDEX=118; \ | ||
| python3 -m pip install --no-cache-dir sgl-kernel -i https://docs.sglang.ai/whl/cu118 --break-system-packages; \ | ||
| else \ | ||
| echo "Unsupported CUDA version: $CUDA_VERSION" && exit 1; \ | ||
| fi \ | ||
| && if [ "$CUDA_VERSION" = "12.4.1" ]; then \ | ||
| python3 -m pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cu126 --break-system-packages; \ | ||
| else \ | ||
| python3 -m pip install --no-cache-dir torch --index-url https://download.pytorch.org/whl/cu${CUINDEX} --break-system-packages; \ | ||
| fi \ | ||
| && cd sglang \ | ||
| && python3 -m pip --no-cache-dir install -e "python[${BUILD_TYPE}]" --break-system-packages \ | ||
| && if [ "$CUDA_VERSION" = "12.8.1" ]; then \ | ||
| python3 -m pip install nvidia-nccl-cu12==2.26.2.post1 --force-reinstall --no-deps --break-system-packages; \ | ||
| fi | ||
| # Python tools | ||
| RUN python3 -m pip install --no-cache-dir \ | ||
| datamodel_code_generator \ | ||
| mooncake_transfer_engine==0.3.3.post2 \ | ||
| pre-commit \ | ||
| pytest \ | ||
| black \ | ||
| isort \ | ||
| icdiff \ | ||
| uv \ | ||
| wheel \ | ||
| scikit-build-core | ||
|
|
||
| ENV DEBIAN_FRONTEND=interactive | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.