-
Notifications
You must be signed in to change notification settings - Fork 5.1k
[Bug] sglang[all]>=0.4.4.post2 installation environment is very confusing #4812
Copy link
Copy link
Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
I tried to reinstall the latest version of SGLang, but I get all sorts of errors like the following
Reproduction
(sglang_env) xlab@xlab:/mnt/Agent/SGLang$ uv pip install --no-cache "sglang[all]>=0.4.4.post2" --find-links https://flashinfer.ai/whl/cu124/torch2.5/flashinfer-python
Using Python 3.12.3 environment at: sglang_env
Resolved 129 packages in 9.99s
Prepared 129 packages in 35m 27s
Installed 129 packages in 160ms
+ aiohappyeyeballs==2.6.1
+ aiohttp==3.11.14
+ aiosignal==1.3.2
+ airportsdata==20250224
+ annotated-types==0.7.0
+ anthropic==0.49.0
+ anyio==4.9.0
+ asttokens==3.0.0
+ attrs==25.3.0
+ certifi==2025.1.31
+ cffi==1.17.1
+ charset-normalizer==3.4.1
+ click==8.1.8
+ cloudpickle==3.1.1
+ cuda-bindings==12.8.0
+ cuda-python==12.8.0
+ datasets==3.4.1
+ decorator==5.2.1
+ decord==0.6.0
+ dill==0.3.8
+ diskcache==5.6.3
+ distro==1.9.0
+ executing==2.2.0
+ fastapi==0.115.12
+ filelock==3.18.0
+ flashinfer-python==0.2.3+cu124torch2.5
+ frozenlist==1.5.0
+ fsspec==2024.12.0
+ h11==0.14.0
+ hf-transfer==0.1.9
+ httpcore==1.0.7
+ httpx==0.28.1
+ huggingface-hub==0.29.3
+ idna==3.10
+ importlib-metadata==8.6.1
+ interegular==0.3.3
+ ipython==9.0.2
+ ipython-pygments-lexers==1.1.1
+ jedi==0.19.2
+ jinja2==3.1.6
+ jiter==0.9.0
+ jsonschema==4.23.0
+ jsonschema-specifications==2024.10.1
+ lark==1.2.2
+ litellm==1.64.1
+ llguidance==0.7.10
+ markupsafe==3.0.2
+ matplotlib-inline==0.1.7
+ modelscope==1.24.0
+ mpmath==1.3.0
+ multidict==6.2.0
+ multiprocess==0.70.16
+ nest-asyncio==1.6.0
+ networkx==3.4.2
+ ninja==1.11.1.4
+ numpy==2.2.4
+ nvidia-cublas-cu12==12.4.5.8
+ nvidia-cuda-cupti-cu12==12.4.127
+ nvidia-cuda-nvrtc-cu12==12.4.127
+ nvidia-cuda-runtime-cu12==12.4.127
+ nvidia-cudnn-cu12==9.1.0.70
+ nvidia-cufft-cu12==11.2.1.3
+ nvidia-curand-cu12==10.3.5.147
+ nvidia-cusolver-cu12==11.6.1.9
+ nvidia-cusparse-cu12==12.3.1.170
+ nvidia-nccl-cu12==2.21.5
+ nvidia-nvjitlink-cu12==12.4.127
+ nvidia-nvtx-cu12==12.4.127
+ openai==1.68.2
+ orjson==3.10.16
+ outlines==0.1.11
+ outlines-core==0.1.26
+ packaging==24.2
+ pandas==2.2.3
+ parso==0.8.4
+ pexpect==4.9.0
+ pillow==11.1.0
+ prometheus-client==0.21.1
+ prompt-toolkit==3.0.50
+ propcache==0.3.1
+ psutil==7.0.0
+ ptyprocess==0.7.0
+ pure-eval==0.2.3
+ pyarrow==19.0.1
+ pycountry==24.6.1
+ pycparser==2.22
+ pydantic==2.10.6
+ pydantic-core==2.27.2
+ pygments==2.19.1
+ python-dateutil==2.9.0.post0
+ python-dotenv==1.1.0
+ python-multipart==0.0.20
+ pytz==2025.2
+ pyyaml==6.0.2
+ pyzmq==26.3.0
+ referencing==0.36.2
+ regex==2024.11.6
+ requests==2.32.3
+ rpds-py==0.24.0
+ safetensors==0.5.3
+ sentencepiece==0.2.0
+ setproctitle==1.3.5
+ setuptools==78.1.0
+ sgl-kernel==0.0.5.post3
+ sglang==0.4.4.post2
+ six==1.17.0
+ sniffio==1.3.1
+ soundfile==0.13.1
+ stack-data==0.6.3
+ starlette==0.46.1
+ sympy==1.13.1
+ tiktoken==0.9.0
+ tokenizers==0.21.1
+ torch==2.5.1
+ torchao==0.9.0
+ tqdm==4.67.1
+ traitlets==5.14.3
+ transformers==4.50.0
+ triton==3.1.0
+ typing-extensions==4.13.0
+ tzdata==2025.2
+ urllib3==2.3.0
+ uvicorn==0.34.0
+ uvloop==0.21.0
+ wcwidth==0.2.13
+ xgrammar==0.1.16
+ xxhash==3.5.0
+ yarl==1.18.3
+ zipp==3.21.0
(sglang_env) xlab@xlab:/mnt/Agent/SGLang$ python -m sglang.launch_server --model-path /mnt/Model/Qwen --tp-size 4 --reasoning-parser deepseek-r1 --host 0.0.0.0
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/launch_server.py", line 6, in <module>
from sglang.srt.entrypoints.http_server import launch_server
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/entrypoints/http_server.py", line 45, in <module>
from sglang.srt.entrypoints.engine import _launch_subprocesses
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/entrypoints/engine.py", line 40, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/data_parallel_controller.py", line 27, in <module>
from sglang.srt.managers.io_struct import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/io_struct.py", line 25, in <module>
from sglang.srt.managers.schedule_batch import BaseFinishReason
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/schedule_batch.py", line 43, in <module>
from sglang.srt.configs.model_config import ModelConfig
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/configs/__init__.py", line 3, in <module>
from sglang.srt.configs.deepseekvl2 import DeepseekVL2Config
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/configs/deepseekvl2.py", line 7, in <module>
import torchvision.transforms as T
ModuleNotFoundError: No module named 'torchvision'
(sglang_env) xlab@xlab:/mnt/Agent/SGLang$ uv pip install torchvision
Using Python 3.12.3 environment at: sglang_env
Resolved 27 packages in 758ms
Uninstalled 2 packages in 528ms
Installed 4 packages in 105ms
+ nvidia-cusparselt-cu12==0.6.2
- torch==2.5.1
+ torch==2.6.0
+ torchvision==0.21.0
- triton==3.1.0
+ triton==3.2.0
(sglang_env) xlab@xlab:/mnt/Agent/SGLang$ python -m sglang.launch_server --model-path /mnt/Model/Qwen --tp-size 4 --reasoning-parser deepseek-r1 --host 0.0.0.0
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/launch_server.py", line 6, in <module>
from sglang.srt.entrypoints.http_server import launch_server
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/entrypoints/http_server.py", line 45, in <module>
from sglang.srt.entrypoints.engine import _launch_subprocesses
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/entrypoints/engine.py", line 40, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/data_parallel_controller.py", line 27, in <module>
from sglang.srt.managers.io_struct import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/io_struct.py", line 25, in <module>
from sglang.srt.managers.schedule_batch import BaseFinishReason
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/managers/schedule_batch.py", line 43, in <module>
from sglang.srt.configs.model_config import ModelConfig
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/configs/model_config.py", line 25, in <module>
from sglang.srt.layers.quantization import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/layers/quantization/__init__.py", line 46, in <module>
from sglang.srt.layers.quantization.compressed_tensors.compressed_tensors import (
File "/mnt/Agent/SGLang/sglang_env/lib/python3.12/site-packages/sglang/srt/layers/quantization/compressed_tensors/compressed_tensors.py", line 9, in <module>
from compressed_tensors.config import (
ModuleNotFoundError: No module named 'compressed_tensors'
Environment
(sglang_env) xlab@xlab:/mnt/Agent/SGLang$ python -m sglang.check_env
Python: 3.12.3 (main, Feb 4 2025, 14:48:35) [GCC 13.3.0]
CUDA available: True
GPU 0,1,2,3: NVIDIA RTX 6000 Ada Generation
GPU 0,1,2,3 Compute Capability: 8.9
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 12.8, V12.8.93
CUDA Driver Version: 570.124.06
PyTorch: 2.6.0+cu124
sglang: 0.4.4.post2
sgl_kernel: 0.0.5.post3
flashinfer: Module Not Found
triton: 3.2.0
transformers: 4.50.0
torchao: 0.9.0
numpy: 2.2.4
aiohttp: 3.11.14
fastapi: 0.115.12
hf_transfer: 0.1.9
huggingface_hub: 0.29.3
interegular: 0.3.3
modelscope: 1.24.0
orjson: 3.10.16
outlines: 0.1.11
packaging: 24.2
psutil: 7.0.0
pydantic: 2.10.6
multipart: Module Not Found
zmq: Module Not Found
uvicorn: 0.34.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.16
openai: 1.68.2
tiktoken: 0.9.0
anthropic: 0.49.0
litellm: 1.64.1
decord: 0.6.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NODE SYS SYS 20-29 2 N/A
GPU1 NODE X SYS SYS 20-29 2 N/A
GPU2 SYS SYS X NODE 60-69 6 N/A
GPU3 SYS SYS NODE X 60-69 6 N/A
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
ulimit soft: 1024
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels