-
Notifications
You must be signed in to change notification settings - Fork 5.1k
[Bug] Missing gguf and partial-json-parser dependencies in 0.4.4.post2 #4869
Copy link
Copy link
Closed
Description
Checklist
- 1. I have searched related issues but cannot get the expected help.
- 2. The bug has not been fixed in the latest version.
- 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
- 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
- 5. Please use English, otherwise it will be closed.
Describe the bug
I installed SGLang with uv add "sglang[all]". I first encountered the issue of misisng compressed_tensors (as described in #4818). After installing the dependency with uv add compressed-tensors, I encountered two more missing dependencies: gguf and partial-json-parser.
Reproduction
I've been more or less following DSPy's tutorial on classfication finetuning (https://dspy.ai/tutorials/classification_finetuning/?h=bootstrapfinetune#dspy-program).
The lm.launch() triggered the following ModuleNotFound errors:
from dspy.clients.lm_local import LocalProvider
lm = dspy.LM("openai/local:meta-llama/Llama-3.2-1B-Instruct", provider=LocalProvider(), max_tokens=8196)
pipeline.set_lm(lm)
lm.launch()
I first encountered:
/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/managers/session_controller.py:57: SyntaxWarning: invalid escape sequence '\-'
prefix = " " * len(origin_prefix) + " \- " + child.req.rid
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/launch_server.py", line 6, in <module>
from sglang.srt.entrypoints.http_server import launch_server
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/entrypoints/http_server.py", line 45, in <module>
from sglang.srt.entrypoints.engine import _launch_subprocesses
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/entrypoints/engine.py", line 40, in <module>
from sglang.srt.managers.data_parallel_controller import (
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/managers/data_parallel_controller.py", line 31, in <module>
from sglang.srt.managers.scheduler import run_scheduler_process
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/managers/scheduler.py", line 108, in <module>
from sglang.srt.managers.tp_worker import TpModelWorker
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/managers/tp_worker.py", line 35, in <module>
from sglang.srt.model_executor.model_runner import ModelRunner
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/model_executor/model_runner.py", line 47, in <module>
from sglang.srt.lora.lora_manager import LoRAManager
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/lora/lora_manager.py", line 27, in <module>
from sglang.srt.lora.lora import LoRAAdapter
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/lora/lora.py", line 32, in <module>
from sglang.srt.model_loader.loader import DefaultModelLoader
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/model_loader/__init__.py", line 8, in <module>
from sglang.srt.model_loader.loader import BaseModelLoader, get_model_loader
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/model_loader/loader.py", line 17, in <module>
import gguf
ModuleNotFoundError: No module named 'gguf'
And the following after installing gguf:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/launch_server.py", line 6, in <module>
from sglang.srt.entrypoints.http_server import launch_server
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/entrypoints/http_server.py", line 45, in <module>
from sglang.srt.entrypoints.engine import _launch_subprocesses
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/entrypoints/engine.py", line 59, in <module>
from sglang.srt.openai_api.adapter import load_chat_template_for_openai_api
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/openai_api/adapter.py", line 41, in <module>
from sglang.srt.function_call_parser import FunctionCallParser
File "/<TRUNCATED>/.venv/lib/python3.12/site-packages/sglang/srt/function_call_parser.py", line 9, in <module>
import partial_json_parser
ModuleNotFoundError: No module named 'partial_json_parser'
Environment
Python: 3.12.9 (main, Feb 5 2025, 08:49:00) [GCC 11.4.0]
CUDA available: True
GPU 0: Tesla V100-SXM3-32GB
GPU 0 Compute Capability: 7.0
CUDA_HOME: None
PyTorch: 2.5.1+cu124
sglang: 0.4.4.post2
sgl_kernel: 0.0.5.post3
flashinfer: Module Not Found
triton: 3.1.0
transformers: 4.50.0
torchao: 0.9.0
numpy: 2.0.2
aiohttp: 3.11.14
fastapi: 0.115.12
hf_transfer: 0.1.9
huggingface_hub: 0.29.3
interegular: 0.3.3
modelscope: 1.24.0
orjson: 3.10.16
outlines: 0.1.11
packaging: 24.2
psutil: 7.0.0
pydantic: 2.11.0
multipart: Module Not Found
zmq: Module Not Found
uvicorn: 0.34.0
uvloop: 0.21.0
vllm: Module Not Found
xgrammar: 0.1.16
openai: 1.61.0
tiktoken: 0.9.0
anthropic: 0.49.0
litellm: 1.63.7
decord: 0.6.0
NVIDIA Topology:
GPU0 GPU1 GPU2 GPU3 GPU4 GPU5 GPU6 GPU7 GPU8 GPU9 GPU10 GPU11 GPU12 GPU13 GPU14 GPU15 NIC0 NIC1 NIC2 NIC3 NIC4 NIC5 NIC6 NIC7 NIC8 NIC9 CPU Affinity NUMA Affinity GPU NUMA ID
GPU0 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 PIX PXB NODE NODE SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU1 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 PIX PXB NODE NODE SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU2 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 PXB PIX NODE NODE SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU3 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 PXB PIX NODE NODE SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU4 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NODE NODE PIX PXB SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU5 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NODE NODE PIX PXB SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NODE NODE PXB PIX SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU7 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NODE NODE PXB PIX SYS SYS SYS SYS SYS SYS 0-23,48-71 0 N/A
GPU8 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 NV6 SYS SYS SYS SYS NODE NODE PIX PXB NODE NODE 24-47,72-95 1 N/A
GPU9 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 NV6 SYS SYS SYS SYS NODE NODE PIX PXB NODE NODE 24-47,72-95 1 N/A
GPU10 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 NV6 SYS SYS SYS SYS NODE NODE PXB PIX NODE NODE 24-47,72-95 1 N/A
GPU11 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 NV6 SYS SYS SYS SYS NODE NODE PXB PIX NODE NODE 24-47,72-95 1 N/A
GPU12 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 NV6 SYS SYS SYS SYS NODE NODE NODE NODE PIX PXB 24-47,72-95 1 N/A
GPU13 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 NV6 SYS SYS SYS SYS NODE NODE NODE NODE PIX PXB 24-47,72-95 1 N/A
GPU14 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X NV6 SYS SYS SYS SYS NODE NODE NODE NODE PXB PIX 24-47,72-95 1 N/A
GPU15 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 NV6 X SYS SYS SYS SYS NODE NODE NODE NODE PXB PIX 24-47,72-95 1 N/A
NIC0 PIX PIX PXB PXB NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS SYS SYS X PXB NODE NODE SYS SYS SYS SYS SYS SYS
NIC1 PXB PXB PIX PIX NODE NODE NODE NODE SYS SYS SYS SYS SYS SYS SYS SYS PXB X NODE NODE SYS SYS SYS SYS SYS SYS
NIC2 NODE NODE NODE NODE PIX PIX PXB PXB SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE X PXB SYS SYS SYS SYS SYS SYS
NIC3 NODE NODE NODE NODE PXB PXB PIX PIX SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE PXB X SYS SYS SYS SYS SYS SYS
NIC4 SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE NODE NODE NODE NODE SYS SYS SYS SYS X PIX NODE NODE NODE NODE
NIC5 SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE NODE NODE NODE NODE SYS SYS SYS SYS PIX X NODE NODE NODE NODE
NIC6 SYS SYS SYS SYS SYS SYS SYS SYS PIX PIX PXB PXB NODE NODE NODE NODE SYS SYS SYS SYS NODE NODE X PXB NODE NODE
NIC7 SYS SYS SYS SYS SYS SYS SYS SYS PXB PXB PIX PIX NODE NODE NODE NODE SYS SYS SYS SYS NODE NODE PXB X NODE NODE
NIC8 SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PIX PIX PXB PXB SYS SYS SYS SYS NODE NODE NODE NODE X PXB
NIC9 SYS SYS SYS SYS SYS SYS SYS SYS NODE NODE NODE NODE PXB PXB PIX PIX SYS SYS SYS SYS NODE NODE NODE NODE PXB X
Legend:
X = Self
SYS = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
PHB = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
PXB = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
PIX = Connection traversing at most a single PCIe bridge
NV# = Connection traversing a bonded set of # NVLinks
NIC Legend:
NIC0: mlx5_0
NIC1: mlx5_1
NIC2: mlx5_2
NIC3: mlx5_3
NIC4: mlx5_4
NIC5: mlx5_5
NIC6: mlx5_6
NIC7: mlx5_7
NIC8: mlx5_8
NIC9: mlx5_9
ulimit soft: 131072
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels