-
Notifications
You must be signed in to change notification settings - Fork 491
Description
Your current environment
当前环境
vllm version 0.14.0
vllm-omni version 0.14.0rc1
Name: torch
Version: 2.9.1
vllm-omni version 0.14.0rc1
Name: torch
Version: 2.9.1
vllm version 0.14.0
vllm-omni version 0.14.0rc1
Your code version
The commit id or version of vllm
0.14.0
The commit id or version of vllm-omni
0.14.0rc1
🐛 Describe the bug
启动命令 /data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice 本地路径
vllm-omni serve /data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice
--served-model-name Qwen3-TTS-12Hz-1.7B-CustomVoice
--stage-configs-path qwen3_tts.yaml
--host 0.0.0.0
--port 8800
--gpu-memory-utilization 0.5
--trust-remote-code
--enforce-eager
--omni
qwen3_tts.yaml文件内容
stage_args:
-
stage_id: 0
stage_type: llm # Use llm stage type to launch OmniLLM
runtime:
devices: "0"
max_batch_size: 1
engine_args:
model_stage: qwen3_tts
model_arch: Qwen3TTSForConditionalGeneration
worker_cls: vllm_omni.worker.gpu_generation_worker.GPUGenerationWorker
scheduler_cls: vllm_omni.core.sched.omni_generation_scheduler.OmniGenerationScheduler
enforce_eager: true
trust_remote_code: true
async_scheduling: false
enable_prefix_caching: false
engine_output_type: audio # Final output: audio waveform
gpu_memory_utilization: 0.1
distributed_executor_backend: "mp"
max_num_batched_tokens: 10000final_output: true
final_output_type: audio
启动的时候报错 但是还是能启动
[Stage-0] WARNING 01-29 09:58:53 [mooncake_connector.py:18] Mooncake not available, MooncakeOmniConnector will not work
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:492] speaker_encoder_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:489] talker_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:492] speaker_encoder_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
[Stage-0] INFO 01-29 09:59:04 [model.py:530] Resolved architecture: Qwen3TTSForConditionalGeneration
[Stage-0] ERROR 01-29 09:59:04 [repo_utils.py:65] Error retrieving safetensors: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice'. Use repo_type argument if needed., retrying 1 of 2
[Stage-0] ERROR 01-29 09:59:06 [repo_utils.py:63] Error retrieving safetensors: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice'. Use repo_type argument if needed.
推理的时候报错
python openai_speech_client.py \
--text "今天天气真好" \
--voice Ryan \
--instructions "用开心的语气说"
报错400 返回错误信息
{"error":{"message":"1 validation error:\n {'type': 'literal_error', 'loc': ('body', 'voice'), 'msg': "Input should be 'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'", 'input': 'Ryan', 'ctx': {'expected': "'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'"}}\n\n File "/root/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/utils.py", line 709, in create_speech\n POST /v1/audio/speech [{'type': 'literal_error', 'loc': ('body', 'voice'), 'msg': "Input should be 'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'", 'input': 'Ryan', 'ctx': {'expected': "'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'"}}]","type":"Bad Request","param":null,"code":400}}
我自己写的一个其他的测试能200 ok但是wav音频对不上
import requests
url = "http://localhost:8800/v1/audio/speech"
data = {
"input": "你好在吗",
"voice": "alloy",
"response_format": "wav"
}
response = requests.post(url, json=data)
if response.status_code == 200:
with open("test_vllm.wav", "wb") as f:
f.write(response.content)
请问是我启动命令的不规范 还是用的不对
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.