Skip to content

[Bug]: Error retrieving safetensors: Repo id must be in the form 'repo_name' and 400 Bad Request #1041

@CodeWang-Ay

Description

@CodeWang-Ay

Your current environment

当前环境

vllm version 0.14.0
vllm-omni version 0.14.0rc1
Name: torch
Version: 2.9.1

vllm      version 0.14.0
vllm-omni version 0.14.0rc1

Your code version

The commit id or version of vllm
0.14.0
The commit id or version of vllm-omni
0.14.0rc1

🐛 Describe the bug

启动命令 /data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice 本地路径

vllm-omni serve /data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice
--served-model-name Qwen3-TTS-12Hz-1.7B-CustomVoice
--stage-configs-path qwen3_tts.yaml
--host 0.0.0.0
--port 8800
--gpu-memory-utilization 0.5
--trust-remote-code
--enforce-eager
--omni

qwen3_tts.yaml文件内容

stage_args:

  • stage_id: 0
    stage_type: llm # Use llm stage type to launch OmniLLM
    runtime:
    devices: "0"
    max_batch_size: 1
    engine_args:
    model_stage: qwen3_tts
    model_arch: Qwen3TTSForConditionalGeneration
    worker_cls: vllm_omni.worker.gpu_generation_worker.GPUGenerationWorker
    scheduler_cls: vllm_omni.core.sched.omni_generation_scheduler.OmniGenerationScheduler
    enforce_eager: true
    trust_remote_code: true
    async_scheduling: false
    enable_prefix_caching: false
    engine_output_type: audio # Final output: audio waveform
    gpu_memory_utilization: 0.1
    distributed_executor_backend: "mp"
    max_num_batched_tokens: 10000

    final_output: true
    final_output_type: audio

启动的时候报错 但是还是能启动

[Stage-0] WARNING 01-29 09:58:53 [mooncake_connector.py:18] Mooncake not available, MooncakeOmniConnector will not work
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:492] speaker_encoder_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:489] talker_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:492] speaker_encoder_config is None. Initializing talker model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
[Stage-0] INFO 01-29 09:58:54 [configuration_qwen3_tts.py:441] code_predictor_config is None. Initializing code_predictor model with default values
[Stage-0] INFO 01-29 09:59:04 [model.py:530] Resolved architecture: Qwen3TTSForConditionalGeneration
[Stage-0] ERROR 01-29 09:59:04 [repo_utils.py:65] Error retrieving safetensors: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice'. Use repo_type argument if needed., retrying 1 of 2
[Stage-0] ERROR 01-29 09:59:06 [repo_utils.py:63] Error retrieving safetensors: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/data/LLM/Qwen3-TTS-12Hz-1.7B-CustomVoice'. Use repo_type argument if needed.

推理的时候报错

python openai_speech_client.py \
--text "今天天气真好" \
--voice Ryan \
--instructions "用开心的语气说"

报错400 返回错误信息

{"error":{"message":"1 validation error:\n {'type': 'literal_error', 'loc': ('body', 'voice'), 'msg': "Input should be 'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'", 'input': 'Ryan', 'ctx': {'expected': "'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'"}}\n\n File "/root/miniconda3/envs/vllm/lib/python3.12/site-packages/vllm/entrypoints/utils.py", line 709, in create_speech\n POST /v1/audio/speech [{'type': 'literal_error', 'loc': ('body', 'voice'), 'msg': "Input should be 'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'", 'input': 'Ryan', 'ctx': {'expected': "'alloy', 'ash', 'ballad', 'coral', 'echo', 'fable', 'onyx', 'nova', 'sage', 'shimmer' or 'verse'"}}]","type":"Bad Request","param":null,"code":400}}

我自己写的一个其他的测试能200 ok但是wav音频对不上

import requests

url = "http://localhost:8800/v1/audio/speech"

data = {
    "input": "你好在吗",
    "voice": "alloy",
    "response_format": "wav"
}

response = requests.post(url, json=data)

if response.status_code == 200:
    with open("test_vllm.wav", "wb") as f:
        f.write(response.content)

请问是我启动命令的不规范 还是用的不对

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions