[Feature] Send response with request id by Bounty-hunter · Pull Request #301 · vllm-project/vllm-omni

Bounty-hunter · 2025-12-12T07:12:52Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

async_omni get response according to request id, modify same as vllm:
generate:
init a dedicated queue for this request, and async await response from this queue

ouput_handler:
init loop to accept respones, and send to correct request queue according to request id.

Test Plan

benmark:

#!/bin/bash

start=$(date +%s)


python openai_chat_completion_client_for_multimodal_generation.py \
    --query-type text \
    --prompt "Generate a 400-character introduction about Huawei." \
    > out1.log 2>&1 &

python openai_chat_completion_client_for_multimodal_generation.py \
    --query-type text \
    --prompt "Generate a 400-character introduction about Beijing." \
    > out2.log 2>&1 &

python openai_chat_completion_client_for_multimodal_generation.py \
    --query-type text \
    --prompt "Generate a 400-character introduction about wuhan." \
    > out3.log 2>&1 &

python openai_chat_completion_client_for_multimodal_generation.py \
    --query-type text \
    --prompt "Generate a 400-character introduction about shenzhen." \
    > out4.log 2>&1 &

wait


end=$(date +%s)

echo "All tasks completed."
echo "Total elapsed time: $((end - start)) seconds"

Test Result

The response is correct:

[root@devserver-bms-163 qwen2_5_omni]# cat out1.log 
Chat completion output from text: Huawei is a Chinese tech giant. It's known for its high - quality mobile phones like the P series and Mate series. They also have great networking equipment. Their products are sold worldwide. Huawei has a large R D team that constantly innovates. They're really important in the global tech industry.If you want to know more about Huawei, like their specific technologies or business strategies, feel free to ask me.
Audio saved to audio_0.wav
[root@devserver-bms-163 qwen2_5_omni]# cat out2.log 
Chat completion output from text: Beijing is an amazing city. It's the capital of China with a rich history. There are ancient palaces like the Forbidden City that show off its long - standing culture. The Great Wall is also nearby, a symbol of strength. It has modern skyscrapers too, showing it's a big business hub. The food there is diverse, from Peking duck to all kinds of local snacks. And the people are friendly and welcoming. It's really a place full of contrasts. So, what do you think? Do you want to know more about any specific part of Beijing?
Audio saved to audio_0.wav
[root@devserver-bms-163 qwen2_5_omni]# cat out3.log 
Chat completion output from text: Wuhan is a city in China. It's got a population of over ten million. It has a long history, being an important cultural and economic center. There are many famous places like the Yellow Crane Tower. The food there is also amazing, with things like hot and dry noodles. It's a place full of energy and vitality.If you want to know more about Wuhan, like its attractions or local culture, feel free to ask me.
Audio saved to audio_0.wav
[root@devserver-bms-163 qwen2_5_omni]# cat out4.log 
Chat completion output from text: Shenzhen is a really cool city in Guangdong Province. It's known for being super modern and full of innovation. There are lots of high - tech companies there. The city has a great business environment too. It attracts people from all over to start their own businesses. And it's also a place with a lot of cultural diversity. You can find different cuisines everywhere you go. It's growing fast, always changing and becoming more amazing.If you want to know more about Shenzhen, like its famous attractions or local life, feel free to ask me.
Audio saved to audio_0.wav

With Pipe line execute, the execute time reduce from: 789s to 642s

If requests are more balanced across different stages, the benefits will be more significant.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

Bounty-hunter · 2025-12-12T10:10:09Z

@Gaohan123 @fake0fan @hsliuustc0106
Please help review this PR, it address #286 and #293 :
(1) Concurrent requests cannot be processed in a pipeline manner.
(2) Responses of concurrent requests may get mixed up.

i would appreciate for your feedback.

hsliuustc0106 · 2025-12-13T00:33:20Z

@Gaohan123 @fake0fan @hsliuustc0106 Please help review this PR, it address #286 and #293 : (1) Concurrent requests cannot be processed in a pipeline manner. (2) Responses of concurrent requests may get mixed up.

i would appreciate for your feedback.

this change seems to help fix an important bug in entrypoint, I was wondering whether you can help to provide a systematic test plan for qwen2.5/3-Omni for online serving.

david6666666 · 2025-12-15T01:58:27Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

vllm_omni/entrypoints/async_omni.py

Signed-off-by: dengyunyang <[email protected]>

hsliuustc0106

lgtm

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: wangyu31577 <[email protected]>

Signed-off-by: dengyunyang <[email protected]>

Bounty-hunter requested a review from hsliuustc0106 as a code owner December 12, 2025 07:12

Bounty-hunter force-pushed the send_respond_with_id branch 2 times, most recently from 594de26 to 15b350c Compare December 12, 2025 09:13

Bounty-hunter changed the title ~~[WIP] send response with request id~~ [Feature] Send response with request id Dec 12, 2025

Bounty-hunter mentioned this pull request Dec 15, 2025

[Bugfix] yield when stage response #287

Closed

5 tasks

chatgpt-codex-connector bot reviewed Dec 15, 2025

View reviewed changes

vllm_omni/entrypoints/async_omni.py Outdated Show resolved Hide resolved

david6666666 linked an issue Dec 15, 2025 that may be closed by this pull request

[Feature]: Async omni client get the stages response according to request id #293

Closed

1 task

david6666666 reviewed Dec 15, 2025

View reviewed changes

vllm_omni/entrypoints/async_omni.py Show resolved Hide resolved

vllm_omni/entrypoints/async_omni.py Show resolved Hide resolved

vllm_omni/entrypoints/async_omni.py Outdated Show resolved Hide resolved

Bounty-hunter force-pushed the send_respond_with_id branch 3 times, most recently from 5f979a7 to 344a91e Compare December 16, 2025 12:01

send response with request id

7d40b4e

Signed-off-by: dengyunyang <[email protected]>

Bounty-hunter force-pushed the send_respond_with_id branch from 344a91e to 7d40b4e Compare December 16, 2025 12:03

hsliuustc0106 approved these changes Dec 16, 2025

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label Dec 16, 2025

hsliuustc0106 enabled auto-merge (squash) December 16, 2025 15:14

hsliuustc0106 merged commit bd53347 into vllm-project:main Dec 16, 2025
4 checks passed

faaany pushed a commit to faaany/vllm-omni that referenced this pull request Dec 19, 2025

[Feature] Send response with request id (vllm-project#301)

649ee76

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: Fanli Lin <[email protected]>

yenuo26 pushed a commit to yenuo26/vllm-omni that referenced this pull request Dec 29, 2025

[Feature] Send response with request id (vllm-project#301)

d51d747

Signed-off-by: dengyunyang <[email protected]> Signed-off-by: wangyu31577 <[email protected]>

princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026

[Feature] Send response with request id (vllm-project#301)

372ce49

Signed-off-by: dengyunyang <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Feature] Send response with request id#301

[Feature] Send response with request id#301
hsliuustc0106 merged 1 commit intovllm-project:mainfrom
Bounty-hunter:send_respond_with_id

Bounty-hunter commented Dec 12, 2025 •

edited

Loading

Uh oh!

Bounty-hunter commented Dec 12, 2025

Uh oh!

hsliuustc0106 commented Dec 13, 2025

Uh oh!

david6666666 commented Dec 15, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

Bounty-hunter commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

Bounty-hunter commented Dec 12, 2025

Uh oh!

hsliuustc0106 commented Dec 13, 2025

Uh oh!

david6666666 commented Dec 15, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hsliuustc0106 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Bounty-hunter commented Dec 12, 2025 •

edited

Loading