Skip to content

[Feature] Add Gradio Demo for Qwen2.5Omni#60

Merged
Gaohan123 merged 15 commits intomainfrom
pr_gradio
Nov 26, 2025
Merged

[Feature] Add Gradio Demo for Qwen2.5Omni#60
Gaohan123 merged 15 commits intomainfrom
pr_gradio

Conversation

@SamitHuang
Copy link
Collaborator

@SamitHuang SamitHuang commented Nov 11, 2025

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Support gradio web service for Qwen2.5-Omni and Qwen3-Omni

Test Plan

Run

 python gradio_demo.py  --model Qwen/Qwen2.5-Omni-7B --port 7861

, then open http://127.0.0.1:7861/ in a browser

Test Result

gradio mm

Future Updates

TODOs (from demo to app level):

  • Support camera recording input
  • Maybe save history and support multi-turn dialogues
  • Support realtime interact

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/hsliuustc0106/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

@SamitHuang SamitHuang added the enhancement New feature or request label Nov 11, 2025
@@ -0,0 +1,194 @@
import argparse
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make this gradio function as a universal frontend built under vllm-omni folder,

maybe serving like:

vllm serve xxmodel --display Gradio

Copy link
Collaborator Author

@SamitHuang SamitHuang Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion. But the UI can vary for different models; each model needs its own build_interface function. We need a clean solution to tell model developers how to link their UI and inference input args to this universal --display gradio arg

README.md Outdated

## Run examples (Qwen2.5-omni)

### Offlince Inference
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline

README.md Outdated
Launch the gradio service:

```
python gradio_demo.py --model Qwen/Qwen2.5-Omni-7B --port 7861
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall use the bash.sh instead of python script to align with the offline inference,

in addition, it seems offline inference is not paired with Gradio Demo, they should not be placed on the same level.

README.md Outdated
Then open `http://localhost:7861/` on the local browser.


## Further details
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please help delete this section

@SamitHuang SamitHuang force-pushed the pr_gradio branch 2 times, most recently from 3e75b07 to 0ea59c0 Compare November 17, 2025 13:59
@SamitHuang
Copy link
Collaborator Author

updated based on #64

@hsliuustc0106
Copy link
Collaborator

hsliuustc0106 commented Nov 21, 2025

please align with PR #76 to support multimodal inputs with multiple requests

Signed-off-by: SamitHuang <[email protected]>
Signed-off-by: SamitHuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
Signed-off-by: samithuang <[email protected]>
@SamitHuang
Copy link
Collaborator Author

rebase and update to support text + audio + image + video inputs

Signed-off-by: samithuang <[email protected]>
Signed-off-by: SamitHuang <[email protected]>
@hsliuustc0106
Copy link
Collaborator

hsliuustc0106 commented Nov 25, 2025

@codex address that feedback

Copy link
Collaborator

@Gaohan123 Gaohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@Gaohan123 Gaohan123 merged commit b2d8457 into main Nov 26, 2025
2 of 3 checks passed
@Gaohan123 Gaohan123 deleted the pr_gradio branch December 1, 2025 09:55
princepride pushed a commit to princepride/vllm-omni that referenced this pull request Jan 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants