[Doc][Bagel] Add BAGEL-7B-MoT documentation and edit the default stage configuration by nussejzz · Pull Request #987 · vllm-project/vllm-omni

nussejzz · 2026-01-27T14:05:18Z

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Add comprehensive documentation and test example scripts in it for running BAGEL-7B-MoT model in vLLM-Omni.
#936

Test Plan

Tested on dual NVIDIA RTX 5000 Ada GPUs (32GB each) and one NVIDIA A100(80GB).

Container: runpod/pytorch:1.0.2-cu1281-torch280-ubuntu2404

Run all the commands in the README

Test Result

All pass

Changes

Documentation

[examples/offline_inference/bagel/README.md]
[examples/online_serving/bagel/README.md]
[docs/user_guide/examples/offline_inference/bagel.md]
[docs/user_guide/examples/online_serving/bagel.md]

Configuration

modify the devices of [vllm_omni/model_executor/stage_configs/bagel.yaml]

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)
@princepride PTAL ❤️

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 144ecab1d8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

docs/user_guide/examples/offline_inference/bagel.md

examples/offline_inference/bagel/README.md

examples/online_serving/bagel/README.md

Copilot

Pull request overview

This pull request adds comprehensive documentation, example scripts, and configuration files for the BAGEL-7B-MoT multimodal model in vLLM-Omni. The PR addresses issue #936 by providing complete deployment guides for both online serving and offline inference modes.

Changes:

Added single-GPU configuration file (bagel_single_gpu.yaml) and updated dual-GPU config memory utilization
Created example Python scripts for text-to-image and image-to-text online serving
Added comprehensive README documentation for both online serving and offline inference examples
Added user guide documentation and shell scripts for various inference modes

Reviewed changes

Copilot reviewed 14 out of 16 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`vllm_omni/model_executor/stage_configs/bagel_single_gpu.yaml`	New single-GPU configuration with reduced memory utilization (0.40/0.50)
`vllm_omni/model_executor/stage_configs/bagel.yaml`	Increased GPU memory utilization from 0.4 to 0.8 for dual-GPU setup
`examples/online_serving/bagel/t2i.py`	Text-to-image example using OpenAI SDK
`examples/online_serving/bagel/i2t.py`	Image-to-text example with hardcoded path issue
`examples/online_serving/bagel/README.md`	Comprehensive online serving documentation
`examples/offline_inference/bagel/README.md`	Detailed offline inference guide with setup instructions
`examples/offline_inference/bagel/run_t2i.sh`	Shell script for text-to-image inference
`examples/offline_inference/bagel/run_t2t.sh`	Shell script for text-to-text inference
`examples/offline_inference/bagel/run_i2t.sh`	Shell script for image-to-text inference
`examples/offline_inference/bagel/run_t2t_multiple_prompt.sh`	Batch text-to-text inference script
`examples/offline_inference/bagel/text_prompts_10.txt`	Sample text prompts file
`examples/online_serving/bagel/cat.jpg`	Sample image for examples
`docs/user_guide/examples/online_serving/bagel.md`	User guide for online serving
`docs/user_guide/examples/offline_inference/bagel.md`	User guide for offline inference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

examples/online_serving/bagel/i2t.py

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

examples/offline_inference/bagel/README.md

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

examples/online_serving/bagel/README.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/online_serving/bagel.md

vllm_omni/model_executor/stage_configs/bagel.yaml

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

nussejzz · 2026-01-28T05:32:47Z

PTAL❤️ @princepride

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

docs/user_guide/examples/online_serving/bagel.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

princepride

@nussejzz PTAL

docs/user_guide/examples/offline_inference/bagel.md

princepride · 2026-01-28T05:41:35Z

docs/user_guide/examples/offline_inference/bagel.md

+- If you encounter warnings about flash_attn, try to install lower version like 2.8.1 with the command below.
+
+```
+uv pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.1/flash_attn-2.8.1+cu12torch2.9cxx11abiTRUE-cp312-cp312-linux_x86_64.whl


Bagel directly use vLLM flash-attn, I think we don't need install extra flash-attn

docs/user_guide/examples/offline_inference/bagel.md

docs/user_guide/examples/online_serving/bagel.md

examples/offline_inference/bagel/README.md

examples/online_serving/bagel/README.md

nussejzz · 2026-01-28T06:41:50Z

I have deleted the unnecessary content. PTAL again. Thank you very much!❤️
@princepride

princepride

A little need change.

docs/user_guide/examples/offline_inference/bagel.md

docs/user_guide/examples/online_serving/bagel.md

examples/offline_inference/bagel/README.md

examples/online_serving/bagel/README.md

nussejzz · 2026-01-28T07:01:24Z

Thank you for your advice.❤️ However, both qwen2.5_omni and qwen3_omni are written this way. @princepride

princepride · 2026-01-28T07:06:51Z

Thank you for your advice.❤️ However, both qwen2.5_omni and qwen3_omni are written this way. @princepride

Okay

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 14 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/user_guide/examples/online_serving/bagel.md

docs/user_guide/examples/offline_inference/bagel.md

examples/online_serving/bagel/README.md

docs/user_guide/examples/online_serving/bagel.md

examples/online_serving/bagel/README.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/offline_inference/bagel.md

docs/user_guide/examples/online_serving/bagel.md

examples/online_serving/bagel/README.md

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

examples/online_serving/bagel/README.md

docs/user_guide/examples/online_serving/bagel.md

examples/online_serving/bagel/README.md

vllm_omni/model_executor/stage_configs/bagel.yaml

examples/online_serving/bagel/README.md

examples/offline_inference/bagel/README.md

docs/user_guide/examples/online_serving/bagel.md

docs/user_guide/examples/offline_inference/bagel.md

The prompt was split into multiple requests because it was not enclosed in quotes. Signed-off-by: Ding Zuhao <[email protected]> Signed-off-by: jzz <[email protected]>

…al online serving and offline inference. Signed-off-by: jzz <[email protected]>

- Add offline_inference and online_serving README files for BAGEL model - Add docs for both offline and online serving examples - Create i2t.py and t2i.py example scripts using OpenAI SDK - Fix broken links with local Windows paths - Fix typos and grammar issues (Staged->Stages, add articles) - Add language identifiers to code blocks (bash, python) - Fix inline comments that would break shell commands Signed-off-by: jzz <[email protected]>

Increased GPU memory utilization from 0.4 to 0.8 for model stages. Signed-off-by: Ding Zuhao <[email protected]> Signed-off-by: jzz <[email protected]>

Signed-off-by: jzz <[email protected]>

nussejzz · 2026-01-29T17:13:24Z

Could you please merge them for me again? I have passed the CI. Thank you very much! @hsliuustc0106 ❤️

nussejzz · 2026-01-30T00:29:07Z

Special thanks to my co-author @princepride ([email protected]) for the significant contribution to this work.❤️

…e configuration (vllm-project#987) Signed-off-by: Ding Zuhao <[email protected]> Signed-off-by: jzz <[email protected]>

nussejzz requested a review from hsliuustc0106 as a code owner January 27, 2026 14:05

chatgpt-codex-connector bot reviewed Jan 27, 2026

View reviewed changes

docs/user_guide/examples/offline_inference/bagel.md Outdated Show resolved Hide resolved

nussejzz force-pushed the add-bagel-example-scripts branch 2 times, most recently from 54f6501 to fd93af3 Compare January 27, 2026 14:45

hsliuustc0106 reviewed Jan 27, 2026

View reviewed changes

examples/offline_inference/bagel/README.md Outdated Show resolved Hide resolved

examples/online_serving/bagel/README.md Show resolved Hide resolved

hsliuustc0106 requested a review from Copilot January 27, 2026 16:06

Copilot started reviewing on behalf of hsliuustc0106 January 27, 2026 16:11 View session

Copilot AI reviewed Jan 27, 2026

View reviewed changes

nussejzz force-pushed the add-bagel-example-scripts branch 2 times, most recently from e332929 to ab16373 Compare January 28, 2026 04:47

nussejzz requested a review from Copilot January 28, 2026 05:02

Copilot started reviewing on behalf of nussejzz January 28, 2026 05:06 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

nussejzz force-pushed the add-bagel-example-scripts branch from b779ce2 to a019b81 Compare January 28, 2026 05:27

nussejzz requested a review from Copilot January 28, 2026 05:29

Copilot started reviewing on behalf of nussejzz January 28, 2026 05:29 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

princepride requested changes Jan 28, 2026

View reviewed changes

nussejzz force-pushed the add-bagel-example-scripts branch from 72e2539 to 49fe0df Compare January 28, 2026 06:41

princepride requested changes Jan 28, 2026

View reviewed changes

nussejzz requested a review from hsliuustc0106 January 28, 2026 07:16

princepride approved these changes Jan 28, 2026

View reviewed changes

nussejzz requested a review from Copilot January 28, 2026 12:10

Copilot started reviewing on behalf of nussejzz January 28, 2026 12:11 View session

Copilot AI reviewed Jan 28, 2026

View reviewed changes

nussejzz force-pushed the add-bagel-example-scripts branch from 49fe0df to 801d59a Compare January 28, 2026 12:45

Copilot started reviewing on behalf of nussejzz January 29, 2026 01:13 View session

Copilot AI reviewed Jan 29, 2026

View reviewed changes

princepride mentioned this pull request Jan 29, 2026

[RFC]: Bagel deployment #936

Open

14 tasks

nussejzz added 13 commits January 29, 2026 22:53

[Bugfix] Quote shell variable to prevent word splitting in bagel example

fa3b9d0

The prompt was split into multiple requests because it was not enclosed in quotes. Signed-off-by: Ding Zuhao <[email protected]> Signed-off-by: jzz <[email protected]>

feat: Add BAGEL-7B-MoT model examples and documentation for multi-mod…

d4139d6

…al online serving and offline inference. Signed-off-by: jzz <[email protected]>

Update GPU memory utilization in bagel.yaml

f908008

Increased GPU memory utilization from 0.4 to 0.8 for model stages. Signed-off-by: Ding Zuhao <[email protected]> Signed-off-by: jzz <[email protected]>

Fix linting issues

1fb7cca

Signed-off-by: jzz <[email protected]>

refactor: update Bagel examples to download images and fix documentation

eaf75c3

Signed-off-by: jzz <[email protected]>

Improve English expression

bea382a

Signed-off-by: jzz <[email protected]>

Improve English expression

de72971

Signed-off-by: jzz <[email protected]>

delete unnecessary content

aab3cbb

Signed-off-by: jzz <[email protected]>

match the Stage Configuration table with bagel.yaml

a73a2d1

Signed-off-by: jzz <[email protected]>

add recommended gpu configuration and needed VRAM

1497c96

Signed-off-by: jzz <[email protected]>

delete blank line

dcd0e0d

Signed-off-by: jzz <[email protected]>

Improved documentation

6e88ea9

Signed-off-by: jzz <[email protected]>

nussejzz force-pushed the add-bagel-example-scripts branch from f9db4eb to 6e88ea9 Compare January 29, 2026 14:53

hsliuustc0106 approved these changes Jan 29, 2026

View reviewed changes

hsliuustc0106 added the ready label to trigger buildkite CI label Jan 29, 2026

hsliuustc0106 enabled auto-merge (squash) January 29, 2026 15:05

auto-merge was automatically disabled January 29, 2026 16:26
Head branch was pushed to by a user without write access

nussejzz force-pushed the add-bagel-example-scripts branch from b68531f to cfd81dc Compare January 29, 2026 16:26

Trigger CI retry

086bad5

Signed-off-by: jzz <[email protected]>

nussejzz force-pushed the add-bagel-example-scripts branch from 9de79f8 to 086bad5 Compare January 29, 2026 16:31

hsliuustc0106 merged commit 0e07eb6 into vllm-project:main Jan 29, 2026
7 checks passed

nussejzz mentioned this pull request Jan 31, 2026

[Bugfix] Update run_single_prompt.sh for offline_inference\bagel #970

Closed

5 tasks

nussejzz deleted the add-bagel-example-scripts branch January 31, 2026 10:08

Comments

Conversation

nussejzz commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Changes

Documentation

Configuration

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nussejzz commented Jan 28, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

princepride left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

princepride Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nussejzz commented Jan 28, 2026

Uh oh!

princepride left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nussejzz commented Jan 28, 2026

Uh oh!

nussejzz commented Jan 27, 2026 •

edited

Loading