Skip to content

[Misc] Add per-request generator_device to online image gen and edit#1183

Merged
ZJY0516 merged 3 commits intovllm-project:mainfrom
gcanlin:generator-device
Feb 10, 2026
Merged

[Misc] Add per-request generator_device to online image gen and edit#1183
ZJY0516 merged 3 commits intovllm-project:mainfrom
gcanlin:generator-device

Conversation

@gcanlin
Copy link
Collaborator

@gcanlin gcanlin commented Feb 4, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

The seed field in OmniDiffusionSamplingParams was never converted to a torch.Generator in the online serving path — only the offline examples did that manually. The runner already had a seed→generator conversion, but it unconditionally used self.device, which fails when users need cpu generator.

Test Plan

Added the print statement to check if the generator is on CPU.

        self.kv_transfer_manager.receive_kv_cache(req, target_device=getattr(self.pipeline, "device", None))

        if req.sampling_params.generator is None and req.sampling_params.seed is not None:
            if req.sampling_params.generator_device is not None:
                gen_device = req.sampling_params.generator_device
            elif self.device.type == "cpu":
                gen_device = "cpu"
            else:
                gen_device = self.device
            req.sampling_params.generator = torch.Generator(device=gen_device).manual_seed(req.sampling_params.seed)

        print(f"Generator device: {req.sampling_params.generator.device}")

Test Result

(APIServer pid=4164109) INFO:     Started server process [4164109]
(APIServer pid=4164109) INFO:     Waiting for application startup.
(APIServer pid=4164109) INFO:     Application startup complete.
(APIServer pid=4164109) INFO 02-09 02:58:55 [api_server.py:1138] Generating 1 image(s) 1190x2032
(APIServer pid=4164109) INFO 02-09 02:58:55 [async_omni.py:318] [AsyncOrchestrator] Entering scheduling loop: stages=1, final_stage=0
[Stage-0] INFO 02-09 02:58:55 [diffusion_engine.py:75] Pre-processing completed in 0.1226 seconds
[Stage-0] INFO 02-09 02:58:56 [manager.py:538] Deactivating all adapters: 0 layers
[Stage-0] WARNING 02-09 02:58:56 [kv_transfer_manager.py:356] No connector available for receiving KV cache
Generator device: cpu
[Stage-0] WARNING 02-09 02:58:56 [pipeline_qwen_image_edit_plus.py:570] negative_prompt is not set. The official Qwen-Image-Edit model may produce lower-quality results without a negative_prompt. Qwen official repository recommends to use whitespace string as negative_prompt. Note: some distilled variants may not be affected by this.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

…it endpoints

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cc9e42a56b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@hsliuustc0106
Copy link
Collaborator

does it help resolve generator acc problem?

@gcanlin
Copy link
Collaborator Author

gcanlin commented Feb 4, 2026

does it help resolve generator acc problem?

It helps users set cpu generator to observe whether the acc problem is related to device generator when different devices generator have different behaviors.

Signed-off-by: gcanlin <canlinguosdu@gmail.com>
@gcanlin gcanlin marked this pull request as draft February 4, 2026 02:36
@gcanlin gcanlin marked this pull request as ready for review February 4, 2026 02:43
if req.sampling_params.generator is None and req.sampling_params.seed is not None:
req.sampling_params.generator = torch.Generator(device=self.device).manual_seed(req.sampling_params.seed)
if req.sampling_params.generator_device is not None:
gen_device = req.sampling_params.generator_device
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think users shouldn't be aware of the backend hardware model; it's very confusing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I get your concerns. We can only give users two choices "device" and "host". How about it? I think it's necessary to give users to choose cpu generator. Actually, for our current offline examples, we have exposed the generator to users. I think it's better to remove req.sampling_params.generator and only keep device_str that only includes device and host str. And we write the dispatch in model runner.

generator = torch.Generator(device=current_omni_platform.device_type).manual_seed(args.seed)

Copy link
Collaborator

@david6666666 david6666666 Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req.sampling_params.generator is reasonable, generator_device is confusing I mean.

Copy link
Collaborator Author

@gcanlin gcanlin Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

req.sampling_params.generator is reasonable, generator_device is confusing I mean.

In offline, users can choose their generator. But in online, users can't do anything, which is this PR doing. We should keep consistent between offline and online at least.

We will face the problem that in online server, how can I set cpu generator? But in offline inference, we can easily set the code below.

generator = torch.Generator(device=current_omni_platform.device_type).manual_seed(args.seed)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, make sense in online serving,but does vllm has such a param we can refer, or can we change a param name

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SGLang-Diffusion, it's also generator_device actually. Seems that vLLM doesn't have the similar parameter. For now, I think we can keep this parameter temporarily to make online and offline consistent only. And in a following-up PR, unify generator_device and generator.
https://github.com/sgl-project/sglang/blob/c1d529c19605cbf1f9be8db6d6d225b1465ea2e0/python/sglang/multimodal_gen/runtime/entrypoints/openai/image_api.py#L260

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david6666666 Could you please take a another look? I have added the test result.

@david6666666
Copy link
Collaborator

@hsliuustc0106 @Bounty-hunter PTAL thx

@Bounty-hunter
Copy link
Contributor

LGTM

@gcanlin gcanlin changed the title [Misc] Add per-request generator_device to online image generation/ed… [Misc] Add per-request generator_device to online image gen and edit Feb 10, 2026
@gcanlin
Copy link
Collaborator Author

gcanlin commented Feb 10, 2026

Also cc @ZJY0516 @princepride PTAL. Many thanks!

Copy link
Collaborator

@ZJY0516 ZJY0516 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

req.sampling_params.generator = torch.Generator(device=self.device).manual_seed(req.sampling_params.seed)
if req.sampling_params.generator_device is not None:
gen_device = req.sampling_params.generator_device
elif self.device.type == "cpu":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not just gen_device = self.device?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to give users the ability to choose the generator on host or device. If we directly use gen_device = self.device, then it will always be device. On online scenario, users can't pass the whole generator so we need to init generator in server side. Adding this field will help users init cpu generator even if they run the model in GPU.

@ZJY0516
Copy link
Collaborator

ZJY0516 commented Feb 10, 2026

Honestly, the parameters for online diffusion serving have become quite messy at this point.

@Gaohan123 Gaohan123 added this to the v0.16.0 milestone Feb 10, 2026
@david6666666
Copy link
Collaborator

LGTM

@gcanlin
Copy link
Collaborator Author

gcanlin commented Feb 10, 2026

Honestly, the parameters for online diffusion serving have become quite messy at this point.

Yeah. Any suggestions for refactoring?

@Gaohan123 Gaohan123 added the ready label to trigger buildkite CI label Feb 10, 2026
@gcanlin
Copy link
Collaborator Author

gcanlin commented Feb 10, 2026

Ready to merge:)

@ZJY0516 ZJY0516 merged commit e192b0a into vllm-project:main Feb 10, 2026
7 checks passed
YanickSchraner pushed a commit to YanickSchraner/vllm-omni that referenced this pull request Feb 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants