[Misc] Add per-request generator_device to online image gen and edit#1183
[Misc] Add per-request generator_device to online image gen and edit#1183ZJY0516 merged 3 commits intovllm-project:mainfrom
Conversation
…it endpoints Signed-off-by: gcanlin <canlinguosdu@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: cc9e42a56b
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
does it help resolve generator acc problem? |
It helps users set cpu generator to observe whether the acc problem is related to device generator when different devices generator have different behaviors. |
Signed-off-by: gcanlin <canlinguosdu@gmail.com>
| if req.sampling_params.generator is None and req.sampling_params.seed is not None: | ||
| req.sampling_params.generator = torch.Generator(device=self.device).manual_seed(req.sampling_params.seed) | ||
| if req.sampling_params.generator_device is not None: | ||
| gen_device = req.sampling_params.generator_device |
There was a problem hiding this comment.
I think users shouldn't be aware of the backend hardware model; it's very confusing.
There was a problem hiding this comment.
Oh, I get your concerns. We can only give users two choices "device" and "host". How about it? I think it's necessary to give users to choose cpu generator. Actually, for our current offline examples, we have exposed the generator to users. I think it's better to remove req.sampling_params.generator and only keep device_str that only includes device and host str. And we write the dispatch in model runner.
There was a problem hiding this comment.
req.sampling_params.generator is reasonable, generator_device is confusing I mean.
There was a problem hiding this comment.
req.sampling_params.generatoris reasonable,generator_deviceis confusing I mean.
In offline, users can choose their generator. But in online, users can't do anything, which is this PR doing. We should keep consistent between offline and online at least.
We will face the problem that in online server, how can I set cpu generator? But in offline inference, we can easily set the code below.
generator = torch.Generator(device=current_omni_platform.device_type).manual_seed(args.seed)
There was a problem hiding this comment.
Yeah, make sense in online serving,but does vllm has such a param we can refer, or can we change a param name
There was a problem hiding this comment.
In SGLang-Diffusion, it's also generator_device actually. Seems that vLLM doesn't have the similar parameter. For now, I think we can keep this parameter temporarily to make online and offline consistent only. And in a following-up PR, unify generator_device and generator.
https://github.com/sgl-project/sglang/blob/c1d529c19605cbf1f9be8db6d6d225b1465ea2e0/python/sglang/multimodal_gen/runtime/entrypoints/openai/image_api.py#L260
There was a problem hiding this comment.
@david6666666 Could you please take a another look? I have added the test result.
|
@hsliuustc0106 @Bounty-hunter PTAL thx |
|
LGTM |
|
Also cc @ZJY0516 @princepride PTAL. Many thanks! |
| req.sampling_params.generator = torch.Generator(device=self.device).manual_seed(req.sampling_params.seed) | ||
| if req.sampling_params.generator_device is not None: | ||
| gen_device = req.sampling_params.generator_device | ||
| elif self.device.type == "cpu": |
There was a problem hiding this comment.
nit: Why not just gen_device = self.device?
There was a problem hiding this comment.
We want to give users the ability to choose the generator on host or device. If we directly use gen_device = self.device, then it will always be device. On online scenario, users can't pass the whole generator so we need to init generator in server side. Adding this field will help users init cpu generator even if they run the model in GPU.
|
Honestly, the parameters for online diffusion serving have become quite messy at this point. |
|
LGTM |
Yeah. Any suggestions for refactoring? |
|
Ready to merge:) |
…llm-project#1183) Signed-off-by: gcanlin <canlinguosdu@gmail.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
The seed field in OmniDiffusionSamplingParams was never converted to a torch.Generator in the online serving path — only the offline examples did that manually. The runner already had a seed→generator conversion, but it unconditionally used self.device, which fails when users need cpu generator.
Test Plan
Added the print statement to check if the generator is on CPU.
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)