-
Notifications
You must be signed in to change notification settings - Fork 518
[Misc] Add per-request generator_device to online image gen and edit #1183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -159,7 +159,13 @@ def execute_model(self, req: OmniDiffusionRequest) -> DiffusionOutput: | |
| self.kv_transfer_manager.receive_kv_cache(req, target_device=getattr(self.pipeline, "device", None)) | ||
|
|
||
| if req.sampling_params.generator is None and req.sampling_params.seed is not None: | ||
| req.sampling_params.generator = torch.Generator(device=self.device).manual_seed(req.sampling_params.seed) | ||
| if req.sampling_params.generator_device is not None: | ||
| gen_device = req.sampling_params.generator_device | ||
| elif self.device.type == "cpu": | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Why not just
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We want to give users the ability to choose the generator on |
||
| gen_device = "cpu" | ||
| else: | ||
| gen_device = self.device | ||
| req.sampling_params.generator = torch.Generator(device=gen_device).manual_seed(req.sampling_params.seed) | ||
|
|
||
| # Refresh cache context if needed | ||
| if ( | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think users shouldn't be aware of the backend hardware model; it's very confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I get your concerns. We can only give users two choices "device" and "host". How about it? I think it's necessary to give users to choose cpu generator. Actually, for our current offline examples, we have exposed the generator to users. I think it's better to remove
req.sampling_params.generatorand only keep device_str that only includesdeviceandhoststr. And we write the dispatch in model runner.vllm-omni/examples/offline_inference/image_to_image/image_edit.py
Line 330 in 7883056
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
req.sampling_params.generatoris reasonable,generator_deviceis confusing I mean.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In offline, users can choose their generator. But in online, users can't do anything, which is this PR doing. We should keep consistent between offline and online at least.
We will face the problem that in online server, how can I set cpu generator? But in offline inference, we can easily set the code below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, make sense in online serving,but does vllm has such a param we can refer, or can we change a param name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In SGLang-Diffusion, it's also
generator_deviceactually. Seems that vLLM doesn't have the similar parameter. For now, I think we can keep this parameter temporarily to make online and offline consistent only. And in a following-up PR, unifygenerator_deviceandgenerator.https://github.com/sgl-project/sglang/blob/c1d529c19605cbf1f9be8db6d6d225b1465ea2e0/python/sglang/multimodal_gen/runtime/entrypoints/openai/image_api.py#L260
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@david6666666 Could you please take a another look? I have added the test result.