[Feature] add Tensor Parallelism to SD_3.5#1336
Conversation
Signed-off-by: GG-li <3226868735@qq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a2c2e8ab32
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@wtomin @Bounty-hunter Please take a look |
| self.net = nn.ModuleList([]) | ||
| self.net.append(act_fn) | ||
| self.net.append(nn.Dropout(dropout)) | ||
| self.net.append(RowParallelLinear(inner_dim, dim_out, bias=bias)) |
There was a problem hiding this comment.
Is input_is_parallel=True required for RowParallelLinear?
There was a problem hiding this comment.
input_is_parallel is True by default. I've fixed the implementation.
| # Compute QKV for text stream (context projections) | ||
| qkv, _ = self.add_kv_proj(encoder_hidden_states) | ||
| txt_query, txt_key, txt_value = qkv.chunk(3, dim=-1) | ||
| qkv_add, _ = self.add_kv_proj(encoder_hidden_states) |
There was a problem hiding this comment.
self.add_kv_proj may not be initialized if added_kv_proj_dim is None.
Needs to run check before calling self.add_kv_proj.
| else: | ||
| self.to_out = None | ||
|
|
||
| self.norm_added_q = RMSNorm(head_dim, eps=eps) if qk_norm else nn.Identity() |
There was a problem hiding this comment.
Can you check if you can replace diffusers' RMSNorm by vllm's:
from vllm.model_executor.layers.layernorm import RMSNormI guess vllm's RMSNorm is better, and it is commonly used in vllm-omni
| if final_dropout: | ||
| self.net.append(nn.Dropout(dropout)) | ||
|
|
||
| def forward(self, hidden_states: torch.Tensor, *args, **kwargs) -> torch.Tensor: |
There was a problem hiding this comment.
maybe to can rewrite as follow:
for layer in self.net:
output = layer(hidden_states)
hidden_states = output[0] if isinstance(output, tuple) else output
return hidden_states
| total_num_heads=num_heads, | ||
| disable_tp=True, | ||
| bias=True, | ||
| return_bias=True, |
There was a problem hiding this comment.
Please check for:
The returned bias was not used subsequently, which could easily lead to misunderstanding.
if skip_bias_add is false, bias returns is None, we don't need to return.
|
@GG-li Can you also help to support the online serving text-to-image script with tensor parallel? Sofar the online serving script does not have the argument corresponding to tensor parallel size. |
Signed-off-by: GG-li <3226868735@qq.com>
|
@hsliuustc0106 @ZJY0516 @SamitHuang I think it is ready to merge. |
Signed-off-by: GG-li <3226868735@qq.com>
PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Feature for #1217.
Add Tensor Parallelism to Stable Diffusion-3.5.
Test Plan
python text_to_image.py --model stabilityai/stable-diffusion-3.5-medium --prompt "a cup of coffee on the table" --negative_prompt "ugly, unclear" --cfg_scale 4.0 --num_inference_steps 50 --output "tp_enabled.png" --tensor_parallel_size 4
Test Result
Image size: 1024x1024
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model. Please runmkdocs serveto sync the documentation editions to./docs.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)