-
-
Notifications
You must be signed in to change notification settings - Fork 11.7k
[Models] Replace all nn.Conv2d with vLLM's Conv2dLayer
#28842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Isotr0py <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request replaces nn.Conv2d with vLLM's Conv2dLayer across multiple model files. The refactoring is mostly correct, but there are several critical errors in vllm/model_executor/models/chameleon.py. In this file, torch.Conv2dLayer is used instead of Conv2dLayer, which will lead to an AttributeError as Conv2dLayer is not a member of the torch module. I've provided suggestions to fix these issues. Please apply the corrections to ensure the model works as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| self.patch_embedding = Conv2dLayer( | ||
| in_channels=config.num_channels, | ||
| out_channels=self.embed_dim, | ||
| kernel_size=self.patch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass unsupported padding="valid" to Conv2dLayer in Siglip embeddings
Conv2dLayer stores the passed padding as-is and forwards it to torch.nn.functional.conv2d, which only accepts integers or tuples. The new code now hands it the string "valid", causing a runtime TypeError when embedding images. Compute the numeric padding or drop the argument to preserve the previous behaviour.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good bot, thanks! :)
| self.patch_embedding = Conv2dLayer( | ||
| in_channels=config.num_channels, | ||
| out_channels=self.embed_dim, | ||
| kernel_size=self.patch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Siglip2 NaViT embedding uses Conv2dLayer with string padding
Similar to the Siglip change, Conv2dLayer does not translate string paddings. Passing padding="valid" will be forwarded directly to F.conv2d, triggering a TypeError when the model processes images. Replace the string with an explicit numeric padding (likely 0) before constructing the layer.
Useful? React with 👍 / 👎.
| self.patch_embedding = Conv2dLayer( | ||
| in_channels=config.num_channels, | ||
| out_channels=self.embed_dim, | ||
| kernel_size=self.patch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idefics2 vision embeddings now call Conv2dLayer with padding string
The new Conv2dLayer wrapper forwards padding directly to F.conv2d and does not implement the "valid" shortcut that nn.Conv2d provided. Using the string here will cause a runtime failure when forward runs. Replace with the correct numeric padding (0) or add conversion logic.
Useful? React with 👍 / 👎.
| self.patch_embedding = Conv2dLayer( | ||
| in_channels=config.num_channels, | ||
| out_channels=self.embed_dim, | ||
| kernel_size=self.patch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keye vision embeddings pass unsupported padding="valid"
Conv2dLayer’s constructor only handles integer/tuple padding values. Passing the string "valid" like nn.Conv2d previously allowed will lead to an exception in forward when the convolution executes. Replace the string with the equivalent numeric padding.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the type annotation to account for this?
| self.patch_embedding = Conv2dLayer( | ||
| in_channels=config.num_channels, | ||
| out_channels=self.embed_dim, | ||
| kernel_size=self.patch_size, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PaddleOCR vision embeddings forward padding string to Conv2dLayer
The padding argument is now the literal string "valid", but Conv2dLayer passes self.padding straight to F.conv2d, which expects integers or tuples. This will raise a TypeError when embeddings are computed. Compute the numeric padding instead of relying on the string shortcut.
Useful? React with 👍 / 👎.
Signed-off-by: Isotr0py <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request systematically replaces all instances of torch.nn.Conv2d with vllm.model_executor.layers.conv.Conv2dLayer across various model implementations. This is a consistent and necessary refactoring to leverage vLLM's custom convolution layer, which likely offers optimizations or specific functionalities tailored for the framework. The changes are applied uniformly, including updating import statements and isinstance checks where appropriate. No functional issues or bugs were identified in this mechanical replacement.
Signed-off-by: Isotr0py <[email protected]>
Signed-off-by: Isotr0py <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request systematically replaces all instances of torch.nn.Conv2d with vLLM's custom Conv2dLayer across various model implementations. This is a crucial step towards unifying convolution operations within the vLLM framework, likely enabling custom optimizations or distributed computing features. The changes in vllm/model_executor/layers/conv.py correctly extend the ConvLayerBase to handle string-based padding ("same", "valid") and include a necessary validation check for strided convolutions with "same" padding. This ensures correctness and prevents undefined behavior. The widespread adoption of Conv2dLayer across numerous models demonstrates a consistent application of this architectural change.
| if padding == "same" and any(s != 1 for s in stride): | ||
| raise ValueError("padding='same' is not supported for strided convolutions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of this validation check is crucial for correctness. padding='same' behavior is not well-defined for strided convolutions in all frameworks, and explicitly disallowing it prevents potential silent miscalculations or unexpected output dimensions. This improves the robustness of the Conv2dLayer.
…t#28842) Signed-off-by: Isotr0py <[email protected]>
…t#28842) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Bhagyashri <[email protected]>
…t#28842) Signed-off-by: Isotr0py <[email protected]> Signed-off-by: Amir Samani <[email protected]>
Signed-off-by: Isotr0py <[email protected]> Signed-off-by: jiang1.li <[email protected]>
…t#28842) Signed-off-by: Isotr0py <[email protected]>
…t#28842) Signed-off-by: Isotr0py <[email protected]>
…t#28842) Signed-off-by: Isotr0py <[email protected]>
Due to the latest changes from upstream, gemma3 is failing to compile on HPU vllm-project/vllm#27772 vllm-project/vllm#28842 -replace unfold to view/reshape -replace text embedding to avoid dynamic shape -remove merge_multimodal replacement since masked_scatter issue is fixed -enable back gemma3 model test --------- Signed-off-by: Jimin Ha <[email protected]>
Purpose
nn.Conv2dusages withConv2dLayerTest Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.