-
-
Notifications
You must be signed in to change notification settings - Fork 11.8k
Simplify weight loading in Transformers backend #21382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify weight loading in Transformers backend #21382
Conversation
Signed-off-by: Harry Mellor <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request simplifies the Transformers backend by removing nn.Module inheritance from TransformersModel, streamlining weight mappers. However, the weight mapper in TransformersForMultimodalLM has a critical issue that will cause weight loading to fail for some models. The review includes a fix for this issue, as well as suggestions for code simplification and readability.
| hf_to_vllm_mapper = WeightsMapper( | ||
| orig_to_new_prefix={ | ||
| "language_model.model": "language_model", | ||
| "text_model.model": "text_model", | ||
| "text_model.lm_head": "lm_head", | ||
| "language_model.lm_head": "lm_head", | ||
| # deal with Qwen2-VL mapping | ||
| "model.layers": "language_model.layers", | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The updated hf_to_vllm_mapper for TransformersForMultimodalLM appears to be incorrect. The AutoWeightsLoader is initialized with self (the TransformersForMultimodalLM instance), which has self.model as an attribute containing the PreTrainedModel. Therefore, parameter names within the model are expected to be prefixed with model. (e.g., model.language_model...).
The current mappings, such as "language_model.model": "language_model", will cause the loader to look for a top-level language_model attribute on TransformersForMultimodalLM, which doesn't exist. The target prefixes should include model. to correctly map to the nested structure.
For example, language_model.model from the checkpoint should map to model.language_model in the vLLM model.
hf_to_vllm_mapper = WeightsMapper(
orig_to_new_prefix={
"language_model.model": "model.language_model",
"text_model.model": "model.text_model",
"text_model.lm_head": "model.lm_head",
"language_model.lm_head": "model.lm_head",
# deal with Qwen2-VL mapping
"model.layers": "model.language_model.layers",
})|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
… tests Signed-off-by: Harry Mellor <[email protected]>
Isotr0py
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as long as tests can pass.
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: qizixi <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: x22x22 <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Jinzhen Lin <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Paul Pak <[email protected]>
Signed-off-by: Harry Mellor <[email protected]> Signed-off-by: Diego-Castan <[email protected]>
Signed-off-by: Harry Mellor <[email protected]>
TransformersModelno longer inherits fromnn.Modulesoself.model = AutoModel.from_config(...)is no longer instantly registered as a child module ofTransformersModelTransformersForCausalLMandTransformersForMultimodalLMwe setself.modelto the innermodelfromTransformersModelhf_to_vllm_mapperis no longer necessary inTransformersForCausalLMand can be converted to a class variable inTransformersForMultimodalLMhf_to_vllm_mapperis no longer a@propertyin any Transformers backend classes, we can updateSupportsQuantto access the class attribute directly as instructed in the comments