[LLM Inference] Support Qwen2_Moe Inference Model by CJ77Qi · Pull Request #8892 · PaddlePaddle/PaddleNLP

CJ77Qi · 2024-08-07T10:50:15Z

PR types

New features

PR changes

Models

Description

Support Qwen-Moe Inference Model

目前支持bf16/wint8，单卡推理
已在Qwen/Qwen1.5-MoE-A2.7B验证

TODO:

支持Qwen/Qwen2-57B-A14B 多卡推理，以及wint4

paddle-bot · 2024-08-07T10:50:21Z

Thanks for your contribution!

CLAassistant · 2024-08-07T10:56:56Z

All committers have signed the CLA.

codecov · 2024-08-07T11:26:17Z

Codecov Report

Attention: Patch coverage is 0% with 476 lines in your changes missing coverage. Please review.

Project coverage is 53.88%. Comparing base (f6fc7ff) to head (674b24d).
Report is 227 commits behind head on develop.

Files with missing lines	Patch %	Lines
...lp/experimental/transformers/qwen2_moe/modeling.py	0.00%	397 Missing ⚠️
...erimental/transformers/fused_transformer_layers.py	0.00%	77 Missing ⚠️
paddlenlp/experimental/transformers/__init__.py	0.00%	1 Missing ⚠️
...lp/experimental/transformers/qwen2_moe/__init__.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8892      +/-   ##
===========================================
- Coverage    54.05%   53.88%   -0.18%     
===========================================
  Files          650      652       +2     
  Lines       103884   104356     +472     
===========================================
+ Hits         56155    56230      +75     
- Misses       47729    48126     +397

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yuanlehome · 2024-08-27T03:31:46Z

@@ -1,4 +1,4 @@
-# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
+# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.


这里恢复为2023

yuanlehome · 2024-08-27T03:32:11Z

    fused_rms_norm,
    masked_multihead_attention,
    variable_length_memory_efficient_attention,
+    fused_moe,


上面已经import过了

yuanlehome · 2024-08-27T03:33:40Z

+        shared_expert_ffn1_weight_attrs=None,
+        shared_expert_ffn1_weight_scale_attrs=None,
+        shared_expert_ffn2_weight_attrs=None,
+        shared_expert_ffn2_weight_scale_attrs=None,
+        shared_expert_gate_weight_attrs=None,


这些以及下面的shared_expert_intermediate_size都放进MoeConfig里去

yuanlehome · 2024-08-27T03:34:31Z

@@ -0,0 +1,15 @@
+# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.


yuanlehome · 2024-08-27T03:34:57Z

                token = token.lower()
-            if (
-                token != self.unk_token
+            if (self.convert_tokens_to_ids(token) == self.convert_tokens_to_ids(self.unk_token)


这里不要修改，恢复

yuanlehome · 2024-08-27T03:35:54Z

+# Copyright 2018 The OpenAI Team Authors and HuggingFace Inc. team.
+# Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.


wawltor · 2024-08-28T03:43:54Z

+                        config=config,
+                        dtype=predictor_args.dtype,
+                    )
+                model.eval()


这里代码是不是可以梳理设计下，每新增一个模型都需要增加相关的模型初始化方式

恩恩，这个工作在计划中，预计九月份有结论

Co-authored-by: yuanlehome <[email protected]>

CJ77Qi changed the title ~~[Inference LLM] Support Qwen2_Moe Inference Model~~ [LLM Inference] Support Qwen2_Moe Inference Model Aug 26, 2024

yuanlehome reviewed Aug 27, 2024

View reviewed changes

yuanlehome force-pushed the qwen2_moe branch from 90ee61e to 43ecf04 Compare August 27, 2024 12:48

supprot qwen-moe

674b24d

yuanlehome force-pushed the qwen2_moe branch from 43ecf04 to 674b24d Compare August 27, 2024 12:57

yuanlehome approved these changes Aug 27, 2024

View reviewed changes

wawltor reviewed Aug 28, 2024

View reviewed changes

wawltor merged commit 34a71c8 into PaddlePaddle:develop Aug 28, 2024

Mangodadada pushed a commit to Mangodadada/PaddleNLP that referenced this pull request Sep 10, 2024

supprot qwen-moe (PaddlePaddle#8892)

a2bf616

Co-authored-by: yuanlehome <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM Inference] Support Qwen2_Moe Inference Model#8892

[LLM Inference] Support Qwen2_Moe Inference Model#8892
wawltor merged 1 commit into
PaddlePaddle:developfrom
CJ77Qi:qwen2_moe

CJ77Qi commented Aug 7, 2024 •

edited

Loading

Uh oh!

paddle-bot Bot commented Aug 7, 2024

Uh oh!

CLAassistant commented Aug 7, 2024 •

edited

Loading

Uh oh!

codecov Bot commented Aug 7, 2024 •

edited

Loading

Uh oh!

yuanlehome Aug 27, 2024

Uh oh!

yuanlehome Aug 27, 2024 •

edited

Loading

Uh oh!

yuanlehome Aug 27, 2024

Uh oh!

yuanlehome Aug 27, 2024

Uh oh!

yuanlehome Aug 27, 2024

Uh oh!

yuanlehome Aug 27, 2024

Uh oh!

wawltor Aug 28, 2024

Uh oh!

yuanlehome Aug 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		@@ -1,4 +1,4 @@
		# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.
		# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,15 @@
		# Copyright (c) 2023 PaddlePaddle Authors. All Rights Reserved.

		# Copyright 2018 The OpenAI Team Authors and HuggingFace Inc. team.
		# Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.

Conversation

CJ77Qi commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

Uh oh!

paddle-bot Bot commented Aug 7, 2024

Uh oh!

CLAassistant commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Aug 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yuanlehome Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 27, 2024

Choose a reason for hiding this comment

Uh oh!

wawltor Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

CJ77Qi commented Aug 7, 2024 •

edited

Loading

CLAassistant commented Aug 7, 2024 •

edited

Loading

codecov Bot commented Aug 7, 2024 •

edited

Loading

yuanlehome Aug 27, 2024 •

edited

Loading