【inference】support load or save Llama2-7b in three patterns by lizexu123 · Pull Request #8712 · PaddlePaddle/PaddleNLP

lizexu123 · 2024-07-04T09:01:25Z

PR types

New features

PR changes

Others

Description

补充:

导出pdmodel:

python ./predict/export_model.py --model_name_or_path meta-llama/Llama-2-7b-chat --output_path ./inference --dtype float16

导出json:

FLAGS_enable_pir_api=1 python ./predict/export_model.py --model_name_or_path meta-llama/Llama-2-7b-chat --output_path ./inference --dtype float16

支持三种模式跑

跑Llama2-7b.pdmodel +旧ir

python ./predict/predictor.py --model_name_or_path ./llama2-7b --dtype float16 --mode static

跑Llama2-7b.pdmodel +pir

FLAGS_enable_pir_in_executor=1 python ./predict/predictor.py --model_name_or_path ./llama2-7b --dtype float16 --mode static

跑Llama2-7b.json +pir

FLAGS_enable_pir_api=1 python ./predict/predictor.py --model_name_or_path ./inference --dtype float16 --mode static
如需测试保存优化后的模型再推理，设置inference_config.use_optimized_model(True)，第一次执行会先保存优化后的模型，第二次执行，会直接加载优化后的模型进行推理

paddle-bot · 2024-07-04T09:01:30Z

Thanks for your contribution!

codecov · 2024-07-04T09:38:02Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 55.73%. Comparing base (6d464bf) to head (de3ae3a).
Report is 219 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #8712      +/-   ##
===========================================
- Coverage    55.74%   55.73%   -0.01%     
===========================================
  Files          623      623              
  Lines        97456    97459       +3     
===========================================
  Hits         54323    54323              
- Misses       43133    43136       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yuanlehome · 2024-07-04T11:46:15Z

-        inference_config.disable_glog_info()
+        # inference_config.disable_glog_info()


yuanlehome · 2024-07-04T11:46:22Z

-        inference_config.disable_glog_info()
+        # inference_config.disable_glog_info()
        inference_config.enable_new_executor()
+


yuanlehome · 2024-07-04T11:46:31Z

+        # if use optimized_model to inference
+        # inference_config.use_optimized_model(True)


yuanlehome · 2024-07-04T11:46:36Z

    def _preprocess(self, input_text: str | list[str]):
        inputs = super()._preprocess(input_text)
        inputs["max_new_tokens"] = np.array(self.config.max_length, dtype="int64")
-


wawltor

LGTM

fix

e0a7f79

fix

d557f18

删除注释

23a9f3f

yuanlehome reviewed Jul 4, 2024

View reviewed changes

lizexu123 added 2 commits July 4, 2024 11:54

恢复+删除

d11f394

删除+恢复

de3ae3a

yuanlehome approved these changes Jul 4, 2024

View reviewed changes

wawltor approved these changes Jul 5, 2024

View reviewed changes

wawltor merged commit d8ddba9 into PaddlePaddle:develop Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【inference】support load or save Llama2-7b in three patterns#8712

【inference】support load or save Llama2-7b in three patterns#8712
wawltor merged 5 commits into
PaddlePaddle:developfrom
lizexu123:llama2-7b-s

lizexu123 commented Jul 4, 2024 •

edited

Loading

Uh oh!

paddle-bot Bot commented Jul 4, 2024

Uh oh!

codecov Bot commented Jul 4, 2024 •

edited

Loading

Uh oh!

yuanlehome Jul 4, 2024

Uh oh!

lizexu123 Jul 4, 2024

Uh oh!

yuanlehome Jul 4, 2024

Uh oh!

yuanlehome Jul 4, 2024

Uh oh!

yuanlehome Jul 4, 2024

Uh oh!

wawltor left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		inference_config.disable_glog_info()
		# inference_config.disable_glog_info()

		# if use optimized_model to inference
		# inference_config.use_optimized_model(True)

Conversation

lizexu123 commented Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR types

PR changes

Description

Uh oh!

paddle-bot Bot commented Jul 4, 2024

Uh oh!

codecov Bot commented Jul 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yuanlehome Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

lizexu123 Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

yuanlehome Jul 4, 2024

Choose a reason for hiding this comment

Uh oh!

wawltor left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

lizexu123 commented Jul 4, 2024 •

edited

Loading

codecov Bot commented Jul 4, 2024 •

edited

Loading