support load qwen2-72b-instruct lora #5498

NiuBlibing · 2024-06-13T09:53:43Z

Like #4007, to support qwen2-72b-instruct's lora adapter with 1,2,4,8 tp-size.

NiuBlibing · 2024-06-14T06:55:22Z

Currntly punica kernel cannot support Qwen2-72B-Instruct because of 3696 could not be divided by 64. Hope #5036 or #5356 will work.

jeejeelee · 2024-06-14T07:38:03Z

Could you provide your running script?

I can test Qwen2-72B-Instruct+LoRA on my local device using #5036.

NiuBlibing · 2024-06-14T07:43:13Z

Could you provide your running script?

I can test Qwen2-72B-Instruct+LoRA on my local device using #5356.

I just start it with vllm cli.

python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-72B-Chat-test --model ./Qwen/Qwen2-72B-Instruct/ --gpu-memory-utilization 0.9 --tensor-parallel-size 8 --enable-lora --lora-dtype bfloat16 --lora-modules test=/path/to/lora/

jeejeelee · 2024-06-14T16:22:37Z

Could you provide your running script?
I can test Qwen2-72B-Instruct+LoRA on my local device using #5356.

I just start it with vllm cli.

python -m vllm.entrypoints.openai.api_server --served-model-name Qwen2-72B-Chat-test --model ./Qwen/Qwen2-72B-Instruct/ --gpu-memory-utilization 0.9 --tensor-parallel-size 8 --enable-lora --lora-dtype bfloat16 --lora-modules test=/path/to/lora/

Sorry, Actually, #5036 was used for the testing.

I have completed the test, #5036 can resolve this issue.

However, there are still some other issues that need to be resolved with #5036, I will process ASAP

NiuBlibing added 2 commits June 13, 2024 17:49

Add 3696 bgmv-kernel to support qwen2-72b-instruct lora

492ba85

add missing value

ecafe7e

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora~~ Add 3696 bgmv-kernel to support qwen2-72b-instruct lora with tp 8 Jun 13, 2024

add missing values

09486db

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora with tp 8~~ Add 3696 bgmv-kernel to support qwen2-72b-instruct lora Jun 13, 2024

NiuBlibing changed the title ~~Add 3696 bgmv-kernel to support qwen2-72b-instruct lora~~ support load qwen2-72b-instruct lora Jun 13, 2024

NiuBlibing marked this pull request as draft June 13, 2024 10:33

NiuBlibing closed this Jun 13, 2024

NiuBlibing reopened this Jun 13, 2024

NiuBlibing closed this Jun 13, 2024

NiuBlibing reopened this Jun 14, 2024

NiuBlibing closed this Jun 14, 2024

NiuBlibing reopened this Jun 14, 2024

NiuBlibing closed this Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

support load qwen2-72b-instruct lora #5498

support load qwen2-72b-instruct lora #5498

Uh oh!

NiuBlibing commented Jun 13, 2024 •

edited

Loading

Uh oh!

NiuBlibing commented Jun 14, 2024

Uh oh!

jeejeelee commented Jun 14, 2024 •

edited

Loading

Uh oh!

NiuBlibing commented Jun 14, 2024

Uh oh!

jeejeelee commented Jun 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

support load qwen2-72b-instruct lora #5498

support load qwen2-72b-instruct lora #5498

Uh oh!

Conversation

NiuBlibing commented Jun 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NiuBlibing commented Jun 14, 2024

Uh oh!

jeejeelee commented Jun 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

NiuBlibing commented Jun 14, 2024

Uh oh!

jeejeelee commented Jun 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

NiuBlibing commented Jun 13, 2024 •

edited

Loading

jeejeelee commented Jun 14, 2024 •

edited

Loading