Error when using load_in_8bit with deepspeed 

I am trying to fine tune a Yi 34B model. I am able to fine tune it with load_in_4bit set to True. When I try to use load_in_8bit instead, I get the following error:

`RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != signed char`

Here's my config:

```deepspeed train.py \
--deepspeed_stage 2 \
--use_gradient_checkpointing True  \
--model_name_or_path ../Yi-34B \
--use_flash_attention_2 False \
--load_in_8bit True \
--apply_lora True  \
--lora_rank 128  \
--lora_alpha 256  \
--lora_dropout 0.05  \
--raw_lora_target_modules "all"  \
--fuse_after_training True  \
--save_steps 100  \
--logging_steps 1  \
--per_device_train_batch_size 2  \
--gradient_accumulation_steps 2  \
--max_length 4096  \
--deepspeed_stage "stage_2"  \
--single_gpu False  \
--push_to_hub False  \
--report_to_wandb False```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error when using load_in_8bit with deepspeed #16

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Error when using load_in_8bit with deepspeed #16

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions