Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion paddlemix/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ paddlemix `examples` 目录下提供模型的一站式体验,包括模型推
| [groundingdino](./groundingdino/) | ✅ | ❌ | 🚧 | ❌ | ✅ | ❌ |
| [imagebind](./imagebind/) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| [InternLM-XComposer2](./internlm_xcomposer2/) | ✅ | ❌ | ✅ | ❌ | ❌ | ❌ |
| [Internvl2](./internvl2/) | ✅ | ❌ | ✅ | ❌ | ❌ | |
| [Internvl2](./internvl2/) | ✅ | ❌ | ✅ | ❌ | ❌ | |
| [llava](./llava/) | ✅ | ✅ | ✅ | ✅ | 🚧 | ✅ |
| [llava-next](./llava_next_interleave/) | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| [minigpt4](./minigpt4) | ✅ | ✅ | ✅ | ❌ | ✅ | ❌ |
Expand Down
31 changes: 29 additions & 2 deletions paddlemix/examples/llava/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,35 @@ python paddlemix/tools/supervised_finetune.py paddlemix/config/llava/v1_5/lora_s
python paddlemix/tools/supervised_finetune.py paddlemix/config/llava/v1_5/sft_argument.json
```

## 5 NPU硬件训练
请参照[tools](../../tools/README.md)进行NPU硬件Paddle安装和环境变量设置,配置完成后可直接执行微调命令进行训练或预测。
## 6 NPU硬件训练
请参照[tools](../../tools/README.md)进行NPU硬件Paddle安装和环境变量设置。执行预测和训练前需要设置如下环境变量:
```shell
export ASCEND_RT_VISIBLE_DEVICES=8
export FLAGS_npu_storage_format=0
export FLAGS_use_stride_kernel=0
export FLAGS_npu_jit_compile=0
export FLAGS_npu_scale_aclnn=True
export FLAGS_npu_split_aclnn=True
export FLAGS_allocator_strategy=auto_growth
export CUSTOM_DEVICE_BLACK_LIST=set_value,set_value_with_tensor
```

预测:
```shell
python paddlemix/examples/llava/run_predict_multiround.py \
--model-path "paddlemix/llava/llava-v1.6-7b" \
--image-file "https://bj.bcebos.com/v1/paddlenlp/models/community/GroundingDino/000000004505.jpg" \
--fp16
```
微调:
```shell
# llava lora微调
python paddlemix/tools/supervised_finetune.py paddlemix/config/llava/v1_5/lora_sft_argument.json

# llava full参数微调
python paddlemix/tools/supervised_finetune.py paddlemix/config/llava/v1_5/sft_argument.json
```


### 参考文献
```BibTeX
Expand Down
12 changes: 6 additions & 6 deletions paddlemix/processors/qwen2_vl_processing.py
Original file line number Diff line number Diff line change
Expand Up @@ -667,7 +667,7 @@ def smart_resize(
return h_bar, w_bar


def fetch_image(ele: dict[str, str | Image.Image], size_factor: int = IMAGE_FACTOR) -> Image.Image:
def fetch_image(ele: dict[str, Union[str, Image.Image]], size_factor: int = IMAGE_FACTOR) -> Image.Image:
if "image" in ele:
image = ele["image"]
else:
Expand Down Expand Up @@ -715,7 +715,7 @@ def fetch_image(ele: dict[str, str | Image.Image], size_factor: int = IMAGE_FACT
def smart_nframes(
ele: dict,
total_frames: int,
video_fps: int | float,
video_fps: Union[int, float],
) -> int:
"""calculate the number of frames for video used for model inputs.

Expand Down Expand Up @@ -850,7 +850,7 @@ def gaussian_kernel_1d(size, sigma):
kernel = np.exp(-x**2 / (2 * sigma**2))
return kernel / kernel.sum()

def fetch_video(ele: dict, image_factor: int = IMAGE_FACTOR) -> paddle.Tensor | list[Image.Image]:
def fetch_video(ele: dict, image_factor: int = IMAGE_FACTOR) -> Union[paddle.Tensor, list[Image.Image]]:
if isinstance(ele["video"], str):
video_reader_backend = get_video_reader_backend()

Expand Down Expand Up @@ -902,7 +902,7 @@ def fetch_video(ele: dict, image_factor: int = IMAGE_FACTOR) -> paddle.Tensor |
return images


def extract_vision_info(conversations: list[dict] | list[list[dict]]) -> list[dict]:
def extract_vision_info(conversations: Union[list[dict], list[list[dict]]]) -> list[dict]:
vision_infos = []
if isinstance(conversations[0], dict):
conversations = [conversations]
Expand All @@ -921,8 +921,8 @@ def extract_vision_info(conversations: list[dict] | list[list[dict]]) -> list[di


def process_vision_info(
conversations: list[dict] | list[list[dict]],
) -> tuple[list[Image.Image] | None, list[paddle.Tensor | list[Image.Image]] | None]:
conversations: Union[list[dict], list[list[dict]]],
) -> tuple[Union[list[Image.Image], None, list[Union[paddle.Tensor, list[Image.Image]]], None]]:
vision_infos = extract_vision_info(conversations)
image_inputs = []
video_inputs = []
Expand Down
2 changes: 1 addition & 1 deletion paddlemix/tools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ PaddleMIX工具箱秉承了飞桨套件一站式体验、性能极致、生态
| [groundingdino](./groundingdino/) | 🚧 | ❌ | ✅ | ❌ |
| [imagebind](./imagebind/) | ❌ | ❌ | ❌ | ❌ |
| [InternLM-XComposer2](./internlm_xcomposer2/) | ✅ | ❌ | ❌ | ❌ |
| [Internvl2](./internvl2/)| ✅ | ❌ | ❌ | |
| [Internvl2](./internvl2/)| ✅ | ❌ | ❌ | |
| [llava](./llava/) | ✅ | ✅ | 🚧 | ✅ |
| [llava-next](./llava_next_interleave/) | ❌ | ❌ | ❌ | ❌ |
| [minigpt4](./minigpt4) | ✅ | ❌ | ✅ | ❌ |
Expand Down