Skip to content
Merged
Show file tree
Hide file tree
Changes from 99 commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
2c95f78
add qwen3_vl
openvino-dev-samples Sep 12, 2025
b47cc60
Update setup.py
openvino-dev-samples Sep 13, 2025
8654a53
Update modeling_visual_language.py
openvino-dev-samples Sep 13, 2025
e1f75c3
Update model_patcher.py
openvino-dev-samples Sep 13, 2025
d260216
update
openvino-dev-samples Sep 14, 2025
047e30b
set to static shape
openvino-dev-samples Sep 14, 2025
6c88fbf
add qwen3vl_moe support
openvino-dev-samples Sep 15, 2025
a2c7350
Update modeling_visual_language.py
openvino-dev-samples Sep 19, 2025
c7b2d28
Update modeling_visual_language.py
openvino-dev-samples Sep 22, 2025
9b76446
Update model_patcher.py
openvino-dev-samples Sep 22, 2025
8e7cdd2
Update modeling_visual_language.py
openvino-dev-samples Sep 23, 2025
3cb4e20
Revert "Update modeling_visual_language.py"
openvino-dev-samples Sep 29, 2025
741501e
Update modeling_visual_language.py
openvino-dev-samples Sep 29, 2025
02f9c50
transformers 4.57
IlyasMoutawwakil Nov 26, 2025
c68919f
patch dynamic cache layer
IlyasMoutawwakil Nov 26, 2025
073fc46
fix qwen and gpt_oss
IlyasMoutawwakil Nov 27, 2025
5b245cf
fix seq2seq models as well
IlyasMoutawwakil Nov 27, 2025
513977a
fix
IlyasMoutawwakil Nov 27, 2025
43d5842
fix
IlyasMoutawwakil Nov 27, 2025
79a0bbf
more decoder fixes
IlyasMoutawwakil Nov 27, 2025
bc57cec
limit awq
IlyasMoutawwakil Nov 27, 2025
6489d7e
fix dynamic layer in optimum-onnx's model patcher
IlyasMoutawwakil Nov 27, 2025
11b5a5a
remove
IlyasMoutawwakil Nov 27, 2025
d6cd7a6
fix donut
IlyasMoutawwakil Nov 28, 2025
272a624
vlm fixes
IlyasMoutawwakil Nov 28, 2025
c62546e
fix speecht5
IlyasMoutawwakil Nov 28, 2025
a7ede39
fix whisper
IlyasMoutawwakil Nov 28, 2025
817bc54
fix
IlyasMoutawwakil Nov 28, 2025
225b81d
fix qwenvl
IlyasMoutawwakil Nov 28, 2025
7c5c92c
better fix
IlyasMoutawwakil Nov 28, 2025
9116262
fix recursion issue
IlyasMoutawwakil Nov 28, 2025
f4591a7
fix llama4 and quantization
IlyasMoutawwakil Dec 1, 2025
a5029bd
fix setup
IlyasMoutawwakil Dec 1, 2025
d1449c6
fix gemma3 and skip grouped beam search
IlyasMoutawwakil Dec 1, 2025
a679411
fix
IlyasMoutawwakil Dec 1, 2025
25d2f66
fix quants
IlyasMoutawwakil Dec 1, 2025
bfcf961
fix
IlyasMoutawwakil Dec 1, 2025
b714f6d
fix
IlyasMoutawwakil Dec 1, 2025
3ca93c8
revert line
IlyasMoutawwakil Dec 1, 2025
20250f6
test offline on python 3.10
IlyasMoutawwakil Dec 1, 2025
e5d2dc6
ov 2025.4.0
IlyasMoutawwakil Dec 1, 2025
ad94d8f
fix
IlyasMoutawwakil Dec 1, 2025
99372b8
simply skip phi4
IlyasMoutawwakil Dec 1, 2025
d14416b
Apply suggestion from @IlyasMoutawwakil
IlyasMoutawwakil Dec 2, 2025
d8e65e7
add test case
openvino-dev-samples Dec 3, 2025
01f7176
update test case
openvino-dev-samples Dec 3, 2025
b2b1aed
Merge main.
popovaan Dec 11, 2025
c70b290
Merge master.
popovaan Dec 17, 2025
67764cc
Removed not related changes.
popovaan Dec 23, 2025
890b953
Removed not related changes.
popovaan Dec 23, 2025
d7e159d
Removed not related changes.
popovaan Dec 23, 2025
26b97a3
Merge branch 'main' into qwen3vl
popovaan Dec 23, 2025
d9acc87
Error fix.
popovaan Dec 23, 2025
7c1ca28
Fix error.
popovaan Dec 29, 2025
87cd373
Fix quantization test.
popovaan Dec 29, 2025
032496d
Removed not needed code.
popovaan Dec 29, 2025
273025d
Fix quantization test.
popovaan Dec 29, 2025
8a355ce
Fix tests.
popovaan Dec 29, 2025
fd90d63
Minor fix.
popovaan Dec 29, 2025
3e55be1
Minor fix.
popovaan Dec 29, 2025
6a2dbdd
Tests corrections.
popovaan Dec 30, 2025
ac4cacf
Fix quantization params.
popovaan Dec 30, 2025
3d2fc82
Minor corrections.
popovaan Dec 30, 2025
810f071
Minor correction.
popovaan Dec 30, 2025
443de4a
Added video test.
popovaan Jan 2, 2026
293621a
Code style.
popovaan Jan 2, 2026
4f56b5c
Removed wrong changes.
popovaan Jan 2, 2026
69d2337
Docs update.
popovaan Jan 2, 2026
e564a11
Minor refactor.
popovaan Jan 2, 2026
7b80d1d
Removed not used code.
popovaan Jan 12, 2026
a79488c
Removed not used code.
popovaan Jan 12, 2026
b7f620f
Code style.
popovaan Jan 12, 2026
d44adcb
Update optimum/exporters/openvino/model_configs.py
popovaan Jan 12, 2026
46f1759
Applied comments.
popovaan Jan 12, 2026
2f198bb
Add outputs specification.
popovaan Jan 12, 2026
16d8206
Minor fix.
popovaan Jan 12, 2026
24da115
Update optimum/intel/openvino/modeling_visual_language.py
popovaan Jan 12, 2026
5a95067
Use num deepstack layers from config.
popovaan Jan 13, 2026
c4ba9c2
Code style.
popovaan Jan 13, 2026
80376e5
Applied comments.
popovaan Jan 16, 2026
3d94276
Code style.
popovaan Jan 16, 2026
590e4c3
Inherited _OVQwen3VLForCausalLM from Qwen3VLModel.
popovaan Jan 16, 2026
ad2f66b
Merge remote-tracking branch 'upstream/main' into qwen3vl
popovaan Jan 16, 2026
48ba80c
Update optimum/exporters/openvino/model_configs.py
popovaan Jan 16, 2026
3c62201
Minor correction.
popovaan Jan 16, 2026
fd2cac5
Update optimum/exporters/openvino/model_configs.py
popovaan Jan 16, 2026
002ae3f
Applied comments.
popovaan Jan 16, 2026
9797d2b
Minor correction.
popovaan Jan 16, 2026
b14e0a7
Merge branch 'main' into qwen3vl
IlyasMoutawwakil Jan 28, 2026
869d8e7
Error fixed.
popovaan Feb 2, 2026
deeb4e2
Removed wrong change.
popovaan Feb 2, 2026
c7877e1
Test corrected.
popovaan Feb 2, 2026
796566d
Applied comments.
popovaan Feb 4, 2026
953d77e
Update optimum/intel/openvino/modeling_visual_language.py
popovaan Feb 5, 2026
72eb222
Added links, minor corrections.
popovaan Feb 5, 2026
6eb3398
Apply suggestion from @echarlaix
popovaan Feb 5, 2026
3f03a9d
Apllied comments.
popovaan Feb 5, 2026
d8c9093
Fixed error.
popovaan Feb 5, 2026
336de2e
Code style.
popovaan Feb 5, 2026
d7bfb38
Added comment.
popovaan Feb 9, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/source/openvino/models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ Here is the list of the supported architectures :
- Qwen2MoE
- Qwen2VL
- Qwen2.5VL
- Qwen3VL
- ResNet
- Roberta
- Roformer
Expand Down
4 changes: 4 additions & 0 deletions optimum/exporters/openvino/convert.py
Original file line number Diff line number Diff line change
Expand Up @@ -665,6 +665,10 @@ def export_from_model(
)
logging.disable(logging.NOTSET)

# Remove empty model and export_configs pairs, they can be empty when a config class is shared between model versions.
# Example: Qwen2VL and Qwen3VL share config class, but "vision_embeddings_pos" is used in Qwen3VL only.
models_and_export_configs = {k: v for k, v in models_and_export_configs.items() if v != (None, None)}

if library_name == "open_clip":
if hasattr(model.config, "save_pretrained"):
model.config.save_pretrained(output)
Expand Down
298 changes: 272 additions & 26 deletions optimum/exporters/openvino/model_configs.py

Large diffs are not rendered by default.

88 changes: 88 additions & 0 deletions optimum/exporters/openvino/model_patcher.py
Original file line number Diff line number Diff line change
Expand Up @@ -4062,6 +4062,52 @@ def __exit__(self, exc_type, exc_value, traceback):
self._model.forward = self._model.__orig_forward


class Qwen3VLLanguageModelPatcher(OVDecoderModelPatcher):
def __init__(
self,
config: "OnnxConfig",
model: Union["PreTrainedModel"],
model_kwargs: Optional[Dict[str, Any]] = None,
):
# Adopted from https://github.com/huggingface/transformers/blob/v4.51.3/src/transformers/models/phi4_multimodal/modeling_phi4_multimodal.py#L2156-L2178
# moved audio and vision features processing outside model
# This method in original model: https://github.com/huggingface/transformers/blob/v4.57.6/src/transformers/models/qwen3_vl/modeling_qwen3_vl.py#L1344-L1362
def lm_forward(
self,
attention_mask,
position_ids,
past_key_values,
inputs_embeds,
visual_pos_masks,
deepstack_visual_embeds,
use_cache=True,
):
from transformers.cache_utils import DynamicCache

pkv = DynamicCache.from_legacy_cache(past_key_values)
outputs = self.model.language_model(
inputs_embeds=inputs_embeds,
attention_mask=attention_mask,
position_ids=position_ids,
use_cache=use_cache,
past_key_values=pkv,
visual_pos_masks=visual_pos_masks,
deepstack_visual_embeds=deepstack_visual_embeds,
)
hidden_states = outputs[0]
# Only compute necessary logits, and do not upcast them to float if we are not computing the loss
logits = self.lm_head(hidden_states)
return (logits, outputs.past_key_values.to_legacy_cache())

model.__orig_forward = model.forward
model.forward = types.MethodType(lm_forward, model)
super().__init__(config, model, model_kwargs)

def __exit__(self, exc_type, exc_value, traceback):
super().__exit__(exc_type, exc_value, traceback)
self._model.forward = self._model.__orig_forward


def patch_qwen2vl_vision_blocks(model, force_new_behaviour=False):
if not force_new_behaviour and is_transformers_version("<=", "4.48.99"):
# Modified from https://github.com/huggingface/transformers/blob/v4.45.2/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L390
Expand Down Expand Up @@ -4276,6 +4322,48 @@ def __exit__(self, exc_type, exc_value, traceback):
block.attn.forward = block.attn._orig_forward


class Qwen3VLVisionEmbMergerPatcher(ModelPatcher):
def __init__(
self,
config: "OnnxConfig",
model: Union["PreTrainedModel"],
model_kwargs: Dict[str, Any] = None,
):
model.__orig_forward = model.forward

# Modified from https://github.com/huggingface/transformers/blob/v4.45.2/src/transformers/models/qwen2_vl/modeling_qwen2_vl.py#L1118
# added attention_mask input instead cu_lens for its internal calculation model (unsupported by tracing due to cycle with dynamic len)
# separated patch_embed and rot_pos_emb calls for performing as part of another model
# This code part in original model: https://github.com/huggingface/transformers/blob/main/src/transformers/models/qwen3_vl/modeling_qwen3_vl.py#L794-L808
def image_embed_forward(
self, hidden_states: torch.Tensor, attention_mask: torch.Tensor, rotary_pos_emb: torch.Tensor
) -> torch.Tensor:
deepstack_feature_lists = []
for layer_num, blk in enumerate(self.blocks):
hidden_states = blk(hidden_states, attention_mask=attention_mask, rotary_pos_emb=rotary_pos_emb)
if layer_num in self.deepstack_visual_indexes:
deepstack_feature = self.deepstack_merger_list[self.deepstack_visual_indexes.index(layer_num)](
hidden_states
)
deepstack_feature_lists.append(deepstack_feature)
last_hidden_state = self.merger(hidden_states)
return last_hidden_state, torch.stack(deepstack_feature_lists, dim=0)

model.forward = types.MethodType(image_embed_forward, model)
super().__init__(config, model, model_kwargs)

def __enter__(self):
patch_qwen2vl_vision_blocks(self._model)
super().__enter__()

def __exit__(self, exc_type, exc_value, traceback):
super().__exit__(exc_type, exc_value, traceback)
self._model.forward = self._model.__orig_forward
for block in self._model.blocks:
block.forward = block._orig_forward
block.attn.forward = block.attn._orig_forward


# copied from https://github.com/huggingface/transformers/blob/v4.47.1/src/transformers/models/granitemoe/modeling_granitemoe.py#L321
def _granite_moe_topk_gating_forward(self, hidden_states):
# compute the top_k routing decision
Expand Down
1 change: 1 addition & 0 deletions optimum/exporters/openvino/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,7 @@ def get_submodels(model):
"phi3_v",
"qwen2_vl",
"qwen2_5_vl",
"qwen3_vl",
"got_ocr2",
"gemma3",
"idefics3",
Expand Down
Loading
Loading