[OpenVINO] Add image-text-to-embedding task for Qwen2VL by jianxiah-intel · Pull Request #1649 · huggingface/optimum-intel

jianxiah-intel · 2026-03-25T02:21:47Z

What does this PR do?

This PR adds image-text-to-embedding task support for Qwen2VL-based models in the OpenVINO exporter.

Added Qwen2VLEmbeddingPatcher to patch the model forward for export, replacing the full causal LM forward with a backbone-only call that outputs last_hidden_state.
Added LMEmbeddingConfigHelper and registered the image-text-to-embedding task in the tasks manager for qwen2_vl.
Implemented OVModelForImageTextToEmbedding as the inference class, along with OVLMEmbeddingModel (stateless LM submodel without KV-cache) and _OVQwen2VLForEmbedding.
Introduced MODEL_TYPE_TO_IMAGE_TEXT_TO_EMBEDDING_CLS_MAPPING as a dedicated dispatch table for the new task, parallel to the existing MODEL_TYPE_TO_CLS_MAPPING.

Support for additional architectures (e.g., Qwen3VL Embedding) will be added in follow-up PRs.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Add a new task type that runs the full Qwen2VL (for now) multimodal pipeline through a stateless language model backbone (no lm_head, no KV-cache) and returns raw last_hidden_state [B, T, D].

…ormers output.

jianxiah-intel added 2 commits March 25, 2026 10:08

[OpenVINO] Add image-text-to-embedding task for Qwen2VL

dd880f4

Add a new task type that runs the full Qwen2VL (for now) multimodal pipeline through a stateless language model backbone (no lm_head, no KV-cache) and returns raw last_hidden_state [B, T, D].

add e2e test that compare the results of ov exported model and transf…

a6e129e

…ormers output.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenVINO] Add image-text-to-embedding task for Qwen2VL#1649

[OpenVINO] Add image-text-to-embedding task for Qwen2VL#1649
jianxiah-intel wants to merge 2 commits intohuggingface:mainfrom
jianxiah-intel:image-text-to-embedding

jianxiah-intel commented Mar 25, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jianxiah-intel commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jianxiah-intel commented Mar 25, 2026 •

edited

Loading