Add Qwen3-Omni (dense) OpenVINO export and inference support by sgonorov · Pull Request #1640 · huggingface/optimum-intel

sgonorov · 2026-03-18T22:31:43Z

What does this PR do?

Adds OpenVINO export and inference support for the Qwen3-Omni dense multimodal model (text + vision + audio).

The model is exported as 6 sub-models: language model, text embeddings, vision patch embeddings, vision merger (with deepstack features), vision position embeddings, and audio encoder. Inference supports text-only, image, audio, and combined image+audio inputs through OVModelForVisualCausalLM.

Key implementation details:

Language model exports hidden_states alongside logits for correct stateful transformation
4D position IDs for Qwen3-Omni's multimodal RoPE (vs 3D in Qwen3-VL)
Audio inference replicates the original model's windowed chunking pipeline before calling the exported audio encoder
Vision merger uses InputEmbeddingPatcher for the position embedding sub-model to work around inspect.signature resolution issues after prior exports
Note: Requires transformers from commit 3d1a4f5e for Qwen3-Omni dense variant support (not yet in a stable release). The Talker subsystem (speech synthesis) is not present in the dense 4B variant and is not supported.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2026-03-19T16:37:29Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Wovchena · 2026-03-27T10:00:50Z

optimum/exporters/openvino/model_configs.py

        "transformers",
        "AutoModelForImageTextToText",
    )
+    TasksManager._CUSTOM_CLASSES[("pt", "qwen3_omni", "image-text-to-text")] = (


I suspect a new omni task is needed

Wovchena · 2026-03-30T05:37:49Z

tests/openvino/utils_tests.py

    "qwen3": "optimum-intel-internal-testing/tiny-random-qwen3",
    "qwen3_moe": "optimum-intel-internal-testing/tiny-random-qwen3moe",
    "qwen3_vl": "optimum-intel-internal-testing/tiny-random-qwen3-vl",
+    "qwen3_omni": "optimum-intel-internal-testing/tiny-random-qwen3-omni",


Doesn't exist

sgonorov marked this pull request as ready for review March 25, 2026 22:13

Wovchena reviewed Mar 30, 2026

View reviewed changes

sgonorov added 3 commits March 30, 2026 14:34

Initial implementation

6b9edd5

Fixes and debugging

606797c

Qwen3-Omni tests generation

e04d859

sgonorov force-pushed the qwen-3-omni-dense-support branch from 03f88fb to e04d859 Compare March 30, 2026 10:35

sgonorov and others added 5 commits March 31, 2026 04:50

Fixes and polishing

e1e619d

Other inputs support

033da0d

More unit test and logic fixes

ca0d292

More unit test and logic fixes

1490ace

More unit test and logic fixes pt.3

1b52029

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen3-Omni (dense) OpenVINO export and inference support#1640

Add Qwen3-Omni (dense) OpenVINO export and inference support#1640
sgonorov wants to merge 8 commits intohuggingface:mainfrom
sgonorov:qwen-3-omni-dense-support

sgonorov commented Mar 18, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Mar 19, 2026

Uh oh!

Wovchena Mar 27, 2026

Uh oh!

Wovchena Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sgonorov commented Mar 18, 2026

What does this PR do?

Before submitting

Uh oh!

HuggingFaceDocBuilderDev commented Mar 19, 2026

Uh oh!

Wovchena Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Wovchena Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants