[RFC]: Support Ming-Flash-Omni in vLLM-Omni

### Motivation.


Ming-flash-omni Preview, an upgraded version of [Ming-Omni](https://arxiv.org/abs/2506.09344), built upon a sparser Mixture-of-Experts (MoE) variant of [Ling-Flash-2.0](https://github.com/inclusionAI/Ling-V2). Ming-flash-omni Preview shows competitive performance in vision-text understanding, image generation, audio understanding and text-to-speech capabilities[techinical report](https://arxiv.org/abs/2510.24821).
The primary objective of this proposal is to adapt this model to the vllm-omni framework.







### Proposed Change.

The implementation follows a three-phase roadmap:
Phase 1 : This phase focuses on adapting Ming-flash-omni runing on Ascend with current vllm adaptor.
Phase 2: This phase integrates  Ming-flash-omni enabling the full multi-stage pipeline. Thinker for multi-modal data encoder, Talker for llm, Show for VisionDecoder or AudioDecoder.
Phase 3: This phase performance tuning to maximize NPU throughput.

### Feedback Period.

_No response_

### CC List.

_No response_

### Any Other Things.

_No response_

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://vllm-omni.readthedocs.io), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC]: Support Ming-Flash-Omni in vLLM-Omni #692

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC]: Support Ming-Flash-Omni in vLLM-Omni #692

Description

Motivation.

Proposed Change.

Feedback Period.

CC List.

Any Other Things.

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions