[Bug]: vllm-omni(v0.12.0) results of talker model of qwen2.5-omni are incorrect when running with enforce eager being False

### Your current environment

<details>
<summary>The output of <code>python collect_env.py</code></summary>

```text
Your output of `python collect_env.py` here
```

</details>


### Your code version

<details>
<summary>The commit id or version of vllm</summary>

```text

```
</details>
<details>
<summary>The commit id or version of vllm-omni</summary>

```text

```
</details>


### 🐛 Describe the bug

The tokens output by the talker model of qwen2.5-omni are incorrect when enforce eager (in qwen2_5_omni.yaml) is set to false (running compile on NPU), but it's correct for thinker model under the same parameters

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://vllm-omni.readthedocs.io), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: vllm-omni(v0.12.0) results of talker model of qwen2.5-omni are incorrect when running with enforce eager being False #912

Your current environment

Your code version

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: vllm-omni(v0.12.0) results of talker model of qwen2.5-omni are incorrect when running with enforce eager being False #912

Description

Your current environment

Your code version

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions