[ModelRunner] remove unused args (follow vllm changes) (#159)

MengqingCao · web-flow · commit 79fbb20b4db5 · 2025-02-25T17:51:09.000+08:00
### What this PR does / why we need it? The arg list of `Attention.forward()` is changed by vllm-project/vllm#13555. The unused args `kv_caches` and `attn_metadata` are removed. ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. Signed-off-by: MengqingCao <cmq0113@163.com>
diff --git a/vllm_ascend/model_runner.py b/vllm_ascend/model_runner.py
@@ -1142,8 +1142,6 @@ def execute_model(
                 hidden_or_intermediate_states = model_executable(
                     input_ids=model_input.input_tokens,
                     positions=model_input.input_positions,
-                    kv_caches=kv_caches,
-                    attn_metadata=model_input.attn_metadata,
                     intermediate_tensors=intermediate_tensors,
                     **MultiModalKwargs.as_kwargs(multi_modal_kwargs,
                                                  device=self.device),