Commit 875dcc0
fix AiterFlashAttentionImpl init (#20103)
Summary:
Pull Request resolved: #20103
Signed-off-by: Yu Guo <[email protected]>
get error ```TypeError: AiterFlashAttentionImpl.__init__() got multiple values for argument 'use_irope'``` for llama4, AiterFlashAttentionImpl.__init__() is missing the `kv_sharing_target_layer_name` arg, https://github.com/vllm-project/vllm/blob/296ce95d8e72f4c6680bda539058f48dbe0f340a/vllm/attention/layer.py#L54
Test Plan:
launch a llama4 server with this fix
Rollback Plan:
Differential Revision: D773406371 parent 296ce95 commit 875dcc0
1 file changed
+4
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
| 390 | + | |
390 | 391 | | |
391 | 392 | | |
392 | 393 | | |
393 | 394 | | |
394 | 395 | | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
395 | 399 | | |
396 | 400 | | |
397 | 401 | | |
| |||
0 commit comments