Remove bindings for flash_attn_with_kvcache #107

MatthewBonanni · 2025-11-10T22:22:21Z

vLLM no longer uses flash_attn_with_kvcache anywhere. This PR removes the python binding and sets FLASHATTENTION_DISABLE_APPENDKV in order to reduce build time and binary size.

Signed-off-by: Matthew Bonanni <[email protected]>

MatthewBonanni added 2 commits November 10, 2025 16:11

remove method

d92764f

Signed-off-by: Matthew Bonanni <[email protected]>

disable AppendKV

220b280

Signed-off-by: Matthew Bonanni <[email protected]>

This was referenced Nov 10, 2025

[Test] Remove old non-varlen FA2 test vllm-project/vllm#28420

Merged

[Attention] Bump FA for removed method vllm-project/vllm#28429

Merged

remove from init

17cd59d

Signed-off-by: Matthew Bonanni <[email protected]>

LucasWilkinson merged commit 58e0626 into vllm-project:main Nov 13, 2025
1 check passed

MatthewBonanni deleted the remove_unused_method branch November 14, 2025 14:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove bindings for flash_attn_with_kvcache #107

Remove bindings for flash_attn_with_kvcache #107

Uh oh!

MatthewBonanni commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Remove bindings for flash_attn_with_kvcache #107

Remove bindings for flash_attn_with_kvcache #107

Uh oh!

Conversation

MatthewBonanni commented Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants