Skip to content

Conversation

@MatthewBonanni
Copy link

vLLM no longer uses flash_attn_with_kvcache anywhere. This PR removes the python binding and sets FLASHATTENTION_DISABLE_APPENDKV in order to reduce build time and binary size.

Signed-off-by: Matthew Bonanni <[email protected]>
Signed-off-by: Matthew Bonanni <[email protected]>
Signed-off-by: Matthew Bonanni <[email protected]>
@LucasWilkinson LucasWilkinson merged commit 58e0626 into vllm-project:main Nov 13, 2025
1 check passed
@MatthewBonanni MatthewBonanni deleted the remove_unused_method branch November 14, 2025 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants