Skip to content

Conversation

@MatthewBonanni
Copy link
Contributor

@MatthewBonanni MatthewBonanni commented Nov 10, 2025

Purpose

This is testing a function that is no longer used anywhere in vLLM, so this PR removes it. This helps eliminate dependence on FA2.


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <[email protected]>
@MatthewBonanni MatthewBonanni changed the title [Test] Remove old non-varlet FA2 test [Test] Remove old non-varlen FA2 test Nov 10, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes test_flash_attn_with_paged_kv, a test for the obsolete function flash_attn_with_kvcache. The change is a straightforward code cleanup, removing unused test code. The deletion is self-contained and I found no issues with it.

Copy link
Collaborator

@LucasWilkinson LucasWilkinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; can we remove this on the vllm-flash-attn side too?

@LucasWilkinson LucasWilkinson enabled auto-merge (squash) November 10, 2025 21:05
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Nov 10, 2025
@MatthewBonanni
Copy link
Contributor Author

MatthewBonanni commented Nov 10, 2025

LGTM; can we remove this on the vllm-flash-attn side too?

@LucasWilkinson are you sure? I realized that this still exists on upstream main, removing it would cause a divergence: https://github.com/Dao-AILab/flash-attention

Edit: discussed offline. Created PR: vllm-project/flash-attention#107

Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

@LucasWilkinson LucasWilkinson merged commit 0bf29fa into vllm-project:main Nov 10, 2025
26 checks passed
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Nov 13, 2025
Signed-off-by: Matthew Bonanni <[email protected]>
Signed-off-by: xuebwang-amd <[email protected]>
devpatelio pushed a commit to SumanthRH/vllm that referenced this pull request Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants