Skip to content

Conversation

@tdoublep
Copy link
Member

@tdoublep tdoublep commented Mar 14, 2024

Recent fix #3269 removed flash_attn as an explicit dependency since it was breaking builds in a bunch of environments and made the wheel size larger.

This PR leaves the wheel unchanged, but installs flash_attn independently within the Docker build (for both the test as well as the runtime image). This will allow us to use the FlashAttentionBackend in containerized environments without affecting those using the package in other ways.

@tdoublep tdoublep changed the title Install flash-attn in Docker image Install flash_attn in Docker image Mar 14, 2024
@simon-mo simon-mo merged commit 06ec486 into vllm-project:main Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants