Skip to content

Conversation

@CISC
Copy link
Contributor

@CISC CISC commented Apr 7, 2025

This allows BF16 KV-cache on CUDA.

Full disclosure: I noticed this in ik_llama.cpp repo, but this is not an upstream, it was a simple feature to add.

@CISC
Copy link
Contributor Author

CISC commented Apr 7, 2025

Actually, I see this will conflict with llama.cpp changes just made, will move this PR there instead.

@CISC CISC closed this Apr 7, 2025
@CISC CISC deleted the cuda-bf16-kv-cache branch April 7, 2025 20:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant