cuda : add f32 to bf16 copy op #1182

CISC · 2025-04-07T20:03:31Z

This allows BF16 KV-cache on CUDA.

Full disclosure: I noticed this in ik_llama.cpp repo, but this is not an upstream, it was a simple feature to add.

CISC · 2025-04-07T20:15:21Z

Actually, I see this will conflict with llama.cpp changes just made, will move this PR there instead.

add f32 to bf16 copy op

c78fce2

CISC closed this Apr 7, 2025

CISC mentioned this pull request Apr 7, 2025

cuda : add f32 to bf16 copy op ggml-org/llama.cpp#12806

Merged

CISC deleted the cuda-bf16-kv-cache branch April 7, 2025 20:22

Provide feedback