Skip to content

metal : enable FA for MLA heads#18950

Merged
ggerganov merged 1 commit intomasterfrom
gg/metal-fa-mla-tune
Jan 20, 2026
Merged

metal : enable FA for MLA heads#18950
ggerganov merged 1 commit intomasterfrom
gg/metal-fa-mla-tune

Conversation

@ggerganov
Copy link
Copy Markdown
Member

ref #18936

Re-enable FA for K head size of 576 (MQA mode of MLA) and adjust simdgroups and loop unrolling for performance.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Jan 20, 2026
@ggerganov ggerganov merged commit 2711919 into master Jan 20, 2026
78 checks passed
@ggerganov ggerganov deleted the gg/metal-fa-mla-tune branch January 20, 2026 10:21
michaelw9999 pushed a commit to michaelw9999/llama.cpp that referenced this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant