Move GLM4 f32 attention fix to the correct function by 0cc4m · Pull Request #13750 · ggml-org/llama.cpp

0cc4m · 2025-05-24T13:31:28Z

@ggerganov You merged SWA support (#13194) 3 hours before I merged my GLM4 fix (#13639). They touched the same build_attn functions, so there should have been a merge conflict. For whatever reason, my patch was applied to the newly-created build_attn function with a unified_iswa kv cache, which is not used by GLM4. So it didn't work anymore. Here's the fix, moving my patch back to the unified build_attn function... I don't think I've seen something like this before, quite the coincidence.

ggerganov · 2025-05-24T13:54:14Z

Huh, that's indeed strange why we didn't get a merge conflict.

LostRuins · 2025-05-24T14:12:09Z

Thanks for spotting it quickly

Move GLM4 f32 attention fix to the correct function

7ad56c7

0cc4m requested a review from ggerganov May 24, 2025 13:31

ggerganov approved these changes May 24, 2025

View reviewed changes

0cc4m merged commit 259469c into master May 24, 2025
46 checks passed

0cc4m deleted the 0cc4m/glm4-fix2 branch May 24, 2025 14:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move GLM4 f32 attention fix to the correct function#13750

Move GLM4 f32 attention fix to the correct function#13750
0cc4m merged 1 commit intomasterfrom
0cc4m/glm4-fix2

0cc4m commented May 24, 2025

Uh oh!

ggerganov commented May 24, 2025

Uh oh!

LostRuins commented May 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

0cc4m commented May 24, 2025

Uh oh!

ggerganov commented May 24, 2025

Uh oh!

LostRuins commented May 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants