Skip to content

model: add glm-asr support#17901

Merged
CISC merged 12 commits intoggml-org:masterfrom
piDack:glm_asr_support
Dec 15, 2025
Merged

model: add glm-asr support#17901
CISC merged 12 commits intoggml-org:masterfrom
piDack:glm_asr_support

Conversation

@piDack
Copy link
Contributor

@piDack piDack commented Dec 10, 2025

Make sure to read the contributing guidelines before submitting a PR

This PR adds support for the GLM-ASR architecture, specifically validating with the zai-org/GLM-ASR-Nano-2512 model.

Key Changes:

  • Model Support: Implemented necessary logic to support GLM-ASR models.
  • Conversion Script: Updated convert_hf_to_gguf.py to handle dynamic configuration keys (glm-asr use "lm_config" instead of text_config). It now correctly identifies the config section by checking:
    llm_config_key = "lm_config" if "lm_config" in self.hparams else "text_config"

Result

img_v3_02sr_36e8953d-e10a-4165-b587-5759da7d2deg

@piDack piDack changed the title [model] add glm-asr support model: add glm-asr support Dec 10, 2025
@piDack piDack requested review from CISC and ngxson December 12, 2025 06:59
Copy link
Contributor

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mtmd changes looks good, waiting for final approval from @CISC

@CISC
Copy link
Member

CISC commented Dec 14, 2025

You probably need to rebase to fix server CIs.

@piDack piDack requested a review from CISC December 14, 2025 08:11
Copy link
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EditorConfig error is unrelated.

@piDack
Copy link
Contributor Author

piDack commented Dec 14, 2025

EditorConfig error is unrelated.

Hi, I think this PR is ready to be merged. It seems the failures in the ggml-ci-x64-cpu-low-perf and ggml-ci-x64-cpu-high-perf CI jobs are unrelated to my changes.

Copy link
Member

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested conversion, it was not a pleasant experience. :)

This works far more smoothly...

@CISC CISC merged commit 745fa0e into ggml-org:master Dec 15, 2025
1 check passed
@LostRuins
Copy link
Collaborator

Gave this a try and it's working for me, however the transcription quality drops rapidly the longer the audio gets. A 10s clip works perfectly fine, but a 1 min clip just gets interpreted as random words.

Also since i couldn't find quants online, i made some https://huggingface.co/concedo/GLM-ASR-Nano-2512-GGUF

@CISC
Copy link
Member

CISC commented Dec 19, 2025

Gave this a try and it's working for me, however the transcription quality drops rapidly the longer the audio gets. A 10s clip works perfectly fine, but a 1 min clip just gets interpreted as random words.

Sounds like a rope issue?

@LostRuins
Copy link
Collaborator

does this use linear rope or 1d mrope?

@LostRuins
Copy link
Collaborator

So it looks like linear rope, but I don't think it's related to rope at all, though maybe it is an attention issue. Anyway, you can test on mtmd-cli.

Here are some test files:

mtmd-cli.exe -m c:\Users\user\Desktop\GLM-ASR-Nano-1.6B-2512-Q8_0.gguf --mmproj c:\Users\user\Desktop\mmproj-GLM-ASR-Nano-2512-Q8_0.gguf -p "what do you hear?" -n 150 --audio file.wav

this one will work fine
half_ok.wav
Perfect transcription

The rumblings of dissatisfaction in this land have become more ominous recently. Scouts report that several barbarian tribes speaking strange tongues have arrived on the shores of Kairenaka to our west.

exact same file but longer, it breaks down or frequently returns empty output
full_bad.wav
often blank output, best result is still bad:

The Ramblings of this stress and discord in the west have become more Ramming, as recently, Scouts report that several barbaric tribes speaking strange tongues have arrived on the shores of Karnak. To our west, to make matters worse, these people's now walking and hand with my land. The son of the Behemoth, God of the living, and Pharaoh in his realm, he has now reached far beyond the boundaries of the world, possessing great might. Whereas before the women, children, and worldly possessions were hidden, now they're heard beyond the boundaries.

I suspect it's worse once the audio preprocessor splits the audio into multiple chunks. I'm willing to help with troubleshooting, but only if someone actually plans to look into it.

@LostRuins
Copy link
Collaborator

complete.wav

here is the complete audio source file. if you use this, it will just output garbage.

@arch-btw
Copy link
Contributor

arch-btw commented Dec 26, 2025

@piDack I think this implementation needs to be updated after these files were merged to be hf compatible:

https://huggingface.co/zai-org/GLM-ASR-Nano-2512/commit/8172188f128059b57f09686dda5b36edff1606dd

Currently it's outputting this:

Model GlmAsrForConditionalGeneration is not supported

Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
* [model] add glm-asr support

* fix format for ci

* fix convert format for ci

* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review

* check root architecture for convert hf script

* fix conficlt with upstream

* fix convert script for glm asr & format clip-impl

* format

* restore hparams text

* improved conversion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* [model] add glm-asr support

* fix format for ci

* fix convert format for ci

* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review

* check root architecture for convert hf script

* fix conficlt with upstream

* fix convert script for glm asr & format clip-impl

* format

* restore hparams text

* improved conversion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants