Skip to content

common : restore grammar-based rejection sampling#18137

Merged
ggerganov merged 2 commits intomasterfrom
gg/common-restore-grammar
Dec 17, 2025
Merged

common : restore grammar-based rejection sampling#18137
ggerganov merged 2 commits intomasterfrom
gg/common-restore-grammar

Conversation

@ggerganov
Copy link
Member

cont #17937
ref #18107 (comment)
rel #18135

I keep underestimating how much the grammar sampling is used and how slow it is. For now lets restore the old behaviour and I'll try to figure out a way to adapt #17004 to this.

@ggerganov ggerganov changed the title common : restart grammar-based rejection sampling common : restore grammar-based rejection sampling Dec 17, 2025
@ggerganov ggerganov requested a review from aldehir December 17, 2025 12:16
@aldehir
Copy link
Contributor

aldehir commented Dec 17, 2025

Works as it did before.

I'll research into other constrain decoding implementations. It does seem like a challenging problem to solve, though.

@pwilkin
Copy link
Member

pwilkin commented Dec 17, 2025

Dumb idea - what if we precomputed, for each grammar state, the entire set of accepted / rejected tokens?

@aldehir
Copy link
Contributor

aldehir commented Dec 17, 2025

@pwilkin it's not dumb, I was considering this too but doing it lazily by caching the result.

The tricky part comes in handling utf8 codepoints that span tokens. So you'd need to consider that part of the state as well.

@pwilkin
Copy link
Member

pwilkin commented Dec 17, 2025

Fair point, but I think that can be optimized this way:
-> for most cases, you can wildcard the unicode codepoints (best case scenario, you can reject based on the first codepoint already if all characters from that codepoint are rejected)
-> for the cases where it matters, you can introduce pseudo-states - pairs of <grammar state, codepoints so far> - this will introduce extra memory overhead, but since most codepoints can be optimized away like in the point above, it shouldn't matter that much

@ggerganov ggerganov merged commit 4301e27 into master Dec 17, 2025
64 of 71 checks passed
@ggerganov ggerganov deleted the gg/common-restore-grammar branch December 17, 2025 17:46
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
* common : restart grammar-based rejection sampling

* sampling : allow null samplers
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* common : restart grammar-based rejection sampling

* sampling : allow null samplers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants