Gemma 4 template parser fixes by pwilkin · Pull Request #21326 · ggml-org/llama.cpp

pwilkin · 2026-04-02T18:08:36Z

Overview

As in topic

Additional information

Quick fixes for some observed discrepancies + refactoring of the parser architecture for the dict format

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES for the refactoring, NO for the fixes

aldehir

We should roll a dedicated parser for this model.

aldehir · 2026-04-02T18:21:48Z

common/peg-parser.cpp

+    common_peg_parse_result operator()(const common_peg_string_delim_parser & p) const {
+        trie matcher({p.delimiter});
+
+        size_t pos = start_pos;
+        size_t last_valid_pos = start_pos;
+
+        while (pos < ctx.input.size()) {
+            auto utf8_result = common_parse_utf8_codepoint(ctx.input, pos);
+
+            if (utf8_result.status == utf8_parse_result::INCOMPLETE) {
+                if (!ctx.is_lenient()) {
+                    return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_FAIL, start_pos);
+                }
+                return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
+            }
+
+            if (utf8_result.status == utf8_parse_result::INVALID) {
+                return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_FAIL, start_pos);
+            }
+
+            auto match = matcher.check_at(ctx.input, pos);
+
+            if (match == trie::COMPLETE_MATCH) {
+                return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_SUCCESS, start_pos, pos);
+            }
+
+            if (match == trie::PARTIAL_MATCH) {
+                return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_SUCCESS, start_pos, pos);
+            }
+
+            pos += utf8_result.bytes_consumed;
+            last_valid_pos = pos;
+        }
+
+        if (last_valid_pos == ctx.input.size() && ctx.is_lenient()) {
+            return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
+        }
+        return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
+    }


This is functionally equivalent to p.until("<|\"|>") + p.literal("<|\"|>"). There's no need for a new parser.

aldehir · 2026-04-02T18:32:35Z

common/chat.h

+    std::string                         thinking_start_tag;  // e.g., "💭"
+    std::string                         thinking_end_tag;    // e.g., "_flow"


Beats me, model went cuckoo :P

pwilkin · 2026-04-02T18:41:16Z

We should roll a dedicated parser for this model.

Yes, but I want something out quickly when people are testing. We'll definitely do a proper one and cleanup later on.

aldehir · 2026-04-02T18:52:16Z

Ok, I'm good with that.

aldehir

Ok, undo the core peg parser changes, undo unrelated changes, and this is fine as a quick fix for now.

aldehir · 2026-04-02T18:52:47Z

common/chat-auto-parser-generator.cpp

+                value_parser = p.literal(QUOTE) +
+                    p.tool_arg_string_value(p.until(QUOTE)) +
                    p.literal(QUOTE);
+            } else if (type == "number" || type == "integer") {
+                value_parser = p.tool_arg_value(g4.gemma4_number());
+            } else if (type == "boolean") {
+                value_parser = p.tool_arg_value(g4.gemma4_bool());
+            } else if (type == "null") {
+                value_parser = p.tool_arg_value(g4.gemma4_null());
+            } else if (type == "object") {
+                value_parser = p.tool_arg_value(g4.gemma4_dict());
+            } else if (type == "array") {
+                value_parser = p.tool_arg_value(g4.gemma4_array());
            } else {
-                // Numbers, booleans: raw text up to the next comma or closing brace
-                value_parser = p.tool_arg_value(p.until_one_of({",", "}"}));
+                value_parser = p.tool_arg_value(g4.gemma4_value());


Should use gemma4_value_for_type() here?

aldehir · 2026-04-02T18:53:50Z

common/chat-peg-parser.cpp

+
+static std::string normalize_gemma4_to_json(const std::string & input) {
+    std::string result;
+    result.reserve(input.size() * 2);


In the previous chat-peg-parser, I had a mapper that would build this JSON up incrementally via the AST instead of through a separate pass. Was that removed?

No, it's still there (void common_chat_peg_mapper::map(const common_peg_ast_node & node)). Would have to adapt to the funny format, the model I used for the refactoring was too dumb to do it apparently.

pwilkin · 2026-04-02T19:06:32Z

Done.

ch10299342 · 2026-04-02T21:25:31Z

The new gemma4.jinja template still has the same issue as the GGUF-embedded template: value['type'] | upper crashes with Unknown (built-in) filter 'upper' for type Array when a tool parameter uses JSON Schema
array-style types like ["string", "null"].

The format_parameters macro already handles this correctly for array items (lines ~487–489), but not at the property level. The fix is one line before the if/elif chain:

       {%- endif -%}

   {%- set value_type = value['type'] if value['type'] is string else value['type'][0] -%}

   {%- if value['type'] | upper == 'STRING' -%}

   {%- if value_type | upper == 'STRING' -%}

...

   {%- elif value['type'] | upper == 'OBJECT' -%}

   {%- elif value_type | upper == 'OBJECT' -%}

...

   {%- elif value['type'] | upper == 'ARRAY' -%}

   {%- elif value_type | upper == 'ARRAY' -%}

...

   type:<|"|>{{ value['type'] | upper }}<|"|>}

   type:<|"|>{{ value_type | upper }}<|"|>}

Reproduced with gemma-4-27b-it (GGUF) served via llama-server --jinja when the client sends tool schemas with array types

Cherry-picked from ggml-org/llama.cpp: - fix: gemma 4 template (ggml-org#21326) - vocab: fix Gemma4 tokenizer (ggml-org#21343) - llama-model: read final_logit_softcapping for Gemma 4 (ggml-org#21390) - llama: add custom newline split for Gemma 4 (ggml-org#21406) - common: add gemma 4 specialized parser (ggml-org#21418) Resolved conflict in chat.h/chat.cpp: kept our extended common_chat_template_direct_apply signature as internal _full variant.

Partially implements Gemma4 chat template fix from llama.cpp master. What was done: - Add COMMON_CHAT_FORMAT_PEG_GEMMA4 enum value to common_chat_format What was NOT implemented (infrastructure missing): - chat-auto-parser module does not exist in this fork - common_peg_gemma4_builder class for tool calling - normalize_gemma4_to_json() function - gemma4.jinja template file - tests for gemma4 tool calling The full fix requires the chat-auto-parser infrastructure which is not present in this fork. This enum addition is a placeholder for future implementation when the chat template system is upgraded. See original PR: ggml-org/llama.cpp#21326 Co-authored-by: Piotr Wilkin (ilintar) <[email protected]>

Rebased onto upstream master (b8672+) which includes Gemma 4 model support (PR ggml-org#21309, ggml-org#21326, ggml-org#21418). This enables loading Gemma 4 E2B/E4B GGUF models on-device via llama.cpp. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

pwilkin requested a review from a team as a code owner April 2, 2026 18:08

pwilkin requested review from CISC, aldehir, ggerganov and ngxson April 2, 2026 18:09

aldehir reviewed Apr 2, 2026

View reviewed changes

github-actions bot added the testing Everything test related label Apr 2, 2026

aldehir approved these changes Apr 2, 2026

View reviewed changes

Remake clean version

a507ae5

pwilkin force-pushed the gemma4-parser branch from 4b1f4d1 to a507ae5 Compare April 2, 2026 19:06

NickM-27 mentioned this pull request Apr 2, 2026

Misc. bug: Gemma4 tool calling leaves unexpected tokens in tool calls #21316

Closed

ggerganov approved these changes Apr 2, 2026

View reviewed changes

pwilkin merged commit 5208e2d into ggml-org:master Apr 2, 2026
46 checks passed

gerstnr mentioned this pull request Apr 3, 2026

fix: minja upper/lower filter fails on array type values #21342

Closed

slomin mentioned this pull request Apr 3, 2026

feat(runtime): add Unsloth Gemma 4 support for E2B, E4B, and 26B-A4B potato-os/core#267

Merged

GodEmperor785 mentioned this pull request Apr 3, 2026

Gemma 4 needs new llama.cpp fixes oobabooga/text-generation-webui#7459

Closed

1 task

gerstnr mentioned this pull request Apr 4, 2026

fix(gemma4): handle nullable type arrays #21433

Open

luvwinnie pushed a commit to luvwinnie/llama.cpp that referenced this pull request Apr 5, 2026

fix: gemma 4 template (ggml-org#21326)

3482b55

wordingone pushed a commit to wordingone/llama-cpp-turboquant-cuda that referenced this pull request Apr 6, 2026

fix: gemma 4 template (ggml-org#21326)

3a576ca

CSCSoftware mentioned this pull request Apr 6, 2026

Eval bug: Gemma 4 generates <unused> tokens in infinite loop on Vulkan backend #21516

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma 4 template parser fixes#21326

Gemma 4 template parser fixes#21326
pwilkin merged 1 commit intoggml-org:masterfrom
pwilkin:gemma4-parser

pwilkin commented Apr 2, 2026

Uh oh!

aldehir left a comment

Uh oh!

aldehir Apr 2, 2026

Uh oh!

aldehir Apr 2, 2026

Uh oh!

pwilkin Apr 2, 2026

Uh oh!

pwilkin commented Apr 2, 2026

Uh oh!

aldehir commented Apr 2, 2026

Uh oh!

aldehir left a comment •

edited

Loading

Uh oh!

aldehir Apr 2, 2026

Uh oh!

aldehir Apr 2, 2026

Uh oh!

pwilkin Apr 2, 2026

Uh oh!

pwilkin commented Apr 2, 2026

Uh oh!

ch10299342 commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		std::string thinking_start_tag; // e.g., "💭"
		std::string thinking_end_tag; // e.g., "_flow"

Conversation

pwilkin commented Apr 2, 2026

Overview

Additional information

Requirements

Uh oh!

aldehir left a comment

Choose a reason for hiding this comment

Uh oh!

aldehir Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

aldehir Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin commented Apr 2, 2026

Uh oh!

aldehir commented Apr 2, 2026

Uh oh!

aldehir left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aldehir Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

aldehir Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

pwilkin commented Apr 2, 2026

Uh oh!

ch10299342 commented Apr 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

aldehir left a comment •

edited

Loading