Skip to content

Gemma 4 template parser fixes#21326

Merged
pwilkin merged 1 commit intoggml-org:masterfrom
pwilkin:gemma4-parser
Apr 2, 2026
Merged

Gemma 4 template parser fixes#21326
pwilkin merged 1 commit intoggml-org:masterfrom
pwilkin:gemma4-parser

Conversation

@pwilkin
Copy link
Copy Markdown
Member

@pwilkin pwilkin commented Apr 2, 2026

Overview

As in topic

Additional information

Quick fixes for some observed discrepancies + refactoring of the parser architecture for the dict format

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES for the refactoring, NO for the fixes

@pwilkin pwilkin requested a review from a team as a code owner April 2, 2026 18:08
@pwilkin pwilkin requested review from CISC, aldehir, ggerganov and ngxson April 2, 2026 18:09
Copy link
Copy Markdown
Contributor

@aldehir aldehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should roll a dedicated parser for this model.

Comment on lines +740 to +778
common_peg_parse_result operator()(const common_peg_string_delim_parser & p) const {
trie matcher({p.delimiter});

size_t pos = start_pos;
size_t last_valid_pos = start_pos;

while (pos < ctx.input.size()) {
auto utf8_result = common_parse_utf8_codepoint(ctx.input, pos);

if (utf8_result.status == utf8_parse_result::INCOMPLETE) {
if (!ctx.is_lenient()) {
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_FAIL, start_pos);
}
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
}

if (utf8_result.status == utf8_parse_result::INVALID) {
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_FAIL, start_pos);
}

auto match = matcher.check_at(ctx.input, pos);

if (match == trie::COMPLETE_MATCH) {
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_SUCCESS, start_pos, pos);
}

if (match == trie::PARTIAL_MATCH) {
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_SUCCESS, start_pos, pos);
}

pos += utf8_result.bytes_consumed;
last_valid_pos = pos;
}

if (last_valid_pos == ctx.input.size() && ctx.is_lenient()) {
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
}
return common_peg_parse_result(COMMON_PEG_PARSE_RESULT_NEED_MORE_INPUT, start_pos, last_valid_pos);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is functionally equivalent to p.until("<|\"|>") + p.literal("<|\"|>"). There's no need for a new parser.

common/chat.h Outdated
Comment on lines +218 to +219
std::string thinking_start_tag; // e.g., "💭"
std::string thinking_end_tag; // e.g., "_flow"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beats me, model went cuckoo :P

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Apr 2, 2026

We should roll a dedicated parser for this model.

Yes, but I want something out quickly when people are testing. We'll definitely do a proper one and cleanup later on.

@github-actions github-actions bot added the testing Everything test related label Apr 2, 2026
@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Apr 2, 2026

Ok, I'm good with that.

Copy link
Copy Markdown
Contributor

@aldehir aldehir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, undo the core peg parser changes, undo unrelated changes, and this is fine as a quick fix for now.

Comment on lines +578 to +592
value_parser = p.literal(QUOTE) +
p.tool_arg_string_value(p.until(QUOTE)) +
p.literal(QUOTE);
} else if (type == "number" || type == "integer") {
value_parser = p.tool_arg_value(g4.gemma4_number());
} else if (type == "boolean") {
value_parser = p.tool_arg_value(g4.gemma4_bool());
} else if (type == "null") {
value_parser = p.tool_arg_value(g4.gemma4_null());
} else if (type == "object") {
value_parser = p.tool_arg_value(g4.gemma4_dict());
} else if (type == "array") {
value_parser = p.tool_arg_value(g4.gemma4_array());
} else {
// Numbers, booleans: raw text up to the next comma or closing brace
value_parser = p.tool_arg_value(p.until_one_of({",", "}"}));
value_parser = p.tool_arg_value(g4.gemma4_value());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use gemma4_value_for_type() here?

Comment on lines +79 to +82

static std::string normalize_gemma4_to_json(const std::string & input) {
std::string result;
result.reserve(input.size() * 2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the previous chat-peg-parser, I had a mapper that would build this JSON up incrementally via the AST instead of through a separate pass. Was that removed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's still there (void common_chat_peg_mapper::map(const common_peg_ast_node & node)). Would have to adapt to the funny format, the model I used for the refactoring was too dumb to do it apparently.

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Apr 2, 2026

Done.

@ch10299342
Copy link
Copy Markdown

The new gemma4.jinja template still has the same issue as the GGUF-embedded template: value['type'] | upper crashes with Unknown (built-in) filter 'upper' for type Array when a tool parameter uses JSON Schema
array-style types like ["string", "null"].

The format_parameters macro already handles this correctly for array items (lines ~487–489), but not at the property level. The fix is one line before the if/elif chain:

       {%- endif -%}
  •    {%- set value_type = value['type'] if value['type'] is string else value['type'][0] -%}
    
  •    {%- if value['type'] | upper == 'STRING' -%}
    
  •    {%- if value_type | upper == 'STRING' -%}
    

...

  •    {%- elif value['type'] | upper == 'OBJECT' -%}
    
  •    {%- elif value_type | upper == 'OBJECT' -%}
    

...

  •    {%- elif value['type'] | upper == 'ARRAY' -%}
    
  •    {%- elif value_type | upper == 'ARRAY' -%}
    

...

  •    type:<|"|>{{ value['type'] | upper }}<|"|>}
    
  •    type:<|"|>{{ value_type | upper }}<|"|>}
    

Reproduced with gemma-4-27b-it (GGUF) served via llama-server --jinja when the client sends tool schemas with array types

@pwilkin pwilkin merged commit 5208e2d into ggml-org:master Apr 2, 2026
46 checks passed
luvwinnie pushed a commit to luvwinnie/llama.cpp that referenced this pull request Apr 5, 2026
luvwinnie added a commit to luvwinnie/llama.cpp that referenced this pull request Apr 5, 2026
Cherry-picked from ggml-org/llama.cpp:
- fix: gemma 4 template (ggml-org#21326)
- vocab: fix Gemma4 tokenizer (ggml-org#21343)
- llama-model: read final_logit_softcapping for Gemma 4 (ggml-org#21390)
- llama: add custom newline split for Gemma 4 (ggml-org#21406)
- common: add gemma 4 specialized parser (ggml-org#21418)

Resolved conflict in chat.h/chat.cpp: kept our extended
common_chat_template_direct_apply signature as internal _full variant.
Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Apr 5, 2026
Partially implements Gemma4 chat template fix from llama.cpp master.

What was done:
- Add COMMON_CHAT_FORMAT_PEG_GEMMA4 enum value to common_chat_format

What was NOT implemented (infrastructure missing):
- chat-auto-parser module does not exist in this fork
- common_peg_gemma4_builder class for tool calling
- normalize_gemma4_to_json() function
- gemma4.jinja template file
- tests for gemma4 tool calling

The full fix requires the chat-auto-parser infrastructure which is not
present in this fork. This enum addition is a placeholder for future
implementation when the chat template system is upgraded.

See original PR: ggml-org/llama.cpp#21326

Co-authored-by: Piotr Wilkin (ilintar) <[email protected]>
Nexesenex added a commit to Nexesenex/ik_llama.cpp.nxs that referenced this pull request Apr 5, 2026
Partially implements Gemma4 chat template fix from llama.cpp master.

What was done:
- Add COMMON_CHAT_FORMAT_PEG_GEMMA4 enum value to common_chat_format

What was NOT implemented (infrastructure missing):
- chat-auto-parser module does not exist in this fork
- common_peg_gemma4_builder class for tool calling
- normalize_gemma4_to_json() function
- gemma4.jinja template file
- tests for gemma4 tool calling

The full fix requires the chat-auto-parser infrastructure which is not
present in this fork. This enum addition is a placeholder for future
implementation when the chat template system is upgraded.

See original PR: ggml-org/llama.cpp#21326

Co-authored-by: Piotr Wilkin (ilintar) <[email protected]>
wordingone pushed a commit to wordingone/llama-cpp-turboquant-cuda that referenced this pull request Apr 6, 2026
kgluszczyk added a commit to Creatiwi-ai/llama.cpp that referenced this pull request Apr 6, 2026
Rebased onto upstream master (b8672+) which includes Gemma 4 model
support (PR ggml-org#21309, ggml-org#21326, ggml-org#21418). This enables loading Gemma 4
E2B/E4B GGUF models on-device via llama.cpp.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants