Skip to content

common : fix tool call type detection for nullable and enum schemas#21327

Merged
pwilkin merged 2 commits intoggml-org:masterfrom
sacredvoid:fix/gemma4-tool-call-type-detection
Apr 3, 2026
Merged

common : fix tool call type detection for nullable and enum schemas#21327
pwilkin merged 2 commits intoggml-org:masterfrom
sacredvoid:fix/gemma4-tool-call-type-detection

Conversation

@sacredvoid
Copy link
Copy Markdown
Contributor

@sacredvoid sacredvoid commented Apr 2, 2026

Overview

Fixes #21316

The Gemma4 dict parser and the tagged parser both only check type_v.is_string() when figuring out if a tool argument is a string. This breaks for schemas that use nullable types like "type": ["string", "null"] or enum fields without an explicit "type" key, both of which are pretty common in OpenAPI/Home Assistant setups.

When the type isn't recognized as "string", the parser falls through to the raw-value path and captures <|"|> delimiter tokens as literal text, which is how you end up with output like "domain": "[<|\"|>light<|\"|>]" instead of "domain": "light".

The fix is straightforward:

  • Extract the effective type from type arrays (grab the first non-null entry)
  • Infer string type from enum fields that have string values
  • Applied to both build_tool_parser_tag_gemma4_dict and build_tool_parser_tag_tagged since they share the same pattern

All existing tests pass (test-chat-auto-parser: 56 tests, 373 assertions, 0 failures; test-chat: all passed).

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: YES - used Claude Code to explore the codebase, trace through the parser logic, and identify the root cause. The fix itself is a simple type-detection improvement that I reviewed and tested manually.

@sacredvoid sacredvoid requested a review from a team as a code owner April 2, 2026 18:20
@pwilkin
Copy link
Copy Markdown
Member

pwilkin commented Apr 2, 2026

Can you add tests please? Looks legit to me, might want @aldehir @CISC @ngxson to take a look.

@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 2, 2026

Yes I'll test it tomorrow

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Apr 2, 2026

This should be common functionality implemented in common/json-schema-to-grammar.cpp by common_schema_info. There is already logic to identify if a particular schema reduces to a string, which already handles the ["string", "null"] and enum cases. It needs to be extended to produce a type name if that's what we need.

@sacredvoid
Copy link
Copy Markdown
Contributor Author

Thanks for looking at this. @aldehir good call - I didn't realize common_schema_info already handles the schema reduction for ["string", "null"] and enum cases. Makes way more sense to extend that rather than patching each parser separately.

I'll rework it to add a type name getter (or similar) to common_schema_info and have both parsers use that. Will add tests as well.

One thing I want to check: for the enum case where there's no explicit "type" key but all values are strings, does common_schema_info already reduce that to string? Or would that need new logic too?

@aldehir
Copy link
Copy Markdown
Contributor

aldehir commented Apr 2, 2026

It does infer types if there is no type property. I say hold off until we get input from other maintainers.

I might have jumped the gun and perhaps this surgical change might be good for now and we can address the general case later.

…add tests

Fix enum type inference to scan all enum values (not just index 0) so
schemas like {"enum": [0, "celsius"]} correctly detect string type.

Fix schema_delegates in peg-parser to handle nullable type arrays
(["string", "null"]) and typeless enum schemas in raw mode, allowing
the tagged parser to use raw text instead of JSON-formatted strings.

Add test cases for Qwen3-Coder (TAG_WITH_TAGGED format):
- nullable string ["string", "null"]
- nullable string with null first ["null", "string"]
- nullable integer ["integer", "null"]
- enum without explicit type key
@sacredvoid sacredvoid requested a review from pwilkin as a code owner April 2, 2026 23:30
@github-actions github-actions bot added the testing Everything test related label Apr 2, 2026
@ngxson
Copy link
Copy Markdown
Contributor

ngxson commented Apr 3, 2026

I wasn't been able to test this on my side, but the added case in test-chat.cpp seems correct.

Copy link
Copy Markdown
Member

@pwilkin pwilkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we utilize this fix now and possibly generalize later.

@pwilkin pwilkin merged commit af5c138 into ggml-org:master Apr 3, 2026
42 of 46 checks passed
icex added a commit to icex/llama.cpp that referenced this pull request Apr 5, 2026
Includes:
- server: Fix undefined timing measurement errors (ggml-org#21201)
- server: save and clear idle slots on new task --clear-idle (ggml-org#20993)
- common: fix tool call type detection for nullable/enum schemas (ggml-org#21327)
- CUDA: fix FA kernel selection logic (ggml-org#21271)
- kv-cache: do not quantize SWA KV cache (ggml-org#21277) + revert (ggml-org#21332)
- common/parser: fix call ID detection + atomicity (ggml-org#21230)
- jinja: coerce input for string-specific filters (ggml-org#21370)
- Various CI, HIP, WebGPU, and documentation fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Gemma4 tool calling leaves unexpected tokens in tool calls

4 participants