fix: handle non-capturing groups (?:...) in JSON schema pattern converter by XciD · Pull Request #21124 · ggml-org/llama.cpp

XciD · 2026-03-28T15:55:25Z

Summary

Fix SIGSEGV in _visit_pattern() when a JSON schema pattern contains non-capturing groups (?:...)
Skip ?: after ( to treat non-capturing groups as regular groups
Safely handle unsupported lookahead/lookbehind syntax by skipping to matching )

Root cause

In common/json-schema-to-grammar.cpp, _visit_pattern() line ~420: when the parser sees ( followed by ?, it pushes a warning but does not advance i past ?:. The recursive transform() call then interprets ? as a quantifier and calls seq.back() on an empty vector (undefined behavior, SIGSEGV).

Reproducer

Start llama-server with any model, then:

curl -X POST http://127.0.0.1:8080/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: dummy" \
  -H "anthropic-version: 2023-06-01" \
  -d '{"model":"any","max_tokens":50,"messages":[{"role":"user","content":"hi"}],
  "tools":[{"name":"t","description":"t","input_schema":{"type":"object",
  "properties":{"arr":{"type":"array","items":{"anyOf":[{"type":"object",
  "properties":{"d":{"type":"string","pattern":"^(?:foo)$"}},
  "required":["d"]}]}}}}}],"tool_choice":{"type":"auto"}}'

The server crashes with SIGSEGV before returning a response. With this fix, it returns HTTP 200 correctly.

This affects real-world clients (e.g. Claude Code with Notion MCP tools) that send tool schemas containing date validation patterns like ^(?:(?:\d\d[2468][048]|...)-02-29|...)$.

aldehir · 2026-03-28T15:59:33Z

Can you add a test to tests/test-json-schema-to-grammar.cpp?

…rter The regex-to-grammar converter in _visit_pattern() crashes with SIGSEGV when a JSON schema "pattern" field contains a non-capturing group (?:...). Root cause: when the parser sees '(' followed by '?', it pushes a warning but does not advance past '?:'. The recursive transform() call then interprets '?' as a quantifier and calls seq.back() on an empty vector, causing undefined behavior. This commonly occurs when serving OpenAI-compatible tool calls from clients that include complex regex patterns in their JSON schemas (e.g., date validation patterns like ^(?:(?:\d\d[2468][048]|...)-02-29|...)$). The fix: - Skip '?:' after '(' to treat non-capturing groups as regular groups - For unsupported syntax (?=, ?!, etc.), skip to matching ')' safely, handling escaped characters to avoid miscounting parenthesis depth - Adjust the ')' unbalanced-parentheses check using direct char comparisons instead of substr - Add test cases for non-capturing groups (C++ only, as the JS/Python implementations do not yet support this syntax)

Includes: - fix: handle non-capturing groups (?:...) in JSON schema pattern converter (ggml-org#21124) - memory: respect unified KV cache in hybrid memory for eval tasks (ggml-org#21224) - fix: CUDA FA kernel selection, head dimension 512 support - rotate activations for better quantization (ggml-org#21038) - Various parser, jinja, webui, and CI fixes Conflicts resolved: - llama-kv-cache.cpp: keep TurboQuant InnerQ stubs + upstream Hadamard helpers - llama-graph.cpp: keep TurboQuant V-padding + upstream self_v_rot - fattn-tile.cu: add upstream D=512 before TurboQuant HIP guard - fattn.cu: combine D=512 (upstream) + D=640 (TurboQuant) exclusions

XciD requested a review from a team as a code owner March 28, 2026 15:55

XciD force-pushed the fix/visit-pattern-non-capturing-group branch from 42b7e0d to dd6d5f7 Compare March 28, 2026 15:59

XciD requested a review from ggerganov as a code owner March 28, 2026 15:59

github-actions bot added the testing Everything test related label Mar 28, 2026

XciD force-pushed the fix/visit-pattern-non-capturing-group branch from dd6d5f7 to 0b7e5f1 Compare March 28, 2026 16:02

pwilkin approved these changes Mar 28, 2026

View reviewed changes

aldehir approved these changes Mar 28, 2026

View reviewed changes

pwilkin merged commit e397d38 into ggml-org:master Mar 28, 2026
44 of 45 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle non-capturing groups (?:...) in JSON schema pattern converter#21124

fix: handle non-capturing groups (?:...) in JSON schema pattern converter#21124
pwilkin merged 1 commit intoggml-org:masterfrom
XciD:fix/visit-pattern-non-capturing-group

XciD commented Mar 28, 2026

Uh oh!

aldehir commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

XciD commented Mar 28, 2026

Summary

Root cause

Reproducer

Uh oh!

aldehir commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants