json: document schema conversion in GBNF readme, align manual grammar examples & converters#7841
Conversation
| {"array", {"\"[\" space ( value (\",\" space value)* )? \"]\" space", {"value"}}}, | ||
| {"uuid", {"\"\\\"\" [0-9a-fA-F]{8} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{4} \"-\" [0-9a-fA-F]{12} \"\\\"\" space", {}}}, | ||
| {"char", {"[^\"\\\\] | \"\\\\\" ([\"\\\\/bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, | ||
| {"char", {"[^\"\\\\\\x7F\\x00-\\x1F] | [\\\\] ([\"\\\\bfnrt] | \"u\" [0-9a-fA-F]{4})", {}}}, |
There was a problem hiding this comment.
Should these possibly be updated to use the new . operator, or is now not the time for that?
There was a problem hiding this comment.
Oh I see, we can't do that, because we need to exclude backslashes from this list. Nevermind, carry on! :)
grammars/json.gbnf
Outdated
|
|
||
| # Optional space: by convention, applied in this grammar after literal chars when allowed | ||
| ws ::= ([ \t\n] ws)? | ||
| ws ::= [ \t\n]{,20} |
There was a problem hiding this comment.
This feels like a good change -- I like constraining the output in this way. Could even consider limiting it to something more restrictive like {,4} or {,8}, but this is a good start.
There was a problem hiding this comment.
I've drafted an updated space rule in #7866. No matter what the bound is with this current syntax, models like Llama-3-8B & Phi-3-mini seem keen to misuse it. But given near-unlimited indent space only (and only 1 newline at a time), they're very sensible.
json: document schema conversion in GBNF readme, align manual grammar examples & convertersjson: document schema conversion in GBNF readme, align manual grammar examples & converters
JSON Schemas → GBNFsection to the grammar readmecc/ @HanClinto @ExtReMLapin