model : gpt-oss add response_format support#15494
Conversation
|
@aldehir thanks for the change, and it seems like now my chat completion request with response_format is working with the llama.cpp backend. one question, would the grammar rule also affect the reasoning token generation of gpt-oss? i.e. forcing the reasoning tokens to be generated in the json schema format, which certainly would impact the performance. |
|
@samshipengs with |
|
@aldehir i haven't looked at the |
|
@samshipengs Ah, ok. The grammar for gpt-oss when using If you're finding reasoning traces in your structured output, I would verify you are passing in |
|
@aldehir I was using I noticed that if i don't use structured_output i.e. response_format not passing in, it seems to give me more sensible answer (im looking at the final channel of the harmony format response) comapred to the parsed from passing in a pydantic model in response_format. Is the grammar based constraint decoding in llama cpp done by GBNF? Do we know if openai (for their commercial models) uses the same constraint decoding technique? |
|
@samshipengs the grammar is defined in gbnf, but I don't know the specifics about the constrained decoding implementation. If you can provide an example of such a task, I can look further into it. |
This reverts commit 32732f2.
Add
response_formatsupport togpt-ossmodels.The generic grammar implementation is not great for
gpt-oss,curl example
{ "choices": [ { "finish_reason": "stop", "index": 0, "message": { "role": "assistant", "content": "{\n\n \"country\": \"Switzerland\"\n\n , \"landmarks\":[\n\n\"[\"]\n\n }" } } ], ...Note the weirdness around landmarks.
This PR wraps the
response_formatschema in a harmony-aware grammar so the model can answer properly,{ "choices": [ { "finish_reason": "stop", "index": 0, "message": { "role": "assistant", "reasoning_content": "The user gave the city \"Zürich\". We need to output JSON in the defined schema. The schema says: object with properties: \"country\" (string) and \"landmarks\" (array of strings). It's required at least those two. We must supply. Provide country: Switzerland. Landmarks: choose 3 notable landmarks: \"Bahnhofstrasse\", \"Lake Zürich (Limmat, scenic)\", \"Zürcher Mozartplatz and its cathedral\"? Let's find known landmarks: \"Château Fraiture\"? Wait landmarks: \"Lake Zürich\", \"Bahnhofstrasse\", \"Old Town\" (Altstadt), \"Kunsthaus Zürich\". Choose 3: \"Bahnhofstrasse\", \"Lake Zürich\", \"Kunsthaus Zürich\". Compose JSON. Ensure it's valid according to schema. Should be:\n\n{\n \"country\": \"Switzerland\",\n \"landmarks\": [\n \"Bahnhofstrasse\",\n \"Lake Zürich\",\n \"Kunsthaus Zürich\"\n ]\n}\n\nMake sure no extra keys. Provide only JSON.", "content": "{\"country\":\"Switzerland\",\"landmarks\":[\"Bahnhofstrasse\",\"Lake Zürich\",\"Kunsthaus Zürich\"]}" } } ], ...fixes #15276