Skip to content

[Bugfix][ResponsesAPI] Fix crash when tool_choice=required exceeds max_output_tokens#37258

Merged
DarkLight1337 merged 3 commits intovllm-project:mainfrom
chaunceyjiang:response_required_max_tokens
Mar 17, 2026
Merged

[Bugfix][ResponsesAPI] Fix crash when tool_choice=required exceeds max_output_tokens#37258
DarkLight1337 merged 3 commits intovllm-project:mainfrom
chaunceyjiang:response_required_max_tokens

Conversation

@chaunceyjiang
Copy link
Copy Markdown
Collaborator

@chaunceyjiang chaunceyjiang commented Mar 17, 2026

Purpose

follow up #36841

FIX https://buildkite.com/vllm/ci/builds/56537?group_by=test#019cf9dc-06da-4341-aa86-6e0d6cb06ec8


[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     return await self.responses_full_generator(
--
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/responses/serving.py", line 711, in responses_full_generator
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     output = self._make_response_output_items(request, final_output, tokenizer)
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]   File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/responses/serving.py", line 904, in _make_response_output_items
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     return parser.extract_response_outputs(
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]   File "/usr/local/lib/python3.12/dist-packages/vllm/parser/abstract_parser.py", line 325, in extract_response_outputs
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     tool_calls, content = self._parse_tool_calls(
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]                           ^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]   File "/usr/local/lib/python3.12/dist-packages/vllm/parser/abstract_parser.py", line 426, in _parse_tool_calls
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     tool_calls = TypeAdapter(list[FunctionDefinition]).validate_json(content)
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]   File "/usr/local/lib/python3.12/dist-packages/pydantic/type_adapter.py", line 492, in validate_json
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]     return self.validator.validate_json(
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[2026-03-17T04:29:59Z] (APIServer pid=2510) ERROR 03-17 04:29:59 [server_utils.py:374] pydantic_core._pydantic_core.ValidationError: 1 validation error for list[function-wrap[__log_extra_fields__()]]



Test Plan

see e2e

Test gpt-5 with openai

response = client.responses.create(
    model="gpt-5",
    input=prompt,
    tools=tools,
    tool_choice="required",
    max_output_tokens=1,
)

Test Result

gpt-5

{
    "id": "resp_0c59c8e31591e43d0069b8f1e2a17c8190bffa061a344becb7",
    "created_at": 1773728226.0,
    "error": null,
    "incomplete_details": {
        "reason": "max_output_tokens"
    },
    "instructions": null,
    "metadata": {},
    "model": "gpt-5",
    "object": "response",
    "output": [
        {
            "id": "rs_0c59c8e31591e43d0069b8f1e3de108190b7204a20852a1dca",
            "summary": [],
            "type": "reasoning",
            "content": null,
            "encrypted_content": null,
            "status": null
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 1.0,
    "tool_choice": "required",
....
}

vllm

{
    "id": "resp_89d52120b02c63ff",
    "created_at": 1773728620.0,
    "error": null,
    "incomplete_details": {
        "reason": "max_output_tokens"
    },
    "instructions": null,
    "metadata": null,
    "model": "my-model",
    "object": "response",
    "output": [
        {
            "id": "rs_a1ff3b9137e892f3",
            "summary": [],
            "type": "reasoning",
            "content": [
                {
                    "text": "The",
                    "type": "reasoning_text"
                }
            ],
            "encrypted_content": null,
            "status": null
        }
    ],
    "parallel_tool_calls": true,
    "temperature": 1.0,
    "tool_choice": "required",

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the bug Something isn't working label Mar 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a crash in the Responses API when tool_choice="required" and the generated output for the tool call exceeds max_output_tokens. The fix correctly handles potential ValidationError during JSON parsing of the model's output by suppressing the exception. This prevents the crash and ensures that if the tool call JSON is invalid or truncated, no tool call is returned, which is the desired behavior. A new test case is added to validate this fix, confirming that the system remains stable under these conditions.

@chaunceyjiang
Copy link
Copy Markdown
Collaborator Author

/cc @DarkLight1337 PTAL.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) March 17, 2026 07:00
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Mar 17, 2026
@DarkLight1337 DarkLight1337 merged commit 132bfd4 into vllm-project:main Mar 17, 2026
47 checks passed
@chaunceyjiang chaunceyjiang deleted the response_required_max_tokens branch March 17, 2026 09:03
zhenwei-intel pushed a commit to zhenwei-intel/vllm that referenced this pull request Mar 17, 2026
Lucaskabela pushed a commit to Lucaskabela/vllm that referenced this pull request Mar 17, 2026
andylolu2 pushed a commit to andylolu2/vllm that referenced this pull request Mar 18, 2026
wendyliu235 pushed a commit to wendyliu235/vllm-public that referenced this pull request Mar 18, 2026
fxdawnn pushed a commit to fxdawnn/vllm that referenced this pull request Mar 19, 2026
khairulkabir1661 pushed a commit to khairulkabir1661/vllm that referenced this pull request Mar 27, 2026
Monishver11 pushed a commit to Monishver11/vllm that referenced this pull request Mar 27, 2026
…x_output_tokens (vllm-project#37258)

Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: Monishver Chandrasekaran <[email protected]>
JiantaoXu pushed a commit to JiantaoXu/vllm that referenced this pull request Mar 28, 2026
vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Mar 30, 2026
…x_output_tokens (vllm-project#37258)

Signed-off-by: chaunceyjiang <[email protected]>
Signed-off-by: Vinay Damodaran <[email protected]>
EricccYang pushed a commit to EricccYang/vllm that referenced this pull request Apr 1, 2026
liuchenbing2026 pushed a commit to liuchenbing2026/vllm that referenced this pull request Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants