[Frontend] Improve error message when tool_choice=required hits max_tokens by chenwenxiaolive · Pull Request #36883 · vllm-project/vllm

chenwenxiaolive · 2026-03-12T13:03:38Z

Summary

When tool_choice='required' is set and the model exhausts max_tokens before producing a complete tool call, users previously received a cryptic JSON parse error (Invalid JSON: EOF while parsing a string). This PR adds clear, actionable error messages that guide users to increase max_tokens.
Added an early check in chat_completion_full_generator() that detects finish_reason='length' with tool_choice='required' and raises a descriptive error before attempting JSON parsing.
Wrapped validate_json() in _parse_tool_calls_from_content() with a try-except to provide a helpful error message as a fallback.

Test plan

Added unit test test_parse_tool_calls_incomplete_json_gives_clear_error verifying the improved error message
Existing tool_choice=required tests continue to pass
Manual verification: set tool_choice="required" with a low max_tokens and confirm the new error message appears

🤖 Generated with Claude Code

…okens When tool_choice='required' is set and the model exhausts max_tokens before producing a complete tool call, users previously got a cryptic JSON parse error. This adds clear, actionable error messages that guide users to increase max_tokens. Fixes vllm-project#36794 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

mergify · 2026-03-12T13:04:29Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @chenwenxiaolive.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

github-actions · 2026-03-12T13:06:15Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

chaunceyjiang · 2026-03-12T13:08:53Z

see #36841

gemini-code-assist

Code Review

This pull request improves error messaging for truncated tool calls when tool_choice='required'. The changes add an early check for finish_reason='length' and a fallback for JSON parsing errors, which is a good approach. I have one high-severity suggestion regarding test coverage for the new logic to ensure its robustness.

gemini-code-assist · 2026-03-12T13:10:41Z

vllm/entrypoints/openai/chat_completion/serving.py

+            if (request.tool_choice == "required"
+                    and output.finish_reason == "length"):
+                raise ValueError(
+                    "Model reached the max token limit "
+                    "(finish_reason='length') before generating "
+                    "complete tool calls, but tool_choice='required' "
+                    "demands valid tool calls. Please increase "
+                    "`max_tokens` to allow the model to finish "
+                    "generating the tool call."
+                )


This is a good addition for providing a clear error message. However, this new logic path is not covered by automated tests. The new test in test_tool_choice_required.py only covers the fallback mechanism in _parse_tool_calls_from_content. To prevent future regressions, it's important to add a unit test that specifically triggers this early check. This would likely involve mocking self.engine.generate to return a result with output.finish_reason='length' and asserting that the correct ValueError is raised.

gemini-code-assist

Code Review

This pull request improves error handling for cases where tool_choice='required' is used and the model fails to produce valid tool call JSON, particularly when hitting a token limit. It introduces an early check for finish_reason='length' in chat_completion_full_generator and adds a more descriptive error in _parse_tool_calls_from_content when JSON parsing fails. While the changes are a good improvement, there is a logical inconsistency between the two new error handling paths that results in a potentially misleading error message. I've suggested a change to address this.

gemini-code-assist · 2026-03-12T13:11:22Z

vllm/entrypoints/openai/engine/serving.py

+                raise ValueError(
+                    "When tool_choice='required', the model must produce "
+                    "valid JSON tool calls, but the generated content "
+                    "could not be parsed. This can happen when the model "
+                    "hits the max token limit (finish_reason='length') "
+                    "before completing the tool call. Consider increasing "
+                    "`max_tokens` to allow the model to finish generating "
+                    "the tool call."
+                ) from e


The new pre-check in chat_completion/serving.py specifically handles the finish_reason='length' case. This means when this except block is reached from the chat_completion_full_generator call path, the finish reason is known not to be length. Therefore, the part of this error message suggesting the cause could be max token limit (finish_reason='length') is misleading in this primary call path. The error message should be made more general to reflect that the cause is a JSON parsing failure for other reasons. Note that this change will require updating the assertion in test_parse_tool_calls_incomplete_json_gives_clear_error.

raise ValueError( "When tool_choice='required', the model must produce " "valid JSON tool calls, but the generated content " "could not be parsed. This may be due to the model " "failing to follow tool use instructions or generating " "malformed JSON." ) from e

chenwenxiaolive requested review from DarkLight1337, aarnphm, chaunceyjiang and russellb as code owners March 12, 2026 13:03

mergify bot added frontend tool-calling labels Mar 12, 2026

mergify bot added the needs-rebase label Mar 12, 2026

github-project-automation bot added this to Tool Calling Mar 12, 2026

gemini-code-assist bot reviewed Mar 12, 2026

View reviewed changes

chenwenxiaolive closed this Mar 12, 2026

github-project-automation bot moved this to Done in Tool Calling Mar 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Frontend] Improve error message when tool_choice=required hits max_tokens#36883

[Frontend] Improve error message when tool_choice=required hits max_tokens#36883
chenwenxiaolive wants to merge 1 commit intovllm-project:mainfrom
chenwenxiaolive:fix/improve-tool-choice-required-error-message

chenwenxiaolive commented Mar 12, 2026 •

edited

Loading

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

github-actions bot commented Mar 12, 2026

Uh oh!

chaunceyjiang commented Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

chenwenxiaolive commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

mergify bot commented Mar 12, 2026

Uh oh!

github-actions bot commented Mar 12, 2026

Uh oh!

chaunceyjiang commented Mar 12, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chenwenxiaolive commented Mar 12, 2026 •

edited

Loading