Skip to content

fix: respect tool_choice="none" by excluding tools from template#210

Open
awanawana wants to merge 1 commit intowaybarrios:mainfrom
awanawana:fix/tool-choice-none
Open

fix: respect tool_choice="none" by excluding tools from template#210
awanawana wants to merge 1 commit intowaybarrios:mainfrom
awanawana:fix/tool-choice-none

Conversation

@awanawana
Copy link
Copy Markdown

Summary

Fixes #162

When tool_choice is set to "none", models like Qwen2.5-Instruct and Llama 3.x-Instruct still activate their tool-calling behavior if tools are present in the chat template context. This causes:

  • finish_reason: "tool_calls" instead of "stop"
  • content: null with tool_calls containing empty {} arguments
  • Model encoding output data as function names instead of text content

Root Cause

The chat completions handler was passing tools to chat_kwargs without checking tool_choice. Even with tool_choice="none", if tools were present in the request, they would be passed to apply_chat_template(). Qwen2.5 and Llama 3.x have tool-calling jinja templates that activate when a tools key is present — even an empty list triggers this behavior.

Solution

Before adding tools to chat_kwargs, check if tool_choice == "none". If so, skip adding tools entirely. This prevents the template from activating tool-calling mode.

# Before
if request.tools:
    chat_kwargs["tools"] = convert_tools_for_template(request.tools)

# After
if request.tools and request.tool_choice != "none":
    chat_kwargs["tools"] = convert_tools_for_template(request.tools)

This matches the behavior of upstream vLLM's --exclude-tools-when-tool-choice-none flag.

Test Plan

# Start server
vllm-mlx serve mlx-community/Qwen2.5-14B-Instruct-4bit --port 8080

# Test with tool_choice="none"
curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mlx-community/Qwen2.5-14B-Instruct-4bit",
    "messages": [{"role": "user", "content": "List 3 fruits as a JSON array"}],
    "tool_choice": "none",
    "tools": [],
    "max_tokens": 200
  }'

# Expected: finish_reason="stop", content contains JSON array, no tool_calls
# Before fix: finish_reason="tool_calls", content=null

🤖 Generated with Claude Code

When tool_choice is set to "none", models like Qwen2.5 and Llama 3.x
still activate their tool-calling behavior if tools are present in the
chat template context (even with an empty list). This causes the model
to output tool_calls with empty arguments instead of normal text content.

The fix checks tool_choice before passing tools to the chat template.
When tool_choice="none", tools are not included in chat_kwargs,
preventing the template from activating tool-calling mode.

Fixes waybarrios#162

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@Thump604
Copy link
Copy Markdown
Collaborator

Note: our PR #173 takes a more complete approach to tool_choice=none. In addition to stripping tools from the chat template (which this PR does), #173 also suppresses tool call parsing in the response handler. Without parser suppression, the model can still generate text that looks like a tool call and the parser will extract it as one, even though the user requested tool_choice=none.

The two fixes are complementary but independent: template stripping prevents the model from seeing tool definitions, parser suppression prevents false-positive tool call extraction from generated text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tool_choice: "none" ignored — model generates tool_calls with empty {} arguments when no tools defined

2 participants