Skip to content

DeepSeek direct API 400 "reasoning_content must be passed back" on multi-turn tool calls #17212

@Zhangxiaocheng28

Description

@Zhangxiaocheng28

Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"

Summary

When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is:

{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}

Root Cause Chain

  1. _supports_reasoning_extra_body() gate is too narrow (run_agent.py:7315): For non-OpenRouter base URLs, it returns False. Direct DeepSeek API hits this path.

  2. User config sets reasoning_effort: none (thinking disabled), but because supports_reasoning is False, the reasoning_config dict is never translated into the extra_body sent to the API.

  3. DeepSeek's direct API uses its own parameter namespace: when thinking is not explicitly disabled, the model defaults to thinking=enabled. The control parameter is {"thinking": {"type": "disabled"}}not the OpenRouter-style {"reasoning": {"enabled": false}}.

  4. With thinking enabled, the model generates reasoning_content in its response. However, the agent's compression pipeline strips this field before storing the message.

  5. On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.

Affected Code

run_agent.py:7294-7329_supports_reasoning_extra_body()
The function correctly gates reasoning for OpenRouter routes (where deepseek/ is a known prefix at line 7322), but has no awareness of direct DeepSeek API endpoints (api.deepseek.com).

agent/transports/chat_completions.py — missing DeepSeek thinking handling
Before the fix below, there was no code path that sent {"thinking": {"type": "disabled"}} to the direct DeepSeek API.

Working Fix

Added DeepSeek V-series thinking.type control to chat_completions.py before the general reasoning block. This mirrors the existing Kimi/Moonshot pattern (lines 232-240).

# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning".  Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
    _ds_thinking_enabled = True
    if reasoning_config and isinstance(reasoning_config, dict):
        if reasoning_config.get("enabled") is False:
            _ds_thinking_enabled = False
    extra_body["thinking"] = {
        "type": "enabled" if _ds_thinking_enabled else "disabled",
    }

File: agent/transports/chat_completions.py
Lines: insert after line 240 (after Kimi block, before the # Reasoning block at line 257)

Verification

The fix has been confirmed working on a production install (WSL Ubuntu 24.04, hermes-agent v0.11.0, DeepSeek v4-pro direct API). Multi-turn tool-calling conversations that previously hit the 400 on the second turn now complete successfully.

Alternative / More General Fix

A more systematic approach would be to update _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, so the general reasoning block in chat_completions.py could handle it. However, that block emits reasoning-keyed dicts, and the direct DeepSeek API expects thinking-keyed dicts — so the model-specific fix above is cleaner and follows the existing Kimi pattern.

Steps to Reproduce

  1. Configure Hermes to use DeepSeek direct API (api.deepseek.com)
  2. Use model deepseek-v4-pro or deepseek-v4-flash
  3. Set reasoning_effort: none in config (recommended for cost-saving with direct API)
  4. Send any message that triggers a tool call (e.g. "run date")
  5. Agent will fail with 400 on the second API call after the tool result is returned

Environment

  • hermes-agent v0.11.0
  • WSL Ubuntu 24.04
  • Python 3.11
  • DeepSeek API direct (not OpenRouter)
  • Model: deepseek-v4-pro

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium — degraded but workaround existscomp/agentCore agent loop, run_agent.py, prompt builderprovider/deepseekDeepSeek APItype/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions