Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"
Summary
When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is:
{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}
Root Cause Chain
-
_supports_reasoning_extra_body() gate is too narrow (run_agent.py:7315): For non-OpenRouter base URLs, it returns False. Direct DeepSeek API hits this path.
-
User config sets reasoning_effort: none (thinking disabled), but because supports_reasoning is False, the reasoning_config dict is never translated into the extra_body sent to the API.
-
DeepSeek's direct API uses its own parameter namespace: when thinking is not explicitly disabled, the model defaults to thinking=enabled. The control parameter is {"thinking": {"type": "disabled"}} — not the OpenRouter-style {"reasoning": {"enabled": false}}.
-
With thinking enabled, the model generates reasoning_content in its response. However, the agent's compression pipeline strips this field before storing the message.
-
On the next API call, the compressed messages no longer contain reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.
Affected Code
run_agent.py:7294-7329 — _supports_reasoning_extra_body()
The function correctly gates reasoning for OpenRouter routes (where deepseek/ is a known prefix at line 7322), but has no awareness of direct DeepSeek API endpoints (api.deepseek.com).
agent/transports/chat_completions.py — missing DeepSeek thinking handling
Before the fix below, there was no code path that sent {"thinking": {"type": "disabled"}} to the direct DeepSeek API.
Working Fix
Added DeepSeek V-series thinking.type control to chat_completions.py before the general reasoning block. This mirrors the existing Kimi/Moonshot pattern (lines 232-240).
# DeepSeek V-series (v4-pro, v4-flash, etc.): thinking.type control
# Direct DeepSeek API uses "thinking" parameter, not OpenRouter's
# "reasoning". Without this, reasoning_effort: none from config is
# silently ignored and the model defaults to thinking=enabled.
# That generates reasoning_content which compression strips →
# 400 "reasoning_content must be passed back" on next turn.
if model_lower.startswith("deepseek-v"):
_ds_thinking_enabled = True
if reasoning_config and isinstance(reasoning_config, dict):
if reasoning_config.get("enabled") is False:
_ds_thinking_enabled = False
extra_body["thinking"] = {
"type": "enabled" if _ds_thinking_enabled else "disabled",
}
File: agent/transports/chat_completions.py
Lines: insert after line 240 (after Kimi block, before the # Reasoning block at line 257)
Verification
The fix has been confirmed working on a production install (WSL Ubuntu 24.04, hermes-agent v0.11.0, DeepSeek v4-pro direct API). Multi-turn tool-calling conversations that previously hit the 400 on the second turn now complete successfully.
Alternative / More General Fix
A more systematic approach would be to update _supports_reasoning_extra_body() to recognize direct DeepSeek API endpoints, so the general reasoning block in chat_completions.py could handle it. However, that block emits reasoning-keyed dicts, and the direct DeepSeek API expects thinking-keyed dicts — so the model-specific fix above is cleaner and follows the existing Kimi pattern.
Steps to Reproduce
- Configure Hermes to use DeepSeek direct API (
api.deepseek.com)
- Use model
deepseek-v4-pro or deepseek-v4-flash
- Set
reasoning_effort: none in config (recommended for cost-saving with direct API)
- Send any message that triggers a tool call (e.g. "run date")
- Agent will fail with 400 on the second API call after the tool result is returned
Environment
- hermes-agent v0.11.0
- WSL Ubuntu 24.04
- Python 3.11
- DeepSeek API direct (not OpenRouter)
- Model: deepseek-v4-pro
Bug Report: DeepSeek direct API 400 error — "reasoning_content must be passed back"
Summary
When using DeepSeek models (v4-pro, v4-flash, v3, v3.1) via the direct API endpoint (
api.deepseek.com), the agent fails with a 400 error on multi-turn conversations after the first tool call. The error message is:{"error":{"type":"invalid_request_error","message":"reasoning_content must be passed back when the API responds with reasoning_content. ..."}}
Root Cause Chain
_supports_reasoning_extra_body()gate is too narrow (run_agent.py:7315): For non-OpenRouter base URLs, it returnsFalse. Direct DeepSeek API hits this path.User config sets
reasoning_effort: none(thinking disabled), but becausesupports_reasoningisFalse, thereasoning_configdict is never translated into theextra_bodysent to the API.DeepSeek's direct API uses its own parameter namespace: when thinking is not explicitly disabled, the model defaults to thinking=enabled. The control parameter is
{"thinking": {"type": "disabled"}}— not the OpenRouter-style{"reasoning": {"enabled": false}}.With thinking enabled, the model generates
reasoning_contentin its response. However, the agent's compression pipeline strips this field before storing the message.On the next API call, the compressed messages no longer contain
reasoning_content, but the DeepSeek API requires it to be present when the previous assistant message originally had it → 400 error.Affected Code
run_agent.py:7294-7329—_supports_reasoning_extra_body()The function correctly gates reasoning for OpenRouter routes (where
deepseek/is a known prefix at line 7322), but has no awareness of direct DeepSeek API endpoints (api.deepseek.com).agent/transports/chat_completions.py— missing DeepSeekthinkinghandlingBefore the fix below, there was no code path that sent
{"thinking": {"type": "disabled"}}to the direct DeepSeek API.Working Fix
Added DeepSeek V-series
thinking.typecontrol tochat_completions.pybefore the generalreasoningblock. This mirrors the existing Kimi/Moonshot pattern (lines 232-240).File:
agent/transports/chat_completions.pyLines: insert after line 240 (after Kimi block, before the
# Reasoningblock at line 257)Verification
The fix has been confirmed working on a production install (WSL Ubuntu 24.04, hermes-agent v0.11.0, DeepSeek v4-pro direct API). Multi-turn tool-calling conversations that previously hit the 400 on the second turn now complete successfully.
Alternative / More General Fix
A more systematic approach would be to update
_supports_reasoning_extra_body()to recognize direct DeepSeek API endpoints, so the generalreasoningblock inchat_completions.pycould handle it. However, that block emitsreasoning-keyed dicts, and the direct DeepSeek API expectsthinking-keyed dicts — so the model-specific fix above is cleaner and follows the existing Kimi pattern.Steps to Reproduce
api.deepseek.com)deepseek-v4-proordeepseek-v4-flashreasoning_effort: nonein config (recommended for cost-saving with direct API)Environment