fix(openai_compat): strip <think> tags from content before sending to…#1233
fix(openai_compat): strip <think> tags from content before sending to…#1233denysvitali wants to merge 1 commit intosipeed:mainfrom
Conversation
… user Some models (e.g. MiniMax M2.5, DeepSeek) embed chain-of-thought in the content field using <think> tags rather than a dedicated reasoning_content field. Move these blocks into ReasoningContent so channels don't render raw thinking output to users. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nikolasdehor
left a comment
There was a problem hiding this comment.
This addresses the same problem as PR #1237 (leaked thinking tags) but takes a different approach: extracting <think> content into ReasoningContent rather than stripping all reasoning/thinking tags.
Comparison with PR #1237:
- #1233 (this PR): only handles
<think>...</think>, moves content toReasoningContent - #1237: handles
<think>,<thinking>,<thought>,<reasoning>,<final>, escaped unicode, unclosed blocks -- strips all of them
Feedback on this PR:
-
Only
<think>is handled -- some providers use<thinking>,<thought>, or<reasoning>tags. PR #1237 is more comprehensive in this regard. -
Good idea to preserve reasoning -- moving thinking content to
ReasoningContentrather than discarding it is arguably better than pure stripping. This lets channels/UI that support reasoning display optionally show it. However, the condition `if msgReasoning == "" && strings.Contains(msgContent, "")" means this only fires when no dedicated reasoning field exists. -
Regex concern --
(?s)<think>(.*?)</think>with the lazy quantifier is correct for matching the innermost<think>block. However, this won't handle nested tags (unlikely but possible) or malformed HTML. -
No test coverage -- PR #1237 includes tests; this PR does not. Tests are important for a parsing change like this.
-
Redundant regex compilation inside callback --
thinkTagRe.FindStringSubmatch(m)is called insideReplaceAllStringFunc, which already matched the regex. The submatch is available if you useReplaceAllStringSubmatchFunc(not in stdlib, but the pattern can be restructured to useFindAllStringSubmatch+ manual reconstruction).
I would recommend consolidating this with PR #1237. The "move to ReasoningContent" behavior from this PR is superior to pure stripping, while #1237 has better tag coverage and tests.
|
@denysvitali Hi! This PR has had no activity for over 2 weeks, so I'm closing it for now to keep things organized. Feel free to reopen anytime if you'd like to continue. |
📝 Description
Some models (e.g. MiniMax M2.5, DeepSeek) embed chain-of-thought in the content field using tags rather than a dedicated reasoning_content field. Move these blocks into ReasoningContent so channels don't render raw thinking output to users.
🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
📚 Technical Context (Skip for Docs)
🧪 Test Environment
📸 Evidence (Optional)
Click to view Logs/Screenshots
☑️ Checklist