fix(web_fetch): include fetched page content in ForLLM response#512
fix(web_fetch): include fetched page content in ForLLM response#512CrisisAlpha wants to merge 1 commit intosipeed:mainfrom
Conversation
The web_fetch tool's ForLLM field only returned metadata (byte count, extractor type, truncation status) but omitted the actual extracted text. This caused the LLM to never see fetched page content, making the tool effectively useless and causing repeated fetch loops. Append the extracted text content to ForLLM so the LLM can read and reason about fetched web pages. Fixes sipeed#388 Co-authored-by: Cursor <[email protected]>
nikolasdehor
left a comment
There was a problem hiding this comment.
Critical bug fix — a one-line change that makes the web_fetch tool actually useful. Without this, the LLM saw only metadata and would loop until hitting max iterations.
The fix is correct: appending the actual text content to ForLLM with a double-newline separator.
Also good: the test fixes change && to || in multiple assertions (strings.Contains(x, "a") && strings.Contains(x, "b") → ||). The old tests using && with a negation (!a && !b means "neither contains a NOR b") would only fail if BOTH conditions were missing, so it could pass even if one was missing. The corrected || means the test fails if EITHER is missing, which is the right behavior.
One thing to keep in mind: the text variable can be quite large (up to the truncation limit). Since this now goes into ForLLM which is sent as a tool result to the LLM provider, make sure the content is already bounded by the existing truncation logic upstream. Looking at the existing code, text is already truncated by maxOutputLen, so this should be fine.
LGTM.
|
Hi @CrisisAlpha, could you try again on the latest branch? I've tested and confirmed that this issue has been fixed on the latest branch. |
|
Hey @lxowalle, thanks for checking! Just synced with the latest main (cb0c870) and the issue appears to be present. In Since the agent loop sends ForLLM (not ForUser) to the LLM as the tool result, the LLM never receives the page content. The user sees it in the chat, but the LLM can't reason about it. Could you double-check? Happy to rebase if needed! |
|
@lxowalle I can confirm CrisisAlpha's analysis is correct -- the bug is still present on main. Looking at the code path:
The net effect is that the LLM receives a metadata-only string with zero page content. The LLM has no way to read the fetched page, which causes it to loop calling web_fetch repeatedly until max_tool_iterations is exhausted. This PR's fix (appending text to ForLLM) is the correct approach. The ForUser/ForLLM split is intentional architecture, and the content simply needs to be in both. |
|
Okay, I'll double-check. |
|
Hi @CrisisAlpha , I tested and confirmed that the message for user can be passed to the channel, but it cannot be passed to the LLM. I suggest modifying the method to: return &ToolResult{
ForLLM: fmt.Sprintf(
"Fetched %d bytes from %s (extractor: %s, truncated: %v): %s",
len(text),
urlStr,
extractor,
truncated,
string(resultJSON),
),
ForUser: "",
}This ensures that the message will not be sent to the channel repeatedly. |
|
#833 fixed |
📝 Description
The
web_fetchtool'sForLLMfield only returned a metadata summary line (byte count, extractor type, truncation status), omitting the actual extracted page content. This meant the LLM could never read fetched web pages, causing it to loop callingweb_fetchrepeatedly untilmax_tool_iterationswas hit.The fix appends the extracted text content to the
ForLLMresponse so the LLM can read and reason about fetched pages.🗣️ Type of Change
🤖 AI Code Generation
🔗 Related Issue
Fixes #388
📚 Technical Context (Skip for Docs)
ForLLMis what the LLM sees as the tool result. Without the actual content, the tool provides no value and causes the LLM to loop indefinitely.🧪 Test Environment
📸 Evidence (Optional)
Click to view Logs/Screenshots
Before (broken):
ForLLMreturns only:After (fixed):
ForLLMreturns metadata + actual content:All tests pass:
make fmt,make vet,make test✓☑️ Checklist
Made with Cursor