fix(web_fetch): include fetched page content in ForLLM response by CrisisAlpha · Pull Request #512 · sipeed/picoclaw

CrisisAlpha · 2026-02-20T06:45:08Z

📝 Description

The web_fetch tool's ForLLM field only returned a metadata summary line (byte count, extractor type, truncation status), omitting the actual extracted page content. This meant the LLM could never read fetched web pages, causing it to loop calling web_fetch repeatedly until max_tool_iterations was hit.

The fix appends the extracted text content to the ForLLM response so the LLM can read and reason about fetched pages.

🗣️ Type of Change

🐞 Bug fix (non-breaking change which fixes an issue)
✨ New feature (non-breaking change which adds functionality)
📖 Documentation update
⚡ Code refactoring (no functional changes, no api changes)

🤖 AI Code Generation

🤖 Fully AI-generated (100% AI, 0% Human)
🛠️ Mostly AI-generated (AI draft, Human verified/modified)
👨‍💻 Mostly Human-written (Human lead, AI assisted or none)

🔗 Related Issue

Fixes #388

📚 Technical Context (Skip for Docs)

Reference URL: web_fetch tool: ForLLM response missing actual page content #388
Reasoning: ForLLM is what the LLM sees as the tool result. Without the actual content, the tool provides no value and causes the LLM to loop indefinitely.

🧪 Test Environment

Hardware: Mac (x86_64)
OS: macOS
Model/Provider: N/A (unit tests only, no API keys needed)
Channels: N/A

📸 Evidence (Optional)

Click to view Logs/Screenshots

Before (broken): ForLLM returns only:

Fetched 5000 bytes from https://example.com (extractor: text, truncated: false)

After (fixed): ForLLM returns metadata + actual content:

Fetched 5000 bytes from https://example.com (extractor: text, truncated: false)

[actual page content here]

All tests pass: make fmt, make vet, make test ✓

☑️ Checklist

My code/docs follow the style of this project.
I have performed a self-review of my own changes.
I have updated the documentation accordingly.

Made with Cursor

The web_fetch tool's ForLLM field only returned metadata (byte count, extractor type, truncation status) but omitted the actual extracted text. This caused the LLM to never see fetched page content, making the tool effectively useless and causing repeated fetch loops. Append the extracted text content to ForLLM so the LLM can read and reason about fetched web pages. Fixes sipeed#388 Co-authored-by: Cursor <[email protected]>

nikolasdehor

Critical bug fix — a one-line change that makes the web_fetch tool actually useful. Without this, the LLM saw only metadata and would loop until hitting max iterations.

The fix is correct: appending the actual text content to ForLLM with a double-newline separator.

Also good: the test fixes change && to || in multiple assertions (strings.Contains(x, "a") && strings.Contains(x, "b") → ||). The old tests using && with a negation (!a && !b means "neither contains a NOR b") would only fail if BOTH conditions were missing, so it could pass even if one was missing. The corrected || means the test fails if EITHER is missing, which is the right behavior.

One thing to keep in mind: the text variable can be quite large (up to the truncation limit). Since this now goes into ForLLM which is sent as a tool result to the LLM provider, make sure the content is already bounded by the existing truncation logic upstream. Looking at the existing code, text is already truncated by maxOutputLen, so this should be fine.

LGTM.

lxowalle · 2026-02-22T15:35:05Z

Hi @CrisisAlpha, could you try again on the latest branch? I've tested and confirmed that this issue has been fixed on the latest branch.

CrisisAlpha · 2026-02-22T16:20:48Z

Hey @lxowalle, thanks for checking! Just synced with the latest main (cb0c870) and the issue appears to be present. In pkg/tools/web.go L486-492, ForLLM only contains the metadata summary (byte count, extractor, truncated flag). The actual fetched text is only included in ForUser as JSON.

Since the agent loop sends ForLLM (not ForUser) to the LLM as the tool result, the LLM never receives the page content. The user sees it in the chat, but the LLM can't reason about it.

Could you double-check? Happy to rebase if needed!

nikolasdehor · 2026-02-23T00:42:32Z

@lxowalle I can confirm CrisisAlpha's analysis is correct -- the bug is still present on main.

Looking at the code path:

In pkg/tools/web.go L486-492, ForLLM is set to only the metadata summary line. The actual text content is only serialized into ForUser as part of the JSON result.
In pkg/agent/loop.go L706-713, the agent loop sends ForLLM (not ForUser) to the LLM as the tool result. This is the correct design -- ForUser is for display to the human, ForLLM is what the model reasons about.

The net effect is that the LLM receives a metadata-only string with zero page content. The LLM has no way to read the fetched page, which causes it to loop calling web_fetch repeatedly until max_tool_iterations is exhausted.

This PR's fix (appending text to ForLLM) is the correct approach. The ForUser/ForLLM split is intentional architecture, and the content simply needs to be in both.

lxowalle · 2026-02-23T03:30:01Z

Okay, I'll double-check.

lxowalle · 2026-02-25T12:16:45Z

Hi @CrisisAlpha , I tested and confirmed that the message for user can be passed to the channel, but it cannot be passed to the LLM. I suggest modifying the method to:

	return &ToolResult{
		ForLLM: fmt.Sprintf(
			"Fetched %d bytes from %s (extractor: %s, truncated: %v): %s",
			len(text),
			urlStr,
			extractor,
			truncated,
			string(resultJSON),
		),
		ForUser: "",
	}

This ensures that the message will not be sent to the channel repeatedly.

lxowalle · 2026-03-05T06:34:01Z

#833 fixed

nikolasdehor approved these changes Feb 20, 2026

View reviewed changes

nikolasdehor mentioned this pull request Feb 20, 2026

web_fetch tool: ForLLM response missing actual page content #388

Closed

xiaket requested a review from lxowalle February 22, 2026 10:49

This was referenced Feb 23, 2026

fix: include extracted text in web_fetch ForLLM response #389

Closed

fix(web): ensure fresh URL fetch context reaches model #208

Closed

iambpn mentioned this pull request Feb 24, 2026

[BUG] Default api_base is always set to GLM provider #680

Closed

sipeed-bot bot added type: bug Something isn't working domain: tool labels Mar 3, 2026

lxowalle closed this Mar 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(web_fetch): include fetched page content in ForLLM response#512

fix(web_fetch): include fetched page content in ForLLM response#512
CrisisAlpha wants to merge 1 commit intosipeed:mainfrom
CrisisAlpha:fix/web-fetch-forllm-content

CrisisAlpha commented Feb 20, 2026

Uh oh!

nikolasdehor left a comment

Uh oh!

lxowalle commented Feb 22, 2026

Uh oh!

CrisisAlpha commented Feb 22, 2026

Uh oh!

nikolasdehor commented Feb 23, 2026

Uh oh!

lxowalle commented Feb 23, 2026

Uh oh!

lxowalle commented Feb 25, 2026

Uh oh!

lxowalle commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

CrisisAlpha commented Feb 20, 2026

📝 Description

🗣️ Type of Change

🤖 AI Code Generation

🔗 Related Issue

📚 Technical Context (Skip for Docs)

🧪 Test Environment

📸 Evidence (Optional)

☑️ Checklist

Uh oh!

nikolasdehor left a comment

Choose a reason for hiding this comment

Uh oh!

lxowalle commented Feb 22, 2026

Uh oh!

CrisisAlpha commented Feb 22, 2026

Uh oh!

nikolasdehor commented Feb 23, 2026

Uh oh!

lxowalle commented Feb 23, 2026

Uh oh!

lxowalle commented Feb 25, 2026

Uh oh!

lxowalle commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants