Skip to content

fix(web): ensure fresh URL fetch context reaches model#208

Closed
punishell wants to merge 1 commit intosipeed:mainfrom
punishell:fix/web-fetch-fresh-url
Closed

fix(web): ensure fresh URL fetch context reaches model#208
punishell wants to merge 1 commit intosipeed:mainfrom
punishell:fix/web-fetch-fresh-url

Conversation

@punishell
Copy link

Make web_fetch return the fetched payload to the LLM (not just metadata), fix HTML text extraction regex replacements, and add a prompt rule requiring fresh web_fetch calls for URL visit/check requests to prevent stale memory answers.

The bot was constantly hallucinating when asked e.g to visit the website

@Leeaandrob
Copy link
Collaborator

@Zepan Ensures a fresh URL fetch context reaches the model — fixes web_fetch tool breaking silently when context is reused.

Recommendation: Merge. +8/-6, small correctness fix for web tools.

Copy link
Collaborator

@Leeaandrob Leeaandrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #208 Deep Review — @punishell

Hey @punishell, thanks for diagnosing this. The root cause is spot-on: ForLLM was sending only metadata ("Fetched 5000 bytes from ...") but never the actual page content. The LLM literally could not see what was fetched, so it hallucinated the page content. This PR correctly fixes that.


Verification

  • Checked out branch locally: fix/web-fetch-fresh-url
  • go vet ./pkg/tools/... ./pkg/agent/... — clean
  • go test ./pkg/tools/... -run "Web|Fetch|Search" — all 10 tests passing
  • go test ./pkg/agent/... — all 10 tests passing
  • Traced the ForLLMcontentForLLM → tool result message path (agent/loop.go:551-558) — confirmed ForLLM is what the model receives as the tool response
  • Verified ReplaceAllLiteralString("") vs ReplaceAllString("") produce identical results when replacement is empty string (no $ expansion possible)

Summary of Findings

HIGH (Should Fix)

  • H1: Existing test TestWebTool_WebFetch_Success now passes by coincidence, not by intent

MEDIUM

  • M1: ReplaceAllLiteralStringReplaceAllString is a no-op change (adds diff noise)

LOW

  • L1: System prompt rule is English-only

POSITIVE

  1. Critical bug fixForLLM was only sending metadata, never the actual content. web_fetch was fundamentally broken.
  2. System prompt defense-in-depth rule is a reasonable pattern to prevent hallucination
  3. Small, focused changeset — exactly what's needed, nothing more
  4. Content extraction and truncation logic is correctly preserved

Verdict: REQUEST CHANGES

What needs to change before merge:

  • Update TestWebTool_WebFetch_Success (web_test.go:39-42) to validate the new ForLLM format (see H1)
  • Remove the no-op ReplaceAllLiteralStringReplaceAllString changes, or justify why they're needed (see M1)

Estimated effort: ~30 minutes. Happy to re-review.

Copy link
Collaborator

@Leeaandrob Leeaandrob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #208 Re-Review (Post Commit ed0a2e8)

Verdict: REQUEST_CHANGES

@punishell — Thanks for addressing the earlier feedback. The no-op ReplaceAllLiteralString changes are gone, and the English-only keyword approach was replaced with regex-based URL detection. However, the new commit introduces a critical false-positive problem with the URL regex that makes this worse than the original state for many common conversations.


What was fixed (from previous review)

  • No-op ReplaceAllLiteralStringReplaceAllString changes removed — Clean diff now.
  • English-only keyword list replaced — The new approach uses URL regex detection, which is language-agnostic.
  • Core bug fix intactForLLM: string(resultJSON) correctly sends actual content to the model.
  • All tests passgo test -race ./pkg/tools/... ./pkg/agent/... both PASS.
  • go vet clean

New Findings

[HIGH] URL regex has severe false positives — matches filenames, JS libraries, and PicoClaw's own files

File: pkg/agent/loop.go:48

var userMessageURLRegex = regexp.MustCompile(`(?i)(https?://[^\s<>"')\]]+|\b[a-z0-9][a-z0-9.-]*\.[a-z]{2,}\b(?:/[^\s<>"')\]]*)?)`) 

The bare-domain pattern \b[a-z0-9][a-z0-9.-]*\.[a-z]{2,}\b matches ANY word.extension where extension is 2+ characters. Tested locally:

Input Match Problem
config.json ✅ MATCH Every config discussion triggers it
MEMORY.md ✅ MATCH PicoClaw's own workspace file!
node.js ✅ MATCH Extremely common in tech conversations
vue.js ✅ MATCH Common JS library
three.js ✅ MATCH Common JS library
my-app.service ✅ MATCH Systemd service discussions
https://example.com ✅ MATCH Correct
github.com/sipeed ✅ MATCH Correct

When a false positive fires, the system injects:

[System note: This message includes URL(s). Before answering, you must call web_fetch for each URL...]

The model then tries to web_fetch("config.json") or web_fetch("MEMORY.md"), which fails and wastes tool calls. This makes the user experience worse than without the feature for any tech conversation mentioning filenames.

Fix: Only match https?:// URLs (not bare domains). The bare domain pattern causes far more harm than good:

var userMessageURLRegex = regexp.MustCompile(`https?://[^\s<>"')\]]+`)

If bare domain support is desired, require at least a TLD from a known list (.com, .org, .io, .dev, etc.) rather than any 2+ char extension.


[MEDIUM] No tests for enforceFreshWebFetchHint

File: pkg/agent/loop.go:281-289

The new function has zero test coverage. Given the regex issues above, tests are especially important to catch regressions. At minimum:

func TestEnforceFreshWebFetchHint(t *testing.T) {
    // Should inject note for real URLs
    assert.Contains(t, enforceFreshWebFetchHint("check https://example.com"), "[System note")
    // Should NOT inject note for filenames
    assert.Equal(t, "edit config.json", enforceFreshWebFetchHint("edit config.json"))
    // Should NOT inject note for plain text
    assert.Equal(t, "hello world", enforceFreshWebFetchHint("hello world"))
}

[LOW] System prompt rule #3 and the enforceFreshWebFetchHint are redundant

The PR adds URL enforcement in two places:

  1. System prompt rule #3 in context.go — tells the model generically to fetch URLs
  2. enforceFreshWebFetchHint in loop.go — detects URLs and injects a system note per-message

Both serve the same purpose. The per-message injection (#2) is more forceful and makes #1 somewhat redundant. Consider keeping only one approach to avoid double-prompting the model.


Summary

Severity Count Details
CRITICAL 0
HIGH 1 URL regex false positives (config.json, MEMORY.md, node.js, etc.)
MEDIUM 1 No tests for enforceFreshWebFetchHint
LOW 1 Redundant enforcement in system prompt + per-message injection
POSITIVE 1 Core bug fix (ForLLM now sends actual content) is correct and valuable

The core fix in web.go (line 407) is valuable and correct. The problem is entirely in the new URL detection logic in loop.go. I'd recommend:

  1. Simplify the regex to only match https?:// URLs (drop bare domain matching)
  2. Add unit tests for enforceFreshWebFetchHint
  3. Consider removing the redundant system prompt rule #3 since the per-message injection is more reliable

@punishell punishell force-pushed the fix/web-fetch-fresh-url branch from 89625e7 to 079812d Compare February 19, 2026 08:27
Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ForLLM fix in web.go is the key change — the LLM was receiving only a summary string instead of actual fetched content, causing hallucination. This alone fixes the core issue.

Concern about enforceFreshWebFetchHint(): the regex matches any URL in any context (code blocks, error messages, etc.), not just "fetch this page" requests. The injected system note is a fragile workaround that's unnecessary after the ForLLM fix. Consider deferring or dropping the hint mechanism — ship the web.go fix alone.

Approving for the critical web.go fix. LGTM.

web_fetch was returning only a short metadata string to the LLM
("Fetched N bytes from ..."), so the model never saw the actual page
content and fell back to hallucinated answers. Now ForLLM carries the
full JSON payload (url, status, text) so the model reasons on real
fetched data.
@punishell punishell force-pushed the fix/web-fetch-fresh-url branch from 079812d to 137cf94 Compare February 20, 2026 06:42
@punishell
Copy link
Author

removed everything just keeping: 1 line changed: ForLLM: string(resultJSON) instead of the metadata string

@lxowalle
Copy link
Collaborator

The real reason for this issue is that the returned ForUser parameter was not handled correctly. I have tested this in the new version and it has been fixed. You can try it again.

@nikolasdehor
Copy link

PRs #389 and #512 both address this. #512 has been reviewed and approved — it should be the one to land. Suggesting this issue reference #512 as the fix.

@sipeed-bot sipeed-bot bot added type: bug Something isn't working domain: tool labels Mar 3, 2026
@lxowalle
Copy link
Collaborator

lxowalle commented Mar 5, 2026

#833 fixed

@lxowalle lxowalle closed this Mar 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: tool type: bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants