fix(mcp): retry listTools() before evicting client on transient failure#17153
Open
nil957 wants to merge 1 commit intoanomalyco:devfrom
Open
fix(mcp): retry listTools() before evicting client on transient failure#17153nil957 wants to merge 1 commit intoanomalyco:devfrom
nil957 wants to merge 1 commit intoanomalyco:devfrom
Conversation
Previously, a single transient listTools() failure (timeout, pipe hiccup, GC pause) would permanently delete the MCP client from the session state with no retry or reconnection mechanism. This caused MCP tools to silently vanish mid-session while the server process was still running. This commit: 1. Adds retry logic (3 attempts with 1s delay) before marking a client as failed 2. Removes the immediate 'delete s.clients[clientName]' that permanently evicted healthy servers on transient errors 3. Logs retry attempts at warn level for visibility The client remains in 'failed' status after all retries are exhausted, but is no longer deleted, allowing future state() calls to attempt reconnection. Fixes anomalyco#17099
Contributor
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
6 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue for this PR
Closes #17099
Type of change
What does this PR do?
This PR fixes a critical bug where MCP tools silently vanish mid-session after a single transient
listTools()failure.The Problem:
In
mcp/index.ts, whenclient.listTools()fails (timeout, pipe hiccup, GC pause), the code immediately executesdelete s.clients[clientName], permanently removing the client from the singleton state. The MCP server process may still be running perfectly fine, but the tools are gone forever until OpenCode restarts.The Fix:
delete s.clients[clientName]that permanently evicted healthy serversWhy it works:
Transient failures (network blips, GC pauses, brief timeouts) are common in long-running sessions. Retrying gives the MCP server a chance to respond before we give up. Not deleting the client allows future reconnection attempts.
How did you verify your code works?
Screenshots / recordings
N/A - backend change
Checklist