Handle local agent connection recovery by wwwillchen · Pull Request #2853 · dyad-sh/dyad

wwwillchen · 2026-03-02T21:35:42Z

Handle connection-drop and retry behavior in local agent IPC handling.
Add an end-to-end scenario that verifies recovery after a temporary local agent disconnect.
Align test fixtures and snapshots for local-agent reconnection behavior.

gemini-code-assist · 2026-03-02T21:36:06Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the resilience of the local agent by introducing automatic retry mechanisms for transient connection failures during AI stream processing. The changes ensure that agent conversations can seamlessly recover from network interruptions, preserving ongoing work and providing a more robust user experience. This improvement is validated through new comprehensive end-to-end tests and supported by an updated fake LLM server capable of simulating these failure conditions.

Highlights

Local Agent Connection Recovery: Implemented robust retry and resume logic within the local agent handler to automatically recover from transient connection drops during AI stream processing, ensuring continuity of agent operations.
Tool Event Replay on Retry: Ensured that tool calls and their results emitted before a connection drop are correctly replayed and re-injected into the message history upon retry, preventing loss of context and redundant execution.
End-to-End Testing for Resilience: Added a new end-to-end test scenario that simulates a local agent connection drop and verifies the system's ability to recover, complete the task, and maintain UI integrity without displaying errors.
Enhanced Fake LLM Server for Testing: Updated the fake LLM server to support configurable connection drop simulations, allowing specific turns and attempts to be designated for connection termination to thoroughly test recovery mechanisms.

Changelog

e2e-tests/fixtures/engine/local-agent/connection-drop.ts
- Added a new fixture to define a scenario where the local agent connection drops on the first attempt of turn 1, simulating a realistic network interruption.
e2e-tests/local_agent_connection_retry.spec.ts
- Added a new E2E test case to verify the local agent's ability to recover from a simulated connection drop, checking for correct file creation and UI state.
e2e-tests/snapshots/local_agent_connection_retry.spec.ts_after-connection-retry.txt
- Added a new snapshot file capturing the expected file system state after a successful connection recovery in the E2E test.
e2e-tests/snapshots/local_agent_connection_retry.spec.ts_local-agent---recovers-from-connection-drop-1.aria.yml
- Added a new ARIA snapshot for the chat UI, confirming the correct display of messages and tool cards after connection recovery.
src/tests/local_agent_handler.test.ts
- Added a unit test to verify that the stream retries and resumes correctly after a transient termination, appending new output.
- Added a unit test to confirm that previously emitted tool calls and results are replayed in the message history when a stream is retried after termination.
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
- Introduced constants MAX_TERMINATED_STREAM_RETRIES, STREAM_RETRY_BASE_DELAY_MS, and STREAM_CONTINUE_MESSAGE for retry configuration and user instruction.
- Defined RetryReplayEvent type to capture tool call and tool result events for replaying during stream retries.
- Refactored handleLocalAgentStream to include a while loop for stream retries, managing terminatedRetryCount and needsContinuationInstruction.
- Modified stream processing to capture streamErrorFromIteration and streamErrorFromCallback for comprehensive error handling.
- Implemented shouldRetryTerminatedStreamError to determine if a stream error warrants a retry based on error type, retry count, and abort status.
- Added maybeCaptureRetryReplayEvent to record tool calls and results that need to be replayed if a stream terminates.
- Implemented maybeAppendRetryReplayForRetry to construct and append replay messages (partial response, tool calls, tool results) to the message history for subsequent retry attempts.
- Added utility functions buildTerminatedRetryContinuationInstruction, unwrapStreamError, getErrorMessage, isTerminatedStreamError, delay, and toToolResultOutput to support the retry logic.
- Updated the access to prevStepMessages to use optional chaining (?.response?.messages) for safer property access.
testing/fake-llm-server/localAgentHandler.ts
- Added connectionAttempts map to track retry attempts for connection drop simulation.
- Implemented logic to check dropConnectionByTurn and dropConnectionOnAttempts from the fixture to simulate connection drops by destroying the response socket.
testing/fake-llm-server/localAgentTypes.ts
- Added dropConnectionOnAttempts property to LocalAgentFixture to specify which attempts should trigger a connection drop.
- Added dropConnectionByTurn property to LocalAgentFixture for per-turn connection drop configuration, including turnIndex and attempts.

Activity

No human activity (comments, reviews) has been recorded on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

wwwillchen · 2026-03-02T21:36:14Z

@BugBot run

gemini-code-assist

Code Review

This pull request introduces a robust connection recovery mechanism for the local agent, enhancing network resilience with retry logic and backoff for transient stream terminations, validated by unit and end-to-end tests. A security audit confirms that these changes do not introduce any new security vulnerabilities, with the implementation correctly utilizing AbortController, retry limits, and backoff delays while maintaining conversation history integrity. My review focuses on increasing the robustness of the error detection logic and correcting the handling of tool results during the retry process to ensure compatibility with the AI SDK, and I've also included a suggestion for a minor refactoring to improve code clarity.

gemini-code-assist · 2026-03-02T21:39:42Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+function isTerminatedStreamError(error: unknown): boolean {
+  const normalized = unwrapStreamError(error);
+  const message = getErrorMessage(normalized).toLowerCase();
+  if (message.includes("typeerror: terminated") || message === "terminated") {
+    return true;
+  }
+  const cause =
+    isRecord(normalized) && "cause" in normalized
+      ? normalized.cause
+      : undefined;
+  if (cause) {
+    return isTerminatedStreamError(cause);
+  }
+  return false;
+}


The current implementation of isTerminatedStreamError is brittle as it primarily checks for the specific string "typeerror: terminated". This is tailored to the mocked error in the unit tests but is unlikely to cover the range of real-world network errors that can occur when a stream is terminated (e.g., ECONNRESET).

To make the retry logic more robust and effective in production, this check should be expanded to include common network error codes (like ECONNRESET) and more generic error messages (like "fetch failed").

gemini-code-assist · 2026-03-02T21:39:42Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+function toToolResultOutput(value: unknown): { type: "text"; value: string } {
+  if (typeof value === "string") {
+    return { type: "text", value };
+  }
+  try {
+    return { type: "text", value: JSON.stringify(value) };
+  } catch {
+    return { type: "text", value: String(value) };
+  }
+}


The toToolResultOutput function incorrectly wraps the tool output in an object { type: 'text', value: '...' }. This structure is then assigned to the output field of a tool-result message part. The output field of a tool-result should contain the raw, JSON-serializable result of the tool execution, not an object that mimics a content part.

This incorrect nesting will likely be misinterpreted by the AI SDK when it serializes the message for the model provider, potentially causing errors. The function should pass through the raw tool output, as the AI SDK handles the final serialization. Note that the corresponding unit test assertion in src/__tests__/local_agent_handler.test.ts will also need to be updated to reflect this change.

function toToolResultOutput(value: unknown): unknown { // The AI SDK will handle serialization of the output, so we should pass the raw value through. return value; }

gemini-code-assist · 2026-03-02T21:39:42Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+            maybeAppendRetryReplayForRetry({
+              partialResponse: fullResponse.slice(
+                passAttemptResponseStartLength,
+              ),
+              retryReplayEvents,
+              currentMessageHistoryRef: currentMessageHistory,
+              accumulatedAiMessagesRef: accumulatedAiMessages,
+              onCurrentMessageHistoryUpdate: (next) =>
+                (currentMessageHistory = next),
+            });


The function maybeAppendRetryReplayForRetry modifies currentMessageHistory via a callback, onCurrentMessageHistoryUpdate. This pattern of passing a setter can make the data flow harder to trace and maintain.

For better clarity, consider refactoring maybeAppendRetryReplayForRetry to be a pure function that returns the new message history. The caller can then explicitly re-assign currentMessageHistory. This would make the state update more explicit and easier to follow.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ca24a3719

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-02T21:42:37Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+              partialResponse: fullResponse.slice(
+                passAttemptResponseStartLength,


Replay only model output when constructing retry history

The retry branch builds partialResponse from fullResponse, but fullResponse includes synthetic XML added via onXmlComplete (tool-call/result markup and other UI tags), not just model text. If a disconnect happens after tool XML is appended, retry history injects that XML back as assistant text before the next streamText() call, so the model sees renderer-specific markup and duplicated tool transcript data. This can derail continuation and produce repeated or inconsistent tool behavior; retry context should be derived from streamed model text only.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-02T21:42:37Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+    if (event.type === "tool-call") {
+      replayMessages.push({
+        role: "assistant",


Replay only completed tool exchanges after termination

This retry replay loop re-adds every captured tool-call even when no matching tool-result was captured before the stream dropped. In the common failure window between a tool-call event and its tool-result, the next attempt receives an orphaned tool call in history, which can cause provider-side message validation failures or make the model re-run a state-changing tool. Retry replay should only include tool calls that have corresponding results.

Useful? React with 👍 / 👎.

devin-ai-integration

Devin Review found 1 potential issue.

View 7 additional findings in Devin Review.

greptile-apps · 2026-03-02T21:47:42Z

Greptile Summary

This PR adds connection-drop resilience to the local agent IPC stream handler: when a TypeError: terminated (TCP teardown mid-stream) is detected, the handler replays any already-emitted tool-call/tool-result events into the message history, appends a system continuation instruction, and re-opens the stream (up to MAX_TERMINATED_STREAM_RETRIES = 2 times with exponential backoff). It also adds two E2E test scenarios for the two main drop windows (during a text turn, and mid-tool-call stream), two unit tests, and extends the fake-server fixture DSL with dropConnectionAfterToolCallByTurn.

Key observations:

Fake server turn-index bug: countToolResultRounds in testing/fake-llm-server/localAgentHandler.ts finds the last user message as its baseline. In a retry, the continuation instruction is the last user message, so replayed tool results (which appear before it) are not counted — turnIndex resets to 0. For the connection-drop fixture (drop after a completed tool turn), this causes the fake server to re-serve turn 0 on retry, executing write_file a second time. The write is idempotent so no data is lost, but toHaveCount(1) for edit-card assertions could become flaky.
isTerminatedStreamError relies on exact error-message strings ("typeerror: terminated") which are Node.js/V8 specific and could silently stop retrying if formatting changes across environments or SDK versions.
Duplicated retry-decision blocks: the same ~20-line retry handling sequence is copy-pasted for streamErrorFromIteration and streamResult.response finalization errors; extracting a shared helper would reduce maintenance surface.
Confusing variable names in maybeAppendRetryReplayForRetry: toolCallsWithResult is populated from tool-result events and toolResultsWithCall from tool-call events — the names are semantically inverted relative to the event type that populates them.
Good incidental fix: optional chaining was added for steps[...].response?.messages (line ~803), preventing a potential crash when step data is incomplete.

Confidence Score: 3/5

Mostly safe to merge — the production retry logic is sound, but the fake-server turn-index issue could make the new E2E tests flaky, and error detection relies on brittle string matching.
The core retry mechanism (replay events, continuation instruction, exponential backoff) is well-designed and the unit tests cover the happy-path retry scenarios. However, the fake server's countToolResultRounds does not account for the continuation instruction being appended as the last user message, which can cause the server to re-serve an earlier turn (and duplicate tool execution) in the connection-drop E2E scenario. Additionally, isTerminatedStreamError uses Node.js-specific message strings that could stop retrying silently if the error format changes, and the duplicated retry blocks increase future maintenance risk.
testing/fake-llm-server/localAgentHandler.ts (countToolResultRounds baseline logic) and src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts (isTerminatedStreamError and duplicated retry blocks).

Important Files Changed

Filename	Overview
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts	Core retry logic added; contains duplicated retry blocks and confusingly-named variables in `maybeAppendRetryReplayForRetry`, plus brittle error-string detection in `isTerminatedStreamError`. Also includes a good bug-fix for optional chaining on `steps[...].response?.messages`.
testing/fake-llm-server/localAgentHandler.ts	`countToolResultRounds` uses the last user message as its baseline, but the retry path appends a continuation instruction as the last user message — causing the function to return `turnIndex=0` regardless of replayed tool rounds, potentially triggering duplicate tool execution in E2E tests.
testing/fake-llm-server/localAgentTypes.ts	New `dropConnectionAfterToolCallByTurn` field added to `LocalAgentFixture` type; well-documented with JSDoc, straightforward addition.
testing/fake-llm-server/chatCompletionHandler.ts	Correctly extends the synthetic-message detection to also skip the new retry continuation instruction when searching for the original fixture trigger; minimal, correct change.
e2e-tests/local_agent_connection_retry.spec.ts	Good coverage of both drop-before-finish and drop-after-tool-call scenarios; positional ordering assertions are a nice touch, though the `toHaveCount(1)` assertion for edit cards may be fragile if duplicate tool execution occurs (see fake server issue).
src/tests/local_agent_handler.test.ts	Two well-structured unit tests added for text-only and tool-replay retry scenarios; verifies continuation instruction injection and replayed tool history on second attempt.
e2e-tests/fixtures/engine/local-agent/connection-drop.ts	Fixture correctly uses `dropConnectionByTurn` to drop turn 1 (post-tool text turn), which is a realistic mid-stream interruption scenario.
e2e-tests/fixtures/engine/local-agent/connection-drop-after-tool-call.ts	Fixture for drop-after-tool-call scenario is correctly configured with `dropConnectionAfterToolCallByTurn`; tests the "orphaned tool call" window.

Sequence Diagram

sequenceDiagram
    participant Client as Electron Client
    participant Handler as local_agent_handler
    participant SDK as AI SDK (streamText)
    participant LLM as Local Agent LLM

    Client->>Handler: handleLocalAgentStream()
    loop Retry while loop (max 2 retries)
        Handler->>SDK: streamText(attemptMessages)
        SDK->>LLM: POST /v1/chat/completions
        LLM-->>SDK: SSE stream (text-delta, tool-call, tool-result)
        SDK-->>Handler: fullStream events (captured in retryReplayEvents)
        
        alt Stream terminates normally
            SDK-->>Handler: streamResult.response resolved
            Handler->>Handler: Accumulate responseMessages
            Handler->>Client: sendResponseChunk()
            Note over Handler: break out of retry loop
        else Stream terminates with TypeError("terminated")
            Note over Handler: isTerminatedStreamError() → true
            Handler->>Handler: maybeAppendRetryReplayForRetry()<br/>(inject replayed tool exchanges into currentMessageHistory)
            Handler->>Handler: append continuation instruction<br/>(needsContinuationInstruction = true)
            Handler->>Handler: delay(400ms * retryCount)
            Note over Handler: continue → retry loop
            Handler->>SDK: streamText([...replay, continuationInstruction])
            SDK->>LLM: POST /v1/chat/completions (with context)
            LLM-->>SDK: SSE stream (resumed)
            SDK-->>Handler: fullStream events
            SDK-->>Handler: streamResult.response resolved
            Note over Handler: break
        else Retries exhausted (retryCount >= MAX_TERMINATED_STREAM_RETRIES)
            Handler->>Client: throw error (caught upstream)
        end
    end
    Handler->>Client: chat:response:end

_{Last reviewed commit: 47623be}

github-actions · 2026-03-02T21:51:14Z

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity	File	Issue
🟡 MEDIUM	`src/.../local_agent_handler.ts`	Retry-on-terminated logic duplicated in two catch branches
🟡 MEDIUM	`src/.../local_agent_handler.ts`	`isTerminatedStreamError` uses fragile string matching without documenting source

🟢 Low Priority Notes (4 items)

baseMessageHistoryCount not updated after retry replay - src/.../local_agent_handler.ts — If mid-turn compaction triggers during a retry iteration, the stale baseMessageHistoryCount could cause incorrect message slicing of in-flight messages.
Recursive cause traversal has no depth limit - src/.../local_agent_handler.ts:~1205 — isTerminatedStreamError recursively walks error.cause with no max depth. A circular cause chain (unlikely but possible) would stack overflow.
Whitespace-only partial responses silently discarded - src/.../local_agent_handler.ts — maybeAppendRetryReplayForRetry uses partialResponse.trim() to decide whether to replay text. Whitespace-only partial output is dropped from model context but remains in fullResponse, creating a minor mismatch.
dropConnectionOnAttempts defined but never exercised - testing/fake-llm-server/localAgentTypes.ts — The simpler top-level field exists in the type and handler fallback logic but no fixture uses it. Consider removing it or adding a fixture.

🚫 Dropped False Positives (8 items)

~~passProducedChatText not reset between retry attempts~~ — Dropped: Tracking text production across all attempts is correct behavior. The flag indicates whether the pass produced any text at all, regardless of which attempt emitted it.
~~Partial text from dropped connection remains in chat~~ — Dropped: This is intentional design. STREAM_CONTINUE_MESSAGE instructs the model to continue from where it left off, so the partial text is preserved and the continuation seamlessly follows.
~~onError callback timing: streamErrorFromCallback may not be set when checked~~ — Dropped: The ?? operator correctly handles both paths — either the iterator throws (caught as streamErrorFromIteration) or the callback fires first. The fallback covers both orderings.
~~Module-level toolStreamingEntries not cleared on retry~~ — Dropped: The AI SDK generates unique UUIDs per tool call, so stale entries from a failed attempt won't collide with new ones. The orphaned entries are a theoretical memory concern but not a corruption risk.
~~connectionAttempts Map grows unboundedly~~ — Dropped: E2E test servers are short-lived per test run, and session IDs are unique per test. Not a practical concern.
~~No user-visible indication of retry occurring~~ — Dropped: The retry delay is 400–800ms, well within normal streaming latency. Adding a UI indicator for sub-second pauses would be over-engineering.
~~Raw TypeError surfaces when retries exhausted~~ — Dropped: Error propagation via throw streamError is correct. The error is handled by the caller's existing error handling path, same as before this PR.
~~Retry counter resets per pass~~ — Dropped: Per-pass retry semantics are reasonable — each pass is a separate conversation turn with independent network conditions.

Generated by Dyadbot multi-agent code review

github-actions · 2026-03-02T21:51:43Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+          if (
+            shouldRetryTerminatedStreamError({
+              error: err,
+              retryCount: terminatedRetryCount,
+              aborted: abortController.signal.aborted,
+            })
+          ) {
+            maybeAppendRetryReplayForRetry({
+              partialResponse: fullResponse.slice(
+                passAttemptResponseStartLength,
+              ),
+              retryReplayEvents,
+              currentMessageHistoryRef: currentMessageHistory,
+              accumulatedAiMessagesRef: accumulatedAiMessages,
+              onCurrentMessageHistoryUpdate: (next) =>
+                (currentMessageHistory = next),
+            });
+            terminatedRetryCount += 1;
+            needsContinuationInstruction = true;
+            const retryDelayMs =
+              STREAM_RETRY_BASE_DELAY_MS * terminatedRetryCount;
+            logger.warn(
+              `Transient stream termination while finalizing response for chat ${req.chatId}; retrying pass (${terminatedRetryCount}/${MAX_TERMINATED_STREAM_RETRIES}) after ${retryDelayMs}ms`,
+            );
+            await delay(retryDelayMs);
+            continue;
+          }


🟡 MEDIUM | duplication

Retry-on-terminated logic duplicated in two catch branches

This retry block (lines 985–1011) is nearly identical to the one at lines 948–975 — the only difference is the log message string. Any future change to the retry logic (delay formula, counter increment, telemetry, etc.) must be applied in both places, making it easy to introduce inconsistencies.

💡 Suggestion: Extract a shared helper, e.g. async function performTerminatedRetry(logContext: string), that encapsulates maybeAppendRetryReplayForRetry, counter increment, delay, and the continue signal. Call it from both branches.

github-actions · 2026-03-02T21:51:45Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+function isTerminatedStreamError(error: unknown): boolean {
+  const normalized = unwrapStreamError(error);
+  const message = getErrorMessage(normalized).toLowerCase();
+  if (message.includes("typeerror: terminated") || message === "terminated") {
+    return true;
+  }
+  const cause =
+    isRecord(normalized) && "cause" in normalized
+      ? normalized.cause
+      : undefined;
+  if (cause) {
+    return isTerminatedStreamError(cause);
+  }
+  return false;
+}


🟡 MEDIUM | fragile-detection

isTerminatedStreamError relies on fragile string matching without documenting the source

This function detects "terminated" errors by matching lowercased error message strings ("typeerror: terminated", "terminated"). This is a fragile heuristic that will silently stop working if the underlying HTTP library (undici? node-fetch? Node.js built-in?) changes its error message wording. There's no comment explaining:

Which library emits TypeError: terminated and under what condition

Why a typed error check (e.g., error.code or error subclass) can't be used instead

The recursive cause traversal (with no depth limit) adds to the opacity.

💡 Suggestion: Add a comment documenting which library emits this error and why string matching is necessary. If possible, also check error.code or the error constructor as a more robust detection mechanism.

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

cursor · 2026-03-02T22:10:52Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+              `Transient stream termination for chat ${req.chatId}; retrying pass (${terminatedRetryCount}/${MAX_TERMINATED_STREAM_RETRIES}) after ${retryDelayMs}ms`,
            );
+            await delay(retryDelayMs);
+            continue;


Duplicated retry-and-continue blocks risk inconsistent future fixes

Low Severity

The shouldRetryTerminatedStreamError → maybeAppendRetryReplayForRetry → increment → delay → continue sequence is copy-pasted nearly identically across two error-handling paths (stream iteration error and response finalization error). Only the error variable name and log message differ. If the retry logic ever needs updating, both blocks must be changed in lockstep, creating a maintenance risk and inconsistency hazard.

Additional Locations (1)

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts#L984-L1011

wwwillchen · 2026-03-02T22:20:31Z

@BugBot run

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1583cc151c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-02T22:25:37Z

testing/fake-llm-server/localAgentHandler.ts

+      const attemptKey = `${sessionId}-${passIndex}-${turnIndex}`;
+      const currentAttempt = (connectionAttempts.get(attemptKey) || 0) + 1;
+      connectionAttempts.set(attemptKey, currentAttempt);


Reset connection-drop attempt state per fixture session

The global connectionAttempts counter is incremented for each ${sessionId}-${passIndex}-${turnIndex} key but never cleared, and sessionId is derived from the first user message content. When the same fixture prompt is reused in the same fake-server process (for example with Playwright --repeat-each or multiple tests that send the same trigger), later runs start at attempt 2+ so rules like attempts: [1] no longer drop the connection, making the reconnection fixtures non-deterministic and potentially masking regressions.

Useful? React with 👍 / 👎.

cubic-dev-ai

2 issues found across 7 files (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="e2e-tests/local_agent_connection_retry.spec.ts">

<violation number="1" location="e2e-tests/local_agent_connection_retry.spec.ts:44">
P2: The conversation-order check is layout-dependent (`boundingBox().y`) and can become flaky across environments; assert DOM/message sequence instead of pixel coordinates.</violation>
</file>

<file name="testing/fake-llm-server/localAgentHandler.ts">

<violation number="1" location="testing/fake-llm-server/localAgentHandler.ts:480">
P2: The global `connectionAttempts` map is never cleared between test sessions. Since `sessionId` is derived from user message content, reusing the same fixture prompt within the same fake-server process (e.g., via Playwright `--repeat-each` or multiple tests sharing a trigger) causes the attempt counter to start at 2+ on subsequent runs. This means `attempts: [1]` rules won't fire, making connection-drop fixtures non-deterministic and potentially masking regressions. Consider clearing the relevant keys when a new session/fixture starts, or resetting the map between test runs.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-02T22:29:40Z

e2e-tests/local_agent_connection_retry.spec.ts

+    const introY = (await introText.boundingBox())?.y;
+    const editCardY = (await recoveredEditCard.boundingBox())?.y;
+    const completionY = (await completionText.boundingBox())?.y;
+    expect(introY).toBeDefined();
+    expect(editCardY).toBeDefined();
+    expect(completionY).toBeDefined();
+    expect(introY!).toBeLessThan(editCardY!);
+    expect(editCardY!).toBeLessThan(completionY!);
+


P2: The conversation-order check is layout-dependent (boundingBox().y) and can become flaky across environments; assert DOM/message sequence instead of pixel coordinates.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At e2e-tests/local_agent_connection_retry.spec.ts, line 44: <comment>The conversation-order check is layout-dependent (`boundingBox().y`) and can become flaky across environments; assert DOM/message sequence instead of pixel coordinates.</comment> <file context> @@ -20,16 +20,35 @@ testSkipIfWindows( + + // The replayed conversation order must stay: + // intro assistant text -> tool edit card -> completion assistant text. + const introY = (await introText.boundingBox())?.y; + const editCardY = (await recoveredEditCard.boundingBox())?.y; + const completionY = (await completionText.boundingBox())?.y; </file context>

Suggested change

const introY = (await introText.boundingBox())?.y;

const editCardY = (await recoveredEditCard.boundingBox())?.y;

const completionY = (await completionText.boundingBox())?.y;

expect(introY).toBeDefined();

expect(editCardY).toBeDefined();

expect(completionY).toBeDefined();

expect(introY!).toBeLessThan(editCardY!);

expect(editCardY!).toBeLessThan(completionY!);

await expect(recoveredEditCard).toBeVisible();

const [introEl, editEl, completionEl] = await Promise.all([

introText.elementHandle(),

recoveredEditCard.elementHandle(),

completionText.elementHandle(),

]);

expect(introEl).toBeTruthy();

expect(editEl).toBeTruthy();

expect(completionEl).toBeTruthy();

const isInOrder = await po.page.evaluate(

([intro, edit, completion]) =>

!!intro &&

!!edit &&

!!completion &&

!!(intro.compareDocumentPosition(edit) & Node.DOCUMENT_POSITION_FOLLOWING) &&

!!(edit.compareDocumentPosition(completion) & Node.DOCUMENT_POSITION_FOLLOWING),

[introEl, editEl, completionEl],

);

expect(isInOrder).toBe(true);

cubic-dev-ai · 2026-03-02T22:29:40Z

testing/fake-llm-server/localAgentHandler.ts

+        turnScopedDropAfterToolCallAttempts &&
+        turnScopedDropAfterToolCallAttempts.length > 0
+          ? (() => {
+              const attemptKey = `${sessionId}-${passIndex}-${turnIndex}-after-tool-call`;


P2: The global connectionAttempts map is never cleared between test sessions. Since sessionId is derived from user message content, reusing the same fixture prompt within the same fake-server process (e.g., via Playwright --repeat-each or multiple tests sharing a trigger) causes the attempt counter to start at 2+ on subsequent runs. This means attempts: [1] rules won't fire, making connection-drop fixtures non-deterministic and potentially masking regressions. Consider clearing the relevant keys when a new session/fixture starts, or resetting the map between test runs.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At testing/fake-llm-server/localAgentHandler.ts, line 480: <comment>The global `connectionAttempts` map is never cleared between test sessions. Since `sessionId` is derived from user message content, reusing the same fixture prompt within the same fake-server process (e.g., via Playwright `--repeat-each` or multiple tests sharing a trigger) causes the attempt counter to start at 2+ on subsequent runs. This means `attempts: [1]` rules won't fire, making connection-drop fixtures non-deterministic and potentially masking regressions. Consider clearing the relevant keys when a new session/fixture starts, or resetting the map between test runs.</comment> <file context> @@ -454,7 +473,23 @@ export async function handleLocalAgentFixture( + turnScopedDropAfterToolCallAttempts && + turnScopedDropAfterToolCallAttempts.length > 0 + ? (() => { + const attemptKey = `${sessionId}-${passIndex}-${turnIndex}-after-tool-call`; + const currentAttempt = + (connectionAttempts.get(attemptKey) || 0) + 1; </file context>

github-actions · 2026-03-02T22:33:36Z

🎭 Playwright Test Results

❌ Some tests failed

OS	Passed	Failed	Flaky	Skipped
🍎 macOS	236	2	6	6

Summary: 236 passed, 2 failed, 6 flaky, 6 skipped

Failed Tests

🍎 macOS

context_manage.spec.ts > manage context - smart context
- Error: expect(string).toMatchSnapshot(expected) failed
template-create-nextjs.spec.ts > create next.js app
- Error: expect(locator).toMatchAriaSnapshot(expected) failed

📋 Re-run Failing Tests (macOS)

Copy and paste to re-run all failing spec files locally:

npm run e2e \
  e2e-tests/context_manage.spec.ts \
  e2e-tests/template-create-nextjs.spec.ts

⚠️ Flaky Tests

🍎 macOS

engine.spec.ts > send message to engine - openai gpt-5 (passed after 1 retry)
hmr_path.spec.ts > HMR after navigating back to root should stay on root (passed after 1 retry)
partial_response.spec.ts > partial message is resumed (passed after 1 retry)
select_component.spec.ts > select component next.js (passed after 1 retry)
setup_flow.spec.ts > Setup Flow > setup banner shows correct state when node.js is installed (passed after 1 retry)
setup.spec.ts > setup ai provider (passed after 1 retry)

📊 View full report

github-actions · 2026-03-02T22:34:19Z

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity	File	Issue
🟡 MEDIUM	`local_agent_handler.ts:1306`	Swapped variable names in `maybeAppendRetryReplayForRetry`
🟡 MEDIUM	`local_agent_handler.ts:1194`	`getErrorMessage` duplicates existing utility in `src/lib/errors.ts`
🟡 MEDIUM	`local_agent_handler.ts:623`	Inner retry `while` loop lacks high-level documentation
🟡 MEDIUM	`local_agent_handler.ts:947`	Synthetic `</think>` tag not captured in retry replay events
🟡 MEDIUM	`local_agent_handler.ts:976`	No visual feedback to user during retry delay
🟡 MEDIUM	`local_agent_handler.ts:983`	Raw `TypeError: terminated` shown to user after retries exhausted

🟢 Low Priority Notes (7 items)

Retry delay doesn't respect abort signal — local_agent_handler.ts:980 — 400–800ms delay blocks even if user cancels
Orphaned toolStreamingEntries on mid-input termination — local_agent_handler.ts — module-level Map leaks entries when stream dies between tool-input-start and tool-input-end
steps type annotation is overly loose — local_agent_handler.ts:616 — Uses Array<unknown> instead of the SDK's step type
dropConnectionOnAttempts defined but never used by any fixture — localAgentTypes.ts:56 — Dead infrastructure; only per-turn variants are used
STREAM_CONTINUE_MESSAGE lacks "why" comment — local_agent_handler.ts:84 — Specific wording matters (prevents duplicate text) but isn't documented
Continuation instruction relies on model compliance — local_agent_handler.ts:84 — Model may repeat text; no programmatic dedup
Only 2 retries with short total window (~1.2s) — local_agent_handler.ts:82 — May be insufficient for genuine intermittent connectivity issues

🚫 Dropped False Positives (3 items)

~~Partial text from failed stream persists in user-visible response~~ — Dropped: This is the intended design. The unit test explicitly validates that fullResponse contains both partial and recovered text. The continuation instruction handles deduplication.
~~streamErrorFromCallback may be stale from previous retry iteration~~ — Dropped: The variable is declared fresh per loop iteration and onError fires synchronously during stream processing, so timing conflicts are not realistic.
~~onError no longer sends error to frontend for non-retryable errors~~ — Dropped: Non-retryable errors are thrown and caught by the outer handler which does send chat:response:error. The path where onError fires but iteration succeeds is unrealistic with the AI SDK.

Generated by Dyadbot multi-agent code review

github-actions · 2026-03-02T22:34:46Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+  const toolCallsWithResult = new Set<string>();
+  const toolResultsWithCall = new Set<string>();
+
+  for (const event of retryReplayEvents) {
+    if (event.type === "tool-call") {
+      toolResultsWithCall.add(event.toolCallId);
+      continue;
+    }
+    if (event.type === "tool-result") {
+      toolCallsWithResult.add(event.toolCallId);
+    }


🟡 MEDIUM | naming

Set variable names are swapped relative to their content

toolCallsWithResult is populated from tool-result events (line 1315), and toolResultsWithCall is populated from tool-call events (line 1311). The names are the opposite of what they store. The intersection logic for completedToolExchangeIds happens to produce the correct result, but the swapped names make this code confusing and error-prone for future maintenance.

💡 Suggestion: Rename to seenToolCallIds and seenToolResultIds to directly describe what each set holds.

github-actions · 2026-03-02T22:34:47Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+  return error;
+}
+
+function getErrorMessage(error: unknown): string {


🟡 MEDIUM | duplication

getErrorMessage duplicates existing utility in src/lib/errors.ts

A getErrorMessage function already exists at src/lib/errors.ts. This PR adds a second, slightly different implementation. Having two getErrorMessage functions with different behavior is a maintenance hazard.

💡 Suggestion: Extend the existing getErrorMessage in src/lib/errors.ts to handle the additional cases, or rename this one to something distinct like formatStreamErrorMessage.

github-actions · 2026-03-02T22:34:48Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+      let terminatedRetryCount = 0;
+      let needsContinuationInstruction = false;
+
+      while (!abortController.signal.aborted) {


🟡 MEDIUM | missing-why-comment

Inner retry while loop lacks high-level documentation

This inner while loop is the core of this PR — it implements the entire retry-on-terminated-stream protocol. A block comment explaining the high-level contract would help future readers understand: (1) what conditions cause a retry, (2) what state is preserved vs reset between attempts, and (3) how replay events feed into the next attempt's message history.

💡 Suggestion: Add a comment like: "Retry loop: if the stream terminates with a transient error, captured text/tool events are replayed into message history, a continuation instruction is appended, and the stream is re-opened. Breaks on success, abort, or non-retryable error."

github-actions · 2026-03-02T22:34:50Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+          const closingThinkBlock = "</think>\n";
+          fullResponse += closingThinkBlock;
+          await updateResponseInDb(placeholderMessageId, fullResponse);


🟡 MEDIUM | data-integrity

Synthetic </think> tag not captured in retry replay events

When a stream terminates mid-thinking-block, this code appends a synthetic </think>\n to fullResponse and persists it to DB. However, maybeCaptureRetryReplayText is only called from the text-delta handler, so this synthetic closing tag is NOT included in the retry replay events. This means the replay text fed into message history on retry is inconsistent with what's shown to the user (user sees closed thinking block, but message history has an unclosed one).

github-actions · 2026-03-02T22:34:51Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+              STREAM_RETRY_BASE_DELAY_MS * terminatedRetryCount;
+            logger.warn(
+              `Transient stream termination for chat ${req.chatId}; retrying pass (${terminatedRetryCount}/${MAX_TERMINATED_STREAM_RETRIES}) after ${retryDelayMs}ms`,
            );
+            await delay(retryDelayMs);
+            continue;
          }
+          throw streamError;


🟡 MEDIUM | UX

No visual feedback during retry; raw error after retries exhausted

Two UX concerns here:

During retry: The user experiences a 400–800ms pause with no visual indication that recovery is happening. The stream stops mid-response, waits, then resumes. Consider a transient "Reconnecting…" indicator.

After retries exhausted: The throw streamError surfaces as a raw TypeError: terminated to the user, which is not actionable. After silently retrying multiple times, showing a user-friendly message like "The connection to your local agent was lost and could not be recovered" would be much better.

cursor · 2026-03-02T22:38:42Z

testing/fake-llm-server/localAgentHandler.ts

+    if (turnScopedDropAttempts && turnScopedDropAttempts.length > 0) {
+      const attemptKey = `${sessionId}-${passIndex}-${turnIndex}`;
+      const currentAttempt = (connectionAttempts.get(attemptKey) || 0) + 1;
+      connectionAttempts.set(attemptKey, currentAttempt);


Retry drop counters leak across conversations

Medium Severity

connectionAttempts is keyed by sessionId, but sessionId is derived from the first user message content. Separate chats that start with the same fixture trigger share the same key, so attempt counts bleed across runs. In localAgentHandler.ts, later conversations may skip configured first-attempt drops unexpectedly.

Additional Locations (1)

testing/fake-llm-server/localAgentHandler.ts#L31-L41

cursor · 2026-03-02T22:38:42Z

testing/fake-llm-server/localAgentHandler.ts

+    // Check if we should simulate a connection drop for this attempt
+    const turnScopedDropAttempts =
+      fixture.dropConnectionByTurn?.find((rule) => rule.turnIndex === turnIndex)
+        ?.attempts ?? fixture.dropConnectionOnAttempts;


Global drop rule applies to every turn

Medium Severity

dropConnectionOnAttempts is documented for the first turn, but localAgentHandler.ts falls back to it for any turnIndex without a per-turn rule. This makes later turns drop unexpectedly, so fixture behavior diverges from the declared contract in localAgentTypes.ts.

Additional Locations (1)

testing/fake-llm-server/localAgentTypes.ts#L50-L54

cursor · 2026-03-02T22:38:42Z

testing/fake-llm-server/localAgentHandler.ts


+// Track connection attempts per session+turn for connection drop simulation.
+// Key: `${sessionId}-${passIndex}-${turnIndex}`, Value: attempt count
+const connectionAttempts = new Map<string, number>();


Connection attempt map never gets cleared

Low Severity

connectionAttempts accumulates per-session keys but has no cleanup path. In long-running fake-server processes, each unique sessionId/pass/turn key remains forever, causing unbounded growth and stale state retention in localAgentHandler.ts.

Additional Locations (2)

testing/fake-llm-server/localAgentHandler.ts#L444-L447

testing/fake-llm-server/localAgentHandler.ts#L480-L484

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

cursor · 2026-03-02T22:40:15Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+              ...currentMessageHistory,
+              buildTerminatedRetryContinuationInstruction(),
+            ]
+          : currentMessageHistory;


Retry instruction resets fixture turn state

Medium Severity

Appending buildTerminatedRetryContinuationInstruction() as the last user message can make fixture turn detection use the synthetic retry prompt as the boundary. countToolResultRounds() then ignores prior replayed tool results, so retries can restart from turn 0 and re-run earlier fixture work instead of resuming.

Additional Locations (1)

testing/fake-llm-server/localAgentHandler.ts#L72-L95

cursor · 2026-03-02T22:40:15Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+  }
+  try {
+    return { type: "text", value: JSON.stringify(value) };
+  } catch {


Undefined tool results break replay message encoding

Medium Severity

toToolResultOutput() returns JSON.stringify(value) directly for non-strings. When value is undefined, JSON.stringify returns undefined, so replayed tool-result messages get a non-string output.value. That can make the retry replay payload invalid and fail the recovery attempt.

Additional Locations (1)

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts#L1361-L1366

wwwillchen · 2026-03-02T22:46:16Z

@BugBot run

wwwillchen · 2026-03-02T22:51:14Z

@BugBot run

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 47623be5f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-03-02T22:58:46Z

testing/fake-llm-server/chatCompletionHandler.ts

+      (userTextContent.includes("incomplete todo(s)") ||
+        userTextContent.includes("previous response stream was interrupted"))


Select the most recent fixture for retry continuation

Including "previous response stream was interrupted" in this fallback means retry continuation prompts now reuse the "search earlier messages" path, but that path picks the first tc=local-agent/... trigger in the chat rather than the latest one. In a chat that has multiple fixture prompts, a retry can be routed to an older fixture, and handleLocalAgentFixture then derives turn state from the wrong script, which can replay earlier turns and re-run tool side effects instead of resuming the interrupted one.

Useful? React with 👍 / 👎.

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

cursor · 2026-03-02T23:04:16Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+    if (event.type === "tool-result") {
+      toolCallsWithResult.add(event.toolCallId);
+    }
+  }


Swapped variable names in tool exchange tracking sets

Medium Severity

toolCallsWithResult is populated from tool-result events and toolResultsWithCall is populated from tool-call events — the opposite of what the names suggest. The final completedToolExchangeIds computation is still correct because set intersection is commutative, but the naming is misleading. A future developer trusting the variable names while modifying this logic (e.g., changing the intersection to a directional filter) could easily introduce a real bug.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is ON, but it could not run because the branch was deleted or merged before autofix could start.}

cursor · 2026-03-02T23:04:20Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+    if (event.type === "tool-result") {
+      toolCallsWithResult.add(event.toolCallId);
+    }
+  }


Swapped set variable names obscure retry replay logic

Low Severity

In maybeAppendRetryReplayForRetry, the sets toolCallsWithResult and toolResultsWithCall have their population logic swapped relative to their names. toolCallsWithResult is populated from tool-result events, and toolResultsWithCall is populated from tool-call events — the opposite of what the names suggest. The intersection used to compute completedToolExchangeIds is commutative, so the result is still correct. However, the naming mismatch is confusing and could mislead future maintainers into introducing a real bug when modifying this code.

greptile-apps

_{12 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-03-02T23:07:02Z

Additional Comments (1)

testing/fake-llm-server/localAgentHandler.ts
countToolResultRounds miscalculates turn index after retry with replay

countToolResultRounds finds the last user message and counts tool-result rounds that appear after it. In the retry path, handleLocalAgentStream appends a continuation instruction as the last user message after any replay messages:

[user: fixture trigger]
[assistant: "I'll create a file…" + write_file]  ← replay
[tool:  write_file result]                         ← replay
[assistant: "Partial response…"]                  ← replay (turn 1 partial)
[user: "[System] …stream was interrupted…"]        ← continuation (LAST user msg)

Because the continuation instruction is the last user message, lastUserIndex points to it and the tool-result at the replay position is not counted. The function returns rounds = 0, so turnIndex = 0.

This causes the fake server to re-serve turn 0 (which may include tool calls) on retry, instead of the correct turn 1. For the connection-drop fixture this means write_file gets executed a second time (the file write is idempotent, so it may not surface as a test failure, but the conversation history will contain a duplicate tool exchange and potentially two edit cards in the UI).

A targeted fix would be to skip synthetic user messages (continuation instructions, todo-reminders) when searching for the baseline user message:

function countToolResultRounds(messages: any[]): number {
  // Skip synthetic messages injected by the retry/outer-loop machinery
  const isSynthetic = (msg: any) =>
    isTodoReminderMessage(msg) ||
    (typeof msg?.content === "string" &&
      msg.content.includes("previous response stream was interrupted")) ||
    (Array.isArray(msg?.content) &&
      msg.content.some(
        (p: any) =>
          p.type === "text" &&
          p.text?.includes("previous response stream was interrupted"),
      ));

  let lastUserIndex = -1;
  for (let i = messages.length - 1; i >= 0; i--) {
    if (messages[i]?.role === "user" && !isSynthetic(messages[i])) {
      lastUserIndex = i;
      break;
    }
  }
  // ... rest unchanged
}

Prompt To Fix With AI

This is a comment left during a code review.
Path: testing/fake-llm-server/localAgentHandler.ts
Line: 64-96

Comment:
**`countToolResultRounds` miscalculates turn index after retry with replay**

`countToolResultRounds` finds the **last** user message and counts tool-result rounds that appear after it. In the retry path, `handleLocalAgentStream` appends a continuation instruction as the last user message _after_ any replay messages:

```
[user: fixture trigger]
[assistant: "I'll create a file…" + write_file]  ← replay
[tool:  write_file result]                         ← replay
[assistant: "Partial response…"]                  ← replay (turn 1 partial)
[user: "[System] …stream was interrupted…"]        ← continuation (LAST user msg)
```

Because the continuation instruction is the last user message, `lastUserIndex` points to it and the tool-result at the replay position is **not counted**. The function returns `rounds = 0`, so `turnIndex = 0`.

This causes the fake server to re-serve turn 0 (which may include tool calls) on retry, instead of the correct turn 1. For the `connection-drop` fixture this means `write_file` gets executed a second time (the file write is idempotent, so it may not surface as a test failure, but the conversation history will contain a duplicate tool exchange and potentially two edit cards in the UI).

A targeted fix would be to skip synthetic user messages (continuation instructions, todo-reminders) when searching for the baseline user message:

```ts
function countToolResultRounds(messages: any[]): number {
  // Skip synthetic messages injected by the retry/outer-loop machinery
  const isSynthetic = (msg: any) =>
    isTodoReminderMessage(msg) ||
    (typeof msg?.content === "string" &&
      msg.content.includes("previous response stream was interrupted")) ||
    (Array.isArray(msg?.content) &&
      msg.content.some(
        (p: any) =>
          p.type === "text" &&
          p.text?.includes("previous response stream was interrupted"),
      ));

  let lastUserIndex = -1;
  for (let i = messages.length - 1; i >= 0; i--) {
    if (messages[i]?.role === "user" && !isSynthetic(messages[i])) {
      lastUserIndex = i;
      break;
    }
  }
  // ... rest unchanged
}
```

How can I resolve this? If you propose a fix, please make it concise.

github-actions · 2026-03-02T23:07:31Z

🔍 Dyadbot Code Review Summary

Verdict: 🤔 NOT SURE - Potential issues

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity	File	Issue
🟡 MEDIUM	`local_agent_handler.ts:1210`	Non-retryable stream errors produce degraded user-facing error messages
🟡 MEDIUM	`local_agent_handler.ts:1353`	Set variable names are swapped, making retry replay logic confusing
🟡 MEDIUM	`local_agent_handler.ts:982-1048`	Retry-and-continue logic duplicated across two error handling sites
🟡 MEDIUM	`local_agent_connection_retry.spec.ts:78`	Second test uses `.first()` instead of asserting count=1, may mask duplicate tool execution

🟢 Low Priority Notes (7 items)

dropConnectionOnAttempts unused - localAgentTypes.ts:50 - Declared in type but no fixture uses it; dead test infrastructure
streamingPreview not reset between retries - local_agent_handler.ts:267 - Stale tool preview could briefly flicker on retry (self-correcting)
connectionAttempts map never cleaned up - localAgentHandler.ts:26 - Module-level map grows unbounded in long-running fake server
Unnecessary xmlChunk variable - local_agent_handler.ts:496 - Intermediate variable used once, adds no clarity
Magic string matching for retry detection - chatCompletionHandler.ts:59 - Hardcoded substring should reference shared constant
Second test lacks message ordering verification - local_agent_connection_retry.spec.ts:62 - Missing bounding box ordering + snapshot compared to first test
Local getErrorMessage shadows src/lib/errors.ts export - local_agent_handler.ts:1232 - Same function name with different behavior is confusing

🚫 Dropped False Positives (4 items)

~~No user-visible feedback during retry delay~~ - Dropped: Max total delay is ~1.2s across all retries; users won't notice a sub-second pause during AI streaming
~~Retry delay doesn't respect abort signal~~ - Dropped: Max delay is 800ms, negligible wait before abort is detected on next loop iteration
~~maybeCaptureRetryReplayEvent takes unknown~~ - Dropped: Reasonable defensive TypeScript pattern; function is called from a generic stream loop
~~MAX_TERMINATED_STREAM_RETRIES = 2 is conservative~~ - Dropped: Reasonable starting point; PR adds telemetry to determine right threshold from real-world data

Generated by Dyadbot multi-agent code review

github-actions

Multi-agent review: 4 MEDIUM issues found

github-actions · 2026-03-02T23:08:29Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+                "Local agent stream error:",
+                getErrorMessage(normalizedError),
+              );
+            },


🟡 MEDIUM | error-handling

Non-retryable stream errors produce degraded user-facing error messages

The old onError callback immediately sent chat:response:error to the UI with a well-formatted AI error: ${error?.error?.message || JSON.stringify(error)} message. The new callback silently captures the error, and for non-retryable errors, it eventually throws to the outer catch (line 1210) which formats it as `Error: ${error}`.

If the unwrapped error is a plain object (e.g., from API rate limits, auth failures, server 500s), the user will see Error: [object Object] in the ChatErrorBox instead of a meaningful message.

💡 Suggestion: In the outer catch block (line 1210-1213), use the getErrorMessage() helper already defined in this file to format the error, and preserve the AI error: prefix:

error: `AI error: ${getErrorMessage(error)}`

github-actions · 2026-03-02T23:08:29Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+                `Transient stream termination for chat ${req.chatId}; retrying pass (${terminatedRetryCount}/${MAX_TERMINATED_STREAM_RETRIES}) after ${retryDelayMs}ms`,
+              );
+              await delay(retryDelayMs);
+              continue;


🟡 MEDIUM | duplication

Retry-and-continue logic duplicated across two error handling sites

The ~20-line retry block here (stream iteration errors) is nearly identical to lines 1030-1048 (response finalization errors). The only differences are the error variable name, the phase telemetry string, and the log message.

This duplication makes it easy to update one path and forget the other when retry behavior evolves.

💡 Suggestion: Extract a helper like handleTerminatedStreamRetry(error, phase) that encapsulates the shouldRetry check, replay append, counter increment, telemetry event, log, and delay. Return a boolean so the caller knows whether to continue or fall through.

github-actions · 2026-03-02T23:08:29Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+    if (event.type === "tool-result") {
+      toolCallsWithResult.add(event.toolCallId);
+    }
+  }


🟡 MEDIUM | naming

Set variable names are swapped, making retry replay logic confusing

toolCallsWithResult is populated from tool-result events, and toolResultsWithCall is populated from tool-call events — the names are backwards. The intersection logic works correctly because set intersection is commutative, but the misleading names could cause bugs if this code is modified by someone trusting the variable names.

💡 Suggestion: Rename to match what they actually track, e.g., idsWithToolCall and idsWithToolResult, or simply toolCallIds and toolResultIds.

github-actions · 2026-03-02T23:08:29Z

e2e-tests/local_agent_connection_retry.spec.ts

+    await expect(
+      po.page
+        .getByRole("button", {
+          name: /recovered-after-tool-call\.ts .*src\/recovered-after-tool-call\.ts.*Edit/,


🟡 MEDIUM | test-correctness

Using .first() may mask duplicate tool execution visible to user

The first test (recovers from connection drop) correctly asserts toHaveCount(1) for the edit card. This test uses .first() without asserting the count first. If the retry logic accidentally creates duplicate edit cards (e.g., the tool runs twice), this test would still pass while the user would see a confusing duplicate.

💡 Suggestion: Add await expect(editCard).toHaveCount(1) before .first(), matching the pattern in the first test.

wwwillchen added 3 commits March 2, 2026 13:16

Local agent retry codex

0796088

tests

cefe711

retry

6ca24a3

wwwillchen requested a review from a team March 2, 2026 21:35

wwwillchen temporarily deployed to ai-bots March 2, 2026 21:36 — with GitHub Actions Inactive

gemini-code-assist bot reviewed Mar 2, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 2, 2026

View reviewed changes

devin-ai-integration bot reviewed Mar 2, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

github-actions bot added the needs-human:review-issue ai agent flagged an issue that requires human review label Mar 2, 2026

github-actions bot reviewed Mar 2, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

cursor bot reviewed Mar 2, 2026

View reviewed changes

fix

1583cc1

wwwillchen temporarily deployed to ai-bots March 2, 2026 22:20 — with GitHub Actions Inactive

fix

9246547

chatgpt-codex-connector bot reviewed Mar 2, 2026

View reviewed changes

cubic-dev-ai bot reviewed Mar 2, 2026

View reviewed changes

posthog!

8593b7c

github-actions bot reviewed Mar 2, 2026

View reviewed changes

cursor bot reviewed Mar 2, 2026

View reviewed changes

fix

ba49d25

wwwillchen temporarily deployed to ai-bots March 2, 2026 22:46 — with GitHub Actions Inactive

wwwillchen had a problem deploying to ai-bots March 2, 2026 22:46 — with GitHub Actions Error

fix

47623be

wwwillchen temporarily deployed to ai-bots March 2, 2026 22:51 — with GitHub Actions Inactive

wwwillchen merged commit 6f53e89 into dyad-sh:main Mar 2, 2026
7 of 8 checks passed

chatgpt-codex-connector bot reviewed Mar 2, 2026

View reviewed changes

cursor bot reviewed Mar 2, 2026

View reviewed changes

greptile-apps bot reviewed Mar 2, 2026

View reviewed changes

github-actions bot reviewed Mar 2, 2026

View reviewed changes

		partialResponse: fullResponse.slice(
		passAttemptResponseStartLength,

-    const introY = (await introText.boundingBox())?.y;
-    const editCardY = (await recoveredEditCard.boundingBox())?.y;
-    const completionY = (await completionText.boundingBox())?.y;
-    expect(introY).toBeDefined();
-    expect(editCardY).toBeDefined();
-    expect(completionY).toBeDefined();
-    expect(introY!).toBeLessThan(editCardY!);
-    expect(editCardY!).toBeLessThan(completionY!);
+    await expect(recoveredEditCard).toBeVisible();
+    const [introEl, editEl, completionEl] = await Promise.all([
+      introText.elementHandle(),
+      recoveredEditCard.elementHandle(),
+      completionText.elementHandle(),
+    ]);
+    expect(introEl).toBeTruthy();
+    expect(editEl).toBeTruthy();
+    expect(completionEl).toBeTruthy();
+    const isInOrder = await po.page.evaluate(
+      ([intro, edit, completion]) =>
+        !!intro &&
+        !!edit &&
+        !!completion &&
+        !!(intro.compareDocumentPosition(edit) & Node.DOCUMENT_POSITION_FOLLOWING) &&
+        !!(edit.compareDocumentPosition(completion) & Node.DOCUMENT_POSITION_FOLLOWING),
+      [introEl, editEl, completionEl],
+    );
+    expect(isInOrder).toBe(true);

		(userTextContent.includes("incomplete todo(s)") \|\|
		userTextContent.includes("previous response stream was interrupted"))

Conversation

wwwillchen commented Mar 2, 2026

Uh oh!

gemini-code-assist bot commented Mar 2, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

wwwillchen commented Mar 2, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

greptile-apps bot commented Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions bot commented Mar 2, 2026

🔍 Dyadbot Code Review Summary

Issues Summary

Uh oh!

github-actions bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

cursor bot Mar 2, 2026

Choose a reason for hiding this comment

Duplicated retry-and-continue blocks risk inconsistent future fixes

Uh oh!

wwwillchen commented Mar 2, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

greptile-apps bot commented Mar 2, 2026 •

edited

Loading

cubic-dev-ai bot Mar 2, 2026 •

edited

Loading

cubic-dev-ai bot Mar 2, 2026 •

edited

Loading