fix: retry transient local agent server errors by wwwillchen · Pull Request #3044 · dyad-sh/dyad

wwwillchen · 2026-03-19T06:16:57Z

Summary

retry local-agent stream passes when providers emit transient server errors like Azure server_error
keep the existing terminated-stream continuation path and apply it to retryable provider-side failures
add a regression test covering the structured provider error event shape

Test plan

npm run fmt
npm run lint:fix
npm run ts
npm test

🤖 Generated with Claude Code

wwwillchen · 2026-03-19T06:17:08Z

@BugBot run

gemini-code-assist · 2026-03-19T06:17:14Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the resilience of the local agent's streaming capabilities by introducing a robust retry mechanism for transient server errors originating from external providers. By intelligently retrying failed stream passes and leveraging existing continuation paths, the system can gracefully recover from temporary service interruptions, leading to a more stable and reliable user experience.

Highlights

Retry Mechanism for Transient Errors: Implemented retry logic for local agent stream when providers encounter transient server errors, such as Azure server_error.
Enhanced Stream Continuation: Extended the existing terminated-stream continuation mechanism to handle newly identified retryable provider-side failures.
Regression Test Coverage: Added a new regression test to validate the handling of structured provider error events.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

wwwillchen · 2026-03-19T06:17:47Z

@BugBot run

gemini-code-assist

Code Review

The pull request introduces a robust retry mechanism for transient server errors encountered during the local agent stream processing. This significantly improves the reliability and resilience of the AI agent by allowing it to recover from temporary network or provider-side issues. A new test case has been added to validate this functionality, which is a good practice. The refactoring of the retry logic and the introduction of specific retryable error patterns are well-implemented.

gemini-code-assist · 2026-03-19T06:18:43Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+const RETRYABLE_STREAM_ERROR_STATUS_CODES = new Set([
+  408, 429, 500, 502, 503, 504,
+]);
+const RETRYABLE_STREAM_ERROR_PATTERNS = [
+  "server_error",
+  "internal server error",
+  "service unavailable",
+  "bad gateway",
+  "gateway timeout",
+  "too many requests",
+  "rate_limit",
+  "overloaded",
+  "timeout",
+  "econnrefused",
+  "enotfound",
+  "econnreset",
+  "epipe",
+  "etimedout",


Consider grouping these new constants (RETRYABLE_STREAM_ERROR_STATUS_CODES, RETRYABLE_STREAM_ERROR_PATTERNS) with the other related constants (e.g., MAX_TERMINATED_STREAM_RETRIES, STREAM_RETRY_BASE_DELAY_MS, STREAM_CONTINUE_MESSAGE) for better organization and readability. This would make it easier to find all configuration-like values in one place.

const MAX_TERMINATED_STREAM_RETRIES = 3; const STREAM_RETRY_BASE_DELAY_MS = 400; const STREAM_CONTINUE_MESSAGE = "[System] Your previous response stream was interrupted by a transient network error. Continue from exactly where you left off and do not repeat text that has already been sent."; const RETRYABLE_STREAM_ERROR_STATUS_CODES = new Set([ 408, 429, 500, 502, 503, 504, ]); const RETRYABLE_STREAM_ERROR_PATTERNS = [ "server_error", "internal server error", "service unavailable", "bad gateway", "gateway timeout", "too many requests", "rate_limit", "overloaded", "timeout", "econnrefused", "enotfound", "econnreset", "epipe", "etimedout", ];

greptile-apps · 2026-03-19T06:21:16Z

Greptile Summary

This PR extends the local agent stream retry logic to cover structured provider-side errors (e.g. Azure server_error) in addition to transport-level "terminated" errors. It adds isRetryableProviderStreamError which checks HTTP status codes and a set of error string patterns, renames shouldRetryTerminatedStreamError to shouldRetryTransientStreamError, adds a focused regression test, and documents the behaviour in rules/local-agent-tools.md.

isRetryableProviderStreamError correctly unwraps Azure's { type: "error", error: { type: "server_error", ... } } shape and matches it via RETRYABLE_STREAM_ERROR_PATTERNS, which the new test validates end-to-end.
The telemetry event name "local_agent:terminated_stream_retry" (lines 1033 and 1082) is now a misnomer — it fires for provider errors too. This will conflate two distinct failure types in any monitoring dashboard.
unwrapStreamError is called before reading the status code, which discards any statusCode/status on the outer error envelope. A 408 Request Timeout error wrapped as { status: 408, error: { message: "..." } } would lose its status code and fall through to string-pattern matching where no "request timeout" pattern exists, potentially silently skipping the retry.

Confidence Score: 4/5

Safe to merge with minor follow-up recommended for telemetry naming and the status-code unwrap edge case.
The core retry logic is sound and well-tested. The new isRetryableProviderStreamError function correctly handles the Azure structured error shape demonstrated in the PR description. Two non-blocking issues exist: the telemetry event name is now misleading, and unwrapStreamError called before the status-code extraction can silently drop a 408 from a wrapped envelope. Neither causes incorrect behaviour for the targeted use case, but both could become harder to debug later.
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts — specifically the telemetry event names at lines 1033/1082 and the status-code extraction in isRetryableProviderStreamError (lines 1357–1363).

Important Files Changed

Filename	Overview
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts	Adds `isRetryableProviderStreamError` to catch structured provider errors (e.g. Azure `server_error`) and renames `shouldRetryTerminatedStreamError` → `shouldRetryTransientStreamError`. Two minor issues: the telemetry event name `"local_agent:terminated_stream_retry"` is now a misnomer, and `unwrapStreamError` drops the outer-envelope HTTP status code which could cause `408` errors to slip through unretried.
src/tests/local_agent_handler.test.ts	Adds a regression test for the structured provider error retry path. The test correctly simulates Azure's `{ type: "error", error: { type: "server_error", ... } }` shape, verifies `attemptCount === 2`, confirms no error events are emitted, and asserts the continuation instruction is injected into the second attempt's messages.
rules/local-agent-tools.md	Adds a "Stream retries" section documenting that `handleLocalAgentStream` retry behavior should cover both transport errors and structured provider stream errors.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Stream iteration / response.await] -->|error thrown| B{aborted?}
    B -- yes --> C[Break / log]
    B -- no --> D{shouldRetryTransientStreamError?}
    D -->|retryCount >= MAX 3| E[throw error → surface to user]
    D -->|retryCount < MAX| F{isTerminatedStreamError?}
    F -- yes --> H[Schedule retry with continuation instruction]
    F -- no --> G{isRetryableProviderStreamError?}
    G -->|statusCode in set OR ≥500| H
    G -->|error string matches pattern| H
    G -- no match --> E
    H --> I[maybeAppendRetryReplayForRetry]
    I --> J[terminatedRetryCount++]
    J --> K[needsContinuationInstruction = true]
    K --> L[delay STREAM_RETRY_BASE_DELAY_MS × retryCount]
    L --> M[continue → restart stream pass]

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
Line: 1033-1039

Comment:
**Telemetry event name no longer reflects its scope**

The event name `"local_agent:terminated_stream_retry"` is now fired for both transport-level `"terminated"` errors and structured provider-side errors (e.g. Azure `server_error`). Any telemetry dashboards filtering on this event name will silently lump in provider retries under the old "terminated" bucket, making it hard to distinguish the two failure modes.

The same stale name is used again at line 1082 (the `response` await catch path).

Consider a name that covers both cases, and optionally add an `errorKind` field to differentiate:

```suggestion
              sendTelemetryEvent("local_agent:transient_stream_retry", {
                chatId: req.chatId,
                dyadRequestId,
                retryCount: terminatedRetryCount,
                error: String(streamError),
                phase: "stream_iteration",
              });
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
Line: 1357-1363

Comment:
**`unwrapStreamError` discards the outer envelope's HTTP status code**

`unwrapStreamError` returns `error.error` when an `"error"` key is present, which strips any `statusCode` / `status` sitting on the outer wrapper. For a provider error shaped like `{ status: 408, error: { message: "Request Timeout" } }`, the `408` is lost and the code falls through to string-pattern matching — but `"request timeout"` is not in `RETRYABLE_STREAM_ERROR_PATTERNS`, so a legitimate 408 wrapped error would silently not be retried.

Consider extracting the status code from both levels before unwrapping:

```
const statusCode =
  (isRecord(error) && typeof (error as any).statusCode === "number" && (error as any).statusCode) ||
  (isRecord(error) && typeof (error as any).status === "number" && (error as any).status) ||
  // ... then fall through to the unwrapped object
```

Or alternatively, add `"request timeout"` to `RETRYABLE_STREAM_ERROR_PATTERNS` so the string-path still catches it.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "Remove 'timeout' err..."}

greptile-apps · 2026-03-19T06:21:19Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+  if (
+    typeof statusCode === "number" &&
+    (statusCode >= 500 || RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode))
+  ) {
+    return true;
+  }


>= 500 makes the explicit Set redundant and retries non-retryable codes

statusCode >= 500 catches every 5xx code including non-transient ones like 501 Not Implemented and 505 HTTP Version Not Supported, which a provider would never recover from on a retry. Because the Set already enumerates the exact 5xx codes worth retrying (500, 502, 503, 504) alongside the 4xx ones (408, 429), the >= 500 branch is both overly broad and redundant.

Consider replacing the condition with just the Set lookup:

Suggested change

if (

typeof statusCode === "number" &&

(statusCode >= 500 || RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode))

) {

return true;

}

if (

typeof statusCode === "number" &&

RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode)

) {

return true;

}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts Line: 1366-1371 Comment: **`>= 500` makes the explicit Set redundant and retries non-retryable codes** `statusCode >= 500` catches every 5xx code including non-transient ones like `501 Not Implemented` and `505 HTTP Version Not Supported`, which a provider would never recover from on a retry. Because the Set already enumerates the exact 5xx codes worth retrying (`500, 502, 503, 504`) alongside the 4xx ones (`408, 429`), the `>= 500` branch is both overly broad and redundant. Consider replacing the condition with just the Set lookup: ```suggestion if ( typeof statusCode === "number" && RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode) ) { return true; } ``` How can I resolve this? If you propose a fix, please make it concise.

dyad-assistant · 2026-03-19T06:22:52Z

🔍 Dyadbot Code Review Summary

Verdict: 🤔 NOT SURE - Potential issues

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity	File	Issue
🟡 MEDIUM	`local_agent_handler.ts:1096`	Exhausted-retries telemetry only fires for terminated errors, not new provider errors
🟡 MEDIUM	`local_agent_handler.ts:104`	Bare `timeout` pattern is too broad — will match non-transient client timeouts
🟡 MEDIUM	`local_agent_handler.ts:92`	429 rate-limit errors retried with same short delay (~400ms) as server errors

🟢 Low Priority Notes (4 items)

Continuation message inaccuracy - local_agent_handler.ts:89 — STREAM_CONTINUE_MESSAGE says "network error" but is now also sent for server-side provider errors
Redundant 5xx codes in Set - local_agent_handler.ts:92 — 500/502/503/504 in the Set are already covered by the >= 500 check; only 408 and 429 are needed
Stale naming - local_agent_handler.ts:87 — MAX_TERMINATED_STREAM_RETRIES / terminatedRetryCount no longer reflect the broadened retry scope
statusCode extraction fragility - local_agent_handler.ts:1358 — && short-circuit yields false instead of undefined for edge cases; works but is confusing

🚫 Dropped False Positives (3 items)

~~No user feedback during retries~~ — Dropped: Pre-existing behavior from the terminated-stream retry path, not introduced by this PR
~~Test only covers stream-iteration path~~ — Dropped: The existing terminated-stream tests already validate the response-finalization retry pattern; adding another is nice-to-have
~~getErrorMessage fallback complexity~~ — Dropped: Defensive fallback for edge-case error shapes is reasonable

Generated by Dyadbot multi-agent code review

cursor · 2026-03-19T06:23:57Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

          } catch (err) {
            if (
-              shouldRetryTerminatedStreamError({
+              shouldRetryTransientStreamError({


Exhausted-retries telemetry missing for new provider errors

Low Severity

In the response_finalization phase, the telemetry guard at line 1096 still only checks isTerminatedStreamError(err). Since shouldRetryTransientStreamError now also retries isRetryableProviderStreamError errors, when those new provider errors exhaust retries, the terminated_stream_retries_exhausted telemetry event silently won't fire. The stream_iteration phase (line 1047) correctly fires telemetry unconditionally on exhaustion, making this an inconsistency between the two paths.

Additional Locations (1)

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts#L1065-L1070

dyad-assistant

Multi-agent review: 3 issue(s) found

dyad-assistant · 2026-03-19T06:24:05Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+  "too many requests",
+  "rate_limit",
+  "overloaded",
+  "timeout",


🟡 MEDIUM | correctness

Bare timeout pattern is too broad

The substring 'timeout' will match any error whose message/code/type contains the word — including non-transient client-imposed timeouts (e.g., AbortSignal.timeout(), user-configured request timeouts, or messages like "connection timeout set too low"). This would cause up to 3 silent retries of non-retryable errors before surfacing the real failure.

The more specific 'etimedout' and 'gateway timeout' patterns already cover the network-level and HTTP 504 cases. Status code 408 is also handled by the Set.

💡 Suggestion: Remove the bare 'timeout' entry and rely on the existing specific patterns (etimedout, gateway timeout, status 408/504).

dyad-assistant · 2026-03-19T06:24:05Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

-import { DEFAULT_MAX_TOOL_CALL_STEPS } from "@/constants/settings_constants";
+
+const RETRYABLE_STREAM_ERROR_STATUS_CODES = new Set([
+  408, 429, 500, 502, 503, 504,


🟡 MEDIUM | user-experience

429 rate-limit errors retried too aggressively

429 (Too Many Requests) is included in RETRYABLE_STREAM_ERROR_STATUS_CODES and will be retried after only ~400ms-1200ms (linear backoff). Providers that emit 429 typically expect longer backoff (seconds to minutes) and may include a Retry-After header. Retrying in <2s will likely hit the rate limit again immediately, burning all 3 retry attempts and delaying the error the user sees by ~2.4s with no benefit.

💡 Suggestion: Either exclude 429 from automatic retry (and let the existing rate-limit error UI surface immediately), or apply a significantly longer minimum delay for 429s and respect Retry-After headers if present.

dyad-assistant · 2026-03-19T06:24:05Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

@@ -1338,7 +1394,7 @@ function shouldRetryTerminatedStreamError(params: {
  return (


🟡 MEDIUM | observability

Exhausted-retries telemetry misses new provider errors

The terminated_stream_retries_exhausted telemetry event in the response-finalization path (~line 1096) is still gated on isTerminatedStreamError(err). After this PR, retries can also be exhausted by provider-side errors (e.g., Azure server_error, 429s) matched by isRetryableProviderStreamError — but those cases silently skip telemetry.

Note: the stream-iteration exhaustion path (~line 1047) fires unconditionally, so only the response-finalization path has the gap.

💡 Suggestion: Change the condition at ~line 1096 to isTerminatedStreamError(err) || isRetryableProviderStreamError(err).

devin-ai-integration

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

devin-ai-integration · 2026-03-19T06:24:09Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

🟡 Missing "retries exhausted" telemetry for retryable provider errors in response finalization phase

The PR expands shouldRetryTransientStreamError (line 1066) to cover both terminated errors and retryable provider errors, but the fallback telemetry guard at local_agent_handler.ts:1096 still only checks isTerminatedStreamError(err). When a retryable provider error (e.g., 500 server_error) exhausts its MAX_TERMINATED_STREAM_RETRIES retries during the response finalization phase, no "local_agent:terminated_stream_retries_exhausted" telemetry event is emitted — unlike the stream iteration error site (local_agent_handler.ts:1047) which unconditionally sends the telemetry. This creates an observability blind spot for the newly added error types.

(Refers to lines 1096-1107)

Was this helpful? React with 👍 or 👎 to provide feedback.

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

cubic-dev-ai

1 issue found across 2 files

Confidence score: 3/5

There is a concrete reliability risk in src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts: using statusCode >= 500 broadens retries to all 5xx responses, including non-transient cases that are unlikely to succeed on retry.
Because this issue is medium severity (6/10) with high confidence (9/10), it introduces meaningful behavior risk rather than a minor housekeeping concern.
The retry logic may ignore the intended curated RETRYABLE_STREAM_ERRORS behavior, which could cause unnecessary retry loops and delayed failure handling for users.
Pay close attention to src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts - ensure retry conditions exclude non-retryable 5xx statuses and align with the curated retry policy.

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts">

<violation number="1" location="src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts:1368">
P2: `statusCode >= 500` catches every 5xx code including non-transient ones like `501 Not Implemented` and `505 HTTP Version Not Supported`, which will never recover on retry. This makes the curated `RETRYABLE_STREAM_ERROR_STATUS_CODES` Set redundant for 5xx codes and causes unnecessary retries for permanent failures. Replace the compound condition with just the Set lookup.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-19T06:25:01Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

+
+  if (
+    typeof statusCode === "number" &&
+    (statusCode >= 500 || RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode))


P2: statusCode >= 500 catches every 5xx code including non-transient ones like 501 Not Implemented and 505 HTTP Version Not Supported, which will never recover on retry. This makes the curated RETRYABLE_STREAM_ERROR_STATUS_CODES Set redundant for 5xx codes and causes unnecessary retries for permanent failures. Replace the compound condition with just the Set lookup.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts, line 1368: <comment>`statusCode >= 500` catches every 5xx code including non-transient ones like `501 Not Implemented` and `505 HTTP Version Not Supported`, which will never recover on retry. This makes the curated `RETRYABLE_STREAM_ERROR_STATUS_CODES` Set redundant for 5xx codes and causes unnecessary retries for permanent failures. Replace the compound condition with just the Set lookup.</comment> <file context> @@ -1329,7 +1349,43 @@ function isTerminatedStreamError(error: unknown): boolean { + + if ( + typeof statusCode === "number" && + (statusCode >= 500 || RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode)) + ) { + return true; </file context>

Suggested change

(statusCode >= 500 || RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode))

RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode)

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Removed 'timeout' from the list of error messages.

wwwillchen · 2026-03-20T22:42:03Z

@BugBot run

dyad-assistant · 2026-03-20T22:48:03Z

🔍 Dyadbot Code Review Summary

Verdict: ✅ YES - Ready to merge

Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard.

Issues Summary

Severity	File	Issue
🟡 MEDIUM	`src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts:89`	Continuation instruction semantically wrong for provider errors

🟢 Low Priority Notes (3 items)

Constant naming inconsistency - src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts:87 - MAX_TERMINATED_STREAM_RETRIES was not renamed to match shouldRetryTransientStreamError
No user feedback during retry delays - src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts - Users see a silent pause (up to ~2.4s) with no indication retries are happening
Error message doesn't mention retries - src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts - When retries exhaust, the user-facing error gives no context that automatic retries already occurred

🚫 Dropped False Positives (7 items)

~~Exhausted-retries telemetry missing for provider errors~~ - Dropped: Already covered by existing comments from cursor[bot], dyad-assistant[bot], and devin-ai-integration[bot]
~~>= 500 makes Set redundant / catches non-transient 5xx~~ - Dropped: Already covered by existing comments from greptile-apps[bot] and cubic-dev-ai[bot]
~~429 rate-limit retried too aggressively~~ - Dropped: Already covered by existing comment from dyad-assistant[bot]
~~timeout pattern too broad~~ - Dropped: Already covered by existing comment from dyad-assistant[bot]
~~statusCode short-circuit returns false for status 0~~ - Dropped: HTTP status 0 is not a real scenario; typeof check downstream correctly filters it
~~Test may not cover Error-wrapped provider shape~~ - Dropped: Speculative without evidence the AI SDK wraps differently; test covers the specific Azure shape from the PR description
~~Test throws vs yields error event~~ - Dropped: The thrown-error path is the actual code path being exercised

Generated by Dyadbot multi-agent code review

dyad-assistant

Multi-agent review: 1 issue found

dyad-assistant · 2026-03-20T22:48:34Z

src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts

 const MAX_TERMINATED_STREAM_RETRIES = 3;
 const STREAM_RETRY_BASE_DELAY_MS = 400;
 const STREAM_CONTINUE_MESSAGE =
  "[System] Your previous response stream was interrupted by a transient network error. Continue from exactly where you left off and do not repeat text that has already been sent.";


🟡 MEDIUM | user-experience / correctness

Continuation instruction semantically wrong for provider errors

STREAM_CONTINUE_MESSAGE says "interrupted by a transient network error. Continue from exactly where you left off" — but when a provider emits a structured server error (e.g. Azure server_error) before any text was streamed, there is nothing to continue from. The model receives a misleading instruction to "continue" non-existent partial output.

For the existing terminated-stream path this was appropriate (the TCP connection dropped mid-response), but provider errors can fire before any output is generated.

💡 Suggestion: Only set needsContinuationInstruction = true when fullResponse is non-empty, or use a distinct retry message for provider errors that fired before any output (e.g. a simple "Please retry the request" instead of "continue from where you left off").

github-actions · 2026-03-20T23:13:08Z

🎭 Playwright Test Results

❌ Some tests failed

OS	Passed	Failed	Flaky	Skipped
🍎 macOS	391	2	3	126
🪟 Windows	396	7	8	126

Summary: 787 passed, 9 failed, 11 flaky, 252 skipped

Failed Tests

🍎 macOS

queued_message.spec.ts > editing queued message restores attachments and selected components
- Error: expect(locator).toBeVisible() failed
queued_message.spec.ts > canceling queued message edit clears restored components
- Error: expect(locator).toBeVisible() failed

🪟 Windows

concurrent_chat.spec.ts > concurrent chat
- Error: expect(locator).toBeVisible() failed
edit_code.spec.ts > edit code
- Error: expect(locator).toBeVisible() failed
github.spec.ts > create and sync to new repo
- Error: expect(locator).toHaveClass(expected) failed
github.spec.ts > create and sync to new repo - custom branch
- TimeoutError: locator.click: Timeout 30000ms exceeded.
github.spec.ts > create and sync to existing repo
- Error: expect(locator).toMatchAriaSnapshot(expected) failed
github.spec.ts > create and sync to existing repo - custom branch
- Error: expect(locator).toMatchAriaSnapshot(expected) failed
setup_flow.spec.ts > Setup Flow > node.js install flow
- TimeoutError: locator.dispatchEvent: Timeout 30000ms exceeded.

📋 Re-run Failing Tests (macOS)

Copy and paste to re-run all failing spec files locally:

npm run e2e \
  e2e-tests/queued_message.spec.ts

⚠️ Flaky Tests

🍎 macOS

fix_error.spec.ts > fix error with AI (passed after 1 retry)
logs_server.spec.ts > system messages UI shows server logs with correct type (passed after 1 retry)
setup_flow.spec.ts > Setup Flow > setup banner shows correct state when node.js is installed (passed after 1 retry)

🪟 Windows

chat_input.spec.ts > send button disabled during pending proposal (passed after 1 retry)
chat_input.spec.ts > send button disabled during pending proposal - reject (passed after 1 retry)
chat_mode.spec.ts > chat mode selector - ask mode (passed after 1 retry)
chat_tabs.spec.ts > right-click context menu: Close tabs to the right (passed after 2 retries)
edit_code.spec.ts > edit code edits the right file (passed after 1 retry)
github.spec.ts > should connect to GitHub using device flow (passed after 2 retries)
github.spec.ts > github clear integration settings (passed after 1 retry)
setup_flow.spec.ts > Setup Flow > setup banner shows correct state when node.js is installed (passed after 1 retry)

📊 View full report

cursor

✅ Bugbot reviewed your changes and found no new issues!

1 issue from previous review remains unresolved.

Comment @cursor review or bugbot run to trigger another review on this PR

fix: retry transient local agent server errors

ed448f4

wwwillchen requested a review from a team March 19, 2026 06:16

wwwillchen temporarily deployed to ai-bots March 19, 2026 06:17 — with GitHub Actions Inactive

wwwillchen had a problem deploying to ai-bots March 19, 2026 06:17 — with GitHub Actions Error

docs: record session learnings

f2d2ff9

wwwillchen temporarily deployed to ai-bots March 19, 2026 06:17 — with GitHub Actions Inactive

gemini-code-assist bot reviewed Mar 19, 2026

View reviewed changes

greptile-apps bot reviewed Mar 19, 2026

View reviewed changes

cursor bot reviewed Mar 19, 2026

View reviewed changes

dyad-assistant bot reviewed Mar 19, 2026

View reviewed changes

devin-ai-integration bot reviewed Mar 19, 2026

View reviewed changes

cursor bot reviewed Mar 19, 2026

View reviewed changes

cubic-dev-ai bot reviewed Mar 19, 2026

View reviewed changes

cursor bot reviewed Mar 19, 2026

View reviewed changes

github-actions bot added the needs-human:review-issue ai agent flagged an issue that requires human review label Mar 19, 2026

Remove 'timeout' error from handler

b0b40d3

Removed 'timeout' from the list of error messages.

wwwillchen temporarily deployed to ai-bots March 20, 2026 22:42 — with GitHub Actions Inactive

wwwillchen merged commit 60e7444 into dyad-sh:main Mar 20, 2026
6 of 9 checks passed

dyad-assistant bot reviewed Mar 20, 2026

View reviewed changes

cursor bot reviewed Mar 21, 2026

View reviewed changes

		@@ -1338,7 +1394,7 @@ function shouldRetryTerminatedStreamError(params: {
		return (

	(statusCode >= 500 \|\| RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode))
	RETRYABLE_STREAM_ERROR_STATUS_CODES.has(statusCode)

Conversation

wwwillchen commented Mar 19, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

wwwillchen commented Mar 19, 2026

Uh oh!

gemini-code-assist bot commented Mar 19, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

wwwillchen commented Mar 19, 2026

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dyad-assistant bot commented Mar 19, 2026

🔍 Dyadbot Code Review Summary

Issues Summary

Uh oh!

cursor bot Mar 19, 2026

Choose a reason for hiding this comment

Exhausted-retries telemetry missing for new provider errors

Uh oh!

dyad-assistant bot left a comment

Choose a reason for hiding this comment

Uh oh!

dyad-assistant bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dyad-assistant bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

dyad-assistant bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

wwwillchen commented Mar 20, 2026

Uh oh!

Uh oh!

dyad-assistant bot commented Mar 20, 2026

🔍 Dyadbot Code Review Summary

Issues Summary

Uh oh!

dyad-assistant bot left a comment

Choose a reason for hiding this comment

wwwillchen commented Mar 19, 2026 •

edited by devin-ai-integration bot

Loading

greptile-apps bot commented Mar 19, 2026 •

edited

Loading

cubic-dev-ai bot Mar 19, 2026 •

edited

Loading