fix: improve terminated stream retry telemetry and bump retries to 3#2877
Conversation
- Include dyadRequestId in terminated_stream_retry telemetry events - Add terminated_stream_retries_exhausted telemetry event when all retries are used up (both stream_iteration and response_finalization) - Bump MAX_TERMINATED_STREAM_RETRIES from 2 to 3 Made-with: Cursor
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the robustness and observability of stream handling by increasing the maximum number of retry attempts for terminated streams and significantly improving the associated telemetry. The changes provide better insight into retry behavior and exhaustion, which will aid in debugging and understanding stream stability. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with π and π on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
| sendTelemetryEvent( | ||
| "local_agent:terminated_stream_retries_exhausted", | ||
| { | ||
| chatId: req.chatId, | ||
| dyadRequestId, | ||
| retryCount: terminatedRetryCount, | ||
| error: String(streamError), | ||
| phase: "stream_iteration", | ||
| }, | ||
| ); |
There was a problem hiding this comment.
π‘ Missing isTerminatedStreamError guard causes false retries_exhausted telemetry for non-terminated errors
In the stream_iteration phase, the terminated_stream_retries_exhausted telemetry event is sent unconditionally for any stream error that doesn't qualify for retry. However, shouldRetryTerminatedStreamError returns false for two distinct reasons: (1) the error IS a terminated stream error but retries are exhausted, or (2) the error is NOT a terminated stream error at all (e.g., auth error, rate limit, server error).
Root Cause and Comparison with response_finalization
At local_agent_handler.ts:1020-1029, when shouldRetryTerminatedStreamError returns false, the code unconditionally sends terminated_stream_retries_exhausted before throwing. This means any non-terminated stream error (e.g. a 401 auth error or 429 rate-limit error) will produce a misleading terminated_stream_retries_exhausted telemetry event.
The response_finalization phase at local_agent_handler.ts:1069 correctly guards the same telemetry event with if (isTerminatedStreamError(err)), but this guard is missing in the stream_iteration phase.
Impact: False/misleading telemetry events will pollute the terminated_stream_retries_exhausted metric, making it unreliable for diagnosing actual terminated stream exhaustion.
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| if (isTerminatedStreamError(streamError)) { | |
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| } |
Was this helpful? React with π or π to provide feedback.
| error: String(streamError), | ||
| phase: "stream_iteration", | ||
| }, | ||
| ); |
There was a problem hiding this comment.
Missing guard emits false "retries exhausted" telemetry events
Medium Severity
In the stream_iteration phase, the terminated_stream_retries_exhausted telemetry event fires unconditionally when shouldRetryTerminatedStreamError returns false. Since that function also returns false for errors that aren't terminated stream errors at all, non-terminated stream errors will incorrectly emit this event. The response_finalization phase correctly guards this with if (isTerminatedStreamError(err)), but the stream_iteration phase is missing the equivalent guard.
Additional Locations (1)
Greptile SummaryThis PR improves telemetry around terminated stream retries in Key changes:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Stream Error Detected] --> B{isTerminatedStreamError?}
B -- No, stream_iteration --> C[Emit exhausted event unconditionally - INCONSISTENCY]
B -- No, response_finalization --> D[Skip exhausted event - correct]
B -- Yes --> E{retryCount less than MAX=3?}
E -- Yes --> F[Emit terminated_stream_retry event with dyadRequestId\nincrement retryCount, delay and continue]
E -- No --> G[Emit terminated_stream_retries_exhausted event]
G --> H{Phase?}
H -- stream_iteration --> I[throw streamError]
H -- response_finalization --> J[warn log, clear steps and messages]
Last reviewed commit: db3b99c |
| sendTelemetryEvent( | ||
| "local_agent:terminated_stream_retries_exhausted", | ||
| { | ||
| chatId: req.chatId, | ||
| dyadRequestId, | ||
| retryCount: terminatedRetryCount, | ||
| error: String(streamError), | ||
| phase: "stream_iteration", | ||
| }, | ||
| ); | ||
| throw streamError; |
There was a problem hiding this comment.
terminated_stream_retries_exhausted fired for non-terminated errors
In the stream_iteration phase, the terminated_stream_retries_exhausted event is emitted unconditionally whenever shouldRetryTerminatedStreamError returns false. However, shouldRetryTerminatedStreamError returns false in two distinct cases:
- Retries are exhausted (correct β the event is appropriate)
- The error is not a terminated stream error (incorrect β the event name is misleading and will produce noisy/wrong telemetry)
The response_finalization phase (added just below this) correctly guards with if (isTerminatedStreamError(err)) before emitting the same event. The stream_iteration path is missing that guard.
Since both streamErrorFromCallback (set in onError) and streamErrorFromIteration (caught from the fullStream async iteration) can be arbitrary errors β not just terminated-stream errors β a non-terminated error will fall through shouldRetryTerminatedStreamError and incorrectly emit this event.
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| throw streamError; | |
| if (isTerminatedStreamError(streamError)) { | |
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts
Line: 1020-1030
Comment:
**`terminated_stream_retries_exhausted` fired for non-terminated errors**
In the `stream_iteration` phase, the `terminated_stream_retries_exhausted` event is emitted unconditionally whenever `shouldRetryTerminatedStreamError` returns `false`. However, `shouldRetryTerminatedStreamError` returns `false` in **two** distinct cases:
1. Retries are exhausted (correct β the event is appropriate)
2. The error is **not** a terminated stream error (incorrect β the event name is misleading and will produce noisy/wrong telemetry)
The `response_finalization` phase (added just below this) correctly guards with `if (isTerminatedStreamError(err))` before emitting the same event. The `stream_iteration` path is missing that guard.
Since both `streamErrorFromCallback` (set in `onError`) and `streamErrorFromIteration` (caught from the `fullStream` async iteration) can be arbitrary errors β not just terminated-stream errors β a non-terminated error will fall through `shouldRetryTerminatedStreamError` and incorrectly emit this event.
```suggestion
if (isTerminatedStreamError(streamError)) {
sendTelemetryEvent(
"local_agent:terminated_stream_retries_exhausted",
{
chatId: req.chatId,
dyadRequestId,
retryCount: terminatedRetryCount,
error: String(streamError),
phase: "stream_iteration",
},
);
}
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Code Review
This pull request improves the telemetry for terminated stream retries by adding the dyadRequestId and introducing a new terminated_stream_retries_exhausted event. It also increases the maximum number of retries to 3. The changes are well-aligned with the description. I've found one area for improvement regarding error handling consistency when retries are exhausted.
| if (isTerminatedStreamError(err)) { | ||
| sendTelemetryEvent( | ||
| "local_agent:terminated_stream_retries_exhausted", | ||
| { | ||
| chatId: req.chatId, | ||
| dyadRequestId, | ||
| retryCount: terminatedRetryCount, | ||
| error: String(err), | ||
| phase: "response_finalization", | ||
| }, | ||
| ); | ||
| } |
There was a problem hiding this comment.
There's an inconsistency in error handling for exhausted retries between the stream_iteration and response_finalization phases.
In the stream_iteration phase (lines 1020-1030), when retries are exhausted, the streamError is correctly re-thrown, causing the agent turn to fail. However, in this response_finalization phase, the error is not re-thrown after sending the terminated_stream_retries_exhausted event. This causes the error to be silently swallowed, and the agent proceeds with an empty response as if the turn was successful.
To ensure consistent and robust error handling, the error should be re-thrown here to propagate the failure, just as it's done in the other phase.
| if (isTerminatedStreamError(err)) { | |
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(err), | |
| phase: "response_finalization", | |
| }, | |
| ); | |
| } | |
| if (isTerminatedStreamError(err)) { | |
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(err), | |
| phase: "response_finalization", | |
| }, | |
| ); | |
| throw err; | |
| } |
π Dyadbot Code Review SummaryVerdict: π€ NOT SURE - Potential issues Reviewed by 3 independent agents: Correctness Expert, Code Health Expert, UX Wizard. Issues Summary
π’ Low Priority Notes (1 item)
π« Dropped False Positives (3 items)
Generated by Dyadbot multi-agent code review |
| sendTelemetryEvent( | ||
| "local_agent:terminated_stream_retries_exhausted", | ||
| { | ||
| chatId: req.chatId, | ||
| dyadRequestId, | ||
| retryCount: terminatedRetryCount, | ||
| error: String(streamError), | ||
| phase: "stream_iteration", | ||
| }, | ||
| ); | ||
| throw streamError; |
There was a problem hiding this comment.
π‘ MEDIUM | correctness
Missing isTerminatedStreamError guard on exhausted event (stream_iteration path)
The terminated_stream_retries_exhausted event fires unconditionally here whenever shouldRetryTerminatedStreamError returns false. But that function returns false for three reasons: (1) error is not a terminated stream error, (2) retries are exhausted, or (3) the stream was aborted. Only case (2) should emit this event.
The response_finalization path (line ~1069) correctly wraps its exhausted event with if (isTerminatedStreamError(err)). This path is missing the equivalent guard, which will produce misleading telemetry data.
π‘ Suggestion: Add an isTerminatedStreamError(streamError) guard to match the response_finalization pattern:
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| throw streamError; | |
| if (isTerminatedStreamError(streamError)) { | |
| sendTelemetryEvent( | |
| "local_agent:terminated_stream_retries_exhausted", | |
| { | |
| chatId: req.chatId, | |
| dyadRequestId, | |
| retryCount: terminatedRetryCount, | |
| error: String(streamError), | |
| phase: "stream_iteration", | |
| }, | |
| ); | |
| } |
There was a problem hiding this comment.
2 issues found across 1 file
Confidence score: 3/5
- There is a concrete regression risk in
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts: unlikestream_iteration, theresponse_finalizationpath appears to swallow errors after retries are exhausted instead of re-throwing, which can hide failures from callers and make debugging harder. - Telemetry accuracy may be impacted in
src/pro/main/ipc/handlers/local_agent/local_agent_handler.tsbecauseterminated_stream_retries_exhaustedcan be emitted for non-terminated errors; adding anisTerminatedStreamErrorguard should prevent misleading signals. - Given the higher-severity, high-confidence error-handling inconsistency (7/10, 8/10), this carries moderate merge risk until failure propagation behavior is made consistent across phases.
- Pay close attention to
src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts- error propagation and telemetry event gating need to align with actual terminated-stream conditions.
Prompt for AI agents (unresolved issues)
Check if these issues are valid β if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts">
<violation number="1" location="src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts:1020">
P1: Inconsistent error handling: in the `stream_iteration` phase, exhausted retries correctly re-throw the error (`throw streamError`), but here in the `response_finalization` phase the error is silently swallowed after sending telemetry. The code falls through to set `steps = []` and `responseMessages = []`, allowing the agent to proceed with an empty response as if the turn succeeded. Add `throw err;` after the telemetry event to propagate the failure consistently.</violation>
<violation number="2" location="src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts:1020">
P2: `terminated_stream_retries_exhausted` is emitted for non-terminated errors; guard it with `isTerminatedStreamError` to avoid incorrect telemetry.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| await delay(retryDelayMs); | ||
| continue; | ||
| } | ||
| sendTelemetryEvent( |
There was a problem hiding this comment.
P1: Inconsistent error handling: in the stream_iteration phase, exhausted retries correctly re-throw the error (throw streamError), but here in the response_finalization phase the error is silently swallowed after sending telemetry. The code falls through to set steps = [] and responseMessages = [], allowing the agent to proceed with an empty response as if the turn succeeded. Add throw err; after the telemetry event to propagate the failure consistently.
Prompt for AI agents
Check if this issue is valid β if so, understand the root cause and fix it. At src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts, line 1020:
<comment>Inconsistent error handling: in the `stream_iteration` phase, exhausted retries correctly re-throw the error (`throw streamError`), but here in the `response_finalization` phase the error is silently swallowed after sending telemetry. The code falls through to set `steps = []` and `responseMessages = []`, allowing the agent to proceed with an empty response as if the turn succeeded. Add `throw err;` after the telemetry event to propagate the failure consistently.</comment>
<file context>
@@ -1016,6 +1017,16 @@ export async function handleLocalAgentStream(
await delay(retryDelayMs);
continue;
}
+ sendTelemetryEvent(
+ "local_agent:terminated_stream_retries_exhausted",
+ {
</file context>
| await delay(retryDelayMs); | ||
| continue; | ||
| } | ||
| sendTelemetryEvent( |
There was a problem hiding this comment.
P2: terminated_stream_retries_exhausted is emitted for non-terminated errors; guard it with isTerminatedStreamError to avoid incorrect telemetry.
Prompt for AI agents
Check if this issue is valid β if so, understand the root cause and fix it. At src/pro/main/ipc/handlers/local_agent/local_agent_handler.ts, line 1020:
<comment>`terminated_stream_retries_exhausted` is emitted for non-terminated errors; guard it with `isTerminatedStreamError` to avoid incorrect telemetry.</comment>
<file context>
@@ -1016,6 +1017,16 @@ export async function handleLocalAgentStream(
await delay(retryDelayMs);
continue;
}
+ sendTelemetryEvent(
+ "local_agent:terminated_stream_retries_exhausted",
+ {
</file context>
π Playwright Test Resultsβ Some tests failed
Summary: 728 passed, 1 failed, 10 flaky, 240 skipped Failed Testsπͺ Windows
|


Summary
dyadRequestIdinterminated_stream_retrytelemetry events for better traceabilityterminated_stream_retries_exhaustedtelemetry event when all retries are used up (both stream_iteration and response_finalization phases)MAX_TERMINATED_STREAM_RETRIESfrom 2 to 3Test plan
local_agent_handlerunit tests pass (stream retry tests verified)dyadRequestIdin retry scenarios#skip-bugbot
Made with Cursor