fix: Fixes race conditions and missing error handling that cause subagents to hang indefinitely when errors occur.#18389
Conversation
When a session is cancelled (e.g., due to an error or abort), the callbacks array containing pending Promise resolve/reject handlers was not being processed. This caused callers waiting for the session to complete to hang indefinitely. This fix ensures all pending callbacks are rejected with an AbortError before the session state is cleaned up, both in the cancel() function and in the state dispose handler.
- Add try-catch in task.ts to catch and propagate subagent execution errors - Check session state existence in loop() before adding callbacks - Prevent race conditions where callbacks are added to deleted session state - Ensure parent agent receives proper error notification when subagent fails This fixes issues where subagents appear to be running but are actually crashed or cancelled, causing parent agents to wait indefinitely.
|
Hey! Your PR title Please update it to start with one of:
Where See CONTRIBUTING.md for details. |
|
The following comment was made by an LLM, it may be inaccurate: Based on the search results, I found the following potentially related PRs: Potentially Related PRs
Most relevant: PR #13422 appears to be the most directly related, as it specifically addresses propagating subagent errors to the parent session, which is the core issue being fixed in PR #18389. |
Issue for this PR
Closes #18378
Type of change
What does this PR do?
Fixes race conditions and missing error handling that cause subagents to hang indefinitely when errors occur.
Changes Made
1. prompt.ts - session/prompt.ts
2. task.ts - tool/task.ts
let result: MessageV2.WithParts
try {
result = await SessionPrompt.prompt({ /* ... */ })
} catch (error) {
throw new Error(
Subagent execution failed: ${error instanceof Error ? error.message : String(error)})}
Problem Solved
Before: Parent agent shows subagent as "running" forever, even though subagent crashed/cancelled
After: Parent agent receives proper error notification and can handle failure gracefully
How did you verify your code works?
Screenshots / recordings
Not applicable (backend error handling fix)
Checklist