fix(core): drop late tool calls after SIGINT cancellation (#28091)#28096
fix(core): drop late tool calls after SIGINT cancellation (#28091)#28096KittiphonKamnuan wants to merge 2 commits into
Conversation
Closes google-gemini#28091. A SIGINT delivered after the stream consumer has already started a turn could still result in a delayed provider tool-call chunk being executed locally: the side-effect ran, and the result was submitted back to the model after the user had cancelled. The root cause is that the abort signal was only checked at the top of each `for await` iteration on the model stream. A chunk that arrived after cancellation but in the same iteration as the abort would still materialize tool calls, push them onto the scheduler, and run. Defense-in-depth fix at three layers: 1. `Turn.run` (`packages/core/src/core/turn.ts`): re-check `signal.aborted` between processing text/thought parts and yielding `ToolCallRequest` events. If the user cancelled mid-chunk, yield `UserCancelled` and stop instead of forwarding the tool call. 2. Non-interactive CLI driver (`packages/cli/src/nonInteractiveCli.ts`): re-check `signal.aborted` after the stream loop exits and before calling `scheduler.schedule(...)`, so a late tool call that survived the stream layer is not handed to the scheduler. 3. Scheduler (`packages/core/src/scheduler/scheduler.ts`): in `_startBatch`, short-circuit via `cancelAllQueued('Operation cancelled')` if the caller's signal is already aborted, before validating + enqueuing. This mirrors the existing pattern in `_processNextItem` and is the final backstop against late tool calls reaching the executor. Each layer has a regression test for the late-after-cancel ordering.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a race condition where tool calls could be executed locally even after a user has issued a SIGINT cancellation. By implementing a defense-in-depth strategy, the changes ensure that the abort signal is respected at the stream processing, CLI driver, and scheduler layers, preventing late-arriving tool call chunks from triggering side effects after the user has terminated the operation. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
📊 PR Size: size/M
|
There was a problem hiding this comment.
Code Review
This pull request introduces guards to check for aborted signals before scheduling or executing tool calls, preventing late-arriving side effects from running after user cancellation. It also adds corresponding regression tests. The review feedback highlights a critical issue in the scheduler's early return where bypassing the request queue processing can cause subsequent queued tool call batches to hang indefinitely, and suggests invoking _processNextInRequestQueue() before returning.
| if (signal.aborted) { | ||
| this.state.cancelAllQueued('Operation cancelled'); | ||
| this.isProcessing = false; | ||
| return this.state.completedBatch; | ||
| } |
There was a problem hiding this comment.
Returning early when signal.aborted is true bypasses the finally block of _startBatch. This prevents this._processNextInRequestQueue() from being called, which will cause any subsequent or concurrent tool call batches queued in this.requestQueue to hang indefinitely and never resolve or reject. To prevent this promise/queue leak, we must ensure this._processNextInRequestQueue() is called before returning early.
| if (signal.aborted) { | |
| this.state.cancelAllQueued('Operation cancelled'); | |
| this.isProcessing = false; | |
| return this.state.completedBatch; | |
| } | |
| if (signal.aborted) { | |
| this.state.cancelAllQueued('Operation cancelled'); | |
| this.isProcessing = false; | |
| this._processNextInRequestQueue(); | |
| return this.state.completedBatch; | |
| } |
References
- Asynchronous operations that can be cancelled by the user should accept and propagate an AbortSignal to ensure cancellability and prevent dangling processes or network requests.
Follow-up to gemini-code-assist review on google-gemini#28096. The previous fix short-circuited `_startBatch` with an early `return` *outside* the `try { ... } finally { ... }`. That bypassed the `finally` block, so `_processNextInRequestQueue()` never ran when the signal was already aborted — any concurrent batches queued in `requestQueue` would hang indefinitely instead of being drained / rejected. Move the abort guard inside the `try` so the existing `finally` (`isProcessing = false`, `clearBatch()`, `_processNextInRequestQueue()`) runs on the early-return path too. Add a regression test that schedules a second batch after the aborted one to confirm the scheduler still makes progress.
|
Thanks for catching the queue leak, @gemini-code-assist. Fixed in e0efeb8 by moving the abort guard inside the This keeps the existing cleanup pattern intact:
Slightly different from the suggested patch (which duplicated the cleanup into the early-return branch) — putting the guard inside Added a regression test that schedules a second batch after the aborted one to confirm the scheduler still drains the queue and the follow-up call resolves. All 68 scheduler + turn tests pass, lint and tsc clean. |
|
That is a solid approach. Moving the abort guard inside the |
Summary
Closes #28091.
A SIGINT delivered after the stream consumer has already started a turn can still result in a delayed provider tool-call chunk being executed locally — the side effect runs and the tool result is submitted back to the model after the user has cancelled. The standalone reproducer in the issue confirms this on
0.47.0.Root cause
The abort signal is only checked at the top of each `for await` iteration on the model stream (`Turn.run` in `packages/core/src/core/turn.ts`). A chunk that arrives after cancellation but in the same iteration as the abort still:
Fix — defense in depth at three layers
`Turn.run` (`packages/core/src/core/turn.ts`)
Re-check `signal.aborted` between processing text/thought parts and yielding `ToolCallRequest` events. If the user cancelled mid-chunk, yield `UserCancelled` and stop instead of forwarding the tool call.
Non-interactive CLI driver (`packages/cli/src/nonInteractiveCli.ts`)
Re-check `signal.aborted` after the stream loop exits and before calling `scheduler.schedule(...)`, so a late tool call that survived the stream layer is not handed to the scheduler.
Scheduler (`packages/core/src/scheduler/scheduler.ts`)
In `_startBatch`, short-circuit via `cancelAllQueued('Operation cancelled')` if the caller's signal is already aborted, before validating + enqueuing. Mirrors the existing pattern in `_processNextItem`; final backstop against late tool calls reaching the executor.
Each layer ships a regression test for the late-after-cancel ordering.
Test plan
Files changed