feat(agent): mid-turn message injection for responsive follow-ups#2985
Merged
chengyongru merged 2 commits intonightlyfrom Apr 10, 2026
Merged
feat(agent): mid-turn message injection for responsive follow-ups#2985chengyongru merged 2 commits intonightlyfrom
chengyongru merged 2 commits intonightlyfrom
Conversation
Allow user messages sent during an active agent turn to be injected
into the running LLM context instead of being queued behind a
per-session lock. Inspired by Claude Code's mid-turn queue drain
mechanism (query.ts:1547-1643).
Key design decisions:
- Messages are injected as natural user messages between iterations,
no tool cancellation or special system prompt needed
- Two drain checkpoints: after tool execution and after final LLM
response ("last-mile" to prevent dropping late arrivals)
- Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the
iteration budget on rapid follow-ups
- had_injections flag bypasses _sent_in_turn suppression so follow-up
responses are always delivered
Closes #1609
…ue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish
jhkim43
added a commit
to jhkim43/nanobot
that referenced
this pull request
Apr 11, 2026
…UDS#2985) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes HKUDS#1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish
chengyongru
added a commit
that referenced
this pull request
Apr 11, 2026
) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes #1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish
Re-bin
pushed a commit
that referenced
this pull request
Apr 11, 2026
) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes #1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish
xzq-xu
added a commit
to xzq-xu/nanobot
that referenced
this pull request
Apr 13, 2026
Upstream's pending_queue injection (PR HKUDS#2985) fully replaces the SteeringHook mechanism. Gateway now passes pending_queue directly to _process_message, so the per-call extra_hooks parameter and the steering.py / messages.py files are no longer needed. - Delete nanobot/agent/steering.py (InterruptionChecker + SteeringHook) - Delete nanobot/agent/messages.py (AgentMessage dual-layer model) - Remove extra_hooks parameter from _process_message and _run_agent_loop - Restore original hook merging in _run_agent_loop Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Motivation
Currently, nanobot uses a per-session
asyncio.Lockthat serializes message processing. When a task takes a long time (e.g.web_search, longexec), new messages from the user must wait until the entire turn completes. This makes the agent feel unresponsive. See #1609 for the full discussion.Previous attempt (#1233, closed) tried to interrupt tool execution mid-way, which introduced significant complexity (~500 lines) with cancellation edge cases. This PR takes a simpler approach: don't cancel tools, just inject messages between iterations.
How It Works
Two drain checkpoints in the agent loop:
Key Design Decisions
usermessages[Follow-up]markers needed — LLM handles multi-turn naturally_MAX_INJECTIONS_PER_TURN = 3_MAX_INJECTION_CYCLES = 5had_injectionsbypasses_sent_in_turntry/finallyEdge Cases Handled
_sent_in_turnconflict: Follow-up responses bypass the MessageTool suppression checkfinallyblock in_dispatch()ensures no dangling queuesinjection_callbackthrows, the error is logged and injection is skippedChanges
nanobot/agent/runner.py:injection_callbackonAgentRunSpec,_drain_injections()helper, two drain checkpoints inrun(),had_injectionsonAgentRunResultnanobot/agent/loop.py:_pending_queuesdict, message routing inrun(), queue lifecycle in_dispatch(),_drain_pendingcallback,_sent_in_turnbypass_run_agent_loopreturn value unpacking (3-tuple → 5-tuple)Related