feat(agent): mid-turn message injection for responsive follow-ups by chengyongru · Pull Request #2985 · HKUDS/nanobot

chengyongru · 2026-04-09T17:37:51Z

Summary

Allow user messages sent during an active agent turn to be injected into the running LLM context instead of waiting behind the per-session lock
between iterations, queued messages are drained as attachments and sent to the model in the current turn

Motivation

Currently, nanobot uses a per-session asyncio.Lock that serializes message processing. When a task takes a long time (e.g. web_search, long exec), new messages from the user must wait until the entire turn completes. This makes the agent feel unresponsive. See #1609 for the full discussion.

Previous attempt (#1233, closed) tried to interrupt tool execution mid-way, which introduced significant complexity (~500 lines) with cancellation edge cases. This PR takes a simpler approach: don't cancel tools, just inject messages between iterations.

How It Works

User sends "hello" → agent starts processing (holds session lock)
User sends "what time is it?" → routed to pending queue (not a new task)
  ↓
Agent iteration: LLM call → tool execution → [DRAIN] → next LLM call
                                              ↑ injected as user message
  ↓
LLM sees both messages naturally and responds to both

Two drain checkpoints in the agent loop:

After tool execution (before next LLM call) — tools run to completion, then new messages are appended
After final response ("last-mile") — if the user sent a follow-up while the LLM was generating its final answer, continue the loop instead of breaking

Key Design Decisions

Decision	Rationale
Inject as natural `user` messages	No special system prompt or `[Follow-up]` markers needed — LLM handles multi-turn naturally
No tool cancellation	Avoids the complexity that caused #1233 to be closed (partial writes, inconsistent state)
`_MAX_INJECTIONS_PER_TURN = 3`	Prevents context window pressure from rapid message accumulation
`_MAX_INJECTION_CYCLES = 5`	Prevents injection loops from consuming the iteration budget
`had_injections` bypasses `_sent_in_turn`	When follow-ups are injected, the final response is new content the user hasn't seen — always deliver it
Pending queue lifecycle via `try/finally`	Prevents memory leaks; queue is registered before lock acquisition and cleaned up after

Edge Cases Handled

Last-mile: Messages arriving after the LLM's final response but before turn end are caught by Checkpoint 2
_sent_in_turn conflict: Follow-up responses bypass the MessageTool suppression check
Queue cleanup: finally block in _dispatch() ensures no dangling queues
Bounded accumulation: Both per-drain (3 messages) and per-turn (5 cycles) limits
Graceful degradation: If injection_callback throws, the error is logged and injection is skipped

Changes

nanobot/agent/runner.py: injection_callback on AgentRunSpec, _drain_injections() helper, two drain checkpoints in run(), had_injections on AgentRunResult
nanobot/agent/loop.py: _pending_queues dict, message routing in run(), queue lifecycle in _dispatch(), _drain_pending callback, _sent_in_turn bypass
Test files: updated _run_agent_loop return value unpacking (3-tuple → 5-tuple)

Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes #1609

…ue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish

…UDS#2985) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes HKUDS#1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish

) * feat(agent): add mid-turn message injection for responsive follow-ups Allow user messages sent during an active agent turn to be injected into the running LLM context instead of being queued behind a per-session lock. Inspired by Claude Code's mid-turn queue drain mechanism (query.ts:1547-1643). Key design decisions: - Messages are injected as natural user messages between iterations, no tool cancellation or special system prompt needed - Two drain checkpoints: after tool execution and after final LLM response ("last-mile" to prevent dropping late arrivals) - Bounded by MAX_INJECTION_CYCLES (5) to prevent consuming the iteration budget on rapid follow-ups - had_injections flag bypasses _sent_in_turn suppression so follow-up responses are always delivered Closes #1609 * fix(agent): harden mid-turn injection with streaming fix, bounded queue, and message safety - Fix streaming protocol violation: Checkpoint 2 now checks for injections BEFORE calling on_stream_end, passing resuming=True when injections found so streaming channels (Feishu) don't prematurely finalize the card - Bound pending queue to maxsize=20 with QueueFull handling - Add warning log when injection batch exceeds _MAX_INJECTIONS_PER_TURN - Re-publish leftover queue messages to bus in _dispatch finally block to prevent silent message loss on early exit (max_iterations, tool_error, cancel) - Fix PEP 8 blank line before dataclass and logger.info indentation - Add 12 new tests covering drain, checkpoints, cycle cap, queue routing, cleanup, and leftover re-publish

Upstream's pending_queue injection (PR HKUDS#2985) fully replaces the SteeringHook mechanism. Gateway now passes pending_queue directly to _process_message, so the per-call extra_hooks parameter and the steering.py / messages.py files are no longer needed. - Delete nanobot/agent/steering.py (InterruptionChecker + SteeringHook) - Delete nanobot/agent/messages.py (AgentMessage dual-layer model) - Remove extra_hooks parameter from _process_message and _run_agent_loop - Restore original hook merging in _run_agent_loop Made-with: Cursor

chengyongru marked this pull request as ready for review April 10, 2026 15:26

chengyongru merged commit bc4cc49 into nightly Apr 10, 2026
3 checks passed

github-actions bot mentioned this pull request Apr 11, 2026

🦞 OpenClaw 生态日报 2026-04-11 gsscsd/big_model_radar#168

Open

chengyongru mentioned this pull request Apr 11, 2026

feat(agent): mid-turn message injection for responsive follow-ups #3042

Merged

chengyongru deleted the feat/mid-turn-injection branch April 14, 2026 03:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent): mid-turn message injection for responsive follow-ups#2985

feat(agent): mid-turn message injection for responsive follow-ups#2985
chengyongru merged 2 commits intonightlyfrom
feat/mid-turn-injection

chengyongru commented Apr 9, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

chengyongru commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

How It Works

Key Design Decisions

Edge Cases Handled

Changes

Related

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

chengyongru commented Apr 9, 2026 •

edited

Loading