You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Background tasks dispatched via copaw agents chat --background are spontaneously cancelled when the target agent undergoes a workspace reload. The reload's graceful shutdown logic has a blind spot: it only checks CoPaw's per-workspace TaskTracker for active tasks, but background tasks submitted through /api/agent/process/task are managed by agentscope_runtime's AgentApp and are invisible to that tracker. The old workspace is therefore stopped immediately, killing all in-flight background tasks.
All affected sessions end with _is_interrupted=True and "The tool call has been interrupted by the user" — but no user issued a stop command.
Any config change endpoint that calls schedule_agent_reload() triggers this — including PUT /agent/running-config, PUT /agent/system-prompt-files, PUT /agents/{agentId}, and PUT /config/channels.
Actual vs Expected
Actual: All background tasks are cancelled within seconds of a reload. Sessions show _is_interrupted=True. Task status API shows "pending" or "cancelled".
Expected: Background tasks should survive agent reloads (or at least be given a grace period to complete). The graceful shutdown should be aware of all running tasks, not just those tracked by CoPaw's TaskTracker.
Logs / Screenshots
All interrupted sessions end with the same pattern:
{
"metadata": {"_is_interrupted": true},
"content": "I noticed that you have interrupted me. What can I do for you?"
}
Preceding tool result:
<system-info>The tool call has been interrupted by the user.</system-info>
The task lifecycle is owned by agentscope_runtime, not by CoPaw's TaskTracker (src/copaw/app/runner/task_tracker.py).
Step-by-step
1. Tasks are dispatched and running. Each goes through agentscope_runtime's /api/agent/process/task endpoint, which creates an asyncio.Task wrapping DynamicMultiAgentRunner.stream_query(). At task start, the runner resolves to workspace A's AgentRunner and holds a reference to it.
2. A config change triggers schedule_agent_reload() (src/copaw/app/utils.py:15), which fires MultiAgentManager.reload_agent() in the background.
3. reload_agent() (src/copaw/app/multi_agent_manager.py:208-319) creates a new Workspace with a fresh TaskTracker, starts it, atomically swaps it in (self.agents[agent_id] = new_instance, line 312), then calls _graceful_stop_old_instance().
4. _graceful_stop_old_instance() — THE BUG (src/copaw/app/multi_agent_manager.py:91-186):
has_active=awaitold_instance.task_tracker.has_active_tasks() # line 105ifhas_active:
# Wait up to 60s for tasks, then stop ...else:
# No active tasks — stop immediately ← THIS PATH IS TAKENawaitold_instance.stop(final=False) # line 176
has_active returns False because TaskTracker only tracks tasks registered via attach_or_start() (console channel, messaging channels). Background tasks from /api/agent/process/task are managed by agentscope_runtime's AgentApp and are never registered in CoPaw's TaskTracker.
5. Old workspace is stopped immediately.stop(final=False) (workspace.py:363) calls ServiceManager.stop_all(final=False) which stops the runner, MCP clients, and channels. The in-flight tasks receive asyncio.CancelledError.
6. CancelledError propagates to agent interrupt:
# runner.py:541-545exceptasyncio.CancelledErrorasexc:
ifagentisnotNone:
awaitagent.interrupt() # cancels agent's reply taskraiseAgentException("Task has been cancelled!") fromexc
agent.interrupt() (react_agent.py:1031-1046) cancels the agent's _reply_task, producing the _is_interrupted=True metadata.
Why the evidence matches
Observation
Explanation
All 10 tasks interrupted
All used the same old workspace runner
Different durations before death (12s–134s)
Tasks dispatched at different times, killed by the same reload event
All interrupted during sleep commands
sleep yields to the event loop where CancelledError is delivered
_is_interrupted=True in all sessions
Standard agentscope interrupt response to CancelledError
Task status API shows "pending/cancelled"
agentscope_runtime's tracker is separate; status doesn't update correctly after workspace stop
No user-initiated /stop
Stop was triggered by workspace reload, not user
Suggested Fix
Option A (Preferred): Register AgentApp tasks with CoPaw's TaskTracker
In DynamicMultiAgentRunner.stream_query() (_app.py:104), register each background task with the resolved workspace's TaskTracker so _graceful_stop_old_instance waits for them.
Option B: Delay old workspace stop unconditionally
Always schedule a delayed cleanup with a configurable grace period (e.g. 60–300s) before stopping the old workspace after a reload, instead of relying solely on has_active_tasks().
Key files to modify
File
Purpose
src/copaw/app/multi_agent_manager.py:91-186
_graceful_stop_old_instance — add awareness of AgentApp tasks
src/copaw/app/_app.py:104-126
DynamicMultiAgentRunner.stream_query — register tasks with TaskTracker
src/copaw/app/runner/task_tracker.py
May need API additions for external task registration
CoPaw Version
1.0.2
Description
Background tasks dispatched via
copaw agents chat --backgroundare spontaneously cancelled when the target agent undergoes a workspace reload. The reload's graceful shutdown logic has a blind spot: it only checks CoPaw's per-workspaceTaskTrackerfor active tasks, but background tasks submitted through/api/agent/process/taskare managed by agentscope_runtime's AgentApp and are invisible to that tracker. The old workspace is therefore stopped immediately, killing all in-flight background tasks.All affected sessions end with
_is_interrupted=Trueand "The tool call has been interrupted by the user" — but no user issued a stop command.Related PR(s): N/A
Security considerations: N/A
Component(s) Affected
Environment
Steps to Reproduce
Any config change endpoint that calls
schedule_agent_reload()triggers this — includingPUT /agent/running-config,PUT /agent/system-prompt-files,PUT /agents/{agentId}, andPUT /config/channels.Actual vs Expected
_is_interrupted=True. Task status API shows "pending" or "cancelled".TaskTracker.Logs / Screenshots
All interrupted sessions end with the same pattern:
{ "metadata": {"_is_interrupted": true}, "content": "I noticed that you have interrupted me. What can I do for you?" }Preceding tool result:
Root Cause Analysis
Affected code path
The task lifecycle is owned by
agentscope_runtime, not by CoPaw'sTaskTracker(src/copaw/app/runner/task_tracker.py).Step-by-step
1. Tasks are dispatched and running. Each goes through agentscope_runtime's
/api/agent/process/taskendpoint, which creates anasyncio.TaskwrappingDynamicMultiAgentRunner.stream_query(). At task start, the runner resolves to workspace A'sAgentRunnerand holds a reference to it.2. A config change triggers
schedule_agent_reload()(src/copaw/app/utils.py:15), which firesMultiAgentManager.reload_agent()in the background.3.
reload_agent()(src/copaw/app/multi_agent_manager.py:208-319) creates a newWorkspacewith a freshTaskTracker, starts it, atomically swaps it in (self.agents[agent_id] = new_instance, line 312), then calls_graceful_stop_old_instance().4.
_graceful_stop_old_instance()— THE BUG (src/copaw/app/multi_agent_manager.py:91-186):has_activereturnsFalsebecauseTaskTrackeronly tracks tasks registered viaattach_or_start()(console channel, messaging channels). Background tasks from/api/agent/process/taskare managed by agentscope_runtime's AgentApp and are never registered in CoPaw'sTaskTracker.5. Old workspace is stopped immediately.
stop(final=False)(workspace.py:363) callsServiceManager.stop_all(final=False)which stops the runner, MCP clients, and channels. The in-flight tasks receiveasyncio.CancelledError.6. CancelledError propagates to agent interrupt:
agent.interrupt()(react_agent.py:1031-1046) cancels the agent's_reply_task, producing the_is_interrupted=Truemetadata.Why the evidence matches
sleepcommandssleepyields to the event loop whereCancelledErroris delivered_is_interrupted=Truein all sessionsCancelledError/stopSuggested Fix
Option A (Preferred): Register AgentApp tasks with CoPaw's TaskTracker
In
DynamicMultiAgentRunner.stream_query()(_app.py:104), register each background task with the resolved workspace'sTaskTrackerso_graceful_stop_old_instancewaits for them.Option B: Delay old workspace stop unconditionally
Always schedule a delayed cleanup with a configurable grace period (e.g. 60–300s) before stopping the old workspace after a reload, instead of relying solely on
has_active_tasks().Key files to modify
src/copaw/app/multi_agent_manager.py:91-186_graceful_stop_old_instance— add awareness of AgentApp taskssrc/copaw/app/_app.py:104-126DynamicMultiAgentRunner.stream_query— register tasks with TaskTrackersrc/copaw/app/runner/task_tracker.pyAdditional Notes
anyio>=4.0.0,<4.13.0pin (pyproject.toml, [Bug]: Main process CPU pegged at 100% after 20-30 min due to anyio 4.13.0_deliver_cancellationbusy-wait loop #2632) addresses a separate anyio cancellation busy-loop issue that may exacerbate symptoms.agentscope-runtime==1.1.3is the external package managing/agent/process/taskendpoints and background task lifecycle.