Skip to content

fix: handle errors in streaming output callback to prevent group hangs#719

Closed
cmraible-bot wants to merge 1 commit intoqwibitai:mainfrom
cmraible-bot:fix/output-chain-error-handling-nanoclaw
Closed

fix: handle errors in streaming output callback to prevent group hangs#719
cmraible-bot wants to merge 1 commit intoqwibitai:mainfrom
cmraible-bot:fix/output-chain-error-handling-nanoclaw

Conversation

@cmraible-bot
Copy link
Copy Markdown

Summary

• The outputChain in container-runner.ts chains onOutput callbacks via .then() with no .catch() handler
• If onOutput throws (e.g. channel.sendMessage fails due to network error, invalid JID, rate limiting), the promise chain silently rejects
• The close handler uses outputChain.then(() => resolve(...)) — with a rejected chain, this .then() never fires, and runContainerAgent's promise hangs forever
• This permanently blocks all message processing for the affected group until the process is restarted
• Fix: Added .catch() to the output chain so errors are logged and the chain continues, ensuring the container-runner promise always settles

Why this matters

The synchronous try/catch around line 365 only catches JSON.parse() errors. onOutput is async (returns a Promise), so when it's scheduled via .then(), any errors it throws happen on a later microtask — outside the synchronous try/catch. Without .catch() on the promise chain, the rejection is unhandled and the close handler's outputChain.then(() => resolve(...)) never fires.

Test plan

  • Added test: onOutput error does not hang the promise — verifies that when onOutput throws, the promise still resolves instead of hanging
  • All existing tests still pass
  • Typecheck passes
  • Prettier formatting check passes

🤖 Generated with Claude Code

The outputChain in container-runner.ts had no .catch() handler, so if the
onOutput callback threw (e.g. channel.sendMessage fails), the promise
chain would reject silently. Subsequent outputChain.then() calls at the
close handler would never fire, causing runContainerAgent's promise to
hang forever — permanently blocking all message processing for that group
until restart.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@TomGranot
Copy link
Copy Markdown
Collaborator

This looks solid — preventing group hangs from streaming errors is important. @gavrielc mind giving this a quick look? Should be a straightforward review.

@Andy-NanoClaw-AI Andy-NanoClaw-AI added PR: Fix Bug fix Status: Needs Review Ready for maintainer review labels Mar 5, 2026
klapom added a commit to klapom/nanoclaw that referenced this pull request Mar 5, 2026
…res)

Bug fixes applied:
- qwibitai#636: task-scheduler recalculates next_run before enqueue
- qwibitai#655: LIMIT 200 on message queries to prevent OOM
- qwibitai#670: rateLimitResetAt field in ContainerOutput interface
- qwibitai#694: ANTHROPIC_MODEL passthrough to container env
- qwibitai#700: session rotation at 5MB JSONL threshold
- qwibitai#701: session retry on corrupted resume (clear + retry)
- qwibitai#708: update_task MCP tool in ipc-mcp-stdio
- qwibitai#719: outputChain .catch() to prevent group hang
- qwibitai#729: fix send_message description (remove incorrect scheduled-task note)
- qwibitai#735: datePrefix() injects current date/time into all agent prompts
- qwibitai#738: ANTHROPIC_MODEL from .env passed to agent container
- qwibitai#746: systemd OnFailure restart prevention logic (container hardening)
- qwibitai#751: DM-with-bot JID normalization
- qwibitai#754: setOnPipeCallback to reset idle timer on piped messages
- qwibitai#756: cursorBeforePipe rollback on container crash

Features added:
- qwibitai#723: streaming infrastructure (STREAM_TEXT markers, onStreamDelta)
- qwibitai#742: container hardening (entrypoint.sh privilege drop, env sanitize)
- qwibitai#680: add-cli skill (CLI send binary)
- qwibitai#727: add-memory skill extracted to .claude/skills/add-memory/
- qwibitai#744: add-s3-storage skill extracted to .claude/skills/add-s3-storage/

Test fixes:
- Mock fs.promises in container-runner.test.ts to prevent real I/O
- Add ANTHROPIC_MODEL to config mock
- Fix cpSync expectation: { recursive: true, force: true }
- Fix isActive() to use state.active instead of state.process
- Fix container-runtime error message: Docker → Container runtime

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: Fix Bug fix Status: Needs Review Ready for maintainer review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants