fix(wecom): use channel context instead of HTTP request context for async message processing#914
Conversation
…sync message processing The HTTP request context is canceled as soon as the handler returns the response, causing PublishInbound to fail with "context canceled" when processMessage runs asynchronously in a goroutine. Use the channel's long-lived context (c.ctx) instead.
There was a problem hiding this comment.
Pull request overview
Fixes async WeCom webhook message processing by avoiding the short-lived HTTP request context (which gets canceled when the handler returns), and instead using the channel’s long-lived context for background processing.
Changes:
- Switch WeCom Bot webhook async processing to use
c.ctxinstead of the handler/request context. - Switch WeCom App webhook async processing to use
c.ctxinstead of the handler/request context.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| pkg/channels/wecom/bot.go | Uses the channel lifecycle context for goroutine-based message processing to avoid context canceled on publish. |
| pkg/channels/wecom/app.go | Same context change for the WeCom App webhook callback async processing path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| go c.processMessage(ctx, msg) | ||
| // Process the message with the channel's long-lived context (not the HTTP | ||
| // request context, which is canceled as soon as we return the response). | ||
| go c.processMessage(c.ctx, msg) |
There was a problem hiding this comment.
c.ctx may be nil here if the channel hasn't been started yet (e.g., current tests invoke handleMessageCallback on a newly constructed channel without calling Start). A nil context will panic downstream in BaseChannel.HandleMessage / MessageBus.PublishInbound when ctx.Err() is called. Ensure c.ctx is always non-nil (initialize it in NewWeComAppChannel, or add a safe fallback here when c.ctx == nil).
| go c.processMessage(c.ctx, msg) | |
| procCtx := c.ctx | |
| if procCtx == nil { | |
| procCtx = context.Background() | |
| } | |
| go c.processMessage(procCtx, msg) |
| // Process the message with the channel's long-lived context (not the HTTP | ||
| // request context, which is canceled as soon as we return the response). | ||
| go c.processMessage(c.ctx, msg) |
There was a problem hiding this comment.
There are existing tests for handleMessageCallback, but they only assert the HTTP response and don't validate the async processing path. Please add a regression test that passes a canceled request context into handleMessageCallback and asserts an inbound message is still published (e.g., via msgBus.ConsumeInbound with a timeout), to prevent reintroducing the "context canceled" failure this change is addressing.
| // Process the message with the channel's long-lived context (not the HTTP | ||
| // request context, which is canceled as soon as we return the response). | ||
| go c.processMessage(c.ctx, msg) |
There was a problem hiding this comment.
Current webhook callback tests only check the immediate HTTP response and don't assert that the async goroutine successfully publishes the inbound message. Add a regression test that cancels the request context and verifies an inbound message is still consumed from the bus (e.g., ConsumeInbound with a timeout), so this bugfix stays covered.
| go c.processMessage(ctx, msg) | ||
| // Process the message with the channel's long-lived context (not the HTTP | ||
| // request context, which is canceled as soon as we return the response). | ||
| go c.processMessage(c.ctx, msg) |
There was a problem hiding this comment.
c.ctx can be nil here when the channel hasn't been started yet (e.g., current unit tests call handleMessageCallback without calling Start). Passing a nil context into processMessage eventually reaches BaseChannel.HandleMessage → MessageBus.PublishInbound(ctx, ...), which will panic on ctx.Err(). Initialize a non-nil default (e.g., set c.ctx = context.Background() in the constructor, or fall back to context.Background()/a detached context here when c.ctx == nil).
| go c.processMessage(c.ctx, msg) | |
| procCtx := c.ctx | |
| if procCtx == nil { | |
| procCtx = context.Background() | |
| } | |
| go c.processMessage(procCtx, msg) |
…n tests The ctx field was only set in Start(), so tests calling handleMessageCallback without Start() caused a nil pointer dereference in MessageBus.PublishInbound.
yinwm
left a comment
There was a problem hiding this comment.
Code Review
✅ Correct Problem Identification
The PR correctly identifies a real bug:
- Root cause: HTTP request context is canceled immediately after the handler returns, but
processMessageruns asynchronously in a goroutine, causingPublishInboundto fail with "context canceled" error - Fix direction is correct: Using the channel's long-lived context is the right approach
⚠️ Potential Issue: Resource Leak
There's a subtle bug in the implementation. The cancel function created in the constructor is overwritten in Start(), causing a resource leak:
// In constructor:
ctx, cancel := context.WithCancel(context.Background())
return &WeComBotChannel{
ctx: ctx,
cancel: cancel, // cancel_1
}
// In Start():
c.ctx, c.cancel = context.WithCancel(ctx) // cancel_2 overwrites cancel_1cancel_1 will never be called, causing a goroutine/resource leak (albeit small).
🔧 Suggested Fix
Release the previous cancel in Start():
func (c *WeComBotChannel) Start(ctx context.Context) error {
logger.InfoC("wecom", "Starting WeCom Bot channel...")
// Release the old cancel created in constructor
if c.cancel != nil {
c.cancel()
}
c.ctx, c.cancel = context.WithCancel(ctx)
// ...
}This applies to both WeComBotChannel and WeComAppChannel.
📝 Minor Suggestions
- Consider adding a test case to verify async processing doesn't fail due to context cancellation
- The comments are good, could be slightly more explicit about the "context canceled" error scenario
Summary
| Aspect | Assessment |
|---|---|
| Problem identification | ✅ Correct |
| Fix direction | ✅ Correct |
| Implementation details | |
| Code style | ✅ Good |
Please address the cancel overwrite issue and this PR should be good to merge. Thanks for the fix!
yinwm
left a comment
There was a problem hiding this comment.
wait you leak pr @alexhoshina
Changes from upstream: - fix(channel): config cleanup and regex precompile (sipeed#911, sipeed#916) - fix(github_copilot): improve error handling (sipeed#919) - fix(wecom): context leaks and data race fixes (sipeed#914, sipeed#918) - fix(tools): HTTP client caching and response body cleanup (sipeed#940) - feat(tui): Add configurable Launcher and Gateway process management (sipeed#909) - feat(migrate): Update migration system with openclaw support (sipeed#910) - fix(skills): Use registry-backed search for skills discovery (sipeed#929) - build: Add armv6 support to goreleaser (sipeed#905) - docs: Sync READMEs and channel documentation
…eled fix(wecom): use channel context instead of HTTP request context for async message processing
…eled fix(wecom): use channel context instead of HTTP request context for async message processing
📝 Description
The HTTP request context is canceled as soon as the handler returns the response, causing PublishInbound to fail with "context canceled" when processMessage runs asynchronously in a goroutine. Use the channel's long-lived context (c.ctx) instead.