feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473
feat(hooks/plugin): add lifecycle hooks + phase-1 plugin contract#473gh-xj wants to merge 19 commits intosipeed:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds a lightweight, typed lifecycle hook registry to PicoClaw and integrates hook trigger points throughout the agent loop to enable observability, filtering, and guardrails with minimal overhead.
Changes:
- Introduces
pkg/hookswith typed event structs, aHookRegistry, and trigger/registration APIs (void + modifying hooks). - Integrates hook triggers into
pkg/agent/loop.goaround inbound messages, session boundaries, LLM calls, tool calls, and outbound publishing. - Adds unit tests for hook ordering, cancellation, concurrency, and concurrent registration/trigger behavior, plus a design doc.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/hooks/types.go | Defines the 8 hook event payload types used across the system. |
| pkg/hooks/hooks.go | Implements the hook registry, priority ordering, and void/modifying execution semantics. |
| pkg/hooks/hooks_test.go | Adds tests for concurrency, ordering, cancellation, and race-oriented behavior. |
| pkg/agent/loop.go | Wires hook triggers into the agent loop and wraps outbound publishing via sendOutbound. |
| cmd/picoclaw/main.go | Creates and attaches a hook registry during startup (agent + gateway commands). |
| docs/plans/2026-02-19-hook-system-design.md | Documents the hook model, execution patterns, and integration points. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| go func(reg HookRegistration[T]) { | ||
| defer wg.Done() | ||
| if err := reg.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } | ||
| }(h) |
There was a problem hiding this comment.
Hook handlers can panic (especially since they’re user/extension code), and right now a panic in any handler will crash the whole process (including from the goroutines in triggerVoid). Consider recovering around reg.Handler invocation and logging the panic as a hook failure so hooks can’t take down the agent.
pkg/agent/loop.go
Outdated
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) | ||
| return nil | ||
| }) |
There was a problem hiding this comment.
In SetHooks, the MessageTool callback always returns nil even when message_sending cancels delivery (sendOutbound just returns). That makes MessageTool report success and set sentInRound=true while the message was never published, which can also suppress the loop’s normal outbound response (alreadySent becomes true). Consider having sendOutbound return a “sent/canceled” status (and reason) so this callback can return an error on cancel and avoid marking the tool call as successful.
| if al.hooks != nil { | ||
| event := &hooks.MessageSendingEvent{Channel: msg.Channel, ChatID: msg.ChatID, Content: msg.Content} | ||
| al.hooks.TriggerMessageSending(ctx, event) | ||
| if event.Cancel { |
There was a problem hiding this comment.
When message_sending cancels, sendOutbound silently drops the message. Since MessageSendingEvent includes CancelReason, consider logging (or otherwise surfacing) the cancel reason here so operators can understand why messages disappeared (and so cancellations are debuggable).
| if event.Cancel { | |
| if event.Cancel { | |
| reason := strings.TrimSpace(event.CancelReason) | |
| if reason == "" { | |
| reason = "unspecified" | |
| } | |
| logger.Warnf("Outbound message to channel %s (chatID=%s) canceled by hook: %s", msg.Channel, msg.ChatID, reason) |
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| agentLoop.SetHooks(hooks.NewHookRegistry()) | ||
|
|
There was a problem hiding this comment.
agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.
|
|
||
| for i := range 5 { | ||
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | ||
| count.Add(1) | ||
| // Small sleep to verify concurrency (all run in parallel). | ||
| time.Sleep(10 * time.Millisecond) | ||
| return nil | ||
| }) | ||
| } | ||
|
|
||
| start := time.Now() | ||
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | ||
| elapsed := time.Since(start) | ||
|
|
||
| if count.Load() != 5 { | ||
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | ||
| } | ||
| // If truly concurrent, total time should be much less than 5*10ms = 50ms. | ||
| if elapsed > 40*time.Millisecond { | ||
| t.Errorf("Handlers appear sequential: elapsed %v", elapsed) | ||
| } |
There was a problem hiding this comment.
TestVoidHooksConcurrent relies on timing (sleeping 10ms per handler and asserting total elapsed < 40ms). This is likely to be flaky under CI load or on slower machines, even if handlers are concurrent. Prefer a synchronization-based assertion (e.g., a barrier channel all handlers must reach before any can proceed) so the test doesn’t depend on wall-clock thresholds.
| for i := range 5 { | |
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| count.Add(1) | |
| // Small sleep to verify concurrency (all run in parallel). | |
| time.Sleep(10 * time.Millisecond) | |
| return nil | |
| }) | |
| } | |
| start := time.Now() | |
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | |
| elapsed := time.Since(start) | |
| if count.Load() != 5 { | |
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | |
| } | |
| // If truly concurrent, total time should be much less than 5*10ms = 50ms. | |
| if elapsed > 40*time.Millisecond { | |
| t.Errorf("Handlers appear sequential: elapsed %v", elapsed) | |
| } | |
| started := make(chan struct{}, 5) | |
| release := make(chan struct{}) | |
| done := make(chan struct{}) | |
| for i := range 5 { | |
| r.OnMessageReceived("hook-"+string(rune('A'+i)), i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| started <- struct{}{} | |
| <-release | |
| count.Add(1) | |
| return nil | |
| }) | |
| } | |
| go func() { | |
| r.TriggerMessageReceived(ctx, &MessageReceivedEvent{Content: "test"}) | |
| close(done) | |
| }() | |
| // Ensure all handlers have started and reached the barrier. | |
| for i := 0; i < 5; i++ { | |
| select { | |
| case <-started: | |
| case <-time.After(1 * time.Second): | |
| t.Fatalf("timeout waiting for handler %d to start", i+1) | |
| } | |
| } | |
| // Release all handlers to complete. | |
| close(release) | |
| // Wait for TriggerMessageReceived to finish. | |
| select { | |
| case <-done: | |
| case <-time.After(1 * time.Second): | |
| t.Fatal("timeout waiting for handlers to complete") | |
| } | |
| if count.Load() != 5 { | |
| t.Errorf("Expected 5 handlers called, got %d", count.Load()) | |
| } |
pkg/hooks/hooks.go
Outdated
| for _, h := range hooks { | ||
| if err := h.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": h.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } |
There was a problem hiding this comment.
A panic inside a modifying hook handler will crash the agent loop. Since these are intended as extension points, consider adding panic recovery around h.Handler(ctx, event) (and then logging) so a single bad hook can’t bring down the whole process.
pkg/hooks/types.go
Outdated
| ToolName string | ||
| Args map[string]any // Modifiable | ||
| Channel string | ||
| ChatID string | ||
| Cancel bool | ||
| CancelMsg string // Message returned to LLM when canceled |
There was a problem hiding this comment.
Cancellation fields are named inconsistently across events (MessageSendingEvent has CancelReason, BeforeToolCallEvent has CancelMsg). Since these structs are part of the public hook API, consider standardizing naming (e.g., CancelReason vs CancelMessage) so hook authors don’t have to remember special cases.
| ToolName string | |
| Args map[string]any // Modifiable | |
| Channel string | |
| ChatID string | |
| Cancel bool | |
| CancelMsg string // Message returned to LLM when canceled | |
| ToolName string | |
| Args map[string]any // Modifiable | |
| Channel string | |
| ChatID string | |
| Cancel bool | |
| CancelReason string // Reason for cancellation; preferred over CancelMsg for new hooks | |
| CancelMsg string // DEPRECATED: use CancelReason; kept for backward compatibility |
pkg/agent/loop.go
Outdated
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) | ||
| return nil | ||
| }) | ||
| } |
There was a problem hiding this comment.
SetHooks mutates each MessageTool’s sendCallback without any synchronization, while MessageTool.Execute reads sendCallback without locking. If SetHooks could be called while the agent loop is running, this is a data race. Either document/enforce that SetHooks must be called before Run starts, or add synchronization (e.g., protect sendCallback with a mutex/atomic pointer or rewire at tool construction time only).
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| agentLoop.SetHooks(hooks.NewHookRegistry()) | ||
|
|
There was a problem hiding this comment.
agentLoop.SetHooks(hooks.NewHookRegistry()) installs an empty registry unconditionally. With this wiring, al.hooks is non-nil even when no handlers are registered, so the loop will still construct hook events and call Trigger* on every iteration (adding overhead and undermining the stated “zero-cost when unused” goal). Consider leaving hooks nil by default and only calling SetHooks when hooks are actually configured/registered, or add a cheap “has handlers” check before building events.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Comments suppressed due to low confidence (1)
pkg/agent/loop.go:848
maybeSummarizelaunches a goroutine but captures and reuses the requestctxwhen callingsendOutbound. Since the goroutine may run after the parent request is canceled, hooks may see a canceled context and behave differently (e.g., skip work / cancel the message). Use a fresh context (e.g.,context.Background()or a derived context with timeout) for this background notification, or explicitly document that hooks will receive the caller's ctx even in background work.
go func() {
defer al.summarizing.Delete(summarizeKey)
if !constants.IsInternalChannel(channel) {
al.sendOutbound(ctx, bus.OutboundMessage{
Channel: channel,
ChatID: chatID,
Content: "Memory threshold reached. Optimizing conversation history...",
})
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| // Hook registry is nil by default for true zero-cost. | ||
| // Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks. | ||
|
|
There was a problem hiding this comment.
Same as earlier in agentCmd: this is comment-only guidance and does not actually wire hooks. If the intent is to support hooks in gateway mode, consider adding an explicit enablement mechanism (config/env/flag) and calling agentLoop.SetHooks(...), or adjust the PR description accordingly.
| // BeforeToolCallEvent is fired before a tool is executed. | ||
| // Handlers can modify Args, or set Cancel to block execution. | ||
| type BeforeToolCallEvent struct { | ||
| ToolName string | ||
| Args map[string]any // Modifiable | ||
| Channel string | ||
| ChatID string | ||
| Cancel bool | ||
| CancelReason string // Message returned to LLM when canceled | ||
| } |
There was a problem hiding this comment.
BeforeToolCallEvent.Args is documented as modifiable but may be nil depending on the upstream tool call (arguments are omitempty). If the API expects modification, consider documenting that Args is always non-nil (and enforce it in the integration/trigger path), or explicitly state that handlers must handle nil.
| if al.hooks != nil { | ||
| btcEvent := &hooks.BeforeToolCallEvent{ | ||
| ToolName: tc.Name, | ||
| Args: tc.Arguments, | ||
| Channel: opts.Channel, | ||
| ChatID: opts.ChatID, | ||
| } | ||
| al.hooks.TriggerBeforeToolCall(ctx, btcEvent) | ||
| if btcEvent.Cancel { | ||
| toolCanceled = true | ||
| toolResult = tools.ErrorResult(btcEvent.CancelReason) | ||
| } | ||
| tc.Arguments = btcEvent.Args | ||
| } |
There was a problem hiding this comment.
BeforeToolCallEvent.Args is populated from tc.Arguments, which can be nil (tool call arguments are omitempty). If a hook tries to modify args (the documented use-case), writing to a nil map will panic and the modification will be silently dropped (panic is recovered in the hook runner). Initialize args to a non-nil map before triggering the hook (and/or when constructing btcEvent).
pkg/agent/loop.go
Outdated
| al.hooks.TriggerBeforeToolCall(ctx, btcEvent) | ||
| if btcEvent.Cancel { | ||
| toolCanceled = true | ||
| toolResult = tools.ErrorResult(btcEvent.CancelReason) |
There was a problem hiding this comment.
When a hook cancels a tool call, tools.ErrorResult(btcEvent.CancelReason) is used even if CancelReason is empty. That can produce an empty tool result message fed back into the LLM, which is hard to debug. Consider defaulting to a non-empty message (and/or including the tool name / handler) when CancelReason == "".
| toolResult = tools.ErrorResult(btcEvent.CancelReason) | |
| reason := btcEvent.CancelReason | |
| if strings.TrimSpace(reason) == "" { | |
| reason = fmt.Sprintf("tool call %q was cancelled by before_tool_call hook", tc.Name) | |
| } | |
| toolResult = tools.ErrorResult(reason) |
cmd/picoclaw/main.go
Outdated
| msgBus := bus.NewMessageBus() | ||
| agentLoop := agent.NewAgentLoop(cfg, msgBus, provider) | ||
| // Hook registry is nil by default for true zero-cost. | ||
| // Call agentLoop.SetHooks(hooks.NewHookRegistry()) to enable hooks. | ||
|
|
There was a problem hiding this comment.
The PR description claims main wiring will "create and set HookRegistry", but the code only adds comments and never actually enables hooks (no hooks.NewHookRegistry() import/call). Either add real wiring (e.g., a config/env/flag to enable and call agentLoop.SetHooks(...)) or update the PR description/checklist to reflect that hooks are intentionally opt-in and not wired by default.
96dc911 to
f73fb27
Compare
nikolasdehor
left a comment
There was a problem hiding this comment.
Well-designed hook system. The generic approach with HookHandler[T] and HookRegistration[T] is clean and type-safe.
Key strengths:
- Zero-cost when unused: The
len==0early return in trigger methods means no allocations or goroutines when no hooks are registered. Thehooksfield defaults to nil onAgentLoop, which is the right choice. - Copy-on-write
insertSorted: Allocating a new backing array on registration avoids data races between concurrent readers (trigger under RLock) and writers (register under Lock). Good design. - Panic recovery: Both
triggerVoid(concurrent goroutines) andtriggerModifying(sequential loop) havedefer recover()— prevents a misbehaving hook from crashing the agent loop. sendOutboundreturns bool: The caller (including the MessageTool callback path) can detect when delivery was canceled by a hook. Clean API.- Test coverage: 15 tests covering execution order, priority, cancellation, concurrency, and panic recovery.
Minor observations:
- The
SetHooksmethod rewires the MessageTool callback, which is a good approach for intercepting tool-generated outbound messages. Thecontext.Background()usage there is acceptable since the message context is not available at hook registration time. - The
maybeSummarizesignature change to includectx context.Contextis a clean way to thread context through to the hook triggers.
Solid addition to the codebase.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/agent/loop.go
Outdated
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") |
There was a problem hiding this comment.
When sendOutbound returns false, the MessageTool callback returns a generic "message canceled by hook" error. This drops the hook-provided CancelReason, making it hard for the LLM/operator to understand what policy blocked the message. Consider having sendOutbound return the cancel reason (or an error type) so it can be propagated here.
pkg/hooks/hooks.go
Outdated
| // triggerVoid runs all handlers concurrently and waits for completion. | ||
| // Handlers MUST NOT mutate the event — it is shared across goroutines. | ||
| // Errors are logged but do not propagate to the caller. | ||
| func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) { |
There was a problem hiding this comment.
triggerVoid passes the same event pointer to multiple goroutines. Even with the comment saying handlers must not mutate, it only takes one handler to accidentally write to the struct/map/slice fields to introduce a data race between hooks. To make the system safer for third-party hooks, consider passing each handler its own (shallow or deep) copy of the event, or running void hooks sequentially when the event contains reference types.
pkg/agent/loop.go
Outdated
| } | ||
|
|
||
| // SetHooks installs a hook registry. Must be called before Run starts. | ||
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { |
There was a problem hiding this comment.
SetHooks mutates al.hooks and rewires MessageTool callbacks without any guard against the loop already running. If SetHooks is called after Run starts (despite the comment), this can race with sendOutbound/trigger calls and tool execution. Consider enforcing this contract (e.g., return an error or panic when al.running is true, or protect the hooks pointer/callback rewiring with a mutex/atomic).
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | |
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | |
| if al.running.Load() { | |
| panic("SetHooks must be called before Run starts") | |
| } |
|
Rebased/merged with latest upstream Local checks all pass:
Current GitHub Actions status is |
|
Phase 1 plugin direction has now been implemented in this PR as requested, in separate commits:
Key docs:
This keeps risk bounded: dynamic plugin loading remains out-of-scope; current model is compile-time registration on top of typed lifecycle hooks. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
pkg/agent/loop.go:901
- The context passed to sendOutbound in the goroutine may be canceled before the message is sent. The ctx parameter comes from the parent request and could be canceled if the original request completes or times out. Consider using context.Background() or a detached context for goroutines that should continue after the request ends, similar to line 227 in SetHooks.
if st, ok := tool.(tools.ContextualTool); ok {
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/hooks/hooks_test.go
Outdated
| wg.Add(1) | ||
| go func() { | ||
| defer wg.Done() | ||
| r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error { |
There was a problem hiding this comment.
All registration goroutines use the same handler name "reg-hook" but different priorities. When multiple hooks with the same name are registered, there's no deduplication, so this test registers 10 hooks all named "reg-hook". While this tests concurrent registration safety, it doesn't reflect realistic usage. Consider using unique names like fmt.Sprintf("reg-hook-%d", i) to better represent production scenarios.
| r.OnMessageReceived("reg-hook", i, func(_ context.Context, _ *MessageReceivedEvent) error { | |
| r.OnMessageReceived(fmt.Sprintf("reg-hook-%d", i), i, func(_ context.Context, _ *MessageReceivedEvent) error { |
| } | ||
|
|
||
| // SetHooks installs a hook registry. Must be called before Run starts. | ||
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if !al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // SetPluginManager installs a plugin manager and routes its hook registry into the loop. | ||
| // Must be called before Run starts. | ||
| func (al *AgentLoop) SetPluginManager(pm *plugin.Manager) { | ||
| al.pluginManager = pm | ||
| if pm == nil { | ||
| al.SetHooks(nil) | ||
| return | ||
| } | ||
| al.SetHooks(pm.HookRegistry()) | ||
| } | ||
|
|
||
| // EnablePlugins is a convenience helper to build and install a plugin manager. | ||
| func (al *AgentLoop) EnablePlugins(plugins ...plugin.Plugin) error { | ||
| pm := plugin.NewManager() | ||
| if err := pm.RegisterAll(plugins...); err != nil { | ||
| return err | ||
| } | ||
| al.SetPluginManager(pm) | ||
| return nil | ||
| } | ||
|
|
||
| // sendOutbound wraps bus.PublishOutbound with the message_sending hook. |
There was a problem hiding this comment.
The hook integration in AgentLoop lacks test coverage. While pkg/hooks has comprehensive unit tests, there are no integration tests verifying that hooks are correctly triggered during message processing, tool execution, or LLM calls. Consider adding tests that verify hook execution in realistic agent loop scenarios, especially for the critical paths like before_tool_call cancellation and message_sending filtering.
pkg/hooks/hooks.go
Outdated
| // triggerVoid runs all handlers concurrently and waits for completion. | ||
| // Handlers MUST NOT mutate the event — it is shared across goroutines. | ||
| // Errors are logged but do not propagate to the caller. | ||
| func triggerVoid[T any](ctx context.Context, hooks []HookRegistration[T], event *T, hookName string) { | ||
| if len(hooks) == 0 { | ||
| return | ||
| } | ||
| var wg sync.WaitGroup | ||
| for _, h := range hooks { | ||
| wg.Add(1) | ||
| go func(reg HookRegistration[T]) { | ||
| defer wg.Done() | ||
| defer func() { | ||
| if r := recover(); r != nil { | ||
| logger.ErrorCF("hooks", "Hook panic", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "panic": fmt.Sprintf("%v", r), | ||
| }) | ||
| } | ||
| }() | ||
| if err := reg.Handler(ctx, event); err != nil { | ||
| logger.WarnCF("hooks", "Hook error", | ||
| map[string]any{ | ||
| "hook": hookName, | ||
| "handler": reg.Name, | ||
| "error": err.Error(), | ||
| }) | ||
| } | ||
| }(h) | ||
| } | ||
| wg.Wait() | ||
| } |
There was a problem hiding this comment.
Void hooks documentation states "Handlers MUST NOT mutate the event" but there's no enforcement mechanism. Since event data is shared across concurrent goroutines, mutations could cause data races. Consider adding a note in public-facing documentation that hook handlers must be read-only for observe-only hooks, or consider using immutable event types (though this would add overhead).
| "iteration": iteration, | ||
| }) | ||
|
|
||
| // Create async callback for tools that implement AsyncTool | ||
| // NOTE: Following openclaw's design, async tools do NOT send results directly to users. | ||
| // Instead, they notify the agent via PublishInbound, and the agent decides | ||
| // whether to forward the result to the user (in processSystemMessage). | ||
| asyncCallback := func(callbackCtx context.Context, result *tools.ToolResult) { | ||
| // Log the async completion but don't send directly to user | ||
| // The agent will handle user notification via processSystemMessage | ||
| if !result.Silent && result.ForUser != "" { | ||
| logger.InfoCF("agent", "Async tool completed, agent will handle notification", | ||
| map[string]any{ | ||
| "tool": tc.Name, | ||
| "content_len": len(result.ForUser), | ||
| }) | ||
| } | ||
| } | ||
|
|
||
| toolResult := agent.Tools.ExecuteWithContext( | ||
| ctx, | ||
| tc.Name, | ||
| tc.Arguments, | ||
| opts.Channel, | ||
| opts.ChatID, | ||
| asyncCallback, | ||
| ) | ||
| // Fire before_tool_call hook |
There was a problem hiding this comment.
When tc.Arguments is nil, a new empty map is created for the hook event (line 789), but modifications to this map are not propagated back to tc.Arguments because args is reassigned on line 806. However, if tc.Arguments was originally nil, line 806 correctly assigns the (possibly modified) map back to tc.Arguments. This logic is correct but subtle. Consider adding a comment explaining that the empty map creation ensures hooks always receive a non-nil map that can be safely modified.
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if !al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }) { | ||
| return fmt.Errorf("message canceled by hook") | ||
| } |
There was a problem hiding this comment.
The SetSendCallback method on MessageTool is being called without synchronization. If a tool execution is in progress during SetHooks, there's a race condition between reading sendCallback in Execute() and writing it in SetSendCallback(). While unlikely due to expected pre-Run() usage, this violates Go's race detection rules.
pkg/hooks/hooks_test.go
Outdated
| // Register with priorities: 50, 10, 30, 20, 40 | ||
| priorities := []int{50, 10, 30, 20, 40} | ||
| for _, p := range priorities { | ||
| r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error { |
There was a problem hiding this comment.
The test name creation logic "p-"+string(rune('0'+p)) doesn't produce readable test names. For priority 50, rune('0'+50) equals 98 which is 'b', not a meaningful name. This should use fmt.Sprintf("p-%d", p) instead to generate names like "p-50", "p-10", etc.
| r.OnBeforeToolCall("p-"+string(rune('0'+p)), p, func(_ context.Context, _ *BeforeToolCallEvent) error { | |
| r.OnBeforeToolCall(fmt.Sprintf("p-%d", p), p, func(_ context.Context, _ *BeforeToolCallEvent) error { |
|
Follow-up for style/quality validation: GitHub Actions is still Could a maintainer approve workflow execution for PR #473 so CI (including lint) can run? Latest head: |
|
Reviewed Copilot feedback on the latest head and triaged stale vs actionable. Addressed in follow-up commits:
Items considered stale/outdated:
Remaining open by design (defer):
If maintainers want, I can do a separate follow-up PR for an optional "safe mode" that deep-copies event payloads for observe-only hooks (with measured overhead tradeoff). |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
pkg/agent/loop.go
Outdated
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { |
There was a problem hiding this comment.
Inconsistent indentation: this if statement should be aligned with the surrounding code. It appears to have extra indentation (starts with tabs when it should match line 229).
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { |
pkg/agent/loop.go
Outdated
| toolStart := time.Now() | ||
| if !toolCanceled { | ||
| toolResult = agent.Tools.ExecuteWithContext( | ||
| ctx, | ||
| tc.Name, | ||
| tc.Arguments, | ||
| opts.Channel, | ||
| opts.ChatID, | ||
| asyncCallback, | ||
| ) | ||
| } | ||
| toolDuration := time.Since(toolStart) |
There was a problem hiding this comment.
The toolStart timer is set before checking if the tool is canceled, but when toolCanceled is true, the tool never actually executes. This means the Duration in the after_tool_call hook will be near-zero for canceled tools, which is misleading. Consider either moving toolStart to line 839 (inside the if block), or documenting that Duration will be near-zero for canceled operations.
docs/design/plugin-system-roadmap.md
Outdated
|
|
||
| ## Current Status (Phase 0: Foundation) | ||
|
|
||
| Implemented in current hooks MR: |
There was a problem hiding this comment.
Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.
docs/design/plugin-system-roadmap.md
Outdated
|
|
||
| ## Maintainer Review Notes | ||
|
|
||
| The current hooks MR should be reviewed as Phase 0 only. It intentionally establishes extension points while avoiding high-risk runtime plugin mechanics. |
There was a problem hiding this comment.
Inconsistent terminology: "MR" (Merge Request) is GitLab terminology, while this is a GitHub repository. The codebase uses "PR" (Pull Request) elsewhere in documentation. Should use "PR" for consistency.
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }); !sent { | ||
| if strings.TrimSpace(reason) == "" { | ||
| reason = "unspecified" | ||
| } | ||
| return fmt.Errorf("message canceled by hook: %s", reason) | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
When SetPluginManager(nil) is called after hooks were previously installed, the MessageTool callbacks are not restored to their original state. They remain pointing to sendOutbound, which will use the now-nil hooks registry. Consider resetting the MessageTool callbacks to the original direct bus.PublishOutbound behavior when h is nil, or document that SetHooks should not be called with nil after hooks were set.
| al.hooks = h | |
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }); !sent { | |
| if strings.TrimSpace(reason) == "" { | |
| reason = "unspecified" | |
| } | |
| return fmt.Errorf("message canceled by hook: %s", reason) | |
| } | |
| return nil | |
| }) | |
| } | |
| } | |
| } | |
| } | |
| // If hooks are being cleared, restore MessageTool callbacks to direct bus publishing. | |
| if h == nil { | |
| al.hooks = nil | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| // Bypass hook interception and publish directly to the bus. | |
| return al.bus.PublishOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }) | |
| }) | |
| } | |
| } | |
| } | |
| } | |
| return | |
| } | |
| al.hooks = h | |
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | |
| for _, agentID := range al.registry.ListAgentIDs() { | |
| if agent, ok := al.registry.GetAgent(agentID); ok { | |
| if tool, ok := agent.Tools.Get("message"); ok { | |
| if mt, ok := tool.(*tools.MessageTool); ok { | |
| mt.SetSendCallback(func(channel, chatID, content string) error { | |
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| Channel: channel, | |
| ChatID: chatID, | |
| Content: content, | |
| }); !sent { | |
| if strings.TrimSpace(reason) == "" { | |
| reason = "unspecified" | |
| } | |
| return fmt.Errorf("message canceled by hook: %s", reason) | |
| } | |
| return nil | |
| }) | |
| } | |
| } | |
| } | |
| } |
docs/design/plugin-system-roadmap.md
Outdated
| ## Phase 1: Static Plugin Contract (Compile-time) | ||
|
|
||
| Goal: define a minimal public plugin contract for Go modules. | ||
|
|
||
| Proposed: | ||
|
|
||
| - Add `pkg/plugin` with a small interface: | ||
| - `Name() string` | ||
| - `Register(*hooks.HookRegistry) error` | ||
| - Register plugins at startup in code. | ||
| - Add compatibility metadata (`PluginAPIVersion`) for forward checks. | ||
|
|
||
| Exit criteria: | ||
|
|
||
| - Example plugin module builds against the contract. |
There was a problem hiding this comment.
The PR description states that Phase 1 has been implemented in this PR (commit 8d4cbd7), but this roadmap document still lists Phase 1 as "Proposed" with future tense. The document should be updated to reflect that Phase 1 is now implemented, moving it to a "Current Status" or "Completed" section to match the actual state of the codebase.
| ## Phase 1: Static Plugin Contract (Compile-time) | |
| Goal: define a minimal public plugin contract for Go modules. | |
| Proposed: | |
| - Add `pkg/plugin` with a small interface: | |
| - `Name() string` | |
| - `Register(*hooks.HookRegistry) error` | |
| - Register plugins at startup in code. | |
| - Add compatibility metadata (`PluginAPIVersion`) for forward checks. | |
| Exit criteria: | |
| - Example plugin module builds against the contract. | |
| ## Phase 1: Static Plugin Contract (Compile-time) — Completed | |
| Goal: define a minimal public plugin contract for Go modules. | |
| Status: Implemented in commit 8d4cbd7: | |
| - `pkg/plugin` provides a small interface: | |
| - `Name() string` | |
| - `Register(*hooks.HookRegistry) error` | |
| - Plugins are registered at startup in code. | |
| - Compatibility metadata (`PluginAPIVersion`) enables forward checks. | |
| Exit criteria (met): | |
| - An example plugin module builds against the contract. |
| - Attach once via `agentLoop.SetHooks(registry)` before `Run()`. | ||
| - If hooks are not set, default behavior is unchanged. | ||
|
|
||
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) |
There was a problem hiding this comment.
Missing blank line between documentation links. For better readability and consistency with Markdown best practices, add a blank line between these two documentation references.
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) | |
| See runnable examples: [docs/hooks-plugin-examples.md](docs/hooks-plugin-examples.md) |
pkg/agent/loop.go
Outdated
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ |
There was a problem hiding this comment.
Using context.Background() here discards the original context that may contain cancellation signals, deadlines, or tracing information. The MessageTool callback should accept and propagate a context parameter instead of creating a new background context. This could lead to operations continuing even after the parent context is cancelled.
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | |
| ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) | |
| defer cancel() | |
| if sent, reason := al.sendOutbound(ctx, bus.OutboundMessage{ |
|
Added a reviewer-context block to the PR body with verified external references and explicit scope boundaries. Key point: this PR is intentionally bounded to hooks + phase-1 compile-time plugin contract; dynamic runtime loading/marketplace/sandbox are explicitly out-of-scope. References used: |
|
I kept this PR intentionally small.
Dynamic plugin loading/marketplace/sandbox are not part of this PR. If this scope looks OK, please enable CI for this fork PR. I will fix anything that fails right away. |
|
Thanks maintainers for reviewing. Quick scope:
Why this matters (security-first):
Why this is a good opportunity for PicoClaw:
If this phase-1 direction is acceptable, I will keep follow-ups small and separate. |
|
Updated roadmap with explicit runtime plugin direction. Short version:
This keeps current PR scope unchanged (hooks + compile-time contract) while documenting a safer long-term runtime path. |
|
Added plugin API compatibility guard in this PR. What changed:
Reason:
|
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case uint64: | ||
| if n > maxIntU64 { | ||
| return 0, false | ||
| } |
There was a problem hiding this comment.
Float to int conversion truncates the decimal part without rounding. For timeout values, this could lead to unexpected behavior. For example, timeout_seconds: 30.9 would become 30 instead of 31. Consider documenting this truncation behavior or using rounding (e.g., int(n + 0.5)) to avoid surprising users with values just below a threshold.
| return int(n), true | ||
| case int16: |
There was a problem hiding this comment.
The conversion from int64 to int can overflow on 32-bit systems if the value exceeds MaxInt32. The function should check if the value is within the valid range for int before converting, similar to how it's done for unsigned types. This could cause silent overflow and incorrect behavior.
pkg/agent/loop.go
Outdated
| func (al *AgentLoop) SetHooks(h *hooks.HookRegistry) { | ||
| if al.running.Load() { | ||
| panic("SetHooks must be called before Run starts") | ||
| } | ||
| al.hooks = h | ||
|
|
||
| // Rewire MessageTool callbacks to route through sendOutbound for hook interception. | ||
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { | ||
| mt.SetSendCallback(func(channel, chatID, content string) error { | ||
| if sent, reason := al.sendOutbound(context.Background(), bus.OutboundMessage{ | ||
| Channel: channel, | ||
| ChatID: chatID, | ||
| Content: content, | ||
| }); !sent { | ||
| if strings.TrimSpace(reason) == "" { | ||
| reason = "unspecified" | ||
| } | ||
| return fmt.Errorf("message canceled by hook: %s", reason) | ||
| } | ||
| return nil | ||
| }) | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
When SetHooks(nil) is called (e.g., via SetPluginManager(nil)), the MessageTool callback is still rewired to route through sendOutbound. This means even with hooks disabled, message tool sends go through an extra layer of indirection. Consider adding a check to skip the MessageTool rewiring when h == nil, or restore the original callback behavior.
| -> outbound publish | ||
| -> session_end | ||
| ``` | ||
|
|
There was a problem hiding this comment.
The lifecycle map shows a linear flow, but in reality, llm_input, llm_output, before_tool_call, and after_tool_call can occur multiple times in an iteration loop (up to MaxToolIterations). Consider adding a note about iteration/looping behavior to avoid confusion for plugin developers who need to understand when their hooks will fire.
| Note: The map above is shown as a single linear pass for readability. In practice, the | |
| agent loop may iterate up to `MaxToolIterations`, and the following hooks can fire | |
| multiple times within a single overall lifecycle: | |
| `llm_input`, `llm_output`, `before_tool_call`, and `after_tool_call`. |
pkg/agent/loop.go
Outdated
| for _, agentID := range al.registry.ListAgentIDs() { | ||
| if agent, ok := al.registry.GetAgent(agentID); ok { | ||
| if tool, ok := agent.Tools.Get("message"); ok { | ||
| if mt, ok := tool.(*tools.MessageTool); ok { |
There was a problem hiding this comment.
Incorrect indentation: this line has an extra tab character. The if mt, ok := tool.(*tools.MessageTool); ok { should align with the line above it at the same indentation level.
pkg/agent/loop.go
Outdated
| defer al.summarizing.Delete(summarizeKey) | ||
| if !constants.IsInternalChannel(channel) { | ||
| al.bus.PublishOutbound(bus.OutboundMessage{ | ||
| al.sendOutbound(ctx, bus.OutboundMessage{ |
There was a problem hiding this comment.
The context from the parent function is captured in a goroutine, which can lead to the goroutine using a canceled or expired context if the parent function completes before the goroutine. Consider using context.Background() or context.TODO() for the sendOutbound call within the goroutine, or ensure the context lifetime extends beyond the goroutine's execution.
| al.sendOutbound(ctx, bus.OutboundMessage{ | |
| al.sendOutbound(context.Background(), bus.OutboundMessage{ |
Add a typed lifecycle hook system inspired by OpenClaw, designed for PicoClaw's ultra-lightweight philosophy. Provides 8 interception points around the agent loop for observability, content filtering, and guardrails. Two execution patterns: - Void hooks (concurrent): message_received, after_tool_call, llm_input, llm_output, session_start, session_end - Modifying hooks (sequential by priority, with cancel): message_sending, before_tool_call Key design choices: - Zero-cost when unused: all triggers check len==0 and return immediately - Copy-on-write registration: insertSorted allocates new backing array so concurrent readers never race with writers - Panic recovery in all handler dispatch paths - sendOutbound wrapper returns cancel status to callers - MessageTool callback rewired via SetHooks for content filtering 15 tests covering execution, priority ordering, cancel semantics, concurrency (barrier-based), panic recovery, and error swallowing. Co-Authored-By: Claude Opus 4.6 <[email protected]>
116904c to
0d2c4f9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
Description
This PR introduces a bounded plugin foundation for PicoClaw:
pkg/plugin) with explicit startup wiring.The goal is to enable extension without expanding core dependencies or adding runtime plugin complexity.
Scope
Included:
pkg/hooks: typed hook events + registry + trigger semantics.pkg/agent/loop.go: hook integration + plugin manager wiring.pkg/plugin: compile-time plugin interface and manager.docs/hooks-plugin-examples.mddocs/plugin-system-roadmap.mdNot included in this PR:
Behavior & Risk
SetHooksis enforced as pre-run only.Validation
Local:
make fmtgo generate ./...go vet ./...go test ./...External context (references)
Type of Change
AI Code Generation
Related Issue
N/A
Technical Context
Test Environment
Checklist