Skip to content

fix(kernel): idle reactive agents triggering spurious background API#791

Open
Vn1k wants to merge 1 commit intoRightNow-AI:mainfrom
Vn1k:fix/idle-agent-fires-api-calls-without-user-input
Open

fix(kernel): idle reactive agents triggering spurious background API#791
Vn1k wants to merge 1 commit intoRightNow-AI:mainfrom
Vn1k:fix/idle-agent-fires-api-calls-without-user-input

Conversation

@Vn1k
Copy link

@Vn1k Vn1k commented Mar 22, 2026

## Summary

Fixes a bug where agents with `[autonomous] max_iterations = N` in their
`agent.toml` were firing background API calls while idle — observed as
unprompted requests with 24K+ input tokens and `finish_reason: "length"`
up to 90 minutes after the last user message.

The default `assistant` agent ships with `[autonomous] max_iterations = 100`,
making this affect every default installation.

**Evidence — provider logs showing two calls from the same session:**

<img width="1110" height="104" alt="screenshot-2026-03-23_00-04-45" src="https://github.com/user-attachments/assets/7202938f-87f8-43f9-a825-44a7270912aa" />

The first row (03:42 PM, `finish_reason: length`, 24,582 tokens) fired
**93 minutes after the last user message** (02:09 PM, `finish_reason: stop`).
No user interaction occurred between the two calls.

## Changes

- **`heartbeat.rs`** — Only use `heartbeat_interval_secs` as the unresponsive
  timeout for agents with `Continuous` or `Periodic` schedule. Reactive agents
  now always use `default_timeout_secs` (180s) regardless of any `[autonomous]`
  config. Previously, the 30s default caused Reactive agents to be marked
  `Crashed` after just 60s of user inactivity.

- **`kernel.rs`** — Added two guards in the heartbeat monitor loop:
  - Unresponsive check: skip `Crashed` marking for Reactive agents — an idle
    conversational agent is not crashed, it is waiting for the next user message.
  - Recovery loop: if a Reactive agent ends up `Crashed`, reset it silently to
    `Running` without publishing a `HealthCheckFailed` event or entering the
    recovery counter, preventing the idle → Crashed → recovered → idle cycle
    that triggered background `send_message` calls with the full session context.
    
## Testing

- [x] `cargo clippy --workspace --all-targets -- -D warnings` passes
  (one pre-existing warning in `openfang-types/src/model_catalog.rs:311`
  unrelated to this fix — exists on `main` before these changes)
- [x] `cargo test --workspace` passes — 228/228 kernel tests pass
- [x] Live integration tested — no background API calls observed after 10+
  minutes of inactivity with `assistant` agent

## Security

- [x] No new unsafe code
- [x] No secrets or API keys in diff
- [x] User input validated at boundaries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant