@elizaos/plugin-autonomous

A high-performance plugin for ElizaOS that significantly reduces LLM costs through intelligent message batching, multi-agent coordination, and strategic planning.

Features

🚀 Extreme Cost Reduction: Time-sliced batching reduces LLM calls by 20-40× across ALL rooms and agents simultaneously.
🕒 Time-Sliced Batching: Processes messages from multiple rooms in 50-500ms windows with a single LLM call.
🧠 Intelligent Planning: Assesses message complexity, knowledge requirements, and token budgets before generating responses.
⚡ Always-On Batching: All messages use time-sliced batching for optimal multi-agent coordination (critical messages can bypass for instant response).
🤖 Multi-Agent Coordination: Single planning call coordinates responses for all agents across all active rooms.
🎯 Smart Filtering: Topic relevance and flood detection prevent unnecessary processing.
🚨 Priority Routing: Critical messages bypass batching for immediate response (DMs, mentions, urgent keywords).
🔮 Predictive Pre-Warming: Learns daily patterns to predict and prepare for high-traffic periods.
🗂️ Semantic Clustering: Groups similar messages to reduce redundant processing.
💰 Budget Pooling: All agents share a global budget pool for optimal resource allocation.
📊 Rich Metrics: Detailed tracking of token usage, costs, latency, and I/O.
🛡️ Durable Queuing: Messages are persisted on enqueue so batches survive process restarts.

Installation

bun add @elizaos/plugin-autonomous

Configuration

Add the plugin to your character configuration:

{
  "plugins": ["@elizaos/plugin-autonomous"]
}

Environment Variables (Optional)

Configure behavior via .env or runtime settings:

Setting	Description	Default
`AUTONOMOUS_BATCH_THRESHOLD`	Messages/sec globally for "high load" status (batching always enabled)	`2`
`AUTONOMOUS_BATCH_MAX_SIZE`	Min messages per time slice (efficiency gate)	`2`
`AUTONOMOUS_WORKER_CONCURRENCY`	Parallel time slice processors	`4`
`AUTONOMOUS_BUDGET_DAILY_USD`	Max daily spend (USD). Omit = unlimited	`undefined`
`AUTONOMOUS_DEADLINE_MS`	Max processing time per message (ms)	`undefined`
`AUTONOMOUS_BUDGET_MODE`	Action when over budget (`dynamic` \| `reject`)	`dynamic`

Time Slice Behavior:

Slice duration dynamically adjusts from 50ms (high load) to 500ms (low load)
Minimum 2 messages required before processing (prevents inefficient singleton batches)
Messages from ALL rooms are collected in each time slice
Single LLM call processes all rooms simultaneously

Advanced Features:

Priority routing enabled by default (AUTONOMOUS_PRIORITY_ENABLED=true)
Semantic clustering tracks similar messages automatically
Predictive learning builds hourly patterns for load anticipation

Budget Modes

dynamic (Default): Automatically downgrades to cheaper/faster models when budget is tight.
reject: Queues or rejects messages when budget is exceeded.

How It Works

1. Always-On Time-Sliced Batching

All messages are processed through time-sliced batching for optimal multi-agent coordination. Even in "slow" rooms with a single message, batching allows multiple agents to coordinate their responses in a single LLM call, which is far more efficient than per-agent planning.

Exception: Critical priority messages (DMs, urgent mentions) can bypass batching for instant response via the "express lane."

The plugin monitors message velocity across all agents and rooms for metrics and logging:

Low Load (< 2 msg/s): Quiet batching (no banner).
High Load (≥ 2 msg/s): Shows colorful console output.

When high load is detected, you'll see:

🔥 HIGH LOAD BATCHING 🔥
   Rate: 3.2 msg/s (threshold: 2)
   Queue Depth: 8 messages
   Room Count: 2 active rooms
   → Queuing message abc123... for batch processing

2. Time-Sliced Batching

Messages are collected into time slices (50-500ms windows) across all rooms simultaneously:

⏰ TIME SLICE READY ⏰
   Slice ID: 42
   Messages: 12
   Rooms: 3
   Agents: 8
   Wait Time: 150ms

3. Multi-Room Planning

The planner analyzes all messages from all rooms and all active agents in a single LLM call:

Decides who should respond, ignore, or react
Assigns complexity scores (0-100)
Selects optimal pipelines (Fast/Balanced/Quality)
Filters inactive agents automatically

Time slice processing shows beautiful ANSI art:

📦 TIME SLICE PROCESSING 📦
   Slice ID: 42
   Rooms: 3
   Messages: 12
   Agents: 8 (Alice, Bob, Charlie, ...)

📋 PLANNING COMPLETE 📋
   Decisions: 24
   Time: 234ms
   Tokens: 1,456 in, 178 out
   Actions: reply:12, ignore:10, react:2

🚀 EXECUTING PLANS 🚀
   Executing 14 response(s)...

✅ TIME SLICE PROCESSED ✅
   Time: 567ms
   Slice ID: 42

4. Smart Filtering & Priority Routing

Before adding agents to a time slice, the plugin applies several filters:

Topic Relevance: Agents with 0% relevance (and not mentioned) are skipped
Active Status: Only running agents are included in planning
Flood Protection: During floods (20+ msg/5s), only agents with 30%+ relevance participate
Priority Express Lane: Critical messages (DMs, @mentions, "urgent" keywords) bypass batching

Priority Scoring (0-100):

Direct messages: +100
Agent @mentions: +100
Replies to agent: +80
Urgent keywords: +80
Voice messages: +100

Critical priority (≥80 score) → Immediate processing, skip batch

4.5 Semantic Clustering

Messages are automatically clustered by semantic similarity:

Lightweight word-overlap clustering (no LLM calls)
Groups "hello", "hi", "hey" together for efficient processing
60% similarity threshold for clustering
60-second cluster lifetime

4.6 Predictive Pattern Learning

The system learns your traffic patterns automatically:

Tracks hourly message rates (0-23 hours)
Builds confidence over time (100 samples = 100% confidence)
Predicts load for upcoming hours
Recommendations: warm (prepare), normal, cool (scale down)

Example: If your Discord is always busy 9am-5pm, the system learns this and can pre-optimize resources.

5. Resource Tracking & Efficiency Reports

The ResourceTracker service monitors:

Token usage (Input/Output)
Estimated cost
Latency (E2E, LLM, DB)
I/O operations

After each batch processing (when there are actual metrics to show), you'll see a loud efficiency report:

╔══════════════════════════════════════════════════════════════╗
║  🚀 AUTONOMOUS EFFICIENCY REPORT 🚀                         ║
╠══════════════════════════════════════════════════════════════╣
║  📊 Messages Served:     45                                 ║
║  🧠 LLM Calls Made:       8                                 ║
║  🎯 Messages per LLM Call: 5.63                             ║
║  💰 Cost per Message (¢): 2.45                              ║
║  📦 Batches Processed:    3                                 ║
╚══════════════════════════════════════════════════════════════╝

Note: The report only displays when there are messages or batches to report on - no empty reports!

Scaling Benefits

Traditional vs Time-Sliced Batching

Without plugin-autonomous:

8 agents × 10 messages × 5 rooms = 400 LLM calls

With per-room batching (v1):

5 rooms × 1 planning call = 5 LLM calls (80× reduction)

With time-sliced batching (v2):

1 time slice × 1 planning call = 1 LLM call (400× reduction!)

Real-World Example: Discord with 8 Agents

Scenario: 8 agents monitoring 3 busy Discord channels

Per-Message (No Batching):

50 messages arrive in 10 seconds
50 msg × 8 agents = 400 LLM calls
Cost: ~$2.00

Time-Sliced Batching:

Messages collected in 20 time slices (500ms each)
20 slices × 1 LLM call = 20 LLM calls
Cost: ~$0.10
Savings: 95% ($1.90)

Multi-Room Coordination

Time slices enable true multi-room coordination:

Time Slice #1 (500ms):
  Room A: 3 messages
  Room B: 2 messages  
  Room C: 1 message
  → Single LLM call coordinates all 8 agents across all 3 rooms
  → 6 messages, 8 agents, 1 LLM call = 48 decisions

Documentation

ARCHITECTURE.md - Technical architecture and data flow
EXAMPLES.md - Real-world usage patterns and examples
CONFIG.md - Complete configuration reference
DESIGN_DECISIONS.md - Architectural rationale

Development

Build

bun run build

Test

bun run test

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
src		src
ADDRESSEE_FIX.md		ADDRESSEE_FIX.md
AGENT_RUNTIME_LOOKUP_FIX.md		AGENT_RUNTIME_LOOKUP_FIX.md
ALWAYS_ON_BATCHING.md		ALWAYS_ON_BATCHING.md
ARCHITECTURE.md		ARCHITECTURE.md
BATCHING_COMPARISON.md		BATCHING_COMPARISON.md
CHAT_IMPROVEMENTS.md		CHAT_IMPROVEMENTS.md
COMPLETE_SUMMARY.md		COMPLETE_SUMMARY.md
CONFIG.md		CONFIG.md
DEAD_CODE_ANALYSIS.md		DEAD_CODE_ANALYSIS.md
DECISION_LOGGING.md		DECISION_LOGGING.md
DESIGN_DECISIONS.md		DESIGN_DECISIONS.md
EXAMPLES.md		EXAMPLES.md
EXAMPLE_LOGS.md		EXAMPLE_LOGS.md
LLM_CALL_ANALYSIS.md		LLM_CALL_ANALYSIS.md
MULTI_AGENT_BATCHING_FIX.md		MULTI_AGENT_BATCHING_FIX.md
NAME_VARIATION_REGISTRY.md		NAME_VARIATION_REGISTRY.md
NLP_IMPLEMENTATION.md		NLP_IMPLEMENTATION.md
PROMPT_IMPROVEMENTS.md		PROMPT_IMPROVEMENTS.md
PROMPT_REFACTORING.md		PROMPT_REFACTORING.md
README.md		README.md
REPLY_THREAD_DETECTION.md		REPLY_THREAD_DETECTION.md
TECHNICAL_DEBT.md		TECHNICAL_DEBT.md
TEST_COVERAGE_GAP.md		TEST_COVERAGE_GAP.md
build.ts		build.ts
bunfig.toml		bunfig.toml
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

@elizaos/plugin-autonomous

Features

Installation

Configuration

Environment Variables (Optional)

Budget Modes

How It Works

1. Always-On Time-Sliced Batching

2. Time-Sliced Batching

3. Multi-Room Planning

4. Smart Filtering & Priority Routing

4.5 Semantic Clustering

4.6 Predictive Pattern Learning

5. Resource Tracking & Efficiency Reports

Scaling Benefits

Traditional vs Time-Sliced Batching

Real-World Example: Discord with 8 Agents

Multi-Room Coordination

Documentation

Development

Build

Test

About

Uh oh!

Releases

Packages

Languages

elizaos-plugins/plugin-autonomous

Folders and files

Latest commit

History

Repository files navigation

@elizaos/plugin-autonomous

Features

Installation

Configuration

Environment Variables (Optional)

Budget Modes

How It Works

1. Always-On Time-Sliced Batching

2. Time-Sliced Batching

3. Multi-Room Planning

4. Smart Filtering & Priority Routing

4.5 Semantic Clustering

4.6 Predictive Pattern Learning

5. Resource Tracking & Efficiency Reports

Scaling Benefits

Traditional vs Time-Sliced Batching

Real-World Example: Discord with 8 Agents

Multi-Room Coordination

Documentation

Development

Build

Test

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages