Skip to content

elizaos-plugins/plugin-autonomous

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

@elizaos/plugin-autonomous

A high-performance plugin for ElizaOS that significantly reduces LLM costs through intelligent message batching, multi-agent coordination, and strategic planning.

Features

  • 🚀 Extreme Cost Reduction: Time-sliced batching reduces LLM calls by 20-40× across ALL rooms and agents simultaneously.
  • 🕒 Time-Sliced Batching: Processes messages from multiple rooms in 50-500ms windows with a single LLM call.
  • 🧠 Intelligent Planning: Assesses message complexity, knowledge requirements, and token budgets before generating responses.
  • Always-On Batching: All messages use time-sliced batching for optimal multi-agent coordination (critical messages can bypass for instant response).
  • 🤖 Multi-Agent Coordination: Single planning call coordinates responses for all agents across all active rooms.
  • 🎯 Smart Filtering: Topic relevance and flood detection prevent unnecessary processing.
  • 🚨 Priority Routing: Critical messages bypass batching for immediate response (DMs, mentions, urgent keywords).
  • 🔮 Predictive Pre-Warming: Learns daily patterns to predict and prepare for high-traffic periods.
  • 🗂️ Semantic Clustering: Groups similar messages to reduce redundant processing.
  • 💰 Budget Pooling: All agents share a global budget pool for optimal resource allocation.
  • 📊 Rich Metrics: Detailed tracking of token usage, costs, latency, and I/O.
  • 🛡️ Durable Queuing: Messages are persisted on enqueue so batches survive process restarts.

Installation

bun add @elizaos/plugin-autonomous

Configuration

Add the plugin to your character configuration:

{
  "plugins": ["@elizaos/plugin-autonomous"]
}

Environment Variables (Optional)

Configure behavior via .env or runtime settings:

Setting Description Default
AUTONOMOUS_BATCH_THRESHOLD Messages/sec globally for "high load" status (batching always enabled) 2
AUTONOMOUS_BATCH_MAX_SIZE Min messages per time slice (efficiency gate) 2
AUTONOMOUS_WORKER_CONCURRENCY Parallel time slice processors 4
AUTONOMOUS_BUDGET_DAILY_USD Max daily spend (USD). Omit = unlimited undefined
AUTONOMOUS_DEADLINE_MS Max processing time per message (ms) undefined
AUTONOMOUS_BUDGET_MODE Action when over budget (dynamic | reject) dynamic

Time Slice Behavior:

  • Slice duration dynamically adjusts from 50ms (high load) to 500ms (low load)
  • Minimum 2 messages required before processing (prevents inefficient singleton batches)
  • Messages from ALL rooms are collected in each time slice
  • Single LLM call processes all rooms simultaneously

Advanced Features:

  • Priority routing enabled by default (AUTONOMOUS_PRIORITY_ENABLED=true)
  • Semantic clustering tracks similar messages automatically
  • Predictive learning builds hourly patterns for load anticipation

Budget Modes

  • dynamic (Default): Automatically downgrades to cheaper/faster models when budget is tight.
  • reject: Queues or rejects messages when budget is exceeded.

How It Works

1. Always-On Time-Sliced Batching

All messages are processed through time-sliced batching for optimal multi-agent coordination. Even in "slow" rooms with a single message, batching allows multiple agents to coordinate their responses in a single LLM call, which is far more efficient than per-agent planning.

Exception: Critical priority messages (DMs, urgent mentions) can bypass batching for instant response via the "express lane."

The plugin monitors message velocity across all agents and rooms for metrics and logging:

  • Low Load (< 2 msg/s): Quiet batching (no banner).
  • High Load (≥ 2 msg/s): Shows colorful console output.

When high load is detected, you'll see:

🔥 HIGH LOAD BATCHING 🔥
   Rate: 3.2 msg/s (threshold: 2)
   Queue Depth: 8 messages
   Room Count: 2 active rooms
   → Queuing message abc123... for batch processing

2. Time-Sliced Batching

Messages are collected into time slices (50-500ms windows) across all rooms simultaneously:

⏰ TIME SLICE READY ⏰
   Slice ID: 42
   Messages: 12
   Rooms: 3
   Agents: 8
   Wait Time: 150ms

3. Multi-Room Planning

The planner analyzes all messages from all rooms and all active agents in a single LLM call:

  • Decides who should respond, ignore, or react
  • Assigns complexity scores (0-100)
  • Selects optimal pipelines (Fast/Balanced/Quality)
  • Filters inactive agents automatically

Time slice processing shows beautiful ANSI art:

📦 TIME SLICE PROCESSING 📦
   Slice ID: 42
   Rooms: 3
   Messages: 12
   Agents: 8 (Alice, Bob, Charlie, ...)

📋 PLANNING COMPLETE 📋
   Decisions: 24
   Time: 234ms
   Tokens: 1,456 in, 178 out
   Actions: reply:12, ignore:10, react:2

🚀 EXECUTING PLANS 🚀
   Executing 14 response(s)...

✅ TIME SLICE PROCESSED ✅
   Time: 567ms
   Slice ID: 42

4. Smart Filtering & Priority Routing

Before adding agents to a time slice, the plugin applies several filters:

  • Topic Relevance: Agents with 0% relevance (and not mentioned) are skipped
  • Active Status: Only running agents are included in planning
  • Flood Protection: During floods (20+ msg/5s), only agents with 30%+ relevance participate
  • Priority Express Lane: Critical messages (DMs, @mentions, "urgent" keywords) bypass batching

Priority Scoring (0-100):

  • Direct messages: +100
  • Agent @mentions: +100
  • Replies to agent: +80
  • Urgent keywords: +80
  • Voice messages: +100

Critical priority (≥80 score) → Immediate processing, skip batch

4.5 Semantic Clustering

Messages are automatically clustered by semantic similarity:

  • Lightweight word-overlap clustering (no LLM calls)
  • Groups "hello", "hi", "hey" together for efficient processing
  • 60% similarity threshold for clustering
  • 60-second cluster lifetime

4.6 Predictive Pattern Learning

The system learns your traffic patterns automatically:

  • Tracks hourly message rates (0-23 hours)
  • Builds confidence over time (100 samples = 100% confidence)
  • Predicts load for upcoming hours
  • Recommendations: warm (prepare), normal, cool (scale down)

Example: If your Discord is always busy 9am-5pm, the system learns this and can pre-optimize resources.

5. Resource Tracking & Efficiency Reports

The ResourceTracker service monitors:

  • Token usage (Input/Output)
  • Estimated cost
  • Latency (E2E, LLM, DB)
  • I/O operations

After each batch processing (when there are actual metrics to show), you'll see a loud efficiency report:

╔══════════════════════════════════════════════════════════════╗
║  🚀 AUTONOMOUS EFFICIENCY REPORT 🚀                         ║
╠══════════════════════════════════════════════════════════════╣
║  📊 Messages Served:     45                                 ║
║  🧠 LLM Calls Made:       8                                 ║
║  🎯 Messages per LLM Call: 5.63                             ║
║  💰 Cost per Message (¢): 2.45                              ║
║  📦 Batches Processed:    3                                 ║
╚══════════════════════════════════════════════════════════════╝

Note: The report only displays when there are messages or batches to report on - no empty reports!

Scaling Benefits

Traditional vs Time-Sliced Batching

Without plugin-autonomous:

  • 8 agents × 10 messages × 5 rooms = 400 LLM calls

With per-room batching (v1):

  • 5 rooms × 1 planning call = 5 LLM calls (80× reduction)

With time-sliced batching (v2):

  • 1 time slice × 1 planning call = 1 LLM call (400× reduction!)

Real-World Example: Discord with 8 Agents

Scenario: 8 agents monitoring 3 busy Discord channels

Per-Message (No Batching):

  • 50 messages arrive in 10 seconds
  • 50 msg × 8 agents = 400 LLM calls
  • Cost: ~$2.00

Time-Sliced Batching:

  • Messages collected in 20 time slices (500ms each)
  • 20 slices × 1 LLM call = 20 LLM calls
  • Cost: ~$0.10
  • Savings: 95% ($1.90)

Multi-Room Coordination

Time slices enable true multi-room coordination:

Time Slice #1 (500ms):
  Room A: 3 messages
  Room B: 2 messages  
  Room C: 1 message
  → Single LLM call coordinates all 8 agents across all 3 rooms
  → 6 messages, 8 agents, 1 LLM call = 48 decisions

Documentation

Development

Build

bun run build

Test

bun run test

About

intelligent message batching and multi-agent coordination

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published