Skip to content

Latest commit

Β 

History

History
268 lines (224 loc) Β· 11.1 KB

File metadata and controls

268 lines (224 loc) Β· 11.1 KB

ShieldCortex β€” Architecture

Overview

ShieldCortex is a security layer and brain-like memory system for AI agents. It combines persistent memory (STM/LTM/episodic) with a 6-layer defence pipeline that scans every memory write for threats.

Agent β†’ ShieldCortex β†’ Memory Store (SQLite)
         ↓
    Tier 1 (sync, 1-5ms):
    Trust β†’ Firewall β†’ Sensitivity β†’ Fragmentation β†’ Credential β†’ Audit
         ↓ (if QUARANTINE + verify enabled)
    Tier 2 (async, 500-2000ms):
    Cloud LLM Verification → verdict → optional QUARANTINE→BLOCK upgrade

Memory Model

Short-Term Memory (STM)

  • Scope: Current coding session
  • Decay: Fast (hours)
  • Limit: 100 memories max

Long-Term Memory (LTM)

  • Scope: Cross-session, persistent
  • Content: Architecture decisions, code patterns, user preferences
  • Decay: Slow (weeks/months), reinforced by access
  • Limit: 1,000 memories max

Episodic Memory

  • Scope: Specific events/outcomes
  • Content: "When I tried X, Y happened", successful solutions
  • Decay: Based on utility

Salience Detection

Factor Weight Description
Explicit request 1.0 User says "remember this"
Architecture decision 0.9 System design choices
Error resolution 0.8 Debugging breakthroughs
Code pattern 0.7 Reusable implementation patterns
User preference 0.7 Coding style, tool preferences
Repeated mention 0.6 Topics that come up multiple times
File location 0.5 Where important code lives
Temporary context 0.2 Current debugging state

Base salience: 0.25. Deletion threshold: 0.2.

Temporal Decay & Reinforcement

  • Decay: score = base_score * (0.995 ^ hours_since_access)
  • Reinforcement: Each access boosts score by 1.2x
  • Consolidation: High-access STM β†’ LTM (runs every 4 hours)

Defence Pipeline

Every addMemory() call runs through a tiered defence pipeline:

1. Trust Scorer (src/defence/trust/)

Scores the source of the memory write:

Source Trust Score
user 1.0
cli 0.9
hook 0.8
api 0.7
agent 0.5
web 0.3
unknown 0.1

Low trust (< 0.5) escalates detections to BLOCK in balanced mode.

2. Memory Firewall (src/defence/firewall/)

Four detection modules run in parallel:

  • Instruction Detector β€” prompt injection, fake system prompts, hidden instructions, social engineering, delimiter attacks, frontmatter injection
  • Privilege Detector β€” credential references, system commands, destructive filesystem ops, network exfiltration, external URLs
  • Encoding Detector β€” base64, hex (including plain continuous hex), URL encoding, zero-width chars, RTL override, Unicode homoglyphs
  • Anomaly Scorer β€” entropy analysis, length anomalies, repetition patterns

Modes:

  • strict β€” any detection β†’ BLOCK
  • balanced β€” context-aware: instruction injection β†’ QUARANTINE (low trust β†’ BLOCK), encoding decoded and re-scanned, zero-width/RTL always quarantined
  • permissive β€” allow all, populate indicators only

3. Sensitivity Classifier (src/defence/sensitivity/)

Classifies content as PUBLIC / INTERNAL / CONFIDENTIAL / RESTRICTED. Detects passwords, API keys, PII, credentials. RESTRICTED content is blocked. CONFIDENTIAL is redacted on recall.

4. Fragmentation Detector (src/defence/fragmentation/)

Cross-references new memories with recent ones to catch multi-step assembly attacks:

  • Entity extraction from content
  • Temporal analysis of related memories
  • Assembly pattern detection (fragments that combine into exploits)

5. Audit Logger (src/defence/audit/)

Full forensic trail of every memory operation: source, trust score, firewall result, sensitivity level, anomaly score, threat indicators, blocked patterns, duration.

6. Credential Leak Detection (src/defence/credential-leak/)

Scans content for 25+ credential patterns across 11 providers (AWS, GitHub, Stripe, etc.). Entropy analysis catches generic secrets. Blocked credentials upgrade the firewall result to BLOCK.

Tier 2: LLM Verification (src/cloud/verify.ts)

Optional async layer for content that Tier 1 flags as QUARANTINE. Submits content to /v1/verify for cloud-based LLM analysis (Claude 3.5 Haiku).

  • Fail-OPEN β€” if the LLM is unavailable or times out, the Tier 1 verdict stands unchanged
  • Advisory mode (default): fire-and-forget HTTP request, returns { status: 'pending' } immediately
  • Enforce mode: awaits the LLM verdict; upgrades QUARANTINE β†’ BLOCK if verdict is THREAT with confidence >= 0.7
  • Credentials are redacted before sending to the LLM
  • Configurable timeout (default 5000ms, range 1000-30000ms)
  • Gated by: cloud enabled + API key set + verify enabled + firewall result matches triggers

Config (~/.shieldcortex/config.json):

{
  "verifyEnabled": true,
  "verifyMode": "advisory",
  "verifyTriggers": ["QUARANTINE"],
  "verifyTimeoutMs": 5000
}

API: runDefencePipelineWithVerify() wraps the sync pipeline and adds optional verification. Returns DefencePipelineResultWithVerify which extends the standard result with a verification field.

Knowledge Graph (src/graph/)

Entities and relationships automatically extracted from memories:

  • Pattern-based entity extraction (files, tools, languages, concepts, people, services)
  • Entity resolution with fuzzy matching
  • Subject-predicate-object triples
  • Graph traversal and path finding

Database Schema

SQLite with FTS5 full-text search. Location: ~/.shieldcortex/memories.db

CREATE TABLE memories (
  id INTEGER PRIMARY KEY,
  type TEXT NOT NULL,           -- 'short_term', 'long_term', 'episodic'
  category TEXT,                -- 'architecture', 'pattern', 'preference', etc.
  title TEXT NOT NULL,
  content TEXT NOT NULL,
  project TEXT,
  tags TEXT,                    -- JSON array
  salience REAL DEFAULT 0.5,
  access_count INTEGER DEFAULT 0,
  last_accessed TIMESTAMP,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  decayed_score REAL,
  metadata TEXT,                -- JSON
  trust_score REAL,
  sensitivity_level TEXT,
  source TEXT                   -- JSON { type, identifier }
);

CREATE VIRTUAL TABLE memories_fts USING fts5(
  title, content, tags,
  content='memories',
  content_rowid='id'
);

File Structure

shieldcortex/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts                    # MCP server entry point
β”‚   β”œβ”€β”€ server.ts                   # MCP server setup, tool definitions
β”‚   β”œβ”€β”€ database/
β”‚   β”‚   └── init.ts                 # SQLite setup, schema, transactions
β”‚   β”œβ”€β”€ memory/
β”‚   β”‚   β”œβ”€β”€ types.ts                # Memory type definitions
β”‚   β”‚   β”œβ”€β”€ store.ts                # Core CRUD operations, links
β”‚   β”‚   β”œβ”€β”€ salience.ts             # Salience scoring
β”‚   β”‚   β”œβ”€β”€ decay.ts                # Temporal decay logic
β”‚   β”‚   β”œβ”€β”€ consolidate.ts          # STM β†’ LTM consolidation
β”‚   β”‚   β”œβ”€β”€ similarity.ts           # Semantic similarity
β”‚   β”‚   β”œβ”€β”€ activation.ts           # Spreading activation
β”‚   β”‚   └── contradiction.ts        # Contradiction detection
β”‚   β”œβ”€β”€ cloud/
β”‚   β”‚   β”œβ”€β”€ config.ts               # Cloud + verify config (~/.shieldcortex/config.json)
β”‚   β”‚   β”œβ”€β”€ cli.ts                  # CLI flag handlers (cloud + verify)
β”‚   β”‚   β”œβ”€β”€ sync.ts                 # Fire-and-forget audit sync
β”‚   β”‚   └── verify.ts               # LLM verification HTTP client (Tier 2)
β”‚   β”œβ”€β”€ defence/
β”‚   β”‚   β”œβ”€β”€ pipeline.ts             # Orchestrates all layers (sync + async verify)
β”‚   β”‚   β”œβ”€β”€ types.ts                # Defence type definitions
β”‚   β”‚   β”œβ”€β”€ firewall/
β”‚   β”‚   β”‚   β”œβ”€β”€ index.ts            # Firewall orchestrator
β”‚   β”‚   β”‚   β”œβ”€β”€ instruction-detector.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ privilege-detector.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ encoding-detector.ts
β”‚   β”‚   β”‚   └── anomaly-scorer.ts
β”‚   β”‚   β”œβ”€β”€ trust/
β”‚   β”‚   β”‚   β”œβ”€β”€ source-scorer.ts    # Trust hierarchy
β”‚   β”‚   β”‚   └── recall-filter.ts    # Filter by trust on recall
β”‚   β”‚   β”œβ”€β”€ sensitivity/
β”‚   β”‚   β”‚   β”œβ”€β”€ classifier.ts       # PUBLIC/INTERNAL/CONFIDENTIAL/RESTRICTED
β”‚   β”‚   β”‚   β”œβ”€β”€ patterns.ts         # Detection patterns
β”‚   β”‚   β”‚   └── redaction.ts        # Auto-redact secrets
β”‚   β”‚   β”œβ”€β”€ fragmentation/
β”‚   β”‚   β”‚   β”œβ”€β”€ entity-extractor.ts
β”‚   β”‚   β”‚   β”œβ”€β”€ temporal-analyzer.ts
β”‚   β”‚   β”‚   └── assembly-detector.ts
β”‚   β”‚   β”œβ”€β”€ credential-leak/
β”‚   β”‚   β”‚   └── index.ts            # 25+ credential patterns, entropy analysis
β”‚   β”‚   β”œβ”€β”€ audit/
β”‚   β”‚   β”‚   β”œβ”€β”€ logger.ts           # Write audit entries
β”‚   β”‚   β”‚   └── queries.ts          # Query audit trail
β”‚   β”‚   └── scanner/
β”‚   β”‚       └── scan-existing.ts    # Retroactive memory scanner
β”‚   β”œβ”€β”€ integrations/
β”‚   β”‚   β”œβ”€β”€ langchain.ts            # ShieldCortexMemory + ShieldCortexGuard
β”‚   β”‚   └── index.ts
β”‚   β”œβ”€β”€ graph/
β”‚   β”‚   β”œβ”€β”€ extract.ts              # Entity/triple extraction
β”‚   β”‚   β”œβ”€β”€ resolve.ts              # Entity resolution
β”‚   β”‚   └── backfill.ts             # Backfill existing memories
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   └── visualization-server.ts # REST API + WebSocket + defence endpoints
β”‚   β”œβ”€β”€ tools/
β”‚   β”‚   β”œβ”€β”€ remember.ts
β”‚   β”‚   β”œβ”€β”€ recall.ts
β”‚   β”‚   β”œβ”€β”€ forget.ts
β”‚   β”‚   β”œβ”€β”€ context.ts
β”‚   β”‚   └── graph.ts
β”‚   β”œβ”€β”€ context/
β”‚   β”‚   └── project-context.ts      # Project auto-detection
β”‚   β”œβ”€β”€ service/
β”‚   β”‚   β”œβ”€β”€ install.ts              # Cross-platform service installer
β”‚   β”‚   └── templates.ts            # launchd/systemd/Windows templates
β”‚   β”œβ”€β”€ setup/
β”‚   β”‚   β”œβ”€β”€ migrate.ts              # Claude Cortex β†’ ShieldCortex migration
β”‚   β”‚   β”œβ”€β”€ settings-hooks.ts       # Auto-configure hooks
β”‚   β”‚   └── doctor.ts               # Installation health check
β”‚   β”œβ”€β”€ worker/
β”‚   β”‚   └── brain-worker.ts         # Background processing
β”‚   └── embeddings/
β”‚       └── generator.ts            # Text embeddings
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ session-start-hook.mjs      # Auto-recall context
β”‚   β”œβ”€β”€ pre-compact-hook.mjs        # Auto-extract before compaction
β”‚   β”œβ”€β”€ session-end-hook.mjs        # Auto-extract on exit
β”‚   └── stop-hook.mjs               # Check last response (opt-in)
β”œβ”€β”€ hooks/
β”‚   └── openclaw/cortex-memory/     # OpenClaw hook
β”œβ”€β”€ dashboard/                      # Next.js 3D brain visualization
β”œβ”€β”€ package.json
β”œβ”€β”€ tsconfig.json
└── README.md

Anti-Bloat Safeguards

  • Max 100 STM, 1,000 LTM memories
  • 10KB content limit per memory
  • 100MB database hard limit
  • Auto-consolidation every 4 hours
  • Auto-vacuum after deletions
  • Decay scores persisted every 5 minutes