Technical Architecture

AssemblyZero is a multi-agent orchestration platform built on LangGraph state machines, with cross-model adversarial review (Claude drafts, Gemini reviews) and RAG-like codebase intelligence.

LLM Invocation Patterns

AssemblyZero uses four LLM providers across three distinct roles:

Provider	Class	Use Case	Cost Model
Claude CLI (`claude -p`)	`ClaudeCLIProvider`	Drafting, implementation, all creative work	Free (Max subscription)
Anthropic API	`AnthropicProvider`	Automatic fallback when CLI fails	Per-token ($5-25/M output)
Fallback	`FallbackProvider`	CLI first (180s) → API (300s) transparent failover	Free first, paid if needed
Gemini	`GeminiProvider`	Adversarial review only (different model family)	Free (API quota)

Why `claude -p` Instead of the API?

The Claude CLI (claude -p) is invoked as a subprocess with strict flags:

claude -p --output-format json --tools "" --strict-mcp-config --model claude-opus-4-6

--tools "" — Disables all built-in tools (no file ops, no web access)
--strict-mcp-config — Disables MCP tool loading
--output-format json — Returns structured response with token counts and cost

This gives us free LLM calls via the Max subscription, deterministic side-effect-free behavior, and structured token accounting. The API exists only as a paid fallback for resilience.

Cross-Model Review

The key architectural decision: Claude and Gemini are different model families. When Gemini reviews Claude's work, it brings genuinely different training biases and blind spots. This is adversarial verification, not self-review.

See: ADR 0208: LLM Invocation Strategy

LangGraph State Machines

All workflows are implemented as LangGraph StateGraph instances with typed state, conditional edges, and SQLite checkpointing.

Five Workflows

Workflow	Nodes	Purpose	Key Conditional Edge
Issue	7	GitHub issue → structured brief	Review verdict routes to revision or filing
Requirements (LLD)	10	Issue → approved Low-Level Design	Gemini verdict routes to revision or approval
Implementation Spec	7	LLD → concrete implementation instructions	Completeness check routes to revision or review
TDD Implementation	13	Spec → code + tests + PR	Test results route to fix loop or report generation
Scout	Variable	External intelligence gathering	Search results route to deeper research or synthesis

Requirements Workflow Graph

The most complex workflow, showing the full draft-review-revise cycle:

graph TD
    N0["N0: Load Input<br/>(issue body, config)"]
    N0b["N0b: Analyze Codebase<br/>(file scanning, patterns)"]
    N1["N1: Generate Draft<br/>(Claude drafts LLD)"]
    N1_5["N1.5: Mechanical Validation<br/>(format, structure checks)"]
    N1b["N1b: Validate Test Plan<br/>(requirement refs in scenarios)"]
    N2["N2: Human Gate - Draft<br/>(optional human review)"]
    N3["N3: Review<br/>(Gemini adversarial review)"]
    N4["N4: Human Gate - Verdict<br/>(optional override)"]
    N5["N5: Finalize<br/>(write approved LLD)"]

    N0 --> N0b
    N0b --> N1
    N1 --> N1_5
    N1_5 --> N1b
    N1b --> N2
    N2 --> N3
    N3 --> N4
    N4 -->|"APPROVE"| N5
    N4 -->|"BLOCK/REVISE"| N1

Checkpointing

Each workflow uses SqliteSaver from langgraph.checkpoint.sqlite:

from langgraph.checkpoint.sqlite import SqliteSaver

with SqliteSaver.from_conn_string(str(db_path)) as memory:
    app = workflow.compile(checkpointer=memory)
    config = {"configurable": {"thread_id": f"issue-{issue_number}"}}
    result = app.invoke(initial_state, config)

Thread ID = issue number → enables resume after interruption
Per-issue databases prevent deadlocks during concurrent workflows
Recursion limits scale with iteration count: (max_iters × edges_per_loop) + buffer

Codebase Intelligence (RAG-Like)

AssemblyZero uses a RAG-like pattern for codebase understanding, but without vector embeddings. Instead, it uses deterministic techniques:

How It Works

Technique	What It Does	Where Used
AST Summarization	Extracts function signatures, class hierarchies, imports	`summarize_file_for_context()` in N0b node
Pattern Scanning	Finds similar implementations in codebase by structure	Spec workflow N1 (analyze codebase)
Token Budget Management	Trims context to fit model limits (~60KB target)	All draft nodes
Section-Aware Truncation	Preserves important sections, trims verbose ones	Implementation spec context injection

The Flow

graph LR
    Issue["Issue/LLD"]
    Scan["Scan Codebase<br/>(files-to-modify table)"]
    AST["AST Summarize<br/>(signatures, imports)"]
    Similar["Find Similar<br/>(pattern matching)"]
    Budget["Token Budget<br/>(trim to ~60KB)"]
    Draft["Inject into<br/>Draft Prompt"]

    Issue --> Scan
    Scan --> AST
    Scan --> Similar
    AST --> Budget
    Similar --> Budget
    Budget --> Draft

Why Not Vector Embeddings?

Deterministic: Same input always produces same context (auditable)
No infrastructure: No vector database to deploy or maintain
Token-efficient: AST summaries are compact; full file contents would exceed budgets
Sufficient: For a single-repo workflow, structural analysis covers the use cases

Full Pipeline Flow

Idea → GitHub Issue → LLD (Design) → Implementation Spec → Code + Tests + PR
         │                │                   │                    │
         ▼                ▼                   ▼                    ▼
    Issue Workflow    Requirements        Spec Workflow      TDD Implementation
    (7 nodes)        Workflow             (7 nodes)          (13 nodes)
                     (10 nodes)

Each stage produces an artifact that feeds the next:

Stage	Input	Output	Reviewed By
Issue	Idea description	Structured GitHub issue	Optional Gemini
Requirements	Issue body	Approved LLD	Gemini (mandatory)
Spec	Approved LLD + codebase analysis	Implementation spec with line-level instructions	Gemini (mandatory)
Implementation	Spec (or LLD)	Working code, passing tests, PR	Gemini (mandatory)

Governance Gates

Every stage transition requires a Gemini review verdict:

APPROVE → advance to next stage
BLOCK → loop back for revision (up to max iterations)
Human override → orchestrator can waive gates for hotfixes

These gates are enforced by the LangGraph state machine — there is no edge that bypasses review.

Key Source Files

File	Contents
`assemblyzero/core/llm_provider.py`	All four LLM providers, `LLMCallResult`, `get_provider()` factory
`assemblyzero/workflows/requirements/graph.py`	Requirements workflow StateGraph (10 nodes)
`assemblyzero/workflows/implementation_spec/graph.py`	Spec workflow StateGraph (7 nodes)
`assemblyzero/workflows/issue/graph.py`	Issue workflow StateGraph (7 nodes)
`tools/run_requirements_workflow.py`	Requirements workflow CLI runner with SqliteSaver
`tools/run_implement_from_lld.py`	TDD implementation workflow CLI runner
`docs/adrs/0208-llm-invocation-strategy.md`	ADR for LLM invocation decisions

LangGraph Evolution — Roadmap for supervisor pattern and LangSmith
Gemini Verification — Multi-model review architecture
Governance Gates — Gate enforcement details
Codebase Intelligence — Intelligence layer design
Mechanical Validation — Auto-fix layer before review
Historical Intelligence — Learning from past issues

Technical Architecture

Technical Architecture

LLM Invocation Patterns

Why claude -p Instead of the API?

Cross-Model Review

LangGraph State Machines

Five Workflows

Requirements Workflow Graph

Checkpointing

Codebase Intelligence (RAG-Like)

How It Works

The Flow

Why Not Vector Embeddings?

Full Pipeline Flow

Governance Gates

Key Source Files

Related

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

AssemblyZero Wiki

Metrics

For Leaders

Architecture

Core Workflows

Context & Intelligence

Core Solutions

Getting Started

Reference

Reflections

Culture

Clone this wiki locally

Why `claude -p` Instead of the API?