-
Notifications
You must be signed in to change notification settings - Fork 0
Technical Architecture
AssemblyZero is a multi-agent orchestration platform built on LangGraph state machines, with cross-model adversarial review (Claude drafts, Gemini reviews) and RAG-like codebase intelligence.
AssemblyZero uses four LLM providers across three distinct roles:
| Provider | Class | Use Case | Cost Model |
|---|---|---|---|
Claude CLI (claude -p) |
ClaudeCLIProvider |
Drafting, implementation, all creative work | Free (Max subscription) |
| Anthropic API | AnthropicProvider |
Automatic fallback when CLI fails | Per-token ($5-25/M output) |
| Fallback | FallbackProvider |
CLI first (180s) → API (300s) transparent failover | Free first, paid if needed |
| Gemini | GeminiProvider |
Adversarial review only (different model family) | Free (API quota) |
The Claude CLI (claude -p) is invoked as a subprocess with strict flags:
claude -p --output-format json --tools "" --strict-mcp-config --model claude-opus-4-6
-
--tools ""— Disables all built-in tools (no file ops, no web access) -
--strict-mcp-config— Disables MCP tool loading -
--output-format json— Returns structured response with token counts and cost
This gives us free LLM calls via the Max subscription, deterministic side-effect-free behavior, and structured token accounting. The API exists only as a paid fallback for resilience.
The key architectural decision: Claude and Gemini are different model families. When Gemini reviews Claude's work, it brings genuinely different training biases and blind spots. This is adversarial verification, not self-review.
See: ADR 0208: LLM Invocation Strategy
All workflows are implemented as LangGraph StateGraph instances with typed state, conditional edges, and SQLite checkpointing.
| Workflow | Nodes | Purpose | Key Conditional Edge |
|---|---|---|---|
| Issue | 7 | GitHub issue → structured brief | Review verdict routes to revision or filing |
| Requirements (LLD) | 10 | Issue → approved Low-Level Design | Gemini verdict routes to revision or approval |
| Implementation Spec | 7 | LLD → concrete implementation instructions | Completeness check routes to revision or review |
| TDD Implementation | 13 | Spec → code + tests + PR | Test results route to fix loop or report generation |
| Scout | Variable | External intelligence gathering | Search results route to deeper research or synthesis |
The most complex workflow, showing the full draft-review-revise cycle:
graph TD
N0["N0: Load Input<br/>(issue body, config)"]
N0b["N0b: Analyze Codebase<br/>(file scanning, patterns)"]
N1["N1: Generate Draft<br/>(Claude drafts LLD)"]
N1_5["N1.5: Mechanical Validation<br/>(format, structure checks)"]
N1b["N1b: Validate Test Plan<br/>(requirement refs in scenarios)"]
N2["N2: Human Gate - Draft<br/>(optional human review)"]
N3["N3: Review<br/>(Gemini adversarial review)"]
N4["N4: Human Gate - Verdict<br/>(optional override)"]
N5["N5: Finalize<br/>(write approved LLD)"]
N0 --> N0b
N0b --> N1
N1 --> N1_5
N1_5 --> N1b
N1b --> N2
N2 --> N3
N3 --> N4
N4 -->|"APPROVE"| N5
N4 -->|"BLOCK/REVISE"| N1
Each workflow uses SqliteSaver from langgraph.checkpoint.sqlite:
from langgraph.checkpoint.sqlite import SqliteSaver
with SqliteSaver.from_conn_string(str(db_path)) as memory:
app = workflow.compile(checkpointer=memory)
config = {"configurable": {"thread_id": f"issue-{issue_number}"}}
result = app.invoke(initial_state, config)- Thread ID = issue number → enables resume after interruption
- Per-issue databases prevent deadlocks during concurrent workflows
- Recursion limits scale with iteration count:
(max_iters × edges_per_loop) + buffer
AssemblyZero uses a RAG-like pattern for codebase understanding, but without vector embeddings. Instead, it uses deterministic techniques:
| Technique | What It Does | Where Used |
|---|---|---|
| AST Summarization | Extracts function signatures, class hierarchies, imports |
summarize_file_for_context() in N0b node |
| Pattern Scanning | Finds similar implementations in codebase by structure | Spec workflow N1 (analyze codebase) |
| Token Budget Management | Trims context to fit model limits (~60KB target) | All draft nodes |
| Section-Aware Truncation | Preserves important sections, trims verbose ones | Implementation spec context injection |
graph LR
Issue["Issue/LLD"]
Scan["Scan Codebase<br/>(files-to-modify table)"]
AST["AST Summarize<br/>(signatures, imports)"]
Similar["Find Similar<br/>(pattern matching)"]
Budget["Token Budget<br/>(trim to ~60KB)"]
Draft["Inject into<br/>Draft Prompt"]
Issue --> Scan
Scan --> AST
Scan --> Similar
AST --> Budget
Similar --> Budget
Budget --> Draft
- Deterministic: Same input always produces same context (auditable)
- No infrastructure: No vector database to deploy or maintain
- Token-efficient: AST summaries are compact; full file contents would exceed budgets
- Sufficient: For a single-repo workflow, structural analysis covers the use cases
Idea → GitHub Issue → LLD (Design) → Implementation Spec → Code + Tests + PR
│ │ │ │
▼ ▼ ▼ ▼
Issue Workflow Requirements Spec Workflow TDD Implementation
(7 nodes) Workflow (7 nodes) (13 nodes)
(10 nodes)
Each stage produces an artifact that feeds the next:
| Stage | Input | Output | Reviewed By |
|---|---|---|---|
| Issue | Idea description | Structured GitHub issue | Optional Gemini |
| Requirements | Issue body | Approved LLD | Gemini (mandatory) |
| Spec | Approved LLD + codebase analysis | Implementation spec with line-level instructions | Gemini (mandatory) |
| Implementation | Spec (or LLD) | Working code, passing tests, PR | Gemini (mandatory) |
Every stage transition requires a Gemini review verdict:
- APPROVE → advance to next stage
- BLOCK → loop back for revision (up to max iterations)
- Human override → orchestrator can waive gates for hotfixes
These gates are enforced by the LangGraph state machine — there is no edge that bypasses review.
| File | Contents |
|---|---|
assemblyzero/core/llm_provider.py |
All four LLM providers, LLMCallResult, get_provider() factory |
assemblyzero/workflows/requirements/graph.py |
Requirements workflow StateGraph (10 nodes) |
assemblyzero/workflows/implementation_spec/graph.py |
Spec workflow StateGraph (7 nodes) |
assemblyzero/workflows/issue/graph.py |
Issue workflow StateGraph (7 nodes) |
tools/run_requirements_workflow.py |
Requirements workflow CLI runner with SqliteSaver |
tools/run_implement_from_lld.py |
TDD implementation workflow CLI runner |
docs/adrs/0208-llm-invocation-strategy.md |
ADR for LLM invocation decisions |
- LangGraph Evolution — Roadmap for supervisor pattern and LangSmith
- Gemini Verification — Multi-model review architecture
- Governance Gates — Gate enforcement details
- Codebase Intelligence — Intelligence layer design
- Mechanical Validation — Auto-fix layer before review
- Historical Intelligence — Learning from past issues