Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 1 addition & 10 deletions .agents/learnings/2026-04-07-v2.35.0-release-postmortem.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,8 @@
---
id: learning-2026-04-07-v2350-release
type: learning
date: 2026-04-07
category: process
confidence: 0.1648
maturity: provisional
utility: 0.8000
harmful_count: 0
reward_count: 1
helpful_count: 1
last_decay_at: 2026-04-12T12:33:50-04:00
last_reward: 0.80
last_reward_at: 2026-04-11T17:51:26-04:00
utility: 0.80
---

# Learning: v2.35.0 Release Process — CI Failure Cascade
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
---
id: learning-2026-04-12-tier1-forge-wins
type: learning
date: 2026-04-12
category: architecture
confidence: high
maturity: provisional
utility: 0.7
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
---
id: learning-2026-04-12-parallel-worktree-isolation
type: learning
date: 2026-04-12
category: process
confidence: high
maturity: provisional
utility: 0.8
---
Expand Down
3 changes: 0 additions & 3 deletions .agents/learnings/2026-04-12-yagni-bridge-not-clone.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
---
id: learning-2026-04-12-yagni-bridge-not-clone
type: learning
date: 2026-04-12
category: architecture
confidence: high
maturity: provisional
utility: 0.9
---
Expand Down
10 changes: 1 addition & 9 deletions .agents/patterns/2026-02-21-council-judges-pattern.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
---
utility: 0.7951
last_reward: 0.80
reward_count: 152
last_reward_at: 2026-03-31T09:33:17-04:00
confidence: 0.8140
last_decay_at: 2026-04-12T12:33:50-04:00
helpful_count: 151
maturity: established
maturity_changed_at: 2026-03-05T09:08:21-05:00
maturity_reason: utility 0.70 >= 0.55, reward_count 5 >= 5, helpful > harmful (4 > 0)
utility: 0.80
Comment on lines 2 to +3
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Retain MemRL history fields in pattern frontmatter

This truncation drops machine-managed fields (reward_count, helpful_count, harmful_count, confidence, last_reward_at, last_decay_at) from an actively cited pattern, which resets its feedback history. In cli/internal/lifecycle/feedback.go, UpdateMarkdownUtility treats missing counters as 0 and recomputes confidence from that baseline, so the next ao feedback/ao feedback-loop update will rewrite this artifact as if it had no prior usage, distorting maturity/utility evolution for the flywheel.

Useful? React with 👍 / 👎.

---

# Council Judges Pattern
Expand Down
10 changes: 1 addition & 9 deletions .agents/patterns/contracts-first-wave-planning.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
---
utility: 0.6368
last_reward: 0.80
reward_count: 3
last_reward_at: 2026-04-11T20:02:42-04:00
confidence: 0.3713
last_decay_at: 2026-04-12T12:33:50-04:00
helpful_count: 2
maturity: candidate
maturity_changed_at: 2026-04-11T20:02:42-04:00
maturity_reason: utility 0.64 >= 0.55 and reward_count 3 >= 3
utility: 0.64
---

# Contracts-First Wave Planning
Expand Down
10 changes: 1 addition & 9 deletions .agents/patterns/pre-mortem-first.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
---
utility: 0.7952
last_reward: 0.80
reward_count: 132
last_reward_at: 2026-04-11T20:02:42-04:00
confidence: 0.9541
last_decay_at: 2026-04-12T12:33:50-04:00
helpful_count: 131
maturity: established
maturity_changed_at: 2026-03-05T15:24:08-05:00
maturity_reason: utility 0.70 >= 0.55, reward_count 5 >= 5, helpful > harmful (4 > 0)
utility: 0.80
---

# Pre-Mortem First
Expand Down
10 changes: 1 addition & 9 deletions .agents/patterns/topological-wave-decomposition.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
---
utility: 0.7966
last_reward: 0.80
reward_count: 179
last_reward_at: 2026-04-11T20:02:42-04:00
confidence: 0.9633
last_decay_at: 2026-04-12T12:33:50-04:00
helpful_count: 178
maturity: established
maturity_changed_at: 2026-03-05T00:15:38-05:00
maturity_reason: utility 0.70 >= 0.55, reward_count 5 >= 5, helpful > harmful (4 > 0)
utility: 0.80
---

# Topological Wave Decomposition
Expand Down
10 changes: 1 addition & 9 deletions .agents/patterns/warn-then-fail-ratchet.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,6 @@
---
utility: 0.7966
last_reward: 0.80
reward_count: 253
last_reward_at: 2026-03-31T09:33:17-04:00
confidence: 0.8247
last_decay_at: 2026-04-12T12:33:50-04:00
helpful_count: 252
maturity: established
maturity_changed_at: 2026-03-03T10:14:50-05:00
maturity_reason: utility 0.73 >= 0.55, reward_count 6 >= 5, helpful > harmful (5 > 0)
utility: 0.80
---

# Warn-Then-Fail Ratchet
Expand Down
152 changes: 23 additions & 129 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,17 +62,11 @@ cd cli && make sync-hooks # Sync embedded hooks/skills into cli/embedded/

| Script | Purpose |
|--------|---------|
| `scripts/pre-push-gate.sh` | Smart pre-push validation (use `--fast` for diff-based checks) |
| `scripts/ci-local-release.sh` | Local release validation gate (run before releasing) |
| `scripts/retag-release.sh` | Retag existing release with post-tag commits |
| `scripts/extract-release-notes.sh` | Extract notes from CHANGELOG.md for GitHub release |
| `scripts/security-gate.sh` | Security scanning (semgrep, gosec, gitleaks) |
| `scripts/validate-go-fast.sh` | Quick Go validation (build + vet + test) |
| `scripts/sync-skill-counts.sh` | Sync skill counts across all docs after adding/removing skills |
| `scripts/pre-push-gate.sh` | Smart pre-push validation (`--fast` for diff-based) |
| `scripts/ci-local-release.sh` | Local release validation gate (run before tagging) |
| `scripts/sync-skill-counts.sh` | Sync skill counts across docs after adding/removing skills |
| `scripts/generate-cli-reference.sh` | Regenerate CLI docs after changing commands/flags |
| `scripts/audit-codex-parity.sh` | Audit generated `skills-codex/` for semantic drift that simple rewrites miss |
| `scripts/regen-codex-hashes.sh` | Regenerate manifest/marker hashes after changing skills-codex/ files |
| `scripts/prune-agents.sh` | Clean up bloated .agents/ directory |
| `scripts/regen-codex-hashes.sh` | Regenerate hashes after changing skills-codex/ files |

## CI Validation

Expand All @@ -81,28 +75,11 @@ All pushes to `main` run `.github/workflows/validate.yml` (24 jobs). **Run check
### Quick Local Validation

```bash
# Recommended: smart conditional gate (only checks relevant to changed files):
scripts/pre-push-gate.sh --fast

# These checks are now included in the pre-push gate (no need to run separately):
# bash skills/heal-skill/scripts/heal.sh --strict # → check 12
# ./tests/docs/validate-doc-release.sh # → check 25
# ./scripts/check-contract-compatibility.sh # → check 26

# If you changed Go code:
cd cli && make build && make test

# If you changed hooks or lib/hook-helpers.sh:
cd cli && make sync-hooks

# If you changed skills-codex/ files:
scripts/regen-codex-hashes.sh

# Full gate (all 33 checks, ~3min):
scripts/pre-push-gate.sh

# Release gate (runs everything including security):
scripts/ci-local-release.sh
scripts/pre-push-gate.sh --fast # Recommended: diff-based conditional checks
cd cli && make build && make test # If you changed Go code
cd cli && make sync-hooks # If you changed hooks/ or lib/hook-helpers.sh
scripts/regen-codex-hashes.sh # If you changed skills-codex/ files
scripts/pre-push-gate.sh # Full gate (all 33 checks, ~3min)
```

### Rules That Break CI
Expand All @@ -119,33 +96,11 @@ This updates SKILL-TIERS.md, PRODUCT.md, README.md, docs/SKILLS.md, docs/ARCHITE

**Every `references/*.md` must be linked in SKILL.md.** If a file exists in `skills/<name>/references/`, the skill's SKILL.md must contain a markdown link to it or a `Read` instruction referencing it. Use `heal.sh --strict` to check.

**Codex skills are manually maintained.** Edit `skills-codex/<name>/SKILL.md` directly or add a durable override in `skills-codex-overrides/<name>/`. The sync script (`sync-codex-native-skills.sh`) is deprecated — it overwrites manual edits. To audit for drift:
**Codex skills are manually maintained.** Edit `skills-codex/<name>/SKILL.md` directly or add overrides in `skills-codex-overrides/<name>/`. Audit drift with `bash scripts/audit-codex-parity.sh --skill <name>`.

```bash
bash scripts/audit-codex-parity.sh --skill <name>
```
**Embedded hooks must stay in sync.** After editing `hooks/`, `lib/hook-helpers.sh`, or `skills/standards/references/`: run `cd cli && make sync-hooks`.

**Embedded hooks must stay in sync.** After editing `hooks/`, `lib/hook-helpers.sh`, or `skills/standards/references/`:

```bash
cd cli && make sync-hooks
```

**CLI docs must stay in sync.** After adding/changing CLI commands or flags:

```bash
scripts/generate-cli-reference.sh
```

**Codex maintenance flow.** For Codex-specific skill changes:

```bash
# 1. Edit skills-codex/<name>/SKILL.md directly, or add override in skills-codex-overrides/<name>/
# 2. Audit for drift
bash scripts/audit-codex-parity.sh --skill <name>
# 3. Validate artifacts
bash scripts/validate-codex-generated-artifacts.sh --scope worktree
```
**CLI docs must stay in sync.** After changing commands/flags: run `scripts/generate-cli-reference.sh`.

**Contracts must be catalogued.** Files added to `docs/contracts/` need a link in `docs/INDEX.md`.

Expand All @@ -155,28 +110,13 @@ bash scripts/validate-codex-generated-artifacts.sh --scope worktree

**No secrets in code.** CI greps for hardcoded passwords, API keys, tokens in non-test files.

## Testing Rules (AI-Native Test Shape)
## Testing Rules

- **L2 first, L1 always.** AI agents write tests where bugs are found (L2 integration) AND where regressions hide (L1 unit). L2 first for bug-finding, L1 always for regression safety. The traditional pyramid's cost economics are obsolete — agents write both. See `skills/standards/references/test-pyramid.md` for the full theory and evidence.
- **No coverage-padding tests.** Tests that use trivial `!= ""` or `!= nil` assertions solely to inflate coverage metrics are banned. Every test must assert behavioral correctness, not just presence. If a function's coverage is low, write a real test or accept the metric.
- **No `cov*_test.go` naming convention.** Test files must be named after the source file they test (e.g., `goals_test.go` not `cov15_goals_init_test.go`).
See `.claude/rules/go.md` and `.claude/rules/python.md` for language-specific testing conventions. Key rules: L2 integration tests first, L1 unit tests always. No coverage-padding. No `cov*_test.go` naming.

## Release Pipeline

Releases are automated via GoReleaser + GitHub Actions:

1. **Normal release**: Tag triggers the workflow automatically
```bash
git tag v2.X.0 && git push origin v2.X.0
```
2. **Retag release** (roll post-tag commits into existing release):
```bash
scripts/retag-release.sh v2.X.0
```

The workflow builds cross-platform binaries, creates the GitHub release, updates the Homebrew tap (`boshu2/homebrew-agentops`), generates SBOM + security report, and attests SLSA provenance.

**Always run `scripts/ci-local-release.sh` before tagging.**
Tag triggers GoReleaser + GitHub Actions: `git tag v2.X.0 && git push origin v2.X.0`. **Always run `scripts/ci-local-release.sh` before tagging.** Retag with `scripts/retag-release.sh v2.X.0`.

## Agent Goals

Expand All @@ -195,62 +135,16 @@ Research → Plan → Implement → Validate
└──── Knowledge Flywheel ────┘
```

## Claude Code Startup Surface

`CLAUDE.md` is the startup surface in Claude Code. Do not expect `SessionStart`
or first-prompt hooks to inject briefings into the conversation.

- Use the goal stated in the user prompt or recovered handoff as the working objective.
- If you want the full software-factory lane, run `/rpi "goal"` explicitly.
- If you want a compiled goal-time briefing first, run `ao knowledge brief --goal "goal"`.
- Treat `.agents/ao/factory-goal.txt` and `.agents/ao/factory-briefing.txt` as
silent runtime state, not operator-facing instructions.

### Session & Swarm Constraints
## Session Constraints

- **Multi-phase work:** Route through `ao rpi` — it enforces 90-min phase timeouts and 10-min stall detection. Raw sessions have neither.
- **Validation overhead is by design:** Pre-mortem + vibe cost 3-5x implementation time. This ratio prevents bug rework — do not shortcut.
- **Before spawning workers:** Verify no file overlap across the wave (see swarm SKILL.md pre-flight). File collisions are the #1 swarm failure mode.
- **Multi-phase work:** Route through `ao rpi` (enforces timeouts and stall detection).
- **Before spawning workers:** Verify no file overlap across the wave. File collisions are the #1 swarm failure mode.
- **Before proposing new capability:** Check `ao rpi serve --help`, `hooks/hooks.json`, and `GOALS.md` first.

### Gas City Integration (gc bridge)

The `ao` CLI has a Gas City (`gc`) bridge that enables RPI phases to run as gc sessions. Key files:

| File | Purpose |
|------|---------|
| `cli/cmd/ao/gc_bridge.go` | Bridge primitives: availability, version, status/session parsing, city.toml discovery |
| `cli/cmd/ao/gc_events.go` | Event emitters (`ao:phase`, `ao:gate`, `ao:failure`, `ao:metric`) to gc event bus |
| `cli/cmd/ao/rpi_phased_gc.go` | `gcExecutor` — PhaseExecutor backend that runs phases as gc sessions |

**How it works:**
- `gcBridgeCityPath(cwd)` walks up from cwd looking for `city.toml` to locate the city root.
- `gcBridgeReady(cityPath)` checks binary availability, version >= 0.13.0, and controller state.
- `selectExecutorFromCaps` with `RuntimeMode: "gc"` creates a `gcExecutor` using city path from opts.
- Phase events are emitted to the gc event bus for observability (`gcEmitPhaseEvent`, `gcEmitGateEvent`).
- When gc is not available, event emission silently no-ops (graceful degradation).

**Testing:** Run `go test ./cmd/ao/ -run "TestGC"` for all gc bridge tests (L1 unit + L2 integration + L3 live controller).

**Deprecation status:** The following files are superseded by gc and marked deprecated:

| File | Status | gc replacement |
|------|--------|----------------|
| `rpi_loop_supervisor.go` | Deprecated | gc controller + session management |
| `rpi_c2_events.go` | Deprecated | gc event bus (`gc event emit`) |
| `rpi_phased_tmux.go` | Deprecated | `gcExecutor` in `rpi_phased_gc.go` |
| `rpi_workers.go` | Deprecated (partial) | gc agent health patrol |
| `rpi_parallel.go` | Deprecated (partial) | gc convoys + sling (pending worktree support) |
| `fire.go` | Deprecated (partial) | gc sling + bead dispatch (pending FIRE semantics) |

Do not write new tests for deprecated files. Do not add features to them.
- **Gas City (gc) bridge:** `cli/cmd/ao/gc_bridge.go`, `gc_events.go`, `rpi_phased_gc.go`. Do not write new tests or features for deprecated files (`rpi_loop_supervisor.go`, `rpi_c2_events.go`, `rpi_phased_tmux.go`, `rpi_workers.go`, `rpi_parallel.go`, `fire.go`).

### Execution Discipline

- **Produce artifacts, not just plans.** When asked to research, plan, or investigate, always produce actionable output (code changes, tests, or concrete files) within the session. Do not spend an entire session only planning unless explicitly told to "just plan."
- **Verify before committing.** After modifying Go files, run `go test ./...` and `go vet ./...` before committing. After modifying Python files, run relevant tests. Never commit code that hasn't been verified.
- **Execute first, research second.** When asked to run tests or execute something, start running within the first 2-3 messages. If research is needed, do it concurrently or after initial execution — not instead of it.
- **Parallel agent caution.** When working with parallel agents, worktrees, or swarm workers: avoid using git worktrees unless explicitly requested, watch for linter side-effects and import errors from partial changes, and verify the base branch is up-to-date before starting parallel work.
- **First-Edit Rule.** Your first Edit, Write, or executable Bash call MUST happen within your first 3 responses. If the user asked you to DO something, start doing it immediately. Research while doing, not instead of doing. If you reach response 3 without producing output, stop and act on what you know.
- **Intent Echo.** Before starting any non-trivial task, state in ONE sentence what you understand the user wants: "I understand you want me to [action] [scope] [constraint]." Wait for confirmation before proceeding with multi-file changes. This is especially important for removal, refactoring, scope changes, or requests with "just" or "only."
- **Two-Correction Rule.** If the user corrects your approach twice on the same task: STOP, re-read the original request, state what you now understand differently, and ask "Is this what you mean?" Do not attempt a third approach without explicit confirmation.
- **Verify before committing.** Go: `go test ./...` and `go vet ./...`. Python: run relevant tests. Never commit unverified code.
- **First-Edit Rule.** First Edit/Write/Bash must happen within your first 3 responses. Execute first, research second.
- **Intent Echo.** Before non-trivial tasks, state in ONE sentence what you understand. Wait for confirmation on multi-file changes.
- **Two-Correction Rule.** If corrected twice on the same task: STOP, re-read, state what you now understand differently, and confirm before trying again.
Loading