diff --git a/AGENTS.md b/AGENTS.md index 093ac2c16..75c9ae5d8 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,6 +2,7 @@ - ALWAYS USE PARALLEL TOOLS WHEN APPLICABLE. - The default branch in this repo is `dev`. - Local `main` ref may not exist; use `dev` or `origin/dev` for diffs. +- **NEVER commit directly to `dev`.** Always create a feature branch, rebase on `origin/dev`, and open a PR. No exceptions. - Prefer automation: execute requested actions without confirmation unless blocked by missing info or safety/irreversibility. ## Style Guide @@ -153,8 +154,12 @@ Use `context_history` to navigate the edit DAG: - Test actual implementation, do not duplicate logic into tests - Tests cannot run from repo root (guard: `do-not-run-tests-from-root`); run from package dirs like `packages/opencode`. -## Git Hooks +## Git Workflow +- **NEVER commit or push directly to `dev`.** Always work on a feature/fix/docs branch and create a PR. +- All changes go through PRs — no exceptions, not even "quick fixes" or docs-only changes. +- **Always rebase on `origin/dev` before creating a PR.** Run `git fetch origin && git rebase origin/dev` on your branch first. No exceptions. +- When creating a PR with `gh pr create`, always target the `origin` repo explicitly: `gh pr create --repo e6qu/frankencode --base dev`. - NEVER bypass pre-commit hooks. No `HUSKY=0`, no `--no-verify`. Fix the issue instead. - Pre-commit runs: prettier format, typecheck, tests. All must pass before commit. - Commit messages must follow conventional commits (`feat:`, `fix:`, `chore:`, etc). diff --git a/DO_NEXT.md b/DO_NEXT.md index 2632e5810..6abd7cece 100644 --- a/DO_NEXT.md +++ b/DO_NEXT.md @@ -1,65 +1,50 @@ # Frankencode — Do Next -## Completed - -- [x] Plan Mode Fixes (removed experimental flag, enabled plan_enter tool) -- [x] **Verification tool** implementation -- [x] Fixed `focus-rewrite-history` agent missing tool permissions -- [x] Added `verification` config schema to config.ts -- [x] Fixed TypeScript errors in verify.ts -- [x] Added VerifyTool to registry -- [x] Added `/verify` command -- [x] Typecheck passes -- [x] All 1384 tests pass -- [x] Progressive Disclosure for Skills - lazy load content on demand -- [x] Evaluator-Optimizer - evaluator/optimizer agents + refine tool -- [x] Skills as Scripts - scripts in skill directories become callable tools -- [x] Code review — found 16 bugs (#21-#36) - -## In Progress — Bug Fix Pass - -### P0 — Critical (do first) - -- [ ] **#28** Refine tool: pass `git diff` output or changed file paths into evaluator prompt -- [ ] **#29** Refine tool: verify `tools: {}` behavior — if it blocks tools, pass correct tool set -- [ ] **#32** Skill template: verify all `template` consumers handle `Promise` - -### P1 — High - -- [ ] **#21** Circuit breaker: move `this.lastFailure = now` before the throw -- [ ] **#27** Verify config: replace shallow spread with deep merge (`mergeDeep` from remeda) -- [ ] **#36** Evaluator agent: remove `bash: "allow"` from permission set - -### P2 — Medium - -- [ ] **#22** Circuit breaker: call `breaker.reset()` after successful check -- [ ] **#24** Circuit breaker: increase default cooldownMs to 30000+ -- [ ] **#25** Verify tool: implement `scope`/`files` filtering or remove unused params -- [ ] **#30** Refine parseEvaluation: add fallback parsing for score/passed -- [ ] **#34** Scripts: add argument validation/sanitization - -### P3 — Low (can defer) - -- [ ] **#23** Circuit breaker: rename `open` to `closed` or `tripped` -- [ ] **#26** Verify: use shell-word splitter instead of `split(" ")` -- [ ] **#31** Refine: add child session cleanup after loop -- [ ] **#33** Skill.get(): add content caching layer -- [ ] **#35** Scripts: use `::` separator for tool IDs - -## Backlog — Testing - -- [ ] Unit tests for VerifyTool (circuit-breaker, config loading, error parsing) -- [ ] Unit tests for RefineTool (evaluation parsing, iteration loop) -- [ ] Unit tests for Scripts (discovery, tool generation) -- [ ] Unit tests for CAS, filterEdited, EditGraph, SideThread CRUD -- [ ] Test classifier_threads + distill_threads -- [ ] Test /btw command -- [ ] CAS garbage collection -- [ ] TUI rendering of edit indicators - -## Backlog — Documentation - -- [ ] Document /verify command in user guide -- [ ] Document refine tool usage patterns -- [ ] Document skill scripts feature -- [ ] Update README with new features +## Implemented + +- [x] CAS (SQLite) + Part Editing (EditMeta, LifecycleMeta, filterEdited, context_edit, context_deref) +- [x] Conversation Graph (edit_graph DAG, context_history with log/tree/checkout/fork) +- [x] Focus Agent + Side Threads (side_thread table, thread_park, thread_list, classifier, focus agents) +- [x] Integration (system prompt injection, plugin hooks, lifecycle sweeper) +- [x] v2: query/toolName targeting, classifier_threads, distill_threads, /btw, /focus, /reset-context +- [x] Config-based control (no feature toggles) +- [x] Documentation (README, docs/context-editing, docs/schema, docs/agents, AGENTS.md) +- [x] Ephemeral commands (/threads, /history, /tree, /deref, /classify) +- [x] /cost TUI command with usage dialog +- [x] Verify tool (test/lint/typecheck with circuit breaker) +- [x] Refine tool (evaluator-optimizer loop) +- [x] Script discovery and execution from skills +- [x] 40 bugs fixed (code review audits + ephemeral fixes) +- [x] 25 regression tests for bug fixes + +## Next — Upstream Sync + +The upstream `anomalyco/opencode` has diverged significantly (~50 commits). Key conflict areas: + +- [ ] **Rebase onto upstream/dev** — resolve conflicts in `skill.ts` (Effect service rewrite), `prompt.ts`, `message-v2.ts`, `instance.ts` +- [ ] **Adapt to Effect-ification** — upstream moved to `LayerMap` and scoped services for Skill, File, Format, VCS, FileTime, FileWatcher; our `Instance.state()` usage in `skill.ts` may need to adapt to `SkillService` +- [ ] **Verify `instance-state.ts` deletion** — upstream deleted this; check if our code depends on it (used by `Skill.state`, `Command.state`) +- [ ] **Test after rebase** — run full suite, fix any breakage from upstream changes + +## Next — Testing + +- [ ] Unit tests for CAS (store, get, dedup via ON CONFLICT) +- [ ] Unit tests for filterEdited (hidden parts stripped, empty messages dropped) +- [ ] Unit tests for EditGraph (commit chain, log walk, checkout restore) +- [ ] Unit tests for SideThread CRUD +- [ ] Unit tests for ContextEdit validation (ownership, budget, recency, privileged agents) +- [ ] Unit tests for lifecycle sweeper (discardable auto-hide, ephemeral auto-externalize) +- [ ] Test classifier_threads + distill_threads with a real session +- [ ] Test /btw command (verify it forks, doesn't pollute main thread) + +## Next — Features + +- [ ] CAS garbage collection (orphan cleanup, size limits) +- [ ] TUI rendering of edit indicators (hidden/replaced/annotated parts) +- [ ] Session.remove() cleanup of EditGraph rows (add CASCADE or explicit delete) +- [ ] CAS.store() ownership: stop overwriting session_id on hash collision + +## Next — Design Decisions + +- [ ] Explore: make /btw use Session.fork() for true message-level isolation +- [ ] Evaluate upstream's `tools` deprecation and migration to permission-only model diff --git a/PLAN.md b/PLAN.md index 77de2ef6c..a4f4ca360 100644 --- a/PLAN.md +++ b/PLAN.md @@ -1,140 +1,74 @@ # Frankencode Feature Roadmap -Features derived from Claude Blog research (Sept 2025 - Mar 2026) + plan mode fixes + circuit-breaker. +> **Frankencode** is a fork of [OpenCode](https://github.com/anomalyco/opencode) (`dev` branch) that adds surgical, reversible, agent-driven context editing with content-addressable storage and a conversation history graph. + +**Status (2026-03-18):** All features implemented. 40 bugs fixed. 1401 tests passing. See `STATUS.md` for current state, `DO_NEXT.md` for what's next. --- -## 0. Plan Mode Fixes ✅ COMPLETE +## Next: Upstream Rebase Plan -**Done**: Removed experimental flag, enable `plan_enter` tool, ensure seamless mode switching. +The upstream `anomalyco/opencode` has diverged by ~50 commits. The major change is an **Effect-ification wave** that rewrites services (Skill, File, Format, VCS, etc.) from namespace-based modules to Effect scoped services with `LayerMap`. -### Files Modified +### High-conflict files (require manual resolution): -| File | Change | -| `src/tool/registry.ts` | Remove flag check, always include `PlanExitTool`, add `PlanEnterTool` | -| `src/session/prompt.ts` | Remove experimental flag branching, use "new" plan mode logic | -| `src/tool/plan.ts` | Uncomment `PlanEnterTool`, add import for `ENTER_DESCRIPTION` | -| `src/config/config.ts` | Add `verification` config schema | +| File | Upstream change | Our change | Strategy | +|------|----------------|------------|----------| +| `skill/skill.ts` | Rewritten to `SkillService` (Effect) | Content cache added | Reimplement cache inside Effect service | +| `session/prompt.ts` | ~99 lines changed | +filterEdited, +filterEphemeral, +focus injection | Apply our additions to new upstream base | +| `session/message-v2.ts` | ~107 lines changed | +EditMeta, +LifecycleMeta, +filterEdited | Merge schema additions into new shape | +| `project/instance.ts` | Refactored, `instance-state.ts` deleted | We use `Instance.state()` | Adapt to new Instance API | ---- +### Low-conflict files (additive, straightforward merge): -## 1. Verification Tool ✅ COMPLETE +All Frankencode-only files (CAS, edit graph, context tools, side threads, agents) should merge cleanly since upstream doesn't have them. -**Problem**: Generated code often passes the "looks right" test but fails on execution, edge cases, or style requirements. Agents need a way to self-verify their work. +--- -**Solution**: A verification tool that agents can call to validate their work against objective criteria. Also exposed as slash command and CLI command. +## 0. Plan Mode Fixes ✅ COMPLETE -### Files +**Done**: Removed experimental flag, enable `plan_enter` tool, ensure seamless mode switching. -| File | Status | -| --------------------------------- | ----------- | -| `src/tool/verify.ts` | ✅ COMPLETE | -| `src/config/config.ts` | ✅ COMPLETE | -| `src/tool/registry.ts` | ✅ COMPLETE | -| `src/command/index.ts` | ✅ COMPLETE | -| `src/command/template/verify.txt` | ✅ COMPLETE | +--- -### Completed Tasks +## 1. Verification Tool ✅ COMPLETE -- [x] Fix LSP errors in `src/tool/verify.ts` -- [x] Import `Config` from `../config/config` -- [x] Add `VerifyTool` to `src/tool/registry.ts` -- [x] Create `/verify` command template -- [x] Add `/verify` command to `src/command/index.ts` -- [x] Typecheck passes -- [x] All 1384 tests pass +Verification tool with circuit-breaker for test/lint/typecheck. Exposed as `/verify` command. --- ## 2. Progressive Disclosure for Skills ✅ COMPLETE -**Problem**: Loading many skills bloats context window. - -**Solution**: Three-tier loading — metadata always, full content on demand. - -### Tasks - -- [x] Update skill schema to separate `Meta` (name, description, location) from `Loaded` (includes content) -- [x] Implement lazy content loading in `Skill.get()` -- [x] Update `all()` and `available()` to return `Meta[]` without content -- [x] Update command registration to lazy-load skill content -- [x] Typecheck passes +Three-tier loading — metadata always, full content on demand. `Meta` vs `Loaded` types. --- ## 3. Skills as Scripts ✅ COMPLETE -- [x] Create `src/skill/scripts.ts` -- [x] Integrate with registry -- [x] Typecheck passes +Scripts in skill `scripts/` directories become callable tools. Auto-discovered and registered. --- ## 4. Evaluator-Optimizer ✅ COMPLETE -- [x] Create evaluator/optimizer agents -- [x] Create `src/tool/refine.ts` -- [x] Add to tool registry -- [x] Typecheck passes - ---- - -## 5. Bug Fix Pass ⬜ IN PROGRESS - -**Problem**: Code review on 2026-03-17 found 16 bugs across new features (2 critical, 4 high, 6 medium, 4 low). See `BUGS.md` for full details. - -### P0 — Critical (blocks feature correctness) - -- [ ] **#28** Refine tool: evaluator/optimizer have no visibility into actual changes — include `git diff` or file paths in prompt -- [ ] **#29** Refine tool: `tools: {}` may prevent agents from using tools — verify behavior and fix -- [ ] **#32** Skill template returns `Promise` not `string` — verify consumer compatibility - -### P1 — High (incorrect behavior) - -- [ ] **#21** Circuit breaker `lastFailure` set after throw — move assignment before throw -- [ ] **#27** Verify config shallow merge loses nested keys — use deep merge -- [ ] **#36** Evaluator agent has bash access — remove or restrict - -### P2 — Medium (suboptimal behavior) - -- [ ] **#22** Circuit breaker never resets on success — add `breaker.reset()` on pass -- [ ] **#24** Default cooldown (1s) effectively zero — increase to 30s+ -- [ ] **#25** `scope`/`files`/`criteria` params unused — implement or remove -- [ ] **#30** `parseEvaluation` brittle against LLM template echoing — improve parsing -- [ ] **#34** Scripts: arbitrary argument injection — add validation -- [ ] **#36** Evaluator agent has bash (also medium from security angle) - -### P3 — Low (cosmetic/minor) - -- [ ] **#23** Circuit breaker inverted `open` semantics — rename to `closed` or `tripped` -- [ ] **#26** `command.split(" ")` breaks quoted args — use shell-word splitter -- [ ] **#31** Refine child sessions never cleaned up — add cleanup -- [ ] **#33** `Skill.get()` re-parses file every call — add content cache -- [ ] **#35** Scripts tool ID collision — use `::` separator +Evaluator agent reviews changes, optimizer improves based on feedback. Refine tool orchestrates the loop. --- -## Priority & Dependencies +## 5. Bug Fix Pass ✅ COMPLETE -| Feature | Priority | Status | Dependencies | -| ---------------------- | -------- | ------------- | ----------------- | -| Plan Mode Fixes | P0 | ✅ Complete | None | -| Verification Tool | P1 | ⚠️ Has bugs | None | -| Progressive Disclosure | P1 | ⚠️ Has bugs | None | -| Skills as Scripts | P2 | ⚠️ Has bugs | None | -| Evaluator-Optimizer | P2 | ⚠️ Has bugs | Verification Tool | -| Bug Fix Pass | P0 | ⬜ Not started | All above | +All 16 bugs (#21-#36) from the 2026-03-17 code review fixed in PR #12. 25 regression tests added. --- -## Estimated LOC - -| Feature | ~LOC | Actual | -| ---------------------- | ---- | ------ | -| Plan Mode Fixes | ~20 | ~20 | -| Verification Tool | ~120 | 259 | -| Progressive Disclosure | ~60 | ~30 | -| Skills as Scripts | ~100 | 107 | -| Evaluator-Optimizer | ~150 | 207 | -| Bug Fix Pass | ~100 | TBD | -| **Total** | ~630 | ~623+ | +## Summary + +| Feature | Status | +| ---------------------- | ------------- | +| Plan Mode Fixes | ✅ Complete | +| Verification Tool | ✅ Complete | +| Progressive Disclosure | ✅ Complete | +| Skills as Scripts | ✅ Complete | +| Evaluator-Optimizer | ✅ Complete | +| Bug Fix Pass (16 bugs) | ✅ Complete | +| Upstream Rebase | ⬜ Next | diff --git a/STATUS.md b/STATUS.md index 97ec83bbf..058b26137 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,37 +1,58 @@ -# Frankencode Status +# Frankencode — Project Status -## Session: 2026-03-17 +**Date:** 2026-03-18 +**Upstream:** `anomalyco/opencode` @ `dev` +**Fork:** `e6qu/frankencode` @ `dev` -### Completed +## Overview -- ✅ Plan Mode Fixes (removed experimental flag, enabled plan_enter tool) -- ✅ Fixed `focus-rewrite-history` agent missing tool permissions -- ✅ Researched 6 months of Claude Blog posts -- ✅ Designed 4 new features (Verification, Progressive Disclosure, Skills as Scripts, Evaluator-Optimizer) -- ✅ Verification Tool (`/verify` command with circuit-breaker) -- ✅ Progressive Disclosure for Skills (lazy load content on demand) -- ✅ Evaluator-Optimizer (evaluator/optimizer agents + refine tool) -- ✅ Skills as Scripts (scripts in skill directories become callable tools) -- ✅ Code review of all new features — found 16 bugs (2 critical, 4 high, 6 medium, 4 low) +Frankencode is a fork of OpenCode that adds surgical, reversible, agent-driven context editing with content-addressable storage and a conversation history graph. All 4 planned phases are implemented. Currently in hardening/testing phase. -### In Progress +## Branch Status -- ⬜ Bug fix pass — 16 bugs logged in `BUGS.md` (#21-#36) +| Branch | Status | PR | +|--------|--------|----| +| `dev` | Main development branch | — | +| `fix/code-review-bugs` | 16 bug fixes + 25 tests | [#12](https://github.com/e6qu/frankencode/pull/12) (merged) | +| `docs/upstream-sync-notes` | Docs update with upstream analysis | [#13](https://github.com/e6qu/frankencode/pull/13) | -### Blocked +## Upstream Divergence -- Refine tool (#28, #29) — fundamentally incomplete, evaluator/optimizer have no context about actual changes +- **10 commits ahead** of upstream (Frankencode features) +- **~50 commits behind** upstream (Effect refactors, bug fixes, model updates) -### Critical Issues to Resolve Before Merge +### Upstream changes requiring attention: -1. **Refine tool is non-functional** — evaluator receives no code context, optimizer may have no tools -2. **Skill template returns Promise** — may inject `"[object Promise]"` into prompts -3. **Verify circuit breaker has 4 interacting bugs** — lastFailure timing, no success reset, 1s cooldown, shallow config merge +1. **Effect-ification** — `SkillService`, `FileService`, `FormatService`, `VcsService`, etc. refactored to Effect scoped services +2. **`instance-state.ts` deleted** — our `Instance.state()` usage needs review +3. **`skill.ts` rewritten** (333 lines changed) — conflicts with our content cache +4. **`prompt.ts` changed** (~99 lines) — conflicts with our filterEdited/filterEphemeral pipeline +5. **`message-v2.ts` changed** (~107 lines) — conflicts with our EditMeta/LifecycleMeta additions -### Next Steps +## Test Status -1. Fix P0 critical bugs (#28, #29, #32) -2. Fix P1 high bugs (#21, #27, #36) -3. Fix P2 medium bugs (#22, #24, #25, #30, #34) -4. Unit tests for all new features -5. Integration testing +- **1401 tests passing**, 0 failures, 8 skipped +- **25 new regression tests** for bug fixes (verify, refine, scripts, skill cache, agent permissions) +- **Typecheck:** clean (`bun typecheck`) + +## Bug Status + +- **0 active bugs** +- **40 bugs fixed** (tracked in BUGS.md) +- **4 open design issues** (CAS GC, objective staleness, EditGraph leak, CAS ownership) + +## Feature Inventory + +| Feature | Status | Files | +|---------|--------|-------| +| Content-Addressable Store | Done | `src/cas/` | +| Context editing (6 operations) | Done | `src/context-edit/`, `src/tool/context-edit.ts` | +| Edit graph (DAG history) | Done | `src/cas/graph.ts`, `src/tool/context-history.ts` | +| Side threads | Done | `src/session/side-thread.ts`, `src/tool/thread-*.ts` | +| Focus agent | Done | `src/agent/agent.ts`, `src/agent/prompt/focus.txt` | +| Classifier + distill | Done | `src/tool/classifier-threads.ts`, `src/tool/distill-threads.ts` | +| Ephemeral commands | Done | `src/command/index.ts`, `src/session/prompt.ts` | +| Verify tool | Done | `src/tool/verify.ts` | +| Refine tool | Done | `src/tool/refine.ts` | +| Script discovery | Done | `src/skill/scripts.ts` | +| /cost command | Done | TUI dialog | diff --git a/WHAT_WE_DID.md b/WHAT_WE_DID.md index f4debf00b..6d5f2d100 100644 --- a/WHAT_WE_DID.md +++ b/WHAT_WE_DID.md @@ -104,67 +104,95 @@ Root-level: `PLAN.md`, `WHAT_WE_DID.md`, `DO_NEXT.md` ### Modified files: -| File | Changes | -| ------------------------------ | ------------------------------------------------------------------ | -| `src/session/message-v2.ts` | +EditMeta +LifecycleMeta on PartBase, +filterEdited() | -| `src/session/prompt.ts` | +filterEdited +sweeper in pipeline, +focus status in system prompt | -| `src/storage/schema.ts` | +exports for new tables | -| `src/tool/registry.ts` | +10 new tools in BUILTIN array, +removed experimental flag checks | -| `src/agent/agent.ts` | +classifier +focus +focus-rewrite-history agent definitions | -| `src/command/index.ts` | +btw +focus +focus-rewrite-history +reset-context commands | -| `packages/plugin/src/index.ts` | +context.edit.before/after hook types | -| `src/tool/plan.ts` | +uncommented PlanEnterTool, +exported | - -### Phase 7-9 New files: - -| File | Purpose | -| --------------------------------- | --------------------------------------------- | -| `src/tool/verify.ts` | Verification tool with circuit-breaker | -| `src/tool/refine.ts` | Evaluator-optimizer loop tool | -| `src/skill/scripts.ts` | Skill scripts discovery and tool registration | -| `src/agent/prompt/evaluator.txt` | Evaluator agent prompt | -| `src/agent/prompt/optimizer.txt` | Optimizer agent prompt | -| `src/command/template/verify.txt` | /verify command template | - -### Phase 7-9 Modified files: - -| File | Changes | -| ---------------------- | --------------------------------------------- | -| `src/skill/skill.ts` | +Meta/Loaded types, lazy content loading | -| `src/tool/registry.ts` | +VerifyTool, +RefineTool, +Scripts.asTools() | -| `src/command/index.ts` | +/verify command, lazy skill template loading | -| `src/agent/agent.ts` | +evaluator +optimizer agent definitions | - -## Phase 7: Verification Tool + Progressive Disclosure - -- Verification tool (`/verify` command) with circuit-breaker for test/lint/typecheck -- Progressive disclosure for skills: metadata loaded at startup, content lazy-loaded on demand -- Added `Meta` and `Loaded` types to skill schema -- `Skill.get()` now returns full content, `Skill.all()` returns only metadata - -## Phase 8: Evaluator-Optimizer - -- Evaluator agent reviews code changes against quality criteria (correctness, completeness, quality, best practices) -- Optimizer agent improves code based on evaluator feedback -- Refine tool orchestrates evaluator → optimizer loop until quality threshold (score >= 7) or max iterations -- Scoring system (1-10) with structured output format - -## Phase 9: Skills as Scripts - -- Skills can include `scripts/` directory with executable files (.ts, .js, .py, .sh) -- Scripts automatically discovered and registered as callable tools -- Tool naming: `{skill}_{script_name}` (e.g., `agents-sdk_setup`) -- Scripts executed with appropriate interpreter based on extension - -## Phase 10: Code Review - -- Systematic review of all new code from Phases 6-9 -- Found 16 bugs across 6 files (2 critical, 4 high, 6 medium, 4 low) -- **Circuit breaker** (verify.ts): 4 interacting bugs — lastFailure unreachable after throw, no reset on success, inverted naming, 1s cooldown ineffective -- **Refine tool** (refine.ts): evaluator/optimizer receive no code context (no diff, no file paths), `tools: {}` may block tool usage, brittle XML parsing, session leak -- **Skill progressive disclosure** (skill.ts, command/index.ts): template getter returns Promise not string, content re-parsed on every load -- **Scripts** (scripts.ts): argument injection risk, tool ID collision with underscored names -- **Config** (verify.ts): shallow merge loses nested circuitBreaker defaults -- **Agent permissions** (agent.ts): evaluator has bash access despite being read-only reviewer -- Logged all bugs in `BUGS.md` (#21-#36) -- Updated PLAN.md with bug fix pass (Section 5) prioritized by severity +| File | Changes | +|------|---------| +| `src/session/message-v2.ts` | +EditMeta +LifecycleMeta on PartBase, +filterEdited() | +| `src/session/prompt.ts` | +filterEdited +sweeper in pipeline, +focus status in system prompt | +| `src/storage/schema.ts` | +exports for new tables | +| `src/tool/registry.ts` | +10 new tools in BUILTIN array | +| `src/agent/agent.ts` | +classifier +focus +focus-rewrite-history agent definitions | +| `src/command/index.ts` | +btw +focus +focus-rewrite-history +reset-context commands | +| `packages/plugin/src/index.ts` | +context.edit.before/after hook types | +| `src/tool/plan.ts` | +uncommented PlanEnterTool, +exported | + +--- + +## Phase 5: Hardening — Ephemeral Commands + Bug Fixes + +### Ephemeral commands (PRs #7, #8) + +- `/threads`, `/history`, `/tree`, `/deref`, `/classify` — readonly commands that don't pollute context +- Fork-based ephemeral: fork session → run prompt → extract result → delete session +- `filterEphemeral()` — drops ephemeral messages from LLM context entirely +- Fixed schema crash (`afterTurns: 0` violated `min(1)`) and content leak into 1 LLM turn + +### /cost TUI command (PR #11) + +- `/cost` slash command showing session usage (input/output/cache tokens, cost breakdown) +- TUI dialog with formatted cost metrics + +### Code review bug fixes (PRs #10, #12) + +- Fixed 40 bugs total across the codebase (24 in earlier PRs, 16 in PR #12) +- Circuit breaker fixes in `verify.ts`: lastFailure timing, success reset, naming, cooldown, config merge, command splitting +- Refine tool fixes: evaluator context, tool access, parsing robustness, session cleanup +- Script tool fixes: argument injection prevention, tool ID collision +- Skill content caching, evaluator permission lockdown +- 25 regression tests covering all fixed bugs + +### New files (Phase 5): + +| File | Purpose | +|------|---------| +| `src/tool/verify.ts` | Verify tool (test/lint/typecheck with circuit breaker) | +| `src/tool/refine.ts` | Refine tool (evaluator-optimizer loop) | +| `src/skill/scripts.ts` | Script discovery and execution from skills | +| `src/agent/prompt/evaluator.txt` | Evaluator agent prompt | +| `src/agent/prompt/optimizer.txt` | Optimizer agent prompt | +| `src/command/template/verify.txt` | /verify command template | +| `test/tool/verify.test.ts` | Verify tool tests (circuit breaker, config, commands) | +| `test/tool/refine.test.ts` | Refine tool tests (parseEvaluation, session cleanup) | +| `test/tool/scripts.test.ts` | Script tool tests (ID format, arg injection) | +| `test/skill/skill-cache.test.ts` | Skill content caching tests | + +### Modified files (Phase 5): + +| File | Changes | +|------|---------| +| `src/agent/agent.ts` | +evaluator +optimizer agents, fixed evaluator perms | +| `src/command/index.ts` | +verify +objective +threads +history +tree +deref +classify commands | +| `src/config/config.ts` | +verification config schema | +| `src/session/prompt.ts` | +filterEphemeral in pipeline | +| `src/skill/skill.ts` | +content cache with state-reload clearing | +| `test/agent/agent.test.ts` | +evaluator/optimizer permission tests | + +--- + +## Upstream Sync Status (2026-03-18) + +**Upstream:** `anomalyco/opencode` (`dev` branch) +**Our fork:** `e6qu/frankencode` (`dev` branch) +**Divergence:** 10 commits ahead, ~50 commits behind + +### Notable upstream changes since fork: + +- **Effect-ification wave:** `SkillService`, `FileService`, `FormatService`, `FileTimeService`, `VcsService`, `FileWatcherService` all refactored to Effect scoped services with `LayerMap` +- **Instance refactor:** `instance-state.ts` deleted, services moved to Effect layer +- **Compaction fix:** Message transforms now applied during compaction (#17823) +- **Context overflow:** `context_length_exceeded` error code now handled (#17748) +- **Permission fix:** Prompt tool enables preserved with empty agent permissions (#17064) +- **VCS fix:** HEAD filter bug fixed (#17829) +- **Zen updates:** Model pricing, Gemini 3 Pro deprecated +- **Docs:** `tools` config marked deprecated (#17951), snapshot config annotated (#17861) + +### Rebase risk assessment: + +| Area | Risk | Notes | +|------|------|-------| +| `skill/skill.ts` | **High** | Upstream rewrote to Effect service (333 lines changed); we added content cache | +| `session/prompt.ts` | **High** | Upstream changed ~99 lines; we added filterEdited, filterEphemeral, focus injection | +| `session/message-v2.ts` | **Medium** | Upstream changed ~107 lines; we added EditMeta, LifecycleMeta, filterEdited | +| `project/instance.ts` | **Medium** | Upstream refactored Instance; we use `Instance.state()` for skill cache | +| `agent/agent.ts` | **Low** | Upstream didn't touch agent definitions; our changes are additive | +| `tool/registry.ts` | **Low** | Upstream removed some tools; we added 9 | +| New Frankencode files | **None** | CAS, edit graph, context tools — no upstream conflict |