refactor: Major codebase restructuring with improved architecture and testing by spachava753 · Pull Request #163 · spachava753/cpe

spachava753 · 2026-02-02T05:22:28Z

Summary

This PR represents a significant refactoring effort to improve code organization, testability, and maintainability across the codebase.

Key Changes

Architecture Improvements

Separated Cobra commands from business logic - Moved business logic from cmd/ to internal/commands/ to decouple the CLI framework from application logic. This allows replacing Cobra in the future without rewriting command implementations.
Split config loader into focused modules - The monolithic loader.go has been separated into four single-responsibility files: loader.go (file discovery/parsing), env.go (environment variable expansion), validator.go (validation rules), and resolver.go (parameter merging/resolution).
Unified duplicate interfaces - Three interface patterns (ToolRegistrar, Generator, Renderer) were duplicated across packages. All duplicates now reference canonical definitions in internal/types.
Adopted gai.GeneratorWrapper for middleware pattern - Refactored generator wrappers to embed GeneratorWrapper for automatic interface delegation. Each wrapper now only overrides Generate, with Register/Count/Stream delegated automatically.
Moved MCP connection to separate initialization phase - Introduced a two-phase approach to MCP configuration resolution. MCP server connections are now established in a dedicated initialization phase before generator creation.
Functional options pattern for generator creation - Replaced CreateToolCapableGenerator with NewGenerator using idiomatic Go functional options, separating required from optional parameters.
Replaced callback wrapping with generator middleware - The subagent logging system now uses a single emittingMiddleware that wraps the ToolCapableGenerator's Generate method, eliminating the need for callback interception.

New Features

Verbosity control for subagent output - Subagent logging now defaults to concise mode. Verbose mode available via --verbose-subagent flag or CPE_VERBOSE_SUBAGENT=true.
Custom linter for cmd package structure - Added a go/ast-based linter that ensures the cmd package only contains Cobra command setup.
code-desc subcommand - New command to print execute_go_code tool description, showing exactly what would be provided to the LLM.

Testing Improvements

Converted to cupaloy snapshot testing - Replaced hardcoded expected values with cupaloy snapshot assertions across 41 test files, removing ~2700 lines of inline expected values and capturing them in 363 auto-generated snapshot files.
Added comprehensive specification docs - New specs for testing strategy, conversation persistence, MCP handling, and scripts.

Dependency Updates

Upgraded gai to v0.29.0 (new ToolResultMessage API with composable block types)
Updated openai-go from v2 to v3 (transitive dependency)

Bug Fixes

Fixed middleware order for token usage printing
Fixed regression in FormatExecuteGoCodeResultMarkdown where shell code fences were accidentally removed
Added goconst linter and extracted repeated string literals

Breaking Changes

MCP server connection errors now fail fast instead of continuing with partial success
Duplicate tool names across servers now cause immediate failure instead of being silently skipped

Stats

491 files changed
~13,400 insertions, ~6,300 deletions
20 commits

Move the JSON schema generator from a standalone cmd/gen-schema package to a goyek task in scripts/. This consolidates development utilities into the existing task runner, making it consistent with other dev tasks like lint and debug-proxy. Run with: go run ./scripts gen-schema

Replace hardcoded expected values with cupaloy snapshot assertions across 41 test files. This removes ~2700 lines of inline expected values and captures them in 363 auto-generated snapshot files. Snapshot testing improves test maintainability by separating expected output from test logic. When behavior changes intentionally, snapshots can be updated with UPDATE_SNAPSHOTS=true instead of manually editing expected strings. Tests that check error conditions with dynamic content (temp paths, timing) or boolean conditions were left with their original assertion patterns.

Add four new specification documents covering previously undocumented parts of the codebase. The testing strategy spec documents the testing philosophy, table-driven test patterns, snapshot testing with cupaloy, mocking strategies, and test utilities. The conversation persistence spec covers SQLite-backed storage, the message model with parent-child relationships, CLI integration, and the DialogStorage interface. The MCP handling spec documents client integration, server configuration, connection lifecycle, tool discovery and execution, and CLI commands. The scripts spec covers the Goyek-based task runner, available tasks including lint and debug proxies, and how to add new tasks. Each spec follows the established format from existing specs with overview, design, architecture, and implementation details.

Three interface patterns were duplicated across agent, commands, and subagentlog packages: - ToolRegistrar (tool registration) - Generator (dialog generation) - Renderer (content rendering) All duplicates now reference the canonical definitions in internal/types. This reduces code duplication and ensures consistent interface contracts across packages.

The monolithic loader.go has been separated into four single-responsibility files: loader.go for file discovery and parsing, env.go for environment variable expansion, validator.go for validation rules, and resolver.go for parameter merging and resolution logic. Each module now has dedicated tests with comprehensive coverage including edge cases like config format fallback parsing, key collision behavior in environment expansion, and timeout resolution precedence. The refactoring also improves error handling with more informative messages and fixes a potential bug where partial YAML parsing could corrupt the config struct during format fallback.

Move business logic from cmd/ package to internal/commands/ to decouple the CLI framework from application logic. This allows replacing Cobra in the future without rewriting command implementations and improves testability by making business logic independent of flag parsing. The cmd/ package now contains only Cobra command definitions, flags, and command tree structure. All execution logic including user input processing, configuration resolution, system prompt loading, generator creation, and storage management is delegated to internal/commands/.

The lint task now includes a go/ast-based linter that ensures the cmd package only contains cobra command setup (global variables, init functions, and the Execute entry point). Any other functions or methods are flagged with a message directing developers to move business logic to ./internal packages.

Address golangci-lint issues including goimports formatting, context propagation, and resource cleanup. The subagent event server now uses context-aware listener creation and bounded shutdown timeout to prevent potential hangs. Test files use context-aware HTTP functions and properly close response bodies.

Upgrade gai to v0.27.0 and refactor generator wrappers to embed GeneratorWrapper for automatic interface delegation. Each wrapper now only overrides Generate, with Register/Count/Stream delegated automatically. All wrappers expose WithXxx WrapperFuncs for use with gai.Wrap, and generator.go composes the middleware stack declaratively. Block filters and tool result printer move into the middleware stack to operate at the gai.Generator level before the tool execution loop.

…e instead of Register The previous implementation wrapped tool callbacks via the Register method, which required the inner generator to implement types.ToolRegistrar (two-parameter Register). After upgrading gai, the inner generator implements gai.ToolCapableGenerator which only has a single-parameter Register method. The new approach overrides Generate and checks if the last message in the dialog is a tool result. It looks up the tool name by matching the tool call ID from the previous assistant message. This maintains the same user-facing behavior while being compatible with the updated gai interfaces. Additional improvements include accepting an io.Writer for output instead of hardcoding os.Stderr, and displaying mimetype information for multimedia tool results instead of skipping them. Test coverage has been expanded to cover edge cases like unknown tool names, renderer errors, empty dialogs, and multimedia content.

Enable the goconst linter in golangci-lint to detect repeated strings that should be constants. Extract constants across multiple packages to satisfy the linter and improve maintainability. Also fix a regression in FormatExecuteGoCodeResultMarkdown where shell code fences were accidentally removed in a previous refactor.

The TokenUsagePrinter was being applied before ResponsePrinter in the middleware stack, causing token usage to print before assistant responses. Since gai.Wrap applies wrappers in reverse order, TokenUsagePrinting must come first in the slice to be outermost and print after response content.

The test assertion checking for "content, err := Run(ctx)" in the execute_go_code description was no longer valid after changes to the generated documentation.

…cription This command shows exactly what description would be provided to the LLM when code mode is enabled, including all MCP tools converted to Go functions. It connects to configured MCP servers, fetches their tools, and applies the same partitioning logic used during agent execution. The output respects both server-level tool filtering (enabledTools/disabledTools) and the codeMode.excludedTools configuration, ensuring excluded tools do not appear in the generated Go function definitions. The command also accepts the --model flag to use model-specific code mode settings.

Subagent logging now defaults to concise mode which shows abbreviated output: tool names only (without payloads), tool results as success indicators (✓), and thought traces truncated to 2 lines. This reduces noise while still providing visibility into subagent activity. The previous verbose behavior showing full tool payloads, complete thought traces, and full tool results is available via --verbose-subagent flag or CPE_VERBOSE_SUBAGENT=true environment variable. The implementation adds a RenderMode type to the renderer with RenderModeConcise as the default (zero value) and RenderModeVerbose for full output. Filtering happens in the root agent's renderer, not in subagents, allowing the same event stream to support different verbosity preferences.

This refactoring introduces a two-phase approach to MCP configuration resolution. MCP server connections are now established in a dedicated initialization phase before generator creation, keeping config resolution pure while enabling better testability and explicit lifecycle management. New types and functions have been added to support this architecture. MCPState holds all active MCP connections keyed by server name, and MCPConn represents a single server connection with its filtered tools. The InitializeConnections function connects to all configured servers and applies tool filtering upfront. The CreateToolCapableGenerator function now accepts a pre-initialized MCPState instead of raw server configs. Tool callbacks are created on-the-fly during registration using new helper functions NewToolCallback and ToGaiTool. The PartitionTools function has been updated to work with MCPState and return MCPConn entries directly, eliminating the need for the separate ServerToolsInfo type. The FetchTools function and ToolData type have been removed from the mcp package as they are no longer needed with the new initialization approach. BREAKING CHANGE: MCP server connection errors now fail fast instead of continuing with partial success. Duplicate tool names across servers now cause immediate failure instead of being silently skipped with a warning.

The CreateToolCapableGenerator function now accepts a *config.Config instead of extracting individual fields as separate parameters. This reduces the parameter count from 8 to 6 by consolidating config-derived values (Model, Timeout, CodeMode) into the config struct, while keeping genuinely runtime-specific parameters (systemPrompt, disablePrinting, mcpState, callbackWrapper) as separate arguments. The deleted config_refactor.md spec file is no longer needed as the refactoring work it described has been completed.

Replaces CreateToolCapableGenerator with NewGenerator using idiomatic Go functional options. This separates required parameters (context, config, system prompt, MCP state) from optional configuration, making the API cleaner and more extensible. The new API provides four options: WithDisablePrinting for non-interactive modes, WithCallbackWrapper for tool callback instrumentation, WithMiddleware for custom middleware injection, and WithBaseGenerator for testing or custom generator implementations. Also fixes an unsafe type assertion after gai.Wrap by using the comma-ok idiom, and removes an unused test helper function.

…eware for event emission The subagent logging system previously used a dual mechanism for emitting events: callback wrapping via EmittingToolCallback for tool_call/tool_result events, and inner generator wrapping for thought_trace events. This has been consolidated into a single emittingMiddleware that wraps the ToolCapableGenerator's Generate method. The middleware emits events by inspecting the dialog and response at the Generate boundary: tool_result events are emitted before calling the inner generator (for tool results in the dialog from the previous iteration), while thought_trace and tool_call events are emitted after the inner generator returns (from blocks in the response). This eliminates the need for callback interception during tool registration. The callback wrapper code in commands/mcp.go has been removed since event emission is now handled entirely by EmittingGenerator in ExecuteSubagent. Tests have been updated to use the new middleware pattern, and obsolete callback-based test snapshots have been removed.

The gai module upgrade to v0.29.0 brings a new ToolResultMessage API that uses composable block types instead of separate modality and mime-type parameters. This change required updating openai-go from v2 to v3 as a transitive dependency. The ToolResultMessage signature changed from (id, modality, mimeType, data) to (id, blocks...), requiring updates to all call sites to use the new TextBlock helper for text content.

spachava753 added 20 commits January 25, 2026 15:57

test(codemode): remove obsolete main.go example assertion

bee9aea

The test assertion checking for "content, err := Run(ctx)" in the execute_go_code description was no longer valid after changes to the generated documentation.

spachava753 merged commit f29807d into main Feb 2, 2026

spachava753 deleted the refactor branch February 15, 2026 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: Major codebase restructuring with improved architecture and testing#163

refactor: Major codebase restructuring with improved architecture and testing#163
spachava753 merged 20 commits intomainfrom
refactor

spachava753 commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spachava753 commented Feb 2, 2026

Summary

Key Changes

Architecture Improvements

New Features

Testing Improvements

Dependency Updates

Bug Fixes

Breaking Changes

Stats

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant