Skip to content

[Enhancement]: Support running CPE as an MCP Server #112

@spachava753

Description

@spachava753

Enhancement Category

New MCP tool/server integration

Problem or Use Case

I want to use CPE as a sub-agent within other MCP-compliant environments (like Claude Desktop, another CPE instance, or other AI assistants).

Currently, CPE operates primarily as a CLI tool and MCP client. While it can consume other MCP servers, it cannot easily be consumed by others.

There is an existing issue (#104) for a native subagent tool, but a more composable and standard approach would be to expose CPE itself as an MCP Server. This would allow:

  1. Composition: One CPE instance can call another CPE instance as a tool (sub-agent).
  2. Interoperability: Any MCP client can utilize CPE's powerful "Code Mode" and agentic capabilities.
  3. Structured Data: For CPE to be a useful tool, it often needs to return structured data (JSON) rather than just unstructured text to stdout.
  4. Agent Hierarchies and Specialization: By exposing CPE as an MCP server, users can compose trees of specialized agents. A "root" agent could have access to tools like run_architect (a slow, reasoning-heavy CPE instance) and run_linter (a fast, specialized CPE instance). Since those sub-agents can themselves be CPE instances with their own MCP tools, this enables arbitrarily deep, customized agent trees.

Proposed Solution

1. New Command: cpe mcp serve

Implements the MCP Server protocol (Stdio transport). Exposes the subagent defined in the config as an MCP tool.

2. Subagent Configuration

Each cpe.yaml defines a single subagent. This keeps configs self-contained since each subagent can have drastically different requirements for system prompts, models, and tool configurations.

New top-level field:

# architect.cpe.yaml - specialized architect subagent
version: "1.0"

subagent:
  toolName: "consult_architect"
  toolDescription: "Consult an architect agent for high-level design decisions and system architecture planning. Returns structured analysis."
  outputSchemaPath: "./schemas/architect_response.json"

models:
  - ref: opus
    id: claude-opus-4-20250514
    type: anthropic
    api_key_env: ANTHROPIC_API_KEY
    # ... full model config

defaults:
  model: "opus"
  systemPromptPath: "./prompts/architect.prompt"
  codeMode:
    enabled: false

mcpServers: {}  # No tools needed for this agent
# search.cpe.yaml - cheap search-only subagent  
version: "1.0"

subagent:
  toolName: "web_search"
  toolDescription: "Fast, cheap search agent for web lookups. No code execution capability."
  outputSchemaPath: "./schemas/search_response.json"

models:
  - ref: flash
    id: gemini-2.0-flash
    type: gemini
    api_key_env: GEMINI_API_KEY

defaults:
  model: "flash"
  codeMode:
    enabled: false

mcpServers:
  exa:
    command: "exa-mcp"
    type: stdio

3. Parent Agent Configuration

The parent agent configures subagents as MCP servers:

# parent cpe.yaml
mcpServers:
  architect:
    command: "cpe"
    args: ["mcp", "serve", "--config", "./architect.cpe.yaml"]
    type: stdio
  search:
    command: "cpe"
    args: ["mcp", "serve", "--config", "./search.cpe.yaml"]
    type: stdio

4. Structured Output via final_answer Tool

When outputSchemaPath is defined, CPE uses a final_answer tool pattern to enforce structured output:

  1. Create a final_answer tool where the input schema = the output schema from the file
  2. Register it with a nil callback (leveraging existing gai behavior where nil callbacks terminate execution immediately)
  3. When the agent calls final_answer(...), generation terminates
  4. Extract the tool call parameters from the returned dialog and return them as the subagent's structured output

This approach:

  • Avoids parsing unstructured LLM text output
  • Leverages native tool calling for schema validation
  • Uses existing gai infrastructure (no new callback machinery needed)

5. Tool Input Schema

Each exposed subagent tool accepts:

  • prompt (required string): The instruction for the subagent
  • inputs (optional array of strings): File paths or URLs to include as context

6. Bounded Parallelism Guidance

When code mode is enabled and subagent tools are available, the parent agent's system prompt should include guidance like:

When calling subagent tools, limit concurrent calls to 8 to avoid rate limiting issues.

This enables "wide" or fan-out intelligence patterns where the parent agent can dispatch multiple specialized subagents in parallel.

Alternatives Considered

  • Native Subagent Tool ([Enhancement]: Native Subagent Tool for Code Mode #104): Building a specific internal tool inside CPE to spawn child processes. While useful, it's specific to CPE. Making CPE an MCP server is more generic and standard.
  • Multiple subagents per config: Initially considered a subagents (plural) field defining multiple tools in one config. Rejected because each subagent benefits from having its own complete, self-contained configuration file.
  • Scripting: Wrapping CPE in shell scripts to parse output. This is brittle and doesn't handle structured data well.

Impact Scope

Users with MCP integrations

Additional Context

This aligns with the "Unix philosophy" of small, composable tools, but applied to AI agents. It leverages the existing "Code Mode" strength of CPE, making it available as a service to other agents.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions