diff --git a/docs/users/features/_meta.ts b/docs/users/features/_meta.ts index f5218e85ff..9cf6d403f0 100644 --- a/docs/users/features/_meta.ts +++ b/docs/users/features/_meta.ts @@ -1,6 +1,7 @@ export default { commands: 'Commands', 'sub-agents': 'SubAgents', + arena: 'Agent Arena', skills: 'Skills', headless: 'Headless Mode', checkpointing: { diff --git a/docs/users/features/arena.md b/docs/users/features/arena.md new file mode 100644 index 0000000000..7b53238c7f --- /dev/null +++ b/docs/users/features/arena.md @@ -0,0 +1,218 @@ +# Agent Arena + +> Dispatch multiple AI models simultaneously to execute the same task, compare their solutions side-by-side, and select the best result to apply to your workspace. + +> [!warning] +> Agent Arena is experimental. It has [known limitations](#limitations) around display modes and session management. + +Agent Arena lets you pit multiple AI models against each other on the same task. Each model runs as a fully independent agent in its own isolated Git worktree, so file operations never interfere. When all agents finish, you compare results and select a winner to merge back into your main workspace. + +Unlike [subagents](/users/features/sub-agents), which delegate focused subtasks within a single session, Arena agents are complete, top-level agent instances — each with its own model, context window, and full tool access. + +This page covers: + +- [When to use Agent Arena](#when-to-use-agent-arena) +- [Starting an arena session](#start-an-arena-session) +- [Interacting with agents](#interact-with-agents), including display modes and navigation +- [Comparing results and selecting a winner](#compare-results-and-select-a-winner) +- [Best practices](#best-practices) + +## When to use Agent Arena + +Agent Arena is most effective when you want to **evaluate or compare** how different models tackle the same problem. The strongest use cases are: + +- **Model benchmarking**: Evaluate different models' capabilities on real tasks in your actual codebase, not synthetic benchmarks +- **Best-of-N selection**: Get multiple independent solutions and pick the best implementation +- **Exploring approaches**: See how different models reason about and solve the same problem — useful for learning and insight +- **Risk reduction**: For critical changes, validate that multiple models converge on a similar approach before committing + +Agent Arena uses significantly more tokens than a single session (each agent has its own context window and model calls). It works best when the value of comparison justifies the cost. For routine tasks where you trust your default model, a single session is more efficient. + +## Start an arena session + +Use the `/arena` slash command to launch a session. Specify the models you want to compete and the task: + +``` +/arena --models qwen3.5-plus,glm-5,kimi-k2.5 "Refactor the authentication module to use JWT tokens" +``` + +If you omit `--models`, an interactive model selection dialog appears, letting you pick from your configured providers. + +### What happens when you start + +1. **Worktree setup**: Qwen Code creates isolated Git worktrees for each agent at `~/.qwen/arena//worktrees//`. Each worktree mirrors your current working directory state exactly — including staged changes, unstaged changes, and untracked files. +2. **Agent spawning**: Each agent starts in its own worktree with full tool access and its configured model. Agents are launched sequentially but execute in parallel. +3. **Execution**: All agents work on the task independently with no shared state or communication. You can monitor their progress and interact with any of them. +4. **Completion**: When all agents finish (or fail), you enter the result comparison phase. + +## Interact with agents + +### Display modes + +Agent Arena currently supports **in-process mode**, where all agents run asynchronously within the same terminal process. A tab bar at the bottom of the terminal lets you switch between agents. + +> [!note] +> **Split-pane display modes are planned for the future.** We intend to support tmux-based and iTerm2-based split-pane layouts, where each agent gets its own terminal pane for true side-by-side viewing. Currently, only in-process tab switching is available. + +### Navigate between agents + +In in-process mode, use keyboard shortcuts to switch between agent views: + +| Shortcut | Action | +| :------- | :-------------------------------- | +| `Right` | Switch to the next agent tab | +| `Left` | Switch to the previous agent tab | +| `Up` | Switch focus to the input box | +| `Down` | Switch focus to the agent tab bar | + +The tab bar shows each agent's current status: + +| Indicator | Meaning | +| :-------- | :--------------------- | +| `●` | Running or idle | +| `✓` | Completed successfully | +| `✗` | Failed | +| `○` | Cancelled | + +### Interact with individual agents + +When viewing an agent's tab, you can: + +- **Send messages** — type in the input area to give the agent additional instructions +- **Approve tool calls** — if an agent requests tool approval, the confirmation dialog appears in its tab +- **View full history** — scroll through the agent's complete conversation, including model output, tool calls, and results + +Each agent is a full, independent session. Anything you can do with the main agent, you can do with an arena agent. + +## Compare results and select a winner + +When all agents complete, the Arena enters the result comparison phase. You'll see: + +- **Status summary**: Which agents succeeded, failed, or were cancelled +- **Execution metrics**: Duration, rounds of reasoning, token usage, and tool call counts for each agent + +A selection dialog presents the successful agents. Choose one to apply its changes to your main workspace, or discard all results. + +### What happens when you select a winner + +1. The winning agent's changes are extracted as a diff against the baseline +2. The diff is applied to your main working directory +3. All worktrees and temporary branches are cleaned up automatically + +If you want to inspect results before deciding, each agent's full conversation history is available via the tab bar while the selection dialog is active. + +## Configuration + +Arena behavior can be customized in [settings.json](/users/configuration/settings): + +```json +{ + "arena": { + "worktreeBaseDir": "~/.qwen/arena", + "maxRoundsPerAgent": 50, + "timeoutSeconds": 600 + } +} +``` + +| Setting | Description | Default | +| :------------------------ | :--------------------------------- | :-------------- | +| `arena.worktreeBaseDir` | Base directory for arena worktrees | `~/.qwen/arena` | +| `arena.maxRoundsPerAgent` | Maximum reasoning rounds per agent | `50` | +| `arena.timeoutSeconds` | Timeout for each agent in seconds | `600` | + +## Best practices + +### Choose models that complement each other + +Arena is most valuable when you compare models with meaningfully different strengths. For example: + +``` +/arena --models qwen3.5-plus,glm-5,kimi-k2.5 "Optimize the database query layer" +``` + +Comparing three versions of the same model family yields less insight than comparing across providers. + +### Keep tasks self-contained + +Arena agents work independently with no communication. Tasks should be fully describable in the prompt without requiring back-and-forth: + +**Good**: "Refactor the payment module to use the strategy pattern. Update all tests." + +**Less effective**: "Let's discuss how to improve the payment module" — this benefits from conversation, which is better suited to a single session. + +### Limit the number of agents + +Up to 5 agents can run simultaneously. In practice, 2-3 agents provide the best balance of comparison value to resource cost. More agents means: + +- Higher token costs (each agent has its own context window) +- Longer total execution time +- More results to compare + +Start with 2-3 and scale up only when the comparison value justifies it. + +### Use Arena for high-impact decisions + +Arena shines when the stakes justify running multiple models: + +- Choosing an architecture for a new module +- Selecting an approach for a complex refactor +- Validating a critical bug fix from multiple angles + +For routine changes like renaming a variable or updating a config file, a single session is faster and cheaper. + +## Troubleshooting + +### Agents failing to start + +- Verify that each model in `--models` is properly configured with valid API credentials +- Check that your working directory is a Git repository (worktrees require Git) +- Ensure you have write access to the worktree base directory (`~/.qwen/arena/` by default) + +### Worktree creation fails + +- Run `git worktree list` to check for stale worktrees from previous sessions +- Clean up stale worktrees with `git worktree prune` +- Ensure your Git version supports worktrees (`git --version`, requires Git 2.5+) + +### Agent takes too long + +- Increase the timeout: set `arena.timeoutSeconds` in settings +- Reduce task complexity — Arena tasks should be focused and well-defined +- Lower `arena.maxRoundsPerAgent` if agents are spending too many rounds + +### Applying winner fails + +- Check for uncommitted changes in your main working directory that might conflict +- The diff is applied as a patch — merge conflicts are possible if your working directory changed during the session + +## Limitations + +Agent Arena is experimental. Current limitations: + +- **In-process mode only**: Split-pane display via tmux or iTerm2 is not yet available. All agents run within a single terminal window with tab switching. +- **No diff preview before selection**: You can view each agent's conversation history, but there is no unified diff viewer to compare solutions side-by-side before picking a winner. +- **No worktree retention**: Worktrees are always cleaned up after selection. There is no option to preserve them for further inspection. +- **No session resumption**: Arena sessions cannot be resumed after exiting. If you close the terminal mid-session, worktrees remain on disk and must be cleaned up manually via `git worktree prune`. +- **Maximum 5 agents**: The hard limit of 5 concurrent agents cannot be changed. +- **Git repository required**: Arena requires a Git repository for worktree isolation. It cannot be used in non-Git directories. + +## Comparison with other multi-agent modes + +Agent Arena is one of several planned multi-agent modes in Qwen Code. **Agent Team** and **Agent Swarm** are not yet implemented — the table below describes their intended design for reference. + +| | **Agent Arena** | **Agent Team** (planned) | **Agent Swarm** (planned) | +| :---------------- | :----------------------------------------------------- | :------------------------------------------------- | :------------------------------------------------------- | +| **Goal** | Competitive: Find the best solution to the _same_ task | Collaborative: Tackle _different_ aspects together | Batch parallel: Dynamically spawn workers for bulk tasks | +| **Agents** | Pre-configured models compete independently | Teammates collaborate with assigned roles | Workers spawned on-the-fly, destroyed on completion | +| **Communication** | No inter-agent communication | Direct peer-to-peer messaging | One-way: results aggregated by parent | +| **Isolation** | Full: separate Git worktrees | Independent sessions with shared task list | Lightweight ephemeral context per worker | +| **Output** | One selected solution applied to workspace | Synthesized results from multiple perspectives | Aggregated results from parallel processing | +| **Best for** | Benchmarking, choosing between model approaches | Research, complex collaboration, cross-layer work | Batch operations, data processing, map-reduce tasks | + +## Next steps + +Explore related approaches for parallel and delegated work: + +- **Lightweight delegation**: [Subagents](/users/features/sub-agents) handle focused subtasks within your session — better when you don't need model comparison +- **Manual parallel sessions**: Run multiple Qwen Code sessions yourself in separate terminals with [Git worktrees](https://git-scm.com/docs/git-worktree) for full manual control diff --git a/eslint.config.js b/eslint.config.js index d0963e876b..7b54f58a83 100644 --- a/eslint.config.js +++ b/eslint.config.js @@ -59,6 +59,7 @@ export default tseslint.config( ...importPlugin.configs.typescript.rules, 'import/no-default-export': 'warn', 'import/no-unresolved': 'off', // Disable for now, can be noisy with monorepos/paths + 'import/namespace': 'off', // Disabled due to https://github.com/import-js/eslint-plugin-import/issues/2866 }, }, { diff --git a/packages/cli/src/acp-integration/acpAgent.ts b/packages/cli/src/acp-integration/acpAgent.ts index af3590422e..246d800194 100644 --- a/packages/cli/src/acp-integration/acpAgent.ts +++ b/packages/cli/src/acp-integration/acpAgent.ts @@ -58,11 +58,11 @@ import { AcpFileSystemService } from './service/filesystem.js'; import { Readable, Writable } from 'node:stream'; import type { LoadedSettings } from '../config/settings.js'; import { SettingScope } from '../config/settings.js'; +import type { ApprovalModeValue } from './session/types.js'; import { z } from 'zod'; import type { CliArgs } from '../config/config.js'; import { loadCliConfig } from '../config/config.js'; import { Session } from './session/Session.js'; -import type { ApprovalModeValue } from './session/types.js'; import { formatAcpModelId } from '../utils/acpModelUtils.js'; const debugLogger = createDebugLogger('ACP_AGENT'); diff --git a/packages/cli/src/acp-integration/session/Session.ts b/packages/cli/src/acp-integration/session/Session.ts index 1458ce177f..f1e9892e7d 100644 --- a/packages/cli/src/acp-integration/session/Session.ts +++ b/packages/cli/src/acp-integration/session/Session.ts @@ -16,7 +16,7 @@ import type { ToolCallConfirmationDetails, ToolResult, ChatRecord, - SubAgentEventEmitter, + AgentEventEmitter, } from '@qwen-code/qwen-code-core'; import { AuthType, @@ -530,7 +530,7 @@ export class Session implements SessionContext { // Access eventEmitter from TaskTool invocation const taskEventEmitter = ( invocation as { - eventEmitter: SubAgentEventEmitter; + eventEmitter: AgentEventEmitter; } ).eventEmitter; @@ -539,7 +539,7 @@ export class Session implements SessionContext { const subagentType = (args['subagent_type'] as string) ?? ''; // Create a SubAgentTracker for this tool execution - const subAgentTracker = new SubAgentTracker( + const subSubAgentTracker = new SubAgentTracker( this, this.client, parentToolCallId, @@ -547,7 +547,7 @@ export class Session implements SessionContext { ); // Set up sub-agent tool tracking - subAgentCleanupFunctions = subAgentTracker.setup( + subAgentCleanupFunctions = subSubAgentTracker.setup( taskEventEmitter, abortSignal, ); diff --git a/packages/cli/src/acp-integration/session/SubAgentTracker.test.ts b/packages/cli/src/acp-integration/session/SubAgentTracker.test.ts index 86832afddd..0be126ff43 100644 --- a/packages/cli/src/acp-integration/session/SubAgentTracker.test.ts +++ b/packages/cli/src/acp-integration/session/SubAgentTracker.test.ts @@ -10,26 +10,26 @@ import type { SessionContext } from './types.js'; import type { Config, ToolRegistry, - SubAgentEventEmitter, - SubAgentToolCallEvent, - SubAgentToolResultEvent, - SubAgentApprovalRequestEvent, - SubAgentStreamTextEvent, + AgentEventEmitter, + AgentToolCallEvent, + AgentToolResultEvent, + AgentApprovalRequestEvent, + AgentStreamTextEvent, ToolEditConfirmationDetails, ToolInfoConfirmationDetails, } from '@qwen-code/qwen-code-core'; import { - SubAgentEventType, + AgentEventType, ToolConfirmationOutcome, TodoWriteTool, } from '@qwen-code/qwen-code-core'; import type { AgentSideConnection } from '@agentclientprotocol/sdk'; import { EventEmitter } from 'node:events'; -// Helper to create a mock SubAgentToolCallEvent with required fields +// Helper to create a mock AgentToolCallEvent with required fields function createToolCallEvent( - overrides: Partial & { name: string; callId: string }, -): SubAgentToolCallEvent { + overrides: Partial & { name: string; callId: string }, +): AgentToolCallEvent { return { subagentId: 'test-subagent', round: 1, @@ -40,14 +40,14 @@ function createToolCallEvent( }; } -// Helper to create a mock SubAgentToolResultEvent with required fields +// Helper to create a mock AgentToolResultEvent with required fields function createToolResultEvent( - overrides: Partial & { + overrides: Partial & { name: string; callId: string; success: boolean; }, -): SubAgentToolResultEvent { +): AgentToolResultEvent { return { subagentId: 'test-subagent', round: 1, @@ -56,15 +56,15 @@ function createToolResultEvent( }; } -// Helper to create a mock SubAgentApprovalRequestEvent with required fields +// Helper to create a mock AgentApprovalRequestEvent with required fields function createApprovalEvent( - overrides: Partial & { + overrides: Partial & { name: string; callId: string; - confirmationDetails: SubAgentApprovalRequestEvent['confirmationDetails']; - respond: SubAgentApprovalRequestEvent['respond']; + confirmationDetails: AgentApprovalRequestEvent['confirmationDetails']; + respond: AgentApprovalRequestEvent['respond']; }, -): SubAgentApprovalRequestEvent { +): AgentApprovalRequestEvent { return { subagentId: 'test-subagent', round: 1, @@ -102,10 +102,10 @@ function createInfoConfirmation( }; } -// Helper to create a mock SubAgentStreamTextEvent with required fields +// Helper to create a mock AgentStreamTextEvent with required fields function createStreamTextEvent( - overrides: Partial & { text: string }, -): SubAgentStreamTextEvent { + overrides: Partial & { text: string }, +): AgentStreamTextEvent { return { subagentId: 'test-subagent', round: 1, @@ -120,7 +120,7 @@ describe('SubAgentTracker', () => { let sendUpdateSpy: ReturnType; let requestPermissionSpy: ReturnType; let tracker: SubAgentTracker; - let eventEmitter: SubAgentEventEmitter; + let eventEmitter: AgentEventEmitter; let abortController: AbortController; beforeEach(() => { @@ -151,7 +151,7 @@ describe('SubAgentTracker', () => { 'parent-call-123', 'test-subagent', ); - eventEmitter = new EventEmitter() as unknown as SubAgentEventEmitter; + eventEmitter = new EventEmitter() as unknown as AgentEventEmitter; abortController = new AbortController(); }); @@ -169,19 +169,19 @@ describe('SubAgentTracker', () => { tracker.setup(eventEmitter, abortController.signal); expect(onSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_CALL, + AgentEventType.TOOL_CALL, expect.any(Function), ); expect(onSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_RESULT, + AgentEventType.TOOL_RESULT, expect.any(Function), ); expect(onSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_WAITING_APPROVAL, + AgentEventType.TOOL_WAITING_APPROVAL, expect.any(Function), ); expect(onSpy).toHaveBeenCalledWith( - SubAgentEventType.STREAM_TEXT, + AgentEventType.STREAM_TEXT, expect.any(Function), ); }); @@ -193,19 +193,19 @@ describe('SubAgentTracker', () => { cleanups[0](); expect(offSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_CALL, + AgentEventType.TOOL_CALL, expect.any(Function), ); expect(offSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_RESULT, + AgentEventType.TOOL_RESULT, expect.any(Function), ); expect(offSpy).toHaveBeenCalledWith( - SubAgentEventType.TOOL_WAITING_APPROVAL, + AgentEventType.TOOL_WAITING_APPROVAL, expect.any(Function), ); expect(offSpy).toHaveBeenCalledWith( - SubAgentEventType.STREAM_TEXT, + AgentEventType.STREAM_TEXT, expect.any(Function), ); }); @@ -222,7 +222,7 @@ describe('SubAgentTracker', () => { description: 'Reading file', }); - eventEmitter.emit(SubAgentEventType.TOOL_CALL, event); + eventEmitter.emit(AgentEventType.TOOL_CALL, event); // Allow async operations to complete await vi.waitFor(() => { @@ -258,7 +258,7 @@ describe('SubAgentTracker', () => { args: { todos: [] }, }); - eventEmitter.emit(SubAgentEventType.TOOL_CALL, event); + eventEmitter.emit(AgentEventType.TOOL_CALL, event); // Give time for any async operation await new Promise((resolve) => setTimeout(resolve, 10)); @@ -276,7 +276,7 @@ describe('SubAgentTracker', () => { args: {}, }); - eventEmitter.emit(SubAgentEventType.TOOL_CALL, event); + eventEmitter.emit(AgentEventType.TOOL_CALL, event); await new Promise((resolve) => setTimeout(resolve, 10)); @@ -290,7 +290,7 @@ describe('SubAgentTracker', () => { // First emit tool call to store state eventEmitter.emit( - SubAgentEventType.TOOL_CALL, + AgentEventType.TOOL_CALL, createToolCallEvent({ name: 'read_file', callId: 'call-123', @@ -306,7 +306,7 @@ describe('SubAgentTracker', () => { resultDisplay: 'File contents', }); - eventEmitter.emit(SubAgentEventType.TOOL_RESULT, resultEvent); + eventEmitter.emit(AgentEventType.TOOL_RESULT, resultEvent); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalledWith( @@ -334,7 +334,7 @@ describe('SubAgentTracker', () => { resultDisplay: undefined, }); - eventEmitter.emit(SubAgentEventType.TOOL_RESULT, resultEvent); + eventEmitter.emit(AgentEventType.TOOL_RESULT, resultEvent); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalledWith( @@ -356,7 +356,7 @@ describe('SubAgentTracker', () => { // Store args via tool call eventEmitter.emit( - SubAgentEventType.TOOL_CALL, + AgentEventType.TOOL_CALL, createToolCallEvent({ name: TodoWriteTool.Name, callId: 'call-todo', @@ -377,7 +377,7 @@ describe('SubAgentTracker', () => { }), }); - eventEmitter.emit(SubAgentEventType.TOOL_RESULT, resultEvent); + eventEmitter.emit(AgentEventType.TOOL_RESULT, resultEvent); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalledWith({ @@ -393,7 +393,7 @@ describe('SubAgentTracker', () => { tracker.setup(eventEmitter, abortController.signal); eventEmitter.emit( - SubAgentEventType.TOOL_CALL, + AgentEventType.TOOL_CALL, createToolCallEvent({ name: 'test_tool', callId: 'call-cleanup', @@ -402,7 +402,7 @@ describe('SubAgentTracker', () => { ); eventEmitter.emit( - SubAgentEventType.TOOL_RESULT, + AgentEventType.TOOL_RESULT, createToolResultEvent({ name: 'test_tool', callId: 'call-cleanup', @@ -413,7 +413,7 @@ describe('SubAgentTracker', () => { // Emit another result for same callId - should not have stored args sendUpdateSpy.mockClear(); eventEmitter.emit( - SubAgentEventType.TOOL_RESULT, + AgentEventType.TOOL_RESULT, createToolResultEvent({ name: 'test_tool', callId: 'call-cleanup', @@ -447,7 +447,7 @@ describe('SubAgentTracker', () => { respond: respondSpy, }); - eventEmitter.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, event); + eventEmitter.emit(AgentEventType.TOOL_WAITING_APPROVAL, event); await vi.waitFor(() => { expect(requestPermissionSpy).toHaveBeenCalled(); @@ -483,7 +483,7 @@ describe('SubAgentTracker', () => { respond: respondSpy, }); - eventEmitter.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, event); + eventEmitter.emit(AgentEventType.TOOL_WAITING_APPROVAL, event); await vi.waitFor(() => { expect(respondSpy).toHaveBeenCalledWith( @@ -504,7 +504,7 @@ describe('SubAgentTracker', () => { respond: respondSpy, }); - eventEmitter.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, event); + eventEmitter.emit(AgentEventType.TOOL_WAITING_APPROVAL, event); await vi.waitFor(() => { expect(respondSpy).toHaveBeenCalledWith(ToolConfirmationOutcome.Cancel); @@ -525,7 +525,7 @@ describe('SubAgentTracker', () => { respond: respondSpy, }); - eventEmitter.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, event); + eventEmitter.emit(AgentEventType.TOOL_WAITING_APPROVAL, event); await vi.waitFor(() => { expect(respondSpy).toHaveBeenCalledWith(ToolConfirmationOutcome.Cancel); @@ -548,7 +548,7 @@ describe('SubAgentTracker', () => { respond: vi.fn(), }); - eventEmitter.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, event); + eventEmitter.emit(AgentEventType.TOOL_WAITING_APPROVAL, event); await vi.waitFor(() => { expect(requestPermissionSpy).toHaveBeenCalled(); @@ -572,7 +572,7 @@ describe('SubAgentTracker', () => { text: 'Hello, this is a response from the model.', }); - eventEmitter.emit(SubAgentEventType.STREAM_TEXT, event); + eventEmitter.emit(AgentEventType.STREAM_TEXT, event); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalled(); @@ -593,15 +593,15 @@ describe('SubAgentTracker', () => { tracker.setup(eventEmitter, abortController.signal); eventEmitter.emit( - SubAgentEventType.STREAM_TEXT, + AgentEventType.STREAM_TEXT, createStreamTextEvent({ text: 'First chunk ' }), ); eventEmitter.emit( - SubAgentEventType.STREAM_TEXT, + AgentEventType.STREAM_TEXT, createStreamTextEvent({ text: 'Second chunk ' }), ); eventEmitter.emit( - SubAgentEventType.STREAM_TEXT, + AgentEventType.STREAM_TEXT, createStreamTextEvent({ text: 'Third chunk' }), ); @@ -640,7 +640,7 @@ describe('SubAgentTracker', () => { text: 'This should not be emitted', }); - eventEmitter.emit(SubAgentEventType.STREAM_TEXT, event); + eventEmitter.emit(AgentEventType.STREAM_TEXT, event); await new Promise((resolve) => setTimeout(resolve, 10)); @@ -655,7 +655,7 @@ describe('SubAgentTracker', () => { thought: true, }); - eventEmitter.emit(SubAgentEventType.STREAM_TEXT, event); + eventEmitter.emit(AgentEventType.STREAM_TEXT, event); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalled(); @@ -680,7 +680,7 @@ describe('SubAgentTracker', () => { thought: false, }); - eventEmitter.emit(SubAgentEventType.STREAM_TEXT, event); + eventEmitter.emit(AgentEventType.STREAM_TEXT, event); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalled(); @@ -705,7 +705,7 @@ describe('SubAgentTracker', () => { text: 'Default behavior text.', }); - eventEmitter.emit(SubAgentEventType.STREAM_TEXT, event); + eventEmitter.emit(AgentEventType.STREAM_TEXT, event); await vi.waitFor(() => { expect(sendUpdateSpy).toHaveBeenCalled(); diff --git a/packages/cli/src/acp-integration/session/SubAgentTracker.ts b/packages/cli/src/acp-integration/session/SubAgentTracker.ts index acbe950821..e9af7282cb 100644 --- a/packages/cli/src/acp-integration/session/SubAgentTracker.ts +++ b/packages/cli/src/acp-integration/session/SubAgentTracker.ts @@ -5,18 +5,18 @@ */ import type { - SubAgentEventEmitter, - SubAgentToolCallEvent, - SubAgentToolResultEvent, - SubAgentApprovalRequestEvent, - SubAgentUsageEvent, - SubAgentStreamTextEvent, + AgentEventEmitter, + AgentToolCallEvent, + AgentToolResultEvent, + AgentApprovalRequestEvent, + AgentUsageEvent, + AgentStreamTextEvent, ToolCallConfirmationDetails, AnyDeclarativeTool, AnyToolInvocation, } from '@qwen-code/qwen-code-core'; import { - SubAgentEventType, + AgentEventType, ToolConfirmationOutcome, createDebugLogger, } from '@qwen-code/qwen-code-core'; @@ -106,12 +106,12 @@ export class SubAgentTracker { /** * Sets up event listeners for a sub-agent's tool events. * - * @param eventEmitter - The SubAgentEventEmitter from TaskTool + * @param eventEmitter - The AgentEventEmitter from TaskTool * @param abortSignal - Signal to abort tracking if parent is cancelled * @returns Array of cleanup functions to remove listeners */ setup( - eventEmitter: SubAgentEventEmitter, + eventEmitter: AgentEventEmitter, abortSignal: AbortSignal, ): Array<() => void> { const onToolCall = this.createToolCallHandler(abortSignal); @@ -120,19 +120,19 @@ export class SubAgentTracker { const onUsageMetadata = this.createUsageMetadataHandler(abortSignal); const onStreamText = this.createStreamTextHandler(abortSignal); - eventEmitter.on(SubAgentEventType.TOOL_CALL, onToolCall); - eventEmitter.on(SubAgentEventType.TOOL_RESULT, onToolResult); - eventEmitter.on(SubAgentEventType.TOOL_WAITING_APPROVAL, onApproval); - eventEmitter.on(SubAgentEventType.USAGE_METADATA, onUsageMetadata); - eventEmitter.on(SubAgentEventType.STREAM_TEXT, onStreamText); + eventEmitter.on(AgentEventType.TOOL_CALL, onToolCall); + eventEmitter.on(AgentEventType.TOOL_RESULT, onToolResult); + eventEmitter.on(AgentEventType.TOOL_WAITING_APPROVAL, onApproval); + eventEmitter.on(AgentEventType.USAGE_METADATA, onUsageMetadata); + eventEmitter.on(AgentEventType.STREAM_TEXT, onStreamText); return [ () => { - eventEmitter.off(SubAgentEventType.TOOL_CALL, onToolCall); - eventEmitter.off(SubAgentEventType.TOOL_RESULT, onToolResult); - eventEmitter.off(SubAgentEventType.TOOL_WAITING_APPROVAL, onApproval); - eventEmitter.off(SubAgentEventType.USAGE_METADATA, onUsageMetadata); - eventEmitter.off(SubAgentEventType.STREAM_TEXT, onStreamText); + eventEmitter.off(AgentEventType.TOOL_CALL, onToolCall); + eventEmitter.off(AgentEventType.TOOL_RESULT, onToolResult); + eventEmitter.off(AgentEventType.TOOL_WAITING_APPROVAL, onApproval); + eventEmitter.off(AgentEventType.USAGE_METADATA, onUsageMetadata); + eventEmitter.off(AgentEventType.STREAM_TEXT, onStreamText); // Clean up any remaining states this.toolStates.clear(); }, @@ -146,7 +146,7 @@ export class SubAgentTracker { abortSignal: AbortSignal, ): (...args: unknown[]) => void { return (...args: unknown[]) => { - const event = args[0] as SubAgentToolCallEvent; + const event = args[0] as AgentToolCallEvent; if (abortSignal.aborted) return; // Look up tool and build invocation for metadata @@ -187,7 +187,7 @@ export class SubAgentTracker { abortSignal: AbortSignal, ): (...args: unknown[]) => void { return (...args: unknown[]) => { - const event = args[0] as SubAgentToolResultEvent; + const event = args[0] as AgentToolResultEvent; if (abortSignal.aborted) return; const state = this.toolStates.get(event.callId); @@ -215,7 +215,7 @@ export class SubAgentTracker { abortSignal: AbortSignal, ): (...args: unknown[]) => Promise { return async (...args: unknown[]) => { - const event = args[0] as SubAgentApprovalRequestEvent; + const event = args[0] as AgentApprovalRequestEvent; if (abortSignal.aborted) return; const state = this.toolStates.get(event.callId); @@ -292,7 +292,7 @@ export class SubAgentTracker { abortSignal: AbortSignal, ): (...args: unknown[]) => void { return (...args: unknown[]) => { - const event = args[0] as SubAgentUsageEvent; + const event = args[0] as AgentUsageEvent; if (abortSignal.aborted) return; this.messageEmitter.emitUsageMetadata( @@ -312,7 +312,7 @@ export class SubAgentTracker { abortSignal: AbortSignal, ): (...args: unknown[]) => void { return (...args: unknown[]) => { - const event = args[0] as SubAgentStreamTextEvent; + const event = args[0] as AgentStreamTextEvent; if (abortSignal.aborted) return; // Emit streamed text as agent message or thought based on the flag diff --git a/packages/cli/src/acp-integration/session/emitters/MessageEmitter.ts b/packages/cli/src/acp-integration/session/emitters/MessageEmitter.ts index 4b2bf82bfb..c4e0b971c4 100644 --- a/packages/cli/src/acp-integration/session/emitters/MessageEmitter.ts +++ b/packages/cli/src/acp-integration/session/emitters/MessageEmitter.ts @@ -5,6 +5,7 @@ */ import type { GenerateContentResponseUsageMetadata } from '@google/genai'; +import type { SubagentMeta } from '../types.js'; import type { Usage } from '@agentclientprotocol/sdk'; import { BaseEmitter } from './BaseEmitter.js'; @@ -77,7 +78,7 @@ export class MessageEmitter extends BaseEmitter { usageMetadata: GenerateContentResponseUsageMetadata, text: string = '', durationMs?: number, - subagentMeta?: import('../types.js').SubagentMeta, + subagentMeta?: SubagentMeta, ): Promise { const usage: Usage = { inputTokens: usageMetadata.promptTokenCount ?? 0, diff --git a/packages/cli/src/config/config.ts b/packages/cli/src/config/config.ts index 34a9c25cd4..34d7c3fe68 100755 --- a/packages/cli/src/config/config.ts +++ b/packages/cli/src/config/config.ts @@ -51,16 +51,16 @@ import { appEvents } from '../utils/events.js'; import { mcpCommand } from '../commands/mcp.js'; // UUID v4 regex pattern for validation -const UUID_REGEX = - /^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}$/i; +const SESSION_ID_REGEX = + /^[0-9a-f]{8}-[0-9a-f]{4}-[1-5][0-9a-f]{3}-[89ab][0-9a-f]{3}-[0-9a-f]{12}(-agent-[a-zA-Z0-9_.-]+)?$/i; /** - * Validates if a string is a valid UUID format - * @param value - The string to validate - * @returns True if the string is a valid UUID, false otherwise + * Validates if a string is a valid session ID format. + * Accepts a standard UUID, or a UUID followed by `-agent-{suffix}` + * (used by Arena to give each agent a deterministic session ID). */ -function isValidUUID(value: string): boolean { - return UUID_REGEX.test(value); +function isValidSessionId(value: string): boolean { + return SESSION_ID_REGEX.test(value); } import { isWorkspaceTrusted } from './trustedFolders.js'; @@ -568,10 +568,13 @@ export async function parseArguments(): Promise { if (argv['sessionId'] && (argv['continue'] || argv['resume'])) { return 'Cannot use --session-id with --continue or --resume. Use --session-id to start a new session with a specific ID, or use --continue/--resume to resume an existing session.'; } - if (argv['sessionId'] && !isValidUUID(argv['sessionId'] as string)) { + if ( + argv['sessionId'] && + !isValidSessionId(argv['sessionId'] as string) + ) { return `Invalid --session-id: "${argv['sessionId']}". Must be a valid UUID (e.g., "123e4567-e89b-12d3-a456-426614174000").`; } - if (argv['resume'] && !isValidUUID(argv['resume'] as string)) { + if (argv['resume'] && !isValidSessionId(argv['resume'] as string)) { return `Invalid --resume: "${argv['resume']}". Must be a valid UUID (e.g., "123e4567-e89b-12d3-a456-426614174000").`; } return true; @@ -1058,6 +1061,18 @@ export async function loadCliConfig( lsp: { enabled: lspEnabled, }, + agents: settings.agents + ? { + displayMode: settings.agents.displayMode, + arena: settings.agents.arena + ? { + worktreeBaseDir: settings.agents.arena.worktreeBaseDir, + preserveArtifacts: + settings.agents.arena.preserveArtifacts ?? false, + } + : undefined, + } + : undefined, }); if (lspEnabled) { diff --git a/packages/cli/src/config/settingsSchema.ts b/packages/cli/src/config/settingsSchema.ts index 373988d728..85e8181046 100644 --- a/packages/cli/src/config/settingsSchema.ts +++ b/packages/cli/src/config/settingsSchema.ts @@ -1244,6 +1244,104 @@ const SETTINGS_SCHEMA = { description: 'Configuration for web search providers.', showInDialog: false, }, + agents: { + type: 'object', + label: 'Agents', + category: 'Advanced', + requiresRestart: false, + default: {}, + description: + 'Settings for multi-agent collaboration features (Arena, Team, Swarm).', + showInDialog: false, + properties: { + displayMode: { + type: 'enum', + label: 'Display Mode', + category: 'Advanced', + requiresRestart: false, + default: undefined as string | undefined, + description: + 'Display mode for multi-agent sessions. Currently only "in-process" is supported.', + showInDialog: false, + options: [ + { value: 'in-process', label: 'In-process' }, + // { value: 'tmux', label: 'tmux' }, + // { value: 'iterm2', label: 'iTerm2' }, + ], + }, + arena: { + type: 'object', + label: 'Arena', + category: 'Advanced', + requiresRestart: false, + default: {}, + description: 'Settings for Arena (multi-model competitive execution).', + showInDialog: false, + properties: { + worktreeBaseDir: { + type: 'string', + label: 'Worktree Base Directory', + category: 'Advanced', + requiresRestart: true, + default: undefined as string | undefined, + description: + 'Custom base directory for Arena worktrees. Defaults to ~/.qwen/arena.', + showInDialog: false, + }, + preserveArtifacts: { + type: 'boolean', + label: 'Preserve Arena Artifacts', + category: 'Advanced', + requiresRestart: false, + default: false, + description: + 'When enabled, Arena worktrees and session state files are preserved after the session ends or the main agent exits.', + showInDialog: true, + }, + maxRoundsPerAgent: { + type: 'number', + label: 'Max Rounds Per Agent', + category: 'Advanced', + requiresRestart: false, + default: undefined as number | undefined, + description: + 'Maximum number of rounds (turns) each agent can execute. No limit if unset.', + showInDialog: false, + }, + timeoutSeconds: { + type: 'number', + label: 'Timeout (seconds)', + category: 'Advanced', + requiresRestart: false, + default: undefined as number | undefined, + description: + 'Total timeout in seconds for the Arena session. No limit if unset.', + showInDialog: false, + }, + }, + }, + team: { + type: 'object', + label: 'Team', + category: 'Advanced', + requiresRestart: false, + default: {}, + description: + 'Settings for Agent Team (role-based collaborative execution). Reserved for future use.', + showInDialog: false, + }, + swarm: { + type: 'object', + label: 'Swarm', + category: 'Advanced', + requiresRestart: false, + default: {}, + description: + 'Settings for Agent Swarm (parallel sub-agent execution). Reserved for future use.', + showInDialog: false, + }, + }, + }, hooksConfig: { type: 'object', @@ -1315,6 +1413,17 @@ const SETTINGS_SCHEMA = { }, }, }, + + experimental: { + type: 'object', + label: 'Experimental', + category: 'Experimental', + requiresRestart: true, + default: {}, + description: 'Setting to enable experimental features', + showInDialog: false, + properties: {}, + }, } as const satisfies SettingsSchema; export type SettingsSchemaType = typeof SETTINGS_SCHEMA; diff --git a/packages/cli/src/gemini.tsx b/packages/cli/src/gemini.tsx index 58a735c73c..9913a54006 100644 --- a/packages/cli/src/gemini.tsx +++ b/packages/cli/src/gemini.tsx @@ -35,6 +35,7 @@ import { KeypressProvider } from './ui/contexts/KeypressContext.js'; import { SessionStatsProvider } from './ui/contexts/SessionContext.js'; import { SettingsContext } from './ui/contexts/SettingsContext.js'; import { VimModeProvider } from './ui/contexts/VimModeContext.js'; +import { AgentViewProvider } from './ui/contexts/AgentViewContext.js'; import { useKittyKeyboardProtocol } from './ui/hooks/useKittyKeyboardProtocol.js'; import { themeManager } from './ui/themes/theme-manager.js'; import { detectAndEnableKittyProtocol } from './ui/utils/kittyProtocolDetector.js'; @@ -162,13 +163,15 @@ export async function startInteractiveUI( > - + + + diff --git a/packages/cli/src/services/BuiltinCommandLoader.ts b/packages/cli/src/services/BuiltinCommandLoader.ts index 08ee98eb2b..726db95f7c 100644 --- a/packages/cli/src/services/BuiltinCommandLoader.ts +++ b/packages/cli/src/services/BuiltinCommandLoader.ts @@ -9,6 +9,7 @@ import type { SlashCommand } from '../ui/commands/types.js'; import type { Config } from '@qwen-code/qwen-code-core'; import { aboutCommand } from '../ui/commands/aboutCommand.js'; import { agentsCommand } from '../ui/commands/agentsCommand.js'; +import { arenaCommand } from '../ui/commands/arenaCommand.js'; import { approvalModeCommand } from '../ui/commands/approvalModeCommand.js'; import { authCommand } from '../ui/commands/authCommand.js'; import { bugCommand } from '../ui/commands/bugCommand.js'; @@ -61,6 +62,7 @@ export class BuiltinCommandLoader implements ICommandLoader { const allDefinitions: Array = [ aboutCommand, agentsCommand, + arenaCommand, approvalModeCommand, authCommand, bugCommand, diff --git a/packages/cli/src/ui/App.test.tsx b/packages/cli/src/ui/App.test.tsx index be09fe52fa..8df422f4b7 100644 --- a/packages/cli/src/ui/App.test.tsx +++ b/packages/cli/src/ui/App.test.tsx @@ -9,6 +9,11 @@ import { render } from 'ink-testing-library'; import { Text, useIsScreenReaderEnabled } from 'ink'; import { App } from './App.js'; import { UIStateContext, type UIState } from './contexts/UIStateContext.js'; +import { + UIActionsContext, + type UIActions, +} from './contexts/UIActionsContext.js'; +import { AgentViewProvider } from './contexts/AgentViewContext.js'; import { StreamingState } from './types.js'; vi.mock('ink', async (importOriginal) => { @@ -43,6 +48,10 @@ vi.mock('./components/Footer.js', () => ({ Footer: () => Footer, })); +vi.mock('./components/agent-view/AgentTabBar.js', () => ({ + AgentTabBar: () => null, +})); + describe('App', () => { const mockUIState: Partial = { streamingState: StreamingState.Idle, @@ -58,13 +67,24 @@ describe('App', () => { }, }; - it('should render main content and composer when not quitting', () => { - const { lastFrame } = render( - - - , + const mockUIActions = { + refreshStatic: vi.fn(), + } as unknown as UIActions; + + const renderWithProviders = (uiState: UIState) => + render( + + + + + + + , ); + it('should render main content and composer when not quitting', () => { + const { lastFrame } = renderWithProviders(mockUIState as UIState); + expect(lastFrame()).toContain('MainContent'); expect(lastFrame()).toContain('Composer'); }); @@ -75,11 +95,7 @@ describe('App', () => { quittingMessages: [{ id: 1, type: 'user', text: 'test' }], } as UIState; - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(quittingUIState); expect(lastFrame()).toContain('Quitting...'); }); @@ -90,11 +106,7 @@ describe('App', () => { dialogsVisible: true, } as UIState; - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(dialogUIState); expect(lastFrame()).toContain('MainContent'); expect(lastFrame()).toContain('DialogManager'); @@ -107,11 +119,7 @@ describe('App', () => { ctrlCPressedOnce: true, } as UIState; - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(ctrlCUIState); expect(lastFrame()).toContain('Press Ctrl+C again to exit.'); }); @@ -123,11 +131,7 @@ describe('App', () => { ctrlDPressedOnce: true, } as UIState; - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(ctrlDUIState); expect(lastFrame()).toContain('Press Ctrl+D again to exit.'); }); @@ -135,11 +139,7 @@ describe('App', () => { it('should render ScreenReaderAppLayout when screen reader is enabled', () => { (useIsScreenReaderEnabled as vi.Mock).mockReturnValue(true); - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(mockUIState as UIState); expect(lastFrame()).toContain( 'Notifications\nFooter\nMainContent\nComposer', @@ -149,11 +149,7 @@ describe('App', () => { it('should render DefaultAppLayout when screen reader is not enabled', () => { (useIsScreenReaderEnabled as vi.Mock).mockReturnValue(false); - const { lastFrame } = render( - - - , - ); + const { lastFrame } = renderWithProviders(mockUIState as UIState); expect(lastFrame()).toContain('MainContent\nComposer'); }); diff --git a/packages/cli/src/ui/AppContainer.test.tsx b/packages/cli/src/ui/AppContainer.test.tsx index 9e9d4f6732..833d2bed24 100644 --- a/packages/cli/src/ui/AppContainer.test.tsx +++ b/packages/cli/src/ui/AppContainer.test.tsx @@ -78,6 +78,21 @@ vi.mock('./hooks/useAutoAcceptIndicator.js'); vi.mock('./hooks/useGitBranchName.js'); vi.mock('./contexts/VimModeContext.js'); vi.mock('./contexts/SessionContext.js'); +vi.mock('./contexts/AgentViewContext.js', () => ({ + useAgentViewState: vi.fn(() => ({ + activeView: 'main', + agents: new Map(), + })), + useAgentViewActions: vi.fn(() => ({ + switchToMain: vi.fn(), + switchToAgent: vi.fn(), + switchToNext: vi.fn(), + switchToPrevious: vi.fn(), + registerAgent: vi.fn(), + unregisterAgent: vi.fn(), + unregisterAll: vi.fn(), + })), +})); vi.mock('./components/shared/text-buffer.js'); vi.mock('./hooks/useLogger.js'); @@ -268,7 +283,7 @@ describe('AppContainer State Management', () => { listSubagents: vi.fn().mockResolvedValue([]), addChangeListener: vi.fn(), loadSubagent: vi.fn(), - createSubagentScope: vi.fn(), + createSubagent: vi.fn(), }; vi.spyOn(mockConfig, 'getSubagentManager').mockReturnValue( mockSubagentManager as SubagentManager, diff --git a/packages/cli/src/ui/AppContainer.tsx b/packages/cli/src/ui/AppContainer.tsx index c6bfa67c3c..273108e89b 100644 --- a/packages/cli/src/ui/AppContainer.tsx +++ b/packages/cli/src/ui/AppContainer.tsx @@ -52,6 +52,7 @@ import { useAuthCommand } from './auth/useAuth.js'; import { useEditorSettings } from './hooks/useEditorSettings.js'; import { useSettingsCommand } from './hooks/useSettingsCommand.js'; import { useModelCommand } from './hooks/useModelCommand.js'; +import { useArenaCommand } from './hooks/useArenaCommand.js'; import { useApprovalModeCommand } from './hooks/useApprovalModeCommand.js'; import { useResumeCommand } from './hooks/useResumeCommand.js'; import { useSlashCommandProcessor } from './hooks/slashCommandProcessor.js'; @@ -96,6 +97,7 @@ import { } from './hooks/useExtensionUpdates.js'; import { useCodingPlanUpdates } from './hooks/useCodingPlanUpdates.js'; import { ShellFocusContext } from './contexts/ShellFocusContext.js'; +import { useAgentViewState } from './contexts/AgentViewContext.js'; import { t } from '../i18n/index.js'; import { useWelcomeBack } from './hooks/useWelcomeBack.js'; import { useDialogClose } from './hooks/useDialogClose.js'; @@ -470,6 +472,8 @@ export const AppContainer = (props: AppContainerProps) => { const { isModelDialogOpen, openModelDialog, closeModelDialog } = useModelCommand(); + const { activeArenaDialog, openArenaDialog, closeArenaDialog } = + useArenaCommand(); const { isResumeDialogOpen, @@ -509,6 +513,7 @@ export const AppContainer = (props: AppContainerProps) => { openEditorDialog, openSettingsDialog, openModelDialog, + openArenaDialog, openPermissionsDialog, openApprovalModeDialog, quit: (messages: HistoryItem[]) => { @@ -533,6 +538,7 @@ export const AppContainer = (props: AppContainerProps) => { openEditorDialog, openSettingsDialog, openModelDialog, + openArenaDialog, setDebugMessage, dispatchExtensionStateUpdate, openPermissionsDialog, @@ -669,12 +675,15 @@ export const AppContainer = (props: AppContainerProps) => { // Track whether suggestions are visible for Tab key handling const [hasSuggestionsVisible, setHasSuggestionsVisible] = useState(false); - // Auto-accept indicator + const agentViewState = useAgentViewState(); + + // Auto-accept indicator — disabled on agent tabs (agents handle their own) const showAutoAcceptIndicator = useAutoAcceptIndicator({ config, addItem: historyManager.addItem, onApprovalModeChange: handleApprovalModeChange, shouldBlockTab: () => hasSuggestionsVisible, + disabled: agentViewState.activeView !== 'main', }); const { messageQueue, addMessage, clearQueue, getQueuedMessagesText } = @@ -687,9 +696,26 @@ export const AppContainer = (props: AppContainerProps) => { // Callback for handling final submit (must be after addMessage from useMessageQueue) const handleFinalSubmit = useCallback( (submittedValue: string) => { + // Route to active in-process agent if viewing a sub-agent tab. + if (agentViewState.activeView !== 'main') { + const agent = agentViewState.agents.get(agentViewState.activeView); + if (agent) { + agent.interactiveAgent.enqueueMessage(submittedValue.trim()); + return; + } + } addMessage(submittedValue); }, - [addMessage], + [addMessage, agentViewState], + ); + + const handleArenaModelsSelected = useCallback( + (models: string[]) => { + const value = models.join(','); + buffer.setText(`/arena start --models ${value} `); + closeArenaDialog(); + }, + [buffer, closeArenaDialog], ); // Welcome back functionality (must be after handleFinalSubmit) @@ -765,10 +791,17 @@ export const AppContainer = (props: AppContainerProps) => { } }, [buffer, terminalWidth, terminalHeight]); - // Compute available terminal height based on controls measurement + // agentViewState is declared earlier (before handleFinalSubmit) so it + // is available for input routing. Referenced here for layout computation. + + // Compute available terminal height based on controls measurement. + // When in-process agents are present the AgentTabBar renders an extra + // row at the top of the layout; subtract it so downstream consumers + // (shell, transcript, etc.) don't overestimate available space. + const tabBarHeight = agentViewState.agents.size > 0 ? 1 : 0; const availableTerminalHeight = Math.max( 0, - terminalHeight - controlsHeight - staticExtraHeight - 2, + terminalHeight - controlsHeight - staticExtraHeight - 2 - tabBarHeight, ); config.setShellExecutionConfig({ @@ -1047,6 +1080,8 @@ export const AppContainer = (props: AppContainerProps) => { exitEditorDialog, isSettingsDialogOpen, closeSettingsDialog, + activeArenaDialog, + closeArenaDialog, isFolderTrustDialogOpen, showWelcomeBackDialog, handleWelcomeBackClose, @@ -1304,6 +1339,7 @@ export const AppContainer = (props: AppContainerProps) => { isThemeDialogOpen || isSettingsDialogOpen || isModelDialogOpen || + activeArenaDialog !== null || isPermissionsDialogOpen || isAuthDialogOpen || isAuthenticating || @@ -1354,6 +1390,7 @@ export const AppContainer = (props: AppContainerProps) => { quittingMessages, isSettingsDialogOpen, isModelDialogOpen, + activeArenaDialog, isPermissionsDialogOpen, isApprovalModeDialogOpen, isResumeDialogOpen, @@ -1447,6 +1484,7 @@ export const AppContainer = (props: AppContainerProps) => { quittingMessages, isSettingsDialogOpen, isModelDialogOpen, + activeArenaDialog, isPermissionsDialogOpen, isApprovalModeDialogOpen, isResumeDialogOpen, @@ -1543,6 +1581,9 @@ export const AppContainer = (props: AppContainerProps) => { exitEditorDialog, closeSettingsDialog, closeModelDialog, + openArenaDialog, + closeArenaDialog, + handleArenaModelsSelected, dismissCodingPlanUpdate, closePermissionsDialog, setShellModeActive, @@ -1592,6 +1633,9 @@ export const AppContainer = (props: AppContainerProps) => { exitEditorDialog, closeSettingsDialog, closeModelDialog, + openArenaDialog, + closeArenaDialog, + handleArenaModelsSelected, dismissCodingPlanUpdate, closePermissionsDialog, setShellModeActive, diff --git a/packages/cli/src/ui/commands/arenaCommand.test.ts b/packages/cli/src/ui/commands/arenaCommand.test.ts new file mode 100644 index 0000000000..99f9022590 --- /dev/null +++ b/packages/cli/src/ui/commands/arenaCommand.test.ts @@ -0,0 +1,395 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import { + type ArenaManager, + AgentStatus, + ArenaSessionStatus, +} from '@qwen-code/qwen-code-core'; +import { arenaCommand } from './arenaCommand.js'; +import type { + CommandContext, + OpenDialogActionReturn, + SlashCommand, +} from './types.js'; +import { createMockCommandContext } from '../../test-utils/mockCommandContext.js'; + +function getArenaSubCommand( + name: 'start' | 'stop' | 'status' | 'select', +): SlashCommand { + const command = arenaCommand.subCommands?.find((item) => item.name === name); + if (!command?.action) { + throw new Error(`Arena subcommand "${name}" is missing an action`); + } + return command; +} + +describe('arenaCommand stop subcommand', () => { + let mockContext: CommandContext; + let mockConfig: { + getArenaManager: ReturnType; + setArenaManager: ReturnType; + cleanupArenaRuntime: ReturnType; + getAgentsSettings: ReturnType; + }; + + beforeEach(() => { + mockConfig = { + getArenaManager: vi.fn(() => null), + setArenaManager: vi.fn(), + cleanupArenaRuntime: vi.fn().mockResolvedValue(undefined), + getAgentsSettings: vi.fn(() => ({})), + }; + + mockContext = createMockCommandContext({ + invocation: { + raw: '/arena stop', + name: 'arena', + args: 'stop', + }, + executionMode: 'interactive', + services: { + config: mockConfig as never, + }, + }); + }); + + it('returns an error when no arena session is running', async () => { + const stopCommand = getArenaSubCommand('stop'); + const result = await stopCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: 'No running Arena session found.', + }); + }); + + it('opens stop dialog when a running session exists', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.RUNNING), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const stopCommand = getArenaSubCommand('stop'); + const result = (await stopCommand.action!( + mockContext, + '', + )) as OpenDialogActionReturn; + + expect(result).toEqual({ + type: 'dialog', + dialog: 'arena_stop', + }); + }); + + it('opens stop dialog when a completed session exists', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const stopCommand = getArenaSubCommand('stop'); + const result = (await stopCommand.action!( + mockContext, + '', + )) as OpenDialogActionReturn; + + expect(result).toEqual({ + type: 'dialog', + dialog: 'arena_stop', + }); + }); +}); + +describe('arenaCommand status subcommand', () => { + let mockContext: CommandContext; + let mockConfig: { + getArenaManager: ReturnType; + }; + + beforeEach(() => { + mockConfig = { + getArenaManager: vi.fn(() => null), + }; + + mockContext = createMockCommandContext({ + invocation: { + raw: '/arena status', + name: 'arena', + args: 'status', + }, + executionMode: 'interactive', + services: { + config: mockConfig as never, + }, + }); + }); + + it('returns an error when no arena session exists', async () => { + const statusCommand = getArenaSubCommand('status'); + const result = await statusCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: 'No Arena session found. Start one with /arena start.', + }); + }); + + it('opens status dialog when a session exists', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.RUNNING), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const statusCommand = getArenaSubCommand('status'); + const result = (await statusCommand.action!( + mockContext, + '', + )) as OpenDialogActionReturn; + + expect(result).toEqual({ + type: 'dialog', + dialog: 'arena_status', + }); + }); + + it('opens status dialog for completed session', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const statusCommand = getArenaSubCommand('status'); + const result = (await statusCommand.action!( + mockContext, + '', + )) as OpenDialogActionReturn; + + expect(result).toEqual({ + type: 'dialog', + dialog: 'arena_status', + }); + }); +}); + +describe('arenaCommand select subcommand', () => { + let mockContext: CommandContext; + let mockConfig: { + getArenaManager: ReturnType; + setArenaManager: ReturnType; + cleanupArenaRuntime: ReturnType; + getAgentsSettings: ReturnType; + }; + + beforeEach(() => { + mockConfig = { + getArenaManager: vi.fn(() => null), + setArenaManager: vi.fn(), + cleanupArenaRuntime: vi.fn().mockResolvedValue(undefined), + getAgentsSettings: vi.fn(() => ({})), + }; + + mockContext = createMockCommandContext({ + invocation: { + raw: '/arena select', + name: 'arena', + args: 'select', + }, + executionMode: 'interactive', + services: { + config: mockConfig as never, + }, + }); + }); + + it('returns error when no arena session exists', async () => { + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: 'No arena session found. Start one with /arena start.', + }); + }); + + it('returns error when arena is still running', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.RUNNING), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: + 'Arena session is still running. Wait for it to complete or use /arena stop first.', + }); + }); + + it('returns error when all agents failed', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.FAILED, + model: { modelId: 'model-1' }, + }, + ]), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: + 'No successful agent results to select from. All agents failed or were cancelled.\n' + + 'Use /arena stop to end the session.', + }); + }); + + it('opens dialog when no args provided and agents have results', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.COMPLETED, + model: { modelId: 'model-1' }, + }, + { + agentId: 'agent-2', + status: AgentStatus.COMPLETED, + model: { modelId: 'model-2' }, + }, + ]), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, ''); + + expect(result).toEqual({ + type: 'dialog', + dialog: 'arena_select', + }); + }); + + it('applies changes directly when model name is provided', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.COMPLETED, + model: { modelId: 'gpt-4o', displayName: 'gpt-4o' }, + }, + { + agentId: 'agent-2', + status: AgentStatus.COMPLETED, + model: { modelId: 'claude-sonnet', displayName: 'claude-sonnet' }, + }, + ]), + applyAgentResult: vi.fn().mockResolvedValue({ success: true }), + cleanup: vi.fn().mockResolvedValue(undefined), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, 'gpt-4o'); + + expect(mockManager.applyAgentResult).toHaveBeenCalledWith('agent-1'); + expect(mockConfig.cleanupArenaRuntime).toHaveBeenCalled(); + expect(result).toEqual({ + type: 'message', + messageType: 'info', + content: + 'Applied changes from gpt-4o to workspace. Arena session complete.', + }); + }); + + it('returns error when specified model not found', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.COMPLETED, + model: { modelId: 'gpt-4o', displayName: 'gpt-4o' }, + }, + ]), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, 'nonexistent'); + + expect(result).toEqual({ + type: 'message', + messageType: 'error', + content: 'No idle agent found matching "nonexistent".', + }); + }); + + it('asks for confirmation when --discard flag is used', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.COMPLETED, + model: { modelId: 'gpt-4o' }, + }, + ]), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, '--discard'); + + expect(result).toEqual({ + type: 'confirm_action', + prompt: 'Discard all Arena results and clean up worktrees?', + originalInvocation: { raw: '/arena select' }, + }); + }); + + it('discards results after --discard confirmation', async () => { + const mockManager = { + getSessionStatus: vi.fn(() => ArenaSessionStatus.COMPLETED), + getAgentStates: vi.fn(() => [ + { + agentId: 'agent-1', + status: AgentStatus.COMPLETED, + model: { modelId: 'gpt-4o' }, + }, + ]), + cleanup: vi.fn().mockResolvedValue(undefined), + } as unknown as ArenaManager; + mockConfig.getArenaManager = vi.fn(() => mockManager); + mockContext.overwriteConfirmed = true; + + const selectCommand = getArenaSubCommand('select'); + const result = await selectCommand.action!(mockContext, '--discard'); + + expect(mockConfig.cleanupArenaRuntime).toHaveBeenCalled(); + expect(result).toEqual({ + type: 'message', + messageType: 'info', + content: 'Arena results discarded. All worktrees cleaned up.', + }); + }); +}); diff --git a/packages/cli/src/ui/commands/arenaCommand.ts b/packages/cli/src/ui/commands/arenaCommand.ts new file mode 100644 index 0000000000..c178a021d5 --- /dev/null +++ b/packages/cli/src/ui/commands/arenaCommand.ts @@ -0,0 +1,659 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type { + SlashCommand, + CommandContext, + ConfirmActionReturn, + MessageActionReturn, + OpenDialogActionReturn, + SlashCommandActionReturn, +} from './types.js'; +import { CommandKind } from './types.js'; +import { + ArenaManager, + ArenaEventType, + isTerminalStatus, + isSuccessStatus, + ArenaSessionStatus, + AuthType, + createDebugLogger, + stripStartupContext, + type Config, + type ArenaModelConfig, + type ArenaAgentErrorEvent, + type ArenaAgentCompleteEvent, + type ArenaAgentStartEvent, + type ArenaSessionCompleteEvent, + type ArenaSessionErrorEvent, + type ArenaSessionStartEvent, + type ArenaSessionUpdateEvent, +} from '@qwen-code/qwen-code-core'; +import { + MessageType, + type ArenaAgentCardData, + type HistoryItemWithoutId, +} from '../types.js'; + +/** + * Parsed model entry with optional auth type. + */ +interface ParsedModel { + authType?: string; + modelId: string; +} + +/** + * Parses arena command arguments. + * + * Supported formats: + * /arena start --models model1,model2 + * /arena start --models authType1:model1,authType2:model2 + * + * Model format: [authType:]modelId + * - "gpt-4o" → uses default auth type + * - "openai:gpt-4o" → uses "openai" auth type + */ +function parseArenaArgs(args: string): { + models: ParsedModel[]; + task: string; +} { + const modelsMatch = args.match(/--models\s+(\S+)/); + + let models: ParsedModel[] = []; + let task = args; + + if (modelsMatch) { + const modelStrings = modelsMatch[1]!.split(',').filter(Boolean); + models = modelStrings.map((str) => { + // Check for authType:modelId format + const colonIndex = str.indexOf(':'); + if (colonIndex > 0) { + return { + authType: str.substring(0, colonIndex), + modelId: str.substring(colonIndex + 1), + }; + } + return { modelId: str }; + }); + task = task.replace(/--models\s+\S+/, '').trim(); + } + + // Strip surrounding quotes from task + task = task.replace(/^["']|["']$/g, '').trim(); + + return { models, task }; +} + +const debugLogger = createDebugLogger('ARENA_COMMAND'); + +interface ArenaExecutionInput { + task: string; + models: ArenaModelConfig[]; + approvalMode?: string; +} + +function buildArenaExecutionInput( + parsed: ReturnType, + config: Config, +): ArenaExecutionInput | MessageActionReturn { + if (!parsed.task) { + return { + type: 'message', + messageType: 'error', + content: + 'Usage: /arena start --models model1,model2 \n' + + '\n' + + 'Options:\n' + + ' --models [authType:]model1,[authType:]model2\n' + + ' Models to compete (required, at least 2)\n' + + ' Format: authType:modelId or just modelId\n' + + '\n' + + 'Examples:\n' + + ' /arena start --models openai:gpt-4o,anthropic:claude-3 "implement sorting"\n' + + ' /arena start --models qwen-coder-plus,kimi-for-coding "fix the bug"', + }; + } + + if (parsed.models.length < 2) { + return { + type: 'message', + messageType: 'error', + content: + 'Arena requires at least 2 models. Use --models model1,model2 to specify.\n' + + 'Format: [authType:]modelId (e.g., openai:gpt-4o or just gpt-4o)', + }; + } + + // Get the current auth type as default for models without explicit auth type + const contentGeneratorConfig = config.getContentGeneratorConfig(); + const defaultAuthType = + contentGeneratorConfig?.authType ?? AuthType.USE_OPENAI; + + // Build ArenaModelConfig for each model, resolving display names from + // the model registry when available. + const modelsConfig = config.getModelsConfig(); + const models: ArenaModelConfig[] = parsed.models.map((parsedModel) => { + const authType = + (parsedModel.authType as AuthType | undefined) ?? defaultAuthType; + const registryModels = modelsConfig.getAvailableModelsForAuthType(authType); + const resolved = registryModels.find((m) => m.id === parsedModel.modelId); + return { + modelId: parsedModel.modelId, + authType, + displayName: resolved?.label ?? parsedModel.modelId, + }; + }); + + return { + task: parsed.task, + models, + approvalMode: config.getApprovalMode(), + }; +} + +/** + * Persists a single arena history item to the session JSONL file. + * + * Arena events fire asynchronously (after the slash command's recording + * window has closed), so each item must be recorded individually. + */ +function recordArenaItem(config: Config, item: HistoryItemWithoutId): void { + try { + const chatRecorder = config.getChatRecordingService(); + if (!chatRecorder) return; + chatRecorder.recordSlashCommand({ + phase: 'result', + rawCommand: '/arena', + outputHistoryItems: [{ ...item } as Record], + }); + } catch { + debugLogger.error('Failed to record arena history item'); + } +} + +function executeArenaCommand( + config: Config, + ui: CommandContext['ui'], + input: ArenaExecutionInput, +): void { + // Capture the main session's chat history so arena agents start with + // conversational context. Strip the leading startup context (env info + // user message + model ack) because each agent generates its own for + // its worktree directory — keeping the parent's would duplicate it. + let chatHistory; + try { + const fullHistory = config.getGeminiClient().getHistory(); + chatHistory = stripStartupContext(fullHistory); + } catch { + debugLogger.debug('Could not retrieve chat history for arena agents'); + } + + const manager = new ArenaManager(config); + const emitter = manager.getEventEmitter(); + const detachListeners: Array<() => void> = []; + const agentLabels = new Map(); + + const addArenaMessage = ( + type: 'info' | 'warning' | 'error' | 'success', + text: string, + ) => { + ui.addItem({ type, text }, Date.now()); + }; + + const addAndRecordArenaMessage = ( + type: 'info' | 'warning' | 'error' | 'success', + text: string, + ) => { + const item: HistoryItemWithoutId = { type, text }; + ui.addItem(item, Date.now()); + recordArenaItem(config, item); + }; + + const handleSessionStart = (event: ArenaSessionStartEvent) => { + const modelList = event.models + .map((model, index) => ` ${index + 1}. ${model.modelId}`) + .join('\n'); + // SESSION_START fires synchronously before the first await in + // ArenaManager.start(), so the slash command processor's finally + // block already captures this item — no extra recording needed. + addArenaMessage( + MessageType.INFO, + `Arena started with ${event.models.length} agents on task: "${event.task}"\nModels:\n${modelList}`, + ); + }; + + const handleAgentStart = (event: ArenaAgentStartEvent) => { + agentLabels.set(event.agentId, event.model.modelId); + debugLogger.debug( + `Arena agent started: ${event.model.modelId} (${event.agentId})`, + ); + }; + + const handleSessionUpdate = (event: ArenaSessionUpdateEvent) => { + const attachHintPrefix = 'To view agent panes, run: '; + if (event.message.startsWith(attachHintPrefix)) { + const command = event.message.slice(attachHintPrefix.length).trim(); + addAndRecordArenaMessage( + MessageType.INFO, + `Arena panes are running in tmux. Attach with: \`${command}\``, + ); + return; + } + + if (event.type === 'success') { + addAndRecordArenaMessage(MessageType.SUCCESS, event.message); + } else if (event.type === 'info') { + addAndRecordArenaMessage(MessageType.INFO, event.message); + } else { + addAndRecordArenaMessage(MessageType.WARNING, event.message); + } + }; + + const handleAgentError = (event: ArenaAgentErrorEvent) => { + const label = agentLabels.get(event.agentId) || event.agentId; + addAndRecordArenaMessage( + MessageType.ERROR, + `[${label}] failed: ${event.error}`, + ); + }; + + const buildAgentCardData = ( + result: ArenaAgentCompleteEvent['result'], + ): ArenaAgentCardData => ({ + label: result.model.modelId, + status: result.status, + durationMs: result.stats.durationMs, + totalTokens: result.stats.totalTokens, + inputTokens: result.stats.inputTokens, + outputTokens: result.stats.outputTokens, + toolCalls: result.stats.toolCalls, + successfulToolCalls: result.stats.successfulToolCalls, + failedToolCalls: result.stats.failedToolCalls, + rounds: result.stats.rounds, + error: result.error, + diff: result.diff, + }); + + const handleAgentComplete = (event: ArenaAgentCompleteEvent) => { + if (!isTerminalStatus(event.result.status)) { + return; + } + + const agent = buildAgentCardData(event.result); + const item = { + type: 'arena_agent_complete', + agent, + } as HistoryItemWithoutId; + ui.addItem(item, Date.now()); + recordArenaItem(config, item); + }; + + const handleSessionError = (event: ArenaSessionErrorEvent) => { + addAndRecordArenaMessage(MessageType.ERROR, `${event.error}`); + }; + + const handleSessionComplete = (event: ArenaSessionCompleteEvent) => { + const item = { + type: 'arena_session_complete', + sessionStatus: event.result.status, + task: event.result.task, + totalDurationMs: event.result.totalDurationMs ?? 0, + agents: event.result.agents.map(buildAgentCardData), + } as HistoryItemWithoutId; + ui.addItem(item, Date.now()); + recordArenaItem(config, item); + }; + + emitter.on(ArenaEventType.SESSION_START, handleSessionStart); + detachListeners.push(() => + emitter.off(ArenaEventType.SESSION_START, handleSessionStart), + ); + emitter.on(ArenaEventType.AGENT_START, handleAgentStart); + detachListeners.push(() => + emitter.off(ArenaEventType.AGENT_START, handleAgentStart), + ); + emitter.on(ArenaEventType.SESSION_UPDATE, handleSessionUpdate); + detachListeners.push(() => + emitter.off(ArenaEventType.SESSION_UPDATE, handleSessionUpdate), + ); + emitter.on(ArenaEventType.AGENT_ERROR, handleAgentError); + detachListeners.push(() => + emitter.off(ArenaEventType.AGENT_ERROR, handleAgentError), + ); + emitter.on(ArenaEventType.AGENT_COMPLETE, handleAgentComplete); + detachListeners.push(() => + emitter.off(ArenaEventType.AGENT_COMPLETE, handleAgentComplete), + ); + emitter.on(ArenaEventType.SESSION_ERROR, handleSessionError); + detachListeners.push(() => + emitter.off(ArenaEventType.SESSION_ERROR, handleSessionError), + ); + emitter.on(ArenaEventType.SESSION_COMPLETE, handleSessionComplete); + detachListeners.push(() => + emitter.off(ArenaEventType.SESSION_COMPLETE, handleSessionComplete), + ); + + config.setArenaManager(manager); + + const cols = process.stdout.columns || 120; + const rows = Math.max((process.stdout.rows || 40) - 2, 1); + + const lifecycle = manager + .start({ + task: input.task, + models: input.models, + cols, + rows, + approvalMode: input.approvalMode, + chatHistory, + }) + .then( + () => { + debugLogger.debug('Arena agents settled'); + }, + (error) => { + const message = error instanceof Error ? error.message : String(error); + addAndRecordArenaMessage(MessageType.ERROR, `${message}`); + debugLogger.error('Arena session failed:', error); + + // Clear the stored manager so subsequent /arena start calls + // are not blocked by the stale reference after a startup failure. + config.setArenaManager(null); + + // Detach listeners on failure — session is done for good. + for (const detach of detachListeners) { + detach(); + } + }, + ); + + // NOTE: listeners are NOT detached when start() resolves because agents + // may still be alive (IDLE) and accept follow-up tasks. The listeners + // reference this manager's emitter, so they are garbage collected when + // the manager is cleaned up and replaced. + + // Store so that stop can wait for start() to fully unwind before cleanup + manager.setLifecyclePromise(lifecycle); +} + +export const arenaCommand: SlashCommand = { + name: 'arena', + description: 'Manage Arena sessions', + kind: CommandKind.BUILT_IN, + subCommands: [ + { + name: 'start', + description: + 'Start an Arena session with multiple models competing on the same task', + kind: CommandKind.BUILT_IN, + action: async ( + context: CommandContext, + args: string, + ): Promise => { + const executionMode = context.executionMode ?? 'interactive'; + if (executionMode !== 'interactive') { + return { + type: 'message', + messageType: 'error', + content: + 'Arena is not supported in non-interactive mode. Use interactive mode to start an Arena session.', + }; + } + + const { services, ui } = context; + const { config } = services; + + if (!config) { + return { + type: 'message', + messageType: 'error', + content: 'Configuration not available.', + }; + } + + // Refuse to start if a session already exists (regardless of status) + const existingManager = config.getArenaManager(); + if (existingManager) { + return { + type: 'message', + messageType: 'error', + content: + 'An Arena session exists. Use /arena stop or /arena select to end it before starting a new one.', + }; + } + + const parsed = parseArenaArgs(args); + if (parsed.models.length === 0) { + return { + type: 'dialog', + dialog: 'arena_start', + }; + } + + const executionInput = buildArenaExecutionInput(parsed, config); + if ('type' in executionInput) { + return executionInput; + } + + executeArenaCommand(config, ui, executionInput); + }, + }, + { + name: 'stop', + description: 'Stop the current Arena session', + kind: CommandKind.BUILT_IN, + action: async ( + context: CommandContext, + ): Promise => { + const executionMode = context.executionMode ?? 'interactive'; + if (executionMode !== 'interactive') { + return { + type: 'message', + messageType: 'error', + content: + 'Arena is not supported in non-interactive mode. Use interactive mode to stop an Arena session.', + }; + } + + const { config } = context.services; + if (!config) { + return { + type: 'message', + messageType: 'error', + content: 'Configuration not available.', + }; + } + + const manager = config.getArenaManager(); + if (!manager) { + return { + type: 'message', + messageType: 'error', + content: 'No running Arena session found.', + }; + } + + return { + type: 'dialog', + dialog: 'arena_stop', + }; + }, + }, + { + name: 'status', + description: 'Show the current Arena session status', + kind: CommandKind.BUILT_IN, + action: async ( + context: CommandContext, + ): Promise => { + const executionMode = context.executionMode ?? 'interactive'; + if (executionMode !== 'interactive') { + return { + type: 'message', + messageType: 'error', + content: 'Arena is not supported in non-interactive mode.', + }; + } + + const { config } = context.services; + if (!config) { + return { + type: 'message', + messageType: 'error', + content: 'Configuration not available.', + }; + } + + const manager = config.getArenaManager(); + if (!manager) { + return { + type: 'message', + messageType: 'error', + content: 'No Arena session found. Start one with /arena start.', + }; + } + + return { + type: 'dialog', + dialog: 'arena_status', + }; + }, + }, + { + name: 'select', + altNames: ['choose'], + description: + 'Select a model result and merge its diff into the current workspace', + kind: CommandKind.BUILT_IN, + action: async ( + context: CommandContext, + args: string, + ): Promise< + | void + | MessageActionReturn + | OpenDialogActionReturn + | ConfirmActionReturn + > => { + const executionMode = context.executionMode ?? 'interactive'; + if (executionMode !== 'interactive') { + return { + type: 'message', + messageType: 'error', + content: 'Arena is not supported in non-interactive mode.', + }; + } + + const { config } = context.services; + if (!config) { + return { + type: 'message', + messageType: 'error', + content: 'Configuration not available.', + }; + } + + const manager = config.getArenaManager(); + + if (!manager) { + return { + type: 'message', + messageType: 'error', + content: 'No arena session found. Start one with /arena start.', + }; + } + + const sessionStatus = manager.getSessionStatus(); + if ( + sessionStatus === ArenaSessionStatus.RUNNING || + sessionStatus === ArenaSessionStatus.INITIALIZING + ) { + return { + type: 'message', + messageType: 'error', + content: + 'Arena session is still running. Wait for it to complete or use /arena stop first.', + }; + } + + // Handle --discard flag before checking for successful agents, + // so users can clean up worktrees even when all agents failed. + const trimmedArgs = args.trim(); + if (trimmedArgs === '--discard') { + if (!context.overwriteConfirmed) { + return { + type: 'confirm_action', + prompt: 'Discard all Arena results and clean up worktrees?', + originalInvocation: { + raw: context.invocation?.raw || '/arena select --discard', + }, + }; + } + + await config.cleanupArenaRuntime(true); + return { + type: 'message', + messageType: 'info', + content: 'Arena results discarded. All worktrees cleaned up.', + }; + } + + const agents = manager.getAgentStates(); + const hasSuccessful = agents.some((a) => isSuccessStatus(a.status)); + + if (!hasSuccessful) { + return { + type: 'message', + messageType: 'error', + content: + 'No successful agent results to select from. All agents failed or were cancelled.\n' + + 'Use /arena stop to end the session.', + }; + } + + // Handle direct model selection via args + if (trimmedArgs) { + const matchingAgent = agents.find( + (a) => + isSuccessStatus(a.status) && + a.model.modelId.toLowerCase() === trimmedArgs.toLowerCase(), + ); + + if (!matchingAgent) { + return { + type: 'message', + messageType: 'error', + content: `No idle agent found matching "${trimmedArgs}".`, + }; + } + + const label = matchingAgent.model.modelId; + const result = await manager.applyAgentResult(matchingAgent.agentId); + if (!result.success) { + return { + type: 'message', + messageType: 'error', + content: `Failed to apply changes from ${label}: ${result.error}`, + }; + } + + await config.cleanupArenaRuntime(true); + return { + type: 'message', + messageType: 'info', + content: `Applied changes from ${label} to workspace. Arena session complete.`, + }; + } + + // No args → open the select dialog + return { + type: 'dialog', + dialog: 'arena_select', + }; + }, + }, + ], +}; diff --git a/packages/cli/src/ui/commands/types.ts b/packages/cli/src/ui/commands/types.ts index 76eda2c071..20eb695ce6 100644 --- a/packages/cli/src/ui/commands/types.ts +++ b/packages/cli/src/ui/commands/types.ts @@ -139,6 +139,10 @@ export interface OpenDialogActionReturn { dialog: | 'help' + | 'arena_start' + | 'arena_select' + | 'arena_stop' + | 'arena_status' | 'auth' | 'theme' | 'editor' diff --git a/packages/cli/src/ui/components/BaseTextInput.tsx b/packages/cli/src/ui/components/BaseTextInput.tsx new file mode 100644 index 0000000000..07eb1a6934 --- /dev/null +++ b/packages/cli/src/ui/components/BaseTextInput.tsx @@ -0,0 +1,287 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview BaseTextInput — shared text input component with rendering + * and common readline keyboard handling. + * + * Provides: + * - Viewport line rendering from a TextBuffer with cursor display + * - Placeholder support when buffer is empty + * - Configurable border/prefix styling + * - Standard readline shortcuts (Ctrl+A/E/K/U/W, Escape, etc.) + * - An `onKeypress` interceptor so consumers can layer custom behavior + * + * Used by both InputPrompt (with syntax highlighting + complex key handling) + * and AgentComposer (with minimal customization). + */ + +import type React from 'react'; +import { useCallback } from 'react'; +import { Box, Text } from 'ink'; +import chalk from 'chalk'; +import type { TextBuffer } from './shared/text-buffer.js'; +import type { Key } from '../hooks/useKeypress.js'; +import { useKeypress } from '../hooks/useKeypress.js'; +import { keyMatchers, Command } from '../keyMatchers.js'; +import { cpSlice, cpLen } from '../utils/textUtils.js'; +import { theme } from '../semantic-colors.js'; + +// ─── Types ────────────────────────────────────────────────── + +export interface RenderLineOptions { + /** The text content of this visual line. */ + lineText: string; + /** Whether the cursor is on this visual line. */ + isOnCursorLine: boolean; + /** The cursor column within this visual line (visual col, not logical). */ + cursorCol: number; + /** Whether the cursor should be rendered. */ + showCursor: boolean; + /** Index of this line within the rendered viewport (0-based). */ + visualLineIndex: number; + /** Absolute visual line index (scrollVisualRow + visualLineIndex). */ + absoluteVisualIndex: number; + /** The underlying text buffer. */ + buffer: TextBuffer; + /** The first visible visual row (scroll offset). */ + scrollVisualRow: number; +} + +export interface BaseTextInputProps { + /** The text buffer driving this input. */ + buffer: TextBuffer; + /** Called when the user submits (Enter). Buffer is cleared automatically. */ + onSubmit: (text: string) => void; + /** + * Optional key interceptor. Called before default readline handling. + * Return `true` if the key was handled (skips default processing). + */ + onKeypress?: (key: Key) => boolean; + /** Whether to show the blinking block cursor. Defaults to true. */ + showCursor?: boolean; + /** Placeholder text shown when the buffer is empty. */ + placeholder?: string; + /** Custom prefix node (defaults to `> `). */ + prefix?: React.ReactNode; + /** Border color for the input box. */ + borderColor?: string; + /** Whether keyboard handling is active. Defaults to true. */ + isActive?: boolean; + /** + * Custom line renderer for advanced rendering (e.g. syntax highlighting). + * When not provided, lines are rendered as plain text with cursor overlay. + */ + renderLine?: (opts: RenderLineOptions) => React.ReactNode; +} + +// ─── Default line renderer ────────────────────────────────── + +/** + * Renders a single visual line with an inverse-video block cursor. + * Uses codepoint-aware string operations for Unicode/emoji safety. + */ +export function defaultRenderLine({ + lineText, + isOnCursorLine, + cursorCol, + showCursor, +}: RenderLineOptions): React.ReactNode { + if (!isOnCursorLine || !showCursor) { + return {lineText || ' '}; + } + + const len = cpLen(lineText); + + // Cursor past end of line — append inverse space + if (cursorCol >= len) { + return ( + + {lineText} + {chalk.inverse(' ') + '\u200B'} + + ); + } + + const before = cpSlice(lineText, 0, cursorCol); + const cursorChar = cpSlice(lineText, cursorCol, cursorCol + 1); + const after = cpSlice(lineText, cursorCol + 1); + + return ( + + {before} + {chalk.inverse(cursorChar)} + {after} + + ); +} + +// ─── Component ────────────────────────────────────────────── + +export const BaseTextInput: React.FC = ({ + buffer, + onSubmit, + onKeypress, + showCursor = true, + placeholder, + prefix, + borderColor, + isActive = true, + renderLine = defaultRenderLine, +}) => { + // ── Keyboard handling ── + + const handleKey = useCallback( + (key: Key) => { + // Let the consumer intercept first + if (onKeypress?.(key)) { + return; + } + + // ── Standard readline shortcuts ── + + // Submit (Enter, no modifiers) + if (keyMatchers[Command.SUBMIT](key)) { + if (buffer.text.trim()) { + const text = buffer.text; + buffer.setText(''); + onSubmit(text); + } + return; + } + + // Newline (Shift+Enter, Ctrl+Enter, Ctrl+J) + if (keyMatchers[Command.NEWLINE](key)) { + buffer.newline(); + return; + } + + // Escape → clear input + if (keyMatchers[Command.ESCAPE](key)) { + if (buffer.text.length > 0) { + buffer.setText(''); + } + return; + } + + // Ctrl+C → clear input + if (keyMatchers[Command.CLEAR_INPUT](key)) { + if (buffer.text.length > 0) { + buffer.setText(''); + } + return; + } + + // Ctrl+A → home + if (keyMatchers[Command.HOME](key)) { + buffer.move('home'); + return; + } + + // Ctrl+E → end + if (keyMatchers[Command.END](key)) { + buffer.move('end'); + return; + } + + // Ctrl+K → kill to end of line + if (keyMatchers[Command.KILL_LINE_RIGHT](key)) { + buffer.killLineRight(); + return; + } + + // Ctrl+U → kill to start of line + if (keyMatchers[Command.KILL_LINE_LEFT](key)) { + buffer.killLineLeft(); + return; + } + + // Ctrl+W / Alt+Backspace → delete word backward + if (keyMatchers[Command.DELETE_WORD_BACKWARD](key)) { + buffer.deleteWordLeft(); + return; + } + + // Ctrl+X Ctrl+E → open in external editor + if (keyMatchers[Command.OPEN_EXTERNAL_EDITOR](key)) { + buffer.openInExternalEditor(); + return; + } + + // Backspace + if ( + key.name === 'backspace' || + key.sequence === '\x7f' || + (key.ctrl && key.name === 'h') + ) { + buffer.backspace(); + return; + } + + // Fallthrough — delegate to buffer's built-in input handler + buffer.handleInput(key); + }, + [buffer, onSubmit, onKeypress], + ); + + useKeypress(handleKey, { isActive }); + + // ── Rendering ── + + const linesToRender = buffer.viewportVisualLines; + const [cursorVisualRow, cursorVisualCol] = buffer.visualCursor; + const scrollVisualRow = buffer.visualScrollRow; + + const resolvedBorderColor = borderColor ?? theme.border.focused; + const resolvedPrefix = prefix ?? ( + {'> '} + ); + + return ( + + {resolvedPrefix} + + {buffer.text.length === 0 && placeholder ? ( + showCursor ? ( + + {chalk.inverse(placeholder.slice(0, 1))} + {placeholder.slice(1)} + + ) : ( + {placeholder} + ) + ) : ( + linesToRender.map((lineText, idx) => { + const absoluteVisualIndex = scrollVisualRow + idx; + const isOnCursorLine = absoluteVisualIndex === cursorVisualRow; + + return ( + + {renderLine({ + lineText, + isOnCursorLine, + cursorCol: cursorVisualCol, + showCursor, + visualLineIndex: idx, + absoluteVisualIndex, + buffer, + scrollVisualRow, + })} + + ); + }) + )} + + + ); +}; diff --git a/packages/cli/src/ui/components/Composer.tsx b/packages/cli/src/ui/components/Composer.tsx index 1935492459..78eefabc3b 100644 --- a/packages/cli/src/ui/components/Composer.tsx +++ b/packages/cli/src/ui/components/Composer.tsx @@ -104,8 +104,8 @@ export const Composer = () => { {/* Exclusive area: only one component visible at a time */} {/* Hide footer when a confirmation dialog (e.g. ask_user_question) is active */} - {!showSuggestions && - uiState.streamingState !== StreamingState.WaitingForConfirmation && + {uiState.isInputActive && + !showSuggestions && (showShortcuts ? ( ) : ( diff --git a/packages/cli/src/ui/components/DialogManager.tsx b/packages/cli/src/ui/components/DialogManager.tsx index 26390e2706..11d10303ef 100644 --- a/packages/cli/src/ui/components/DialogManager.tsx +++ b/packages/cli/src/ui/components/DialogManager.tsx @@ -20,6 +20,10 @@ import { AuthDialog } from '../auth/AuthDialog.js'; import { EditorSettingsDialog } from './EditorSettingsDialog.js'; import { PermissionsModifyTrustDialog } from './PermissionsModifyTrustDialog.js'; import { ModelDialog } from './ModelDialog.js'; +import { ArenaStartDialog } from './arena/ArenaStartDialog.js'; +import { ArenaSelectDialog } from './arena/ArenaSelectDialog.js'; +import { ArenaStopDialog } from './arena/ArenaStopDialog.js'; +import { ArenaStatusDialog } from './arena/ArenaStatusDialog.js'; import { ApprovalModeDialog } from './ApprovalModeDialog.js'; import { theme } from '../semantic-colors.js'; import { useUIState } from '../contexts/UIStateContext.js'; @@ -237,6 +241,49 @@ export const DialogManager = ({ if (uiState.isModelDialogOpen) { return ; } + if (uiState.activeArenaDialog === 'start') { + return ( + uiActions.closeArenaDialog()} + onConfirm={(models) => uiActions.handleArenaModelsSelected?.(models)} + /> + ); + } + if (uiState.activeArenaDialog === 'status') { + const arenaManager = config.getArenaManager(); + if (arenaManager) { + return ( + + ); + } + } + if (uiState.activeArenaDialog === 'stop') { + return ( + + ); + } + if (uiState.activeArenaDialog === 'select') { + const arenaManager = config.getArenaManager(); + if (arenaManager) { + return ( + + ); + } + } + if (uiState.isAuthDialogOpen || uiState.authError) { return ( diff --git a/packages/cli/src/ui/components/HistoryItemDisplay.tsx b/packages/cli/src/ui/components/HistoryItemDisplay.tsx index a82847cc8f..c50d9d874b 100644 --- a/packages/cli/src/ui/components/HistoryItemDisplay.tsx +++ b/packages/cli/src/ui/components/HistoryItemDisplay.tsx @@ -24,6 +24,7 @@ import { WarningMessage, ErrorMessage, RetryCountdownMessage, + SuccessMessage, } from './messages/StatusMessages.js'; import { Box } from 'ink'; import { AboutBox } from './AboutBox.js'; @@ -38,6 +39,7 @@ import { getMCPServerStatus } from '@qwen-code/qwen-code-core'; import { SkillsList } from './views/SkillsList.js'; import { ToolsList } from './views/ToolsList.js'; import { McpStatus } from './views/McpStatus.js'; +import { ArenaAgentCard, ArenaSessionCard } from './arena/ArenaCards.js'; import { InsightProgressMessage } from './messages/InsightProgressMessage.js'; interface HistoryItemDisplayProps { @@ -132,6 +134,9 @@ const HistoryItemDisplayComponent: React.FC = ({ {itemForDisplay.type === 'info' && ( )} + {itemForDisplay.type === 'success' && ( + + )} {itemForDisplay.type === 'warning' && ( )} @@ -191,6 +196,18 @@ const HistoryItemDisplayComponent: React.FC = ({ {itemForDisplay.type === 'mcp_status' && ( )} + {itemForDisplay.type === 'arena_agent_complete' && ( + + )} + {itemForDisplay.type === 'arena_session_complete' && ( + + )} {itemForDisplay.type === 'insight_progress' && ( )} diff --git a/packages/cli/src/ui/components/InputPrompt.tsx b/packages/cli/src/ui/components/InputPrompt.tsx index 8b6ff40847..52add983b0 100644 --- a/packages/cli/src/ui/components/InputPrompt.tsx +++ b/packages/cli/src/ui/components/InputPrompt.tsx @@ -18,7 +18,6 @@ import { useShellHistory } from '../hooks/useShellHistory.js'; import { useReverseSearchCompletion } from '../hooks/useReverseSearchCompletion.js'; import { useCommandCompletion } from '../hooks/useCommandCompletion.js'; import type { Key } from '../hooks/useKeypress.js'; -import { useKeypress } from '../hooks/useKeypress.js'; import { keyMatchers, Command } from '../keyMatchers.js'; import type { CommandContext, SlashCommand } from '../commands/types.js'; import type { Config } from '@qwen-code/qwen-code-core'; @@ -43,7 +42,13 @@ import { useShellFocusState } from '../contexts/ShellFocusContext.js'; import { useUIState } from '../contexts/UIStateContext.js'; import { useUIActions } from '../contexts/UIActionsContext.js'; import { useKeypressContext } from '../contexts/KeypressContext.js'; +import { + useAgentViewState, + useAgentViewActions, +} from '../contexts/AgentViewContext.js'; import { FEEDBACK_DIALOG_KEYS } from '../FeedbackDialog.js'; +import { BaseTextInput } from './BaseTextInput.js'; +import type { RenderLineOptions } from './BaseTextInput.js'; /** * Represents an attachment (e.g., pasted image) displayed above the input prompt @@ -78,30 +83,8 @@ export interface InputPromptProps { isEmbeddedShellFocused?: boolean; } -// The input content, input container, and input suggestions list may have different widths -export const calculatePromptWidths = (terminalWidth: number) => { - const widthFraction = 0.9; - const FRAME_PADDING_AND_BORDER = 4; // Border (2) + padding (2) - const PROMPT_PREFIX_WIDTH = 2; // '> ' or '! ' - const MIN_CONTENT_WIDTH = 2; - - const innerContentWidth = - Math.floor(terminalWidth * widthFraction) - - FRAME_PADDING_AND_BORDER - - PROMPT_PREFIX_WIDTH; - - const inputWidth = Math.max(MIN_CONTENT_WIDTH, innerContentWidth); - const FRAME_OVERHEAD = FRAME_PADDING_AND_BORDER + PROMPT_PREFIX_WIDTH; - const containerWidth = inputWidth + FRAME_OVERHEAD; - const suggestionsWidth = Math.max(20, Math.floor(terminalWidth * 1.0)); - - return { - inputWidth, - containerWidth, - suggestionsWidth, - frameOverhead: FRAME_OVERHEAD, - } as const; -}; +// Re-export from shared utils for backwards compatibility +export { calculatePromptWidths } from '../utils/layoutUtils.js'; // Large paste placeholder thresholds const LARGE_PASTE_CHAR_THRESHOLD = 1000; @@ -132,6 +115,9 @@ export const InputPrompt: React.FC = ({ const uiState = useUIState(); const uiActions = useUIActions(); const { pasteWorkaround } = useKeypressContext(); + const { agents, agentTabBarFocused } = useAgentViewState(); + const { setAgentTabBarFocused } = useAgentViewActions(); + const hasAgents = agents.size > 0; const [justNavigatedHistory, setJustNavigatedHistory] = useState(false); const [escPressCount, setEscPressCount] = useState(0); const [showEscapePrompt, setShowEscapePrompt] = useState(false); @@ -230,7 +216,8 @@ export const InputPrompt: React.FC = ({ const resetCommandSearchCompletionState = commandSearchCompletion.resetCompletionState; - const showCursor = focus && isShellFocused && !isEmbeddedShellFocused; + const showCursor = + focus && isShellFocused && !isEmbeddedShellFocused && !agentTabBarFocused; const resetEscapeState = useCallback(() => { if (escapeTimerRef.current) { @@ -356,6 +343,17 @@ export const InputPrompt: React.FC = ({ onChange: customSetTextAndResetCompletionSignal, }); + // When an arena session starts (agents appear), reset history position so + // that pressing down-arrow immediately focuses the agent tab bar instead + // of cycling through input history. + const prevHasAgentsRef = useRef(hasAgents); + useEffect(() => { + if (hasAgents && !prevHasAgentsRef.current) { + inputHistory.resetHistoryNav(); + } + prevHasAgentsRef.current = hasAgents; + }, [hasAgents, inputHistory]); + // Effect to reset completion if history navigation just occurred and set the text useEffect(() => { if (justNavigatedHistory) { @@ -416,13 +414,30 @@ export const InputPrompt: React.FC = ({ }, []); const handleInput = useCallback( - (key: Key) => { + (key: Key): boolean => { + // When the tab bar has focus, block all non-printable keys so arrow + // keys and shortcuts don't interfere. Printable characters fall + // through to BaseTextInput's default handler so the first keystroke + // appears in the input immediately (the tab bar handler releases + // focus on the same event). + if (agentTabBarFocused) { + if ( + key.sequence && + key.sequence.length === 1 && + !key.ctrl && + !key.meta + ) { + return false; // let BaseTextInput type the character + } + return true; // consume non-printable keys + } + // TODO(jacobr): this special case is likely not needed anymore. // We should probably stop supporting paste if the InputPrompt is not // focused. /// We want to handle paste even when not focused to support drag and drop. if (!focus && !key.paste) { - return; + return true; } if (key.paste) { @@ -464,18 +479,18 @@ export const InputPrompt: React.FC = ({ // Normal paste handling for small content buffer.handleInput(key); } - return; + return true; } if (vimHandleInput && vimHandleInput(key)) { - return; + return true; } // Handle feedback dialog keyboard interactions when dialog is open if (uiState.isFeedbackDialogOpen) { // If it's one of the feedback option keys (1-4), let FeedbackDialog handle it if ((FEEDBACK_DIALOG_KEYS as readonly string[]).includes(key.name)) { - return; + return true; } else { // For any other key, close feedback dialog temporarily and continue with normal processing uiActions.temporaryCloseFeedbackDialog(); @@ -501,7 +516,7 @@ export const InputPrompt: React.FC = ({ } setShellModeActive(!shellModeActive); buffer.setText(''); // Clear the '!' from input - return; + return true; } // Toggle keyboard shortcuts display with "?" when buffer is empty @@ -512,7 +527,7 @@ export const InputPrompt: React.FC = ({ onToggleShortcuts ) { onToggleShortcuts(); - return; + return true; } // Hide shortcuts on any other key press @@ -542,33 +557,33 @@ export const InputPrompt: React.FC = ({ setReverseSearchActive, reverseSearchCompletion.resetCompletionState, ); - return; + return true; } if (commandSearchActive) { cancelSearch( setCommandSearchActive, commandSearchCompletion.resetCompletionState, ); - return; + return true; } if (shellModeActive) { setShellModeActive(false); resetEscapeState(); - return; + return true; } if (completion.showSuggestions) { completion.resetCompletionState(); setExpandedSuggestionIndex(-1); resetEscapeState(); - return; + return true; } // Handle double ESC for clearing input if (escPressCount === 0) { if (buffer.text === '') { - return; + return true; } setEscPressCount(1); setShowEscapePrompt(true); @@ -584,7 +599,7 @@ export const InputPrompt: React.FC = ({ resetCompletionState(); resetEscapeState(); } - return; + return true; } // Ctrl+Y: Retry the last failed request. @@ -594,19 +609,19 @@ export const InputPrompt: React.FC = ({ // If no failed request exists, a message will be shown to the user. if (keyMatchers[Command.RETRY_LAST](key)) { uiActions.handleRetryLastPrompt(); - return; + return true; } if (shellModeActive && keyMatchers[Command.REVERSE_SEARCH](key)) { setReverseSearchActive(true); setTextBeforeReverseSearch(buffer.text); setCursorPosition(buffer.cursor); - return; + return true; } if (keyMatchers[Command.CLEAR_SCREEN](key)) { onClearScreen(); - return; + return true; } if (reverseSearchActive || commandSearchActive) { @@ -631,29 +646,29 @@ export const InputPrompt: React.FC = ({ if (showSuggestions) { if (keyMatchers[Command.NAVIGATION_UP](key)) { navigateUp(); - return; + return true; } if (keyMatchers[Command.NAVIGATION_DOWN](key)) { navigateDown(); - return; + return true; } if (keyMatchers[Command.COLLAPSE_SUGGESTION](key)) { if (suggestions[activeSuggestionIndex].value.length >= MAX_WIDTH) { setExpandedSuggestionIndex(-1); - return; + return true; } } if (keyMatchers[Command.EXPAND_SUGGESTION](key)) { if (suggestions[activeSuggestionIndex].value.length >= MAX_WIDTH) { setExpandedSuggestionIndex(activeSuggestionIndex); - return; + return true; } } if (keyMatchers[Command.ACCEPT_SUGGESTION_REVERSE_SEARCH](key)) { sc.handleAutocomplete(activeSuggestionIndex); resetState(); setActive(false); - return; + return true; } } @@ -665,7 +680,7 @@ export const InputPrompt: React.FC = ({ handleSubmitAndClear(textToSubmit); resetState(); setActive(false); - return; + return true; } // Prevent up/down from falling through to regular history navigation @@ -673,14 +688,14 @@ export const InputPrompt: React.FC = ({ keyMatchers[Command.NAVIGATION_UP](key) || keyMatchers[Command.NAVIGATION_DOWN](key) ) { - return; + return true; } } // If the command is a perfect match, pressing enter should execute it. if (completion.isPerfectMatch && keyMatchers[Command.RETURN](key)) { handleSubmitAndClear(buffer.text); - return; + return true; } if (completion.showSuggestions) { @@ -688,12 +703,12 @@ export const InputPrompt: React.FC = ({ if (keyMatchers[Command.COMPLETION_UP](key)) { completion.navigateUp(); setExpandedSuggestionIndex(-1); // Reset expansion when navigating - return; + return true; } if (keyMatchers[Command.COMPLETION_DOWN](key)) { completion.navigateDown(); setExpandedSuggestionIndex(-1); // Reset expansion when navigating - return; + return true; } } @@ -708,7 +723,7 @@ export const InputPrompt: React.FC = ({ setExpandedSuggestionIndex(-1); // Reset expansion after selection } } - return; + return true; } } @@ -716,28 +731,28 @@ export const InputPrompt: React.FC = ({ if (isAttachmentMode && attachments.length > 0) { if (key.name === 'left') { setSelectedAttachmentIndex((i) => Math.max(0, i - 1)); - return; + return true; } if (key.name === 'right') { setSelectedAttachmentIndex((i) => Math.min(attachments.length - 1, i + 1), ); - return; + return true; } if (keyMatchers[Command.NAVIGATION_DOWN](key)) { // Exit attachment mode and return to input setIsAttachmentMode(false); setSelectedAttachmentIndex(-1); - return; + return true; } if (key.name === 'backspace' || key.name === 'delete') { handleAttachmentDelete(selectedAttachmentIndex); - return; + return true; } if (key.name === 'return' || key.name === 'escape') { setIsAttachmentMode(false); setSelectedAttachmentIndex(-1); - return; + return true; } // For other keys, exit attachment mode and let input handle them setIsAttachmentMode(false); @@ -758,7 +773,7 @@ export const InputPrompt: React.FC = ({ ) { setIsAttachmentMode(true); setSelectedAttachmentIndex(attachments.length - 1); - return; + return true; } if (!shellModeActive) { @@ -766,16 +781,16 @@ export const InputPrompt: React.FC = ({ setCommandSearchActive(true); setTextBeforeReverseSearch(buffer.text); setCursorPosition(buffer.cursor); - return; + return true; } if (keyMatchers[Command.HISTORY_UP](key)) { inputHistory.navigateUp(); - return; + return true; } if (keyMatchers[Command.HISTORY_DOWN](key)) { inputHistory.navigateDown(); - return; + return true; } // Handle arrow-up/down for history on single-line or at edges if ( @@ -784,27 +799,33 @@ export const InputPrompt: React.FC = ({ (buffer.visualCursor[0] === 0 && buffer.visualScrollRow === 0)) ) { inputHistory.navigateUp(); - return; + return true; } if ( keyMatchers[Command.NAVIGATION_DOWN](key) && (buffer.allVisualLines.length === 1 || buffer.visualCursor[0] === buffer.allVisualLines.length - 1) ) { - inputHistory.navigateDown(); - return; + if (inputHistory.navigateDown()) { + return true; + } + if (hasAgents) { + setAgentTabBarFocused(true); + return true; + } + return true; } } else { // Shell History Navigation if (keyMatchers[Command.NAVIGATION_UP](key)) { const prevCommand = shellHistory.getPreviousCommand(); if (prevCommand !== null) buffer.setText(prevCommand); - return; + return true; } if (keyMatchers[Command.NAVIGATION_DOWN](key)) { const nextCommand = shellHistory.getNextCommand(); if (nextCommand !== null) buffer.setText(nextCommand); - return; + return true; } } @@ -815,7 +836,7 @@ export const InputPrompt: React.FC = ({ // paste markers may not work reliably and Enter key events can leak from pasted text. if (pasteWorkaround && recentPasteTime !== null) { // Paste occurred recently, ignore this submit to prevent auto-execution - return; + return true; } const [row, col] = buffer.cursor; @@ -828,65 +849,21 @@ export const InputPrompt: React.FC = ({ handleSubmitAndClear(buffer.text); } } - return; - } - - // Newline insertion - if (keyMatchers[Command.NEWLINE](key)) { - buffer.newline(); - return; - } - - // Ctrl+A (Home) / Ctrl+E (End) - if (keyMatchers[Command.HOME](key)) { - buffer.move('home'); - return; - } - if (keyMatchers[Command.END](key)) { - buffer.move('end'); - return; - } - // Ctrl+C (Clear input) - if (keyMatchers[Command.CLEAR_INPUT](key)) { - if (buffer.text.length > 0) { - buffer.setText(''); - resetCompletionState(); - } - return; - } - - // Kill line commands - if (keyMatchers[Command.KILL_LINE_RIGHT](key)) { - buffer.killLineRight(); - return; - } - if (keyMatchers[Command.KILL_LINE_LEFT](key)) { - buffer.killLineLeft(); - return; - } - - if (keyMatchers[Command.DELETE_WORD_BACKWARD](key)) { - buffer.deleteWordLeft(); - return; - } - - // External editor - if (keyMatchers[Command.OPEN_EXTERNAL_EDITOR](key)) { - buffer.openInExternalEditor(); - return; + return true; } // Ctrl+V for clipboard image paste if (keyMatchers[Command.PASTE_CLIPBOARD_IMAGE](key)) { handleClipboardImage(); - return; + return true; } // Handle backspace with placeholder-aware deletion if ( - key.name === 'backspace' || - key.sequence === '\x7f' || - (key.ctrl && key.name === 'h') + pendingPastes.size > 0 && + (key.name === 'backspace' || + key.sequence === '\x7f' || + (key.ctrl && key.name === 'h')) ) { const text = buffer.text; const [row, col] = buffer.cursor; @@ -899,7 +876,6 @@ export const InputPrompt: React.FC = ({ offset += col; // Check if we're at the end of any placeholder - let placeholderDeleted = false; for (const placeholder of pendingPastes.keys()) { const placeholderStart = offset - placeholder.length; if ( @@ -918,20 +894,22 @@ export const InputPrompt: React.FC = ({ if (parsed) { freePlaceholderId(parsed.charCount, parsed.id); } - placeholderDeleted = true; - break; + return true; } } + // No placeholder matched — fall through to BaseTextInput's default backspace + } - if (!placeholderDeleted) { - // Normal backspace behavior - buffer.backspace(); + // Ctrl+C with completion active — also reset completion state + if (keyMatchers[Command.CLEAR_INPUT](key)) { + if (buffer.text.length > 0) { + resetCompletionState(); } - return; + // Fall through to BaseTextInput's default CLEAR_INPUT handler } - // Fall back to the text buffer's default input handling for all other keys - buffer.handleInput(key); + // All remaining keys (readline shortcuts, text input) handled by BaseTextInput + return false; }, [ focus, @@ -969,15 +947,89 @@ export const InputPrompt: React.FC = ({ pendingPastes, parsePlaceholder, freePlaceholderId, + agentTabBarFocused, + hasAgents, + setAgentTabBarFocused, ], ); - useKeypress(handleInput, { isActive: !isEmbeddedShellFocused }); + const renderLineWithHighlighting = useCallback( + (opts: RenderLineOptions): React.ReactNode => { + const { + lineText, + isOnCursorLine, + cursorCol: cursorVisualColAbsolute, + showCursor: showCursorOpt, + absoluteVisualIndex, + buffer: buf, + } = opts; + const mapEntry = buf.visualToLogicalMap[absoluteVisualIndex]; + const [logicalLineIdx, logicalStartCol] = mapEntry; + const logicalLine = buf.lines[logicalLineIdx] || ''; + const tokens = parseInputForHighlighting(logicalLine, logicalLineIdx); + + const visualStart = logicalStartCol; + const visualEnd = logicalStartCol + cpLen(lineText); + const segments = buildSegmentsForVisualSlice( + tokens, + visualStart, + visualEnd, + ); + + const renderedLine: React.ReactNode[] = []; + let charCount = 0; + segments.forEach((seg, segIdx) => { + const segLen = cpLen(seg.text); + let display = seg.text; + + if (isOnCursorLine) { + const segStart = charCount; + const segEnd = segStart + segLen; + if ( + cursorVisualColAbsolute >= segStart && + cursorVisualColAbsolute < segEnd + ) { + const charToHighlight = cpSlice( + seg.text, + cursorVisualColAbsolute - segStart, + cursorVisualColAbsolute - segStart + 1, + ); + const highlighted = showCursorOpt + ? chalk.inverse(charToHighlight) + : charToHighlight; + display = + cpSlice(seg.text, 0, cursorVisualColAbsolute - segStart) + + highlighted + + cpSlice(seg.text, cursorVisualColAbsolute - segStart + 1); + } + charCount = segEnd; + } + + const color = + seg.type === 'command' || seg.type === 'file' + ? theme.text.accent + : theme.text.primary; - const linesToRender = buffer.viewportVisualLines; - const [cursorVisualRowAbsolute, cursorVisualColAbsolute] = - buffer.visualCursor; - const scrollVisualRow = buffer.visualScrollRow; + renderedLine.push( + + {display} + , + ); + }); + + if (isOnCursorLine && cursorVisualColAbsolute === cpLen(lineText)) { + // Add zero-width space after cursor to prevent Ink from trimming trailing whitespace + renderedLine.push( + + {showCursorOpt ? chalk.inverse(' ') + '\u200B' : ' \u200B'} + , + ); + } + + return {renderedLine}; + }, + [], + ); const getActiveCompletion = () => { if (commandSearchActive) return commandSearchCompletion; @@ -1014,10 +1066,33 @@ export const InputPrompt: React.FC = ({ } const borderColor = - isShellFocused && !isEmbeddedShellFocused + isShellFocused && !isEmbeddedShellFocused && !agentTabBarFocused ? (statusColor ?? theme.border.focused) : theme.border.default; + const prefixNode = ( + + {shellModeActive ? ( + reverseSearchActive ? ( + + (r:){' '} + + ) : ( + '!' + ) + ) : commandSearchActive ? ( + (r:) + ) : showYoloStyling ? ( + '*' + ) : ( + '>' + )}{' '} + + ); + return ( <> {attachments.length > 0 && ( @@ -1037,142 +1112,17 @@ export const InputPrompt: React.FC = ({ ))} )} - - - {shellModeActive ? ( - reverseSearchActive ? ( - - (r:){' '} - - ) : ( - '!' - ) - ) : commandSearchActive ? ( - (r:) - ) : showYoloStyling ? ( - '*' - ) : ( - '>' - )}{' '} - - - {buffer.text.length === 0 && placeholder ? ( - showCursor ? ( - - {chalk.inverse(placeholder.slice(0, 1))} - {placeholder.slice(1)} - - ) : ( - {placeholder} - ) - ) : ( - linesToRender.map((lineText, visualIdxInRenderedSet) => { - const absoluteVisualIdx = - scrollVisualRow + visualIdxInRenderedSet; - const mapEntry = buffer.visualToLogicalMap[absoluteVisualIdx]; - const cursorVisualRow = cursorVisualRowAbsolute - scrollVisualRow; - const isOnCursorLine = - focus && visualIdxInRenderedSet === cursorVisualRow; - - const renderedLine: React.ReactNode[] = []; - - const [logicalLineIdx, logicalStartCol] = mapEntry; - const logicalLine = buffer.lines[logicalLineIdx] || ''; - const tokens = parseInputForHighlighting( - logicalLine, - logicalLineIdx, - ); - - const visualStart = logicalStartCol; - const visualEnd = logicalStartCol + cpLen(lineText); - const segments = buildSegmentsForVisualSlice( - tokens, - visualStart, - visualEnd, - ); - - let charCount = 0; - segments.forEach((seg, segIdx) => { - const segLen = cpLen(seg.text); - let display = seg.text; - - if (isOnCursorLine) { - const relativeVisualColForHighlight = cursorVisualColAbsolute; - const segStart = charCount; - const segEnd = segStart + segLen; - if ( - relativeVisualColForHighlight >= segStart && - relativeVisualColForHighlight < segEnd - ) { - const charToHighlight = cpSlice( - seg.text, - relativeVisualColForHighlight - segStart, - relativeVisualColForHighlight - segStart + 1, - ); - const highlighted = showCursor - ? chalk.inverse(charToHighlight) - : charToHighlight; - display = - cpSlice( - seg.text, - 0, - relativeVisualColForHighlight - segStart, - ) + - highlighted + - cpSlice( - seg.text, - relativeVisualColForHighlight - segStart + 1, - ); - } - charCount = segEnd; - } - - const color = - seg.type === 'command' || seg.type === 'file' - ? theme.text.accent - : theme.text.primary; - - renderedLine.push( - - {display} - , - ); - }); - - if ( - isOnCursorLine && - cursorVisualColAbsolute === cpLen(lineText) - ) { - // Add zero-width space after cursor to prevent Ink from trimming trailing whitespace - renderedLine.push( - - {showCursor ? chalk.inverse(' ') + '\u200B' : ' \u200B'} - , - ); - } - - return ( - - {renderedLine} - - ); - }) - )} - - + isActive={!isEmbeddedShellFocused} + renderLine={renderLineWithHighlighting} + /> {shouldShowSuggestions && ( = ({ : null; return ( - + {/* Main loading line */} > should truncate long primary text instead of wrapping 1`] = ` -"MockResponding This is an extremely long loading phrase that should be truncated in t (esc to -Spinner cancel, 5s)" +" MockResponding This is an extremely long loading phrase that should be truncated in (esc to + Spinner cancel, 5s)" `; diff --git a/packages/cli/src/ui/components/agent-view/AgentChatView.tsx b/packages/cli/src/ui/components/agent-view/AgentChatView.tsx new file mode 100644 index 0000000000..4853164364 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/AgentChatView.tsx @@ -0,0 +1,272 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentChatView — displays a single in-process agent's conversation. + * + * Renders the agent's message history using HistoryItemDisplay — the same + * component used by the main agent view. AgentMessage[] is converted to + * HistoryItem[] by agentMessagesToHistoryItems() so all 27 HistoryItem types + * are available without duplicating rendering logic. + * + * Layout: + * - Static area: finalized messages (efficient Ink ) + * - Live area: tool groups still executing / awaiting confirmation + * - Status line: spinner while the agent is running + * + * Model text output is shown only after each round completes (no live + * streaming), which avoids per-chunk re-renders and keeps the display simple. + */ + +import { Box, Text, Static } from 'ink'; +import { useMemo, useState, useEffect, useCallback, useRef } from 'react'; +import { + AgentStatus, + AgentEventType, + getGitBranch, + type AgentStatusChangeEvent, +} from '@qwen-code/qwen-code-core'; +import { + useAgentViewState, + useAgentViewActions, +} from '../../contexts/AgentViewContext.js'; +import { useUIState } from '../../contexts/UIStateContext.js'; +import { useTerminalSize } from '../../hooks/useTerminalSize.js'; +import { HistoryItemDisplay } from '../HistoryItemDisplay.js'; +import { ToolCallStatus } from '../../types.js'; +import { theme } from '../../semantic-colors.js'; +import { GeminiRespondingSpinner } from '../GeminiRespondingSpinner.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { agentMessagesToHistoryItems } from './agentHistoryAdapter.js'; +import { AgentHeader } from './AgentHeader.js'; + +// ─── Main Component ───────────────────────────────────────── + +interface AgentChatViewProps { + agentId: string; +} + +export const AgentChatView = ({ agentId }: AgentChatViewProps) => { + const { agents } = useAgentViewState(); + const { setAgentShellFocused } = useAgentViewActions(); + const uiState = useUIState(); + const { historyRemountKey, availableTerminalHeight, constrainHeight } = + uiState; + const { columns: terminalWidth } = useTerminalSize(); + const agent = agents.get(agentId); + const contentWidth = terminalWidth - 4; + + // Force re-render on message updates and status changes. + // STREAM_TEXT is deliberately excluded — model text is shown only after + // each round completes (via committed messages), avoiding per-chunk re-renders. + const [, setRenderTick] = useState(0); + const tickRef = useRef(0); + const forceRender = useCallback(() => { + tickRef.current += 1; + setRenderTick(tickRef.current); + }, []); + + useEffect(() => { + if (!agent) return; + + const emitter = agent.interactiveAgent.getEventEmitter(); + if (!emitter) return; + + const onStatusChange = (_event: AgentStatusChangeEvent) => forceRender(); + const onToolCall = () => forceRender(); + const onToolResult = () => forceRender(); + const onRoundEnd = () => forceRender(); + const onApproval = () => forceRender(); + const onOutputUpdate = () => forceRender(); + + emitter.on(AgentEventType.STATUS_CHANGE, onStatusChange); + emitter.on(AgentEventType.TOOL_CALL, onToolCall); + emitter.on(AgentEventType.TOOL_RESULT, onToolResult); + emitter.on(AgentEventType.ROUND_END, onRoundEnd); + emitter.on(AgentEventType.TOOL_WAITING_APPROVAL, onApproval); + emitter.on(AgentEventType.TOOL_OUTPUT_UPDATE, onOutputUpdate); + + return () => { + emitter.off(AgentEventType.STATUS_CHANGE, onStatusChange); + emitter.off(AgentEventType.TOOL_CALL, onToolCall); + emitter.off(AgentEventType.TOOL_RESULT, onToolResult); + emitter.off(AgentEventType.ROUND_END, onRoundEnd); + emitter.off(AgentEventType.TOOL_WAITING_APPROVAL, onApproval); + emitter.off(AgentEventType.TOOL_OUTPUT_UPDATE, onOutputUpdate); + }; + }, [agent, forceRender]); + + const interactiveAgent = agent?.interactiveAgent; + const messages = interactiveAgent?.getMessages() ?? []; + const pendingApprovals = interactiveAgent?.getPendingApprovals(); + const liveOutputs = interactiveAgent?.getLiveOutputs(); + const shellPids = interactiveAgent?.getShellPids(); + const status = interactiveAgent?.getStatus(); + const isRunning = + status === AgentStatus.RUNNING || status === AgentStatus.INITIALIZING; + + // Derive the active PTY PID: first shell PID among currently-executing tools. + // Resets naturally to undefined when the tool finishes (shellPids cleared). + const activePtyId = + shellPids && shellPids.size > 0 + ? shellPids.values().next().value + : undefined; + + // Track whether the user has toggled input focus into the embedded shell. + // Mirrors the main agent's embeddedShellFocused in AppContainer. + const [embeddedShellFocused, setEmbeddedShellFocusedLocal] = useState(false); + + // Sync to AgentViewContext so AgentTabBar can suppress arrow-key navigation + // when an agent's embedded shell is focused. + useEffect(() => { + setAgentShellFocused(embeddedShellFocused); + return () => setAgentShellFocused(false); + }, [embeddedShellFocused, setAgentShellFocused]); + + // Reset focus when the shell exits (activePtyId disappears). + useEffect(() => { + if (!activePtyId) setEmbeddedShellFocusedLocal(false); + }, [activePtyId]); + + // Ctrl+F: toggle shell input focus when a PTY is active. + useKeypress( + (key) => { + if (key.ctrl && key.name === 'f') { + if (activePtyId || embeddedShellFocused) { + setEmbeddedShellFocusedLocal((prev) => !prev); + } + } + }, + { isActive: true }, + ); + + // Convert AgentMessage[] → HistoryItem[] via adapter. + // tickRef.current in deps ensures we rebuild when events fire even if + // messages.length and pendingApprovals.size haven't changed (e.g. a + // tool result updates an existing entry in place). + const allItems = useMemo( + () => + agentMessagesToHistoryItems( + messages, + pendingApprovals ?? new Map(), + liveOutputs, + shellPids, + ), + // eslint-disable-next-line react-hooks/exhaustive-deps + [ + agentId, + messages.length, + pendingApprovals?.size, + liveOutputs?.size, + shellPids?.size, + tickRef.current, + ], + ); + + // Split into committed (Static) and pending (live area). + // Any tool_group with an Executing or Confirming tool — plus everything + // after it — stays in the live area so confirmation dialogs remain + // interactive (Ink's cannot receive input). + const splitIndex = useMemo(() => { + for (let idx = allItems.length - 1; idx >= 0; idx--) { + const item = allItems[idx]!; + if ( + item.type === 'tool_group' && + item.tools.some( + (t) => + t.status === ToolCallStatus.Executing || + t.status === ToolCallStatus.Confirming, + ) + ) { + return idx; + } + } + return allItems.length; // all committed + }, [allItems]); + + const committedItems = allItems.slice(0, splitIndex); + const pendingItems = allItems.slice(splitIndex); + + const core = interactiveAgent?.getCore(); + const agentWorkingDir = core?.runtimeContext.getTargetDir() ?? ''; + // Cache the branch — it won't change during the agent's lifetime and + // getGitBranch uses synchronous execSync which blocks the render loop. + const agentGitBranch = useMemo( + () => (agentWorkingDir ? getGitBranch(agentWorkingDir) : ''), + // eslint-disable-next-line react-hooks/exhaustive-deps + [agentId], + ); + + if (!agent || !interactiveAgent || !core) { + return ( + + + Agent "{agentId}" not found. + + + ); + } + + const agentModelId = core.modelConfig.model ?? ''; + + return ( + + {/* Committed message history. + key includes historyRemountKey: when refreshStatic() clears the + terminal it bumps the key, forcing Static to remount and re-emit + all items on the cleared screen. */} + , + ...committedItems.map((item) => ( + + )), + ]} + > + {(item) => item} + + + {/* Live area — tool groups awaiting confirmation or still executing. + Must remain outside Static so confirmation dialogs are interactive. + Pass PTY state so ShellInputPrompt is reachable via Ctrl+F. */} + {pendingItems.map((item) => ( + + ))} + + {/* Spinner */} + {isRunning && ( + + + + )} + + ); +}; diff --git a/packages/cli/src/ui/components/agent-view/AgentComposer.tsx b/packages/cli/src/ui/components/agent-view/AgentComposer.tsx new file mode 100644 index 0000000000..d26d5db2fe --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/AgentComposer.tsx @@ -0,0 +1,308 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentComposer — footer area for in-process agent tabs. + * + * Replaces the main Composer when an agent tab is active so that: + * - The loading indicator reflects the agent's status (not the main agent) + * - The input prompt sends messages to the agent (via enqueueMessage) + * - Keyboard events are scoped — no conflict with the main InputPrompt + * + * Wraps its content in a local StreamingContext.Provider so reusable + * components like LoadingIndicator and GeminiRespondingSpinner read the + * agent's derived streaming state instead of the main agent's. + */ + +import { Box, Text, useStdin } from 'ink'; +import { useCallback, useEffect, useMemo, useState } from 'react'; +import { + AgentStatus, + isTerminalStatus, + ApprovalMode, + APPROVAL_MODES, +} from '@qwen-code/qwen-code-core'; +import { + useAgentViewState, + useAgentViewActions, +} from '../../contexts/AgentViewContext.js'; +import { useConfig } from '../../contexts/ConfigContext.js'; +import { StreamingContext } from '../../contexts/StreamingContext.js'; +import { StreamingState } from '../../types.js'; +import { useTerminalSize } from '../../hooks/useTerminalSize.js'; +import { useAgentStreamingState } from '../../hooks/useAgentStreamingState.js'; +import { useKeypress, type Key } from '../../hooks/useKeypress.js'; +import { useTextBuffer } from '../shared/text-buffer.js'; +import { calculatePromptWidths } from '../../utils/layoutUtils.js'; +import { BaseTextInput } from '../BaseTextInput.js'; +import { LoadingIndicator } from '../LoadingIndicator.js'; +import { QueuedMessageDisplay } from '../QueuedMessageDisplay.js'; +import { AgentFooter } from './AgentFooter.js'; +import { keyMatchers, Command } from '../../keyMatchers.js'; +import { theme } from '../../semantic-colors.js'; +import { t } from '../../../i18n/index.js'; + +// ─── Types ────────────────────────────────────────────────── + +interface AgentComposerProps { + agentId: string; +} + +// ─── Component ────────────────────────────────────────────── + +export const AgentComposer: React.FC = ({ agentId }) => { + const { agents, agentTabBarFocused, agentShellFocused, agentApprovalModes } = + useAgentViewState(); + const { + setAgentInputBufferText, + setAgentTabBarFocused, + setAgentApprovalMode, + } = useAgentViewActions(); + const agent = agents.get(agentId); + const interactiveAgent = agent?.interactiveAgent; + + const config = useConfig(); + const { columns: terminalWidth } = useTerminalSize(); + const { inputWidth } = calculatePromptWidths(terminalWidth); + const { stdin, setRawMode } = useStdin(); + + const { + status, + streamingState, + isInputActive, + elapsedTime, + lastPromptTokenCount, + } = useAgentStreamingState(interactiveAgent); + + // ── Escape to cancel the active agent round ── + + useKeypress( + (key) => { + if ( + key.name === 'escape' && + streamingState === StreamingState.Responding + ) { + interactiveAgent?.cancelCurrentRound(); + } + }, + { + isActive: + streamingState === StreamingState.Responding && !agentShellFocused, + }, + ); + + // ── Shift+Tab to cycle this agent's approval mode ── + + const agentApprovalMode = + agentApprovalModes.get(agentId) ?? ApprovalMode.DEFAULT; + + useKeypress( + (key) => { + const isShiftTab = key.shift && key.name === 'tab'; + const isWindowsTab = + process.platform === 'win32' && + key.name === 'tab' && + !key.ctrl && + !key.meta; + if (isShiftTab || isWindowsTab) { + const currentIndex = APPROVAL_MODES.indexOf(agentApprovalMode); + const nextIndex = + currentIndex === -1 ? 0 : (currentIndex + 1) % APPROVAL_MODES.length; + setAgentApprovalMode(agentId, APPROVAL_MODES[nextIndex]!); + } + }, + { isActive: !agentShellFocused }, + ); + + // ── Input buffer (independent from main agent) ── + + const isValidPath = useCallback((): boolean => false, []); + + const buffer = useTextBuffer({ + initialText: '', + viewport: { height: 3, width: inputWidth }, + stdin, + setRawMode, + isValidPath, + }); + + // Sync agent buffer text to context so AgentTabBar can guard tab switching + useEffect(() => { + setAgentInputBufferText(buffer.text); + return () => setAgentInputBufferText(''); + }, [buffer.text, setAgentInputBufferText]); + + // When agent input is not active (agent running, completed, etc.), + // auto-focus the tab bar so arrow keys switch tabs directly. + // We also depend on streamingState so that transitions like + // WaitingForConfirmation → Responding re-trigger the effect — the + // approval keypress releases tab-bar focus (printable char handler), + // but isInputActive stays false throughout, so without this extra + // dependency the focus would never be restored. + useEffect(() => { + if (!isInputActive) { + setAgentTabBarFocused(true); + } + }, [isInputActive, streamingState, setAgentTabBarFocused]); + + // ── Focus management between input and tab bar ── + + const handleKeypress = useCallback( + (key: Key): boolean => { + // When tab bar has focus, block all non-printable keys so they don't + // act on the hidden buffer. Printable characters fall through to + // BaseTextInput naturally; the tab bar handler releases focus on the + // same event so the keystroke appears in the input immediately. + if (agentTabBarFocused) { + if ( + key.sequence && + key.sequence.length === 1 && + !key.ctrl && + !key.meta + ) { + return false; // let BaseTextInput type the character + } + return true; // consume non-printable keys + } + + // Down arrow at the bottom edge (or empty buffer) → focus the tab bar + if (keyMatchers[Command.NAVIGATION_DOWN](key)) { + if ( + buffer.text === '' || + buffer.allVisualLines.length === 1 || + buffer.visualCursor[0] === buffer.allVisualLines.length - 1 + ) { + setAgentTabBarFocused(true); + return true; + } + } + return false; + }, + [buffer, agentTabBarFocused, setAgentTabBarFocused], + ); + + // ── Message queue (accumulate while streaming, flush as one prompt on idle) ── + + const [messageQueue, setMessageQueue] = useState([]); + + // When agent becomes idle (and not terminal), flush queued messages. + useEffect(() => { + if ( + streamingState === StreamingState.Idle && + messageQueue.length > 0 && + status !== undefined && + !isTerminalStatus(status) + ) { + const combined = messageQueue.join('\n'); + setMessageQueue([]); + interactiveAgent?.enqueueMessage(combined); + } + }, [streamingState, messageQueue, interactiveAgent, status]); + + const handleSubmit = useCallback( + (text: string) => { + const trimmed = text.trim(); + if (!trimmed || !interactiveAgent) return; + if (streamingState === StreamingState.Idle) { + interactiveAgent.enqueueMessage(trimmed); + } else { + setMessageQueue((prev) => [...prev, trimmed]); + } + }, + [interactiveAgent, streamingState], + ); + + // ── Render ── + + const statusLabel = useMemo(() => { + switch (status) { + case AgentStatus.COMPLETED: + return { text: t('Completed'), color: theme.status.success }; + case AgentStatus.FAILED: + return { + text: t('Failed: {{error}}', { + error: + interactiveAgent?.getError() ?? + interactiveAgent?.getLastRoundError() ?? + 'unknown', + }), + color: theme.status.error, + }; + case AgentStatus.CANCELLED: + return { text: t('Cancelled'), color: theme.text.secondary }; + default: + return null; + } + }, [status, interactiveAgent]); + + // ── Approval-mode styling (mirrors main InputPrompt) ── + + const isYolo = agentApprovalMode === ApprovalMode.YOLO; + const isAutoAccept = agentApprovalMode !== ApprovalMode.DEFAULT; + + const statusColor = isYolo + ? theme.status.errorDim + : isAutoAccept + ? theme.status.warningDim + : undefined; + + const inputBorderColor = + !isInputActive || agentTabBarFocused + ? theme.border.default + : (statusColor ?? theme.border.focused); + + const prefixNode = ( + {isYolo ? '*' : '>'} + ); + + return ( + + + {/* Loading indicator — mirrors main Composer but reads agent's + streaming state via the overridden StreamingContext. */} + + + {/* Terminal status for completed/failed agents */} + {statusLabel && ( + + {statusLabel.text} + + )} + + + + {/* Input prompt — always visible, like the main Composer */} + + + {/* Footer: approval mode + context usage */} + + + + ); +}; diff --git a/packages/cli/src/ui/components/agent-view/AgentFooter.tsx b/packages/cli/src/ui/components/agent-view/AgentFooter.tsx new file mode 100644 index 0000000000..7b05e4e478 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/AgentFooter.tsx @@ -0,0 +1,66 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Lightweight footer for agent tabs showing approval mode + * and context usage. Mirrors the main Footer layout but without + * main-agent-specific concerns (vim mode, shell mode, exit prompts, etc.). + */ + +import type React from 'react'; +import { Box, Text } from 'ink'; +import { ApprovalMode } from '@qwen-code/qwen-code-core'; +import { AutoAcceptIndicator } from '../AutoAcceptIndicator.js'; +import { ContextUsageDisplay } from '../ContextUsageDisplay.js'; +import { theme } from '../../semantic-colors.js'; + +interface AgentFooterProps { + approvalMode: ApprovalMode | undefined; + promptTokenCount: number; + contextWindowSize: number | undefined; + terminalWidth: number; +} + +export const AgentFooter: React.FC = ({ + approvalMode, + promptTokenCount, + contextWindowSize, + terminalWidth, +}) => { + const showApproval = + approvalMode !== undefined && approvalMode !== ApprovalMode.DEFAULT; + const showContext = promptTokenCount > 0 && contextWindowSize !== undefined; + + if (!showApproval && !showContext) { + return null; + } + + return ( + + + {showApproval ? ( + + ) : null} + + + {showContext && ( + + + + )} + + + ); +}; diff --git a/packages/cli/src/ui/components/agent-view/AgentHeader.tsx b/packages/cli/src/ui/components/agent-view/AgentHeader.tsx new file mode 100644 index 0000000000..1bf9d4c34b --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/AgentHeader.tsx @@ -0,0 +1,64 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Compact header for agent tabs, visually distinct from the + * main view's boxed logo header. Shows model, working directory, and git + * branch in a bordered info panel. + */ + +import type React from 'react'; +import { Box, Text } from 'ink'; +import { shortenPath, tildeifyPath } from '@qwen-code/qwen-code-core'; +import { theme } from '../../semantic-colors.js'; +import { useTerminalSize } from '../../hooks/useTerminalSize.js'; + +interface AgentHeaderProps { + modelId: string; + modelName?: string; + workingDirectory: string; + gitBranch?: string; +} + +export const AgentHeader: React.FC = ({ + modelId, + modelName, + workingDirectory, + gitBranch, +}) => { + const { columns: terminalWidth } = useTerminalSize(); + const maxPathLen = Math.max(20, terminalWidth - 12); + const displayPath = shortenPath(tildeifyPath(workingDirectory), maxPathLen); + + const modelText = + modelName && modelName !== modelId ? `${modelId} (${modelName})` : modelId; + + return ( + + + {'Model: '} + {modelText} + + + {'Path: '} + {displayPath} + + {gitBranch && ( + + {'Branch: '} + {gitBranch} + + )} + + ); +}; diff --git a/packages/cli/src/ui/components/agent-view/AgentTabBar.tsx b/packages/cli/src/ui/components/agent-view/AgentTabBar.tsx new file mode 100644 index 0000000000..c7b0b113c2 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/AgentTabBar.tsx @@ -0,0 +1,167 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentTabBar — horizontal tab strip for in-process agent views. + * + * Rendered at the top of the terminal whenever in-process agents are registered. + * + * On the main tab, Left/Right switch tabs when the input buffer is empty. + * On agent tabs, the tab bar uses an exclusive-focus model: + * - Down arrow at the input's bottom edge focuses the tab bar + * - Left/Right switch tabs only when the tab bar is focused + * - Up arrow or typing returns focus to the input + * + * Tab indicators: running, idle/completed, failed, cancelled + */ + +import { Box, Text } from 'ink'; +import { useState, useEffect, useCallback } from 'react'; +import { AgentStatus, AgentEventType } from '@qwen-code/qwen-code-core'; +import { + useAgentViewState, + useAgentViewActions, + type RegisteredAgent, +} from '../../contexts/AgentViewContext.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { useUIState } from '../../contexts/UIStateContext.js'; +import { theme } from '../../semantic-colors.js'; + +// ─── Status Indicators ────────────────────────────────────── + +function statusIndicator(agent: RegisteredAgent): { + symbol: string; + color: string; +} { + const status = agent.interactiveAgent.getStatus(); + switch (status) { + case AgentStatus.RUNNING: + case AgentStatus.INITIALIZING: + return { symbol: '\u25CF', color: theme.status.warning }; // ● running + case AgentStatus.IDLE: + return { symbol: '\u25CF', color: theme.status.success }; // ● idle (ready) + case AgentStatus.COMPLETED: + return { symbol: '\u2713', color: theme.status.success }; // ✓ completed + case AgentStatus.FAILED: + return { symbol: '\u2717', color: theme.status.error }; // ✗ failed + case AgentStatus.CANCELLED: + return { symbol: '\u25CB', color: theme.text.secondary }; // ○ cancelled + default: + return { symbol: '\u25CB', color: theme.text.secondary }; // ○ fallback + } +} + +// ─── Component ────────────────────────────────────────────── + +export const AgentTabBar: React.FC = () => { + const { activeView, agents, agentShellFocused, agentTabBarFocused } = + useAgentViewState(); + const { switchToNext, switchToPrevious, setAgentTabBarFocused } = + useAgentViewActions(); + const { embeddedShellFocused } = useUIState(); + + useKeypress( + (key) => { + if (embeddedShellFocused || agentShellFocused) return; + if (!agentTabBarFocused) return; + + if (key.name === 'left') { + switchToPrevious(); + } else if (key.name === 'right') { + switchToNext(); + } else if (key.name === 'up') { + setAgentTabBarFocused(false); + } else if ( + key.sequence && + key.sequence.length === 1 && + !key.ctrl && + !key.meta + ) { + // Printable character → return focus to input (key falls through + // to BaseTextInput's useKeypress and gets typed normally) + setAgentTabBarFocused(false); + } + }, + { isActive: true }, + ); + + // Subscribe to STATUS_CHANGE events from all agents so the tab bar + // re-renders when an agent's status transitions (e.g. RUNNING → COMPLETED). + // Without this, status indicators would be stale until the next unrelated render. + const [, setTick] = useState(0); + const forceRender = useCallback(() => setTick((t) => t + 1), []); + + useEffect(() => { + const cleanups: Array<() => void> = []; + for (const [, agent] of agents) { + const emitter = agent.interactiveAgent.getEventEmitter(); + if (emitter) { + emitter.on(AgentEventType.STATUS_CHANGE, forceRender); + cleanups.push(() => + emitter.off(AgentEventType.STATUS_CHANGE, forceRender), + ); + } + } + return () => cleanups.forEach((fn) => fn()); + }, [agents, forceRender]); + + const isFocused = agentTabBarFocused; + + // Navigation hint varies by context + const hint = isFocused ? '\u2190/\u2192 switch \u2191 input' : '\u2193 tabs'; + + return ( + + {/* Main tab */} + + + {' Main '} + + + + {/* Separator */} + + {'\u2502'} + + + {/* Agent tabs */} + {[...agents.entries()].map(([agentId, agent]) => { + const isActive = activeView === agentId; + const { symbol, color: indicatorColor } = statusIndicator(agent); + + return ( + + + {` ${agent.modelId} `} + + + {` ${symbol}`} + + + ); + })} + + {/* Navigation hint */} + + {hint} + + + ); +}; diff --git a/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.test.ts b/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.test.ts new file mode 100644 index 0000000000..afedfc2b68 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.test.ts @@ -0,0 +1,510 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect } from 'vitest'; +import { agentMessagesToHistoryItems } from './agentHistoryAdapter.js'; +import type { + AgentMessage, + ToolCallConfirmationDetails, +} from '@qwen-code/qwen-code-core'; +import { ToolCallStatus } from '../../types.js'; + +// ─── Helpers ──────────────────────────────────────────────── + +function msg( + role: AgentMessage['role'], + content: string, + extra?: Partial, +): AgentMessage { + return { role, content, timestamp: 0, ...extra }; +} + +const noApprovals = new Map(); + +function toolCallMsg( + callId: string, + toolName: string, + opts?: { description?: string; renderOutputAsMarkdown?: boolean }, +): AgentMessage { + return msg('tool_call', `Tool call: ${toolName}`, { + metadata: { + callId, + toolName, + description: opts?.description ?? '', + renderOutputAsMarkdown: opts?.renderOutputAsMarkdown, + }, + }); +} + +function toolResultMsg( + callId: string, + toolName: string, + opts?: { + success?: boolean; + resultDisplay?: string; + outputFile?: string; + }, +): AgentMessage { + return msg('tool_result', `Tool ${toolName}`, { + metadata: { + callId, + toolName, + success: opts?.success ?? true, + resultDisplay: opts?.resultDisplay, + outputFile: opts?.outputFile, + }, + }); +} + +// ─── Role mapping ──────────────────────────────────────────── + +describe('agentMessagesToHistoryItems — role mapping', () => { + it('maps user message', () => { + const items = agentMessagesToHistoryItems( + [msg('user', 'hello')], + noApprovals, + ); + expect(items).toHaveLength(1); + expect(items[0]).toMatchObject({ type: 'user', text: 'hello' }); + }); + + it('maps plain assistant message', () => { + const items = agentMessagesToHistoryItems( + [msg('assistant', 'response')], + noApprovals, + ); + expect(items[0]).toMatchObject({ type: 'gemini', text: 'response' }); + }); + + it('maps thought assistant message', () => { + const items = agentMessagesToHistoryItems( + [msg('assistant', 'thinking...', { thought: true })], + noApprovals, + ); + expect(items[0]).toMatchObject({ + type: 'gemini_thought', + text: 'thinking...', + }); + }); + + it('maps assistant message with error metadata', () => { + const items = agentMessagesToHistoryItems( + [msg('assistant', 'oops', { metadata: { error: true } })], + noApprovals, + ); + expect(items[0]).toMatchObject({ type: 'error', text: 'oops' }); + }); + + it('maps info message with no level → type info', () => { + const items = agentMessagesToHistoryItems( + [msg('info', 'note')], + noApprovals, + ); + expect(items[0]).toMatchObject({ type: 'info', text: 'note' }); + }); + + it.each([ + ['warning', 'warning'], + ['success', 'success'], + ['error', 'error'], + ] as const)('maps info message with level=%s', (level, expectedType) => { + const items = agentMessagesToHistoryItems( + [msg('info', 'text', { metadata: { level } })], + noApprovals, + ); + expect(items[0]).toMatchObject({ type: expectedType }); + }); + + it('maps unknown info level → type info', () => { + const items = agentMessagesToHistoryItems( + [msg('info', 'x', { metadata: { level: 'verbose' } })], + noApprovals, + ); + expect(items[0]).toMatchObject({ type: 'info' }); + }); + + it('skips unknown roles without crashing', () => { + const items = agentMessagesToHistoryItems( + [ + msg('user', 'before'), + // force an unknown role + { role: 'unknown' as AgentMessage['role'], content: 'x', timestamp: 0 }, + msg('user', 'after'), + ], + noApprovals, + ); + expect(items).toHaveLength(2); + expect(items[0]).toMatchObject({ type: 'user', text: 'before' }); + expect(items[1]).toMatchObject({ type: 'user', text: 'after' }); + }); +}); + +// ─── Tool grouping ─────────────────────────────────────────── + +describe('agentMessagesToHistoryItems — tool grouping', () => { + it('merges a tool_call + tool_result pair into one tool_group', () => { + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'read_file'), toolResultMsg('c1', 'read_file')], + noApprovals, + ); + expect(items).toHaveLength(1); + expect(items[0]!.type).toBe('tool_group'); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools).toHaveLength(1); + expect(group.tools[0]!.name).toBe('read_file'); + }); + + it('merges multiple parallel tool calls into one tool_group', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'read_file'), + toolCallMsg('c2', 'write_file'), + toolResultMsg('c1', 'read_file'), + toolResultMsg('c2', 'write_file'), + ], + noApprovals, + ); + expect(items).toHaveLength(1); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools).toHaveLength(2); + expect(group.tools[0]!.name).toBe('read_file'); + expect(group.tools[1]!.name).toBe('write_file'); + }); + + it('preserves tool call order by first appearance', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c2', 'second'), + toolCallMsg('c1', 'first'), + toolResultMsg('c1', 'first'), + toolResultMsg('c2', 'second'), + ], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.name).toBe('second'); + expect(group.tools[1]!.name).toBe('first'); + }); + + it('breaks tool groups at non-tool messages', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'tool_a'), + toolResultMsg('c1', 'tool_a'), + msg('assistant', 'between'), + toolCallMsg('c2', 'tool_b'), + toolResultMsg('c2', 'tool_b'), + ], + noApprovals, + ); + expect(items).toHaveLength(3); + expect(items[0]!.type).toBe('tool_group'); + expect(items[1]!.type).toBe('gemini'); + expect(items[2]!.type).toBe('tool_group'); + }); + + it('handles tool_result arriving without a prior tool_call gracefully', () => { + const items = agentMessagesToHistoryItems( + [ + toolResultMsg('c1', 'orphan', { + success: true, + resultDisplay: 'output', + }), + ], + noApprovals, + ); + expect(items).toHaveLength(1); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.callId).toBe('c1'); + expect(group.tools[0]!.status).toBe(ToolCallStatus.Success); + }); +}); + +// ─── Tool status ───────────────────────────────────────────── + +describe('agentMessagesToHistoryItems — tool status', () => { + it('Executing: tool_call with no result yet', () => { + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.status).toBe(ToolCallStatus.Executing); + }); + + it('Success: tool_result with success=true', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'read'), + toolResultMsg('c1', 'read', { success: true }), + ], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.status).toBe(ToolCallStatus.Success); + }); + + it('Error: tool_result with success=false', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'write'), + toolResultMsg('c1', 'write', { success: false }), + ], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.status).toBe(ToolCallStatus.Error); + }); + + it('Confirming: tool_call present in pendingApprovals', () => { + const fakeApproval = {} as ToolCallConfirmationDetails; + const approvals = new Map([['c1', fakeApproval]]); + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + approvals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.status).toBe(ToolCallStatus.Confirming); + expect(group.tools[0]!.confirmationDetails).toBe(fakeApproval); + }); + + it('Confirming takes priority over Executing', () => { + // pending approval AND no result yet → Confirming, not Executing + const approvals = new Map([['c1', {} as ToolCallConfirmationDetails]]); + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + approvals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.status).toBe(ToolCallStatus.Confirming); + }); +}); + +// ─── Tool metadata ─────────────────────────────────────────── + +describe('agentMessagesToHistoryItems — tool metadata', () => { + it('forwards resultDisplay from tool_result', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'read'), + toolResultMsg('c1', 'read', { + success: true, + resultDisplay: 'file contents', + }), + ], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.resultDisplay).toBe('file contents'); + }); + + it('forwards renderOutputAsMarkdown from tool_call', () => { + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'web_fetch', { renderOutputAsMarkdown: true }), + toolResultMsg('c1', 'web_fetch', { success: true }), + ], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.renderOutputAsMarkdown).toBe(true); + }); + + it('forwards description from tool_call', () => { + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'read', { description: 'reading src/index.ts' })], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.description).toBe('reading src/index.ts'); + }); +}); + +// ─── liveOutputs overlay ───────────────────────────────────── + +describe('agentMessagesToHistoryItems — liveOutputs', () => { + it('uses liveOutput as resultDisplay for Executing tools', () => { + const liveOutputs = new Map([['c1', 'live stdout so far']]); + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + noApprovals, + liveOutputs, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.resultDisplay).toBe('live stdout so far'); + }); + + it('ignores liveOutput for completed tools', () => { + const liveOutputs = new Map([['c1', 'stale live output']]); + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'shell'), + toolResultMsg('c1', 'shell', { + success: true, + resultDisplay: 'final output', + }), + ], + noApprovals, + liveOutputs, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.resultDisplay).toBe('final output'); + }); + + it('falls back to entry resultDisplay when no liveOutput for callId', () => { + const liveOutputs = new Map([['other-id', 'unrelated']]); + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + noApprovals, + liveOutputs, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.resultDisplay).toBeUndefined(); + }); +}); + +// ─── shellPids overlay ─────────────────────────────────────── + +describe('agentMessagesToHistoryItems — shellPids', () => { + it('sets ptyId for Executing tools with a known PID', () => { + const shellPids = new Map([['c1', 12345]]); + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + noApprovals, + undefined, + shellPids, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.ptyId).toBe(12345); + }); + + it('does not set ptyId for completed tools', () => { + const shellPids = new Map([['c1', 12345]]); + const items = agentMessagesToHistoryItems( + [ + toolCallMsg('c1', 'shell'), + toolResultMsg('c1', 'shell', { success: true }), + ], + noApprovals, + undefined, + shellPids, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.ptyId).toBeUndefined(); + }); + + it('does not set ptyId when shellPids is not provided', () => { + const items = agentMessagesToHistoryItems( + [toolCallMsg('c1', 'shell')], + noApprovals, + ); + const group = items[0] as Extract< + (typeof items)[0], + { type: 'tool_group' } + >; + expect(group.tools[0]!.ptyId).toBeUndefined(); + }); +}); + +// ─── ID stability ──────────────────────────────────────────── + +describe('agentMessagesToHistoryItems — ID stability', () => { + it('assigns monotonically increasing IDs', () => { + const items = agentMessagesToHistoryItems( + [ + msg('user', 'u1'), + msg('assistant', 'a1'), + msg('info', 'i1'), + toolCallMsg('c1', 'tool'), + toolResultMsg('c1', 'tool'), + ], + noApprovals, + ); + const ids = items.map((i) => i.id); + expect(ids).toEqual([0, 1, 2, 3]); + }); + + it('tool_group consumes one ID regardless of how many calls it contains', () => { + const items = agentMessagesToHistoryItems( + [ + msg('user', 'go'), + toolCallMsg('c1', 'tool_a'), + toolCallMsg('c2', 'tool_b'), + toolResultMsg('c1', 'tool_a'), + toolResultMsg('c2', 'tool_b'), + msg('assistant', 'done'), + ], + noApprovals, + ); + // user=0, tool_group=1, assistant=2 + expect(items.map((i) => i.id)).toEqual([0, 1, 2]); + }); + + it('IDs from a prefix of messages are stable when more messages are appended', () => { + const base: AgentMessage[] = [msg('user', 'u'), msg('assistant', 'a')]; + + const before = agentMessagesToHistoryItems(base, noApprovals); + const after = agentMessagesToHistoryItems( + [...base, msg('info', 'i')], + noApprovals, + ); + + expect(after[0]!.id).toBe(before[0]!.id); + expect(after[1]!.id).toBe(before[1]!.id); + expect(after[2]!.id).toBe(2); + }); +}); diff --git a/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.ts b/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.ts new file mode 100644 index 0000000000..951618abf0 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/agentHistoryAdapter.ts @@ -0,0 +1,194 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview agentHistoryAdapter — converts AgentMessage[] to HistoryItem[]. + * + * This adapter bridges the sub-agent data model (AgentMessage[] from + * AgentInteractive) to the shared rendering model (HistoryItem[] consumed by + * HistoryItemDisplay). It lives in the CLI package so that packages/core types + * are never coupled to CLI rendering types. + * + * ID stability: AgentMessage[] is append-only, so the resulting HistoryItem[] + * only ever grows. Index-based IDs are therefore stable — Ink's + * requires items never shift or be removed, which this guarantees. + */ + +import type { + AgentMessage, + ToolCallConfirmationDetails, + ToolResultDisplay, +} from '@qwen-code/qwen-code-core'; +import type { HistoryItem, IndividualToolCallDisplay } from '../../types.js'; +import { ToolCallStatus } from '../../types.js'; + +/** + * Convert AgentMessage[] + pendingApprovals into HistoryItem[]. + * + * Consecutive tool_call / tool_result messages are merged into a single + * tool_group HistoryItem. pendingApprovals overlays confirmation state so + * ToolGroupMessage can render confirmation dialogs. + * + * liveOutputs (optional) provides real-time display data for executing tools. + * shellPids (optional) provides PTY PIDs for interactive shell tools so + * HistoryItemDisplay can render ShellInputPrompt on the active shell. + */ +export function agentMessagesToHistoryItems( + messages: readonly AgentMessage[], + pendingApprovals: ReadonlyMap, + liveOutputs?: ReadonlyMap, + shellPids?: ReadonlyMap, +): HistoryItem[] { + const items: HistoryItem[] = []; + let nextId = 0; + let i = 0; + + while (i < messages.length) { + const msg = messages[i]!; + + // ── user ────────────────────────────────────────────────── + if (msg.role === 'user') { + items.push({ type: 'user', text: msg.content, id: nextId++ }); + i++; + + // ── assistant ───────────────────────────────────────────── + } else if (msg.role === 'assistant') { + if (msg.metadata?.['error']) { + items.push({ type: 'error', text: msg.content, id: nextId++ }); + } else if (msg.thought) { + items.push({ type: 'gemini_thought', text: msg.content, id: nextId++ }); + } else { + items.push({ type: 'gemini', text: msg.content, id: nextId++ }); + } + i++; + + // ── info / warning / success / error ────────────────────── + } else if (msg.role === 'info') { + const level = msg.metadata?.['level'] as string | undefined; + const type = + level === 'warning' || level === 'success' || level === 'error' + ? level + : 'info'; + items.push({ type, text: msg.content, id: nextId++ }); + i++; + + // ── tool_call / tool_result → tool_group ────────────────── + } else if (msg.role === 'tool_call' || msg.role === 'tool_result') { + const groupId = nextId++; + + const callMap = new Map< + string, + { + callId: string; + name: string; + description: string; + resultDisplay: ToolResultDisplay | string | undefined; + outputFile: string | undefined; + renderOutputAsMarkdown: boolean | undefined; + success: boolean | undefined; + } + >(); + const callOrder: string[] = []; + + while ( + i < messages.length && + (messages[i]!.role === 'tool_call' || + messages[i]!.role === 'tool_result') + ) { + const m = messages[i]!; + const callId = (m.metadata?.['callId'] as string) ?? `unknown-${i}`; + + if (m.role === 'tool_call') { + if (!callMap.has(callId)) callOrder.push(callId); + callMap.set(callId, { + callId, + name: (m.metadata?.['toolName'] as string) ?? 'unknown', + description: (m.metadata?.['description'] as string) ?? '', + resultDisplay: undefined, + outputFile: undefined, + renderOutputAsMarkdown: m.metadata?.['renderOutputAsMarkdown'] as + | boolean + | undefined, + success: undefined, + }); + } else { + // tool_result — attach to existing call entry + const entry = callMap.get(callId); + const resultDisplay = m.metadata?.['resultDisplay'] as + | ToolResultDisplay + | string + | undefined; + const outputFile = m.metadata?.['outputFile'] as string | undefined; + const success = m.metadata?.['success'] as boolean; + + if (entry) { + entry.success = success; + entry.resultDisplay = resultDisplay; + entry.outputFile = outputFile; + } else { + // Result arrived without a prior tool_call message (shouldn't + // normally happen, but handle gracefully) + callOrder.push(callId); + callMap.set(callId, { + callId, + name: (m.metadata?.['toolName'] as string) ?? 'unknown', + description: '', + resultDisplay, + outputFile, + renderOutputAsMarkdown: undefined, + success, + }); + } + } + i++; + } + + const tools: IndividualToolCallDisplay[] = callOrder.map((callId) => { + const entry = callMap.get(callId)!; + const approval = pendingApprovals.get(callId); + + let status: ToolCallStatus; + if (approval) { + status = ToolCallStatus.Confirming; + } else if (entry.success === undefined) { + status = ToolCallStatus.Executing; + } else if (entry.success) { + status = ToolCallStatus.Success; + } else { + status = ToolCallStatus.Error; + } + + // For executing tools, use live output if available (Gap 4) + const resultDisplay = + status === ToolCallStatus.Executing && liveOutputs?.has(callId) + ? liveOutputs.get(callId) + : entry.resultDisplay; + + return { + callId: entry.callId, + name: entry.name, + description: entry.description, + resultDisplay, + outputFile: entry.outputFile, + renderOutputAsMarkdown: entry.renderOutputAsMarkdown, + status, + confirmationDetails: approval, + ptyId: + status === ToolCallStatus.Executing + ? shellPids?.get(callId) + : undefined, + }; + }); + + items.push({ type: 'tool_group', tools, id: groupId }); + } else { + // Skip unknown roles + i++; + } + } + + return items; +} diff --git a/packages/cli/src/ui/components/agent-view/index.ts b/packages/cli/src/ui/components/agent-view/index.ts new file mode 100644 index 0000000000..c1e595c228 --- /dev/null +++ b/packages/cli/src/ui/components/agent-view/index.ts @@ -0,0 +1,12 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +export { AgentTabBar } from './AgentTabBar.js'; +export { AgentChatView } from './AgentChatView.js'; +export { AgentHeader } from './AgentHeader.js'; +export { AgentComposer } from './AgentComposer.js'; +export { AgentFooter } from './AgentFooter.js'; +export { agentMessagesToHistoryItems } from './agentHistoryAdapter.js'; diff --git a/packages/cli/src/ui/components/arena/ArenaCards.tsx b/packages/cli/src/ui/components/arena/ArenaCards.tsx new file mode 100644 index 0000000000..1ad7d8e2ac --- /dev/null +++ b/packages/cli/src/ui/components/arena/ArenaCards.tsx @@ -0,0 +1,290 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { Box, Text } from 'ink'; +import { theme } from '../../semantic-colors.js'; +import { formatDuration } from '../../utils/formatters.js'; +import { getArenaStatusLabel } from '../../utils/displayUtils.js'; +import type { ArenaAgentCardData } from '../../types.js'; + +// ─── Helpers ──────────────────────────────────────────────── + +// ─── Agent Complete Card ──────────────────────────────────── + +interface ArenaAgentCardProps { + agent: ArenaAgentCardData; + width?: number; +} + +export const ArenaAgentCard: React.FC = ({ + agent, + width, +}) => { + const { icon, text, color } = getArenaStatusLabel(agent.status); + const duration = formatDuration(agent.durationMs); + const tokens = agent.totalTokens.toLocaleString(); + const inTokens = agent.inputTokens.toLocaleString(); + const outTokens = agent.outputTokens.toLocaleString(); + + return ( + + {/* Line 1: Status icon + text + label + duration */} + + + {icon} {agent.label} · {text} · {duration} + + + + {/* Line 2: Tokens */} + + + Tokens: {tokens} (in {inTokens}, out {outTokens}) + + + + {/* Line 3: Tool Calls with colored success/error counts */} + + + Tool Calls: {agent.toolCalls} + {agent.failedToolCalls > 0 && ( + <> + {' '} + ( + + ✓ {agent.successfulToolCalls} + + + ✕ {agent.failedToolCalls}) + + )} + + + + {/* Error line (if terminated with error) */} + {agent.error && ( + + {agent.error} + + )} + + ); +}; + +// ─── Session Complete Card ────────────────────────────────── + +interface ArenaSessionCardProps { + sessionStatus: string; + task: string; + totalDurationMs: number; + agents: ArenaAgentCardData[]; + width?: number; +} + +/** + * Pad or truncate a string to a fixed visual width. + */ +function pad( + str: string, + len: number, + align: 'left' | 'right' = 'left', +): string { + if (str.length >= len) return str.slice(0, len); + const padding = ' '.repeat(len - str.length); + return align === 'right' ? padding + str : str + padding; +} + +/** + * Truncate a string to a maximum length, adding ellipsis if truncated. + */ +function truncate(str: string, maxLen: number): string { + if (str.length <= maxLen) return str; + return str.slice(0, maxLen - 1) + '…'; +} + +/** + * Calculate diff stats from a unified diff string. + * Returns the stats string and individual counts for colored rendering. + */ +function getDiffStats(diff: string | undefined): { + text: string; + additions: number; + deletions: number; +} { + if (!diff) return { text: '', additions: 0, deletions: 0 }; + const lines = diff.split('\n'); + let additions = 0; + let deletions = 0; + for (const line of lines) { + if (line.startsWith('+') && !line.startsWith('+++')) { + additions++; + } else if (line.startsWith('-') && !line.startsWith('---')) { + deletions++; + } + } + return { text: `+${additions}/-${deletions}`, additions, deletions }; +} + +const MAX_MODEL_NAME_LENGTH = 35; + +export const ArenaSessionCard: React.FC = ({ + sessionStatus, + task, + agents, + width, +}) => { + // Truncate task for display + const maxTaskLen = 60; + const displayTask = + task.length > maxTaskLen ? task.slice(0, maxTaskLen - 1) + '…' : task; + + // Column widths for the agent table (unified with Arena Results) + const colStatus = 14; + const colTime = 8; + const colTokens = 10; + const colChanges = 10; + + const titleLabel = + sessionStatus === 'idle' + ? 'Agents Status · Idle' + : sessionStatus === 'completed' + ? 'Arena Complete' + : sessionStatus === 'cancelled' + ? 'Arena Cancelled' + : 'Arena Failed'; + + return ( + + {/* Title - neutral color (not green) */} + + + {titleLabel} + + + + + + {/* Task */} + + + Task: + "{displayTask}" + + + + + + {/* Table header - unified columns: Agent, Status, Time, Tokens, Changes */} + + + + Agent + + + + + Status + + + + + Time + + + + + Tokens + + + + + Changes + + + + + {/* Table separator */} + + + {'─'.repeat((width ?? 60) - 8)} + + + + {/* Agent rows */} + {agents.map((agent) => { + const { text: statusText, color } = getArenaStatusLabel(agent.status); + const diffStats = getDiffStats(agent.diff); + return ( + + + + {truncate(agent.label, MAX_MODEL_NAME_LENGTH)} + + + + {statusText} + + + + {pad(formatDuration(agent.durationMs), colTime - 1, 'right')} + + + + + {pad( + agent.totalTokens.toLocaleString(), + colTokens - 1, + 'right', + )} + + + + {diffStats.additions > 0 || diffStats.deletions > 0 ? ( + + + +{diffStats.additions} + + / + -{diffStats.deletions} + + ) : ( + - + )} + + + ); + })} + + + + {/* Hint */} + {sessionStatus === 'idle' && ( + + + Switch to an agent tab to continue, or{' '} + /arena select to pick a + winner. + + + )} + {sessionStatus === 'completed' && ( + + + Run /arena select to pick a + winner. + + + )} + + ); +}; diff --git a/packages/cli/src/ui/components/arena/ArenaSelectDialog.tsx b/packages/cli/src/ui/components/arena/ArenaSelectDialog.tsx new file mode 100644 index 0000000000..88fe5a5072 --- /dev/null +++ b/packages/cli/src/ui/components/arena/ArenaSelectDialog.tsx @@ -0,0 +1,260 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { useCallback, useMemo } from 'react'; +import { Box, Text } from 'ink'; +import { + type ArenaManager, + isSuccessStatus, + type Config, +} from '@qwen-code/qwen-code-core'; +import { theme } from '../../semantic-colors.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { MessageType, type HistoryItemWithoutId } from '../../types.js'; +import type { UseHistoryManagerReturn } from '../../hooks/useHistoryManager.js'; +import { formatDuration } from '../../utils/formatters.js'; +import { getArenaStatusLabel } from '../../utils/displayUtils.js'; +import { DescriptiveRadioButtonSelect } from '../shared/DescriptiveRadioButtonSelect.js'; +import type { DescriptiveRadioSelectItem } from '../shared/DescriptiveRadioButtonSelect.js'; + +interface ArenaSelectDialogProps { + manager: ArenaManager; + config: Config; + addItem: UseHistoryManagerReturn['addItem']; + closeArenaDialog: () => void; +} + +export function ArenaSelectDialog({ + manager, + config, + addItem, + closeArenaDialog, +}: ArenaSelectDialogProps): React.JSX.Element { + const pushMessage = useCallback( + (result: { messageType: 'info' | 'error'; content: string }) => { + const item: HistoryItemWithoutId = { + type: + result.messageType === 'info' ? MessageType.INFO : MessageType.ERROR, + text: result.content, + }; + addItem(item, Date.now()); + + try { + const chatRecorder = config.getChatRecordingService(); + chatRecorder?.recordSlashCommand({ + phase: 'result', + rawCommand: '/arena select', + outputHistoryItems: [{ ...item } as Record], + }); + } catch { + // Best-effort recording + } + }, + [addItem, config], + ); + + const onSelect = useCallback( + async (agentId: string) => { + closeArenaDialog(); + const mgr = config.getArenaManager(); + if (!mgr) { + pushMessage({ + messageType: 'error', + content: 'No arena session found. Start one with /arena start.', + }); + return; + } + + const agent = + mgr.getAgentState(agentId) ?? + mgr.getAgentStates().find((item) => item.agentId === agentId); + const label = agent?.model.modelId || agentId; + + pushMessage({ + messageType: 'info', + content: `Applying changes from ${label}…`, + }); + const result = await mgr.applyAgentResult(agentId); + if (!result.success) { + pushMessage({ + messageType: 'error', + content: `Failed to apply changes from ${label}: ${result.error}`, + }); + return; + } + + try { + await config.cleanupArenaRuntime(true); + } catch (err) { + pushMessage({ + messageType: 'error', + content: `Warning: failed to clean up arena resources: ${err instanceof Error ? err.message : String(err)}`, + }); + } + pushMessage({ + messageType: 'info', + content: `Applied changes from ${label} to workspace. Arena session complete.`, + }); + }, + [closeArenaDialog, config, pushMessage], + ); + + const onDiscard = useCallback(async () => { + closeArenaDialog(); + const mgr = config.getArenaManager(); + if (!mgr) { + pushMessage({ + messageType: 'error', + content: 'No arena session found. Start one with /arena start.', + }); + return; + } + + try { + pushMessage({ + messageType: 'info', + content: 'Discarding Arena results and cleaning up…', + }); + await config.cleanupArenaRuntime(true); + pushMessage({ + messageType: 'info', + content: 'Arena results discarded. All worktrees cleaned up.', + }); + } catch (err) { + pushMessage({ + messageType: 'error', + content: `Failed to clean up arena worktrees: ${err instanceof Error ? err.message : String(err)}`, + }); + } + }, [closeArenaDialog, config, pushMessage]); + + const result = manager.getResult(); + const agents = manager.getAgentStates(); + + const items: Array> = useMemo( + () => + agents.map((agent) => { + const label = agent.model.modelId; + const statusInfo = getArenaStatusLabel(agent.status); + const duration = formatDuration(agent.stats.durationMs); + const tokens = agent.stats.totalTokens.toLocaleString(); + + // Build diff summary from cached result if available + let diffAdditions = 0; + let diffDeletions = 0; + if (isSuccessStatus(agent.status) && result) { + const agentResult = result.agents.find( + (a) => a.agentId === agent.agentId, + ); + if (agentResult?.diff) { + const lines = agentResult.diff.split('\n'); + for (const line of lines) { + if (line.startsWith('+') && !line.startsWith('+++')) { + diffAdditions++; + } else if (line.startsWith('-') && !line.startsWith('---')) { + diffDeletions++; + } + } + } + } + + // Title: full model name (not truncated) + const title = {label}; + + // Description: status, time, tokens, changes (unified with Arena Complete columns) + const description = ( + + {statusInfo.text} + · + {duration} + · + {tokens} tokens + {(diffAdditions > 0 || diffDeletions > 0) && ( + <> + · + +{diffAdditions} + / + -{diffDeletions} + lines + + )} + + ); + + return { + key: agent.agentId, + value: agent.agentId, + title, + description, + disabled: !isSuccessStatus(agent.status), + }; + }), + [agents, result], + ); + + useKeypress( + (key) => { + if (key.name === 'escape') { + closeArenaDialog(); + } + if (key.name === 'd' && !key.ctrl && !key.meta) { + onDiscard(); + } + }, + { isActive: true }, + ); + + const task = result?.task || ''; + + return ( + + {/* Neutral title color (not green) */} + + Arena Results + + + + + Task: + {`"${task.length > 60 ? task.slice(0, 59) + '…' : task}"`} + + + + + + Select a winner to apply changes: + + + + + !item.disabled)} + onSelect={(agentId: string) => { + onSelect(agentId); + }} + isFocused={true} + showNumbers={false} + /> + + + + + Enter to select, d to discard all, Esc to cancel + + + + ); +} diff --git a/packages/cli/src/ui/components/arena/ArenaStartDialog.tsx b/packages/cli/src/ui/components/arena/ArenaStartDialog.tsx new file mode 100644 index 0000000000..6ce6108873 --- /dev/null +++ b/packages/cli/src/ui/components/arena/ArenaStartDialog.tsx @@ -0,0 +1,161 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { useMemo, useState } from 'react'; +import { Box, Text } from 'ink'; +import Link from 'ink-link'; +import { AuthType } from '@qwen-code/qwen-code-core'; +import { useConfig } from '../../contexts/ConfigContext.js'; +import { theme } from '../../semantic-colors.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { MultiSelect } from '../shared/MultiSelect.js'; +import { t } from '../../../i18n/index.js'; + +interface ArenaStartDialogProps { + onClose: () => void; + onConfirm: (selectedModels: string[]) => void; +} + +const MODEL_PROVIDERS_DOCUMENTATION_URL = + 'https://qwenlm.github.io/qwen-code-docs/en/users/configuration/settings/#modelproviders'; + +export function ArenaStartDialog({ + onClose, + onConfirm, +}: ArenaStartDialogProps): React.JSX.Element { + const config = useConfig(); + const [errorMessage, setErrorMessage] = useState(null); + + const modelItems = useMemo(() => { + const allModels = config.getAllConfiguredModels(); + const selectableModels = allModels.filter((model) => !model.isRuntimeModel); + + return selectableModels.map((model) => { + const token = `${model.authType}:${model.id}`; + const isQwenOauth = model.authType === AuthType.QWEN_OAUTH; + return { + key: token, + value: token, + label: `[${model.authType}] ${model.label}`, + disabled: isQwenOauth, + }; + }); + }, [config]); + const hasDisabledQwenOauth = modelItems.some((item) => item.disabled); + const selectableModelCount = modelItems.filter( + (item) => !item.disabled, + ).length; + const needsMoreModels = selectableModelCount < 2; + const shouldShowMoreModelsHint = + selectableModelCount >= 2 && selectableModelCount < 3; + + useKeypress( + (key) => { + if (key.name === 'escape') { + onClose(); + } + }, + { isActive: true }, + ); + + const handleConfirm = (values: string[]) => { + if (values.length < 2) { + setErrorMessage( + t('Please select at least 2 models to start an Arena session.'), + ); + return; + } + + setErrorMessage(null); + onConfirm(values); + }; + + return ( + + {t('Select Models')} + + {modelItems.length === 0 ? ( + + + {t('No models available. Please configure models first.')} + + + ) : ( + + + + )} + + {errorMessage && ( + + {errorMessage} + + )} + + {(hasDisabledQwenOauth || needsMoreModels) && ( + + {hasDisabledQwenOauth && ( + + {t('Note: qwen-oauth models are not supported in Arena.')} + + )} + {needsMoreModels && ( + <> + + {t('Arena requires at least 2 models. To add more:')} + + + {t( + ' - Run /auth to set up a Coding Plan (includes multiple models)', + )} + + + {t(' - Or configure modelProviders in settings.json')} + + + )} + + )} + + {shouldShowMoreModelsHint && ( + <> + + + {t('Configure more models with the modelProviders guide:')} + + + + + + {MODEL_PROVIDERS_DOCUMENTATION_URL} + + + + + )} + + + + {t('Space to toggle, Enter to confirm, Esc to cancel')} + + + + ); +} diff --git a/packages/cli/src/ui/components/arena/ArenaStatusDialog.tsx b/packages/cli/src/ui/components/arena/ArenaStatusDialog.tsx new file mode 100644 index 0000000000..e4a48031a7 --- /dev/null +++ b/packages/cli/src/ui/components/arena/ArenaStatusDialog.tsx @@ -0,0 +1,288 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { useEffect, useMemo, useState } from 'react'; +import { Box, Text } from 'ink'; +import { + type ArenaManager, + type ArenaAgentState, + type InProcessBackend, + type AgentStatsSummary, + isSettledStatus, + ArenaSessionStatus, + DISPLAY_MODE, +} from '@qwen-code/qwen-code-core'; +import { theme } from '../../semantic-colors.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { formatDuration } from '../../utils/formatters.js'; +import { getArenaStatusLabel } from '../../utils/displayUtils.js'; + +const STATUS_REFRESH_INTERVAL_MS = 2000; +const IN_PROCESS_REFRESH_INTERVAL_MS = 1000; + +interface ArenaStatusDialogProps { + manager: ArenaManager; + closeArenaDialog: () => void; + width?: number; +} + +function truncate(str: string, maxLen: number): string { + if (str.length <= maxLen) return str; + return str.slice(0, maxLen - 1) + '…'; +} + +function pad( + str: string, + len: number, + align: 'left' | 'right' = 'left', +): string { + if (str.length >= len) return str.slice(0, len); + const padding = ' '.repeat(len - str.length); + return align === 'right' ? padding + str : str + padding; +} + +function getElapsedMs(agent: ArenaAgentState): number { + if (isSettledStatus(agent.status)) { + return agent.stats.durationMs; + } + return Date.now() - agent.startedAt; +} + +function getSessionStatusLabel(status: ArenaSessionStatus): { + text: string; + color: string; +} { + switch (status) { + case ArenaSessionStatus.RUNNING: + return { text: 'Running', color: theme.status.success }; + case ArenaSessionStatus.INITIALIZING: + return { text: 'Initializing', color: theme.status.warning }; + case ArenaSessionStatus.IDLE: + return { text: 'Idle', color: theme.status.success }; + case ArenaSessionStatus.COMPLETED: + return { text: 'Completed', color: theme.status.success }; + case ArenaSessionStatus.CANCELLED: + return { text: 'Cancelled', color: theme.status.warning }; + case ArenaSessionStatus.FAILED: + return { text: 'Failed', color: theme.status.error }; + default: + return { text: String(status), color: theme.text.secondary }; + } +} + +const MAX_MODEL_NAME_LENGTH = 35; + +export function ArenaStatusDialog({ + manager, + closeArenaDialog, + width, +}: ArenaStatusDialogProps): React.JSX.Element { + const [tick, setTick] = useState(0); + + // Detect in-process backend for live stats reading + const backend = manager.getBackend(); + const isInProcess = backend?.type === DISPLAY_MODE.IN_PROCESS; + const inProcessBackend = isInProcess ? (backend as InProcessBackend) : null; + + useEffect(() => { + const interval = isInProcess + ? IN_PROCESS_REFRESH_INTERVAL_MS + : STATUS_REFRESH_INTERVAL_MS; + const timer = setInterval(() => { + setTick((prev) => prev + 1); + }, interval); + return () => clearInterval(timer); + }, [isInProcess]); + + // Force re-read on every tick + void tick; + + const sessionStatus = manager.getSessionStatus(); + const sessionLabel = getSessionStatusLabel(sessionStatus); + const agents = manager.getAgentStates(); + const task = manager.getTask() ?? ''; + + // For in-process mode, read live stats directly from AgentInteractive + const liveStats = useMemo(() => { + if (!inProcessBackend) return null; + const statsMap = new Map(); + for (const agent of agents) { + const interactive = inProcessBackend.getAgent(agent.agentId); + if (interactive) { + statsMap.set(agent.agentId, interactive.getStats()); + } + } + return statsMap; + // eslint-disable-next-line react-hooks/exhaustive-deps + }, [inProcessBackend, agents, tick]); + + const maxTaskLen = 60; + const displayTask = + task.length > maxTaskLen ? task.slice(0, maxTaskLen - 1) + '…' : task; + + const colStatus = 14; + const colTime = 8; + const colTokens = 10; + const colRounds = 8; + const colTools = 8; + + useKeypress( + (key) => { + if (key.name === 'escape' || key.name === 'q' || key.name === 'return') { + closeArenaDialog(); + } + }, + { isActive: true }, + ); + + // Inner content width: total width minus border (2) and paddingX (2*2) + const innerWidth = (width ?? 80) - 6; + + return ( + + {/* Title */} + + + Arena Status + + · + {sessionLabel.text} + + + + + {/* Task */} + + + Task: + "{displayTask}" + + + + + + {/* Table header */} + + + + Agent + + + + + Status + + + + + Time + + + + + Tokens + + + + + Rounds + + + + + Tools + + + + + {/* Separator */} + + {'─'.repeat(innerWidth)} + + + {/* Agent rows */} + {agents.map((agent) => { + const label = agent.model.modelId; + const { text: statusText, color } = getArenaStatusLabel(agent.status); + const elapsed = getElapsedMs(agent); + + // Use live stats from AgentInteractive when in-process, otherwise + // fall back to the cached ArenaAgentState.stats (file-polled). + const live = liveStats?.get(agent.agentId); + const totalTokens = live?.totalTokens ?? agent.stats.totalTokens; + const rounds = live?.rounds ?? agent.stats.rounds; + const toolCalls = live?.totalToolCalls ?? agent.stats.toolCalls; + const successfulToolCalls = + live?.successfulToolCalls ?? agent.stats.successfulToolCalls; + const failedToolCalls = + live?.failedToolCalls ?? agent.stats.failedToolCalls; + + return ( + + + + + {truncate(label, MAX_MODEL_NAME_LENGTH)} + + + + {statusText} + + + + {pad(formatDuration(elapsed), colTime - 1, 'right')} + + + + + {pad(totalTokens.toLocaleString(), colTokens - 1, 'right')} + + + + + {pad(String(rounds), colRounds - 1, 'right')} + + + + {failedToolCalls > 0 ? ( + + + {successfulToolCalls} + + / + {failedToolCalls} + + ) : ( + 0 ? theme.status.success : theme.text.primary + } + > + {pad(String(toolCalls), colTools - 1, 'right')} + + )} + + + + ); + })} + + {agents.length === 0 && ( + + No agents registered yet. + + )} + + ); +} diff --git a/packages/cli/src/ui/components/arena/ArenaStopDialog.tsx b/packages/cli/src/ui/components/arena/ArenaStopDialog.tsx new file mode 100644 index 0000000000..65f363793a --- /dev/null +++ b/packages/cli/src/ui/components/arena/ArenaStopDialog.tsx @@ -0,0 +1,213 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { useCallback, useMemo, useState } from 'react'; +import { Box, Text } from 'ink'; +import { + ArenaSessionStatus, + createDebugLogger, + type Config, +} from '@qwen-code/qwen-code-core'; +import { theme } from '../../semantic-colors.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import { MessageType, type HistoryItemWithoutId } from '../../types.js'; +import type { UseHistoryManagerReturn } from '../../hooks/useHistoryManager.js'; +import { DescriptiveRadioButtonSelect } from '../shared/DescriptiveRadioButtonSelect.js'; +import type { DescriptiveRadioSelectItem } from '../shared/DescriptiveRadioButtonSelect.js'; + +const debugLogger = createDebugLogger('ARENA_STOP_DIALOG'); + +type StopAction = 'cleanup' | 'preserve'; + +interface ArenaStopDialogProps { + config: Config; + addItem: UseHistoryManagerReturn['addItem']; + closeArenaDialog: () => void; +} + +export function ArenaStopDialog({ + config, + addItem, + closeArenaDialog, +}: ArenaStopDialogProps): React.JSX.Element { + const [isProcessing, setIsProcessing] = useState(false); + + const pushMessage = useCallback( + (result: { messageType: 'info' | 'error'; content: string }) => { + const item: HistoryItemWithoutId = { + type: + result.messageType === 'info' ? MessageType.INFO : MessageType.ERROR, + text: result.content, + }; + addItem(item, Date.now()); + + try { + const chatRecorder = config.getChatRecordingService(); + chatRecorder?.recordSlashCommand({ + phase: 'result', + rawCommand: '/arena stop', + outputHistoryItems: [{ ...item } as Record], + }); + } catch { + // Best-effort recording + } + }, + [addItem, config], + ); + + const onStop = useCallback( + async (action: StopAction) => { + if (isProcessing) return; + setIsProcessing(true); + closeArenaDialog(); + + const mgr = config.getArenaManager(); + if (!mgr) { + pushMessage({ + messageType: 'error', + content: 'No running Arena session found.', + }); + return; + } + + try { + const sessionStatus = mgr.getSessionStatus(); + if ( + sessionStatus === ArenaSessionStatus.RUNNING || + sessionStatus === ArenaSessionStatus.INITIALIZING + ) { + pushMessage({ + messageType: 'info', + content: 'Stopping Arena agents…', + }); + await mgr.cancel(); + } + await mgr.waitForSettled(); + pushMessage({ + messageType: 'info', + content: 'Cleaning up Arena resources…', + }); + + if (action === 'preserve') { + await mgr.cleanupRuntime(); + } else { + await mgr.cleanup(); + } + config.setArenaManager(null); + + if (action === 'preserve') { + pushMessage({ + messageType: 'info', + content: + 'Arena session stopped. Worktrees and session files were preserved. ' + + 'Use /arena select --discard to manually clean up later.', + }); + } else { + pushMessage({ + messageType: 'info', + content: + 'Arena session stopped. All Arena resources (including Git worktrees) were cleaned up.', + }); + } + } catch (error) { + const message = error instanceof Error ? error.message : String(error); + debugLogger.error('Failed to stop Arena session:', error); + pushMessage({ + messageType: 'error', + content: `Failed to stop Arena session: ${message}`, + }); + } + }, + [isProcessing, closeArenaDialog, config, pushMessage], + ); + + const configPreserve = + config.getAgentsSettings().arena?.preserveArtifacts ?? false; + + const items: Array> = useMemo( + () => [ + { + key: 'cleanup', + value: 'cleanup' as StopAction, + title: Stop and clean up, + description: ( + + Remove all worktrees and session files + + ), + }, + { + key: 'preserve', + value: 'preserve' as StopAction, + title: Stop and preserve artifacts, + description: ( + + Keep worktrees and session files for later inspection + + ), + }, + ], + [], + ); + + const defaultIndex = configPreserve ? 1 : 0; + + useKeypress( + (key) => { + if (key.name === 'escape') { + closeArenaDialog(); + } + }, + { isActive: !isProcessing }, + ); + + return ( + + + Stop Arena Session + + + + + Choose what to do with Arena artifacts: + + + + + { + onStop(action); + }} + isFocused={!isProcessing} + showNumbers={false} + /> + + + {configPreserve && ( + + + Default: preserve (agents.arena.preserveArtifacts is enabled) + + + )} + + + + Enter to confirm, Esc to cancel + + + + ); +} diff --git a/packages/cli/src/ui/components/messages/StatusMessages.tsx b/packages/cli/src/ui/components/messages/StatusMessages.tsx index e6e945bbd5..b6b026a28f 100644 --- a/packages/cli/src/ui/components/messages/StatusMessages.tsx +++ b/packages/cli/src/ui/components/messages/StatusMessages.tsx @@ -75,7 +75,7 @@ export const SuccessMessage: React.FC = ({ text }) => ( export const WarningMessage: React.FC = ({ text }) => ( diff --git a/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx b/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx index 396ee8c3a5..8ab45c2dee 100644 --- a/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx +++ b/packages/cli/src/ui/components/shared/DescriptiveRadioButtonSelect.tsx @@ -66,7 +66,11 @@ export function DescriptiveRadioButtonSelect({ renderItem={(item, { titleColor }) => ( {item.title} - {item.description} + {typeof item.description === 'string' ? ( + {item.description} + ) : ( + item.description + )} )} /> diff --git a/packages/cli/src/ui/components/shared/MultiSelect.tsx b/packages/cli/src/ui/components/shared/MultiSelect.tsx new file mode 100644 index 0000000000..b910430ba5 --- /dev/null +++ b/packages/cli/src/ui/components/shared/MultiSelect.tsx @@ -0,0 +1,193 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type React from 'react'; +import { useCallback, useEffect, useMemo, useState } from 'react'; +import { Box, Text } from 'ink'; +import { theme } from '../../semantic-colors.js'; +import { useSelectionList } from '../../hooks/useSelectionList.js'; +import { useKeypress } from '../../hooks/useKeypress.js'; +import type { SelectionListItem } from '../../hooks/useSelectionList.js'; + +export interface MultiSelectItem extends SelectionListItem { + label: string; +} + +export interface MultiSelectProps { + items: Array>; + initialIndex?: number; + initialSelectedKeys?: string[]; + onConfirm: (selectedValues: T[]) => void; + onChange?: (selectedValues: T[]) => void; + onHighlight?: (value: T) => void; + isFocused?: boolean; + showNumbers?: boolean; + showScrollArrows?: boolean; + maxItemsToShow?: number; +} + +const EMPTY_SELECTED_KEYS: string[] = []; + +function getSelectedValues( + items: Array>, + selectedKeys: Set, +): T[] { + return items + .filter((item) => selectedKeys.has(item.key)) + .map((item) => item.value); +} + +export function MultiSelect({ + items, + initialIndex = 0, + initialSelectedKeys = EMPTY_SELECTED_KEYS, + onConfirm, + onChange, + onHighlight, + isFocused = true, + showNumbers = true, + showScrollArrows = false, + maxItemsToShow = 10, +}: MultiSelectProps): React.JSX.Element { + const [selectedKeys, setSelectedKeys] = useState>( + () => new Set(initialSelectedKeys), + ); + const [scrollOffset, setScrollOffset] = useState(0); + + useEffect(() => { + setSelectedKeys((prev) => { + const next = new Set(initialSelectedKeys); + if ( + prev.size === next.size && + Array.from(next).every((key) => prev.has(key)) + ) { + return prev; + } + return next; + }); + }, [initialSelectedKeys]); + + const { activeIndex } = useSelectionList({ + items, + initialIndex, + isFocused, + // Disable numeric quick-select in useSelectionList — in a multi-select + // context, onSelect triggers onConfirm (submit), so numeric keys would + // accidentally submit the dialog instead of toggling checkboxes. + // Numbers are still rendered visually via the showNumbers prop below. + showNumbers: false, + onHighlight, + onSelect: () => { + onConfirm(getSelectedValues(items, selectedKeys)); + }, + }); + + const toggleSelectionAtIndex = useCallback( + (index: number) => { + const item = items[index]; + if (!item || item.disabled) { + return; + } + + setSelectedKeys((prev) => { + const next = new Set(prev); + if (next.has(item.key)) { + next.delete(item.key); + } else { + next.add(item.key); + } + return next; + }); + }, + [items], + ); + + useEffect(() => { + onChange?.(getSelectedValues(items, selectedKeys)); + }, [items, selectedKeys, onChange]); + + useKeypress( + (key) => { + if (key.name === 'space' || key.sequence === ' ') { + toggleSelectionAtIndex(activeIndex); + } + }, + { isActive: isFocused }, + ); + + useEffect(() => { + const newScrollOffset = Math.max( + 0, + Math.min(activeIndex - maxItemsToShow + 1, items.length - maxItemsToShow), + ); + if (activeIndex < scrollOffset) { + setScrollOffset(activeIndex); + } else if (activeIndex >= scrollOffset + maxItemsToShow) { + setScrollOffset(newScrollOffset); + } + }, [activeIndex, items.length, scrollOffset, maxItemsToShow]); + + const visibleItems = useMemo( + () => items.slice(scrollOffset, scrollOffset + maxItemsToShow), + [items, scrollOffset, maxItemsToShow], + ); + const numberColumnWidth = String(items.length).length; + const hasMoreAbove = scrollOffset > 0; + const hasMoreBelow = scrollOffset + maxItemsToShow < items.length; + const moreAboveCount = scrollOffset; + const moreBelowCount = Math.max( + 0, + items.length - (scrollOffset + maxItemsToShow), + ); + + return ( + + {showScrollArrows && hasMoreAbove && ( + ↑ {moreAboveCount} more above + )} + + {visibleItems.map((item, index) => { + const itemIndex = scrollOffset + index; + const isActive = activeIndex === itemIndex; + const isChecked = selectedKeys.has(item.key); + + const itemNumberText = `${String(itemIndex + 1).padStart( + numberColumnWidth, + )}.`; + const checkboxText = item.disabled ? '[x]' : isChecked ? '[✓]' : '[ ]'; + + let textColor = theme.text.primary; + if (item.disabled) { + textColor = theme.text.secondary; + } else if (isActive) { + textColor = theme.status.success; + } else if (isChecked) { + textColor = theme.text.accent; + } + + return ( + + + {checkboxText} + + {showNumbers && ( + + {itemNumberText} + + )} + + {item.label} + + + ); + })} + + {showScrollArrows && hasMoreBelow && ( + ↓ {moreBelowCount} more below + )} + + ); +} diff --git a/packages/cli/src/ui/components/shared/text-buffer.ts b/packages/cli/src/ui/components/shared/text-buffer.ts index 369c7fff55..c68bd1a4b4 100644 --- a/packages/cli/src/ui/components/shared/text-buffer.ts +++ b/packages/cli/src/ui/components/shared/text-buffer.ts @@ -1907,8 +1907,8 @@ export function useTextBuffer({ else if (key.ctrl && key.name === 'b') move('left'); else if (key.name === 'right' && !key.meta && !key.ctrl) move('right'); else if (key.ctrl && key.name === 'f') move('right'); - else if (key.name === 'up') move('up'); - else if (key.name === 'down') move('down'); + else if (key.name === 'up' && !key.shift) move('up'); + else if (key.name === 'down' && !key.shift) move('down'); else if ((key.ctrl || key.meta) && key.name === 'left') move('wordLeft'); else if (key.meta && key.name === 'b') move('wordLeft'); else if ((key.ctrl || key.meta) && key.name === 'right') diff --git a/packages/cli/src/ui/components/subagents/runtime/AgentExecutionDisplay.tsx b/packages/cli/src/ui/components/subagents/runtime/AgentExecutionDisplay.tsx index 8f9fe2a6a6..8da7a3a246 100644 --- a/packages/cli/src/ui/components/subagents/runtime/AgentExecutionDisplay.tsx +++ b/packages/cli/src/ui/components/subagents/runtime/AgentExecutionDisplay.tsx @@ -8,7 +8,7 @@ import React, { useMemo } from 'react'; import { Box, Text } from 'ink'; import type { TaskResultDisplay, - SubagentStatsSummary, + AgentStatsSummary, Config, } from '@qwen-code/qwen-code-core'; import { theme } from '../../../semantic-colors.js'; @@ -467,7 +467,7 @@ const ExecutionSummaryDetails: React.FC<{ * Tool usage statistics component */ const ToolUsageStats: React.FC<{ - executionSummary?: SubagentStatsSummary; + executionSummary?: AgentStatsSummary; }> = ({ executionSummary }) => { if (!executionSummary) { return ( diff --git a/packages/cli/src/ui/contexts/AgentViewContext.tsx b/packages/cli/src/ui/contexts/AgentViewContext.tsx new file mode 100644 index 0000000000..b2c35e6d38 --- /dev/null +++ b/packages/cli/src/ui/contexts/AgentViewContext.tsx @@ -0,0 +1,308 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentViewContext — React context for in-process agent view switching. + * + * Tracks which view is active (main or an agent tab) and the set of registered + * AgentInteractive instances. Consumed by AgentTabBar, AgentChatView, and + * DefaultAppLayout to implement tab-based agent navigation. + * + * Kept separate from UIStateContext to avoid bloating the main state with + * in-process-only concerns and to make the feature self-contained. + */ + +import { + createContext, + useContext, + useCallback, + useMemo, + useState, +} from 'react'; +import { + type AgentInteractive, + type ApprovalMode, + type Config, +} from '@qwen-code/qwen-code-core'; +import { useArenaInProcess } from '../hooks/useArenaInProcess.js'; + +// ─── Types ────────────────────────────────────────────────── + +export interface RegisteredAgent { + interactiveAgent: AgentInteractive; + /** Model identifier shown in tabs and paths (e.g. "glm-5"). */ + modelId: string; + /** Human-friendly model name (e.g. "GLM 5"). */ + modelName?: string; + color: string; +} + +export interface AgentViewState { + /** 'main' or an agentId */ + activeView: string; + /** Registered in-process agents keyed by agentId */ + agents: ReadonlyMap; + /** Whether any agent tab's embedded shell currently has input focus. */ + agentShellFocused: boolean; + /** Current text in the active agent tab's input buffer (empty when on main). */ + agentInputBufferText: string; + /** Whether the tab bar has keyboard focus (vs the agent input). */ + agentTabBarFocused: boolean; + /** Per-agent approval modes (keyed by agentId). */ + agentApprovalModes: ReadonlyMap; +} + +export interface AgentViewActions { + switchToMain(): void; + switchToAgent(agentId: string): void; + switchToNext(): void; + switchToPrevious(): void; + registerAgent( + agentId: string, + interactiveAgent: AgentInteractive, + modelId: string, + color: string, + modelName?: string, + ): void; + unregisterAgent(agentId: string): void; + unregisterAll(): void; + setAgentShellFocused(focused: boolean): void; + setAgentInputBufferText(text: string): void; + setAgentTabBarFocused(focused: boolean): void; + setAgentApprovalMode(agentId: string, mode: ApprovalMode): void; +} + +// ─── Context ──────────────────────────────────────────────── + +const AgentViewStateContext = createContext(null); +const AgentViewActionsContext = createContext(null); + +// ─── Defaults (used when no provider is mounted) ──────────── + +const DEFAULT_STATE: AgentViewState = { + activeView: 'main', + agents: new Map(), + agentShellFocused: false, + agentInputBufferText: '', + agentTabBarFocused: false, + agentApprovalModes: new Map(), +}; + +const noop = () => {}; + +const DEFAULT_ACTIONS: AgentViewActions = { + switchToMain: noop, + switchToAgent: noop, + switchToNext: noop, + switchToPrevious: noop, + registerAgent: noop, + unregisterAgent: noop, + unregisterAll: noop, + setAgentShellFocused: noop, + setAgentInputBufferText: noop, + setAgentTabBarFocused: noop, + setAgentApprovalMode: noop, +}; + +// ─── Hook: useAgentViewState ──────────────────────────────── + +export function useAgentViewState(): AgentViewState { + return useContext(AgentViewStateContext) ?? DEFAULT_STATE; +} + +// ─── Hook: useAgentViewActions ────────────────────────────── + +export function useAgentViewActions(): AgentViewActions { + return useContext(AgentViewActionsContext) ?? DEFAULT_ACTIONS; +} + +// ─── Provider ─────────────────────────────────────────────── + +interface AgentViewProviderProps { + config?: Config; + children: React.ReactNode; +} + +export function AgentViewProvider({ + config, + children, +}: AgentViewProviderProps) { + const [activeView, setActiveView] = useState('main'); + const [agents, setAgents] = useState>( + () => new Map(), + ); + const [agentShellFocused, setAgentShellFocused] = useState(false); + const [agentInputBufferText, setAgentInputBufferText] = useState(''); + const [agentTabBarFocused, setAgentTabBarFocused] = useState(false); + const [agentApprovalModes, setAgentApprovalModes] = useState< + Map + >(() => new Map()); + + // ── Navigation ── + + const switchToMain = useCallback(() => { + setActiveView('main'); + setAgentTabBarFocused(false); + }, []); + + const switchToAgent = useCallback( + (agentId: string) => { + if (agents.has(agentId)) { + setActiveView(agentId); + } + }, + [agents], + ); + + const switchToNext = useCallback(() => { + const ids = ['main', ...agents.keys()]; + const currentIndex = ids.indexOf(activeView); + const nextIndex = (currentIndex + 1) % ids.length; + setActiveView(ids[nextIndex]!); + }, [agents, activeView]); + + const switchToPrevious = useCallback(() => { + const ids = ['main', ...agents.keys()]; + const currentIndex = ids.indexOf(activeView); + const prevIndex = (currentIndex - 1 + ids.length) % ids.length; + setActiveView(ids[prevIndex]!); + }, [agents, activeView]); + + // ── Registration ── + + const registerAgent = useCallback( + ( + agentId: string, + interactiveAgent: AgentInteractive, + modelId: string, + color: string, + modelName?: string, + ) => { + setAgents((prev) => { + const next = new Map(prev); + next.set(agentId, { + interactiveAgent, + modelId, + color, + modelName, + }); + return next; + }); + // Seed approval mode from the agent's own config + const mode = interactiveAgent.getCore().runtimeContext.getApprovalMode(); + setAgentApprovalModes((prev) => { + const next = new Map(prev); + next.set(agentId, mode); + return next; + }); + }, + [], + ); + + const unregisterAgent = useCallback((agentId: string) => { + setAgents((prev) => { + if (!prev.has(agentId)) return prev; + const next = new Map(prev); + next.delete(agentId); + return next; + }); + setAgentApprovalModes((prev) => { + if (!prev.has(agentId)) return prev; + const next = new Map(prev); + next.delete(agentId); + return next; + }); + setActiveView((current) => (current === agentId ? 'main' : current)); + }, []); + + const unregisterAll = useCallback(() => { + setAgents(new Map()); + setAgentApprovalModes(new Map()); + setActiveView('main'); + setAgentTabBarFocused(false); + }, []); + + const setAgentApprovalMode = useCallback( + (agentId: string, mode: ApprovalMode) => { + // Update the agent's runtime config so tool scheduling picks it up + const agent = agents.get(agentId); + if (agent) { + agent.interactiveAgent.getCore().runtimeContext.setApprovalMode(mode); + } + // Update UI state + setAgentApprovalModes((prev) => { + const next = new Map(prev); + next.set(agentId, mode); + return next; + }); + }, + [agents], + ); + + // ── Memoized values ── + + const state: AgentViewState = useMemo( + () => ({ + activeView, + agents, + agentShellFocused, + agentInputBufferText, + agentTabBarFocused, + agentApprovalModes, + }), + [ + activeView, + agents, + agentShellFocused, + agentInputBufferText, + agentTabBarFocused, + agentApprovalModes, + ], + ); + + const actions: AgentViewActions = useMemo( + () => ({ + switchToMain, + switchToAgent, + switchToNext, + switchToPrevious, + registerAgent, + unregisterAgent, + unregisterAll, + setAgentShellFocused, + setAgentInputBufferText, + setAgentTabBarFocused, + setAgentApprovalMode, + }), + [ + switchToMain, + switchToAgent, + switchToNext, + switchToPrevious, + registerAgent, + unregisterAgent, + unregisterAll, + setAgentShellFocused, + setAgentInputBufferText, + setAgentTabBarFocused, + setAgentApprovalMode, + ], + ); + + // ── Arena in-process bridge ── + // Bridge arena manager events to agent registration. The hook is kept + // in its own file for separation of concerns; it's called here so the + // provider is the single owner of agent tab lifecycle. + useArenaInProcess(config ?? null, actions); + + return ( + + + {children} + + + ); +} diff --git a/packages/cli/src/ui/contexts/UIActionsContext.tsx b/packages/cli/src/ui/contexts/UIActionsContext.tsx index 19464cccc5..7dccb8f459 100644 --- a/packages/cli/src/ui/contexts/UIActionsContext.tsx +++ b/packages/cli/src/ui/contexts/UIActionsContext.tsx @@ -17,6 +17,7 @@ import { import { type SettingScope } from '../../config/settings.js'; import { type CodingPlanRegion } from '../../constants/codingPlan.js'; import type { AuthState } from '../types.js'; +import { type ArenaDialogType } from '../hooks/useArenaCommand.js'; // OpenAICredentials type (previously imported from OpenAIKeyPrompt) export interface OpenAICredentials { apiKey: string; @@ -54,6 +55,9 @@ export interface UIActions { exitEditorDialog: () => void; closeSettingsDialog: () => void; closeModelDialog: () => void; + openArenaDialog: (type: Exclude) => void; + closeArenaDialog: () => void; + handleArenaModelsSelected?: (models: string[]) => void; dismissCodingPlanUpdate: () => void; closePermissionsDialog: () => void; setShellModeActive: (value: boolean) => void; diff --git a/packages/cli/src/ui/contexts/UIStateContext.tsx b/packages/cli/src/ui/contexts/UIStateContext.tsx index 0d461e70ca..e2a2897701 100644 --- a/packages/cli/src/ui/contexts/UIStateContext.tsx +++ b/packages/cli/src/ui/contexts/UIStateContext.tsx @@ -33,6 +33,7 @@ import type { UpdateObject } from '../utils/updateCheck.js'; import { type UseHistoryManagerReturn } from '../hooks/useHistoryManager.js'; import { type RestartReason } from '../hooks/useIdeTrustListener.js'; import { type CodingPlanUpdateRequest } from '../hooks/useCodingPlanUpdates.js'; +import { type ArenaDialogType } from '../hooks/useArenaCommand.js'; export interface UIState { history: HistoryItem[]; @@ -52,6 +53,7 @@ export interface UIState { quittingMessages: HistoryItem[] | null; isSettingsDialogOpen: boolean; isModelDialogOpen: boolean; + activeArenaDialog: ArenaDialogType; isPermissionsDialogOpen: boolean; isApprovalModeDialogOpen: boolean; isResumeDialogOpen: boolean; diff --git a/packages/cli/src/ui/hooks/slashCommandProcessor.ts b/packages/cli/src/ui/hooks/slashCommandProcessor.ts index 82cd52060c..4fd3e92a02 100644 --- a/packages/cli/src/ui/hooks/slashCommandProcessor.ts +++ b/packages/cli/src/ui/hooks/slashCommandProcessor.ts @@ -7,6 +7,7 @@ import { useCallback, useMemo, useEffect, useRef, useState } from 'react'; import { type PartListUnion } from '@google/genai'; import type { UseHistoryManagerReturn } from './useHistoryManager.js'; +import type { ArenaDialogType } from './useArenaCommand.js'; import { type Logger, type Config, @@ -66,6 +67,7 @@ const SLASH_COMMANDS_SKIP_RECORDING = new Set([ interface SlashCommandProcessorActions { openAuthDialog: () => void; + openArenaDialog?: (type: Exclude) => void; openThemeDialog: () => void; openEditorDialog: () => void; openSettingsDialog: () => void; @@ -456,6 +458,18 @@ export const useSlashCommandProcessor = ( return { type: 'handled' }; case 'dialog': switch (result.dialog) { + case 'arena_start': + actions.openArenaDialog?.('start'); + return { type: 'handled' }; + case 'arena_select': + actions.openArenaDialog?.('select'); + return { type: 'handled' }; + case 'arena_stop': + actions.openArenaDialog?.('stop'); + return { type: 'handled' }; + case 'arena_status': + actions.openArenaDialog?.('status'); + return { type: 'handled' }; case 'auth': actions.openAuthDialog(); return { type: 'handled' }; diff --git a/packages/cli/src/ui/hooks/useAgentStreamingState.ts b/packages/cli/src/ui/hooks/useAgentStreamingState.ts new file mode 100644 index 0000000000..881f715b2c --- /dev/null +++ b/packages/cli/src/ui/hooks/useAgentStreamingState.ts @@ -0,0 +1,166 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Hook that subscribes to an AgentInteractive's events and + * derives streaming state, elapsed time, input-active flag, and status. + * + * Extracts the common reactivity + derived-state pattern shared by + * AgentComposer and AgentChatView so each component only deals with + * layout and interaction. + */ + +import { useState, useEffect, useCallback, useMemo, useRef } from 'react'; +import { + AgentStatus, + AgentEventType, + isTerminalStatus, + type AgentInteractive, + type AgentEventEmitter, +} from '@qwen-code/qwen-code-core'; +import { StreamingState } from '../types.js'; +import { useTimer } from './useTimer.js'; + +// ─── Types ────────────────────────────────────────────────── + +export interface AgentStreamingInfo { + /** The agent's current lifecycle status. */ + status: AgentStatus | undefined; + /** Derived streaming state for StreamingContext / LoadingIndicator. */ + streamingState: StreamingState; + /** Whether the agent can accept user input right now. */ + isInputActive: boolean; + /** Seconds elapsed while in Responding state (resets each cycle). */ + elapsedTime: number; + /** Prompt token count from the most recent round (for context usage). */ + lastPromptTokenCount: number; +} + +// ─── Hook ─────────────────────────────────────────────────── + +/** + * Subscribe to an AgentInteractive's events and derive UI streaming state. + * + * @param interactiveAgent - The agent instance, or undefined if not yet registered. + * @param events - Which event types trigger a re-render. Defaults to + * STATUS_CHANGE, TOOL_WAITING_APPROVAL, and TOOL_RESULT — sufficient for + * composer / footer use. Callers like AgentChatView can pass a broader set + * (e.g. include TOOL_CALL, ROUND_END, TOOL_OUTPUT_UPDATE) for richer updates. + */ +export function useAgentStreamingState( + interactiveAgent: AgentInteractive | undefined, + events?: ReadonlyArray<(typeof AgentEventType)[keyof typeof AgentEventType]>, +): AgentStreamingInfo { + // ── Force-render on agent events ── + + const [, setTick] = useState(0); + const tickRef = useRef(0); + const forceRender = useCallback(() => { + tickRef.current += 1; + setTick(tickRef.current); + }, []); + + // ── Track last prompt token count from USAGE_METADATA events ── + + const [lastPromptTokenCount, setLastPromptTokenCount] = useState( + () => interactiveAgent?.getLastPromptTokenCount() ?? 0, + ); + + const subscribedEvents = events ?? DEFAULT_EVENTS; + + useEffect(() => { + if (!interactiveAgent) return; + const emitter: AgentEventEmitter | undefined = + interactiveAgent.getEventEmitter(); + if (!emitter) return; + + const handler = () => forceRender(); + for (const evt of subscribedEvents) { + emitter.on(evt, handler); + } + + // Dedicated listener for usage metadata — updates React state directly + // so the token count is available immediately (even if no other event + // triggers a re-render). Prefers totalTokenCount (prompt + output) + // because output becomes history for the next round, matching + // geminiChat.ts. + const usageHandler = (event: { + usage?: { totalTokenCount?: number; promptTokenCount?: number }; + }) => { + const count = + event?.usage?.totalTokenCount ?? event?.usage?.promptTokenCount; + if (typeof count === 'number' && count > 0) { + setLastPromptTokenCount(count); + } + }; + emitter.on(AgentEventType.USAGE_METADATA, usageHandler); + + return () => { + for (const evt of subscribedEvents) { + emitter.off(evt, handler); + } + emitter.off(AgentEventType.USAGE_METADATA, usageHandler); + }; + }, [interactiveAgent, forceRender, subscribedEvents]); + + // ── Derived state ── + + const status = interactiveAgent?.getStatus(); + const pendingApprovals = interactiveAgent?.getPendingApprovals(); + const hasPendingApprovals = + pendingApprovals !== undefined && pendingApprovals.size > 0; + + const streamingState = useMemo(() => { + if (hasPendingApprovals) { + return StreamingState.WaitingForConfirmation; + } + if (status === AgentStatus.RUNNING || status === AgentStatus.INITIALIZING) { + return StreamingState.Responding; + } + return StreamingState.Idle; + }, [status, hasPendingApprovals]); + + const isInputActive = + (streamingState === StreamingState.Idle || + streamingState === StreamingState.Responding) && + status !== undefined && + !isTerminalStatus(status); + + // ── Timer (resets each time we enter Responding) ── + + const [timerResetKey, setTimerResetKey] = useState(0); + const prevStreamingRef = useRef(streamingState); + useEffect(() => { + if ( + streamingState === StreamingState.Responding && + prevStreamingRef.current !== StreamingState.Responding + ) { + setTimerResetKey((k) => k + 1); + } + prevStreamingRef.current = streamingState; + }, [streamingState]); + + const elapsedTime = useTimer( + streamingState === StreamingState.Responding, + timerResetKey, + ); + + return { + status, + streamingState, + isInputActive, + elapsedTime, + lastPromptTokenCount, + }; +} + +// ─── Defaults ─────────────────────────────────────────────── + +const DEFAULT_EVENTS = [ + AgentEventType.STATUS_CHANGE, + AgentEventType.TOOL_WAITING_APPROVAL, + AgentEventType.TOOL_RESULT, +] as const; diff --git a/packages/cli/src/ui/hooks/useArenaCommand.ts b/packages/cli/src/ui/hooks/useArenaCommand.ts new file mode 100644 index 0000000000..0392a0f1f0 --- /dev/null +++ b/packages/cli/src/ui/hooks/useArenaCommand.ts @@ -0,0 +1,37 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { useCallback, useState } from 'react'; + +export type ArenaDialogType = 'start' | 'select' | 'stop' | 'status' | null; + +interface UseArenaCommandReturn { + activeArenaDialog: ArenaDialogType; + openArenaDialog: (type: Exclude) => void; + closeArenaDialog: () => void; +} + +export function useArenaCommand(): UseArenaCommandReturn { + const [activeArenaDialog, setActiveArenaDialog] = + useState(null); + + const openArenaDialog = useCallback( + (type: Exclude) => { + setActiveArenaDialog(type); + }, + [], + ); + + const closeArenaDialog = useCallback(() => { + setActiveArenaDialog(null); + }, []); + + return { + activeArenaDialog, + openArenaDialog, + closeArenaDialog, + }; +} diff --git a/packages/cli/src/ui/hooks/useArenaInProcess.ts b/packages/cli/src/ui/hooks/useArenaInProcess.ts new file mode 100644 index 0000000000..c75634a2a9 --- /dev/null +++ b/packages/cli/src/ui/hooks/useArenaInProcess.ts @@ -0,0 +1,177 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview useArenaInProcess — bridges ArenaManager in-process events + * to AgentViewContext agent registration. + * + * Subscribes to `config.onArenaManagerChange()` to react immediately when + * the arena manager is set or cleared. Event listeners are attached to the + * manager's emitter as soon as it appears — the backend is resolved lazily + * inside the AGENT_START handler, which only fires after the backend is + * initialized. + */ + +import { useEffect, useRef } from 'react'; +import { + ArenaEventType, + ArenaSessionStatus, + DISPLAY_MODE, + type ArenaAgentStartEvent, + type ArenaManager, + type ArenaSessionCompleteEvent, + type Config, + type InProcessBackend, +} from '@qwen-code/qwen-code-core'; +import type { AgentViewActions } from '../contexts/AgentViewContext.js'; +import { theme } from '../semantic-colors.js'; + +const AGENT_COLORS = [ + theme.text.accent, + theme.text.link, + theme.status.success, + theme.status.warning, + theme.text.code, + theme.status.error, +]; + +/** + * Bridge arena in-process events to agent tab registration/unregistration. + * + * Called by AgentViewProvider — accepts config and actions directly so the + * hook has no dependency on AgentViewContext (avoiding a circular import). + */ +export function useArenaInProcess( + config: Config | null, + actions: AgentViewActions, +): void { + const actionsRef = useRef(actions); + actionsRef.current = actions; + + useEffect(() => { + if (!config) return; + + let detachArenaListeners: (() => void) | null = null; + const retryTimeouts = new Set>(); + + /** Remove agent tabs, cancel pending retries, and detach arena events. */ + const detachSession = () => { + actionsRef.current.unregisterAll(); + for (const t of retryTimeouts) clearTimeout(t); + retryTimeouts.clear(); + detachArenaListeners?.(); + detachArenaListeners = null; + }; + + /** Attach to an arena manager's event emitter. The backend is resolved + * lazily — we only need it when registering agents, not at subscribe + * time. This avoids the race where setArenaManager fires before + * manager.start() initializes the backend. */ + const attachSession = (manager: ArenaManager) => { + const emitter = manager.getEventEmitter(); + let colorIndex = 0; + + const nextColor = () => AGENT_COLORS[colorIndex++ % AGENT_COLORS.length]!; + + /** Resolve the InProcessBackend, or null if not applicable. */ + const getInProcessBackend = (): InProcessBackend | null => { + const backend = manager.getBackend(); + if (!backend || backend.type !== DISPLAY_MODE.IN_PROCESS) return null; + return backend as InProcessBackend; + }; + + // Register agents that already started (events may have fired before + // the callback was attached). + const inProcessBackend = getInProcessBackend(); + if (inProcessBackend) { + for (const agentState of manager.getAgentStates()) { + const interactive = inProcessBackend.getAgent(agentState.agentId); + if (interactive) { + actionsRef.current.registerAgent( + agentState.agentId, + interactive, + agentState.model.modelId, + nextColor(), + agentState.model.displayName, + ); + } + } + } + + // AGENT_START fires *before* backend.spawnAgent() creates the + // AgentInteractive, so getAgent() may return undefined. Retry briefly. + const MAX_RETRIES = 20; + const RETRY_MS = 50; + + const onAgentStart = (event: ArenaAgentStartEvent) => { + const tryRegister = (retriesLeft: number) => { + const backend = getInProcessBackend(); + if (!backend) return; // not an in-process session + + const interactive = backend.getAgent(event.agentId); + if (interactive) { + actionsRef.current.registerAgent( + event.agentId, + interactive, + event.model.modelId, + nextColor(), + event.model.displayName, + ); + return; + } + if (retriesLeft > 0) { + const timeout = setTimeout(() => { + retryTimeouts.delete(timeout); + tryRegister(retriesLeft - 1); + }, RETRY_MS); + retryTimeouts.add(timeout); + } + }; + tryRegister(MAX_RETRIES); + }; + + const onSessionComplete = (event: ArenaSessionCompleteEvent) => { + // IDLE means agents finished but the session is still alive for + // follow-up interaction — keep the tab bar. + if (event.result.status === ArenaSessionStatus.IDLE) return; + detachSession(); + }; + + const onSessionError = () => detachSession(); + + emitter.on(ArenaEventType.AGENT_START, onAgentStart); + emitter.on(ArenaEventType.SESSION_COMPLETE, onSessionComplete); + emitter.on(ArenaEventType.SESSION_ERROR, onSessionError); + + detachArenaListeners = () => { + emitter.off(ArenaEventType.AGENT_START, onAgentStart); + emitter.off(ArenaEventType.SESSION_COMPLETE, onSessionComplete); + emitter.off(ArenaEventType.SESSION_ERROR, onSessionError); + }; + }; + + const handleManagerChange = (manager: ArenaManager | null) => { + detachSession(); + if (manager) { + attachSession(manager); + } + }; + + // Subscribe to future changes. + config.onArenaManagerChange(handleManagerChange); + + // Handle the case where a manager already exists when we mount. + const current = config.getArenaManager(); + if (current) { + attachSession(current); + } + + return () => { + config.onArenaManagerChange(null); + detachSession(); + }; + }, [config]); +} diff --git a/packages/cli/src/ui/hooks/useAutoAcceptIndicator.ts b/packages/cli/src/ui/hooks/useAutoAcceptIndicator.ts index 3135a362b4..3d075f8a65 100644 --- a/packages/cli/src/ui/hooks/useAutoAcceptIndicator.ts +++ b/packages/cli/src/ui/hooks/useAutoAcceptIndicator.ts @@ -19,6 +19,8 @@ export interface UseAutoAcceptIndicatorArgs { addItem?: (item: HistoryItemWithoutId, timestamp: number) => void; onApprovalModeChange?: (mode: ApprovalMode) => void; shouldBlockTab?: () => boolean; + /** When true, the keyboard handler is disabled (e.g. agent tab is active). */ + disabled?: boolean; } export function useAutoAcceptIndicator({ @@ -26,6 +28,7 @@ export function useAutoAcceptIndicator({ addItem, onApprovalModeChange, shouldBlockTab, + disabled, }: UseAutoAcceptIndicatorArgs): ApprovalMode { const currentConfigValue = config.getApprovalMode(); const [showAutoAcceptIndicator, setShowAutoAcceptIndicator] = @@ -78,7 +81,7 @@ export function useAutoAcceptIndicator({ } } }, - { isActive: true }, + { isActive: !disabled }, ); return showAutoAcceptIndicator; diff --git a/packages/cli/src/ui/hooks/useDialogClose.ts b/packages/cli/src/ui/hooks/useDialogClose.ts index d71a211901..119d1c96cd 100644 --- a/packages/cli/src/ui/hooks/useDialogClose.ts +++ b/packages/cli/src/ui/hooks/useDialogClose.ts @@ -7,6 +7,7 @@ import { useCallback } from 'react'; import { SettingScope } from '../../config/settings.js'; import type { AuthType, ApprovalMode } from '@qwen-code/qwen-code-core'; +import type { ArenaDialogType } from './useArenaCommand.js'; // OpenAICredentials type (previously imported from OpenAIKeyPrompt) interface OpenAICredentials { apiKey: string; @@ -42,6 +43,10 @@ export interface DialogCloseOptions { isSettingsDialogOpen: boolean; closeSettingsDialog: () => void; + // Arena dialogs + activeArenaDialog: ArenaDialogType; + closeArenaDialog: () => void; + // Folder trust dialog isFolderTrustDialogOpen: boolean; @@ -83,6 +88,11 @@ export function useDialogClose(options: DialogCloseOptions) { return true; } + if (options.activeArenaDialog !== null) { + options.closeArenaDialog(); + return true; + } + if (options.isFolderTrustDialogOpen) { // FolderTrustDialog doesn't expose close function, but ESC would prevent exit // We follow the same pattern - prevent exit behavior diff --git a/packages/cli/src/ui/hooks/useGeminiStream.test.tsx b/packages/cli/src/ui/hooks/useGeminiStream.test.tsx index 33680358ec..4330ba7a53 100644 --- a/packages/cli/src/ui/hooks/useGeminiStream.test.tsx +++ b/packages/cli/src/ui/hooks/useGeminiStream.test.tsx @@ -203,6 +203,7 @@ describe('useGeminiStream', () => { .fn() .mockReturnValue(contentGeneratorConfig), getMaxSessionTurns: vi.fn(() => 50), + getArenaAgentClient: vi.fn(() => null), } as unknown as Config; mockOnDebugMessage = vi.fn(); mockHandleSlashCommand = vi.fn().mockResolvedValue(false); diff --git a/packages/cli/src/ui/hooks/useGeminiStream.ts b/packages/cli/src/ui/hooks/useGeminiStream.ts index 75a1c53649..108b2fe839 100644 --- a/packages/cli/src/ui/hooks/useGeminiStream.ts +++ b/packages/cli/src/ui/hooks/useGeminiStream.ts @@ -430,6 +430,12 @@ export const useGeminiStream = ( isSubmittingQueryRef.current = false; abortControllerRef.current?.abort(); + // Report cancellation to arena status reporter (if in arena mode). + // This is needed because cancellation during tool execution won't + // flow through sendMessageStream where the inline reportCancelled() + // lives — tools get cancelled and handleCompletedTools returns early. + config.getArenaAgentClient()?.reportCancelled(); + // Log API cancellation const prompt_id = config.getSessionId() + '########' + getPromptCount(); const cancellationEvent = new ApiCancelEvent( @@ -1433,6 +1439,9 @@ export const useGeminiStream = ( role: 'user', parts: combinedParts, }); + + // Report cancellation to arena (safety net — cancelOngoingRequest + config.getArenaAgentClient()?.reportCancelled(); } const callIdsToMarkAsSubmitted = geminiTools.map( @@ -1469,6 +1478,7 @@ export const useGeminiStream = ( geminiClient, performMemoryRefresh, modelSwitchedFromQuotaError, + config, ], ); diff --git a/packages/cli/src/ui/hooks/useInputHistory.ts b/packages/cli/src/ui/hooks/useInputHistory.ts index 58fc9d4a6c..65e0256a5b 100644 --- a/packages/cli/src/ui/hooks/useInputHistory.ts +++ b/packages/cli/src/ui/hooks/useInputHistory.ts @@ -18,6 +18,7 @@ export interface UseInputHistoryReturn { handleSubmit: (value: string) => void; navigateUp: () => boolean; navigateDown: () => boolean; + resetHistoryNav: () => void; } export function useInputHistory({ @@ -107,5 +108,6 @@ export function useInputHistory({ handleSubmit, navigateUp, navigateDown, + resetHistoryNav, }; } diff --git a/packages/cli/src/ui/hooks/useSelectionList.test.ts b/packages/cli/src/ui/hooks/useSelectionList.test.ts index 8383d89c9c..e488fe1756 100644 --- a/packages/cli/src/ui/hooks/useSelectionList.test.ts +++ b/packages/cli/src/ui/hooks/useSelectionList.test.ts @@ -5,6 +5,7 @@ */ import { describe, it, expect, vi, beforeEach, afterEach } from 'vitest'; +import { useEffect, useState } from 'react'; import { renderHook, act } from '@testing-library/react'; import { useSelectionList, @@ -915,6 +916,37 @@ describe('useSelectionList', () => { expect(result.current.activeIndex).toBe(2); }); + + it('should handle equivalent items regenerated on each render', () => { + const { result } = renderHook(() => { + const [tick, setTick] = useState(0); + const regeneratedItems = [ + { value: 'A', key: 'A' }, + { value: 'B', disabled: true, key: 'B' }, + { value: 'C', key: 'C' }, + ]; + + const selection = useSelectionList({ + items: regeneratedItems, + onSelect: mockOnSelect, + initialIndex: 0, + }); + + useEffect(() => { + if (tick === 0) { + setTick(1); + } + }, [tick]); + + return { + tick, + activeIndex: selection.activeIndex, + }; + }); + + expect(result.current.tick).toBe(1); + expect(result.current.activeIndex).toBe(0); + }); }); describe('Manual Control', () => { diff --git a/packages/cli/src/ui/hooks/useSelectionList.ts b/packages/cli/src/ui/hooks/useSelectionList.ts index c09aec8027..81045a5bf0 100644 --- a/packages/cli/src/ui/hooks/useSelectionList.ts +++ b/packages/cli/src/ui/hooks/useSelectionList.ts @@ -133,6 +133,27 @@ const computeInitialIndex = ( return targetIndex; }; +const areItemsStructurallyEqual = ( + a: Array>, + b: Array>, +): boolean => { + if (a === b) { + return true; + } + + if (a.length !== b.length) { + return false; + } + + for (let i = 0; i < a.length; i++) { + if (a[i]?.key !== b[i]?.key || a[i]?.disabled !== b[i]?.disabled) { + return false; + } + } + + return true; +}; + function selectionListReducer( state: SelectionListState, action: SelectionListAction, @@ -176,22 +197,30 @@ function selectionListReducer( case 'INITIALIZE': { const { initialIndex, items } = action.payload; + const initialIndexChanged = initialIndex !== state.initialIndex; const activeKey = - initialIndex === state.initialIndex && - state.activeIndex !== state.initialIndex + !initialIndexChanged && state.activeIndex !== state.initialIndex ? state.items[state.activeIndex]?.key : undefined; - - if (items === state.items && initialIndex === state.initialIndex) { + const targetIndex = computeInitialIndex(initialIndex, items, activeKey); + const itemsStructurallyEqual = areItemsStructurallyEqual( + items, + state.items, + ); + + if ( + !initialIndexChanged && + targetIndex === state.activeIndex && + itemsStructurallyEqual + ) { return state; } - const targetIndex = computeInitialIndex(initialIndex, items, activeKey); - return { ...state, - items, + items: itemsStructurallyEqual ? state.items : items, activeIndex: targetIndex, + initialIndex, pendingHighlight: false, }; } diff --git a/packages/cli/src/ui/layouts/DefaultAppLayout.tsx b/packages/cli/src/ui/layouts/DefaultAppLayout.tsx index 93ad311c67..ddb3f2df00 100644 --- a/packages/cli/src/ui/layouts/DefaultAppLayout.tsx +++ b/packages/cli/src/ui/layouts/DefaultAppLayout.tsx @@ -5,36 +5,77 @@ */ import type React from 'react'; +import { useEffect, useRef } from 'react'; import { Box } from 'ink'; import { MainContent } from '../components/MainContent.js'; import { DialogManager } from '../components/DialogManager.js'; import { Composer } from '../components/Composer.js'; import { ExitWarning } from '../components/ExitWarning.js'; +import { AgentTabBar } from '../components/agent-view/AgentTabBar.js'; +import { AgentChatView } from '../components/agent-view/AgentChatView.js'; +import { AgentComposer } from '../components/agent-view/AgentComposer.js'; import { useUIState } from '../contexts/UIStateContext.js'; +import { useUIActions } from '../contexts/UIActionsContext.js'; +import { useAgentViewState } from '../contexts/AgentViewContext.js'; import { useTerminalSize } from '../hooks/useTerminalSize.js'; export const DefaultAppLayout: React.FC = () => { const uiState = useUIState(); + const { refreshStatic } = useUIActions(); + const { activeView, agents } = useAgentViewState(); const { columns: terminalWidth } = useTerminalSize(); + const hasAgents = agents.size > 0; + const isAgentTab = activeView !== 'main' && agents.has(activeView); + + // Clear terminal on view switch so previous view's output + // is removed. refreshStatic clears the terminal and bumps the + // historyRemountKey so MainContent's re-renders all items + // when switching back. + const prevViewRef = useRef(activeView); + useEffect(() => { + if (prevViewRef.current !== activeView) { + prevViewRef.current = activeView; + refreshStatic(); + } + }, [activeView, refreshStatic]); return ( - - - - {uiState.dialogsVisible ? ( - - + {isAgentTab ? ( + <> + {/* Agent view: chat history + agent-specific composer */} + + + + + + + ) : ( + <> + {/* Main view: conversation history + main composer / dialogs */} + + + {uiState.dialogsVisible ? ( + + + + ) : ( + + )} + - ) : ( - - )} + + )} - - + {/* Tab bar: visible whenever in-process agents exist and input is active */} + {hasAgents && !uiState.dialogsVisible && } ); }; diff --git a/packages/cli/src/ui/types.ts b/packages/cli/src/ui/types.ts index 8f4c41f6d3..95cf9888ff 100644 --- a/packages/cli/src/ui/types.ts +++ b/packages/cli/src/ui/types.ts @@ -11,6 +11,7 @@ import type { ToolCallConfirmationDetails, ToolConfirmationOutcome, ToolResultDisplay, + AgentStatus, } from '@qwen-code/qwen-code-core'; import type { PartListUnion } from '@google/genai'; import { type ReactNode } from 'react'; @@ -128,6 +129,11 @@ export type HistoryItemWarning = HistoryItemBase & { text: string; }; +export type HistoryItemSuccess = HistoryItemBase & { + type: 'success'; + text: string; +}; + export type HistoryItemRetryCountdown = HistoryItemBase & { type: 'retry_countdown'; text: string; @@ -256,6 +262,40 @@ export type HistoryItemMcpStatus = HistoryItemBase & { showTips: boolean; }; +/** + * Arena agent completion card data. + */ +export interface ArenaAgentCardData { + label: string; + status: AgentStatus; + durationMs: number; + totalTokens: number; + inputTokens: number; + outputTokens: number; + toolCalls: number; + successfulToolCalls: number; + failedToolCalls: number; + rounds: number; + error?: string; + diff?: string; +} + +export type HistoryItemArenaAgentComplete = HistoryItemBase & { + type: 'arena_agent_complete'; + agent: ArenaAgentCardData; +}; + +export type HistoryItemArenaSessionComplete = HistoryItemBase & { + type: 'arena_session_complete'; + sessionStatus: string; + task: string; + totalDurationMs: number; + agents: ArenaAgentCardData[]; +}; + +/** + * Insight progress message. + */ export type HistoryItemInsightProgress = HistoryItemBase & { type: 'insight_progress'; progress: InsightProgressProps; @@ -275,6 +315,7 @@ export type HistoryItemWithoutId = | HistoryItemInfo | HistoryItemError | HistoryItemWarning + | HistoryItemSuccess | HistoryItemRetryCountdown | HistoryItemAbout | HistoryItemHelp @@ -290,6 +331,8 @@ export type HistoryItemWithoutId = | HistoryItemToolsList | HistoryItemSkillsList | HistoryItemMcpStatus + | HistoryItemArenaAgentComplete + | HistoryItemArenaSessionComplete | HistoryItemInsightProgress; export type HistoryItem = HistoryItemWithoutId & { id: number }; @@ -297,6 +340,7 @@ export type HistoryItem = HistoryItemWithoutId & { id: number }; // Message types used by internal command feedback (subset of HistoryItem types) export enum MessageType { INFO = 'info', + SUCCESS = 'success', ERROR = 'error', WARNING = 'warning', USER = 'user', @@ -313,6 +357,8 @@ export enum MessageType { TOOLS_LIST = 'tools_list', SKILLS_LIST = 'skills_list', MCP_STATUS = 'mcp_status', + ARENA_AGENT_COMPLETE = 'arena_agent_complete', + ARENA_SESSION_COMPLETE = 'arena_session_complete', INSIGHT_PROGRESS = 'insight_progress', } diff --git a/packages/cli/src/ui/utils/InlineMarkdownRenderer.tsx b/packages/cli/src/ui/utils/InlineMarkdownRenderer.tsx index ce31078d1c..2403db96f5 100644 --- a/packages/cli/src/ui/utils/InlineMarkdownRenderer.tsx +++ b/packages/cli/src/ui/utils/InlineMarkdownRenderer.tsx @@ -103,7 +103,7 @@ const RenderInlineInternal: React.FC = ({ const codeMatch = fullMatch.match(/^(`+)(.+?)\1$/s); if (codeMatch && codeMatch[2]) { renderedNode = ( - + {codeMatch[2]} ); diff --git a/packages/cli/src/ui/utils/displayUtils.ts b/packages/cli/src/ui/utils/displayUtils.ts index b8f603170e..4f8fabb169 100644 --- a/packages/cli/src/ui/utils/displayUtils.ts +++ b/packages/cli/src/ui/utils/displayUtils.ts @@ -5,6 +5,34 @@ */ import { theme } from '../semantic-colors.js'; +import { AgentStatus } from '@qwen-code/qwen-code-core'; + +// --- Status Labels --- + +export interface StatusLabel { + icon: string; + text: string; + color: string; +} + +export function getArenaStatusLabel(status: AgentStatus): StatusLabel { + switch (status) { + case AgentStatus.IDLE: + return { icon: '✓', text: 'Idle', color: theme.status.success }; + case AgentStatus.COMPLETED: + return { icon: '✓', text: 'Done', color: theme.status.success }; + case AgentStatus.CANCELLED: + return { icon: '⊘', text: 'Cancelled', color: theme.status.warning }; + case AgentStatus.FAILED: + return { icon: '✗', text: 'Failed', color: theme.status.error }; + case AgentStatus.RUNNING: + return { icon: '○', text: 'Running', color: theme.text.secondary }; + case AgentStatus.INITIALIZING: + return { icon: '○', text: 'Initializing', color: theme.text.secondary }; + default: + return { icon: '○', text: status, color: theme.text.secondary }; + } +} // --- Thresholds --- export const TOOL_SUCCESS_RATE_HIGH = 95; diff --git a/packages/cli/src/ui/utils/layoutUtils.ts b/packages/cli/src/ui/utils/layoutUtils.ts new file mode 100644 index 0000000000..208babcfc8 --- /dev/null +++ b/packages/cli/src/ui/utils/layoutUtils.ts @@ -0,0 +1,40 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Shared layout calculation utilities for the terminal UI. + */ + +/** + * Calculate the widths for the input prompt area based on terminal width. + * + * Returns the content width (for the text buffer), the total container width + * (including border + padding + prefix), the suggestions dropdown width, + * and the frame overhead constant. + */ +export const calculatePromptWidths = (terminalWidth: number) => { + const widthFraction = 0.9; + const FRAME_PADDING_AND_BORDER = 4; // Border (2) + padding (2) + const PROMPT_PREFIX_WIDTH = 2; // '> ' or '! ' + const MIN_CONTENT_WIDTH = 2; + + const innerContentWidth = + Math.floor(terminalWidth * widthFraction) - + FRAME_PADDING_AND_BORDER - + PROMPT_PREFIX_WIDTH; + + const inputWidth = Math.max(MIN_CONTENT_WIDTH, innerContentWidth); + const FRAME_OVERHEAD = FRAME_PADDING_AND_BORDER + PROMPT_PREFIX_WIDTH; + const containerWidth = inputWidth + FRAME_OVERHEAD; + const suggestionsWidth = Math.max(20, Math.floor(terminalWidth * 1.0)); + + return { + inputWidth, + containerWidth, + suggestionsWidth, + frameOverhead: FRAME_OVERHEAD, + } as const; +}; diff --git a/packages/core/src/agents/arena/ArenaAgentClient.test.ts b/packages/core/src/agents/arena/ArenaAgentClient.test.ts new file mode 100644 index 0000000000..6ab61039c1 --- /dev/null +++ b/packages/core/src/agents/arena/ArenaAgentClient.test.ts @@ -0,0 +1,568 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import * as os from 'node:os'; +import { ArenaAgentClient } from './ArenaAgentClient.js'; +import { safeAgentId } from './types.js'; +import type { ArenaControlSignal } from './types.js'; +import { uiTelemetryService } from '../../telemetry/uiTelemetry.js'; +import type { SessionMetrics } from '../../telemetry/uiTelemetry.js'; +import { ToolCallDecision } from '../../telemetry/tool-call-decision.js'; + +const createMockMetrics = ( + overrides: Partial<{ + totalRequests: number; + totalTokens: number; + promptTokens: number; + candidatesTokens: number; + totalLatencyMs: number; + totalCalls: number; + totalSuccess: number; + totalFail: number; + }> = {}, +): SessionMetrics => ({ + models: { + 'test-model': { + api: { + totalRequests: overrides.totalRequests ?? 0, + totalErrors: 0, + totalLatencyMs: overrides.totalLatencyMs ?? 0, + }, + tokens: { + prompt: overrides.promptTokens ?? 0, + candidates: overrides.candidatesTokens ?? 0, + total: overrides.totalTokens ?? 0, + cached: 0, + thoughts: 0, + tool: 0, + }, + }, + }, + tools: { + totalCalls: overrides.totalCalls ?? 0, + totalSuccess: overrides.totalSuccess ?? 0, + totalFail: overrides.totalFail ?? 0, + totalDurationMs: 0, + totalDecisions: { + [ToolCallDecision.ACCEPT]: 0, + [ToolCallDecision.REJECT]: 0, + [ToolCallDecision.MODIFY]: 0, + [ToolCallDecision.AUTO_ACCEPT]: 0, + }, + byName: {}, + }, + files: { + totalLinesAdded: 0, + totalLinesRemoved: 0, + }, +}); + +describe('ArenaAgentClient', () => { + let tempDir: string; + + beforeEach(async () => { + tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'arena-reporter-test-')); + vi.spyOn(uiTelemetryService, 'getMetrics').mockReturnValue( + createMockMetrics(), + ); + }); + + afterEach(async () => { + vi.restoreAllMocks(); + try { + await fs.rm(tempDir, { recursive: true, force: true }); + } catch { + // Ignore cleanup errors + } + }); + + describe('create() factory', () => { + it('should return null when ARENA_AGENT_ID is not set', () => { + const original = process.env['ARENA_AGENT_ID']; + const originalSession = process.env['ARENA_SESSION_ID']; + const originalDir = process.env['ARENA_SESSION_DIR']; + delete process.env['ARENA_AGENT_ID']; + delete process.env['ARENA_SESSION_ID']; + delete process.env['ARENA_SESSION_DIR']; + + const reporter = ArenaAgentClient.create(); + expect(reporter).toBeNull(); + + // Restore + if (original !== undefined) { + process.env['ARENA_AGENT_ID'] = original; + } + if (originalSession !== undefined) { + process.env['ARENA_SESSION_ID'] = originalSession; + } + if (originalDir !== undefined) { + process.env['ARENA_SESSION_DIR'] = originalDir; + } + }); + + it('should return null when ARENA_SESSION_ID is not set', () => { + const originalAgent = process.env['ARENA_AGENT_ID']; + const originalSession = process.env['ARENA_SESSION_ID']; + const originalDir = process.env['ARENA_SESSION_DIR']; + + process.env['ARENA_AGENT_ID'] = 'test-agent'; + delete process.env['ARENA_SESSION_ID']; + process.env['ARENA_SESSION_DIR'] = tempDir; + + const reporter = ArenaAgentClient.create(); + expect(reporter).toBeNull(); + + // Restore + if (originalAgent !== undefined) { + process.env['ARENA_AGENT_ID'] = originalAgent; + } else { + delete process.env['ARENA_AGENT_ID']; + } + if (originalSession !== undefined) { + process.env['ARENA_SESSION_ID'] = originalSession; + } + if (originalDir !== undefined) { + process.env['ARENA_SESSION_DIR'] = originalDir; + } else { + delete process.env['ARENA_SESSION_DIR']; + } + }); + + it('should return null when ARENA_SESSION_DIR is not set', () => { + const originalAgent = process.env['ARENA_AGENT_ID']; + const originalSession = process.env['ARENA_SESSION_ID']; + const originalDir = process.env['ARENA_SESSION_DIR']; + + process.env['ARENA_AGENT_ID'] = 'test-agent'; + process.env['ARENA_SESSION_ID'] = 'test-session'; + delete process.env['ARENA_SESSION_DIR']; + + const reporter = ArenaAgentClient.create(); + expect(reporter).toBeNull(); + + // Restore + if (originalAgent !== undefined) { + process.env['ARENA_AGENT_ID'] = originalAgent; + } else { + delete process.env['ARENA_AGENT_ID']; + } + if (originalSession !== undefined) { + process.env['ARENA_SESSION_ID'] = originalSession; + } else { + delete process.env['ARENA_SESSION_ID']; + } + if (originalDir !== undefined) { + process.env['ARENA_SESSION_DIR'] = originalDir; + } else { + delete process.env['ARENA_SESSION_DIR']; + } + }); + + it('should return an instance when all env vars are set', () => { + const originalAgent = process.env['ARENA_AGENT_ID']; + const originalSession = process.env['ARENA_SESSION_ID']; + const originalDir = process.env['ARENA_SESSION_DIR']; + + process.env['ARENA_AGENT_ID'] = 'test-agent'; + process.env['ARENA_SESSION_ID'] = 'test-session'; + process.env['ARENA_SESSION_DIR'] = tempDir; + + const reporter = ArenaAgentClient.create(); + expect(reporter).toBeInstanceOf(ArenaAgentClient); + + // Restore + if (originalAgent !== undefined) { + process.env['ARENA_AGENT_ID'] = originalAgent; + } else { + delete process.env['ARENA_AGENT_ID']; + } + if (originalSession !== undefined) { + process.env['ARENA_SESSION_ID'] = originalSession; + } else { + delete process.env['ARENA_SESSION_ID']; + } + if (originalDir !== undefined) { + process.env['ARENA_SESSION_DIR'] = originalDir; + } else { + delete process.env['ARENA_SESSION_DIR']; + } + }); + }); + + describe('init()', () => { + it('should create the agents/ and control/ directories', async () => { + const reporter = new ArenaAgentClient('agent-1', tempDir); + await reporter.init(); + + const agentsDir = path.join(tempDir, 'agents'); + const controlDir = path.join(tempDir, 'control'); + const agentsStat = await fs.stat(agentsDir); + const controlStat = await fs.stat(controlDir); + expect(agentsStat.isDirectory()).toBe(true); + expect(controlStat.isDirectory()).toBe(true); + }); + + it('should be idempotent', async () => { + const reporter = new ArenaAgentClient('agent-1', tempDir); + await reporter.init(); + await reporter.init(); // Should not throw + + const agentsDir = path.join(tempDir, 'agents'); + const stat = await fs.stat(agentsDir); + expect(stat.isDirectory()).toBe(true); + }); + }); + + describe('updateStatus()', () => { + it('should write per-agent status file with stats from telemetry', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue( + createMockMetrics({ + totalRequests: 3, + totalTokens: 1500, + promptTokens: 1000, + candidatesTokens: 500, + totalCalls: 7, + totalSuccess: 6, + totalFail: 1, + }), + ); + + await reporter.updateStatus('Editing files'); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.agentId).toBe(agentId); + expect(content.status).toBe('running'); + expect(content.rounds).toBe(3); + expect(content.currentActivity).toBe('Editing files'); + expect(content.stats.totalTokens).toBe(1500); + expect(content.stats.inputTokens).toBe(1000); + expect(content.stats.outputTokens).toBe(500); + expect(content.stats.toolCalls).toBe(7); + expect(content.stats.successfulToolCalls).toBe(6); + expect(content.stats.failedToolCalls).toBe(1); + expect(content.finalSummary).toBeNull(); + expect(content.error).toBeNull(); + expect(content.updatedAt).toBeTypeOf('number'); + }); + + it('should perform atomic write (no partial reads)', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + // Write status multiple times rapidly + const promises = []; + for (let i = 0; i < 10; i++) { + promises.push(reporter.updateStatus()); + } + await Promise.all(promises); + + // The file should be valid JSON (no corruption from concurrent writes) + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + expect(content.agentId).toBe(agentId); + expect(content.status).toBe('running'); + }); + + it('should reflect latest telemetry on each call', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + // First update + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue( + createMockMetrics({ + totalRequests: 1, + totalTokens: 100, + totalCalls: 5, + }), + ); + await reporter.updateStatus(); + + // Second update with updated telemetry + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue( + createMockMetrics({ + totalRequests: 2, + totalTokens: 200, + totalCalls: 8, + }), + ); + await reporter.updateStatus(); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.rounds).toBe(2); + expect(content.stats.totalTokens).toBe(200); + expect(content.stats.toolCalls).toBe(8); + }); + + it('should auto-initialize if not yet initialized', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + // Skip init() call + + await reporter.updateStatus(); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + expect(content.agentId).toBe(agentId); + }); + }); + + describe('checkControlSignal()', () => { + it('should return null when no control file exists', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + const signal = await reporter.checkControlSignal(); + expect(signal).toBeNull(); + }); + + it('should read and delete control file', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + // Write a control signal + const controlSignal: ArenaControlSignal = { + type: 'shutdown', + reason: 'User cancelled', + timestamp: Date.now(), + }; + const controlPath = path.join( + tempDir, + 'control', + `${safeAgentId(agentId)}.json`, + ); + await fs.writeFile(controlPath, JSON.stringify(controlSignal), 'utf-8'); + + // Read it + const signal = await reporter.checkControlSignal(); + expect(signal).not.toBeNull(); + expect(signal!.type).toBe('shutdown'); + expect(signal!.reason).toBe('User cancelled'); + + // File should be deleted (consumed) + await expect(fs.access(controlPath)).rejects.toThrow(); + }); + + it('should return null on subsequent reads (consume-once)', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + // Write a control signal + const controlSignal: ArenaControlSignal = { + type: 'cancel', + reason: 'Timeout', + timestamp: Date.now(), + }; + const controlPath = path.join( + tempDir, + 'control', + `${safeAgentId(agentId)}.json`, + ); + await fs.writeFile(controlPath, JSON.stringify(controlSignal), 'utf-8'); + + // First read should return the signal + const first = await reporter.checkControlSignal(); + expect(first).not.toBeNull(); + + // Second read should return null + const second = await reporter.checkControlSignal(); + expect(second).toBeNull(); + }); + }); + + describe('reportCompleted()', () => { + it('should write status with completed state and optional summary', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + await reporter.reportCompleted('Successfully implemented feature X'); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.status).toBe('completed'); + expect(content.finalSummary).toBe('Successfully implemented feature X'); + expect(content.error).toBeNull(); + }); + + it('should write status with idle state and no summary', async () => { + const agentId = 'model-a'; + const reporter = new ArenaAgentClient(agentId, tempDir); + await reporter.init(); + + await reporter.reportCompleted(); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId(agentId)}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.status).toBe('completed'); + expect(content.finalSummary).toBeNull(); + expect(content.error).toBeNull(); + }); + }); + + describe('stats aggregation and wall-clock durationMs', () => { + it('should aggregate multi-model stats and use wall-clock durationMs', async () => { + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue({ + models: { + 'model-a': { + api: { + totalRequests: 3, + totalErrors: 0, + totalLatencyMs: 1000, + }, + tokens: { + prompt: 100, + candidates: 50, + total: 150, + cached: 0, + thoughts: 0, + tool: 0, + }, + }, + 'model-b': { + api: { + totalRequests: 2, + totalErrors: 1, + totalLatencyMs: 500, + }, + tokens: { + prompt: 200, + candidates: 100, + total: 300, + cached: 0, + thoughts: 0, + tool: 0, + }, + }, + }, + tools: { + totalCalls: 10, + totalSuccess: 8, + totalFail: 2, + totalDurationMs: 2000, + totalDecisions: { + [ToolCallDecision.ACCEPT]: 0, + [ToolCallDecision.REJECT]: 0, + [ToolCallDecision.MODIFY]: 0, + [ToolCallDecision.AUTO_ACCEPT]: 0, + }, + byName: {}, + }, + files: { totalLinesAdded: 0, totalLinesRemoved: 0 }, + }); + + const reporter = new ArenaAgentClient('model-a', tempDir); + await reporter.init(); + await reporter.updateStatus(); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId('model-a')}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.stats.rounds).toBe(5); + expect(content.stats.totalTokens).toBe(450); + expect(content.stats.inputTokens).toBe(300); + expect(content.stats.outputTokens).toBe(150); + expect(content.stats.toolCalls).toBe(10); + expect(content.stats.successfulToolCalls).toBe(8); + expect(content.stats.failedToolCalls).toBe(2); + // durationMs should be wall-clock time, not API latency sum (1500) + expect(content.stats.durationMs).toBeGreaterThanOrEqual(0); + expect(content.stats.durationMs).toBeLessThan(5000); + }); + + it('should return zeros when no models exist', async () => { + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue( + createMockMetrics(), + ); + // Override with empty models + vi.mocked(uiTelemetryService.getMetrics).mockReturnValue({ + ...createMockMetrics(), + models: {}, + }); + + const reporter = new ArenaAgentClient('model-a', tempDir); + await reporter.init(); + await reporter.updateStatus(); + + const statusPath = path.join( + tempDir, + 'agents', + `${safeAgentId('model-a')}.json`, + ); + const content = JSON.parse(await fs.readFile(statusPath, 'utf-8')); + + expect(content.stats.rounds).toBe(0); + expect(content.stats.totalTokens).toBe(0); + expect(content.stats.inputTokens).toBe(0); + expect(content.stats.outputTokens).toBe(0); + // durationMs is wall-clock, so still non-negative even with no models + expect(content.stats.durationMs).toBeGreaterThanOrEqual(0); + }); + }); + + describe('safeAgentId()', () => { + it('should pass through typical model IDs unchanged', () => { + expect(safeAgentId('qwen-coder-plus')).toBe('qwen-coder-plus'); + }); + + it('should handle IDs without unsafe characters', () => { + expect(safeAgentId('simple-id')).toBe('simple-id'); + }); + + it('should replace slashes with double dashes', () => { + expect(safeAgentId('org/model-name')).toBe('org--model-name'); + }); + + it('should handle multiple unsafe characters', () => { + expect(safeAgentId('a/b\\c:d')).toBe('a--b--c--d'); + }); + }); +}); diff --git a/packages/core/src/agents/arena/ArenaAgentClient.ts b/packages/core/src/agents/arena/ArenaAgentClient.ts new file mode 100644 index 0000000000..12780f8de9 --- /dev/null +++ b/packages/core/src/agents/arena/ArenaAgentClient.ts @@ -0,0 +1,241 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import { createDebugLogger } from '../../utils/debugLogger.js'; +import { isNodeError } from '../../utils/errors.js'; +import { atomicWriteJSON } from '../../utils/atomicFileWrite.js'; +import { uiTelemetryService } from '../../telemetry/uiTelemetry.js'; +import type { + ArenaAgentStats, + ArenaControlSignal, + ArenaStatusFile, +} from './types.js'; +import { safeAgentId } from './types.js'; +import { AgentStatus } from '../runtime/agent-types.js'; + +const debugLogger = createDebugLogger('ARENA_AGENT_CLIENT'); + +const AGENTS_SUBDIR = 'agents'; +const CONTROL_SUBDIR = 'control'; + +/** + * ArenaAgentClient is used by child agent processes to communicate + * their status back to the main ArenaManager process via file-based IPC. + * + * Status files are written to a centralized arena session directory: + * `/agents/.json` + * + * Control signals are read from: + * `/control/.json` + * + * It self-activates based on the ARENA_AGENT_ID environment variable. + * When running outside an Arena session, `ArenaAgentClient.create()` + * returns null. + */ +export class ArenaAgentClient { + private readonly agentsDir: string; + private readonly controlDir: string; + private readonly statusFilePath: string; + private readonly controlFilePath: string; + private readonly startTimeMs: number; + private initialized = false; + + /** + * Static factory - returns an instance if ARENA_AGENT_ID, ARENA_SESSION_ID, + * and ARENA_SESSION_DIR env vars are present, null otherwise. + */ + static create(): ArenaAgentClient | null { + const agentId = process.env['ARENA_AGENT_ID']; + const sessionId = process.env['ARENA_SESSION_ID']; + const sessionDir = process.env['ARENA_SESSION_DIR']; + + if (!agentId || !sessionId || !sessionDir) { + return null; + } + + return new ArenaAgentClient(agentId, sessionDir); + } + + constructor( + private readonly agentId: string, + arenaSessionDir: string, + ) { + const safe = safeAgentId(agentId); + this.agentsDir = path.join(arenaSessionDir, AGENTS_SUBDIR); + this.controlDir = path.join(arenaSessionDir, CONTROL_SUBDIR); + this.statusFilePath = path.join(this.agentsDir, `${safe}.json`); + this.controlFilePath = path.join(this.controlDir, `${safe}.json`); + this.startTimeMs = Date.now(); + } + + /** + * Initialize the agents/ and control/ directories under the arena session + * dir. Called automatically on first use if not invoked explicitly. + */ + async init(): Promise { + await fs.mkdir(this.agentsDir, { recursive: true }); + await fs.mkdir(this.controlDir, { recursive: true }); + this.initialized = true; + debugLogger.info( + `ArenaAgentClient initialized for agent ${this.agentId} at ${this.agentsDir}`, + ); + } + + /** + * Write current status to the per-agent status file using atomic write + * (write to temp file then rename). + * + * Stats are derived automatically from uiTelemetryService which is the + * canonical source for token counts, tool calls, and API request counts. + */ + async updateStatus(currentActivity?: string): Promise { + await this.ensureInitialized(); + + const stats = this.getStatsFromTelemetry(); + + const statusFile: ArenaStatusFile = { + agentId: this.agentId, + status: AgentStatus.RUNNING, + updatedAt: Date.now(), + rounds: stats.rounds, + currentActivity, + stats, + finalSummary: null, + error: null, + }; + + await atomicWriteJSON(this.statusFilePath, statusFile); + } + + /** + * Read and delete control.json (consume-once pattern). + * Returns null if no control signal is pending. + */ + async checkControlSignal(): Promise { + await this.ensureInitialized(); + + try { + const content = await fs.readFile(this.controlFilePath, 'utf-8'); + // Parse before deleting so a corrupted file isn't silently consumed + const signal = JSON.parse(content) as ArenaControlSignal; + await fs.unlink(this.controlFilePath); + return signal; + } catch (error: unknown) { + // File doesn't exist = no signal pending + if (isNodeError(error) && error.code === 'ENOENT') { + return null; + } + // Re-throw permission errors so they surface immediately + if (isNodeError(error) && error.code === 'EACCES') { + throw error; + } + debugLogger.error('Error reading control signal:', error); + return null; + } + } + + /** + * Report that the agent has completed the current task successfully. + * This is the primary signal to the main process that the agent is done working. + */ + async reportCompleted(finalSummary?: string): Promise { + await this.ensureInitialized(); + + const stats = this.getStatsFromTelemetry(); + + const statusFile: ArenaStatusFile = { + agentId: this.agentId, + status: AgentStatus.COMPLETED, + updatedAt: Date.now(), + rounds: stats.rounds, + stats, + finalSummary: finalSummary ?? null, + error: null, + }; + + await atomicWriteJSON(this.statusFilePath, statusFile); + } + + /** + * Report that the agent hit an error (API/auth/rate-limit, loop, etc.). + */ + async reportError(errorMessage: string): Promise { + await this.ensureInitialized(); + + const stats = this.getStatsFromTelemetry(); + + const statusFile: ArenaStatusFile = { + agentId: this.agentId, + status: AgentStatus.FAILED, + updatedAt: Date.now(), + rounds: stats.rounds, + stats, + finalSummary: null, + error: errorMessage, + }; + + await atomicWriteJSON(this.statusFilePath, statusFile); + } + + /** + * Report that the agent's current request was cancelled by the user. + */ + async reportCancelled(): Promise { + await this.ensureInitialized(); + + const stats = this.getStatsFromTelemetry(); + + const statusFile: ArenaStatusFile = { + agentId: this.agentId, + status: AgentStatus.CANCELLED, + updatedAt: Date.now(), + rounds: stats.rounds, + stats, + finalSummary: null, + error: null, + }; + + await atomicWriteJSON(this.statusFilePath, statusFile); + } + + /** + * Build ArenaAgentStats from uiTelemetryService metrics + */ + private getStatsFromTelemetry(): ArenaAgentStats { + const metrics = uiTelemetryService.getMetrics(); + + let rounds = 0; + let totalTokens = 0; + let inputTokens = 0; + let outputTokens = 0; + + for (const model of Object.values(metrics.models)) { + rounds += model.api.totalRequests; + totalTokens += model.tokens.total; + inputTokens += model.tokens.prompt; + outputTokens += model.tokens.candidates; + } + + return { + rounds, + totalTokens, + inputTokens, + outputTokens, + durationMs: Date.now() - this.startTimeMs, + toolCalls: metrics.tools.totalCalls, + successfulToolCalls: metrics.tools.totalSuccess, + failedToolCalls: metrics.tools.totalFail, + }; + } + + private async ensureInitialized(): Promise { + if (!this.initialized) { + await this.init(); + } + } +} diff --git a/packages/core/src/agents/arena/ArenaManager.test.ts b/packages/core/src/agents/arena/ArenaManager.test.ts new file mode 100644 index 0000000000..a21f15d634 --- /dev/null +++ b/packages/core/src/agents/arena/ArenaManager.test.ts @@ -0,0 +1,505 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import * as os from 'node:os'; +import { ArenaManager } from './ArenaManager.js'; +import { ArenaEventType } from './arena-events.js'; +import { ArenaSessionStatus, ARENA_MAX_AGENTS } from './types.js'; + +const hoistedMockSetupWorktrees = vi.hoisted(() => vi.fn()); +const hoistedMockCleanupSession = vi.hoisted(() => vi.fn()); +const hoistedMockGetWorktreeDiff = vi.hoisted(() => vi.fn()); +const hoistedMockApplyWorktreeChanges = vi.hoisted(() => vi.fn()); +const hoistedMockDetectBackend = vi.hoisted(() => vi.fn()); + +vi.mock('../index.js', async (importOriginal) => { + const actual = await importOriginal(); + return { + ...actual, + detectBackend: hoistedMockDetectBackend, + }; +}); + +// Mock GitWorktreeService to avoid real git operations. +// The class mock includes static methods used by ArenaManager. +vi.mock('../../services/gitWorktreeService.js', () => { + const MockClass = vi.fn().mockImplementation(() => ({ + checkGitAvailable: vi.fn().mockResolvedValue({ available: true }), + isGitRepository: vi.fn().mockResolvedValue(true), + setupWorktrees: hoistedMockSetupWorktrees, + cleanupSession: hoistedMockCleanupSession, + getWorktreeDiff: hoistedMockGetWorktreeDiff, + applyWorktreeChanges: hoistedMockApplyWorktreeChanges, + })); + // Static methods called by ArenaManager + (MockClass as unknown as Record)['getBaseDir'] = () => + path.join(os.tmpdir(), 'arena-mock'); + (MockClass as unknown as Record)['getSessionDir'] = ( + sessionId: string, + ) => path.join(os.tmpdir(), 'arena-mock', sessionId); + (MockClass as unknown as Record)['getWorktreesDir'] = ( + sessionId: string, + ) => path.join(os.tmpdir(), 'arena-mock', sessionId, 'worktrees'); + return { GitWorktreeService: MockClass }; +}); + +// Mock the Config class +const createMockConfig = ( + workingDir: string, + arenaSettings: Record = {}, +) => ({ + getWorkingDir: () => workingDir, + getModel: () => 'test-model', + getSessionId: () => 'test-session', + getUserMemory: () => '', + getToolRegistry: () => ({ + getFunctionDeclarations: () => [], + getFunctionDeclarationsFiltered: () => [], + getTool: () => undefined, + }), + getAgentsSettings: () => ({ arena: arenaSettings }), + getUsageStatisticsEnabled: () => false, + getTelemetryEnabled: () => false, + getTelemetryLogPromptsEnabled: () => false, +}); + +describe('ArenaManager', () => { + let tempDir: string; + let mockConfig: ReturnType; + let mockBackend: ReturnType; + + beforeEach(async () => { + // Create a temp directory - no need for git repo since we mock GitWorktreeService + tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'arena-test-')); + // Use tempDir as worktreeBaseDir to avoid slow filesystem access in deriveWorktreeDirName + mockConfig = createMockConfig(tempDir, { worktreeBaseDir: tempDir }); + + mockBackend = createMockBackend(); + hoistedMockDetectBackend.mockResolvedValue({ backend: mockBackend }); + + hoistedMockSetupWorktrees.mockImplementation( + async ({ + sessionId, + sourceRepoPath, + worktreeNames, + }: { + sessionId: string; + sourceRepoPath: string; + worktreeNames: string[]; + }) => { + const worktrees = worktreeNames.map((name) => ({ + id: `${sessionId}/${name}`, + name, + path: path.join(sourceRepoPath, `.arena-${sessionId}`, name), + branch: `arena/${sessionId}/${name}`, + isActive: true, + createdAt: Date.now(), + })); + + return { + success: true, + sessionId, + worktrees, + worktreesByName: Object.fromEntries( + worktrees.map((worktree) => [worktree.name, worktree]), + ), + errors: [], + }; + }, + ); + hoistedMockCleanupSession.mockResolvedValue({ + success: true, + removedWorktrees: [], + removedBranches: [], + errors: [], + }); + hoistedMockGetWorktreeDiff.mockResolvedValue(''); + hoistedMockApplyWorktreeChanges.mockResolvedValue({ success: true }); + }); + + afterEach(async () => { + try { + await fs.rm(tempDir, { recursive: true, force: true }); + } catch { + // Ignore cleanup errors + } + }); + + describe('constructor', () => { + it('should create an ArenaManager instance', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager).toBeDefined(); + expect(manager.getSessionId()).toBeUndefined(); + expect(manager.getSessionStatus()).toBe(ArenaSessionStatus.INITIALIZING); + }); + + it('should not have a backend before start', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager.getBackend()).toBeNull(); + }); + }); + + describe('start validation', () => { + it('should reject start with less than 2 models', async () => { + const manager = new ArenaManager(mockConfig as never); + + await expect( + manager.start({ + models: [{ modelId: 'model-1', authType: 'openai' }], + task: 'Test task', + }), + ).rejects.toThrow('Arena requires at least 2 models'); + }); + + it('should reject start with more than max models', async () => { + const manager = new ArenaManager(mockConfig as never); + + const models = Array.from({ length: ARENA_MAX_AGENTS + 1 }, (_, i) => ({ + modelId: `model-${i}`, + authType: 'openai', + })); + + await expect( + manager.start({ + models, + task: 'Test task', + }), + ).rejects.toThrow( + `Arena supports a maximum of ${ARENA_MAX_AGENTS} models`, + ); + }); + + it('should reject start with empty task', async () => { + const manager = new ArenaManager(mockConfig as never); + + await expect( + manager.start({ + models: [ + { modelId: 'model-1', authType: 'openai' }, + { modelId: 'model-2', authType: 'openai' }, + ], + task: '', + }), + ).rejects.toThrow('Arena requires a task/prompt'); + }); + + it('should reject start with duplicate model IDs', async () => { + const manager = new ArenaManager(mockConfig as never); + + await expect( + manager.start({ + models: [ + { modelId: 'model-1', authType: 'openai' }, + { modelId: 'model-1', authType: 'openai' }, + ], + task: 'Test task', + }), + ).rejects.toThrow('Arena models must have unique identifiers'); + }); + }); + + describe('event emitter', () => { + it('should return the event emitter', () => { + const manager = new ArenaManager(mockConfig as never); + const emitter = manager.getEventEmitter(); + expect(emitter).toBeDefined(); + expect(typeof emitter.on).toBe('function'); + expect(typeof emitter.off).toBe('function'); + expect(typeof emitter.emit).toBe('function'); + }); + }); + + describe('PTY interaction methods', () => { + it('should expose PTY interaction methods', () => { + const manager = new ArenaManager(mockConfig as never); + expect(typeof manager.switchToAgent).toBe('function'); + expect(typeof manager.switchToNextAgent).toBe('function'); + expect(typeof manager.switchToPreviousAgent).toBe('function'); + expect(typeof manager.getActiveAgentId).toBe('function'); + expect(typeof manager.getActiveSnapshot).toBe('function'); + expect(typeof manager.getAgentSnapshot).toBe('function'); + expect(typeof manager.forwardInput).toBe('function'); + expect(typeof manager.resizeAgents).toBe('function'); + }); + + it('should return null for active agent ID when no session', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager.getActiveAgentId()).toBeNull(); + }); + + it('should return null for active snapshot when no session', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager.getActiveSnapshot()).toBeNull(); + }); + }); + + describe('cancel', () => { + it('should handle cancel when no session is active', async () => { + const manager = new ArenaManager(mockConfig as never); + await expect(manager.cancel()).resolves.not.toThrow(); + }); + }); + + describe('cleanup', () => { + it('should handle cleanup when no session is active', async () => { + const manager = new ArenaManager(mockConfig as never); + await expect(manager.cleanup()).resolves.not.toThrow(); + }); + }); + + describe('getAgentStates', () => { + it('should return empty array when no agents', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager.getAgentStates()).toEqual([]); + }); + }); + + describe('getAgentState', () => { + it('should return undefined for non-existent agent', () => { + const manager = new ArenaManager(mockConfig as never); + expect(manager.getAgentState('non-existent')).toBeUndefined(); + }); + }); + + describe('applyAgentResult', () => { + it('should return error for non-existent agent', async () => { + const manager = new ArenaManager(mockConfig as never); + const result = await manager.applyAgentResult('non-existent'); + expect(result.success).toBe(false); + expect(result.error).toContain('not found'); + }); + }); + + describe('getAgentDiff', () => { + it('should return error message for non-existent agent', async () => { + const manager = new ArenaManager(mockConfig as never); + const diff = await manager.getAgentDiff('non-existent'); + expect(diff).toContain('not found'); + }); + }); + + describe('backend initialization', () => { + it('should emit SESSION_UPDATE with type warning when backend detection returns warning', async () => { + const manager = new ArenaManager(mockConfig as never); + const updates: Array<{ + type: string; + message: string; + sessionId: string; + }> = []; + manager.getEventEmitter().on(ArenaEventType.SESSION_UPDATE, (event) => { + updates.push({ + type: event.type, + message: event.message, + sessionId: event.sessionId, + }); + }); + + hoistedMockDetectBackend.mockResolvedValueOnce({ + backend: mockBackend, + warning: 'fallback to tmux backend', + }); + + await manager.start(createValidStartOptions()); + + expect(hoistedMockDetectBackend).toHaveBeenCalledWith( + undefined, + expect.anything(), + ); + const warningUpdate = updates.find((u) => u.type === 'warning'); + expect(warningUpdate).toBeDefined(); + expect(warningUpdate?.message).toContain('fallback to tmux backend'); + expect(warningUpdate?.sessionId).toBe('test-session'); + }); + + it('should emit SESSION_ERROR and mark FAILED when backend init fails', async () => { + const manager = new ArenaManager(mockConfig as never); + const sessionErrors: string[] = []; + manager.getEventEmitter().on(ArenaEventType.SESSION_ERROR, (event) => { + sessionErrors.push(event.error); + }); + + mockBackend.init.mockRejectedValueOnce(new Error('init failed')); + + await expect(manager.start(createValidStartOptions())).rejects.toThrow( + 'init failed', + ); + expect(manager.getSessionStatus()).toBe(ArenaSessionStatus.FAILED); + expect(sessionErrors).toEqual(['init failed']); + }); + }); + + describe('chat history forwarding', () => { + it('should pass chatHistory to backend spawnAgent calls', async () => { + const manager = new ArenaManager(mockConfig as never); + const chatHistory = [ + { role: 'user' as const, parts: [{ text: 'prior question' }] }, + { role: 'model' as const, parts: [{ text: 'prior answer' }] }, + ]; + + await manager.start({ + ...createValidStartOptions(), + chatHistory, + }); + + // Both agents should have been spawned with chatHistory in + // the inProcess config. + expect(mockBackend.spawnAgent).toHaveBeenCalledTimes(2); + for (const call of mockBackend.spawnAgent.mock.calls) { + const spawnConfig = call[0] as { + inProcess?: { chatHistory?: unknown }; + }; + expect(spawnConfig.inProcess?.chatHistory).toEqual(chatHistory); + } + }); + + it('should pass undefined chatHistory when not provided', async () => { + const manager = new ArenaManager(mockConfig as never); + + await manager.start(createValidStartOptions()); + + expect(mockBackend.spawnAgent).toHaveBeenCalledTimes(2); + for (const call of mockBackend.spawnAgent.mock.calls) { + const spawnConfig = call[0] as { + inProcess?: { chatHistory?: unknown }; + }; + expect(spawnConfig.inProcess?.chatHistory).toBeUndefined(); + } + }); + }); + + describe('active session lifecycle', () => { + it('cancel should stop backend and move session to CANCELLED', async () => { + const manager = new ArenaManager(mockConfig as never); + + // Disable auto-exit so agents stay running until we cancel. + mockBackend.setAutoExit(false); + + const startPromise = manager.start({ + ...createValidStartOptions(), + timeoutSeconds: 30, + }); + + // Wait until the backend has spawned all agents. + // (Agents are spawned sequentially; cancelling between spawns would + // cause spawnAgentPty to overwrite the CANCELLED status back to RUNNING.) + await waitForCondition( + () => mockBackend.spawnAgent.mock.calls.length >= 2, + ); + + await manager.cancel(); + expect(mockBackend.stopAll).toHaveBeenCalledTimes(1); + expect(manager.getSessionStatus()).toBe(ArenaSessionStatus.CANCELLED); + + await startPromise; + expect(manager.getSessionStatus()).toBe(ArenaSessionStatus.CANCELLED); + }); + + it('cleanup should release backend and worktree resources after start', async () => { + const manager = new ArenaManager(mockConfig as never); + + // auto-exit is on by default, so agents terminate quickly. + await manager.start(createValidStartOptions()); + + await manager.cleanup(); + + expect(mockBackend.cleanup).toHaveBeenCalledTimes(1); + // cleanupSession is called with worktreeDirName (short ID), not the full sessionId. + // For 'test-session', the short ID is 'testsess' (first 8 chars with dashes removed). + expect(hoistedMockCleanupSession).toHaveBeenCalledWith('testsess'); + expect(manager.getBackend()).toBeNull(); + expect(manager.getSessionId()).toBeUndefined(); + }); + }); +}); + +describe('ARENA_MAX_AGENTS', () => { + it('should be 5', () => { + expect(ARENA_MAX_AGENTS).toBe(5); + }); +}); + +function createMockBackend() { + type ExitCb = ( + agentId: string, + exitCode: number | null, + signal: number | null, + ) => void; + let onAgentExit: ExitCb | null = null; + let autoExit = true; + + const backend = { + type: 'tmux' as const, + init: vi.fn().mockResolvedValue(undefined), + spawnAgent: vi.fn(async (config: { agentId: string }) => { + // By default, simulate immediate agent termination so tests + // don't hang in waitForAllAgentsSettled. + if (autoExit) { + setTimeout(() => onAgentExit?.(config.agentId, 0, null), 5); + } + }), + stopAgent: vi.fn(), + stopAll: vi.fn(), + cleanup: vi.fn().mockResolvedValue(undefined), + setOnAgentExit: vi.fn((cb: ExitCb) => { + onAgentExit = cb; + }), + waitForAll: vi.fn().mockResolvedValue(true), + switchTo: vi.fn(), + switchToNext: vi.fn(), + switchToPrevious: vi.fn(), + getActiveAgentId: vi.fn().mockReturnValue(null), + getActiveSnapshot: vi.fn().mockReturnValue(null), + getAgentSnapshot: vi.fn().mockReturnValue(null), + getAgentScrollbackLength: vi.fn().mockReturnValue(0), + forwardInput: vi.fn().mockReturnValue(false), + writeToAgent: vi.fn().mockReturnValue(false), + resizeAll: vi.fn(), + getAttachHint: vi.fn().mockReturnValue(null), + /** Disable automatic agent exit for tests that need to control timing. */ + setAutoExit(value: boolean) { + autoExit = value; + }, + }; + return backend; +} + +function createValidStartOptions() { + return { + models: [ + { modelId: 'model-1', authType: 'openai' }, + { modelId: 'model-2', authType: 'openai' }, + ], + task: 'Implement feature X', + }; +} + +async function waitForMicrotask(): Promise { + // Use setImmediate (or setTimeout fallback) to yield to the event loop + // and allow other async operations (like the start() method) to progress. + await new Promise((resolve) => { + if (typeof setImmediate === 'function') { + setImmediate(resolve); + } else { + setTimeout(resolve, 0); + } + }); +} + +async function waitForCondition( + predicate: () => boolean, + timeoutMs = 1000, +): Promise { + const startedAt = Date.now(); + while (!predicate()) { + if (Date.now() - startedAt > timeoutMs) { + throw new Error('Timed out while waiting for condition'); + } + await waitForMicrotask(); + } +} diff --git a/packages/core/src/agents/arena/ArenaManager.ts b/packages/core/src/agents/arena/ArenaManager.ts new file mode 100644 index 0000000000..6a386158f2 --- /dev/null +++ b/packages/core/src/agents/arena/ArenaManager.ts @@ -0,0 +1,1648 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import { GitWorktreeService } from '../../services/gitWorktreeService.js'; +import { Storage } from '../../config/storage.js'; +import type { Config } from '../../config/config.js'; +import { getCoreSystemPrompt } from '../../core/prompts.js'; +import { createDebugLogger } from '../../utils/debugLogger.js'; +import { isNodeError } from '../../utils/errors.js'; +import { atomicWriteJSON } from '../../utils/atomicFileWrite.js'; +import type { AnsiOutput } from '../../utils/terminalSerializer.js'; +import { ArenaEventEmitter, ArenaEventType } from './arena-events.js'; +import type { AgentSpawnConfig, Backend, DisplayMode } from '../index.js'; +import { detectBackend, DISPLAY_MODE } from '../index.js'; +import type { InProcessBackend } from '../backends/InProcessBackend.js'; +import { + AgentEventType, + type AgentStatusChangeEvent, +} from '../runtime/agent-events.js'; +import { + type ArenaConfig, + type ArenaConfigFile, + type ArenaControlSignal, + type ArenaStartOptions, + type ArenaAgentResult, + type ArenaSessionResult, + type ArenaAgentState, + type ArenaCallbacks, + type ArenaStatusFile, + ArenaSessionStatus, + ARENA_MAX_AGENTS, + safeAgentId, +} from './types.js'; +import { + AgentStatus, + isTerminalStatus, + isSettledStatus, + isSuccessStatus, +} from '../runtime/agent-types.js'; +import { + logArenaSessionStarted, + logArenaAgentCompleted, + logArenaSessionEnded, + makeArenaSessionStartedEvent, + makeArenaAgentCompletedEvent, + makeArenaSessionEndedEvent, +} from '../../telemetry/index.js'; +import type { ArenaSessionEndedStatus } from '../../telemetry/index.js'; + +const debugLogger = createDebugLogger('ARENA'); + +const ARENA_POLL_INTERVAL_MS = 500; + +/** + * ArenaManager orchestrates multi-model competitive execution. + * + * It manages: + * - Git worktree creation for isolated environments + * - Parallel agent execution via PTY subprocesses (through Backend) + * - Event emission for UI updates + * - Result collection and comparison + * - Active agent switching, input routing, and screen capture + */ +export class ArenaManager { + private readonly config: Config; + private readonly eventEmitter: ArenaEventEmitter; + private readonly worktreeService: GitWorktreeService; + private readonly arenaBaseDir: string; + private readonly callbacks: ArenaCallbacks; + private backend: Backend | null = null; + private cachedResult: ArenaSessionResult | null = null; + + private sessionId: string | undefined; + /** Short directory name used for worktree paths (derived from sessionId). */ + private worktreeDirName: string | undefined; + private sessionStatus: ArenaSessionStatus = ArenaSessionStatus.INITIALIZING; + private agents: Map = new Map(); + private arenaConfig: ArenaConfig | undefined; + + private startedAt: number | undefined; + private masterAbortController: AbortController | undefined; + private terminalCols: number; + private terminalRows: number; + private pollingInterval: ReturnType | null = null; + private lifecyclePromise: Promise | null = null; + /** Cleanup functions for in-process event bridge listeners. */ + private eventBridgeCleanups: Array<() => void> = []; + /** Guard to prevent double-emitting the session-ended telemetry event. */ + private sessionEndedLogged = false; + + constructor(config: Config, callbacks: ArenaCallbacks = {}) { + this.config = config; + this.callbacks = callbacks; + this.eventEmitter = new ArenaEventEmitter(); + const arenaSettings = config.getAgentsSettings().arena; + // Use the user-configured base dir, or default to ~/.qwen/arena. + this.arenaBaseDir = + arenaSettings?.worktreeBaseDir ?? + path.join(Storage.getGlobalQwenDir(), 'arena'); + this.worktreeService = new GitWorktreeService( + config.getWorkingDir(), + this.arenaBaseDir, + ); + this.terminalCols = process.stdout.columns || 120; + this.terminalRows = process.stdout.rows || 40; + } + + // ─── Public API ──────────────────────────────────────────────── + + /** + * Get the event emitter for subscribing to Arena events. + */ + getEventEmitter(): ArenaEventEmitter { + return this.eventEmitter; + } + + /** + * Get the current session ID. + */ + getSessionId(): string | undefined { + return this.sessionId; + } + + /** + * Get the current session status. + */ + getSessionStatus(): ArenaSessionStatus { + return this.sessionStatus; + } + + /** + * Get the current task description (available while session is active). + */ + getTask(): string | undefined { + return this.arenaConfig?.task; + } + + /** + * Get all agent states. + */ + getAgentStates(): ArenaAgentState[] { + return Array.from(this.agents.values()); + } + + /** + * Get a specific agent state. + */ + getAgentState(agentId: string): ArenaAgentState | undefined { + return this.agents.get(agentId); + } + + /** + * Get the cached session result (available after session completes). + */ + getResult(): ArenaSessionResult | null { + return this.cachedResult; + } + + /** + * Get the underlying backend for direct access. + * Returns null before the session initializes a backend. + */ + getBackend(): Backend | null { + return this.backend; + } + + /** + * Store the outer lifecycle promise so cancel/stop can wait for start() + * to fully unwind before proceeding with cleanup. + */ + setLifecyclePromise(p: Promise): void { + this.lifecyclePromise = p; + } + + /** + * Wait for the start lifecycle to fully settle (including error handling + * and listener teardown). Resolves immediately if no lifecycle is active. + */ + async waitForSettled(): Promise { + if (this.lifecyclePromise) { + await this.lifecyclePromise; + } + } + + // ─── PTY Interaction ─────────────────────────────────────────── + + /** + * Switch the active agent for screen display and input routing. + */ + switchToAgent(agentId: string): void { + this.backend?.switchTo(agentId); + } + + /** + * Switch to the next agent in order. + */ + switchToNextAgent(): void { + this.backend?.switchToNext(); + } + + /** + * Switch to the previous agent in order. + */ + switchToPreviousAgent(): void { + this.backend?.switchToPrevious(); + } + + /** + * Get the ID of the currently active agent. + */ + getActiveAgentId(): string | null { + return this.backend?.getActiveAgentId() ?? null; + } + + /** + * Get the screen snapshot for the currently active agent. + */ + getActiveSnapshot(): AnsiOutput | null { + return this.backend?.getActiveSnapshot() ?? null; + } + + /** + * Get the screen snapshot for a specific agent. + */ + getAgentSnapshot( + agentId: string, + scrollOffset: number = 0, + ): AnsiOutput | null { + return this.backend?.getAgentSnapshot(agentId, scrollOffset) ?? null; + } + + /** + * Get the maximum scrollback length for an agent's terminal buffer. + */ + getAgentScrollbackLength(agentId: string): number { + return this.backend?.getAgentScrollbackLength(agentId) ?? 0; + } + + /** + * Forward keyboard input to the currently active agent. + */ + forwardInput(data: string): boolean { + return this.backend?.forwardInput(data) ?? false; + } + + /** + * Resize all agent terminals. + */ + resizeAgents(cols: number, rows: number): void { + this.terminalCols = cols; + this.terminalRows = rows; + this.backend?.resizeAll(cols, rows); + } + + // ─── Session Lifecycle ───────────────────────────────────────── + + /** + * Start an Arena session. + * + * @param options - Arena start options + * @returns Promise resolving to the session result + */ + async start(options: ArenaStartOptions): Promise { + // Validate options + this.validateStartOptions(options); + + // Use caller-provided terminal size if available + if (options.cols && options.cols > 0) { + this.terminalCols = options.cols; + } + if (options.rows && options.rows > 0) { + this.terminalRows = options.rows; + } + + this.sessionId = this.config.getSessionId(); + this.worktreeDirName = await this.deriveWorktreeDirName(this.sessionId); + this.startedAt = Date.now(); + this.sessionStatus = ArenaSessionStatus.INITIALIZING; + this.masterAbortController = new AbortController(); + + const sourceRepoPath = this.config.getWorkingDir(); + const arenaSettings = this.config.getAgentsSettings().arena; + + this.arenaConfig = { + sessionId: this.sessionId, + task: options.task, + models: options.models, + maxRoundsPerAgent: + options.maxRoundsPerAgent ?? arenaSettings?.maxRoundsPerAgent, + timeoutSeconds: options.timeoutSeconds ?? arenaSettings?.timeoutSeconds, + approvalMode: options.approvalMode, + sourceRepoPath, + chatHistory: options.chatHistory, + }; + + debugLogger.info(`Starting Arena session: ${this.sessionId}`); + debugLogger.info(`Task: ${options.task}`); + debugLogger.info( + `Models: ${options.models.map((m) => m.modelId).join(', ')}`, + ); + + // Fail fast on missing git or non-repo directory before any UI output + // so the user gets a clean, single error message without the + // "Arena started…" banner. + const gitCheck = await this.worktreeService.checkGitAvailable(); + if (!gitCheck.available) { + throw new Error(gitCheck.error!); + } + const isRepo = await this.worktreeService.isGitRepository(); + if (!isRepo) { + throw new Error( + 'Failed to start arena: current directory is not a git repository.', + ); + } + + // Emit session start event + this.eventEmitter.emit(ArenaEventType.SESSION_START, { + sessionId: this.sessionId, + task: options.task, + models: options.models, + timestamp: Date.now(), + }); + + // Log arena session start telemetry + logArenaSessionStarted( + this.config, + makeArenaSessionStartedEvent({ + arena_session_id: this.sessionId, + model_ids: options.models.map((m) => m.modelId), + task_length: options.task.length, + }), + ); + + try { + // Detect and initialize the backend. + // Priority: explicit option > agents.displayMode setting > auto-detect + const displayMode = + options.displayMode ?? + (this.config.getAgentsSettings().displayMode as + | DisplayMode + | undefined); + await this.initializeBackend(displayMode); + + // If cancelled during backend init, bail out early + if (this.masterAbortController?.signal.aborted) { + this.sessionStatus = ArenaSessionStatus.CANCELLED; + const result = await this.collectResults(); + this.emitSessionEnded('cancelled'); + return result; + } + + // Set up worktrees for all agents + this.emitProgress(`Setting up environment for agents…`); + await this.setupWorktrees(); + + // If cancelled during worktree setup, bail out early + if (this.masterAbortController?.signal.aborted) { + this.sessionStatus = ArenaSessionStatus.CANCELLED; + const result = await this.collectResults(); + this.emitSessionEnded('cancelled'); + return result; + } + + // Emit worktree info for each agent + const worktreeInfo = Array.from(this.agents.values()) + .map( + (agent, i) => + ` ${i + 1}. ${agent.model.modelId} → ${agent.worktree.path}`, + ) + .join('\n'); + this.emitProgress(`Environment ready. Agent worktrees:\n${worktreeInfo}`); + + // Start all agents in parallel via PTY + this.emitProgress('Launching agents…'); + this.sessionStatus = ArenaSessionStatus.RUNNING; + await this.runAgents(); + + // Mark session as idle (agents finished but still alive) unless + // already cancelled/timed out. + if (this.sessionStatus === ArenaSessionStatus.RUNNING) { + this.sessionStatus = ArenaSessionStatus.IDLE; + } + + // Collect results (uses this.sessionStatus for result status) + const result = await this.collectResults(); + this.cachedResult = result; + + // Emit session complete event + this.eventEmitter.emit(ArenaEventType.SESSION_COMPLETE, { + sessionId: this.sessionId, + result, + timestamp: Date.now(), + }); + + this.callbacks.onArenaComplete?.(result); + + // NOTE: session-ended telemetry is NOT emitted here. + // The session is "done running" but the user hasn't picked a winner + // or discarded yet. The ended event fires from applyAgentResult() + // (status: 'selected') or cleanup/cleanupRuntime (status: 'discarded'). + + return result; + } catch (error) { + this.sessionStatus = ArenaSessionStatus.FAILED; + + const errorMessage = + error instanceof Error ? error.message : String(error); + + // Emit session error event + this.eventEmitter.emit(ArenaEventType.SESSION_ERROR, { + sessionId: this.sessionId, + error: errorMessage, + timestamp: Date.now(), + }); + + // Log arena session failed telemetry + this.emitSessionEnded('failed'); + + this.callbacks.onArenaError?.( + error instanceof Error ? error : new Error(errorMessage), + ); + + throw error; + } + } + + /** + * Cancel the current Arena session. + */ + async cancel(): Promise { + if (!this.sessionId) { + return; + } + + debugLogger.info(`Cancelling Arena session: ${this.sessionId}`); + + // Stop polling + this.stopPolling(); + + // Abort the master controller + this.masterAbortController?.abort(); + + // Force stop all PTY processes (sends Ctrl-C) + this.backend?.stopAll(); + + // Final stats sync so telemetry reflects the latest counters. + // For PTY agents: read each agent's status file one last time. + // For in-process agents: pull counters from the interactive object. + await this.pollAgentStatuses().catch(() => {}); + for (const agent of this.agents.values()) { + if (!isTerminalStatus(agent.status)) { + agent.syncStats?.(); + } + } + + // Update agent statuses — skip agents already in a terminal state + // (COMPLETED, FAILED, CANCELLED) so we don't overwrite a successful result. + for (const agent of this.agents.values()) { + if (!isTerminalStatus(agent.status)) { + agent.abortController.abort(); + agent.stats.durationMs = Date.now() - agent.startedAt; + this.updateAgentStatus(agent.agentId, AgentStatus.CANCELLED); + } + } + + this.sessionStatus = ArenaSessionStatus.CANCELLED; + + // NOTE: session-ended telemetry is NOT emitted here. + // start() emits 'cancelled' when it unwinds through its early-cancel + // paths. If cancel() is called after start() has already returned + // (all agents done, user viewing results), the ended event fires + // from cleanup() / cleanupRuntime() instead. + } + + /** + * Clean up the Arena session (remove worktrees, kill processes, etc.). + */ + async cleanup(): Promise { + if (!this.sessionId) { + return; + } + + debugLogger.info(`Cleaning up Arena session: ${this.sessionId}`); + + // If no session-ended event was emitted yet, emit before tearing down. + // Use 'cancelled' if the session was explicitly stopped, 'discarded' if + // the user simply left without picking a winner. + this.emitSessionEnded( + this.sessionStatus === ArenaSessionStatus.CANCELLED + ? 'cancelled' + : 'discarded', + ); + + // Stop polling in case cleanup is called without cancel + this.stopPolling(); + + // Remove in-process event bridge listeners + this.teardownEventBridge(); + + // Clean up backend resources + if (this.backend) { + await this.backend.cleanup(); + } + + // Clean up worktrees + await this.worktreeService.cleanupSession(this.worktreeDirName!); + + this.agents.clear(); + this.cachedResult = null; + this.sessionId = undefined; + this.worktreeDirName = undefined; + this.arenaConfig = undefined; + this.backend = null; + this.sessionEndedLogged = false; + } + + /** + * Clean up runtime resources (processes, backend, memory) without removing + * worktrees or session files on disk. Used when preserveArtifacts is enabled. + */ + async cleanupRuntime(): Promise { + if (!this.sessionId) { + return; + } + + debugLogger.info( + `Cleaning up Arena runtime (preserving artifacts): ${this.sessionId}`, + ); + + // If no session-ended event was emitted yet, emit before tearing down. + this.emitSessionEnded( + this.sessionStatus === ArenaSessionStatus.CANCELLED + ? 'cancelled' + : 'discarded', + ); + + this.stopPolling(); + + // Remove in-process event bridge listeners + this.teardownEventBridge(); + + if (this.backend) { + await this.backend.cleanup(); + } + + this.agents.clear(); + this.cachedResult = null; + this.sessionId = undefined; + this.worktreeDirName = undefined; + this.arenaConfig = undefined; + this.backend = null; + this.sessionEndedLogged = false; + } + + /** + * Apply the result from a specific agent to the main working directory. + */ + async applyAgentResult( + agentId: string, + ): Promise<{ success: boolean; error?: string }> { + const agent = this.agents.get(agentId); + if (!agent) { + return { success: false, error: `Agent ${agentId} not found` }; + } + + if (!isSuccessStatus(agent.status)) { + return { + success: false, + error: `Agent ${agentId} has not completed (current status: ${agent.status})`, + }; + } + + const applyResult = await this.worktreeService.applyWorktreeChanges( + agent.worktree.path, + ); + + if (applyResult.success) { + this.emitSessionEnded('selected', agent.model.modelId); + } + + return applyResult; + } + + /** + * Get the diff for a specific agent's changes. + */ + async getAgentDiff(agentId: string): Promise { + const agent = this.agents.get(agentId); + if (!agent) { + return `Agent ${agentId} not found`; + } + + return this.worktreeService.getWorktreeDiff(agent.worktree.path); + } + + // ─── Private: Telemetry ─────────────────────────────────────── + + /** + * Emit the `arena_session_ended` telemetry event exactly once. + * + * Called from: + * - start() early-cancel paths → 'cancelled' + * - start() catch block → 'failed' + * - applyAgentResult() on success → 'selected' (with winner) + * - cleanup() / cleanupRuntime() → 'discarded' (user left without picking) + */ + private emitSessionEnded( + status: ArenaSessionEndedStatus, + winnerModelId?: string, + ): void { + if (this.sessionEndedLogged) return; + this.sessionEndedLogged = true; + + const agents = Array.from(this.agents.values()); + logArenaSessionEnded( + this.config, + makeArenaSessionEndedEvent({ + arena_session_id: this.sessionId ?? '', + status, + duration_ms: this.startedAt ? Date.now() - this.startedAt : 0, + display_backend: this.backend?.type, + agent_count: agents.length, + completed_agents: agents.filter( + (a) => a.status === AgentStatus.COMPLETED, + ).length, + failed_agents: agents.filter((a) => a.status === AgentStatus.FAILED) + .length, + cancelled_agents: agents.filter( + (a) => a.status === AgentStatus.CANCELLED, + ).length, + winner_model_id: winnerModelId, + }), + ); + } + + // ─── Private: Progress ───────────────────────────────────────── + + /** + * Emit a progress message via SESSION_UPDATE so the UI can display + * setup status. + */ + private emitProgress( + message: string, + type: 'info' | 'warning' | 'success' = 'info', + ): void { + if (!this.sessionId) return; + this.eventEmitter.emit(ArenaEventType.SESSION_UPDATE, { + sessionId: this.sessionId, + type, + message, + timestamp: Date.now(), + }); + } + + // ─── Private: Validation ─────────────────────────────────────── + + private validateStartOptions(options: ArenaStartOptions): void { + if (!options.models || options.models.length < 2) { + throw new Error('Arena requires at least 2 models to compare'); + } + + if (options.models.length > ARENA_MAX_AGENTS) { + throw new Error(`Arena supports a maximum of ${ARENA_MAX_AGENTS} models`); + } + + if (!options.task || options.task.trim().length === 0) { + throw new Error('Arena requires a task/prompt'); + } + + // Check for duplicate model IDs + const modelIds = options.models.map((m) => m.modelId); + const uniqueIds = new Set(modelIds); + if (uniqueIds.size !== modelIds.length) { + throw new Error('Arena models must have unique identifiers'); + } + + // Check for collisions after filesystem-safe normalization. + // safeAgentId replaces characters like / \ : to '--', so distinct + // model IDs (e.g. "org/model" and "org--model") can map to the same + // status/control file path and corrupt each other's state. + const safeIds = modelIds.map((id) => safeAgentId(id)); + const uniqueSafeIds = new Set(safeIds); + if (uniqueSafeIds.size !== safeIds.length) { + const collisions = modelIds.filter( + (id, i) => safeIds.indexOf(safeIds[i]!) !== i, + ); + throw new Error( + `Arena model IDs collide after path normalization: ${collisions.join(', ')}. ` + + 'Choose model IDs that remain unique when special characters (/ \\ : etc.) are replaced.', + ); + } + } + + // ─── Private: Backend Initialization ─────────────────────────── + + /** + * Initialize the backend. + */ + private async initializeBackend(displayMode?: DisplayMode): Promise { + const { backend, warning } = await detectBackend(displayMode, this.config); + await backend.init(); + this.backend = backend; + + if (warning && this.sessionId) { + this.eventEmitter.emit(ArenaEventType.SESSION_UPDATE, { + sessionId: this.sessionId, + type: 'warning', + message: warning, + timestamp: Date.now(), + }); + } + + // Surface attach hint for external tmux sessions + const attachHint = backend.getAttachHint(); + if (attachHint && this.sessionId) { + this.eventEmitter.emit(ArenaEventType.SESSION_UPDATE, { + sessionId: this.sessionId, + type: 'info', + message: `To view agent panes, run: ${attachHint}`, + timestamp: Date.now(), + }); + } + } + + // ─── Private: Worktree Setup ─────────────────────────────────── + + /** + * Derive a short, filesystem-friendly directory name from the full session ID. + * Uses the first 8 hex characters of the UUID. If that path already exists, + * appends a numeric suffix (-2, -3, …) until an unused name is found. + */ + private async deriveWorktreeDirName(sessionId: string): Promise { + const shortId = sessionId.replaceAll('-', '').slice(0, 8); + let candidate = shortId; + let suffix = 2; + + while (true) { + const candidatePath = path.join(this.arenaBaseDir, candidate); + try { + await fs.access(candidatePath); + candidate = `${shortId}-${suffix}`; + suffix++; + } catch { + return candidate; + } + } + } + + private async setupWorktrees(): Promise { + if (!this.arenaConfig) { + throw new Error('Arena config not initialized'); + } + + debugLogger.info('Setting up worktrees for Arena agents'); + + const worktreeNames = this.arenaConfig.models.map((m) => m.modelId); + + const result = await this.worktreeService.setupWorktrees({ + sessionId: this.worktreeDirName!, + sourceRepoPath: this.arenaConfig.sourceRepoPath, + worktreeNames, + metadata: { arenaSessionId: this.arenaConfig.sessionId }, + }); + + if (!result.success) { + const errorMessages = result.errors + .map((e) => `${e.name}: ${e.error}`) + .join('; '); + throw new Error(`Failed to set up worktrees: ${errorMessages}`); + } + + // Create agent states + for (let i = 0; i < this.arenaConfig.models.length; i++) { + const model = this.arenaConfig.models[i]!; + const worktreeName = worktreeNames[i]!; + const worktree = result.worktreesByName[worktreeName]; + + if (!worktree) { + throw new Error( + `No worktree created for model ${model.modelId} (name: ${worktreeName})`, + ); + } + + const agentId = model.modelId; + + const agentState: ArenaAgentState = { + agentId, + model, + status: AgentStatus.INITIALIZING, + worktree, + abortController: new AbortController(), + agentSessionId: `${this.sessionId}#${agentId}`, + stats: { + rounds: 0, + totalTokens: 0, + inputTokens: 0, + outputTokens: 0, + durationMs: 0, + toolCalls: 0, + successfulToolCalls: 0, + failedToolCalls: 0, + }, + startedAt: 0, + accumulatedText: '', + }; + + this.agents.set(agentId, agentState); + } + + debugLogger.info(`Created ${this.agents.size} agent worktrees`); + } + + // ─── Private: Agent Execution ────────────────────────────────── + + private async runAgents(): Promise { + if (!this.arenaConfig) { + throw new Error('Arena config not initialized'); + } + + debugLogger.info('Starting Arena agents sequentially via backend'); + + const backend = this.requireBackend(); + + // Wire up exit handler on the backend + backend.setOnAgentExit((agentId, exitCode, signal) => { + this.handleAgentExit(agentId, exitCode, signal); + }); + + const isInProcess = backend.type === DISPLAY_MODE.IN_PROCESS; + + // Spawn agents sequentially — each spawn completes before starting the next. + // This creates a visual effect where panes appear one by one. + for (const agent of this.agents.values()) { + await this.spawnAgentPty(agent); + } + + this.emitProgress('All agents are now live and working on the task.'); + + // For in-process mode, set up event bridges instead of file-based polling. + // For PTY mode, start polling agent status files. + if (isInProcess) { + this.setupInProcessEventBridge(backend as InProcessBackend); + } else { + this.startPolling(); + } + + // Set up timeout + const timeoutSeconds = this.arenaConfig.timeoutSeconds; + + // Wait for all agents to reach IDLE or TERMINATED, or timeout. + // Unlike waitForAll (which waits for PTY exit), this resolves as soon + // as every agent has finished its first task in interactive mode. + const allSettled = await this.waitForAllAgentsSettled( + timeoutSeconds ? timeoutSeconds * 1000 : undefined, + ); + + // Stop polling when all agents are done (no-op for in-process mode) + if (!isInProcess) { + this.stopPolling(); + } + + if (!allSettled) { + debugLogger.info('Arena session timed out, stopping remaining agents'); + this.sessionStatus = ArenaSessionStatus.CANCELLED; + + // Terminate remaining active agents + for (const agent of this.agents.values()) { + if (!isTerminalStatus(agent.status)) { + backend.stopAgent(agent.agentId); + agent.abortController.abort(); + agent.stats.durationMs = Date.now() - agent.startedAt; + this.updateAgentStatus(agent.agentId, AgentStatus.CANCELLED); + } + } + } + + debugLogger.info('All Arena agents settled or timed out'); + } + + private async spawnAgentPty(agent: ArenaAgentState): Promise { + if (!this.arenaConfig) { + return; + } + + const backend = this.requireBackend(); + + const { agentId, model, worktree } = agent; + + debugLogger.info(`Spawning agent PTY: ${agentId}`); + + agent.startedAt = Date.now(); + this.updateAgentStatus(agentId, AgentStatus.RUNNING); + + // Emit agent start event + this.eventEmitter.emit(ArenaEventType.AGENT_START, { + sessionId: this.arenaConfig.sessionId, + agentId, + model, + worktreePath: worktree.path, + timestamp: Date.now(), + }); + + this.callbacks.onAgentStart?.(agentId, model); + + // Build the CLI command to spawn the agent as a full interactive instance + const spawnConfig = this.buildAgentSpawnConfig(agent); + + try { + await backend.spawnAgent(spawnConfig); + } catch (error) { + const errorMessage = + error instanceof Error ? error.message : String(error); + agent.error = errorMessage; + this.updateAgentStatus(agentId, AgentStatus.FAILED); + + this.eventEmitter.emit(ArenaEventType.AGENT_ERROR, { + sessionId: this.requireConfig().sessionId, + agentId, + error: errorMessage, + timestamp: Date.now(), + }); + + debugLogger.error(`Failed to spawn agent: ${agentId}`, error); + } + } + + private requireBackend(): Backend { + if (!this.backend) { + throw new Error('Arena backend not initialized.'); + } + return this.backend; + } + + private requireConfig(): ArenaConfig { + if (!this.arenaConfig) { + throw new Error('Arena config not initialized'); + } + return this.arenaConfig; + } + + private handleAgentExit( + agentId: string, + exitCode: number | null, + _signal: number | null, + ): void { + const agent = this.agents.get(agentId); + if (!agent) { + return; + } + + // Already failed/cancelled (e.g. via cancel) + if (isTerminalStatus(agent.status)) { + return; + } + + agent.stats.durationMs = Date.now() - agent.startedAt; + + if ( + exitCode !== 0 && + exitCode !== null && + !agent.abortController.signal.aborted + ) { + agent.error = `Process exited with code ${exitCode}`; + this.eventEmitter.emit(ArenaEventType.AGENT_ERROR, { + sessionId: this.requireConfig().sessionId, + agentId, + error: agent.error, + timestamp: Date.now(), + }); + } + + this.updateAgentStatus( + agentId, + agent.abortController.signal.aborted + ? AgentStatus.CANCELLED + : AgentStatus.FAILED, + ); + debugLogger.info(`Agent exited: ${agentId} (exit code: ${exitCode})`); + } + + /** + * Build the spawn configuration for an agent subprocess. + * + * The agent is launched as a full interactive CLI instance, running in + * its own worktree with the specified model. The task is passed via + * the --prompt argument so the CLI enters interactive mode and + * immediately starts working on the task. + */ + private buildAgentSpawnConfig(agent: ArenaAgentState): AgentSpawnConfig { + const { agentId, model, worktree } = agent; + + // Build CLI args for spawning an interactive agent. + // Note: --cwd is NOT a valid CLI flag; the working directory is set + // via AgentSpawnConfig.cwd which becomes the PTY's cwd. + const args: string[] = []; + + // Set the model and auth type + args.push('--model', model.modelId); + args.push('--auth-type', model.authType); + + // Pass the task via --prompt-interactive (-i) so the CLI enters + // interactive mode AND immediately starts working on the task. + // (--prompt runs non-interactively and would exit after completion.) + if (this.arenaConfig?.task) { + args.push('--prompt-interactive', this.arenaConfig.task); + } + + // Set approval mode if specified + if (this.arenaConfig?.approvalMode) { + args.push('--approval-mode', this.arenaConfig.approvalMode); + } + + // Pass the agent's session ID so the child CLI uses it for telemetry + // correlation instead of generating a random UUID. + args.push('--session-id', agent.agentSessionId); + + // Construct env vars for the agent + const arenaSessionDir = this.getArenaSessionDir(); + const env: Record = { + QWEN_CODE: '1', + ARENA_AGENT_ID: agentId, + ARENA_SESSION_ID: this.arenaConfig?.sessionId ?? '', + ARENA_SESSION_DIR: arenaSessionDir, + }; + + // If the model has auth overrides, pass them via env + if (model.apiKey) { + env['QWEN_API_KEY'] = model.apiKey; + } + if (model.baseUrl) { + env['QWEN_BASE_URL'] = model.baseUrl; + } + + const spawnConfig: AgentSpawnConfig = { + agentId, + command: process.execPath, // Use the same Node.js binary + args: [path.resolve(process.argv[1]!), ...args], // Re-launch the CLI entry point (must be absolute path since cwd changes) + cwd: worktree.path, + env, + cols: this.terminalCols, + rows: this.terminalRows, + inProcess: { + agentName: model.modelId, + initialTask: this.arenaConfig?.task, + runtimeConfig: { + promptConfig: { + systemPrompt: getCoreSystemPrompt( + this.config.getUserMemory(), + model.modelId, + ), + }, + modelConfig: { model: model.modelId }, + runConfig: { + max_turns: this.arenaConfig?.maxRoundsPerAgent, + max_time_minutes: this.arenaConfig?.timeoutSeconds + ? Math.ceil(this.arenaConfig.timeoutSeconds / 60) + : undefined, + }, + }, + authOverrides: { + authType: model.authType, + apiKey: model.apiKey, + baseUrl: model.baseUrl, + }, + chatHistory: this.arenaConfig?.chatHistory, + }, + }; + + debugLogger.info( + `[buildAgentSpawnConfig] agentId=${agentId}, command=${spawnConfig.command}, cliEntry=${process.argv[1]}, resolvedEntry=${path.resolve(process.argv[1]!)}`, + ); + debugLogger.info( + `[buildAgentSpawnConfig] args=${JSON.stringify(spawnConfig.args)}`, + ); + debugLogger.info( + `[buildAgentSpawnConfig] cwd=${spawnConfig.cwd}, env keys=${Object.keys(env).join(',')}`, + ); + + return spawnConfig; + } + + // ─── Private: Status & Results ───────────────────────────────── + + /** Decide whether a status transition is valid. Returns the new status or null. */ + private resolveTransition( + current: AgentStatus, + incoming: AgentStatus, + ): AgentStatus | null { + if (current === incoming) return null; + if (isTerminalStatus(current)) { + // Allow revival: COMPLETED → RUNNING (agent received new input) + if ( + current === AgentStatus.COMPLETED && + incoming === AgentStatus.RUNNING + ) { + return incoming; + } + return null; + } + return incoming; + } + + private updateAgentStatus( + agentId: string, + newStatus: AgentStatus, + options?: { roundCancelledByUser?: boolean }, + ): void { + const agent = this.agents.get(agentId); + if (!agent) { + return; + } + + const previousStatus = agent.status; + agent.status = newStatus; + + this.eventEmitter.emit(ArenaEventType.AGENT_STATUS_CHANGE, { + sessionId: this.requireConfig().sessionId, + agentId, + previousStatus, + newStatus, + timestamp: Date.now(), + }); + + const label = agent.model.modelId; + + // Emit a success message when an agent finishes its initial task. + if ( + this.sessionStatus === ArenaSessionStatus.RUNNING && + previousStatus === AgentStatus.RUNNING && + newStatus === AgentStatus.IDLE + ) { + if (options?.roundCancelledByUser) { + this.emitProgress(`Agent ${label} is cancelled by user.`, 'warning'); + } else { + this.emitProgress(`Agent ${label} finished initial task.`, 'success'); + } + } + + // Emit progress messages for follow-up transitions (only after + // the initial task — the session is IDLE once all agents first settle). + if (this.sessionStatus === ArenaSessionStatus.IDLE) { + if ( + previousStatus === AgentStatus.IDLE && + newStatus === AgentStatus.RUNNING + ) { + this.emitProgress(`Agent ${label} is working on a follow-up task…`); + } else if ( + previousStatus === AgentStatus.RUNNING && + newStatus === AgentStatus.IDLE + ) { + if (options?.roundCancelledByUser) { + this.emitProgress(`Agent ${label} is cancelled by user.`, 'warning'); + } else { + this.emitProgress( + `Agent ${label} finished follow-up task.`, + 'success', + ); + } + } + } + + // Emit AGENT_COMPLETE when agent reaches a terminal status + if (isTerminalStatus(newStatus)) { + const result = this.buildAgentResult(agent); + + this.eventEmitter.emit(ArenaEventType.AGENT_COMPLETE, { + sessionId: this.requireConfig().sessionId, + agentId, + result, + timestamp: Date.now(), + }); + + // Log arena agent completed telemetry + const agentTelemetryStatus = + newStatus === AgentStatus.COMPLETED + ? ('completed' as const) + : newStatus === AgentStatus.FAILED + ? ('failed' as const) + : ('cancelled' as const); + logArenaAgentCompleted( + this.config, + makeArenaAgentCompletedEvent({ + arena_session_id: this.sessionId ?? '', + agent_session_id: agent.agentSessionId, + agent_model_id: agent.model.modelId, + status: agentTelemetryStatus, + duration_ms: agent.stats.durationMs, + rounds: agent.stats.rounds, + total_tokens: agent.stats.totalTokens, + input_tokens: agent.stats.inputTokens, + output_tokens: agent.stats.outputTokens, + tool_calls: agent.stats.toolCalls, + successful_tool_calls: agent.stats.successfulToolCalls, + failed_tool_calls: agent.stats.failedToolCalls, + }), + ); + + this.callbacks.onAgentComplete?.(result); + } + } + + private buildAgentResult(agent: ArenaAgentState): ArenaAgentResult { + return { + agentId: agent.agentId, + model: agent.model, + status: agent.status, + worktree: agent.worktree, + finalText: agent.accumulatedText || undefined, + error: agent.error, + stats: { ...agent.stats }, + startedAt: agent.startedAt, + endedAt: Date.now(), + }; + } + + // ─── Arena Session Directory ────────────────────────────────── + + /** + * Get the arena session directory for the current session. + * All status and control files are stored here. + * + * Returns the absolute path to the session directory, e.g. + * `~/.qwen/worktrees//`. The directory contains: + * - `config.json` — consolidated session config + per-agent status + * - `agents/.json` — individual agent status files + * - `control/` — control signals (shutdown, cancel) + */ + getArenaSessionDir(): string { + if (!this.arenaConfig) { + throw new Error('Arena config not initialized'); + } + return GitWorktreeService.getSessionDir( + this.worktreeDirName!, + this.arenaBaseDir, + ); + } + + // ─── Private: Polling & Control Signals ────────────────────── + + /** + * Wait for all agents to reach IDLE or TERMINATED state. + * Returns true if all agents settled, false if timeout was reached. + */ + private waitForAllAgentsSettled(timeoutMs?: number): Promise { + return new Promise((resolve) => { + const checkSettled = () => { + for (const agent of this.agents.values()) { + if (!isSettledStatus(agent.status)) { + return false; + } + } + return true; + }; + + if (checkSettled()) { + resolve(true); + return; + } + + let timeoutHandle: ReturnType | undefined; + if (timeoutMs !== undefined) { + timeoutHandle = setTimeout(() => { + clearInterval(pollHandle); + resolve(false); + }, timeoutMs); + } + + // Re-check periodically (piggybacks on the same polling interval) + const pollHandle = setInterval(() => { + if (checkSettled()) { + clearInterval(pollHandle); + if (timeoutHandle) clearTimeout(timeoutHandle); + resolve(true); + } + }, ARENA_POLL_INTERVAL_MS); + }); + } + + /** + * Start polling agent status files at a fixed interval. + */ + private startPolling(): void { + if (this.pollingInterval) { + return; + } + + this.pollingInterval = setInterval(() => { + this.pollAgentStatuses().catch((error) => { + debugLogger.error('Error polling agent statuses:', error); + }); + }, ARENA_POLL_INTERVAL_MS); + } + + /** + * Stop the polling interval. + */ + private stopPolling(): void { + if (this.pollingInterval) { + clearInterval(this.pollingInterval); + this.pollingInterval = null; + } + } + + /** + * Set up event bridges for in-process agents. + * Subscribes to each AgentInteractive's events to update ArenaManager state. + * Listeners are tracked in `eventBridgeCleanups` for teardown. + */ + private setupInProcessEventBridge(backend: InProcessBackend): void { + for (const agent of this.agents.values()) { + const interactive = backend.getAgent(agent.agentId); + if (!interactive) continue; + + const emitter = interactive.getEventEmitter(); + if (!emitter) continue; + + // AgentInteractive emits canonical AgentStatus values — no mapping needed. + + const syncStats = () => { + const { totalToolCalls, totalDurationMs, ...rest } = + interactive.getStats(); + Object.assign(agent.stats, rest, { + toolCalls: totalToolCalls, + durationMs: totalDurationMs, + }); + }; + + agent.syncStats = syncStats; + + const applyStatus = ( + incoming: AgentStatus, + options?: { roundCancelledByUser?: boolean }, + ) => { + const resolved = this.resolveTransition(agent.status, incoming); + if (!resolved) return; + if (resolved === AgentStatus.FAILED) { + agent.error = + interactive.getLastRoundError() || interactive.getError(); + } + if (isSettledStatus(resolved)) { + agent.stats.durationMs = Date.now() - agent.startedAt; + } + this.updateAgentStatus(agent.agentId, resolved, options); + }; + + // Sync stats before mapping so counters are up-to-date even when + // the provider omits usage_metadata events. + const onStatusChange = (event: AgentStatusChangeEvent) => { + syncStats(); + applyStatus(event.newStatus, { + roundCancelledByUser: event.roundCancelledByUser, + }); + // Write status files so external consumers get a consistent + // file-based view regardless of backend mode. + this.flushInProcessStatusFiles().catch((err) => + debugLogger.error('Failed to flush in-process status files:', err), + ); + }; + + const onUsageMetadata = () => { + syncStats(); + this.flushInProcessStatusFiles().catch((err) => + debugLogger.error('Failed to flush in-process status files:', err), + ); + }; + + emitter.on(AgentEventType.STATUS_CHANGE, onStatusChange); + emitter.on(AgentEventType.USAGE_METADATA, onUsageMetadata); + + // Store cleanup functions so listeners can be removed during teardown + this.eventBridgeCleanups.push(() => { + emitter.off(AgentEventType.STATUS_CHANGE, onStatusChange); + emitter.off(AgentEventType.USAGE_METADATA, onUsageMetadata); + }); + + // Reconcile: if the agent already transitioned before the bridge was + // attached (e.g. fast completion or createChat failure during spawn), + // backfill stats and apply its current status now so + // waitForAllAgentsSettled sees it. + syncStats(); + applyStatus(interactive.getStatus()); + } + + // Flush status files once after reconciliation so that agents which + // already settled before the bridge was attached still get written to disk. + this.flushInProcessStatusFiles().catch((err) => + debugLogger.error('Failed to flush in-process status files:', err), + ); + } + + /** + * Remove all event bridge listeners registered by setupInProcessEventBridge. + */ + private teardownEventBridge(): void { + for (const cleanup of this.eventBridgeCleanups) { + cleanup(); + } + this.eventBridgeCleanups.length = 0; + } + + /** + * Read per-agent status files from `/agents/` directory. + * Updates agent stats, emits AGENT_STATS_UPDATE events, and writes a + * consolidated `status.json` at the arena session root. + */ + private async pollAgentStatuses(): Promise { + const sessionDir = this.getArenaSessionDir(); + const agentsDir = path.join(sessionDir, 'agents'); + const consolidatedAgents: Record = {}; + + for (const agent of this.agents.values()) { + // Only poll agents that are actively working + if ( + isSettledStatus(agent.status) || + agent.status === AgentStatus.INITIALIZING + ) { + continue; + } + + try { + const statusPath = path.join( + agentsDir, + `${safeAgentId(agent.agentId)}.json`, + ); + const content = await fs.readFile(statusPath, 'utf-8'); + const statusFile = JSON.parse(content) as ArenaStatusFile; + + // Collect for consolidated file + consolidatedAgents[agent.agentId] = statusFile; + + // Update agent stats from the status file. + agent.stats = { + ...agent.stats, + ...statusFile.stats, + }; + + // Detect state transitions from the sideband status file + const resolved = this.resolveTransition( + agent.status, + statusFile.status, + ); + if (resolved) { + if (resolved === AgentStatus.FAILED && statusFile.error) { + agent.error = statusFile.error; + } + this.updateAgentStatus(agent.agentId, resolved); + } + + this.callbacks.onAgentStatsUpdate?.(agent.agentId, statusFile.stats); + } catch (error: unknown) { + // File may not exist yet (agent hasn't written first status) + if (isNodeError(error) && error.code === 'ENOENT') { + continue; + } + debugLogger.error( + `Error reading status for agent ${agent.agentId}:`, + error, + ); + } + } + + // Write consolidated status.json at the arena session root + if (Object.keys(consolidatedAgents).length > 0) { + await this.writeConsolidatedStatus(consolidatedAgents); + } + } + + /** + * Merge agent status data into the arena session's config.json. + * Reads the existing config, adds/updates `updatedAt` and `agents`, + * then writes back atomically (temp file → rename). + */ + private async writeConsolidatedStatus( + agents: Record, + ): Promise { + const sessionDir = this.getArenaSessionDir(); + const configPath = path.join(sessionDir, 'config.json'); + + try { + // Read existing config.json written by GitWorktreeService + let config: ArenaConfigFile; + try { + const content = await fs.readFile(configPath, 'utf-8'); + config = JSON.parse(content) as ArenaConfigFile; + } catch { + // If config.json doesn't exist yet, create a minimal one + const arenaConfig = this.requireConfig(); + config = { + arenaSessionId: arenaConfig.sessionId, + sourceRepoPath: arenaConfig.sourceRepoPath, + worktreeNames: arenaConfig.models.map( + (m) => m.displayName || m.modelId, + ), + createdAt: this.startedAt!, + }; + } + + // Merge in the agent status data + config.updatedAt = Date.now(); + config.agents = agents; + + await atomicWriteJSON(configPath, config); + } catch (error) { + debugLogger.error( + 'Failed to write consolidated status to config.json:', + error, + ); + } + } + + /** + * Build an ArenaStatusFile snapshot from in-memory agent state. + */ + private buildStatusFile(agent: ArenaAgentState): ArenaStatusFile { + return { + agentId: agent.agentId, + status: agent.status, + updatedAt: Date.now(), + rounds: agent.stats.rounds, + stats: { ...agent.stats }, + finalSummary: null, + error: agent.error ?? null, + }; + } + + /** + * Write status files for all in-process agents and update the + * consolidated config.json. + * + * In PTY mode these files are written by ArenaAgentClient inside each + * child process. In in-process mode there is no child process, so the + * ArenaManager writes them directly so that external consumers + * (e.g. an orchestrating agent) get a consistent file-based view + * regardless of backend. + */ + private async flushInProcessStatusFiles(): Promise { + const sessionDir = this.getArenaSessionDir(); + const agentsDir = path.join(sessionDir, 'agents'); + await fs.mkdir(agentsDir, { recursive: true }); + + const consolidatedAgents: Record = {}; + + for (const agent of this.agents.values()) { + const statusFile = this.buildStatusFile(agent); + const filePath = path.join( + agentsDir, + `${safeAgentId(agent.agentId)}.json`, + ); + await atomicWriteJSON(filePath, statusFile); + consolidatedAgents[agent.agentId] = statusFile; + } + + if (Object.keys(consolidatedAgents).length > 0) { + await this.writeConsolidatedStatus(consolidatedAgents); + } + } + + /** + * Write a control signal to the arena session's control/ directory. + * The child agent consumes (reads + deletes) this file. + */ + async sendControlSignal( + agentId: string, + type: ArenaControlSignal['type'], + reason: string, + ): Promise { + const agent = this.agents.get(agentId); + if (!agent) { + debugLogger.error( + `Cannot send control signal: agent ${agentId} not found`, + ); + return; + } + + const controlSignal: ArenaControlSignal = { + type, + reason, + timestamp: Date.now(), + }; + + const sessionDir = this.getArenaSessionDir(); + const controlDir = path.join(sessionDir, 'control'); + const controlPath = path.join(controlDir, `${safeAgentId(agentId)}.json`); + + try { + await fs.mkdir(controlDir, { recursive: true }); + await fs.writeFile( + controlPath, + JSON.stringify(controlSignal, null, 2), + 'utf-8', + ); + debugLogger.info( + `Sent ${type} control signal to agent ${agentId}: ${reason}`, + ); + } catch (error) { + debugLogger.error( + `Failed to send control signal to agent ${agentId}:`, + error, + ); + } + } + + private async collectResults(): Promise { + if (!this.arenaConfig) { + throw new Error('Arena config not initialized'); + } + + const agents: ArenaAgentResult[] = []; + + for (const agent of this.agents.values()) { + const result = this.buildAgentResult(agent); + + // Get diff for agents that finished their task (IDLE or COMPLETED) + if (isSuccessStatus(agent.status)) { + try { + result.diff = await this.worktreeService.getWorktreeDiff( + agent.worktree.path, + ); + } catch (error) { + debugLogger.error( + `Failed to get diff for agent ${agent.agentId}:`, + error, + ); + } + } + + agents.push(result); + } + + const endedAt = Date.now(); + + return { + sessionId: this.arenaConfig.sessionId, + task: this.arenaConfig.task, + status: this.sessionStatus, + agents, + startedAt: this.startedAt!, + endedAt, + totalDurationMs: endedAt - this.startedAt!, + wasRepoInitialized: false, + }; + } +} diff --git a/packages/core/src/agents/arena/arena-events.ts b/packages/core/src/agents/arena/arena-events.ts new file mode 100644 index 0000000000..def7c24440 --- /dev/null +++ b/packages/core/src/agents/arena/arena-events.ts @@ -0,0 +1,184 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { EventEmitter } from 'events'; +import type { + ArenaModelConfig, + ArenaAgentResult, + ArenaSessionResult, +} from './types.js'; +import type { AgentStatus } from '../runtime/agent-types.js'; + +/** + * Arena event types. + */ +export enum ArenaEventType { + /** Arena session started */ + SESSION_START = 'session_start', + /** Informational or warning update during session lifecycle */ + SESSION_UPDATE = 'session_update', + /** Arena session completed */ + SESSION_COMPLETE = 'session_complete', + /** Arena session failed */ + SESSION_ERROR = 'session_error', + /** Agent started */ + AGENT_START = 'agent_start', + /** Agent status changed */ + AGENT_STATUS_CHANGE = 'agent_status_change', + /** Agent completed */ + AGENT_COMPLETE = 'agent_complete', + /** Agent error */ + AGENT_ERROR = 'agent_error', +} + +export type ArenaEvent = + | 'session_start' + | 'session_update' + | 'session_complete' + | 'session_error' + | 'agent_start' + | 'agent_status_change' + | 'agent_complete' + | 'agent_error'; + +/** + * Event payload for session start. + */ +export interface ArenaSessionStartEvent { + sessionId: string; + task: string; + models: ArenaModelConfig[]; + timestamp: number; +} + +/** + * Event payload for session complete. + */ +export interface ArenaSessionCompleteEvent { + sessionId: string; + result: ArenaSessionResult; + timestamp: number; +} + +/** + * Event payload for session error. + */ +export interface ArenaSessionErrorEvent { + sessionId: string; + error: string; + timestamp: number; +} + +/** + * Event payload for agent start. + */ +export interface ArenaAgentStartEvent { + sessionId: string; + agentId: string; + model: ArenaModelConfig; + worktreePath: string; + timestamp: number; +} + +/** + * Event payload for agent error. + */ +export interface ArenaAgentErrorEvent { + sessionId: string; + agentId: string; + error: string; + timestamp: number; +} + +/** + * Event payload for agent complete. + */ +export interface ArenaAgentCompleteEvent { + sessionId: string; + agentId: string; + result: ArenaAgentResult; + timestamp: number; +} + +/** + * Event payload for agent status change. + */ +export interface ArenaAgentStatusChangeEvent { + sessionId: string; + agentId: string; + previousStatus: AgentStatus; + newStatus: AgentStatus; + timestamp: number; +} + +/** + * Event payload for session update (informational or warning). + */ +export type ArenaSessionUpdateType = 'info' | 'warning' | 'success'; + +export interface ArenaSessionUpdateEvent { + sessionId: string; + type: ArenaSessionUpdateType; + message: string; + timestamp: number; +} + +/** + * Type map for arena events. + */ +export interface ArenaEventMap { + [ArenaEventType.SESSION_START]: ArenaSessionStartEvent; + [ArenaEventType.SESSION_UPDATE]: ArenaSessionUpdateEvent; + [ArenaEventType.SESSION_COMPLETE]: ArenaSessionCompleteEvent; + [ArenaEventType.SESSION_ERROR]: ArenaSessionErrorEvent; + [ArenaEventType.AGENT_START]: ArenaAgentStartEvent; + [ArenaEventType.AGENT_STATUS_CHANGE]: ArenaAgentStatusChangeEvent; + [ArenaEventType.AGENT_COMPLETE]: ArenaAgentCompleteEvent; + [ArenaEventType.AGENT_ERROR]: ArenaAgentErrorEvent; +} + +/** + * Event emitter for Arena events. + */ +export class ArenaEventEmitter { + private ee = new EventEmitter(); + + on( + event: E, + listener: (payload: ArenaEventMap[E]) => void, + ): void { + this.ee.on(event, listener as (...args: unknown[]) => void); + } + + off( + event: E, + listener: (payload: ArenaEventMap[E]) => void, + ): void { + this.ee.off(event, listener as (...args: unknown[]) => void); + } + + emit( + event: E, + payload: ArenaEventMap[E], + ): void { + this.ee.emit(event, payload); + } + + once( + event: E, + listener: (payload: ArenaEventMap[E]) => void, + ): void { + this.ee.once(event, listener as (...args: unknown[]) => void); + } + + removeAllListeners(event?: ArenaEvent): void { + if (event) { + this.ee.removeAllListeners(event); + } else { + this.ee.removeAllListeners(); + } + } +} diff --git a/packages/core/src/agents/arena/index.ts b/packages/core/src/agents/arena/index.ts new file mode 100644 index 0000000000..e744250c75 --- /dev/null +++ b/packages/core/src/agents/arena/index.ts @@ -0,0 +1,14 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +// Arena-specific exports +export * from './types.js'; +export * from './arena-events.js'; +export * from './ArenaManager.js'; +export * from './ArenaAgentClient.js'; + +// Re-export shared agent infrastructure for backwards compatibility +export * from '../backends/index.js'; diff --git a/packages/core/src/agents/arena/types.ts b/packages/core/src/agents/arena/types.ts new file mode 100644 index 0000000000..5b9a9ecabf --- /dev/null +++ b/packages/core/src/agents/arena/types.ts @@ -0,0 +1,280 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import type { Content } from '@google/genai'; +import type { WorktreeInfo } from '../../services/gitWorktreeService.js'; +import type { DisplayMode } from '../backends/types.js'; +import type { AgentStatus } from '../runtime/agent-types.js'; + +/** + * Maximum number of concurrent agents allowed in an Arena session. + */ +export const ARENA_MAX_AGENTS = 5; + +/** + * Represents the status of an Arena session. + */ +export enum ArenaSessionStatus { + /** Session is being set up */ + INITIALIZING = 'initializing', + /** Session is running */ + RUNNING = 'running', + /** All agents finished their current task and are idle (can accept follow-ups) */ + IDLE = 'idle', + /** Session completed for good (winner selected or explicit end) */ + COMPLETED = 'completed', + /** Session was cancelled */ + CANCELLED = 'cancelled', + /** Session failed during initialization */ + FAILED = 'failed', +} + +/** + * Configuration for a model participating in the Arena. + */ +export interface ArenaModelConfig { + /** Model identifier (e.g., 'qwen-coder-plus', 'gpt-4') */ + modelId: string; + /** Authentication type for this model */ + authType: string; + /** Display name for UI */ + displayName?: string; + /** Optional API key override */ + apiKey?: string; + /** Optional base URL override */ + baseUrl?: string; +} + +/** + * Configuration for an Arena session. + */ +export interface ArenaConfig { + /** Unique identifier for this Arena session */ + sessionId: string; + /** The task/prompt to be executed by all agents */ + task: string; + /** Models participating in the Arena */ + models: ArenaModelConfig[]; + /** Maximum number of rounds per agent (default: 50) */ + maxRoundsPerAgent?: number; + /** Total timeout in seconds for the entire Arena session (default: 600) */ + timeoutSeconds?: number; + /** Approval mode inherited from the main process (e.g., 'auto', 'suggest', etc.) */ + approvalMode?: string; + /** Source repository path */ + sourceRepoPath: string; + /** Chat history from the parent session for agent context seeding. */ + chatHistory?: Content[]; +} + +/** + * Statistics for an individual Arena agent. + */ +export interface ArenaAgentStats { + /** Number of completed rounds */ + rounds: number; + /** Total tokens used */ + totalTokens: number; + /** Input tokens used */ + inputTokens: number; + /** Output tokens used */ + outputTokens: number; + /** Total execution time in milliseconds */ + durationMs: number; + /** Number of tool calls made */ + toolCalls: number; + /** Number of successful tool calls */ + successfulToolCalls: number; + /** Number of failed tool calls */ + failedToolCalls: number; +} + +/** + * Result from a single Arena agent. + */ +export interface ArenaAgentResult { + /** Agent identifier */ + agentId: string; + /** Model configuration used */ + model: ArenaModelConfig; + /** Final status */ + status: AgentStatus; + /** Worktree information */ + worktree: WorktreeInfo; + /** Final text output from the agent */ + finalText?: string; + /** Error message if failed */ + error?: string; + /** Execution statistics */ + stats: ArenaAgentStats; + /** Git diff of changes made */ + diff?: string; + /** Files modified by this agent */ + modifiedFiles?: string[]; + /** Start timestamp */ + startedAt: number; + /** End timestamp */ + endedAt?: number; +} + +/** + * Result from an Arena session. + */ +export interface ArenaSessionResult { + /** Session identifier */ + sessionId: string; + /** Original task */ + task: string; + /** Session status */ + status: ArenaSessionStatus; + /** Results from all agents */ + agents: ArenaAgentResult[]; + /** Start timestamp */ + startedAt: number; + /** End timestamp */ + endedAt?: number; + /** Total duration in milliseconds */ + totalDurationMs?: number; + /** Whether the repository was auto-initialized */ + wasRepoInitialized: boolean; + /** Selected winner (agent ID) if user has chosen */ + selectedWinner?: string; +} + +/** + * Options for starting an Arena session. + */ +export interface ArenaStartOptions { + /** Models to participate (at least 2, max ARENA_MAX_AGENTS) */ + models: ArenaModelConfig[]; + /** The task/prompt for all agents */ + task: string; + /** Maximum rounds per agent */ + maxRoundsPerAgent?: number; + /** Timeout in seconds */ + timeoutSeconds?: number; + /** Approval mode to use for agents (inherited from main process) */ + approvalMode?: string; + /** Initial terminal columns for agent PTYs (default: process.stdout.columns or 120) */ + cols?: number; + /** Initial terminal rows for agent PTYs (default: process.stdout.rows or 40) */ + rows?: number; + /** Display mode preference */ + displayMode?: DisplayMode; + /** + * Optional chat history from the main session to seed each arena agent + * with conversational context. When provided, this history is prepended + * to each agent's chat so they understand the prior conversation. + */ + chatHistory?: Content[]; +} + +/** + * Callback functions for Arena events. + */ +export interface ArenaCallbacks { + /** Called when an agent starts */ + onAgentStart?: (agentId: string, model: ArenaModelConfig) => void; + /** Called when an agent completes */ + onAgentComplete?: (result: ArenaAgentResult) => void; + /** Called when agent stats are updated */ + onAgentStatsUpdate?: ( + agentId: string, + stats: Partial, + ) => void; + /** Called when the arena session completes */ + onArenaComplete?: (result: ArenaSessionResult) => void; + /** Called on arena error */ + onArenaError?: (error: Error) => void; +} + +/** + * File format for per-agent status (child → main process). + * Written atomically by ArenaAgentClient to + * `/agents/.json`. + */ +export interface ArenaStatusFile { + agentId: string; + status: AgentStatus; + updatedAt: number; + rounds: number; + currentActivity?: string; + stats: ArenaAgentStats; + finalSummary: string | null; + error: string | null; +} + +/** + * File format for the arena session config file (`config.json`). + * + * Initially written by GitWorktreeService with static config fields + * (arenaSessionId, sourceRepoPath, worktreeNames, baseBranch, createdAt). + * Dynamically updated by ArenaManager with agent status data during polling. + */ +export interface ArenaConfigFile { + /** Arena session identifier */ + arenaSessionId: string; + /** Source repository path */ + sourceRepoPath: string; + /** Names of worktrees created */ + worktreeNames: string[]; + /** Base branch used for worktrees */ + baseBranch?: string; + /** Timestamp when the session was created */ + createdAt: number; + /** Timestamp of the last status update (set by ArenaManager polling) */ + updatedAt?: number; + /** Per-agent status data, keyed by agentId (set by ArenaManager polling) */ + agents?: Record; +} + +/** + * Control signal format for control.json (main → child process). + * Written by ArenaManager, consumed (read + deleted) by ArenaAgentClient. + */ +export interface ArenaControlSignal { + type: 'shutdown' | 'cancel'; + reason: string; + timestamp: number; +} + +/** + * Convert an agentId (e.g. "arena-xxx/qwen-coder-plus") to a filename-safe + * string by replacing path-unsafe characters with "--". + */ +export function safeAgentId(agentId: string): string { + return agentId.replace(/[/\\:*?"<>|]/g, '--'); +} + +/** + * Internal state for tracking an Arena agent during execution. + */ +export interface ArenaAgentState { + /** Agent identifier */ + agentId: string; + /** Model configuration */ + model: ArenaModelConfig; + /** Current status */ + status: AgentStatus; + /** Worktree information */ + worktree: WorktreeInfo; + /** Abort controller for cancellation */ + abortController: AbortController; + /** Current statistics */ + stats: ArenaAgentStats; + /** Start timestamp */ + startedAt: number; + /** Accumulated text output */ + accumulatedText: string; + /** Promise for the agent execution */ + executionPromise?: Promise; + /** Error if failed */ + error?: string; + /** Unique session ID for this agent (for telemetry correlation) */ + agentSessionId: string; + /** Flush latest counters into `stats` (set by in-process event bridge) */ + syncStats?: () => void; +} diff --git a/packages/core/src/agents/backends/ITermBackend.test.ts b/packages/core/src/agents/backends/ITermBackend.test.ts new file mode 100644 index 0000000000..124df85ee5 --- /dev/null +++ b/packages/core/src/agents/backends/ITermBackend.test.ts @@ -0,0 +1,569 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import type { AgentSpawnConfig } from './types.js'; + +// ─── Hoisted mocks for iterm-it2 ──────────────────────────────── +const hoistedVerifyITerm = vi.hoisted(() => vi.fn()); +const hoistedItermSplitPane = vi.hoisted(() => vi.fn()); +const hoistedItermRunCommand = vi.hoisted(() => vi.fn()); +const hoistedItermSendText = vi.hoisted(() => vi.fn()); +const hoistedItermFocusSession = vi.hoisted(() => vi.fn()); +const hoistedItermCloseSession = vi.hoisted(() => vi.fn()); + +vi.mock('./iterm-it2.js', () => ({ + verifyITerm: hoistedVerifyITerm, + itermSplitPane: hoistedItermSplitPane, + itermRunCommand: hoistedItermRunCommand, + itermSendText: hoistedItermSendText, + itermFocusSession: hoistedItermFocusSession, + itermCloseSession: hoistedItermCloseSession, +})); + +// ─── Hoisted mocks for node:fs/promises ───────────────────────── +const hoistedFsMkdir = vi.hoisted(() => vi.fn()); +const hoistedFsReadFile = vi.hoisted(() => vi.fn()); +const hoistedFsRm = vi.hoisted(() => vi.fn()); + +vi.mock('node:fs/promises', () => ({ + mkdir: hoistedFsMkdir, + readFile: hoistedFsReadFile, + rm: hoistedFsRm, +})); + +// Mock debug logger +vi.mock('../../utils/debugLogger.js', () => ({ + createDebugLogger: () => ({ + info: vi.fn(), + error: vi.fn(), + warn: vi.fn(), + }), +})); + +import { ITermBackend } from './ITermBackend.js'; + +function makeConfig( + agentId: string, + overrides?: Partial, +): AgentSpawnConfig { + return { + agentId, + command: '/usr/bin/node', + args: ['agent.js'], + cwd: '/tmp/test', + ...overrides, + }; +} + +function setupDefaultMocks(): void { + hoistedVerifyITerm.mockResolvedValue(undefined); + hoistedItermSplitPane.mockResolvedValue('sess-new-1'); + hoistedItermRunCommand.mockResolvedValue(undefined); + hoistedItermSendText.mockResolvedValue(undefined); + hoistedItermFocusSession.mockResolvedValue(undefined); + hoistedItermCloseSession.mockResolvedValue(undefined); + hoistedFsMkdir.mockResolvedValue(undefined); + // Default: marker file doesn't exist yet (agent still running) + hoistedFsReadFile.mockRejectedValue(new Error('ENOENT')); + hoistedFsRm.mockResolvedValue(undefined); +} + +describe('ITermBackend', () => { + let backend: ITermBackend; + let savedItermSessionId: string | undefined; + + beforeEach(() => { + vi.useFakeTimers(); + savedItermSessionId = process.env['ITERM_SESSION_ID']; + delete process.env['ITERM_SESSION_ID']; + setupDefaultMocks(); + backend = new ITermBackend(); + }); + + afterEach(async () => { + await backend.cleanup(); + vi.restoreAllMocks(); + vi.useRealTimers(); + if (savedItermSessionId !== undefined) { + process.env['ITERM_SESSION_ID'] = savedItermSessionId; + } else { + delete process.env['ITERM_SESSION_ID']; + } + }); + + // ─── Initialization ───────────────────────────────────────── + + it('throws if spawnAgent is called before init', async () => { + await expect(backend.spawnAgent(makeConfig('a1'))).rejects.toThrow( + 'not initialized', + ); + }); + + it('init verifies iTerm availability', async () => { + await backend.init(); + expect(hoistedVerifyITerm).toHaveBeenCalled(); + }); + + it('init creates exit marker directory', async () => { + await backend.init(); + expect(hoistedFsMkdir).toHaveBeenCalledWith( + expect.stringContaining('agent-iterm-exit-'), + { recursive: true }, + ); + }); + + it('init is idempotent', async () => { + await backend.init(); + await backend.init(); + expect(hoistedVerifyITerm).toHaveBeenCalledTimes(1); + }); + + // ─── Spawning ───────────────────────────────────────────── + + it('spawns first agent using ITERM_SESSION_ID when set', async () => { + process.env['ITERM_SESSION_ID'] = 'leader-sess'; + backend = new ITermBackend(); + await backend.init(); + + await backend.spawnAgent(makeConfig('agent-1')); + + expect(hoistedItermSplitPane).toHaveBeenCalledWith('leader-sess'); + expect(hoistedItermRunCommand).toHaveBeenCalledWith( + 'sess-new-1', + expect.any(String), + ); + expect(backend.getActiveAgentId()).toBe('agent-1'); + }); + + it('spawns first agent without ITERM_SESSION_ID', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('agent-1')); + + expect(hoistedItermSplitPane).toHaveBeenCalledWith(undefined); + expect(backend.getActiveAgentId()).toBe('agent-1'); + }); + + it('spawns subsequent agent from last session', async () => { + await backend.init(); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('agent-1')); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-2'); + await backend.spawnAgent(makeConfig('agent-2')); + + // Second split should use the first agent's session as source + expect(hoistedItermSplitPane).toHaveBeenLastCalledWith('sess-1'); + }); + + it('rejects duplicate agent IDs', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('dup')); + + await expect(backend.spawnAgent(makeConfig('dup'))).rejects.toThrow( + 'already exists', + ); + }); + + it('registers failed agent and fires exit callback on spawn error', async () => { + await backend.init(); + hoistedItermSplitPane.mockRejectedValueOnce(new Error('split failed')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + await backend.spawnAgent(makeConfig('fail')); + + expect(exitCallback).toHaveBeenCalledWith('fail', 1, null); + }); + + // ─── buildShellCommand (env key validation) ──────────────── + + it('rejects invalid environment variable names', async () => { + await backend.init(); + + await expect( + backend.spawnAgent(makeConfig('bad-env', { env: { 'FOO BAR': 'baz' } })), + ).rejects.toThrow('Invalid environment variable name'); + }); + + it('rejects env key starting with a digit', async () => { + await backend.init(); + + await expect( + backend.spawnAgent(makeConfig('bad-env', { env: { '1VAR': 'baz' } })), + ).rejects.toThrow('Invalid environment variable name'); + }); + + it('accepts valid environment variable names', async () => { + await backend.init(); + + await expect( + backend.spawnAgent( + makeConfig('good-env', { + env: { MY_VAR_123: 'hello', _PRIVATE: 'world' }, + }), + ), + ).resolves.toBeUndefined(); + }); + + // ─── buildShellCommand (atomic marker write) ────────────── + + it('builds command with atomic exit marker write', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const cmdArg = hoistedItermRunCommand.mock.calls[0]![1] as string; + // Should contain write-then-rename pattern + expect(cmdArg).toMatch(/echo \$\? > .+\.tmp.+ && mv .+\.tmp/); + }); + + it('builds command with cd and quoted args', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const cmdArg = hoistedItermRunCommand.mock.calls[0]![1] as string; + expect(cmdArg).toContain("cd '/tmp/test'"); + expect(cmdArg).toContain("'/usr/bin/node'"); + expect(cmdArg).toContain("'agent.js'"); + }); + + it('includes env vars in command when provided', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a', { env: { NODE_ENV: 'test' } })); + + const cmdArg = hoistedItermRunCommand.mock.calls[0]![1] as string; + expect(cmdArg).toContain("NODE_ENV='test'"); + expect(cmdArg).toContain('env '); + }); + + // ─── Navigation ─────────────────────────────────────────── + + it('switchTo changes active agent and focuses session', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-2'); + await backend.spawnAgent(makeConfig('b')); + + backend.switchTo('b'); + expect(backend.getActiveAgentId()).toBe('b'); + expect(hoistedItermFocusSession).toHaveBeenCalledWith('sess-2'); + }); + + it('switchTo throws for unknown agent', async () => { + await backend.init(); + expect(() => backend.switchTo('ghost')).toThrow('not found'); + }); + + it('switchToNext and switchToPrevious cycle correctly', async () => { + await backend.init(); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-2'); + await backend.spawnAgent(makeConfig('b')); + + expect(backend.getActiveAgentId()).toBe('a'); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('b'); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('a'); + backend.switchToPrevious(); + expect(backend.getActiveAgentId()).toBe('b'); + }); + + it('switchToNext does nothing with a single agent', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('solo')); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('solo'); + }); + + it('switchToPrevious does nothing with a single agent', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('solo')); + backend.switchToPrevious(); + expect(backend.getActiveAgentId()).toBe('solo'); + }); + + // ─── Stop & Cleanup ────────────────────────────────────── + + it('stopAgent closes session and fires exit callback', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + backend.stopAgent('a'); + + expect(hoistedItermCloseSession).toHaveBeenCalledWith('sess-1'); + expect(exitCallback).toHaveBeenCalledWith('a', 1, null); + }); + + it('stopAgent is a no-op for already-stopped agent', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + backend.stopAgent('a'); + hoistedItermCloseSession.mockClear(); + + backend.stopAgent('a'); + expect(hoistedItermCloseSession).not.toHaveBeenCalled(); + }); + + it('stopAgent is a no-op for unknown agent', async () => { + await backend.init(); + backend.stopAgent('ghost'); + expect(hoistedItermCloseSession).not.toHaveBeenCalled(); + }); + + it('stopAll closes all sessions and resets activeAgentId', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + hoistedItermSplitPane.mockResolvedValueOnce('sess-2'); + await backend.spawnAgent(makeConfig('b')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + backend.stopAll(); + + expect(hoistedItermCloseSession).toHaveBeenCalledTimes(2); + expect(exitCallback).toHaveBeenCalledTimes(2); + expect(backend.getActiveAgentId()).toBeNull(); + }); + + it('cleanup closes sessions and removes exit marker directory', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + await backend.cleanup(); + + expect(hoistedItermCloseSession).toHaveBeenCalledWith('sess-1'); + expect(hoistedFsRm).toHaveBeenCalledWith( + expect.stringContaining('agent-iterm-exit-'), + { recursive: true, force: true }, + ); + expect(backend.getActiveAgentId()).toBeNull(); + }); + + it('cleanup tolerates session close errors', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + hoistedItermCloseSession.mockRejectedValueOnce(new Error('session gone')); + + // Should not throw + await expect(backend.cleanup()).resolves.toBeUndefined(); + }); + + it('cleanup tolerates exit marker removal errors', async () => { + await backend.init(); + hoistedFsRm.mockRejectedValueOnce(new Error('ENOENT')); + + // Should not throw + await expect(backend.cleanup()).resolves.toBeUndefined(); + }); + + // ─── Exit Detection ───────────────────────────────────────── + + it('marks agent as exited when marker file appears', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + // Simulate marker file appearing with exit code 0 + hoistedFsReadFile.mockResolvedValue('0\n'); + + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledWith('a', 0, null); + }); + + it('preserves non-zero exit codes from marker', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + hoistedFsReadFile.mockResolvedValue('42\n'); + + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledWith('a', 42, null); + }); + + it('defaults to exit code 1 when marker contains NaN', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + hoistedFsReadFile.mockResolvedValue('garbage\n'); + + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledWith('a', 1, null); + }); + + it('does not fire callback twice for the same agent', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + hoistedFsReadFile.mockResolvedValue('0\n'); + + await vi.advanceTimersByTimeAsync(600); + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledTimes(1); + }); + + it('stops polling once all agents have exited', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + hoistedFsReadFile.mockResolvedValue('0\n'); + + await vi.advanceTimersByTimeAsync(600); + + // Reset to track future reads + hoistedFsReadFile.mockClear(); + + // Advance more — should not poll anymore + await vi.advanceTimersByTimeAsync(2000); + expect(hoistedFsReadFile).not.toHaveBeenCalled(); + }); + + // ─── waitForAll ───────────────────────────────────────────── + + it('waitForAll resolves immediately when no agents exist', async () => { + await backend.init(); + const result = await backend.waitForAll(); + expect(result).toBe(true); + }); + + it('waitForAll resolves when all agents exit', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + hoistedFsReadFile.mockResolvedValue('0\n'); + + const waitPromise = backend.waitForAll(); + await vi.advanceTimersByTimeAsync(600); + + const result = await waitPromise; + expect(result).toBe(true); + }); + + it('waitForAll returns false on timeout', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + + // Marker never appears (readFile keeps throwing) + const waitPromise = backend.waitForAll(1000); + await vi.advanceTimersByTimeAsync(1100); + + const result = await waitPromise; + expect(result).toBe(false); + }); + + // ─── Input ───────────────────────────────────────────────── + + it('writeToAgent sends text via itermSendText', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + const result = backend.writeToAgent('a', 'hello'); + expect(result).toBe(true); + expect(hoistedItermSendText).toHaveBeenCalledWith('sess-1', 'hello'); + }); + + it('writeToAgent returns false for unknown agent', async () => { + await backend.init(); + expect(backend.writeToAgent('ghost', 'hello')).toBe(false); + }); + + it('writeToAgent returns false for stopped agent', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + backend.stopAgent('a'); + + expect(backend.writeToAgent('a', 'hello')).toBe(false); + }); + + it('forwardInput delegates to active agent', async () => { + await backend.init(); + hoistedItermSplitPane.mockResolvedValueOnce('sess-1'); + await backend.spawnAgent(makeConfig('a')); + + const result = backend.forwardInput('hello'); + expect(result).toBe(true); + expect(hoistedItermSendText).toHaveBeenCalledWith('sess-1', 'hello'); + }); + + it('forwardInput returns false with no active agent', async () => { + await backend.init(); + expect(backend.forwardInput('hello')).toBe(false); + }); + + // ─── Snapshots ────────────────────────────────────────────── + + it('getActiveSnapshot returns null', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + expect(backend.getActiveSnapshot()).toBeNull(); + }); + + it('getAgentSnapshot returns null', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + expect(backend.getAgentSnapshot('a')).toBeNull(); + }); + + it('getAgentScrollbackLength returns 0', async () => { + await backend.init(); + await backend.spawnAgent(makeConfig('a')); + expect(backend.getAgentScrollbackLength('a')).toBe(0); + }); + + // ─── getAttachHint ────────────────────────────────────────── + + it('getAttachHint returns null', async () => { + await backend.init(); + expect(backend.getAttachHint()).toBeNull(); + }); + + // ─── resizeAll ────────────────────────────────────────────── + + it('resizeAll is a no-op', async () => { + await backend.init(); + // Should not throw + backend.resizeAll(80, 24); + }); + + // ─── type ─────────────────────────────────────────────────── + + it('has type "iterm2"', () => { + expect(backend.type).toBe('iterm2'); + }); +}); diff --git a/packages/core/src/agents/backends/ITermBackend.ts b/packages/core/src/agents/backends/ITermBackend.ts new file mode 100644 index 0000000000..7ff24c44b0 --- /dev/null +++ b/packages/core/src/agents/backends/ITermBackend.ts @@ -0,0 +1,431 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview ITermBackend implements Backend using the it2 CLI + * (iTerm2 Python API). + * + * Each agent runs in its own iTerm2 split pane. The backend manages pane + * creation, exit detection (via exit marker file polling), and cleanup. + * + * Exit detection uses a file-based marker approach: each agent's command is + * wrapped to write its exit code to a temp file on completion, which the backend + * polls to detect exits. + */ + +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import * as os from 'node:os'; +import { createDebugLogger } from '../../utils/debugLogger.js'; +import type { AnsiOutput } from '../../utils/terminalSerializer.js'; +import { DISPLAY_MODE } from './types.js'; +import type { AgentSpawnConfig, AgentExitCallback, Backend } from './types.js'; +import { + verifyITerm, + itermSplitPane, + itermRunCommand, + itermSendText, + itermFocusSession, + itermCloseSession, +} from './iterm-it2.js'; + +const debugLogger = createDebugLogger('ITERM_BACKEND'); + +/** Polling interval for exit detection (ms) */ +const EXIT_POLL_INTERVAL_MS = 500; + +interface ITermAgentSession { + agentId: string; + sessionId: string; + exitMarkerPath: string; + status: 'running' | 'exited'; + exitCode: number; +} + +export class ITermBackend implements Backend { + readonly type = DISPLAY_MODE.ITERM2; + + /** Directory for exit marker files */ + private exitMarkerDir: string; + /** Session ID of the last agent pane (split source) */ + private lastSplitSessionId: string | null = null; + + private sessions: Map = new Map(); + private agentOrder: string[] = []; + private activeAgentId: string | null = null; + private onExitCallback: AgentExitCallback | null = null; + private exitPollTimer: NodeJS.Timeout | null = null; + private initialized = false; + /** Number of agents currently being spawned asynchronously */ + private pendingSpawns = 0; + /** Queue to serialize spawn operations (prevents split race conditions) */ + private spawnQueue: Promise = Promise.resolve(); + + constructor() { + this.exitMarkerDir = path.join( + os.tmpdir(), + `agent-iterm-exit-${Date.now().toString(36)}`, + ); + } + + async init(): Promise { + if (this.initialized) return; + + await verifyITerm(); + + // Create the exit marker directory + await fs.mkdir(this.exitMarkerDir, { recursive: true }); + + this.initialized = true; + debugLogger.info('ITermBackend initialized'); + } + + // ─── Agent Lifecycle ──────────────────────────────────────── + + async spawnAgent(config: AgentSpawnConfig): Promise { + if (!this.initialized) { + throw new Error('ITermBackend not initialized. Call init() first.'); + } + if (this.sessions.has(config.agentId)) { + throw new Error(`Agent "${config.agentId}" already exists.`); + } + + const exitMarkerPath = path.join(this.exitMarkerDir, config.agentId); + await fs.mkdir(path.dirname(exitMarkerPath), { recursive: true }); + const cmd = this.buildShellCommand(config, exitMarkerPath); + + this.pendingSpawns++; + const spawnPromise = this.spawnQueue.then(() => + this.spawnAgentAsync(config.agentId, cmd, exitMarkerPath), + ); + this.spawnQueue = spawnPromise; + await spawnPromise; + } + + private async spawnAgentAsync( + agentId: string, + cmd: string, + exitMarkerPath: string, + ): Promise { + try { + let sessionId: string; + + if (this.sessions.size === 0) { + // First agent: split from ITERM_SESSION_ID if present, else active session + const leaderSessionId = process.env['ITERM_SESSION_ID'] || undefined; + sessionId = await itermSplitPane(leaderSessionId); + await itermRunCommand(sessionId, cmd); + } else { + // Subsequent agents: split from last agent session, else active session + sessionId = await itermSplitPane(this.lastSplitSessionId || undefined); + await itermRunCommand(sessionId, cmd); + } + + const agentSession: ITermAgentSession = { + agentId, + sessionId, + exitMarkerPath, + status: 'running', + exitCode: 0, + }; + + this.sessions.set(agentId, agentSession); + this.agentOrder.push(agentId); + this.lastSplitSessionId = sessionId; + + if (this.activeAgentId === null) { + this.activeAgentId = agentId; + } + + this.startExitPolling(); + + debugLogger.info(`Spawned agent "${agentId}" in session ${sessionId}`); + } catch (error) { + debugLogger.error(`Failed to spawn agent "${agentId}":`, error); + this.sessions.set(agentId, { + agentId, + sessionId: '', + exitMarkerPath, + status: 'exited', + exitCode: 1, + }); + this.agentOrder.push(agentId); + this.onExitCallback?.(agentId, 1, null); + } finally { + this.pendingSpawns--; + } + } + + stopAgent(agentId: string): void { + const session = this.sessions.get(agentId); + if (!session || session.status !== 'running') return; + itermCloseSession(session.sessionId).catch((e) => + debugLogger.error(`Failed to close session for agent "${agentId}": ${e}`), + ); + session.status = 'exited'; + session.exitCode = 1; + this.onExitCallback?.(agentId, 1, null); + debugLogger.info(`Closed iTerm2 session for agent "${agentId}"`); + } + + stopAll(): void { + for (const session of this.sessions.values()) { + if (session.status === 'running') { + itermCloseSession(session.sessionId).catch((e) => + debugLogger.error( + `Failed to close session for agent "${session.agentId}": ${e}`, + ), + ); + session.status = 'exited'; + session.exitCode = 1; + this.onExitCallback?.(session.agentId, 1, null); + } + } + this.activeAgentId = null; + } + + async cleanup(): Promise { + this.stopExitPolling(); + + // Close all iTerm2 sessions we created + for (const session of this.sessions.values()) { + if (!session.sessionId) continue; + try { + await itermCloseSession(session.sessionId); + } catch (error) { + debugLogger.error('Session cleanup error (ignored):', error); + } + } + + // Clean up exit marker files + try { + await fs.rm(this.exitMarkerDir, { + recursive: true, + force: true, + }); + } catch (error) { + debugLogger.error('Exit marker cleanup error (ignored):', error); + } + + this.sessions.clear(); + this.agentOrder = []; + this.activeAgentId = null; + this.lastSplitSessionId = null; + } + + setOnAgentExit(callback: AgentExitCallback): void { + this.onExitCallback = callback; + } + + async waitForAll(timeoutMs?: number): Promise { + if (this.allExited()) return true; + + return new Promise((resolve) => { + let timeoutHandle: NodeJS.Timeout | undefined; + + const checkInterval = setInterval(() => { + if (this.allExited()) { + clearInterval(checkInterval); + if (timeoutHandle) clearTimeout(timeoutHandle); + resolve(true); + } + }, EXIT_POLL_INTERVAL_MS); + + if (timeoutMs !== undefined) { + timeoutHandle = setTimeout(() => { + clearInterval(checkInterval); + resolve(false); + }, timeoutMs); + } + }); + } + + // ─── Active Agent & Navigation ────────────────────────────── + + switchTo(agentId: string): void { + if (!this.sessions.has(agentId)) { + throw new Error(`Agent "${agentId}" not found.`); + } + const session = this.sessions.get(agentId)!; + this.activeAgentId = agentId; + itermFocusSession(session.sessionId).catch((e) => + debugLogger.error(`Failed to focus session for agent "${agentId}": ${e}`), + ); + } + + switchToNext(): void { + if (this.agentOrder.length <= 1) return; + const currentIndex = this.agentOrder.indexOf(this.activeAgentId ?? ''); + const nextIndex = (currentIndex + 1) % this.agentOrder.length; + this.switchTo(this.agentOrder[nextIndex]!); + } + + switchToPrevious(): void { + if (this.agentOrder.length <= 1) return; + const currentIndex = this.agentOrder.indexOf(this.activeAgentId ?? ''); + const prevIndex = + (currentIndex - 1 + this.agentOrder.length) % this.agentOrder.length; + this.switchTo(this.agentOrder[prevIndex]!); + } + + getActiveAgentId(): string | null { + return this.activeAgentId; + } + + // ─── Screen Capture ───────────────────────────────────────── + + getActiveSnapshot(): AnsiOutput | null { + // iTerm2 manages rendering — snapshots not supported + return null; + } + + getAgentSnapshot( + _agentId: string, + _scrollOffset: number = 0, + ): AnsiOutput | null { + return null; + } + + getAgentScrollbackLength(_agentId: string): number { + return 0; + } + + // ─── Input ────────────────────────────────────────────────── + + forwardInput(data: string): boolean { + if (!this.activeAgentId) return false; + return this.writeToAgent(this.activeAgentId, data); + } + + writeToAgent(agentId: string, data: string): boolean { + const session = this.sessions.get(agentId); + if (!session || session.status !== 'running') return false; + itermSendText(session.sessionId, data).catch((e) => + debugLogger.error(`Failed to send text to agent "${agentId}": ${e}`), + ); + return true; + } + + // ─── Resize ───────────────────────────────────────────────── + + resizeAll(_cols: number, _rows: number): void { + // iTerm2 manages pane sizes automatically + } + + getAttachHint(): string | null { + // iTerm2 panes are visible directly, no attach needed + return null; + } + + // ─── Private ──────────────────────────────────────────────── + + /** + * Build the shell command with exit marker wrapping. + * + * The command is wrapped so that its exit code is written to a temp file + * when it completes. This allows the backend to detect agent exit via + * file polling, since iTerm2 `write text` runs commands inside a shell + * (the shell stays alive after the command exits). + */ + private buildShellCommand( + config: AgentSpawnConfig, + exitMarkerPath: string, + ): string { + const envParts: string[] = []; + if (config.env) { + for (const [key, value] of Object.entries(config.env)) { + if (!VALID_ENV_KEY.test(key)) { + throw new Error( + `Invalid environment variable name: "${key}". Names must match /^[A-Za-z_][A-Za-z0-9_]*$/.`, + ); + } + envParts.push(`${key}=${shellQuote(value)}`); + } + } + + const cmdParts = [ + shellQuote(config.command), + ...config.args.map(shellQuote), + ]; + + // Build: cd && [env K=V] command args; echo $? > + const parts = [`cd ${shellQuote(config.cwd)}`]; + if (envParts.length > 0) { + parts.push(`env ${envParts.join(' ')} ${cmdParts.join(' ')}`); + } else { + parts.push(cmdParts.join(' ')); + } + + const mainCmd = parts.join(' && '); + // Write exit code to a temp file first, then atomically rename it + // to the marker path. This prevents the polling loop from reading + // a partially-written file. + const tmpMarker = shellQuote(exitMarkerPath + '.tmp'); + const finalMarker = shellQuote(exitMarkerPath); + return `${mainCmd}; echo $? > ${tmpMarker} && mv ${tmpMarker} ${finalMarker}`; + } + + private allExited(): boolean { + if (this.pendingSpawns > 0) return false; + if (this.sessions.size === 0) return true; + for (const session of this.sessions.values()) { + if (session.status === 'running') return false; + } + return true; + } + + private startExitPolling(): void { + if (this.exitPollTimer) return; + + this.exitPollTimer = setInterval(() => { + void this.pollExitStatus(); + }, EXIT_POLL_INTERVAL_MS); + this.exitPollTimer.unref(); + } + + private stopExitPolling(): void { + if (this.exitPollTimer) { + clearInterval(this.exitPollTimer); + this.exitPollTimer = null; + } + } + + private async pollExitStatus(): Promise { + for (const agent of this.sessions.values()) { + if (agent.status !== 'running') continue; + + try { + const content = await fs.readFile(agent.exitMarkerPath, 'utf8'); + const exitCode = parseInt(content.trim(), 10); + agent.status = 'exited'; + agent.exitCode = isNaN(exitCode) ? 1 : exitCode; + + debugLogger.info( + `Agent "${agent.agentId}" exited with code ${agent.exitCode}`, + ); + + this.onExitCallback?.(agent.agentId, agent.exitCode, null); + } catch { + // File doesn't exist yet — command still running + } + } + + if (this.allExited()) { + this.stopExitPolling(); + } + } +} + +/** Regex for valid POSIX environment variable names */ +const VALID_ENV_KEY = /^[A-Za-z_][A-Za-z0-9_]*$/; + +/** + * Simple shell quoting for building command strings. + * Wraps value in single quotes, escaping any internal single quotes. + */ +function shellQuote(value: string): string { + return `'${value.replace(/'/g, "'\\''")}'`; +} diff --git a/packages/core/src/agents/backends/InProcessBackend.test.ts b/packages/core/src/agents/backends/InProcessBackend.test.ts new file mode 100644 index 0000000000..83bf1cacad --- /dev/null +++ b/packages/core/src/agents/backends/InProcessBackend.test.ts @@ -0,0 +1,564 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import { InProcessBackend } from './InProcessBackend.js'; +import { DISPLAY_MODE } from './types.js'; +import type { AgentSpawnConfig } from './types.js'; +import { AgentCore } from '../runtime/agent-core.js'; +import { createContentGenerator } from '../../core/contentGenerator.js'; + +// Mock createContentGenerator to avoid real API client setup +const mockContentGenerator = { + generateContentStream: vi.fn(), +}; +vi.mock('../../core/contentGenerator.js', () => ({ + createContentGenerator: vi.fn().mockResolvedValue({ + generateContentStream: vi.fn(), + }), +})); + +// Mock AgentCore and AgentInteractive to avoid real model calls +vi.mock('../runtime/agent-core.js', () => ({ + AgentCore: vi.fn().mockImplementation(() => ({ + subagentId: 'mock-id', + name: 'mock-agent', + eventEmitter: { + on: vi.fn(), + off: vi.fn(), + emit: vi.fn(), + }, + stats: { + start: vi.fn(), + getSummary: vi.fn().mockReturnValue({}), + }, + createChat: vi.fn().mockResolvedValue({}), + prepareTools: vi.fn().mockReturnValue([]), + runReasoningLoop: vi.fn().mockResolvedValue({ + text: 'Done', + terminateMode: null, + turnsUsed: 1, + }), + getEventEmitter: vi.fn().mockReturnValue({ + on: vi.fn(), + off: vi.fn(), + emit: vi.fn(), + }), + getExecutionSummary: vi.fn().mockReturnValue({}), + })), +})); + +function createMockToolRegistry() { + return { + getFunctionDeclarations: vi.fn().mockReturnValue([]), + getAllTools: vi.fn().mockReturnValue([]), + getAllToolNames: vi.fn().mockReturnValue([]), + registerTool: vi.fn(), + copyDiscoveredToolsFrom: vi.fn(), + stop: vi.fn().mockResolvedValue(undefined), + }; +} + +function createMockConfig() { + const registry = createMockToolRegistry(); + return { + getModel: vi.fn().mockReturnValue('test-model'), + getToolRegistry: vi.fn().mockReturnValue(registry), + getSessionId: vi.fn().mockReturnValue('test-session'), + getWorkingDir: vi.fn().mockReturnValue('/tmp'), + getTargetDir: vi.fn().mockReturnValue('/tmp'), + createToolRegistry: vi.fn().mockResolvedValue(createMockToolRegistry()), + getContentGenerator: vi.fn().mockReturnValue(mockContentGenerator), + getContentGeneratorConfig: vi.fn().mockReturnValue({ + model: 'test-model', + authType: 'openai', + apiKey: 'parent-key', + baseUrl: 'https://parent.example.com', + }), + getAuthType: vi.fn().mockReturnValue('openai'), + } as never; +} + +function createSpawnConfig(agentId: string): AgentSpawnConfig { + return { + agentId, + command: 'node', + args: [], + cwd: '/tmp', + inProcess: { + agentName: `Agent ${agentId}`, + initialTask: 'Do something', + runtimeConfig: { + promptConfig: { systemPrompt: 'You are a helpful assistant.' }, + modelConfig: { model: 'test-model' }, + runConfig: { max_turns: 10 }, + }, + }, + }; +} + +describe('InProcessBackend', () => { + let backend: InProcessBackend; + + beforeEach(() => { + backend = new InProcessBackend(createMockConfig()); + }); + + it('should have IN_PROCESS type', () => { + expect(backend.type).toBe(DISPLAY_MODE.IN_PROCESS); + }); + + it('should init without error', async () => { + await expect(backend.init()).resolves.toBeUndefined(); + }); + + it('should throw when spawning without inProcess config', async () => { + const config: AgentSpawnConfig = { + agentId: 'test', + command: 'node', + args: [], + cwd: '/tmp', + }; + + await expect(backend.spawnAgent(config)).rejects.toThrow( + 'InProcessBackend requires inProcess config', + ); + }); + + it('should spawn an agent with inProcess config', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + expect(backend.getActiveAgentId()).toBe('agent-1'); + expect(backend.getAgent('agent-1')).toBeDefined(); + }); + + it('should set first spawned agent as active', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + await backend.spawnAgent(createSpawnConfig('agent-2')); + + expect(backend.getActiveAgentId()).toBe('agent-1'); + }); + + it('should navigate between agents', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + await backend.spawnAgent(createSpawnConfig('agent-2')); + await backend.spawnAgent(createSpawnConfig('agent-3')); + + expect(backend.getActiveAgentId()).toBe('agent-1'); + + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('agent-2'); + + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('agent-3'); + + // Wraps around + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('agent-1'); + + backend.switchToPrevious(); + expect(backend.getActiveAgentId()).toBe('agent-3'); + }); + + it('should switch to a specific agent', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + await backend.spawnAgent(createSpawnConfig('agent-2')); + + backend.switchTo('agent-2'); + expect(backend.getActiveAgentId()).toBe('agent-2'); + }); + + it('should forward input to active agent', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + const result = backend.forwardInput('hello'); + expect(result).toBe(true); + }); + + it('should return false for forwardInput with no active agent', () => { + expect(backend.forwardInput('hello')).toBe(false); + }); + + it('should write to specific agent', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + expect(backend.writeToAgent('agent-1', 'hello')).toBe(true); + expect(backend.writeToAgent('nonexistent', 'hello')).toBe(false); + }); + + it('should return null for screen capture methods', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + expect(backend.getActiveSnapshot()).toBeNull(); + expect(backend.getAgentSnapshot('agent-1')).toBeNull(); + expect(backend.getAgentScrollbackLength('agent-1')).toBe(0); + }); + + it('should return null for attach hint', () => { + expect(backend.getAttachHint()).toBeNull(); + }); + + it('should stop a specific agent', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + + backend.stopAgent('agent-1'); + // Agent should eventually reach cancelled state + }); + + it('should stop all agents', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + await backend.spawnAgent(createSpawnConfig('agent-2')); + + backend.stopAll(); + // Both agents should be aborted + }); + + it('should cleanup all agents', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + await backend.cleanup(); + + expect(backend.getActiveAgentId()).toBeNull(); + expect(backend.getAgent('agent-1')).toBeUndefined(); + }); + + it('should fire exit callback when agent completes', async () => { + await backend.init(); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + await backend.spawnAgent(createSpawnConfig('agent-1')); + + // The mock agent stays idle after processing initialTask. + // Trigger a graceful shutdown to make it complete. + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + await agent!.shutdown(); + + // Wait for the exit callback to fire + await vi.waitFor(() => { + expect(exitCallback).toHaveBeenCalledWith( + 'agent-1', + expect.any(Number), + null, + ); + }); + }); + + it('should pass per-agent cwd to AgentCore via config proxy', async () => { + const parentConfig = createMockConfig(); + const backendWithParentCwd = new InProcessBackend(parentConfig); + await backendWithParentCwd.init(); + + const agentCwd = '/worktree/agent-1'; + const config = createSpawnConfig('agent-1'); + config.cwd = agentCwd; + + await backendWithParentCwd.spawnAgent(config); + + const MockAgentCore = AgentCore as unknown as ReturnType; + const lastCall = MockAgentCore.mock.calls.at(-1); + expect(lastCall).toBeDefined(); + + // Second arg is the runtime context (Config) + const agentContext = lastCall![1] as { + getWorkingDir: () => string; + getTargetDir: () => string; + getToolRegistry: () => unknown; + }; + expect(agentContext.getWorkingDir()).toBe(agentCwd); + expect(agentContext.getTargetDir()).toBe(agentCwd); + expect(agentContext.getToolRegistry()).toBeDefined(); + }); + + it('should propagate runConfig limits to AgentInteractive', async () => { + await backend.init(); + + const config = createSpawnConfig('agent-1'); + config.inProcess!.runtimeConfig.runConfig = { + max_turns: 5, + max_time_minutes: 10, + }; + + await backend.spawnAgent(config); + + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + expect(agent!.config.maxTurnsPerMessage).toBe(5); + expect(agent!.config.maxTimeMinutesPerMessage).toBe(10); + }); + + it('should default limits to undefined when runConfig omits them', async () => { + await backend.init(); + + const config = createSpawnConfig('agent-1'); + config.inProcess!.runtimeConfig.runConfig = {}; + + await backend.spawnAgent(config); + + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + expect(agent!.config.maxTurnsPerMessage).toBeUndefined(); + expect(agent!.config.maxTimeMinutesPerMessage).toBeUndefined(); + }); + + it('should give each agent its own cwd even when sharing a backend', async () => { + await backend.init(); + + const config1 = createSpawnConfig('agent-1'); + config1.cwd = '/worktree/agent-1'; + const config2 = createSpawnConfig('agent-2'); + config2.cwd = '/worktree/agent-2'; + + await backend.spawnAgent(config1); + await backend.spawnAgent(config2); + + const MockAgentCore = AgentCore as unknown as ReturnType; + const calls = MockAgentCore.mock.calls; + + const ctx1 = calls.at(-2)![1] as { + getWorkingDir: () => string; + getTargetDir: () => string; + }; + const ctx2 = calls.at(-1)![1] as { + getWorkingDir: () => string; + getTargetDir: () => string; + }; + + expect(ctx1.getWorkingDir()).toBe('/worktree/agent-1'); + expect(ctx1.getTargetDir()).toBe('/worktree/agent-1'); + expect(ctx2.getWorkingDir()).toBe('/worktree/agent-2'); + expect(ctx2.getTargetDir()).toBe('/worktree/agent-2'); + }); + + it('should throw when spawning a duplicate agent ID', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + await expect( + backend.spawnAgent(createSpawnConfig('agent-1')), + ).rejects.toThrow('Agent "agent-1" already exists.'); + }); + + it('should fire exit callback with code 1 when start() throws', async () => { + // Make createChat throw for this test + const MockAgentCore = AgentCore as unknown as ReturnType; + MockAgentCore.mockImplementationOnce(() => ({ + subagentId: 'mock-id', + name: 'mock-agent', + eventEmitter: { + on: vi.fn(), + off: vi.fn(), + emit: vi.fn(), + }, + stats: { + start: vi.fn(), + getSummary: vi.fn().mockReturnValue({}), + }, + createChat: vi.fn().mockRejectedValue(new Error('Auth failed')), + prepareTools: vi.fn().mockReturnValue([]), + getEventEmitter: vi.fn().mockReturnValue({ + on: vi.fn(), + off: vi.fn(), + emit: vi.fn(), + }), + getExecutionSummary: vi.fn().mockReturnValue({}), + })); + + await backend.init(); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + // spawnAgent should NOT throw — it catches the error internally + await expect( + backend.spawnAgent(createSpawnConfig('agent-fail')), + ).resolves.toBeUndefined(); + + // Exit callback should have been fired with exit code 1 + expect(exitCallback).toHaveBeenCalledWith('agent-fail', 1, null); + }); + + it('should return true immediately from waitForAll after cleanup', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + await backend.cleanup(); + + // waitForAll should return immediately after cleanup + const result = await backend.waitForAll(5000); + expect(result).toBe(true); + }); + + describe('chat history', () => { + it('should pass chatHistory to AgentInteractive config', async () => { + await backend.init(); + + const chatHistory = [ + { role: 'user' as const, parts: [{ text: 'prior question' }] }, + { role: 'model' as const, parts: [{ text: 'prior answer' }] }, + ]; + const config = createSpawnConfig('agent-1'); + config.inProcess!.chatHistory = chatHistory; + + await backend.spawnAgent(config); + + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + expect(agent!.config.chatHistory).toEqual(chatHistory); + }); + + it('should leave chatHistory undefined when not provided', async () => { + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + const agent = backend.getAgent('agent-1'); + expect(agent).toBeDefined(); + expect(agent!.config.chatHistory).toBeUndefined(); + }); + }); + + describe('auth isolation', () => { + it('should create per-agent ContentGenerator when authOverrides is provided', async () => { + await backend.init(); + + const config = createSpawnConfig('agent-1'); + config.inProcess!.authOverrides = { + authType: 'anthropic', + apiKey: 'agent-key-123', + baseUrl: 'https://agent.example.com', + }; + + await backend.spawnAgent(config); + + const mockCreate = createContentGenerator as ReturnType; + expect(mockCreate).toHaveBeenCalledWith( + expect.objectContaining({ + authType: 'anthropic', + apiKey: 'agent-key-123', + baseUrl: 'https://agent.example.com', + model: 'test-model', + }), + expect.anything(), + ); + }); + + it('should override getContentGenerator on per-agent config', async () => { + const agentGenerator = { generateContentStream: vi.fn() }; + const mockCreate = createContentGenerator as ReturnType; + mockCreate.mockResolvedValueOnce(agentGenerator); + + await backend.init(); + + const config = createSpawnConfig('agent-1'); + config.inProcess!.authOverrides = { + authType: 'anthropic', + apiKey: 'agent-key', + }; + + await backend.spawnAgent(config); + + const MockAgentCore = AgentCore as unknown as ReturnType; + const lastCall = MockAgentCore.mock.calls.at(-1); + const agentContext = lastCall![1] as { + getContentGenerator: () => unknown; + getAuthType: () => string | undefined; + getModel: () => string; + }; + + expect(agentContext.getContentGenerator()).toBe(agentGenerator); + expect(agentContext.getAuthType()).toBe('anthropic'); + }); + + it('should not create per-agent ContentGenerator without authOverrides', async () => { + const mockCreate = createContentGenerator as ReturnType; + mockCreate.mockClear(); + + await backend.init(); + await backend.spawnAgent(createSpawnConfig('agent-1')); + + expect(mockCreate).not.toHaveBeenCalled(); + }); + + it('should fall back to parent ContentGenerator if per-agent creation fails', async () => { + const mockCreate = createContentGenerator as ReturnType; + mockCreate.mockRejectedValueOnce(new Error('Auth failed')); + + await backend.init(); + + const config = createSpawnConfig('agent-1'); + config.inProcess!.authOverrides = { + authType: 'anthropic', + apiKey: 'bad-key', + }; + + // Should not throw — falls back gracefully + await expect(backend.spawnAgent(config)).resolves.toBeUndefined(); + + const MockAgentCore = AgentCore as unknown as ReturnType; + const lastCall = MockAgentCore.mock.calls.at(-1); + const agentContext = lastCall![1] as { + getContentGenerator: () => unknown; + }; + + // Falls back to parent's content generator + expect(agentContext.getContentGenerator()).toBe(mockContentGenerator); + }); + + it('should give different agents different ContentGenerators', async () => { + const gen1 = { generateContentStream: vi.fn() }; + const gen2 = { generateContentStream: vi.fn() }; + const mockCreate = createContentGenerator as ReturnType; + mockCreate.mockResolvedValueOnce(gen1).mockResolvedValueOnce(gen2); + + await backend.init(); + + const config1 = createSpawnConfig('agent-1'); + config1.inProcess!.authOverrides = { + authType: 'openai', + apiKey: 'key-1', + baseUrl: 'https://api1.example.com', + }; + const config2 = createSpawnConfig('agent-2'); + config2.inProcess!.authOverrides = { + authType: 'anthropic', + apiKey: 'key-2', + baseUrl: 'https://api2.example.com', + }; + + await backend.spawnAgent(config1); + await backend.spawnAgent(config2); + + const MockAgentCore = AgentCore as unknown as ReturnType; + const calls = MockAgentCore.mock.calls; + + const ctx1 = calls.at(-2)![1] as { + getContentGenerator: () => unknown; + }; + const ctx2 = calls.at(-1)![1] as { + getContentGenerator: () => unknown; + }; + + expect(ctx1.getContentGenerator()).toBe(gen1); + expect(ctx2.getContentGenerator()).toBe(gen2); + expect(ctx1.getContentGenerator()).not.toBe(ctx2.getContentGenerator()); + }); + }); +}); diff --git a/packages/core/src/agents/backends/InProcessBackend.ts b/packages/core/src/agents/backends/InProcessBackend.ts new file mode 100644 index 0000000000..c53892cbcb --- /dev/null +++ b/packages/core/src/agents/backends/InProcessBackend.ts @@ -0,0 +1,472 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview InProcessBackend — Backend implementation that runs agents + * in the current process using AgentInteractive instead of PTY subprocesses. + * + * This enables Arena to work without tmux or any external terminal multiplexer. + */ + +import { createDebugLogger } from '../../utils/debugLogger.js'; +import type { Config } from '../../config/config.js'; +import { + type AuthType, + type ContentGenerator, + type ContentGeneratorConfig, + createContentGenerator, +} from '../../core/contentGenerator.js'; +import { AUTH_ENV_MAPPINGS } from '../../models/constants.js'; +import { AgentStatus, isTerminalStatus } from '../runtime/agent-types.js'; +import { AgentCore } from '../runtime/agent-core.js'; +import { AgentEventEmitter } from '../runtime/agent-events.js'; +import { ContextState } from '../runtime/agent-headless.js'; +import { AgentInteractive } from '../runtime/agent-interactive.js'; +import type { + Backend, + AgentSpawnConfig, + AgentExitCallback, + InProcessSpawnConfig, +} from './types.js'; +import { DISPLAY_MODE } from './types.js'; +import type { AnsiOutput } from '../../utils/terminalSerializer.js'; +import { WorkspaceContext } from '../../utils/workspaceContext.js'; +import { FileDiscoveryService } from '../../services/fileDiscoveryService.js'; +import type { ToolRegistry } from '../../tools/tool-registry.js'; + +const debugLogger = createDebugLogger('IN_PROCESS_BACKEND'); + +/** + * InProcessBackend runs agents in the current Node.js process. + * + * Instead of spawning PTY subprocesses, it creates AgentCore + AgentInteractive + * instances that execute in-process. Screen capture returns null (the UI reads + * messages directly from AgentInteractive). + */ +export class InProcessBackend implements Backend { + readonly type = DISPLAY_MODE.IN_PROCESS; + + private readonly runtimeContext: Config; + private readonly agents = new Map(); + private readonly agentRegistries: ToolRegistry[] = []; + private readonly agentOrder: string[] = []; + private activeAgentId: string | null = null; + private exitCallback: AgentExitCallback | null = null; + /** Whether cleanup() has been called */ + private cleanedUp = false; + + constructor(runtimeContext: Config) { + this.runtimeContext = runtimeContext; + } + + // ─── Backend Interface ───────────────────────────────────── + + async init(): Promise { + debugLogger.info('InProcessBackend initialized'); + } + + async spawnAgent(config: AgentSpawnConfig): Promise { + const inProcessConfig = config.inProcess; + if (!inProcessConfig) { + throw new Error( + `InProcessBackend requires inProcess config for agent ${config.agentId}`, + ); + } + + if (this.agents.has(config.agentId)) { + throw new Error(`Agent "${config.agentId}" already exists.`); + } + + const { promptConfig, modelConfig, runConfig, toolConfig } = + inProcessConfig.runtimeConfig; + + const eventEmitter = new AgentEventEmitter(); + + // Build a per-agent runtime context with isolated working directory, + // target directory, workspace context, tool registry, and (optionally) + // a dedicated ContentGenerator for per-agent auth isolation. + const agentContext = await createPerAgentConfig( + this.runtimeContext, + config.cwd, + inProcessConfig.runtimeConfig.modelConfig.model, + inProcessConfig.authOverrides, + ); + + this.agentRegistries.push(agentContext.getToolRegistry()); + + const core = new AgentCore( + inProcessConfig.agentName, + agentContext, + promptConfig, + modelConfig, + runConfig, + toolConfig, + eventEmitter, + ); + + const interactive = new AgentInteractive( + { + agentId: config.agentId, + agentName: inProcessConfig.agentName, + initialTask: inProcessConfig.initialTask, + maxTurnsPerMessage: runConfig.max_turns, + maxTimeMinutesPerMessage: runConfig.max_time_minutes, + chatHistory: inProcessConfig.chatHistory, + }, + core, + ); + + this.agents.set(config.agentId, interactive); + this.agentOrder.push(config.agentId); + + // Set first agent as active + if (this.activeAgentId === null) { + this.activeAgentId = config.agentId; + } + + try { + const context = new ContextState(); + await interactive.start(context); + + // Watch for completion and fire exit callback — but only for + // truly terminal statuses. IDLE means the agent is still alive + // and can accept follow-up messages. + void interactive.waitForCompletion().then(() => { + const status = interactive.getStatus(); + if (!isTerminalStatus(status)) { + return; + } + const exitCode = + status === AgentStatus.COMPLETED + ? 0 + : status === AgentStatus.FAILED + ? 1 + : null; + this.exitCallback?.(config.agentId, exitCode, null); + }); + + debugLogger.info(`Spawned in-process agent: ${config.agentId}`); + } catch (error) { + debugLogger.error( + `Failed to start in-process agent "${config.agentId}":`, + error, + ); + this.exitCallback?.(config.agentId, 1, null); + } + } + + stopAgent(agentId: string): void { + const agent = this.agents.get(agentId); + if (agent) { + agent.abort(); + debugLogger.info(`Stopped agent: ${agentId}`); + } + } + + stopAll(): void { + for (const agent of this.agents.values()) { + agent.abort(); + } + debugLogger.info('Stopped all in-process agents'); + } + + async cleanup(): Promise { + this.cleanedUp = true; + + for (const agent of this.agents.values()) { + agent.abort(); + } + // Wait for loops to settle, but cap at 3s so CLI exit isn't blocked + // if an agent's reasoning loop doesn't terminate promptly after abort. + const CLEANUP_TIMEOUT_MS = 3000; + const promises = Array.from(this.agents.values()).map((a) => + a.waitForCompletion().catch(() => {}), + ); + let timerId: ReturnType; + const timeout = new Promise((resolve) => { + timerId = setTimeout(resolve, CLEANUP_TIMEOUT_MS); + }); + await Promise.race([Promise.allSettled(promises), timeout]); + clearTimeout(timerId!); + + // Stop per-agent tool registries so tools like TaskTool can release + // listeners registered on shared managers (e.g. SubagentManager). + for (const registry of this.agentRegistries) { + await registry.stop().catch(() => {}); + } + this.agentRegistries.length = 0; + + this.agents.clear(); + this.agentOrder.length = 0; + this.activeAgentId = null; + debugLogger.info('InProcessBackend cleaned up'); + } + + setOnAgentExit(callback: AgentExitCallback): void { + this.exitCallback = callback; + } + + async waitForAll(timeoutMs?: number): Promise { + if (this.cleanedUp) return true; + + const promises = Array.from(this.agents.values()).map((a) => + a.waitForCompletion(), + ); + + if (timeoutMs === undefined) { + await Promise.allSettled(promises); + return true; + } + + let timerId: ReturnType; + const timeout = new Promise<'timeout'>((resolve) => { + timerId = setTimeout(() => resolve('timeout'), timeoutMs); + }); + + const result = await Promise.race([ + Promise.allSettled(promises).then(() => 'done' as const), + timeout, + ]); + + clearTimeout(timerId!); + return result === 'done'; + } + + // ─── Navigation ──────────────────────────────────────────── + + switchTo(agentId: string): void { + if (this.agents.has(agentId)) { + this.activeAgentId = agentId; + } + } + + switchToNext(): void { + this.activeAgentId = this.navigate(1); + } + + switchToPrevious(): void { + this.activeAgentId = this.navigate(-1); + } + + getActiveAgentId(): string | null { + return this.activeAgentId; + } + + // ─── Screen Capture (no-op for in-process) ───────────────── + + getActiveSnapshot(): AnsiOutput | null { + return null; + } + + getAgentSnapshot( + _agentId: string, + _scrollOffset?: number, + ): AnsiOutput | null { + return null; + } + + getAgentScrollbackLength(_agentId: string): number { + return 0; + } + + // ─── Input ───────────────────────────────────────────────── + + forwardInput(data: string): boolean { + if (!this.activeAgentId) return false; + return this.writeToAgent(this.activeAgentId, data); + } + + writeToAgent(agentId: string, data: string): boolean { + const agent = this.agents.get(agentId); + if (!agent) return false; + + agent.enqueueMessage(data); + return true; + } + + // ─── Resize (no-op) ─────────────────────────────────────── + + resizeAll(_cols: number, _rows: number): void { + // No terminals to resize in-process + } + + // ─── External Session ────────────────────────────────────── + + getAttachHint(): string | null { + return null; + } + + // ─── Extra: Direct Access ────────────────────────────────── + + /** + * Get an AgentInteractive instance by agent ID. + * Used by ArenaManager for direct event subscription. + */ + getAgent(agentId: string): AgentInteractive | undefined { + return this.agents.get(agentId); + } + + // ─── Private ─────────────────────────────────────────────── + + private navigate(direction: 1 | -1): string | null { + if (this.agentOrder.length === 0) return null; + if (!this.activeAgentId) return this.agentOrder[0] ?? null; + + const currentIndex = this.agentOrder.indexOf(this.activeAgentId); + if (currentIndex === -1) return this.agentOrder[0] ?? null; + + const nextIndex = + (currentIndex + direction + this.agentOrder.length) % + this.agentOrder.length; + return this.agentOrder[nextIndex] ?? null; + } +} + +/** + * Create a per-agent Config that delegates to the shared base Config but + * overrides key methods to provide per-agent isolation: + * + * - `getWorkingDir()` / `getTargetDir()` → agent's worktree cwd + * - `getWorkspaceContext()` → WorkspaceContext rooted at agent's cwd + * - `getFileService()` → FileDiscoveryService rooted at agent's cwd + * (so .qwenignore checks resolve against the agent's worktree) + * - `getToolRegistry()` → per-agent tool registry with core tools bound to + * the agent Config (so tools resolve paths against the agent's worktree) + * - `getContentGenerator()` / `getContentGeneratorConfig()` / `getAuthType()` + * → per-agent ContentGenerator when `authOverrides` is provided, enabling + * agents to target different model providers in the same Arena session + * + * Uses prototypal delegation so all other Config methods/properties resolve + * against the original instance transparently. + */ +async function createPerAgentConfig( + base: Config, + cwd: string, + modelId?: string, + authOverrides?: InProcessSpawnConfig['authOverrides'], +): Promise { + // eslint-disable-next-line @typescript-eslint/no-explicit-any + const override = Object.create(base) as any; + + override.getWorkingDir = () => cwd; + override.getTargetDir = () => cwd; + override.getProjectRoot = () => cwd; + + const agentWorkspace = new WorkspaceContext(cwd); + override.getWorkspaceContext = () => agentWorkspace; + + const agentFileService = new FileDiscoveryService(cwd); + override.getFileService = () => agentFileService; + + // Build a per-agent tool registry: core tools are constructed with + // the per-agent Config so they resolve paths against cwd. Discovered + // (MCP/command) tools are copied from the parent registry as-is. + const agentRegistry: ToolRegistry = await override.createToolRegistry( + undefined, + { skipDiscovery: true }, + ); + agentRegistry.copyDiscoveredToolsFrom(base.getToolRegistry()); + override.getToolRegistry = () => agentRegistry; + + // Build a per-agent ContentGenerator when auth overrides are provided. + // This enables Arena agents to use different providers (OpenAI, Anthropic, + // Gemini, etc.) than the parent process. + if (authOverrides?.authType) { + try { + const agentGeneratorConfig = buildAgentContentGeneratorConfig( + base, + modelId, + authOverrides, + ); + const agentGenerator = await createContentGenerator( + agentGeneratorConfig, + override as Config, + ); + override.getContentGenerator = (): ContentGenerator => agentGenerator; + override.getContentGeneratorConfig = (): ContentGeneratorConfig => + agentGeneratorConfig; + override.getAuthType = (): AuthType | undefined => + agentGeneratorConfig.authType; + override.getModel = (): string => agentGeneratorConfig.model; + + debugLogger.info( + `Created per-agent ContentGenerator: authType=${authOverrides.authType}, model=${agentGeneratorConfig.model}`, + ); + } catch (error) { + debugLogger.error( + 'Failed to create per-agent ContentGenerator, falling back to parent:', + error, + ); + } + } + + return override as Config; +} + +/** + * Build a ContentGeneratorConfig for a per-agent ContentGenerator. + * Inherits operational settings (timeout, retries, proxy, sampling, etc.) + * from the parent's config and overlays the agent-specific auth fields. + * + * For cross-provider agents the parent's API key / base URL are invalid, + * so we resolve credentials from the provider-specific environment + * variables (e.g. ANTHROPIC_API_KEY, ANTHROPIC_BASE_URL). This mirrors + * what a PTY subprocess does during its own initialization. + */ +function buildAgentContentGeneratorConfig( + base: Config, + modelId: string | undefined, + authOverrides: NonNullable, +): ContentGeneratorConfig { + const parentConfig = base.getContentGeneratorConfig(); + const sameProvider = authOverrides.authType === parentConfig.authType; + + const resolvedApiKey = resolveCredentialField( + authOverrides.apiKey, + sameProvider ? parentConfig.apiKey : undefined, + authOverrides.authType, + 'apiKey', + ); + + const resolvedBaseUrl = resolveCredentialField( + authOverrides.baseUrl, + sameProvider ? parentConfig.baseUrl : undefined, + authOverrides.authType, + 'baseUrl', + ); + + return { + ...parentConfig, + model: modelId ?? parentConfig.model, + authType: authOverrides.authType as AuthType, + apiKey: resolvedApiKey, + baseUrl: resolvedBaseUrl, + }; +} + +/** + * Resolve a credential field (apiKey or baseUrl) with the following + * priority: explicit override → same-provider parent value → env var. + */ +function resolveCredentialField( + explicitValue: string | undefined, + inheritedValue: string | undefined, + authType: string, + field: 'apiKey' | 'baseUrl', +): string | undefined { + if (explicitValue) return explicitValue; + if (inheritedValue) return inheritedValue; + + const envMapping = + AUTH_ENV_MAPPINGS[authType as keyof typeof AUTH_ENV_MAPPINGS]; + if (!envMapping) return undefined; + + for (const envKey of envMapping[field]) { + const value = process.env[envKey]; + if (value) return value; + } + return undefined; +} diff --git a/packages/core/src/agents/backends/TmuxBackend.test.ts b/packages/core/src/agents/backends/TmuxBackend.test.ts new file mode 100644 index 0000000000..39a96785df --- /dev/null +++ b/packages/core/src/agents/backends/TmuxBackend.test.ts @@ -0,0 +1,482 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest'; +import type { AgentSpawnConfig } from './types.js'; + +// ─── Hoisted mocks for tmux-commands ──────────────────────────── +const hoistedVerifyTmux = vi.hoisted(() => vi.fn()); +const hoistedTmuxCurrentPaneId = vi.hoisted(() => vi.fn()); +const hoistedTmuxCurrentWindowTarget = vi.hoisted(() => vi.fn()); +const hoistedTmuxHasSession = vi.hoisted(() => vi.fn()); +const hoistedTmuxHasWindow = vi.hoisted(() => vi.fn()); +const hoistedTmuxNewSession = vi.hoisted(() => vi.fn()); +const hoistedTmuxNewWindow = vi.hoisted(() => vi.fn()); +const hoistedTmuxSplitWindow = vi.hoisted(() => vi.fn()); +const hoistedTmuxSendKeys = vi.hoisted(() => vi.fn()); +const hoistedTmuxSelectPane = vi.hoisted(() => vi.fn()); +const hoistedTmuxSelectPaneTitle = vi.hoisted(() => vi.fn()); +const hoistedTmuxSelectPaneStyle = vi.hoisted(() => vi.fn()); +const hoistedTmuxSelectLayout = vi.hoisted(() => vi.fn()); +const hoistedTmuxListPanes = vi.hoisted(() => vi.fn()); +const hoistedTmuxSetOption = vi.hoisted(() => vi.fn()); +const hoistedTmuxRespawnPane = vi.hoisted(() => vi.fn()); +const hoistedTmuxKillPane = vi.hoisted(() => vi.fn()); +const hoistedTmuxKillSession = vi.hoisted(() => vi.fn()); +const hoistedTmuxResizePane = vi.hoisted(() => vi.fn()); +const hoistedTmuxGetFirstPaneId = vi.hoisted(() => vi.fn()); + +vi.mock('./tmux-commands.js', () => ({ + verifyTmux: hoistedVerifyTmux, + tmuxCurrentPaneId: hoistedTmuxCurrentPaneId, + tmuxCurrentWindowTarget: hoistedTmuxCurrentWindowTarget, + tmuxHasSession: hoistedTmuxHasSession, + tmuxHasWindow: hoistedTmuxHasWindow, + tmuxNewSession: hoistedTmuxNewSession, + tmuxNewWindow: hoistedTmuxNewWindow, + tmuxSplitWindow: hoistedTmuxSplitWindow, + tmuxSendKeys: hoistedTmuxSendKeys, + tmuxSelectPane: hoistedTmuxSelectPane, + tmuxSelectPaneTitle: hoistedTmuxSelectPaneTitle, + tmuxSelectPaneStyle: hoistedTmuxSelectPaneStyle, + tmuxSelectLayout: hoistedTmuxSelectLayout, + tmuxListPanes: hoistedTmuxListPanes, + tmuxSetOption: hoistedTmuxSetOption, + tmuxRespawnPane: hoistedTmuxRespawnPane, + tmuxKillPane: hoistedTmuxKillPane, + tmuxKillSession: hoistedTmuxKillSession, + tmuxResizePane: hoistedTmuxResizePane, + tmuxGetFirstPaneId: hoistedTmuxGetFirstPaneId, +})); + +// Mock the debug logger +vi.mock('../../utils/debugLogger.js', () => ({ + createDebugLogger: () => ({ + info: vi.fn(), + error: vi.fn(), + warn: vi.fn(), + }), +})); + +import { TmuxBackend } from './TmuxBackend.js'; + +function makeConfig( + agentId: string, + overrides?: Partial, +): AgentSpawnConfig { + return { + agentId, + command: '/usr/bin/node', + args: ['agent.js'], + cwd: '/tmp/test', + ...overrides, + }; +} + +/** + * Spawn an agent with fake timers active. The `sleep()` inside + * `spawnAgentAsync` uses `setTimeout`, so we must advance fake timers + * while the spawn promise is pending. + */ +async function spawnWithTimers( + backend: TmuxBackend, + config: AgentSpawnConfig, +): Promise { + const promise = backend.spawnAgent(config); + // Advance past INTERNAL_LAYOUT_SETTLE_MS (200) / EXTERNAL_LAYOUT_SETTLE_MS (120) + // and the 100ms triggerMainProcessRedraw timeout + await vi.advanceTimersByTimeAsync(300); + await promise; +} + +function setupDefaultMocks(): void { + hoistedVerifyTmux.mockResolvedValue(undefined); + hoistedTmuxHasSession.mockResolvedValue(false); + hoistedTmuxHasWindow.mockResolvedValue(false); + hoistedTmuxNewSession.mockResolvedValue(undefined); + hoistedTmuxNewWindow.mockResolvedValue(undefined); + hoistedTmuxGetFirstPaneId.mockResolvedValue('%0'); + hoistedTmuxRespawnPane.mockResolvedValue(undefined); + hoistedTmuxSplitWindow.mockResolvedValue('%1'); + hoistedTmuxSetOption.mockResolvedValue(undefined); + hoistedTmuxSelectPaneTitle.mockResolvedValue(undefined); + hoistedTmuxSelectPaneStyle.mockResolvedValue(undefined); + hoistedTmuxSelectLayout.mockResolvedValue(undefined); + hoistedTmuxSelectPane.mockResolvedValue(undefined); + hoistedTmuxResizePane.mockResolvedValue(undefined); + hoistedTmuxListPanes.mockResolvedValue([]); + hoistedTmuxSendKeys.mockResolvedValue(undefined); + hoistedTmuxKillPane.mockResolvedValue(undefined); + hoistedTmuxKillSession.mockResolvedValue(undefined); + hoistedTmuxCurrentPaneId.mockResolvedValue('%0'); + hoistedTmuxCurrentWindowTarget.mockResolvedValue('main:0'); +} + +describe('TmuxBackend', () => { + let backend: TmuxBackend; + let savedTmuxEnv: string | undefined; + + beforeEach(() => { + vi.useFakeTimers(); + savedTmuxEnv = process.env['TMUX']; + // Default: running outside tmux + delete process.env['TMUX']; + setupDefaultMocks(); + backend = new TmuxBackend(); + }); + + afterEach(async () => { + await backend.cleanup(); + vi.restoreAllMocks(); + vi.useRealTimers(); + if (savedTmuxEnv !== undefined) { + process.env['TMUX'] = savedTmuxEnv; + } else { + delete process.env['TMUX']; + } + }); + + // ─── Initialization ───────────────────────────────────────── + + it('throws if spawnAgent is called before init', async () => { + await expect(backend.spawnAgent(makeConfig('a1'))).rejects.toThrow( + 'not initialized', + ); + }); + + it('init verifies tmux availability', async () => { + await backend.init(); + expect(hoistedVerifyTmux).toHaveBeenCalled(); + }); + + it('init is idempotent', async () => { + await backend.init(); + await backend.init(); + expect(hoistedVerifyTmux).toHaveBeenCalledTimes(1); + }); + + // ─── Spawning (outside tmux) ────────────────────────────── + + it('spawns first agent outside tmux by respawning the initial pane', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('agent-1')); + + expect(hoistedTmuxNewSession).toHaveBeenCalled(); + expect(hoistedTmuxRespawnPane).toHaveBeenCalledWith( + '%0', + expect.any(String), + expect.any(String), + ); + expect(backend.getActiveAgentId()).toBe('agent-1'); + }); + + it('spawns second agent outside tmux by splitting', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('agent-1')); + + // For second agent, list-panes returns the first agent pane + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%2'); + + await spawnWithTimers(backend, makeConfig('agent-2')); + + expect(hoistedTmuxSplitWindow).toHaveBeenCalled(); + }); + + it('rejects duplicate agent IDs', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('dup')); + + await expect(backend.spawnAgent(makeConfig('dup'))).rejects.toThrow( + 'already exists', + ); + }); + + // ─── Spawning (inside tmux) ─────────────────────────────── + + it('spawns first agent inside tmux by splitting from main pane', async () => { + process.env['TMUX'] = '/tmp/tmux-1000/default,12345,0'; + backend = new TmuxBackend(); + await backend.init(); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%1'); + + await spawnWithTimers(backend, makeConfig('agent-1')); + + // Should have split horizontally with firstSplitPercent + expect(hoistedTmuxSplitWindow).toHaveBeenCalledWith( + '%0', + expect.objectContaining({ horizontal: true, percent: 70 }), + ); + // Should refocus on main pane (inside tmux, no server name arg) + expect(hoistedTmuxSelectPane).toHaveBeenCalledWith('%0'); + }); + + // ─── Navigation ─────────────────────────────────────────── + + it('switchTo changes active agent', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%2'); + await spawnWithTimers(backend, makeConfig('b')); + + backend.switchTo('b'); + expect(backend.getActiveAgentId()).toBe('b'); + }); + + it('switchTo throws for unknown agent', async () => { + await backend.init(); + expect(() => backend.switchTo('ghost')).toThrow('not found'); + }); + + it('switchToNext and switchToPrevious cycle correctly', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%2'); + await spawnWithTimers(backend, makeConfig('b')); + + expect(backend.getActiveAgentId()).toBe('a'); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('b'); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('a'); + backend.switchToPrevious(); + expect(backend.getActiveAgentId()).toBe('b'); + }); + + it('switchToNext does nothing with a single agent', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('solo')); + backend.switchToNext(); + expect(backend.getActiveAgentId()).toBe('solo'); + }); + + // ─── Stop & Cleanup ────────────────────────────────────── + + it('stopAgent kills the pane', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + backend.stopAgent('a'); + expect(hoistedTmuxKillPane).toHaveBeenCalledWith('%0', expect.any(String)); + }); + + it('stopAll kills all running panes', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%2'); + await spawnWithTimers(backend, makeConfig('b')); + + backend.stopAll(); + // Should have killed both panes + expect(hoistedTmuxKillPane).toHaveBeenCalledTimes(2); + }); + + it('cleanup kills panes and the external session', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + await backend.cleanup(); + + expect(hoistedTmuxKillPane).toHaveBeenCalledWith('%0', expect.any(String)); + expect(hoistedTmuxKillSession).toHaveBeenCalled(); + expect(backend.getActiveAgentId()).toBeNull(); + }); + + it('cleanup does not kill session when running inside tmux', async () => { + process.env['TMUX'] = '/tmp/tmux-1000/default,12345,0'; + backend = new TmuxBackend(); + await backend.init(); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + hoistedTmuxSplitWindow.mockResolvedValue('%1'); + await spawnWithTimers(backend, makeConfig('a')); + + hoistedTmuxKillSession.mockClear(); + await backend.cleanup(); + + expect(hoistedTmuxKillSession).not.toHaveBeenCalled(); + }); + + // ─── Exit Detection (Bug #1: missing pane → exited) ────── + + it('marks agent as exited when pane disappears from tmux', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + // Polling returns no panes → agent's pane is gone + hoistedTmuxListPanes.mockResolvedValue([]); + + // Advance timer to trigger poll + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledWith('a', 1, null); + }); + + it('marks agent as exited when pane reports dead', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + // Polling returns the pane as dead with exit code 42 + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: true, deadStatus: 42 }, + ]); + + await vi.advanceTimersByTimeAsync(600); + + expect(exitCallback).toHaveBeenCalledWith('a', 42, null); + }); + + // ─── waitForAll (Bug #3: cleanup resolves waiters) ──────── + + it('waitForAll resolves when all agents exit', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: true, deadStatus: 0 }, + ]); + + const waitPromise = backend.waitForAll(); + + await vi.advanceTimersByTimeAsync(600); + + const result = await waitPromise; + expect(result).toBe(true); + }); + + it('waitForAll resolves after cleanup is called', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + // Pane stays alive — without cleanup, waitForAll would hang + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + + const waitPromise = backend.waitForAll(); + + // Advance a bit (poll runs but agent still alive) + await vi.advanceTimersByTimeAsync(600); + + // Now cleanup + await backend.cleanup(); + + // Advance again so the waitForAll interval fires + await vi.advanceTimersByTimeAsync(600); + + const result = await waitPromise; + // The key thing is the promise resolves instead of hanging forever. + // allExited() returns true since panes were cleared in cleanup. + expect(result).toBe(true); + }); + + it('waitForAll returns false on timeout', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + // Pane stays alive + hoistedTmuxListPanes.mockResolvedValue([ + { paneId: '%0', dead: false, deadStatus: 0 }, + ]); + + const waitPromise = backend.waitForAll(1000); + + await vi.advanceTimersByTimeAsync(1100); + + const result = await waitPromise; + expect(result).toBe(false); + }); + + // ─── Input ──────────────────────────────────────────────── + + it('forwardInput sends literal keys to active agent pane', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + + const result = backend.forwardInput('hello'); + expect(result).toBe(true); + expect(hoistedTmuxSendKeys).toHaveBeenCalledWith( + '%0', + 'hello', + { literal: true }, + expect.any(String), + ); + }); + + it('forwardInput returns false with no active agent', async () => { + await backend.init(); + expect(backend.forwardInput('hello')).toBe(false); + }); + + // ─── Snapshots ──────────────────────────────────────────── + + it('getActiveSnapshot returns null (tmux handles rendering)', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + expect(backend.getActiveSnapshot()).toBeNull(); + }); + + it('getAgentScrollbackLength returns 0', async () => { + await backend.init(); + await spawnWithTimers(backend, makeConfig('a')); + expect(backend.getAgentScrollbackLength('a')).toBe(0); + }); + + // ─── getAttachHint ──────────────────────────────────────── + + it('returns attach command when outside tmux', async () => { + await backend.init(); + const hint = backend.getAttachHint(); + expect(hint).toMatch(/^tmux -L arena-server-\d+ a$/); + }); + + it('returns null when inside tmux', async () => { + process.env['TMUX'] = '/tmp/tmux-1000/default,12345,0'; + backend = new TmuxBackend(); + await backend.init(); + expect(backend.getAttachHint()).toBeNull(); + }); + + // ─── Spawn failure handling ─────────────────────────────── + + it('registers failed agent and fires exit callback on spawn error', async () => { + await backend.init(); + + // Make the external session setup fail + hoistedTmuxHasSession.mockRejectedValueOnce(new Error('tmux exploded')); + + const exitCallback = vi.fn(); + backend.setOnAgentExit(exitCallback); + + await spawnWithTimers(backend, makeConfig('fail')); + + expect(exitCallback).toHaveBeenCalledWith('fail', 1, null); + }); +}); diff --git a/packages/core/src/agents/backends/TmuxBackend.ts b/packages/core/src/agents/backends/TmuxBackend.ts new file mode 100644 index 0000000000..adc75593fc --- /dev/null +++ b/packages/core/src/agents/backends/TmuxBackend.ts @@ -0,0 +1,813 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview TmuxBackend implements Backend using tmux split-pane. + * + * Layout (inside tmux): main process on the left (leader pane ~30%), + * agent panes on the right, arranged via `main-vertical`. + * + * ┌────────────┬──────────────────────────────────┐ + * │ │ Agent 1 │ + * │ Leader ├──────────────────────────────────┤ + * │ (30%) │ Agent 2 │ + * │ ├──────────────────────────────────┤ + * │ │ Agent 3 │ + * └────────────┴──────────────────────────────────┘ + * + * Outside tmux: a dedicated tmux server is created and panes are arranged + * using `tiled` layout in a separate session/window. + */ + +import { createDebugLogger } from '../../utils/debugLogger.js'; +import type { AnsiOutput } from '../../utils/terminalSerializer.js'; +import { DISPLAY_MODE } from './types.js'; +import type { AgentSpawnConfig, AgentExitCallback, Backend } from './types.js'; +import { + verifyTmux, + tmuxCurrentWindowTarget, + tmuxCurrentPaneId, + tmuxHasSession, + tmuxHasWindow, + tmuxNewSession, + tmuxNewWindow, + tmuxSplitWindow, + tmuxSendKeys, + tmuxSelectPane, + tmuxSelectPaneTitle, + tmuxSelectPaneStyle, + tmuxSelectLayout, + tmuxListPanes, + tmuxSetOption, + tmuxRespawnPane, + tmuxKillPane, + tmuxKillSession, + tmuxResizePane, + tmuxGetFirstPaneId, + type TmuxPaneInfo, +} from './tmux-commands.js'; + +const debugLogger = createDebugLogger('TMUX_BACKEND'); + +/** Polling interval for exit detection (ms) */ +const EXIT_POLL_INTERVAL_MS = 500; + +/** Default tmux server name prefix (for -L) when running outside tmux. + * Actual name is `${prefix}-${process.pid}` so each leader process is isolated. */ +const TMUX_SERVER_PREFIX = 'arena-server'; +/** Default tmux session name when running outside tmux */ +const DEFAULT_TMUX_SESSION = 'arena-view'; +/** Default tmux window name when running outside tmux */ +const DEFAULT_TMUX_WINDOW = 'arena-view'; +/** Default leader pane width percent (main pane) */ +const DEFAULT_LEADER_WIDTH_PERCENT = 30; +/** Default first split percent (right side) */ +const DEFAULT_FIRST_SPLIT_PERCENT = 70; +/** Default pane border format */ +const DEFAULT_PANE_BORDER_FORMAT = '#{pane_title}'; +/** Layout settle delays */ +const INTERNAL_LAYOUT_SETTLE_MS = 200; +const EXTERNAL_LAYOUT_SETTLE_MS = 120; + +interface TmuxAgentPane { + agentId: string; + paneId: string; + status: 'running' | 'exited'; + exitCode: number; +} + +interface ResolvedTmuxOptions { + serverName: string; + sessionName: string; + windowName: string; + paneTitle: string; + paneBorderStyle?: string; + paneActiveBorderStyle?: string; + paneBorderFormat: string; + paneBorderStatus?: 'top' | 'bottom' | 'off'; + leaderPaneWidthPercent: number; + firstSplitPercent: number; +} + +export class TmuxBackend implements Backend { + readonly type = DISPLAY_MODE.TMUX; + + /** The pane ID where the main process runs (left side) */ + private mainPaneId = ''; + /** Window target (session:window) */ + private windowTarget = ''; + /** Whether we are running inside tmux */ + private insideTmux = false; + /** External tmux server name (when outside tmux) */ + private serverName: string | null = null; + /** External tmux session name (when outside tmux) */ + private sessionName: string | null = null; + /** External tmux window name (when outside tmux) */ + private windowName: string | null = null; + + private panes: Map = new Map(); + private agentOrder: string[] = []; + private activeAgentId: string | null = null; + private onExitCallback: AgentExitCallback | null = null; + private exitPollTimer: NodeJS.Timeout | null = null; + private initialized = false; + /** Whether cleanup() has been called */ + private cleanedUp = false; + /** Number of agents currently being spawned asynchronously */ + private pendingSpawns = 0; + /** Queue to serialize spawn operations (prevents race conditions) */ + private spawnQueue: Promise = Promise.resolve(); + async init(): Promise { + if (this.initialized) return; + + // Verify tmux is available and version is sufficient + await verifyTmux(); + + this.insideTmux = Boolean(process.env['TMUX']); + + if (this.insideTmux) { + // Get the current pane ID (this is where the main process runs) + this.mainPaneId = await tmuxCurrentPaneId(); + this.windowTarget = await tmuxCurrentWindowTarget(); + debugLogger.info( + `Initialized inside tmux: pane ${this.mainPaneId}, window ${this.windowTarget}`, + ); + } else { + debugLogger.info( + 'Initialized outside tmux; will use external tmux server', + ); + } + + this.initialized = true; + } + + // ─── Agent Lifecycle ──────────────────────────────────────── + + async spawnAgent(config: AgentSpawnConfig): Promise { + if (!this.initialized) { + throw new Error('TmuxBackend not initialized. Call init() first.'); + } + if (this.panes.has(config.agentId)) { + throw new Error(`Agent "${config.agentId}" already exists.`); + } + + // Build the shell command string for the agent + const cmd = this.buildShellCommand(config); + + // Track pending spawn so waitForAll/allExited don't return + // prematurely before the pane is registered. + this.pendingSpawns++; + + // Chain spawn operations to ensure they run sequentially. + // This prevents race conditions where multiple agents all see + // panes.size === 0 and try to split from mainPaneId. + const spawnPromise = this.spawnQueue.then(() => + this.spawnAgentAsync(config, cmd), + ); + this.spawnQueue = spawnPromise; + + // Wait for this specific spawn to complete + await spawnPromise; + } + + private async spawnAgentAsync( + config: AgentSpawnConfig, + cmd: string, + ): Promise { + const { agentId } = config; + const options = this.resolveTmuxOptions(config); + + debugLogger.info( + `[spawnAgentAsync] Starting spawn for agent "${agentId}", mainPane="${this.mainPaneId}", currentPanesCount=${this.panes.size}`, + ); + try { + let paneId = ''; + if (this.insideTmux) { + paneId = await this.spawnInsideTmux(cmd, options); + } else { + paneId = await this.spawnOutsideTmux(config, cmd, options); + } + + const serverName = this.getServerName(); + + // Set remain-on-exit so we can detect when the process exits + await tmuxSetOption(paneId, 'remain-on-exit', 'on', serverName); + + // Apply pane title/border styling + await this.applyPaneDecorations(paneId, options, serverName); + + if (this.insideTmux) { + await this.applyInsideLayout(options); + await this.sleep(INTERNAL_LAYOUT_SETTLE_MS); + // Keep focus on the main pane + await tmuxSelectPane(this.mainPaneId); + this.triggerMainProcessRedraw(); + } else { + await this.applyExternalLayout(serverName); + await this.sleep(EXTERNAL_LAYOUT_SETTLE_MS); + } + + const agentPane: TmuxAgentPane = { + agentId, + paneId, + status: 'running', + exitCode: 0, + }; + + this.panes.set(agentId, agentPane); + this.agentOrder.push(agentId); + + // First agent becomes active + if (this.activeAgentId === null) { + this.activeAgentId = agentId; + } + + // Start exit polling if not already running + this.startExitPolling(); + + debugLogger.info( + `[spawnAgentAsync] Spawned agent "${agentId}" in pane ${paneId} — SUCCESS`, + ); + } catch (error) { + debugLogger.error( + `[spawnAgentAsync] Failed to spawn agent "${agentId}":`, + error, + ); + // Still register the agent as failed so exit callback fires + this.panes.set(agentId, { + agentId, + paneId: '', + status: 'exited', + exitCode: 1, + }); + this.agentOrder.push(agentId); + this.onExitCallback?.(agentId, 1, null); + } finally { + this.pendingSpawns--; + } + } + + /** + * Trigger terminal redraw in main process after pane layout changes. + * Uses multiple methods to ensure Ink picks up the new terminal size. + */ + private triggerMainProcessRedraw(): void { + if (!this.insideTmux) return; + // Small delay to let tmux finish the resize operation + setTimeout(() => { + try { + // Method 1: Emit resize event on stdout (Ink listens to this) + if (process.stdout.isTTY) { + process.stdout.emit('resize'); + debugLogger.info( + '[triggerMainProcessRedraw] Emitted stdout resize event', + ); + } + + // Method 2: Send SIGWINCH signal + process.kill(process.pid, 'SIGWINCH'); + debugLogger.info('[triggerMainProcessRedraw] Sent SIGWINCH'); + } catch (error) { + debugLogger.info(`[triggerMainProcessRedraw] Failed: ${error}`); + } + }, 100); + } + + stopAgent(agentId: string): void { + const pane = this.panes.get(agentId); + if (!pane || pane.status !== 'running') return; + // Kill the pane outright — a single Ctrl-C only cancels the current + // turn in interactive CLI agents and does not reliably exit the process. + if (pane.paneId) { + void tmuxKillPane(pane.paneId, this.getServerName()); + } + pane.status = 'exited'; + debugLogger.info(`Killed pane for agent "${agentId}"`); + } + + stopAll(): void { + for (const [agentId, pane] of this.panes.entries()) { + if (pane.status === 'running') { + if (pane.paneId) { + void tmuxKillPane(pane.paneId, this.getServerName()); + } + pane.status = 'exited'; + debugLogger.info(`Killed pane for agent "${agentId}"`); + } + } + } + + async cleanup(): Promise { + this.cleanedUp = true; + this.stopExitPolling(); + + // Kill all agent panes (but not the main pane) + for (const pane of this.panes.values()) { + if (pane.paneId) { + try { + await tmuxKillPane(pane.paneId, this.getServerName()); + debugLogger.info(`Killed agent pane ${pane.paneId}`); + } catch (_error) { + // Pane may already be gone + debugLogger.info( + `Failed to kill pane ${pane.paneId} (may already be gone)`, + ); + } + } + } + + // Kill the external tmux session/server if we created one + if (!this.insideTmux && this.sessionName && this.serverName) { + try { + await tmuxKillSession(this.sessionName, this.serverName); + debugLogger.info( + `Killed external tmux session "${this.sessionName}" on server "${this.serverName}"`, + ); + } catch (_error) { + debugLogger.info( + `Failed to kill external tmux session (may already be gone)`, + ); + } + } + + this.panes.clear(); + this.agentOrder = []; + this.activeAgentId = null; + this.serverName = null; + this.sessionName = null; + this.windowName = null; + this.windowTarget = ''; + this.mainPaneId = ''; + } + + setOnAgentExit(callback: AgentExitCallback): void { + this.onExitCallback = callback; + } + + async waitForAll(timeoutMs?: number): Promise { + if (this.allExited() || this.cleanedUp) return this.allExited(); + + return new Promise((resolve) => { + let timeoutHandle: NodeJS.Timeout | undefined; + + const checkInterval = setInterval(() => { + if (this.allExited() || this.cleanedUp) { + clearInterval(checkInterval); + if (timeoutHandle) clearTimeout(timeoutHandle); + resolve(this.allExited()); + } + }, EXIT_POLL_INTERVAL_MS); + + if (timeoutMs !== undefined) { + timeoutHandle = setTimeout(() => { + clearInterval(checkInterval); + resolve(false); + }, timeoutMs); + } + }); + } + + // ─── Active Agent & Navigation ────────────────────────────── + + switchTo(agentId: string): void { + if (!this.panes.has(agentId)) { + throw new Error(`Agent "${agentId}" not found.`); + } + const pane = this.panes.get(agentId)!; + this.activeAgentId = agentId; + void tmuxSelectPane(pane.paneId, this.getServerName()); + } + + switchToNext(): void { + if (this.agentOrder.length <= 1) return; + const currentIndex = this.agentOrder.indexOf(this.activeAgentId ?? ''); + const nextIndex = (currentIndex + 1) % this.agentOrder.length; + this.switchTo(this.agentOrder[nextIndex]!); + } + + switchToPrevious(): void { + if (this.agentOrder.length <= 1) return; + const currentIndex = this.agentOrder.indexOf(this.activeAgentId ?? ''); + const prevIndex = + (currentIndex - 1 + this.agentOrder.length) % this.agentOrder.length; + this.switchTo(this.agentOrder[prevIndex]!); + } + + getActiveAgentId(): string | null { + return this.activeAgentId; + } + + // ─── Screen Capture ───────────────────────────────────────── + + getActiveSnapshot(): AnsiOutput | null { + if (!this.activeAgentId) return null; + return this.getAgentSnapshot(this.activeAgentId); + } + + getAgentSnapshot( + agentId: string, + _scrollOffset: number = 0, + ): AnsiOutput | null { + // tmux panes are rendered by tmux itself. capture-pane is available + // but returns raw text. For the progress bar we don't need snapshots; + // full rendering is handled by tmux directly. + // Return null — the UI doesn't use snapshots for split-pane backends. + return null; + } + + getAgentScrollbackLength(_agentId: string): number { + // Scrollback is managed by tmux, not by us + return 0; + } + + // ─── Input ────────────────────────────────────────────────── + + forwardInput(data: string): boolean { + if (!this.activeAgentId) return false; + return this.writeToAgent(this.activeAgentId, data); + } + + writeToAgent(agentId: string, data: string): boolean { + const pane = this.panes.get(agentId); + if (!pane || pane.status !== 'running') return false; + void tmuxSendKeys( + pane.paneId, + data, + { literal: true }, + this.getServerName(), + ); + return true; + } + + // ─── Resize ───────────────────────────────────────────────── + + resizeAll(_cols: number, _rows: number): void { + // tmux manages pane sizes automatically based on the terminal window + } + + // ─── External Session Info ───────────────────────────────── + + getAttachHint(): string | null { + if (this.insideTmux) { + return null; + } + // When outside tmux, the server name is determined at init time + // (per-process unique). Return the attach command even before + // ensureExternalSession runs, since the server name is deterministic. + const server = this.serverName ?? `${TMUX_SERVER_PREFIX}-${process.pid}`; + return `tmux -L ${server} a`; + } + + // ─── Private ──────────────────────────────────────────────── + + private resolveTmuxOptions(config: AgentSpawnConfig): ResolvedTmuxOptions { + const opts = config.backend?.tmux ?? {}; + return { + serverName: opts.serverName ?? `${TMUX_SERVER_PREFIX}-${process.pid}`, + sessionName: opts.sessionName ?? DEFAULT_TMUX_SESSION, + windowName: opts.windowName ?? DEFAULT_TMUX_WINDOW, + paneTitle: opts.paneTitle ?? config.agentId, + paneBorderStyle: opts.paneBorderStyle, + paneActiveBorderStyle: opts.paneActiveBorderStyle, + paneBorderFormat: opts.paneBorderFormat ?? DEFAULT_PANE_BORDER_FORMAT, + paneBorderStatus: + opts.paneBorderStatus ?? (this.insideTmux ? undefined : 'top'), + leaderPaneWidthPercent: + opts.leaderPaneWidthPercent ?? DEFAULT_LEADER_WIDTH_PERCENT, + firstSplitPercent: opts.firstSplitPercent ?? DEFAULT_FIRST_SPLIT_PERCENT, + }; + } + + private getServerName(): string | undefined { + return this.insideTmux ? undefined : (this.serverName ?? undefined); + } + + private async ensureExternalSession( + config: AgentSpawnConfig, + options: ResolvedTmuxOptions, + ): Promise { + if ( + this.windowTarget && + this.serverName && + this.sessionName && + this.windowName + ) { + return; + } + + this.serverName = options.serverName; + this.sessionName = options.sessionName; + this.windowName = options.windowName; + + const serverName = this.serverName; + const sessionExists = await tmuxHasSession(this.sessionName, serverName); + + if (!sessionExists) { + await tmuxNewSession( + this.sessionName, + { + cols: config.cols, + rows: config.rows, + windowName: this.windowName, + }, + serverName, + ); + } + + const windowExists = sessionExists + ? await tmuxHasWindow(this.sessionName, this.windowName, serverName) + : true; + + if (!windowExists) { + await tmuxNewWindow(this.sessionName, this.windowName, serverName); + } + + this.windowTarget = `${this.sessionName}:${this.windowName}`; + + if (!this.mainPaneId) { + this.mainPaneId = await tmuxGetFirstPaneId(this.windowTarget, serverName); + } + } + + private async spawnInsideTmux( + cmd: string, + options: ResolvedTmuxOptions, + ): Promise { + if (!this.windowTarget) { + throw new Error('Tmux window target not initialized.'); + } + + const panes = await tmuxListPanes(this.windowTarget); + const paneCount = panes.length; + if (paneCount === 1) { + debugLogger.info( + `[spawnInsideTmux] First agent — split -h -l ${options.firstSplitPercent}% from ${this.mainPaneId}`, + ); + return await tmuxSplitWindow(this.mainPaneId, { + horizontal: true, + percent: options.firstSplitPercent, + command: cmd, + }); + } + + const splitTarget = this.pickMiddlePane(panes).paneId; + const horizontal = this.shouldSplitHorizontally(paneCount); + debugLogger.info( + `[spawnInsideTmux] Split from middle pane ${splitTarget} (${paneCount} panes, ${horizontal ? 'horizontal' : 'vertical'})`, + ); + return await tmuxSplitWindow(splitTarget, { + horizontal, + command: cmd, + }); + } + + private async spawnOutsideTmux( + config: AgentSpawnConfig, + cmd: string, + options: ResolvedTmuxOptions, + ): Promise { + await this.ensureExternalSession(config, options); + if (!this.windowTarget) { + throw new Error('External tmux window target not initialized.'); + } + + const serverName = this.getServerName(); + + if (this.panes.size === 0) { + const firstPaneId = await tmuxGetFirstPaneId( + this.windowTarget, + serverName, + ); + this.mainPaneId = firstPaneId; + debugLogger.info( + `[spawnOutsideTmux] First agent — respawn in pane ${firstPaneId}`, + ); + await tmuxRespawnPane(firstPaneId, cmd, serverName); + return firstPaneId; + } + + const panes = await tmuxListPanes(this.windowTarget, serverName); + const splitTarget = this.pickMiddlePane(panes).paneId; + const horizontal = this.shouldSplitHorizontally(panes.length); + debugLogger.info( + `[spawnOutsideTmux] Split from middle pane ${splitTarget} (${panes.length} panes, ${horizontal ? 'horizontal' : 'vertical'})`, + ); + return await tmuxSplitWindow( + splitTarget, + { horizontal, command: cmd }, + serverName, + ); + } + + private pickMiddlePane(panes: TmuxPaneInfo[]): TmuxPaneInfo { + if (panes.length === 0) { + throw new Error('No panes available to split.'); + } + return panes[Math.floor(panes.length / 2)]!; + } + + private shouldSplitHorizontally(paneCount: number): boolean { + return paneCount % 2 === 1; + } + + private async applyPaneDecorations( + paneId: string, + options: ResolvedTmuxOptions, + serverName?: string, + ): Promise { + if (!this.windowTarget) return; + + if (options.paneBorderStatus) { + await tmuxSetOption( + this.windowTarget, + 'pane-border-status', + options.paneBorderStatus, + serverName, + ); + } + + if (options.paneBorderFormat) { + await tmuxSetOption( + this.windowTarget, + 'pane-border-format', + options.paneBorderFormat, + serverName, + ); + } + + if (options.paneBorderStyle) { + await tmuxSetOption( + this.windowTarget, + 'pane-border-style', + options.paneBorderStyle, + serverName, + ); + await tmuxSelectPaneStyle(paneId, options.paneBorderStyle, serverName); + } + + if (options.paneActiveBorderStyle) { + await tmuxSetOption( + this.windowTarget, + 'pane-active-border-style', + options.paneActiveBorderStyle, + serverName, + ); + } + + await tmuxSelectPaneTitle(paneId, options.paneTitle, serverName); + } + + private async applyInsideLayout(options: ResolvedTmuxOptions): Promise { + if (!this.windowTarget || !this.mainPaneId) return; + await tmuxSelectLayout(this.windowTarget, 'main-vertical'); + await tmuxResizePane(this.mainPaneId, { + width: `${options.leaderPaneWidthPercent}%`, + }); + } + + private async applyExternalLayout(serverName?: string): Promise { + if (!this.windowTarget) return; + await tmuxSelectLayout(this.windowTarget, 'tiled', serverName); + } + + private async sleep(ms: number): Promise { + await new Promise((resolve) => setTimeout(resolve, ms)); + } + + private buildShellCommand(config: AgentSpawnConfig): string { + // Build env prefix + command + args + const envParts: string[] = []; + if (config.env) { + for (const [key, value] of Object.entries(config.env)) { + envParts.push(`${key}=${shellQuote(value)}`); + } + } + + const cmdParts = [ + shellQuote(config.command), + ...config.args.map(shellQuote), + ]; + + // cd to the working directory first + const parts = [`cd ${shellQuote(config.cwd)}`]; + if (envParts.length > 0) { + parts.push(`env ${envParts.join(' ')} ${cmdParts.join(' ')}`); + } else { + parts.push(cmdParts.join(' ')); + } + + const fullCommand = parts.join(' && '); + debugLogger.info( + `[buildShellCommand] agentId=${config.agentId}, command=${config.command}, args=${JSON.stringify(config.args)}, cwd=${config.cwd}`, + ); + debugLogger.info(`[buildShellCommand] full shell command: ${fullCommand}`); + return fullCommand; + } + + private allExited(): boolean { + if (this.pendingSpawns > 0) return false; + if (this.panes.size === 0) return true; + for (const pane of this.panes.values()) { + if (pane.status === 'running') return false; + } + return true; + } + + private startExitPolling(): void { + if (this.exitPollTimer) return; + + this.exitPollTimer = setInterval(() => { + void this.pollPaneStatus(); + }, EXIT_POLL_INTERVAL_MS); + } + + private stopExitPolling(): void { + if (this.exitPollTimer) { + clearInterval(this.exitPollTimer); + this.exitPollTimer = null; + } + } + + private async pollPaneStatus(): Promise { + let paneInfos: TmuxPaneInfo[]; + const serverName = this.getServerName(); + try { + if (!this.windowTarget) return; + // List panes in the active window + paneInfos = await tmuxListPanes(this.windowTarget, serverName); + } catch (err) { + // Window may have been killed externally + debugLogger.info( + `[pollPaneStatus] Failed to list panes for window "${this.windowTarget}": ${err}`, + ); + return; + } + + // Build a lookup: paneId → TmuxPaneInfo + const paneMap = new Map(); + for (const info of paneInfos) { + paneMap.set(info.paneId, info); + } + + // Log all pane statuses for debugging (only when there are agent panes) + if (this.panes.size > 0) { + debugLogger.info( + `[pollPaneStatus] paneCount=${paneInfos.length}, agentPanes=${JSON.stringify( + Array.from(this.panes.values()).map((p) => { + const info = paneMap.get(p.paneId); + return { + agentId: p.agentId, + paneId: p.paneId, + status: p.status, + dead: info?.dead, + deadStatus: info?.deadStatus, + }; + }), + )}`, + ); + } + + for (const agent of this.panes.values()) { + if (agent.status !== 'running') continue; + + const info = paneMap.get(agent.paneId); + if (!info) { + // Pane was killed externally — treat as exited + agent.status = 'exited'; + agent.exitCode = 1; + debugLogger.info( + `[pollPaneStatus] Agent "${agent.agentId}" pane ${agent.paneId} not found in tmux list — marking as exited`, + ); + this.onExitCallback?.(agent.agentId, 1, null); + continue; + } + + if (info.dead) { + agent.status = 'exited'; + agent.exitCode = info.deadStatus; + + debugLogger.info( + `[pollPaneStatus] Agent "${agent.agentId}" (pane ${agent.paneId}) detected as DEAD with exit code ${info.deadStatus}`, + ); + + this.onExitCallback?.(agent.agentId, info.deadStatus, null); + } + } + + // Stop polling if all agents have exited + if (this.allExited()) { + this.stopExitPolling(); + } + } +} + +/** + * Simple shell quoting for building command strings. + * Wraps value in single quotes, escaping any internal single quotes. + */ +function shellQuote(value: string): string { + return `'${value.replace(/'/g, "'\\''")}'`; +} diff --git a/packages/core/src/agents/backends/detect.ts b/packages/core/src/agents/backends/detect.ts new file mode 100644 index 0000000000..f94d8c41d1 --- /dev/null +++ b/packages/core/src/agents/backends/detect.ts @@ -0,0 +1,88 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { createDebugLogger } from '../../utils/debugLogger.js'; +import type { Config } from '../../config/config.js'; +// import { TmuxBackend } from './TmuxBackend.js'; +import { InProcessBackend } from './InProcessBackend.js'; +import { type Backend, DISPLAY_MODE, type DisplayMode } from './types.js'; +// import { isTmuxAvailable } from './tmux-commands.js'; + +const debugLogger = createDebugLogger('BACKEND_DETECT'); + +export interface DetectBackendResult { + backend: Backend; + warning?: string; +} + +/** + * Detect and create the appropriate Backend. + * + * Detection priority: + * 1. User explicit preference (--display=in-process|tmux|iterm2) + * 2. Auto-detect: + * - inside tmux: TmuxBackend + * - other terminals: tmux external session mode when tmux is available + * - fallback to InProcessBackend + * + * @param preference - Optional display mode preference + * @param runtimeContext - Runtime config for in-process fallback + */ +export async function detectBackend( + preference: DisplayMode | undefined, + runtimeContext: Config, +): Promise { + // Currently only in-process mode is supported. Other backends (tmux, + // iterm2) are kept in the codebase but not wired up as entry points. + const warning = + preference && preference !== DISPLAY_MODE.IN_PROCESS + ? `Display mode "${preference}" is not currently supported. Using in-process mode instead.` + : undefined; + debugLogger.info('Using InProcessBackend'); + return { backend: new InProcessBackend(runtimeContext), warning }; + + // --- Disabled backends (kept for future use) --- + // // 1. User explicit preference + // if (preference === DISPLAY_MODE.IN_PROCESS) { + // debugLogger.info('Using InProcessBackend (user preference)'); + // return { backend: new InProcessBackend(runtimeContext) }; + // } + // + // if (preference === DISPLAY_MODE.ITERM2) { + // throw new Error( + // `Arena display mode "${DISPLAY_MODE.ITERM2}" is not implemented yet. Please use "${DISPLAY_MODE.TMUX}" or "${DISPLAY_MODE.IN_PROCESS}".`, + // ); + // } + // + // if (preference === DISPLAY_MODE.TMUX) { + // debugLogger.info('Using TmuxBackend (user preference)'); + // return { backend: new TmuxBackend() }; + // } + // + // // 2. Auto-detect + // if (process.env['TMUX']) { + // debugLogger.info('Detected $TMUX — attempting TmuxBackend'); + // return { backend: new TmuxBackend() }; + // } + // + // // Other terminals (including iTerm2): use tmux external session mode if available. + // if (isTmuxAvailable()) { + // debugLogger.info( + // 'tmux is available — using TmuxBackend external session mode', + // ); + // return { backend: new TmuxBackend() }; + // } + // + // // Fallback: use InProcessBackend + // debugLogger.info( + // 'No PTY backend available — falling back to InProcessBackend', + // ); + // return { + // backend: new InProcessBackend(runtimeContext), + // warning: + // 'tmux is not available. Using in-process mode (no split-pane terminal view).', + // }; +} diff --git a/packages/core/src/agents/backends/index.ts b/packages/core/src/agents/backends/index.ts new file mode 100644 index 0000000000..6105fe45c6 --- /dev/null +++ b/packages/core/src/agents/backends/index.ts @@ -0,0 +1,19 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +export { DISPLAY_MODE } from './types.js'; +export type { + Backend, + DisplayMode, + AgentSpawnConfig, + AgentExitCallback, + TmuxBackendOptions, + InProcessSpawnConfig, +} from './types.js'; +export { TmuxBackend } from './TmuxBackend.js'; +export { ITermBackend } from './ITermBackend.js'; +export { InProcessBackend } from './InProcessBackend.js'; +export { detectBackend, type DetectBackendResult } from './detect.js'; diff --git a/packages/core/src/agents/backends/iterm-it2.test.ts b/packages/core/src/agents/backends/iterm-it2.test.ts new file mode 100644 index 0000000000..7232536954 --- /dev/null +++ b/packages/core/src/agents/backends/iterm-it2.test.ts @@ -0,0 +1,318 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, beforeEach, vi } from 'vitest'; + +// ─── Hoisted mocks for shell-utils ────────────────────────────── +const hoistedExecCommand = vi.hoisted(() => vi.fn()); +const hoistedIsCommandAvailable = vi.hoisted(() => vi.fn()); + +vi.mock('../../utils/shell-utils.js', () => ({ + execCommand: hoistedExecCommand, + isCommandAvailable: hoistedIsCommandAvailable, +})); + +vi.mock('../../utils/debugLogger.js', () => ({ + createDebugLogger: () => ({ + info: vi.fn(), + error: vi.fn(), + warn: vi.fn(), + }), +})); + +import { + isIt2Available, + ensureIt2Installed, + verifyITerm, + itermSplitPane, + itermRunCommand, + itermFocusSession, + itermSendText, + itermCloseSession, +} from './iterm-it2.js'; + +describe('iterm-it2', () => { + beforeEach(() => { + vi.clearAllMocks(); + }); + + // ─── isIt2Available ───────────────────────────────────────── + + describe('isIt2Available', () => { + it('returns true when it2 is on PATH', () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + expect(isIt2Available()).toBe(true); + expect(hoistedIsCommandAvailable).toHaveBeenCalledWith('it2'); + }); + + it('returns false when it2 is not on PATH', () => { + hoistedIsCommandAvailable.mockReturnValue({ available: false }); + expect(isIt2Available()).toBe(false); + }); + }); + + // ─── ensureIt2Installed ────────────────────────────────────── + + describe('ensureIt2Installed', () => { + it('does nothing if it2 is already available', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + await ensureIt2Installed(); + expect(hoistedExecCommand).not.toHaveBeenCalled(); + }); + + it('installs via uv when uv is available', async () => { + // isIt2Available() → false; uv available; install succeeds; recheck → true + hoistedIsCommandAvailable + .mockReturnValueOnce({ available: false }) // isIt2Available() initial + .mockReturnValueOnce({ available: true }); // uv available + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + // After install, it2 is available + hoistedIsCommandAvailable.mockReturnValueOnce({ available: true }); + + await ensureIt2Installed(); + + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'uv', + ['tool', 'install', 'it2'], + expect.any(Object), + ); + }); + + it('falls back to pipx when uv is unavailable', async () => { + hoistedIsCommandAvailable + .mockReturnValueOnce({ available: false }) // isIt2Available() + .mockReturnValueOnce({ available: false }) // uv not available + .mockReturnValueOnce({ available: true }); // pipx available + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + hoistedIsCommandAvailable.mockReturnValueOnce({ available: true }); // recheck + + await ensureIt2Installed(); + + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'pipx', + ['install', 'it2'], + expect.any(Object), + ); + }); + + it('falls back to pip when uv and pipx are unavailable', async () => { + hoistedIsCommandAvailable + .mockReturnValueOnce({ available: false }) // isIt2Available() + .mockReturnValueOnce({ available: false }) // uv + .mockReturnValueOnce({ available: false }) // pipx + .mockReturnValueOnce({ available: true }); // pip available + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + hoistedIsCommandAvailable.mockReturnValueOnce({ available: true }); // recheck + + await ensureIt2Installed(); + + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'pip', + ['install', '--user', 'it2'], + expect.any(Object), + ); + }); + + it('throws if no installer succeeds', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: false }); + + await expect(ensureIt2Installed()).rejects.toThrow( + 'it2 is not installed', + ); + }); + }); + + // ─── verifyITerm ────────────────────────────────────────────── + + describe('verifyITerm', () => { + it('succeeds when session list returns code 0', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: 'session1\n', + stderr: '', + }); + + await expect(verifyITerm()).resolves.toBeUndefined(); + }); + + it('throws Python API error when stderr mentions "api"', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + hoistedExecCommand.mockResolvedValue({ + code: 1, + stdout: '', + stderr: 'Python API not enabled', + }); + + await expect(verifyITerm()).rejects.toThrow('Python API not enabled'); + }); + + it('throws Python API error when stderr mentions "connection refused"', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + hoistedExecCommand.mockResolvedValue({ + code: 1, + stdout: '', + stderr: 'Connection refused to iTerm2', + }); + + await expect(verifyITerm()).rejects.toThrow('Python API not enabled'); + }); + + it('throws generic error for unrecognized failures', async () => { + hoistedIsCommandAvailable.mockReturnValue({ available: true }); + hoistedExecCommand.mockResolvedValue({ + code: 1, + stdout: '', + stderr: 'some unknown error', + }); + + await expect(verifyITerm()).rejects.toThrow('it2 session list failed'); + }); + }); + + // ─── itermSplitPane ────────────────────────────────────────── + + describe('itermSplitPane', () => { + it('splits vertically without session ID', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: 'Created new pane: w0t1p2\n', + stderr: '', + }); + + const paneId = await itermSplitPane(); + expect(paneId).toBe('w0t1p2'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'split', '-v'], + expect.any(Object), + ); + }); + + it('passes -s flag when session ID is provided', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: 'Created new pane: w0t1p3\n', + stderr: '', + }); + + await itermSplitPane('sess-123'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'split', '-v', '-s', 'sess-123'], + expect.any(Object), + ); + }); + + it('throws if pane ID cannot be parsed from output', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: 'Unexpected output\n', + stderr: '', + }); + + await expect(itermSplitPane()).rejects.toThrow('Unable to parse'); + }); + + it('throws on non-zero exit code', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 1, + stdout: '', + stderr: 'split failed', + }); + + await expect(itermSplitPane()).rejects.toThrow('split failed'); + }); + }); + + // ─── itermRunCommand ────────────────────────────────────────── + + describe('itermRunCommand', () => { + it('calls it2 session run with correct args', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + + await itermRunCommand('sess-1', 'ls -la'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'run', '-s', 'sess-1', 'ls -la'], + expect.any(Object), + ); + }); + }); + + // ─── itermFocusSession ──────────────────────────────────────── + + describe('itermFocusSession', () => { + it('calls it2 session focus with correct args', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + + await itermFocusSession('sess-1'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'focus', 'sess-1'], + expect.any(Object), + ); + }); + }); + + // ─── itermSendText ───────────────────────────────────────────── + + describe('itermSendText', () => { + it('calls it2 session send with correct args', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + + await itermSendText('sess-1', 'hello world'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'send', '-s', 'sess-1', 'hello world'], + expect.any(Object), + ); + }); + }); + + // ─── itermCloseSession ──────────────────────────────────────── + + describe('itermCloseSession', () => { + it('calls it2 session close with correct args', async () => { + hoistedExecCommand.mockResolvedValue({ + code: 0, + stdout: '', + stderr: '', + }); + + await itermCloseSession('sess-1'); + expect(hoistedExecCommand).toHaveBeenCalledWith( + 'it2', + ['session', 'close', '-s', 'sess-1'], + expect.any(Object), + ); + }); + }); +}); diff --git a/packages/core/src/agents/backends/iterm-it2.ts b/packages/core/src/agents/backends/iterm-it2.ts new file mode 100644 index 0000000000..cf550b9125 --- /dev/null +++ b/packages/core/src/agents/backends/iterm-it2.ts @@ -0,0 +1,141 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Type-safe async wrappers for iTerm2 it2 CLI commands. + * + * The it2 CLI talks to iTerm2's Python API. We use it2 directly and avoid + * AppleScript to match the Team design spec. + */ + +import { execCommand, isCommandAvailable } from '../../utils/shell-utils.js'; +import { createDebugLogger } from '../../utils/debugLogger.js'; + +const debugLogger = createDebugLogger('ITERM_IT2'); + +// ─── Helpers ──────────────────────────────────────────────────── + +async function it2Result( + args: string[], +): Promise<{ stdout: string; stderr: string; code: number }> { + debugLogger.info(`it2 ${args.join(' ')}`); + const result = await execCommand('it2', args, { + preserveOutputOnError: true, + }); + if (result.code !== 0 && result.stderr.trim()) { + debugLogger.error(`it2 error: ${result.stderr.trim()}`); + } + return result; +} + +async function it2(args: string[]): Promise { + const result = await it2Result(args); + if (result.code !== 0) { + const message = result.stderr.trim() || result.stdout.trim(); + throw new Error(message || 'it2 command failed'); + } + return result.stdout; +} + +function parseCreatedPaneId(output: string): string { + const match = output.match(/Created new pane:\s*(\S+)/); + if (!match?.[1]) { + throw new Error(`Unable to parse it2 split output: ${output.trim()}`); + } + return match[1]; +} + +// ─── Installation & Verification ─────────────────────────────── + +export function isIt2Available(): boolean { + return isCommandAvailable('it2').available; +} + +async function tryInstallIt2( + command: string, + args: string[], +): Promise { + if (!isCommandAvailable(command).available) return false; + const result = await execCommand(command, args, { + preserveOutputOnError: true, + }); + return result.code === 0; +} + +export async function ensureIt2Installed(): Promise { + if (isIt2Available()) return; + + const installers: Array<{ cmd: string; args: string[] }> = [ + { cmd: 'uv', args: ['tool', 'install', 'it2'] }, + { cmd: 'pipx', args: ['install', 'it2'] }, + { cmd: 'pip', args: ['install', '--user', 'it2'] }, + ]; + + for (const installer of installers) { + const installed = await tryInstallIt2(installer.cmd, installer.args); + if (installed && isIt2Available()) return; + } + + throw new Error( + 'it2 is not installed. Install it2 via "uv tool install it2", "pipx install it2", or "pip install --user it2".', + ); +} + +export async function verifyITerm(): Promise { + await ensureIt2Installed(); + + const result = await it2Result(['session', 'list']); + if (result.code === 0) return; + + const combined = `${result.stdout}\n${result.stderr}`.toLowerCase(); + if ( + combined.includes('api') || + combined.includes('python') || + combined.includes('connection refused') || + combined.includes('not enabled') + ) { + throw new Error( + 'iTerm2 Python API not enabled. Enable it in iTerm2 → Settings → General → Magic → Enable Python API, then restart iTerm2.', + ); + } + + throw new Error( + `it2 session list failed: ${result.stderr.trim() || result.stdout.trim()}`, + ); +} + +// ─── Public API ───────────────────────────────────────────────── + +export async function itermSplitPane(sessionId?: string): Promise { + const args = ['session', 'split', '-v']; + if (sessionId) { + args.push('-s', sessionId); + } + const output = await it2(args); + return parseCreatedPaneId(output); +} + +export async function itermRunCommand( + sessionId: string, + command: string, +): Promise { + await it2(['session', 'run', '-s', sessionId, command]); +} + +export async function itermFocusSession(sessionId: string): Promise { + await it2(['session', 'focus', sessionId]); +} + +export async function itermSendText( + sessionId: string, + text: string, +): Promise { + await it2(['session', 'send', '-s', sessionId, text]); +} + +export async function itermCloseSession(sessionId: string): Promise { + await it2(['session', 'close', '-s', sessionId]); +} diff --git a/packages/core/src/agents/backends/tmux-commands.test.ts b/packages/core/src/agents/backends/tmux-commands.test.ts new file mode 100644 index 0000000000..8e4a790bac --- /dev/null +++ b/packages/core/src/agents/backends/tmux-commands.test.ts @@ -0,0 +1,60 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect } from 'vitest'; +import { parseTmuxListPanes } from './tmux-commands.js'; + +describe('parseTmuxListPanes', () => { + it('parses a single running pane', () => { + const output = '%0 0 0\n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([{ paneId: '%0', dead: false, deadStatus: 0 }]); + }); + + it('parses a single dead pane with exit code', () => { + const output = '%1 1 42\n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([{ paneId: '%1', dead: true, deadStatus: 42 }]); + }); + + it('parses multiple panes with mixed statuses', () => { + const output = '%0 0 0\n%1 1 1\n%2 0 0\n%3 1 137\n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([ + { paneId: '%0', dead: false, deadStatus: 0 }, + { paneId: '%1', dead: true, deadStatus: 1 }, + { paneId: '%2', dead: false, deadStatus: 0 }, + { paneId: '%3', dead: true, deadStatus: 137 }, + ]); + }); + + it('returns empty array for empty output', () => { + expect(parseTmuxListPanes('')).toEqual([]); + }); + + it('returns empty array for whitespace-only output', () => { + expect(parseTmuxListPanes(' \n \n')).toEqual([]); + }); + + it('skips lines with insufficient fields', () => { + const output = '%0\n%1 1 0\n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([{ paneId: '%1', dead: true, deadStatus: 0 }]); + }); + + it('defaults deadStatus to 0 when missing', () => { + // tmux might omit the third field when pane is alive + const output = '%0 0\n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([{ paneId: '%0', dead: false, deadStatus: 0 }]); + }); + + it('handles extra whitespace gracefully', () => { + const output = ' %5 1 99 \n'; + const result = parseTmuxListPanes(output); + expect(result).toEqual([{ paneId: '%5', dead: true, deadStatus: 99 }]); + }); +}); diff --git a/packages/core/src/agents/backends/tmux-commands.ts b/packages/core/src/agents/backends/tmux-commands.ts new file mode 100644 index 0000000000..6400a72da8 --- /dev/null +++ b/packages/core/src/agents/backends/tmux-commands.ts @@ -0,0 +1,503 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Type-safe async wrappers for tmux CLI commands. + * + * All functions use `execCommand('tmux', [...args])` from shell-utils, + * avoiding shell injection by passing arguments as arrays (execFile). + */ + +import { execCommand, isCommandAvailable } from '../../utils/shell-utils.js'; +import { createDebugLogger } from '../../utils/debugLogger.js'; + +const debugLogger = createDebugLogger('TMUX_CMD'); + +/** + * Information about a tmux pane, parsed from `list-panes`. + */ +export interface TmuxPaneInfo { + /** Pane ID (e.g., '%0', '%1') */ + paneId: string; + /** Whether the pane's process has exited */ + dead: boolean; + /** Exit status of the pane's process (only valid when dead=true) */ + deadStatus: number; +} + +/** + * Information about a tmux window. + */ +export interface TmuxWindowInfo { + /** Window name */ + name: string; + /** Window ID (e.g., '@1') */ + id: string; +} + +/** + * Minimum tmux version required for split-pane support. + */ +const MIN_TMUX_VERSION = '3.0'; + +// ─── Helpers ──────────────────────────────────────────────────── + +async function tmuxResult( + args: string[], + serverName?: string, +): Promise<{ stdout: string; stderr: string; code: number }> { + const fullArgs = serverName ? ['-L', serverName, ...args] : args; + debugLogger.info(`tmux ${fullArgs.join(' ')}`); + const result = await execCommand('tmux', fullArgs, { + preserveOutputOnError: true, + }); + if (result.code !== 0 && result.stderr.trim()) { + debugLogger.error(`tmux error: ${result.stderr.trim()}`); + } + return result; +} + +async function tmux(args: string[], serverName?: string): Promise { + const result = await tmuxResult(args, serverName); + if (result.code !== 0) { + throw new Error( + `tmux ${args[0]} failed (exit ${result.code}): ${result.stderr.trim() || result.stdout.trim()}`, + ); + } + return result.stdout; +} + +function parseVersion(versionStr: string): number[] { + // "tmux 3.4" → [3, 4] + const match = versionStr.match(/(\d+)\.(\d+)/); + if (!match) return [0, 0]; + return [parseInt(match[1]!, 10), parseInt(match[2]!, 10)]; +} + +function isVersionAtLeast(current: string, minimum: string): boolean { + const [curMajor = 0, curMinor = 0] = parseVersion(current); + const [minMajor = 0, minMinor = 0] = parseVersion(minimum); + if (curMajor !== minMajor) return curMajor > minMajor; + return curMinor >= minMinor; +} + +// ─── Public API ───────────────────────────────────────────────── + +/** + * Check if tmux is available on the system. + */ +export function isTmuxAvailable(): boolean { + return isCommandAvailable('tmux').available; +} + +/** + * Get tmux version string (e.g., "tmux 3.4"). + */ +export async function tmuxVersion(): Promise { + const output = await tmux(['-V']); + return output.trim(); +} + +/** + * Verify tmux is available and meets minimum version requirement. + * + * @throws Error if tmux is not available or version is too old. + */ +export async function verifyTmux(): Promise { + if (!isTmuxAvailable()) { + throw new Error( + 'tmux is not installed. Install tmux (version 3.0+) for split-pane mode.', + ); + } + + const version = await tmuxVersion(); + if (!isVersionAtLeast(version, MIN_TMUX_VERSION)) { + throw new Error( + `tmux version ${MIN_TMUX_VERSION}+ required for split-pane mode (found: ${version}).`, + ); + } +} + +/** + * Get the current tmux session name (when running inside tmux). + */ +export async function tmuxCurrentSession(): Promise { + const output = await tmux(['display-message', '-p', '#{session_name}']); + return output.trim(); +} + +/** + * Get the current tmux pane ID (when running inside tmux). + */ +export async function tmuxCurrentPaneId(): Promise { + const output = await tmux(['display-message', '-p', '#{pane_id}']); + return output.trim(); +} + +/** + * Get the current tmux window target (session:window_index). + */ +export async function tmuxCurrentWindowTarget(): Promise { + const output = await tmux([ + 'display-message', + '-p', + '#{session_name}:#{window_index}', + ]); + return output.trim(); +} + +/** + * Check if a tmux session exists. + */ +export async function tmuxHasSession( + name: string, + serverName?: string, +): Promise { + const result = await tmuxResult(['has-session', '-t', name], serverName); + return result.code === 0; +} + +/** + * List windows in a session. + */ +export async function tmuxListWindows( + sessionName: string, + serverName?: string, +): Promise { + const output = await tmux( + ['list-windows', '-t', sessionName, '-F', '#{window_name} #{window_id}'], + serverName, + ); + const windows: TmuxWindowInfo[] = []; + for (const line of output.trim().split('\n')) { + if (!line.trim()) continue; + const [name, id] = line.trim().split(/\s+/, 2); + if (!name || !id) continue; + windows.push({ name, id }); + } + return windows; +} + +/** + * Check if a tmux window exists within a session. + */ +export async function tmuxHasWindow( + sessionName: string, + windowName: string, + serverName?: string, +): Promise { + const windows = await tmuxListWindows(sessionName, serverName); + return windows.some((w) => w.name === windowName); +} + +/** + * Create a new detached tmux session. + */ +export async function tmuxNewSession( + name: string, + opts?: { cols?: number; rows?: number; windowName?: string }, + serverName?: string, +): Promise { + const args = ['new-session', '-d', '-s', name]; + if (opts?.windowName) args.push('-n', opts.windowName); + if (opts?.cols) args.push('-x', String(opts.cols)); + if (opts?.rows) args.push('-y', String(opts.rows)); + await tmux(args, serverName); +} + +/** + * Create a new window in an existing session. + */ +export async function tmuxNewWindow( + targetSession: string, + windowName: string, + serverName?: string, +): Promise { + // -t session: (with trailing colon) means "create window in this session" + // -t session (without colon) means "create at window index = session", which fails if index exists + await tmux( + ['new-window', '-t', `${targetSession}:`, '-n', windowName], + serverName, + ); +} + +/** + * Split a window/pane and return the new pane ID. + * + * @param target - Target pane/window (e.g., session:window or pane ID) + * @param opts.horizontal - Split horizontally (left/right) if true, vertically (top/bottom) if false + * @param opts.percent - Size of the new pane as a percentage (e.g., 70 for 70%) + * @param opts.command - Shell command to execute directly in the new pane. + * When provided, the command becomes the pane's process (not a shell), + * so `#{pane_dead}` is set when the command exits. + * @returns The pane ID of the newly created pane (e.g., '%5') + */ +export async function tmuxSplitWindow( + target: string, + opts?: { horizontal?: boolean; percent?: number; command?: string }, + serverName?: string, +): Promise { + const args = ['split-window', '-t', target]; + if (opts?.horizontal) { + args.push('-h'); + } + if (opts?.percent !== undefined) { + args.push('-l', `${opts.percent}%`); + } + // -P -F: print new pane info in the specified format + args.push('-P', '-F', '#{pane_id}'); + if (opts?.command) { + args.push(opts.command); + } + const output = await tmux(args, serverName); + return output.trim(); +} + +/** + * Send keys to a tmux pane. + * + * @param paneId - Target pane ID + * @param keys - Keys to send + * @param opts.literal - If true, use -l flag (send keys literally, don't interpret) + */ +export async function tmuxSendKeys( + paneId: string, + keys: string, + opts?: { literal?: boolean; enter?: boolean }, + serverName?: string, +): Promise { + const args = ['send-keys', '-t', paneId]; + if (opts?.literal) { + args.push('-l'); + } + args.push(keys); + if (opts?.enter) { + args.push('Enter'); + } + await tmux(args, serverName); +} + +/** + * Select (focus) a tmux pane. + */ +export async function tmuxSelectPane( + paneId: string, + serverName?: string, +): Promise { + await tmux(['select-pane', '-t', paneId], serverName); +} + +/** + * Set a pane title. + */ +export async function tmuxSelectPaneTitle( + paneId: string, + title: string, + serverName?: string, +): Promise { + await tmux(['select-pane', '-t', paneId, '-T', title], serverName); +} + +/** + * Set a pane border style via select-pane -P. + */ +export async function tmuxSelectPaneStyle( + paneId: string, + style: string, + serverName?: string, +): Promise { + await tmux(['select-pane', '-t', paneId, '-P', style], serverName); +} + +/** + * Set the layout for a target window. + * + * @param target - Target window (e.g., session:window) + * @param layout - Layout name: 'tiled', 'even-horizontal', 'even-vertical', etc. + */ +export async function tmuxSelectLayout( + target: string, + layout: string, + serverName?: string, +): Promise { + await tmux(['select-layout', '-t', target, layout], serverName); +} + +/** + * Capture the content of a pane (including ANSI escape codes). + * + * @returns The captured pane content as a string. + */ +export async function tmuxCapturePaneContent( + paneId: string, + serverName?: string, +): Promise { + // -p: output to stdout, -e: include escape sequences + return await tmux(['capture-pane', '-t', paneId, '-p', '-e'], serverName); +} + +/** + * List panes in a target window/session and return parsed info. + * + * @param target - Target window (e.g., session:window) + * @returns Array of pane information. + */ +export async function tmuxListPanes( + target: string, + serverName?: string, +): Promise { + const output = await tmux( + [ + 'list-panes', + '-t', + target, + '-F', + '#{pane_id} #{pane_dead} #{pane_dead_status}', + ], + serverName, + ); + return parseTmuxListPanes(output); +} + +/** + * Parse the output of `tmux list-panes -F '#{pane_id} #{pane_dead} #{pane_dead_status}'`. + */ +export function parseTmuxListPanes(output: string): TmuxPaneInfo[] { + const panes: TmuxPaneInfo[] = []; + for (const line of output.trim().split('\n')) { + if (!line.trim()) continue; + const parts = line.trim().split(/\s+/); + if (parts.length < 2) continue; + panes.push({ + paneId: parts[0]!, + dead: parts[1] === '1', + deadStatus: parts[2] ? parseInt(parts[2], 10) : 0, + }); + } + return panes; +} + +/** + * Set a tmux option on a target pane/window. + */ +export async function tmuxSetOption( + target: string, + option: string, + value: string, + serverName?: string, +): Promise { + await tmux(['set-option', '-t', target, option, value], serverName); +} + +/** + * Respawn a pane with a new command. + * + * Kills the current process in the pane and starts a new one. + * The command becomes the pane's direct process, so `#{pane_dead}` + * is set when the command exits. + * + * @param paneId - Target pane ID + * @param command - Shell command to execute + */ +export async function tmuxRespawnPane( + paneId: string, + command: string, + serverName?: string, +): Promise { + await tmux(['respawn-pane', '-k', '-t', paneId, command], serverName); +} + +/** + * Break a pane into a target session (detaches from current window). + */ +export async function tmuxBreakPane( + paneId: string, + targetSession: string, + serverName?: string, +): Promise { + await tmux(['break-pane', '-s', paneId, '-t', targetSession], serverName); +} + +/** + * Join a pane into a target window. + */ +export async function tmuxJoinPane( + paneId: string, + target: string, + serverName?: string, +): Promise { + await tmux(['join-pane', '-s', paneId, '-t', target], serverName); +} + +/** + * Kill a tmux pane. + */ +export async function tmuxKillPane( + paneId: string, + serverName?: string, +): Promise { + await tmux(['kill-pane', '-t', paneId], serverName); +} + +/** + * Resize a tmux pane. + * + * @param paneId - Target pane ID + * @param opts.height - Height (number for lines, or string like '50%') + * @param opts.width - Width (number for columns, or string like '50%') + */ +export async function tmuxResizePane( + paneId: string, + opts: { height?: number | string; width?: number | string }, + serverName?: string, +): Promise { + const args = ['resize-pane', '-t', paneId]; + if (opts.height !== undefined) { + args.push('-y', String(opts.height)); + } + if (opts.width !== undefined) { + args.push('-x', String(opts.width)); + } + await tmux(args, serverName); +} + +/** + * Kill a tmux session. + */ +export async function tmuxKillSession( + name: string, + serverName?: string, +): Promise { + await tmux(['kill-session', '-t', name], serverName); +} + +/** + * Kill a tmux window. + */ +export async function tmuxKillWindow( + target: string, + serverName?: string, +): Promise { + await tmux(['kill-window', '-t', target], serverName); +} + +/** + * Get the first pane ID of a target window. + */ +export async function tmuxGetFirstPaneId( + target: string, + serverName?: string, +): Promise { + const output = await tmux( + ['list-panes', '-t', target, '-F', '#{pane_id}'], + serverName, + ); + const firstLine = output.trim().split('\n')[0]; + if (!firstLine) { + throw new Error(`No panes found in target: ${target}`); + } + return firstLine.trim(); +} diff --git a/packages/core/src/agents/backends/types.ts b/packages/core/src/agents/backends/types.ts new file mode 100644 index 0000000000..98678fd0f5 --- /dev/null +++ b/packages/core/src/agents/backends/types.ts @@ -0,0 +1,276 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Shared types for multi-agent systems (Arena, Team, Swarm) + * and the Backend abstraction layer. + * + * These types are used across different agent orchestration modes. + */ + +import type { Content } from '@google/genai'; +import type { AnsiOutput } from '../../utils/terminalSerializer.js'; +import type { + PromptConfig, + ModelConfig, + RunConfig, + ToolConfig, +} from '../runtime/agent-types.js'; + +/** + * Canonical display mode values shared across core and CLI. + */ +export const DISPLAY_MODE = { + IN_PROCESS: 'in-process', + TMUX: 'tmux', + ITERM2: 'iterm2', +} as const; + +/** + * Supported display mode values. + */ +export type DisplayMode = (typeof DISPLAY_MODE)[keyof typeof DISPLAY_MODE]; + +/** + * Configuration for spawning an agent subprocess. + */ +export interface AgentSpawnConfig { + /** Unique identifier for this agent */ + agentId: string; + /** Command to execute (e.g., the CLI binary path) */ + command: string; + /** Arguments to pass to the command */ + args: string[]; + /** Working directory for the subprocess */ + cwd: string; + /** Additional environment variables (merged with process.env) */ + env?: Record; + /** Terminal columns (default: 120) */ + cols?: number; + /** Terminal rows (default: 40) */ + rows?: number; + /** + * Backend-specific options (optional). + * These are ignored by backends that do not support them. + */ + backend?: { + tmux?: TmuxBackendOptions; + }; + + /** + * In-process spawn configuration (optional). + * When provided, InProcessBackend uses this to create an AgentInteractive + * instead of launching a PTY subprocess. + */ + inProcess?: InProcessSpawnConfig; +} + +/** + * Configuration for spawning an in-process agent (no PTY subprocess). + */ +export interface InProcessSpawnConfig { + /** Human-readable agent name for display. */ + agentName: string; + /** Optional initial task to start working on immediately. */ + initialTask?: string; + /** Runtime configuration for the AgentCore. */ + runtimeConfig: { + promptConfig: PromptConfig; + modelConfig: ModelConfig; + runConfig: RunConfig; + toolConfig?: ToolConfig; + }; + /** + * Per-agent auth/provider overrides. When present, a dedicated + * ContentGenerator is created for this agent instead of inheriting + * the parent process's. This enables Arena agents to target different + * model providers (OpenAI, Anthropic, Gemini, etc.) in the same session. + */ + authOverrides?: { + authType: string; + apiKey?: string; + baseUrl?: string; + }; + /** + * Optional chat history from the parent session. When provided, this + * history is prepended to the agent's chat so it has conversational + * context from the session that spawned it. + */ + chatHistory?: Content[]; +} + +/** + * Callback for agent exit events. + */ +export type AgentExitCallback = ( + agentId: string, + exitCode: number | null, + signal: number | null, +) => void; + +/** + * Backend abstracts the display/pane management layer for multi-agent systems. + * + * Each display mode (in-process / tmux / iTerm2) implements this interface. The orchestration + * layer (Arena, Team, etc.) delegates all pane operations through the backend, + * making the display mode transparent. + */ +export interface Backend { + /** Backend type identifier. */ + readonly type: DisplayMode; + + /** + * Initialize the backend. + * - in-process: runs in the current process (not yet implemented) + * - tmux: verifies tmux availability, creates session + * - iTerm2: verifies iTerm2 is running + */ + init(): Promise; + + // ─── Agent Lifecycle ──────────────────────────────────────── + + /** + * Spawn a new agent subprocess. + * + * @param config - Agent spawn configuration (command, args, cwd, env, etc.) + * @returns Promise that resolves when the agent's pane/PTY is created and ready. + */ + spawnAgent(config: AgentSpawnConfig): Promise; + + /** + * Stop a specific agent. + */ + stopAgent(agentId: string): void; + + /** + * Stop all running agents. + */ + stopAll(): void; + + /** + * Clean up all resources (kill processes, destroy panes/sessions). + */ + cleanup(): Promise; + + /** + * Register a callback for agent exit events. + */ + setOnAgentExit(callback: AgentExitCallback): void; + + /** + * Wait for all agents to exit, with an optional timeout. + * + * @returns true if all agents exited, false if timeout was reached. + */ + waitForAll(timeoutMs?: number): Promise; + + // ─── Active Agent & Navigation ────────────────────────────── + + /** + * Switch the active agent for screen capture and input routing. + */ + switchTo(agentId: string): void; + + /** + * Switch to the next agent in order. + */ + switchToNext(): void; + + /** + * Switch to the previous agent in order. + */ + switchToPrevious(): void; + + /** + * Get the ID of the currently active agent. + */ + getActiveAgentId(): string | null; + + // ─── Screen Capture ───────────────────────────────────────── + + /** + * Get the screen snapshot for the currently active agent. + * + * @returns AnsiOutput or null if no active agent or not supported. + */ + getActiveSnapshot(): AnsiOutput | null; + + /** + * Get the screen snapshot for a specific agent. + * + * @param agentId - Agent to capture + * @param scrollOffset - Lines to scroll back from viewport (default: 0) + * @returns AnsiOutput or null if not found or not supported. + */ + getAgentSnapshot(agentId: string, scrollOffset?: number): AnsiOutput | null; + + /** + * Get the maximum scrollback length for an agent's terminal buffer. + * + * @returns Number of scrollable lines, or 0 if not supported. + */ + getAgentScrollbackLength(agentId: string): number; + + // ─── Input ────────────────────────────────────────────────── + + /** + * Forward input to the currently active agent's PTY stdin. + * + * @returns true if input was forwarded, false otherwise. + */ + forwardInput(data: string): boolean; + + /** + * Write input to a specific agent's PTY stdin. + * + * @returns true if input was written, false otherwise. + */ + writeToAgent(agentId: string, data: string): boolean; + + // ─── Resize ───────────────────────────────────────────────── + + /** + * Resize all agent terminals/panes. + */ + resizeAll(cols: number, rows: number): void; + + // ─── External Session Info ───────────────────────────────── + + /** + * Get a user-facing hint for how to attach to the external display session. + * + * When the backend runs in external mode (e.g., a detached tmux server), + * this returns a shell command the user can run to view the agent panes. + * Returns null if not applicable (e.g., running inside tmux or iTerm2). + */ + getAttachHint(): string | null; +} + +/** + * Optional tmux backend configuration. + */ +export interface TmuxBackendOptions { + /** tmux server name for -L (when running outside tmux) */ + serverName?: string; + /** tmux session name to use/create (when running outside tmux) */ + sessionName?: string; + /** tmux window name to use/create (when running outside tmux) */ + windowName?: string; + /** Pane title for this agent */ + paneTitle?: string; + /** Border style for inactive panes (tmux style string, e.g. "fg=blue") */ + paneBorderStyle?: string; + /** Border style for active pane (tmux style string, e.g. "fg=green,bold") */ + paneActiveBorderStyle?: string; + /** Pane border format (default: "#{pane_title}") */ + paneBorderFormat?: string; + /** Pane border status location */ + paneBorderStatus?: 'top' | 'bottom' | 'off'; + /** Leader pane width percentage (default: 30) */ + leaderPaneWidthPercent?: number; + /** First split percent when inside tmux (default: 70) */ + firstSplitPercent?: number; +} diff --git a/packages/core/src/agents/index.ts b/packages/core/src/agents/index.ts new file mode 100644 index 0000000000..d29d4dc09b --- /dev/null +++ b/packages/core/src/agents/index.ts @@ -0,0 +1,18 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Multi-agent infrastructure shared across Arena, Team, and Swarm modes. + * + * This module provides the common building blocks for managing multiple concurrent + * agent subprocesses: + * - Backend: Display abstraction (tmux, iTerm2) + * - Shared types for agent spawning and lifecycle + */ + +export * from './backends/index.js'; +export * from './arena/index.js'; +export * from './runtime/index.js'; diff --git a/packages/core/src/agents/runtime/agent-core.ts b/packages/core/src/agents/runtime/agent-core.ts new file mode 100644 index 0000000000..fb63cb530f --- /dev/null +++ b/packages/core/src/agents/runtime/agent-core.ts @@ -0,0 +1,1049 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentCore — the shared execution engine for subagents. + * + * AgentCore encapsulates the model reasoning loop, tool scheduling, stats, + * and event emission. It is composed by both AgentHeadless (one-shot tasks) + * and AgentInteractive (persistent interactive agents). + * + * AgentCore is stateless per-call: it does not own lifecycle or termination + * logic. The caller (executor/collaborator) controls when to start, stop, + * and how to interpret the results. + */ + +import { reportError } from '../../utils/errorReporting.js'; +import type { Config } from '../../config/config.js'; +import { type ToolCallRequestInfo } from '../../core/turn.js'; +import { + CoreToolScheduler, + type ToolCall, + type ExecutingToolCall, + type WaitingToolCall, +} from '../../core/coreToolScheduler.js'; +import type { + ToolConfirmationOutcome, + ToolCallConfirmationDetails, +} from '../../tools/tools.js'; +import { getInitialChatHistory } from '../../utils/environmentContext.js'; +import type { + Content, + Part, + FunctionCall, + GenerateContentConfig, + FunctionDeclaration, + GenerateContentResponseUsageMetadata, +} from '@google/genai'; +import { GeminiChat } from '../../core/geminiChat.js'; +import type { + PromptConfig, + ModelConfig, + RunConfig, + ToolConfig, +} from './agent-types.js'; +import { AgentTerminateMode } from './agent-types.js'; +import type { + AgentRoundEvent, + AgentRoundTextEvent, + AgentToolCallEvent, + AgentToolResultEvent, + AgentToolOutputUpdateEvent, + AgentUsageEvent, + AgentHooks, +} from './agent-events.js'; +import { type AgentEventEmitter, AgentEventType } from './agent-events.js'; +import { AgentStatistics, type AgentStatsSummary } from './agent-statistics.js'; +import { TaskTool } from '../../tools/task.js'; +import { DEFAULT_QWEN_MODEL } from '../../config/models.js'; +import { type ContextState, templateString } from './agent-headless.js'; + +/** + * Result of a single reasoning loop invocation. + */ +export interface ReasoningLoopResult { + /** The final model text response (empty if terminated by abort/limits). */ + text: string; + /** Why the loop ended. null = normal text completion (no tool calls). */ + terminateMode: AgentTerminateMode | null; + /** Number of model round-trips completed. */ + turnsUsed: number; +} + +/** + * Options for configuring a reasoning loop invocation. + */ +export interface ReasoningLoopOptions { + /** Maximum number of turns before stopping. */ + maxTurns?: number; + /** Maximum wall-clock time in minutes before stopping. */ + maxTimeMinutes?: number; + /** Start time in ms (for timeout calculation). Defaults to Date.now(). */ + startTimeMs?: number; +} + +/** + * Options for chat creation. + */ +export interface CreateChatOptions { + /** + * When true, omits the "non-interactive mode" system prompt suffix. + * Used by AgentInteractive for persistent interactive agents. + */ + interactive?: boolean; + /** + * Optional conversation history from a parent session. When provided, + * this history is prepended to the chat so the agent has prior + * conversational context (e.g., from the main session that spawned it). + */ + extraHistory?: Content[]; +} + +/** + * Legacy execution stats maintained for backward compatibility. + */ +export interface ExecutionStats { + startTimeMs: number; + totalDurationMs: number; + rounds: number; + totalToolCalls: number; + successfulToolCalls: number; + failedToolCalls: number; + inputTokens?: number; + outputTokens?: number; + totalTokens?: number; +} + +/** + * AgentCore — shared execution engine for model reasoning and tool scheduling. + * + * This class encapsulates: + * - Chat/model session creation (`createChat`) + * - Tool list preparation (`prepareTools`) + * - The inner reasoning loop (`runReasoningLoop`) + * - Tool call scheduling and execution (`processFunctionCalls`) + * - Statistics tracking and event emission + * + * It does NOT manage lifecycle (start/stop/terminate), abort signals, + * or final result interpretation — those are the caller's responsibility. + */ +export class AgentCore { + readonly subagentId: string; + readonly name: string; + readonly runtimeContext: Config; + readonly promptConfig: PromptConfig; + readonly modelConfig: ModelConfig; + readonly runConfig: RunConfig; + readonly toolConfig?: ToolConfig; + readonly eventEmitter?: AgentEventEmitter; + readonly hooks?: AgentHooks; + readonly stats = new AgentStatistics(); + + /** + * Legacy execution stats maintained for aggregate tracking. + */ + executionStats: ExecutionStats = { + startTimeMs: 0, + totalDurationMs: 0, + rounds: 0, + totalToolCalls: 0, + successfulToolCalls: 0, + failedToolCalls: 0, + inputTokens: 0, + outputTokens: 0, + totalTokens: 0, + }; + /** + * The prompt token count from the most recent model response. + * Exposed so UI hooks can seed initial state without waiting for events. + */ + lastPromptTokenCount = 0; + + private toolUsage = new Map< + string, + { + count: number; + success: number; + failure: number; + lastError?: string; + totalDurationMs?: number; + averageDurationMs?: number; + } + >(); + + constructor( + name: string, + runtimeContext: Config, + promptConfig: PromptConfig, + modelConfig: ModelConfig, + runConfig: RunConfig, + toolConfig?: ToolConfig, + eventEmitter?: AgentEventEmitter, + hooks?: AgentHooks, + ) { + const randomPart = Math.random().toString(36).slice(2, 8); + this.subagentId = `${name}-${randomPart}`; + this.name = name; + this.runtimeContext = runtimeContext; + this.promptConfig = promptConfig; + this.modelConfig = modelConfig; + this.runConfig = runConfig; + this.toolConfig = toolConfig; + this.eventEmitter = eventEmitter; + this.hooks = hooks; + } + + // ─── Chat Creation ──────────────────────────────────────── + + /** + * Creates a GeminiChat instance configured for this agent. + * + * @param context - Context state for template variable substitution. + * @param options - Chat creation options. + * - `interactive`: When true, omits the "non-interactive mode" system prompt suffix. + * @returns A configured GeminiChat, or undefined if initialization fails. + */ + async createChat( + context: ContextState, + options?: CreateChatOptions, + ): Promise { + if (!this.promptConfig.systemPrompt && !this.promptConfig.initialMessages) { + throw new Error( + 'PromptConfig must have either `systemPrompt` or `initialMessages` defined.', + ); + } + if (this.promptConfig.systemPrompt && this.promptConfig.initialMessages) { + throw new Error( + 'PromptConfig cannot have both `systemPrompt` and `initialMessages` defined.', + ); + } + + const envHistory = await getInitialChatHistory(this.runtimeContext); + + const startHistory = [ + ...envHistory, + ...(options?.extraHistory ?? []), + ...(this.promptConfig.initialMessages ?? []), + ]; + + const systemInstruction = this.promptConfig.systemPrompt + ? this.buildChatSystemPrompt(context, options) + : undefined; + + try { + const generationConfig: GenerateContentConfig & { + systemInstruction?: string | Content; + } = { + temperature: this.modelConfig.temp, + topP: this.modelConfig.top_p, + }; + + if (systemInstruction) { + generationConfig.systemInstruction = systemInstruction; + } + + return new GeminiChat( + this.runtimeContext, + generationConfig, + startHistory, + ); + } catch (error) { + await reportError( + error, + 'Error initializing chat session.', + startHistory, + 'startChat', + ); + return undefined; + } + } + + // ─── Tool Preparation ───────────────────────────────────── + + /** + * Prepares the list of tools available to this agent. + * + * If no explicit toolConfig or it contains "*" or is empty, + * inherits all tools (excluding TaskTool to prevent recursion). + */ + prepareTools(): FunctionDeclaration[] { + const toolRegistry = this.runtimeContext.getToolRegistry(); + const toolsList: FunctionDeclaration[] = []; + + if (this.toolConfig) { + const asStrings = this.toolConfig.tools.filter( + (t): t is string => typeof t === 'string', + ); + const hasWildcard = asStrings.includes('*'); + const onlyInlineDecls = this.toolConfig.tools.filter( + (t): t is FunctionDeclaration => typeof t !== 'string', + ); + + if (hasWildcard || asStrings.length === 0) { + toolsList.push( + ...toolRegistry + .getFunctionDeclarations() + .filter((t) => t.name !== TaskTool.Name), + ); + } else { + toolsList.push( + ...toolRegistry.getFunctionDeclarationsFiltered(asStrings), + ); + } + toolsList.push(...onlyInlineDecls); + } else { + // Inherit all available tools by default when not specified. + toolsList.push( + ...toolRegistry + .getFunctionDeclarations() + .filter((t) => t.name !== TaskTool.Name), + ); + } + + return toolsList; + } + + // ─── Reasoning Loop ─────────────────────────────────────── + + /** + * Runs the inner model reasoning loop. + * + * This is the core execution cycle: + * send messages → stream response → collect tool calls → execute tools → repeat. + * + * The loop terminates when: + * - The model produces a text response without tool calls (normal completion) + * - maxTurns is reached + * - maxTimeMinutes is exceeded + * - The abortController signal fires + * + * @param chat - The GeminiChat session to use. + * @param initialMessages - The first messages to send (e.g., user task prompt). + * @param toolsList - Available tool declarations. + * @param abortController - Controls cancellation of the current loop. + * @param options - Optional limits (maxTurns, maxTimeMinutes). + * @returns ReasoningLoopResult with the final text, terminate mode, and turns used. + */ + async runReasoningLoop( + chat: GeminiChat, + initialMessages: Content[], + toolsList: FunctionDeclaration[], + abortController: AbortController, + options?: ReasoningLoopOptions, + ): Promise { + const startTime = options?.startTimeMs ?? Date.now(); + let currentMessages = initialMessages; + let turnCounter = 0; + let finalText = ''; + let terminateMode: AgentTerminateMode | null = null; + + while (true) { + // Check abort before starting a new round — prevents unnecessary API + // calls after processFunctionCalls was unblocked by an abort signal. + if (abortController.signal.aborted) { + terminateMode = AgentTerminateMode.CANCELLED; + break; + } + + // Check termination conditions. + if (options?.maxTurns && turnCounter >= options.maxTurns) { + terminateMode = AgentTerminateMode.MAX_TURNS; + break; + } + + let durationMin = (Date.now() - startTime) / (1000 * 60); + if (options?.maxTimeMinutes && durationMin >= options.maxTimeMinutes) { + terminateMode = AgentTerminateMode.TIMEOUT; + break; + } + + // Create a new AbortController per round to avoid listener accumulation + // in the model SDK. The parent abortController propagates abort to it. + const roundAbortController = new AbortController(); + const onParentAbort = () => roundAbortController.abort(); + abortController.signal.addEventListener('abort', onParentAbort); + if (abortController.signal.aborted) { + roundAbortController.abort(); + } + + const promptId = `${this.runtimeContext.getSessionId()}#${this.subagentId}#${turnCounter++}`; + + const messageParams = { + message: currentMessages[0]?.parts || [], + config: { + abortSignal: roundAbortController.signal, + tools: [{ functionDeclarations: toolsList }], + }, + }; + + const roundStreamStart = Date.now(); + const responseStream = await chat.sendMessageStream( + this.modelConfig.model || + this.runtimeContext.getModel() || + DEFAULT_QWEN_MODEL, + messageParams, + promptId, + ); + this.eventEmitter?.emit(AgentEventType.ROUND_START, { + subagentId: this.subagentId, + round: turnCounter, + promptId, + timestamp: Date.now(), + } as AgentRoundEvent); + + const functionCalls: FunctionCall[] = []; + let roundText = ''; + let roundThoughtText = ''; + let lastUsage: GenerateContentResponseUsageMetadata | undefined = + undefined; + let currentResponseId: string | undefined = undefined; + + for await (const streamEvent of responseStream) { + if (roundAbortController.signal.aborted) { + abortController.signal.removeEventListener('abort', onParentAbort); + return { + text: finalText, + terminateMode: AgentTerminateMode.CANCELLED, + turnsUsed: turnCounter, + }; + } + + // Handle retry events + if (streamEvent.type === 'retry') { + continue; + } + + // Handle chunk events + if (streamEvent.type === 'chunk') { + const resp = streamEvent.value; + // Track the response ID for tool call correlation + if (resp.responseId) { + currentResponseId = resp.responseId; + } + if (resp.functionCalls) functionCalls.push(...resp.functionCalls); + const content = resp.candidates?.[0]?.content; + const parts = content?.parts || []; + for (const p of parts) { + const txt = p.text; + const isThought = p.thought ?? false; + if (txt && isThought) roundThoughtText += txt; + if (txt && !isThought) roundText += txt; + if (txt) + this.eventEmitter?.emit(AgentEventType.STREAM_TEXT, { + subagentId: this.subagentId, + round: turnCounter, + text: txt, + thought: isThought, + timestamp: Date.now(), + }); + } + if (resp.usageMetadata) lastUsage = resp.usageMetadata; + } + } + + if (roundText || roundThoughtText) { + this.eventEmitter?.emit(AgentEventType.ROUND_TEXT, { + subagentId: this.subagentId, + round: turnCounter, + text: roundText, + thoughtText: roundThoughtText, + timestamp: Date.now(), + } as AgentRoundTextEvent); + } + + this.executionStats.rounds = turnCounter; + this.stats.setRounds(turnCounter); + + durationMin = (Date.now() - startTime) / (1000 * 60); + if (options?.maxTimeMinutes && durationMin >= options.maxTimeMinutes) { + abortController.signal.removeEventListener('abort', onParentAbort); + terminateMode = AgentTerminateMode.TIMEOUT; + break; + } + + // Update token usage if available + if (lastUsage) { + this.recordTokenUsage(lastUsage, turnCounter, roundStreamStart); + } + + if (functionCalls.length > 0) { + currentMessages = await this.processFunctionCalls( + functionCalls, + roundAbortController, + promptId, + turnCounter, + toolsList, + currentResponseId, + ); + } else { + // No tool calls — treat this as the model's final answer. + if (roundText && roundText.trim().length > 0) { + finalText = roundText.trim(); + // Emit ROUND_END for the final round so all consumers see it. + // Previously this was skipped, requiring AgentInteractive to + // compensate with an explicit flushStreamBuffers() call. + this.eventEmitter?.emit(AgentEventType.ROUND_END, { + subagentId: this.subagentId, + round: turnCounter, + promptId, + timestamp: Date.now(), + } as AgentRoundEvent); + // Clean up before breaking + abortController.signal.removeEventListener('abort', onParentAbort); + // null terminateMode = normal text completion + break; + } + // Otherwise, nudge the model to finalize a result. + currentMessages = [ + { + role: 'user', + parts: [ + { + text: 'Please provide the final result now and stop calling tools.', + }, + ], + }, + ]; + } + + this.eventEmitter?.emit(AgentEventType.ROUND_END, { + subagentId: this.subagentId, + round: turnCounter, + promptId, + timestamp: Date.now(), + } as AgentRoundEvent); + + // Clean up the per-round listener before the next iteration + abortController.signal.removeEventListener('abort', onParentAbort); + } + + return { + text: finalText, + terminateMode, + turnsUsed: turnCounter, + }; + } + + // ─── Tool Execution ─────────────────────────────────────── + + /** + * Processes a list of function calls via CoreToolScheduler. + * + * Validates each call against the allowed tools list, schedules authorized + * calls, collects results, and emits events for each call/result. + * + * Validates each call, schedules authorized calls, collects results, and emits events. + */ + async processFunctionCalls( + functionCalls: FunctionCall[], + abortController: AbortController, + promptId: string, + currentRound: number, + toolsList: FunctionDeclaration[], + responseId?: string, + ): Promise { + const toolResponseParts: Part[] = []; + + // Build allowed tool names set for filtering + const allowedToolNames = new Set(toolsList.map((t) => t.name)); + + // Filter unauthorized tool calls before scheduling + const authorizedCalls: FunctionCall[] = []; + for (const fc of functionCalls) { + const callId = fc.id ?? `${fc.name}-${Date.now()}`; + + if (!allowedToolNames.has(fc.name)) { + const toolName = String(fc.name); + const errorMessage = `Tool "${toolName}" not found. Tools must use the exact names provided.`; + + // Emit TOOL_CALL event for visibility + this.eventEmitter?.emit(AgentEventType.TOOL_CALL, { + subagentId: this.subagentId, + round: currentRound, + callId, + name: toolName, + args: fc.args ?? {}, + description: `Tool "${toolName}" not found`, + isOutputMarkdown: false, + timestamp: Date.now(), + } as AgentToolCallEvent); + + // Build function response part (used for both event and LLM) + const functionResponsePart = { + functionResponse: { + id: callId, + name: toolName, + response: { error: errorMessage }, + }, + }; + + // Emit TOOL_RESULT event with error + this.eventEmitter?.emit(AgentEventType.TOOL_RESULT, { + subagentId: this.subagentId, + round: currentRound, + callId, + name: toolName, + success: false, + error: errorMessage, + responseParts: [functionResponsePart], + resultDisplay: errorMessage, + durationMs: 0, + timestamp: Date.now(), + } as AgentToolResultEvent); + + // Record blocked tool call in stats + this.recordToolCallStats(toolName, false, 0, errorMessage); + + // Add function response for LLM + toolResponseParts.push(functionResponsePart); + continue; + } + authorizedCalls.push(fc); + } + + // Build scheduler + const responded = new Set(); + let resolveBatch: (() => void) | null = null; + const emittedCallIds = new Set(); + // pidMap: callId → PTY PID, populated by onToolCallsUpdate when a shell + // tool spawns a PTY. Shared with outputUpdateHandler via closure so the + // PID is included in TOOL_OUTPUT_UPDATE events for interactive shell support. + const pidMap = new Map(); + const scheduler = new CoreToolScheduler({ + config: this.runtimeContext, + outputUpdateHandler: (callId, outputChunk) => { + this.eventEmitter?.emit(AgentEventType.TOOL_OUTPUT_UPDATE, { + subagentId: this.subagentId, + round: currentRound, + callId, + outputChunk, + pid: pidMap.get(callId), + timestamp: Date.now(), + } as AgentToolOutputUpdateEvent); + }, + onAllToolCallsComplete: async (completedCalls) => { + for (const call of completedCalls) { + if (emittedCallIds.has(call.request.callId)) continue; + emittedCallIds.add(call.request.callId); + + const toolName = call.request.name; + const duration = call.durationMs ?? 0; + const success = call.status === 'success'; + const errorMessage = + call.status === 'error' || call.status === 'cancelled' + ? call.response.error?.message + : undefined; + + // Record stats + this.recordToolCallStats(toolName, success, duration, errorMessage); + + // Emit tool result event + this.eventEmitter?.emit(AgentEventType.TOOL_RESULT, { + subagentId: this.subagentId, + round: currentRound, + callId: call.request.callId, + name: toolName, + success, + error: errorMessage, + responseParts: call.response.responseParts, + resultDisplay: call.response.resultDisplay, + durationMs: duration, + timestamp: Date.now(), + } as AgentToolResultEvent); + + // post-tool hook + await this.hooks?.postToolUse?.({ + subagentId: this.subagentId, + name: this.name, + toolName, + args: call.request.args, + success, + durationMs: duration, + errorMessage, + timestamp: Date.now(), + }); + + // Append response parts + const respParts = call.response.responseParts; + if (respParts) { + const parts = Array.isArray(respParts) ? respParts : [respParts]; + for (const part of parts) { + if (typeof part === 'string') { + toolResponseParts.push({ text: part }); + } else if (part) { + toolResponseParts.push(part); + } + } + } + } + // Signal that this batch is complete (all tools terminal) + resolveBatch?.(); + }, + onToolCallsUpdate: (calls: ToolCall[]) => { + for (const call of calls) { + // Track PTY PIDs so TOOL_OUTPUT_UPDATE events can carry them. + if (call.status === 'executing') { + const pid = (call as ExecutingToolCall).pid; + if (pid !== undefined) { + const isNewPid = !pidMap.has(call.request.callId); + pidMap.set(call.request.callId, pid); + // Emit immediately so the UI can offer interactive shell + // focus (Ctrl+F) before the tool produces its first output. + if (isNewPid) { + this.eventEmitter?.emit(AgentEventType.TOOL_OUTPUT_UPDATE, { + subagentId: this.subagentId, + round: currentRound, + callId: call.request.callId, + outputChunk: (call as ExecutingToolCall).liveOutput ?? '', + pid, + timestamp: Date.now(), + } as AgentToolOutputUpdateEvent); + } + } + } + + if (call.status !== 'awaiting_approval') continue; + const waiting = call as WaitingToolCall; + + // Emit approval request event for UI visibility + try { + const { confirmationDetails } = waiting; + const { onConfirm: _onConfirm, ...rest } = confirmationDetails; + this.eventEmitter?.emit(AgentEventType.TOOL_WAITING_APPROVAL, { + subagentId: this.subagentId, + round: currentRound, + callId: waiting.request.callId, + name: waiting.request.name, + description: this.getToolDescription( + waiting.request.name, + waiting.request.args, + ), + confirmationDetails: rest, + respond: async ( + outcome: ToolConfirmationOutcome, + payload?: Parameters< + ToolCallConfirmationDetails['onConfirm'] + >[1], + ) => { + if (responded.has(waiting.request.callId)) return; + responded.add(waiting.request.callId); + await waiting.confirmationDetails.onConfirm(outcome, payload); + }, + timestamp: Date.now(), + }); + } catch { + // ignore UI event emission failures + } + } + }, + getPreferredEditor: () => undefined, + onEditorClose: () => {}, + }); + + // Prepare requests and emit TOOL_CALL events + const requests: ToolCallRequestInfo[] = authorizedCalls.map((fc) => { + const toolName = String(fc.name || 'unknown'); + const callId = fc.id ?? `${fc.name}-${Date.now()}`; + const args = (fc.args ?? {}) as Record; + const request: ToolCallRequestInfo = { + callId, + name: toolName, + args, + isClientInitiated: true, + prompt_id: promptId, + response_id: responseId, + }; + + const description = this.getToolDescription(toolName, args); + const isOutputMarkdown = this.getToolIsOutputMarkdown(toolName); + this.eventEmitter?.emit(AgentEventType.TOOL_CALL, { + subagentId: this.subagentId, + round: currentRound, + callId, + name: toolName, + args, + description, + isOutputMarkdown, + timestamp: Date.now(), + } as AgentToolCallEvent); + + // pre-tool hook + void this.hooks?.preToolUse?.({ + subagentId: this.subagentId, + name: this.name, + toolName, + args, + timestamp: Date.now(), + }); + + return request; + }); + + if (requests.length > 0) { + // Create a per-batch completion promise + const batchDone = new Promise((resolve) => { + resolveBatch = () => { + resolve(); + resolveBatch = null; + }; + }); + + // Auto-resolve on abort so processFunctionCalls doesn't block forever + // when tools are awaiting approval or executing without abort support. + const onAbort = () => { + resolveBatch?.(); + for (const req of requests) { + if (emittedCallIds.has(req.callId)) continue; + emittedCallIds.add(req.callId); + + const errorMessage = 'Tool call cancelled by user abort.'; + this.recordToolCallStats(req.name, false, 0, errorMessage); + + this.eventEmitter?.emit(AgentEventType.TOOL_RESULT, { + subagentId: this.subagentId, + round: currentRound, + callId: req.callId, + name: req.name, + success: false, + error: errorMessage, + responseParts: [ + { + functionResponse: { + id: req.callId, + name: req.name, + response: { error: errorMessage }, + }, + }, + ], + resultDisplay: errorMessage, + durationMs: 0, + timestamp: Date.now(), + } as AgentToolResultEvent); + } + }; + abortController.signal.addEventListener('abort', onAbort, { once: true }); + + // If already aborted before the listener was registered, resolve + // immediately to avoid blocking forever. + if (abortController.signal.aborted) { + onAbort(); + } + + await scheduler.schedule(requests, abortController.signal); + await batchDone; + + abortController.signal.removeEventListener('abort', onAbort); + } + + // If all tool calls failed, inform the model so it can re-evaluate. + if (functionCalls.length > 0 && toolResponseParts.length === 0) { + toolResponseParts.push({ + text: 'All tool calls failed. Please analyze the errors and try an alternative approach.', + }); + } + + return [{ role: 'user', parts: toolResponseParts }]; + } + + // ─── Stats & Events ─────────────────────────────────────── + + getEventEmitter(): AgentEventEmitter | undefined { + return this.eventEmitter; + } + + getExecutionSummary(): AgentStatsSummary { + return this.stats.getSummary(); + } + + /** + * Returns legacy execution statistics and per-tool usage. + * Returns legacy execution statistics and per-tool usage. + */ + getStatistics(): { + successRate: number; + toolUsage: Array<{ + name: string; + count: number; + success: number; + failure: number; + lastError?: string; + totalDurationMs?: number; + averageDurationMs?: number; + }>; + } & ExecutionStats { + const total = this.executionStats.totalToolCalls; + const successRate = + total > 0 ? (this.executionStats.successfulToolCalls / total) * 100 : 0; + return { + ...this.executionStats, + successRate, + toolUsage: Array.from(this.toolUsage.entries()).map(([name, v]) => ({ + name, + ...v, + })), + }; + } + + /** + * Safely retrieves the description of a tool by attempting to build it. + * Returns an empty string if any error occurs during the process. + */ + getToolDescription(toolName: string, args: Record): string { + try { + const toolRegistry = this.runtimeContext.getToolRegistry(); + const tool = toolRegistry.getTool(toolName); + if (!tool) { + return ''; + } + + const toolInstance = tool.build(args); + return toolInstance.getDescription() || ''; + } catch { + return ''; + } + } + + private getToolIsOutputMarkdown(toolName: string): boolean { + try { + const toolRegistry = this.runtimeContext.getToolRegistry(); + return toolRegistry.getTool(toolName)?.isOutputMarkdown ?? false; + } catch { + return false; + } + } + + /** + * Records tool call statistics for both successful and failed tool calls. + */ + recordToolCallStats( + toolName: string, + success: boolean, + durationMs: number, + errorMessage?: string, + ): void { + // Update aggregate stats + this.executionStats.totalToolCalls += 1; + if (success) { + this.executionStats.successfulToolCalls += 1; + } else { + this.executionStats.failedToolCalls += 1; + } + + // Per-tool usage + const tu = this.toolUsage.get(toolName) || { + count: 0, + success: 0, + failure: 0, + totalDurationMs: 0, + averageDurationMs: 0, + }; + tu.count += 1; + if (success) { + tu.success += 1; + } else { + tu.failure += 1; + tu.lastError = errorMessage || 'Unknown error'; + } + tu.totalDurationMs = (tu.totalDurationMs || 0) + durationMs; + tu.averageDurationMs = tu.count > 0 ? tu.totalDurationMs / tu.count : 0; + this.toolUsage.set(toolName, tu); + + // Update statistics service + this.stats.recordToolCall( + toolName, + success, + durationMs, + this.toolUsage.get(toolName)?.lastError, + ); + } + + // ─── Private Helpers ────────────────────────────────────── + + /** + * Builds the system prompt with template substitution and optional + * non-interactive instructions suffix. + */ + private buildChatSystemPrompt( + context: ContextState, + options?: CreateChatOptions, + ): string { + if (!this.promptConfig.systemPrompt) { + return ''; + } + + let finalPrompt = templateString(this.promptConfig.systemPrompt, context); + + // Only add non-interactive instructions when NOT in interactive mode + if (!options?.interactive) { + finalPrompt += ` + +Important Rules: + - You operate in non-interactive mode: do not ask the user questions; proceed with available context. + - Use tools only when necessary to obtain facts or make changes. + - When the task is complete, return the final result as a normal model response (not a tool call) and stop.`; + } + + // Append user memory (QWEN.md + output-language.md) to ensure subagent respects project conventions + const userMemory = this.runtimeContext.getUserMemory(); + if (userMemory && userMemory.trim().length > 0) { + finalPrompt += `\n\n---\n\n${userMemory.trim()}`; + } + + return finalPrompt; + } + + /** + * Records token usage from model response metadata. + */ + private recordTokenUsage( + usage: GenerateContentResponseUsageMetadata, + turnCounter: number, + roundStreamStart: number, + ): void { + const inTok = Number(usage.promptTokenCount || 0); + const outTok = Number(usage.candidatesTokenCount || 0); + const thoughtTok = Number(usage.thoughtsTokenCount || 0); + const cachedTok = Number(usage.cachedContentTokenCount || 0); + const totalTok = Number(usage.totalTokenCount || 0); + // Prefer totalTokenCount (prompt + output) for context usage — the + // output from this round becomes history for the next, matching + // the approach in geminiChat.ts. + const contextTok = isFinite(totalTok) && totalTok > 0 ? totalTok : inTok; + if (isFinite(contextTok) && contextTok > 0) { + this.lastPromptTokenCount = contextTok; + } + if ( + isFinite(inTok) || + isFinite(outTok) || + isFinite(thoughtTok) || + isFinite(cachedTok) + ) { + this.stats.recordTokens( + isFinite(inTok) ? inTok : 0, + isFinite(outTok) ? outTok : 0, + isFinite(thoughtTok) ? thoughtTok : 0, + isFinite(cachedTok) ? cachedTok : 0, + isFinite(totalTok) ? totalTok : 0, + ); + // Mirror legacy fields for compatibility + this.executionStats.inputTokens = + (this.executionStats.inputTokens || 0) + (isFinite(inTok) ? inTok : 0); + this.executionStats.outputTokens = + (this.executionStats.outputTokens || 0) + + (isFinite(outTok) ? outTok : 0); + this.executionStats.totalTokens = + (this.executionStats.totalTokens || 0) + + (isFinite(totalTok) ? totalTok : 0); + } + this.eventEmitter?.emit(AgentEventType.USAGE_METADATA, { + subagentId: this.subagentId, + round: turnCounter, + usage, + durationMs: Date.now() - roundStreamStart, + timestamp: Date.now(), + } as AgentUsageEvent); + } +} diff --git a/packages/core/src/agents/runtime/agent-events.ts b/packages/core/src/agents/runtime/agent-events.ts new file mode 100644 index 0000000000..4626bb0cd3 --- /dev/null +++ b/packages/core/src/agents/runtime/agent-events.ts @@ -0,0 +1,260 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Agent event types, emitter, and lifecycle hooks. + * + * Defines the observation/notification contracts for the agent runtime: + * - Event types emitted during agent execution (streaming, tool calls, etc.) + * - AgentEventEmitter — typed wrapper around EventEmitter + * - Lifecycle hooks (pre/post tool use, stop) for synchronous callbacks + */ + +import { EventEmitter } from 'events'; +import type { + ToolCallConfirmationDetails, + ToolConfirmationOutcome, + ToolResultDisplay, +} from '../../tools/tools.js'; +import type { Part, GenerateContentResponseUsageMetadata } from '@google/genai'; +import type { AgentStatus } from './agent-types.js'; + +// ─── Event Types ──────────────────────────────────────────── + +export type AgentEvent = + | 'start' + | 'round_start' + | 'round_end' + | 'round_text' + | 'stream_text' + | 'tool_call' + | 'tool_result' + | 'tool_output_update' + | 'tool_waiting_approval' + | 'usage_metadata' + | 'finish' + | 'error' + | 'status_change'; + +export enum AgentEventType { + START = 'start', + ROUND_START = 'round_start', + ROUND_END = 'round_end', + /** Complete round text, emitted once after streaming before tool calls. */ + ROUND_TEXT = 'round_text', + STREAM_TEXT = 'stream_text', + TOOL_CALL = 'tool_call', + TOOL_RESULT = 'tool_result', + TOOL_OUTPUT_UPDATE = 'tool_output_update', + TOOL_WAITING_APPROVAL = 'tool_waiting_approval', + USAGE_METADATA = 'usage_metadata', + FINISH = 'finish', + ERROR = 'error', + STATUS_CHANGE = 'status_change', +} + +// ─── Event Payloads ───────────────────────────────────────── + +export interface AgentStartEvent { + subagentId: string; + name: string; + model?: string; + tools: string[]; + timestamp: number; +} + +export interface AgentRoundEvent { + subagentId: string; + round: number; + promptId: string; + timestamp: number; +} + +export interface AgentRoundTextEvent { + subagentId: string; + round: number; + text: string; + thoughtText: string; + timestamp: number; +} + +export interface AgentStreamTextEvent { + subagentId: string; + round: number; + text: string; + /** Whether this text is reasoning/thinking content (as opposed to regular output) */ + thought?: boolean; + timestamp: number; +} + +export interface AgentUsageEvent { + subagentId: string; + round: number; + usage: GenerateContentResponseUsageMetadata; + durationMs?: number; + timestamp: number; +} + +export interface AgentToolCallEvent { + subagentId: string; + round: number; + callId: string; + name: string; + args: Record; + description: string; + /** Whether the tool's output should be rendered as markdown. */ + isOutputMarkdown?: boolean; + timestamp: number; +} + +export interface AgentToolResultEvent { + subagentId: string; + round: number; + callId: string; + name: string; + success: boolean; + error?: string; + responseParts?: Part[]; + resultDisplay?: ToolResultDisplay; + /** Path to the temp file where oversized output was saved. */ + outputFile?: string; + durationMs?: number; + timestamp: number; +} + +export interface AgentToolOutputUpdateEvent { + subagentId: string; + round: number; + callId: string; + /** Latest accumulated output for this tool call (replaces previous). */ + outputChunk: ToolResultDisplay; + /** PTY process PID — present when the tool runs in an interactive shell. */ + pid?: number; + timestamp: number; +} + +export interface AgentApprovalRequestEvent { + subagentId: string; + round: number; + callId: string; + name: string; + description: string; + confirmationDetails: Omit & { + type: ToolCallConfirmationDetails['type']; + }; + respond: ( + outcome: ToolConfirmationOutcome, + payload?: Parameters[1], + ) => Promise; + timestamp: number; +} + +export interface AgentFinishEvent { + subagentId: string; + terminateReason: string; + timestamp: number; + rounds?: number; + totalDurationMs?: number; + totalToolCalls?: number; + successfulToolCalls?: number; + failedToolCalls?: number; + inputTokens?: number; + outputTokens?: number; + totalTokens?: number; +} + +export interface AgentErrorEvent { + subagentId: string; + error: string; + timestamp: number; +} + +export interface AgentStatusChangeEvent { + agentId: string; + previousStatus: AgentStatus; + newStatus: AgentStatus; + /** True when the transition to IDLE was caused by user cancelling the round. */ + roundCancelledByUser?: boolean; + timestamp: number; +} + +// ─── Event Map ────────────────────────────────────────────── + +/** + * Maps each event type to its payload type for type-safe emit/on. + */ +export interface AgentEventMap { + [AgentEventType.START]: AgentStartEvent; + [AgentEventType.ROUND_START]: AgentRoundEvent; + [AgentEventType.ROUND_END]: AgentRoundEvent; + [AgentEventType.ROUND_TEXT]: AgentRoundTextEvent; + [AgentEventType.STREAM_TEXT]: AgentStreamTextEvent; + [AgentEventType.TOOL_CALL]: AgentToolCallEvent; + [AgentEventType.TOOL_RESULT]: AgentToolResultEvent; + [AgentEventType.TOOL_OUTPUT_UPDATE]: AgentToolOutputUpdateEvent; + [AgentEventType.TOOL_WAITING_APPROVAL]: AgentApprovalRequestEvent; + [AgentEventType.USAGE_METADATA]: AgentUsageEvent; + [AgentEventType.FINISH]: AgentFinishEvent; + [AgentEventType.ERROR]: AgentErrorEvent; + [AgentEventType.STATUS_CHANGE]: AgentStatusChangeEvent; +} + +// ─── Event Emitter ────────────────────────────────────────── + +export class AgentEventEmitter { + private ee = new EventEmitter(); + + on( + event: E, + listener: (payload: AgentEventMap[E]) => void, + ): void { + this.ee.on(event, listener as (...args: unknown[]) => void); + } + + off( + event: E, + listener: (payload: AgentEventMap[E]) => void, + ): void { + this.ee.off(event, listener as (...args: unknown[]) => void); + } + + emit( + event: E, + payload: AgentEventMap[E], + ): void { + this.ee.emit(event, payload); + } +} + +// ─── Lifecycle Hooks ──────────────────────────────────────── + +export interface PreToolUsePayload { + subagentId: string; + name: string; // subagent name + toolName: string; + args: Record; + timestamp: number; +} + +export interface PostToolUsePayload extends PreToolUsePayload { + success: boolean; + durationMs: number; + errorMessage?: string; +} + +export interface AgentStopPayload { + subagentId: string; + name: string; // subagent name + terminateReason: string; + summary: Record; + timestamp: number; +} + +export interface AgentHooks { + preToolUse?(payload: PreToolUsePayload): Promise | void; + postToolUse?(payload: PostToolUsePayload): Promise | void; + onStop?(payload: AgentStopPayload): Promise | void; +} diff --git a/packages/core/src/subagents/subagent.test.ts b/packages/core/src/agents/runtime/agent-headless.test.ts similarity index 86% rename from packages/core/src/subagents/subagent.test.ts rename to packages/core/src/agents/runtime/agent-headless.test.ts index 0286d11c85..7271eb0945 100644 --- a/packages/core/src/subagents/subagent.test.ts +++ b/packages/core/src/agents/runtime/agent-headless.test.ts @@ -21,39 +21,39 @@ import { vi, type Mock, } from 'vitest'; -import { Config, type ConfigParameters } from '../config/config.js'; -import { DEFAULT_QWEN_MODEL } from '../config/models.js'; +import { Config, type ConfigParameters } from '../../config/config.js'; +import { DEFAULT_QWEN_MODEL } from '../../config/models.js'; import { createContentGenerator, createContentGeneratorConfig, resolveContentGeneratorConfigWithSources, AuthType, -} from '../core/contentGenerator.js'; -import { GeminiChat } from '../core/geminiChat.js'; -import { executeToolCall } from '../core/nonInteractiveToolExecutor.js'; -import type { ToolRegistry } from '../tools/tool-registry.js'; -import { type AnyDeclarativeTool } from '../tools/tools.js'; -import { ContextState, SubAgentScope } from './subagent.js'; +} from '../../core/contentGenerator.js'; +import { GeminiChat } from '../../core/geminiChat.js'; +import { executeToolCall } from '../../core/nonInteractiveToolExecutor.js'; +import type { ToolRegistry } from '../../tools/tool-registry.js'; +import { type AnyDeclarativeTool } from '../../tools/tools.js'; +import { ContextState, AgentHeadless } from './agent-headless.js'; import { - SubAgentEventEmitter, - SubAgentEventType, - type SubAgentStreamTextEvent, - type SubAgentToolCallEvent, - type SubAgentToolResultEvent, -} from './subagent-events.js'; + AgentEventEmitter, + AgentEventType, + type AgentStreamTextEvent, + type AgentToolCallEvent, + type AgentToolResultEvent, +} from './agent-events.js'; import type { ModelConfig, PromptConfig, RunConfig, ToolConfig, -} from './types.js'; -import { SubagentTerminateMode } from './types.js'; +} from './agent-types.js'; +import { AgentTerminateMode } from './agent-types.js'; -vi.mock('../core/geminiChat.js'); -vi.mock('../core/contentGenerator.js', async (importOriginal) => { +vi.mock('../../core/geminiChat.js'); +vi.mock('../../core/contentGenerator.js', async (importOriginal) => { const actual = - await importOriginal(); - const { DEFAULT_QWEN_MODEL } = await import('../config/models.js'); + await importOriginal(); + const { DEFAULT_QWEN_MODEL } = await import('../../config/models.js'); return { ...actual, createContentGenerator: vi.fn().mockResolvedValue({ @@ -77,7 +77,7 @@ vi.mock('../core/contentGenerator.js', async (importOriginal) => { }), }; }); -vi.mock('../utils/environmentContext.js', () => ({ +vi.mock('../../utils/environmentContext.js', () => ({ getEnvironmentContext: vi.fn().mockResolvedValue([{ text: 'Env Context' }]), getInitialChatHistory: vi.fn(async (_config, extraHistory) => [ { @@ -91,11 +91,11 @@ vi.mock('../utils/environmentContext.js', () => ({ ...(extraHistory ?? []), ]), })); -vi.mock('../core/nonInteractiveToolExecutor.js'); -vi.mock('../ide/ide-client.js'); -vi.mock('../core/client.js'); +vi.mock('../../core/nonInteractiveToolExecutor.js'); +vi.mock('../../ide/ide-client.js'); +vi.mock('../../core/client.js'); -vi.mock('../skills/skill-manager.js', () => { +vi.mock('../../skills/skill-manager.js', () => { const SkillManagerMock = vi.fn(); SkillManagerMock.prototype.startWatching = vi .fn() @@ -107,7 +107,7 @@ vi.mock('../skills/skill-manager.js', () => { return { SkillManager: SkillManagerMock }; }); -vi.mock('./subagent-manager.js', () => { +vi.mock('../../subagents/subagent-manager.js', () => { const SubagentManagerMock = vi.fn(); SubagentManagerMock.prototype.loadSessionSubagents = vi.fn(); SubagentManagerMock.prototype.addChangeListener = vi @@ -226,7 +226,7 @@ describe('subagent.ts', () => { }); }); - describe('SubAgentScope', () => { + describe('AgentHeadless', () => { let mockSendMessageStream: Mock; const defaultModelConfig: ModelConfig = { @@ -299,16 +299,16 @@ describe('subagent.ts', () => { describe('create (Tool Validation)', () => { const promptConfig: PromptConfig = { systemPrompt: 'Test prompt' }; - it('should create a SubAgentScope successfully with minimal config', async () => { + it('should create a AgentHeadless successfully with minimal config', async () => { const { config } = await createMockConfig(); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, defaultModelConfig, defaultRunConfig, ); - expect(scope).toBeInstanceOf(SubAgentScope); + expect(scope).toBeInstanceOf(AgentHeadless); }); it('should not block creation when a tool may require confirmation', async () => { @@ -331,7 +331,7 @@ describe('subagent.ts', () => { const toolConfig: ToolConfig = { tools: ['risky_tool'] }; - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -339,7 +339,7 @@ describe('subagent.ts', () => { defaultRunConfig, toolConfig, ); - expect(scope).toBeInstanceOf(SubAgentScope); + expect(scope).toBeInstanceOf(AgentHeadless); }); it('should succeed if tools do not require confirmation', async () => { @@ -357,7 +357,7 @@ describe('subagent.ts', () => { const toolConfig: ToolConfig = { tools: ['safe_tool'] }; - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -365,7 +365,7 @@ describe('subagent.ts', () => { defaultRunConfig, toolConfig, ); - expect(scope).toBeInstanceOf(SubAgentScope); + expect(scope).toBeInstanceOf(AgentHeadless); }); it('should allow creation regardless of tool parameter requirements', async () => { @@ -390,7 +390,7 @@ describe('subagent.ts', () => { const toolConfig: ToolConfig = { tools: ['tool_with_params'] }; - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -399,13 +399,13 @@ describe('subagent.ts', () => { toolConfig, ); - expect(scope).toBeInstanceOf(SubAgentScope); + expect(scope).toBeInstanceOf(AgentHeadless); // Ensure build was not called during creation expect(mockToolWithParams.build).not.toHaveBeenCalled(); }); }); - describe('runNonInteractive - Initialization and Prompting', () => { + describe('execute - Initialization and Prompting', () => { it('should correctly template the system prompt and initialize GeminiChat', async () => { const { config } = await createMockConfig(); @@ -421,7 +421,7 @@ describe('subagent.ts', () => { // Model stops immediately mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -429,7 +429,7 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(context); + await scope.execute(context); // Check if GeminiChat was initialized correctly by the subagent expect(GeminiChat).toHaveBeenCalledTimes(1); @@ -473,7 +473,7 @@ describe('subagent.ts', () => { mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -481,7 +481,7 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(context); + await scope.execute(context); const generationConfig = getGenerationConfigFromMock(); expect(generationConfig.systemInstruction).toContain( @@ -511,7 +511,7 @@ describe('subagent.ts', () => { mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -519,7 +519,7 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(context); + await scope.execute(context); const generationConfig = getGenerationConfigFromMock(); const sysPrompt = generationConfig.systemInstruction as string; @@ -540,7 +540,7 @@ describe('subagent.ts', () => { mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -548,7 +548,7 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(context); + await scope.execute(context); const generationConfig = getGenerationConfigFromMock(); const sysPrompt = generationConfig.systemInstruction as string; @@ -568,7 +568,7 @@ describe('subagent.ts', () => { // Model stops immediately mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -576,7 +576,7 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(context); + await scope.execute(context); const callArgs = vi.mocked(GeminiChat).mock.calls[0]; const generationConfig = getGenerationConfigFromMock(); @@ -602,7 +602,7 @@ describe('subagent.ts', () => { context.set('name', 'Agent'); // 'missing' is not set - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -610,11 +610,11 @@ describe('subagent.ts', () => { defaultRunConfig, ); - // The error from templating causes the runNonInteractive to reject and the terminate_reason to be ERROR. - await expect(scope.runNonInteractive(context)).rejects.toThrow( + // The error from templating causes the execute to reject and the terminate_reason to be ERROR. + await expect(scope.execute(context)).rejects.toThrow( 'Missing context values for the following keys: missing', ); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.ERROR); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.ERROR); }); it('should validate that systemPrompt and initialMessages are mutually exclusive', async () => { @@ -625,7 +625,7 @@ describe('subagent.ts', () => { }; const context = new ContextState(); - const agent = await SubAgentScope.create( + const agent = await AgentHeadless.create( 'TestAgent', config, promptConfig, @@ -633,14 +633,14 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await expect(agent.runNonInteractive(context)).rejects.toThrow( + await expect(agent.execute(context)).rejects.toThrow( 'PromptConfig cannot have both `systemPrompt` and `initialMessages` defined.', ); - expect(agent.getTerminateMode()).toBe(SubagentTerminateMode.ERROR); + expect(agent.getTerminateMode()).toBe(AgentTerminateMode.ERROR); }); }); - describe('runNonInteractive - Execution and Tool Use', () => { + describe('execute - Execution and Tool Use', () => { const promptConfig: PromptConfig = { systemPrompt: 'Execute task.' }; it('should terminate with GOAL if no outputs are expected and model stops', async () => { @@ -648,7 +648,7 @@ describe('subagent.ts', () => { // Model stops immediately mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -657,9 +657,9 @@ describe('subagent.ts', () => { // No ToolConfig, No OutputConfig ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.GOAL); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.GOAL); expect(mockSendMessageStream).toHaveBeenCalledTimes(1); // Check the initial message expect(mockSendMessageStream.mock.calls[0][1].message).toEqual([ @@ -673,7 +673,7 @@ describe('subagent.ts', () => { // Model stops immediately with text response mockSendMessageStream.mockImplementation(createMockStream(['stop'])); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -681,9 +681,9 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.GOAL); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.GOAL); expect(mockSendMessageStream).toHaveBeenCalledTimes(1); }); @@ -744,7 +744,7 @@ describe('subagent.ts', () => { name === 'list_files' ? listFilesTool : undefined, ); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -753,7 +753,7 @@ describe('subagent.ts', () => { toolConfig, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); // Check the response sent back to the model (functionResponse part) const secondCallArgs = mockSendMessageStream.mock.calls[1][1]; @@ -764,11 +764,11 @@ describe('subagent.ts', () => { 'file1.txt\nfile2.ts', ); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.GOAL); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.GOAL); }); }); - describe('runNonInteractive - Termination and Recovery', () => { + describe('execute - Termination and Recovery', () => { const promptConfig: PromptConfig = { systemPrompt: 'Execute task.' }; it('should terminate with MAX_TURNS if the limit is reached', async () => { @@ -800,7 +800,7 @@ describe('subagent.ts', () => { ]), ); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -808,10 +808,10 @@ describe('subagent.ts', () => { runConfig, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); expect(mockSendMessageStream).toHaveBeenCalledTimes(2); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.MAX_TURNS); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.MAX_TURNS); }); it.skip('should terminate with TIMEOUT if the time limit is reached during an LLM call', async () => { @@ -835,7 +835,7 @@ describe('subagent.ts', () => { // The LLM call will hang until we resolve the promise. mockSendMessageStream.mockReturnValue(streamPromise); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -843,7 +843,7 @@ describe('subagent.ts', () => { runConfig, ); - const runPromise = scope.runNonInteractive(new ContextState()); + const runPromise = scope.execute(new ContextState()); // Advance time beyond the limit (6 minutes) while the agent is awaiting the LLM response. await vi.advanceTimersByTimeAsync(6 * 60 * 1000); @@ -854,7 +854,7 @@ describe('subagent.ts', () => { await runPromise; - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.TIMEOUT); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.TIMEOUT); expect(mockSendMessageStream).toHaveBeenCalledTimes(1); vi.useRealTimers(); @@ -864,7 +864,7 @@ describe('subagent.ts', () => { const { config } = await createMockConfig(); mockSendMessageStream.mockRejectedValue(new Error('API Failure')); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -872,14 +872,14 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await expect( - scope.runNonInteractive(new ContextState()), - ).rejects.toThrow('API Failure'); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.ERROR); + await expect(scope.execute(new ContextState())).rejects.toThrow( + 'API Failure', + ); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.ERROR); }); }); - describe('runNonInteractive - Streaming and Thought Handling', () => { + describe('execute - Streaming and Thought Handling', () => { const promptConfig: PromptConfig = { systemPrompt: 'Execute task.' }; // Helper to create a mock stream that yields specific parts @@ -913,13 +913,13 @@ describe('subagent.ts', () => { }) as unknown as GeminiChat, ); - const eventEmitter = new SubAgentEventEmitter(); - const events: SubAgentStreamTextEvent[] = []; - eventEmitter.on(SubAgentEventType.STREAM_TEXT, (...args: unknown[]) => { - events.push(args[0] as SubAgentStreamTextEvent); + const eventEmitter = new AgentEventEmitter(); + const events: AgentStreamTextEvent[] = []; + eventEmitter.on(AgentEventType.STREAM_TEXT, (...args: unknown[]) => { + events.push(args[0] as AgentStreamTextEvent); }); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -929,7 +929,7 @@ describe('subagent.ts', () => { eventEmitter, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); expect(events).toHaveLength(2); expect(events[0]!.text).toBe('Let me think...'); @@ -952,7 +952,7 @@ describe('subagent.ts', () => { }) as unknown as GeminiChat, ); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -960,9 +960,9 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.GOAL); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.GOAL); expect(scope.getFinalText()).toBe('The final answer.'); }); @@ -1016,7 +1016,7 @@ describe('subagent.ts', () => { }) as unknown as GeminiChat, ); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -1024,16 +1024,16 @@ describe('subagent.ts', () => { defaultRunConfig, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); - expect(scope.getTerminateMode()).toBe(SubagentTerminateMode.GOAL); + expect(scope.getTerminateMode()).toBe(AgentTerminateMode.GOAL); expect(scope.getFinalText()).toBe('Actual output.'); // Should have been called twice: first with thought-only, then nudged expect(mockSendMessageStream).toHaveBeenCalledTimes(2); }); }); - describe('runNonInteractive - Tool Restriction Enforcement (Issue #1121)', () => { + describe('execute - Tool Restriction Enforcement (Issue #1121)', () => { const promptConfig: PromptConfig = { systemPrompt: 'Execute task.' }; it('should NOT execute tools that are not in the allowed tools list', async () => { @@ -1142,19 +1142,19 @@ describe('subagent.ts', () => { ); // Track emitted events - const toolCallEvents: SubAgentToolCallEvent[] = []; - const toolResultEvents: SubAgentToolResultEvent[] = []; + const toolCallEvents: AgentToolCallEvent[] = []; + const toolResultEvents: AgentToolResultEvent[] = []; // Create event emitter BEFORE the scope and subscribe to events - const eventEmitter = new SubAgentEventEmitter(); - eventEmitter.on(SubAgentEventType.TOOL_CALL, (event: unknown) => { - toolCallEvents.push(event as SubAgentToolCallEvent); + const eventEmitter = new AgentEventEmitter(); + eventEmitter.on(AgentEventType.TOOL_CALL, (event: unknown) => { + toolCallEvents.push(event as AgentToolCallEvent); }); - eventEmitter.on(SubAgentEventType.TOOL_RESULT, (event: unknown) => { - toolResultEvents.push(event as SubAgentToolResultEvent); + eventEmitter.on(AgentEventType.TOOL_RESULT, (event: unknown) => { + toolResultEvents.push(event as AgentToolResultEvent); }); - const scope = await SubAgentScope.create( + const scope = await AgentHeadless.create( 'test-agent', config, promptConfig, @@ -1164,7 +1164,7 @@ describe('subagent.ts', () => { eventEmitter, ); - await scope.runNonInteractive(new ContextState()); + await scope.execute(new ContextState()); // 1. Only allowed tool should be executed expect(executedTools).toContain('read_file'); diff --git a/packages/core/src/agents/runtime/agent-headless.ts b/packages/core/src/agents/runtime/agent-headless.ts new file mode 100644 index 0000000000..ac02f80dfb --- /dev/null +++ b/packages/core/src/agents/runtime/agent-headless.ts @@ -0,0 +1,360 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentHeadless — one-shot task execution wrapper around AgentCore. + * + * AgentHeadless manages + * the lifecycle of a single headless task: start → run → finish. + * It delegates all model reasoning and tool scheduling to AgentCore. + * + * For persistent interactive agents, see AgentInteractive (Phase 2). + */ + +import type { Config } from '../../config/config.js'; +import { createDebugLogger } from '../../utils/debugLogger.js'; +import type { + AgentEventEmitter, + AgentStartEvent, + AgentErrorEvent, + AgentFinishEvent, + AgentHooks, +} from './agent-events.js'; +import { AgentEventType } from './agent-events.js'; +import type { AgentStatsSummary } from './agent-statistics.js'; +import type { + PromptConfig, + ModelConfig, + RunConfig, + ToolConfig, +} from './agent-types.js'; +import { AgentTerminateMode } from './agent-types.js'; +import { logSubagentExecution } from '../../telemetry/loggers.js'; +import { SubagentExecutionEvent } from '../../telemetry/types.js'; +import { AgentCore } from './agent-core.js'; +import { DEFAULT_QWEN_MODEL } from '../../config/models.js'; + +const debugLogger = createDebugLogger('SUBAGENT'); + +// ─── Utilities (unchanged, re-exported for consumers) ──────── + +/** + * Manages the runtime context state for the subagent. + * This class provides a mechanism to store and retrieve key-value pairs + * that represent the dynamic state and variables accessible to the subagent + * during its execution. + */ +export class ContextState { + private state: Record = {}; + + /** + * Retrieves a value from the context state. + * + * @param key - The key of the value to retrieve. + * @returns The value associated with the key, or undefined if the key is not found. + */ + get(key: string): unknown { + return this.state[key]; + } + + /** + * Sets a value in the context state. + * + * @param key - The key to set the value under. + * @param value - The value to set. + */ + set(key: string, value: unknown): void { + this.state[key] = value; + } + + /** + * Retrieves all keys in the context state. + * + * @returns An array of all keys in the context state. + */ + get_keys(): string[] { + return Object.keys(this.state); + } +} + +/** + * Replaces `${...}` placeholders in a template string with values from a context. + * + * This function identifies all placeholders in the format `${key}`, validates that + * each key exists in the provided `ContextState`, and then performs the substitution. + * + * @param template The template string containing placeholders. + * @param context The `ContextState` object providing placeholder values. + * @returns The populated string with all placeholders replaced. + * @throws {Error} if any placeholder key is not found in the context. + */ +export function templateString( + template: string, + context: ContextState, +): string { + const placeholderRegex = /\$\{(\w+)\}/g; + + // First, find all unique keys required by the template. + const requiredKeys = new Set( + Array.from(template.matchAll(placeholderRegex), (match) => match[1]), + ); + + // Check if all required keys exist in the context. + const contextKeys = new Set(context.get_keys()); + const missingKeys = Array.from(requiredKeys).filter( + (key) => !contextKeys.has(key), + ); + + if (missingKeys.length > 0) { + throw new Error( + `Missing context values for the following keys: ${missingKeys.join( + ', ', + )}`, + ); + } + + // Perform the replacement using a replacer function. + return template.replace(placeholderRegex, (_match, key) => + String(context.get(key)), + ); +} + +// ─── AgentHeadless ────────────────────────────────────────── + +/** + * AgentHeadless — one-shot task executor. + * + * Takes a task, runs it through AgentCore's reasoning loop, and returns + * the result. + * + * Lifecycle: Born → execute() → die. + */ +export class AgentHeadless { + private readonly core: AgentCore; + private finalText: string = ''; + private terminateMode: AgentTerminateMode = AgentTerminateMode.ERROR; + + private constructor(core: AgentCore) { + this.core = core; + } + + /** + * Creates a new AgentHeadless instance. + * + * @param name - The name for the subagent, used for logging and identification. + * @param runtimeContext - The shared runtime configuration and services. + * @param promptConfig - Configuration for the subagent's prompt and behavior. + * @param modelConfig - Configuration for the generative model parameters. + * @param runConfig - Configuration for the subagent's execution environment. + * @param toolConfig - Optional configuration for tools available to the subagent. + * @param eventEmitter - Optional event emitter for streaming events to UI. + * @param hooks - Optional lifecycle hooks. + */ + static async create( + name: string, + runtimeContext: Config, + promptConfig: PromptConfig, + modelConfig: ModelConfig, + runConfig: RunConfig, + toolConfig?: ToolConfig, + eventEmitter?: AgentEventEmitter, + hooks?: AgentHooks, + ): Promise { + const core = new AgentCore( + name, + runtimeContext, + promptConfig, + modelConfig, + runConfig, + toolConfig, + eventEmitter, + hooks, + ); + return new AgentHeadless(core); + } + + /** + * Executes the task in headless mode. + * + * This method orchestrates the subagent's execution lifecycle: + * 1. Creates a chat session + * 2. Prepares tools + * 3. Runs the reasoning loop until completion/termination + * 4. Emits start/finish/error events + * 5. Records telemetry + * + * @param context - The current context state containing variables for prompt templating. + * @param externalSignal - Optional abort signal for external cancellation. + */ + async execute( + context: ContextState, + externalSignal?: AbortSignal, + ): Promise { + const chat = await this.core.createChat(context); + + if (!chat) { + this.terminateMode = AgentTerminateMode.ERROR; + return; + } + + // Set up abort signal propagation + const abortController = new AbortController(); + const onExternalAbort = () => { + abortController.abort(); + }; + if (externalSignal) { + externalSignal.addEventListener('abort', onExternalAbort); + } + if (externalSignal?.aborted) { + abortController.abort(); + } + + const toolsList = this.core.prepareTools(); + + const initialTaskText = String( + (context.get('task_prompt') as string) ?? 'Get Started!', + ); + const initialMessages = [ + { role: 'user' as const, parts: [{ text: initialTaskText }] }, + ]; + + const startTime = Date.now(); + this.core.executionStats.startTimeMs = startTime; + this.core.stats.start(startTime); + + try { + // Emit start event + this.core.eventEmitter?.emit(AgentEventType.START, { + subagentId: this.core.subagentId, + name: this.core.name, + model: + this.core.modelConfig.model || + this.core.runtimeContext.getModel() || + DEFAULT_QWEN_MODEL, + tools: (this.core.toolConfig?.tools || ['*']).map((t) => + typeof t === 'string' ? t : t.name, + ), + timestamp: Date.now(), + } as AgentStartEvent); + + // Log telemetry for subagent start + const startEvent = new SubagentExecutionEvent(this.core.name, 'started'); + logSubagentExecution(this.core.runtimeContext, startEvent); + + // Delegate to AgentCore's reasoning loop + const result = await this.core.runReasoningLoop( + chat, + initialMessages, + toolsList, + abortController, + { + maxTurns: this.core.runConfig.max_turns, + maxTimeMinutes: this.core.runConfig.max_time_minutes, + startTimeMs: startTime, + }, + ); + + this.finalText = result.text; + this.terminateMode = result.terminateMode ?? AgentTerminateMode.GOAL; + } catch (error) { + debugLogger.error('Error during subagent execution:', error); + this.terminateMode = AgentTerminateMode.ERROR; + this.core.eventEmitter?.emit(AgentEventType.ERROR, { + subagentId: this.core.subagentId, + error: error instanceof Error ? error.message : String(error), + timestamp: Date.now(), + } as AgentErrorEvent); + + throw error; + } finally { + if (externalSignal) { + externalSignal.removeEventListener('abort', onExternalAbort); + } + this.core.executionStats.totalDurationMs = Date.now() - startTime; + const summary = this.core.stats.getSummary(Date.now()); + this.core.eventEmitter?.emit(AgentEventType.FINISH, { + subagentId: this.core.subagentId, + terminateReason: this.terminateMode, + timestamp: Date.now(), + rounds: summary.rounds, + totalDurationMs: summary.totalDurationMs, + totalToolCalls: summary.totalToolCalls, + successfulToolCalls: summary.successfulToolCalls, + failedToolCalls: summary.failedToolCalls, + inputTokens: summary.inputTokens, + outputTokens: summary.outputTokens, + totalTokens: summary.totalTokens, + } as AgentFinishEvent); + + const completionEvent = new SubagentExecutionEvent( + this.core.name, + this.terminateMode === AgentTerminateMode.GOAL ? 'completed' : 'failed', + { + terminate_reason: this.terminateMode, + result: this.finalText, + execution_summary: this.core.stats.formatCompact( + 'Subagent execution completed', + ), + }, + ); + logSubagentExecution(this.core.runtimeContext, completionEvent); + + await this.core.hooks?.onStop?.({ + subagentId: this.core.subagentId, + name: this.core.name, + terminateReason: this.terminateMode, + summary: summary as unknown as Record, + timestamp: Date.now(), + }); + } + } + + // ─── Accessors ───────────────────────────────────────────── + + /** + * Provides access to the underlying AgentCore for advanced use cases. + * Used by AgentInteractive and InProcessBackend. + */ + getCore(): AgentCore { + return this.core; + } + + get executionStats() { + return this.core.executionStats; + } + + set executionStats(value) { + this.core.executionStats = value; + } + + getEventEmitter() { + return this.core.getEventEmitter(); + } + + getStatistics() { + return this.core.getStatistics(); + } + + getExecutionSummary(): AgentStatsSummary { + return this.core.getExecutionSummary(); + } + + getFinalText(): string { + return this.finalText; + } + + getTerminateMode(): AgentTerminateMode { + return this.terminateMode; + } + + get name(): string { + return this.core.name; + } + + get runtimeContext(): Config { + return this.core.runtimeContext; + } +} diff --git a/packages/core/src/agents/runtime/agent-interactive.test.ts b/packages/core/src/agents/runtime/agent-interactive.test.ts new file mode 100644 index 0000000000..5560b665f7 --- /dev/null +++ b/packages/core/src/agents/runtime/agent-interactive.test.ts @@ -0,0 +1,620 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect, vi, beforeEach } from 'vitest'; +import { AgentInteractive } from './agent-interactive.js'; +import type { AgentCore } from './agent-core.js'; +import { AgentEventEmitter, AgentEventType } from './agent-events.js'; +import { ContextState } from './agent-headless.js'; +import type { AgentInteractiveConfig } from './agent-types.js'; +import { AgentStatus } from './agent-types.js'; + +function createMockChat() { + return { + sendMessageStream: vi.fn(), + }; +} + +function createMockCore( + overrides: { + chatValue?: unknown; + nullChat?: boolean; + loopResult?: { text: string; terminateMode: null; turnsUsed: number }; + } = {}, +) { + const emitter = new AgentEventEmitter(); + const chatReturnValue = overrides.nullChat + ? undefined + : overrides.chatValue !== undefined + ? overrides.chatValue + : createMockChat(); + const core = { + subagentId: 'test-agent-abc123', + name: 'test-agent', + eventEmitter: emitter, + stats: { + start: vi.fn(), + getSummary: vi.fn().mockReturnValue({ + rounds: 1, + totalDurationMs: 100, + totalToolCalls: 0, + successfulToolCalls: 0, + failedToolCalls: 0, + inputTokens: 0, + outputTokens: 0, + totalTokens: 0, + }), + setRounds: vi.fn(), + recordToolCall: vi.fn(), + recordTokens: vi.fn(), + }, + createChat: vi.fn().mockResolvedValue(chatReturnValue), + prepareTools: vi.fn().mockReturnValue([]), + runReasoningLoop: vi.fn().mockResolvedValue( + overrides.loopResult ?? { + text: 'Done', + terminateMode: null, + turnsUsed: 1, + }, + ), + getEventEmitter: () => emitter, + getExecutionSummary: vi.fn().mockReturnValue({ + rounds: 1, + totalDurationMs: 100, + totalToolCalls: 0, + successfulToolCalls: 0, + failedToolCalls: 0, + inputTokens: 0, + outputTokens: 0, + totalTokens: 0, + }), + } as unknown as AgentCore; + + return { core, emitter }; +} + +function createConfig( + overrides: Partial = {}, +): AgentInteractiveConfig { + return { + agentId: 'agent-1', + agentName: 'Test Agent', + ...overrides, + }; +} + +describe('AgentInteractive', () => { + let context: ContextState; + + beforeEach(() => { + context = new ContextState(); + }); + + // ─── Lifecycle ────────────────────────────────────────────── + + it('should initialize and complete cleanly without initialTask', async () => { + const { core } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + // No initialTask → agent is waiting on queue, status is still initializing. + // Shutdown drains queue, loop exits normally → completed. + await agent.shutdown(); + expect(agent.getStatus()).toBe('completed'); + }); + + it('should process initialTask immediately on start', async () => { + const { core } = createMockCore(); + const config = createConfig({ initialTask: 'Do something' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + expect(core.runReasoningLoop).toHaveBeenCalledOnce(); + expect(agent.getMessages().length).toBeGreaterThan(0); + expect(agent.getMessages()[0]?.role).toBe('user'); + expect(agent.getMessages()[0]?.content).toBe('Do something'); + + await agent.shutdown(); + expect(agent.getStatus()).toBe('completed'); + }); + + it('should process enqueued messages', async () => { + const { core } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + agent.enqueueMessage('Hello'); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + expect(core.runReasoningLoop).toHaveBeenCalledOnce(); + + await agent.shutdown(); + }); + + it('should set status to failed when chat creation fails', async () => { + const { core } = createMockCore({ nullChat: true }); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + expect(agent.getStatus()).toBe('failed'); + expect(agent.getError()).toBe('Failed to create chat session'); + }); + + // ─── Error Recovery ──────────────────────────────────────── + + it('should survive round errors and recover', async () => { + const { core } = createMockCore(); + + let callCount = 0; + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + callCount++; + if (callCount === 1) { + return Promise.reject(new Error('Model error')); + } + return Promise.resolve({ + text: 'Recovered', + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + agent.enqueueMessage('cause error'); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('failed'); + expect(callCount).toBe(1); + }); + + // Error recorded as info message with error level + const messages = agent.getMessages(); + const errorMsg = messages.find( + (m) => + m.role === 'info' && + m.content.includes('Model error') && + m.metadata?.['level'] === 'error', + ); + expect(errorMsg).toBeDefined(); + + // Second message works fine + agent.enqueueMessage('recover'); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + expect(callCount).toBe(2); + }); + + await agent.shutdown(); + }); + + // ─── Cancellation ────────────────────────────────────────── + + it('should cancel current round without killing the agent', async () => { + const { core } = createMockCore(); + let resolveLoop: () => void; + (core.runReasoningLoop as ReturnType).mockImplementation( + () => + new Promise<{ text: string; terminateMode: string; turnsUsed: number }>( + (resolve) => { + resolveLoop = () => + resolve({ text: '', terminateMode: 'cancelled', turnsUsed: 0 }); + }, + ), + ); + + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + agent.enqueueMessage('long task'); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('running'); + }); + + agent.cancelCurrentRound(); + resolveLoop!(); + + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + await agent.shutdown(); + }); + + it('should abort immediately', async () => { + const { core } = createMockCore(); + (core.runReasoningLoop as ReturnType).mockImplementation( + () => + new Promise((resolve) => { + setTimeout( + () => + resolve({ + text: '', + terminateMode: 'cancelled', + turnsUsed: 0, + }), + 50, + ); + }), + ); + + const config = createConfig({ initialTask: 'long task' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + agent.abort(); + + await agent.waitForCompletion(); + expect(agent.getStatus()).toBe('cancelled'); + }); + + // ─── Accessors ───────────────────────────────────────────── + + it('should provide stats via getStats()', async () => { + const { core } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + const stats = agent.getStats(); + expect(stats).toBeDefined(); + expect(stats.rounds).toBe(1); + }); + + it('should provide core via getCore()', () => { + const { core } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + expect(agent.getCore()).toBe(core); + }); + + // ─── Message Recording ───────────────────────────────────── + + it('should record assistant text from ROUND_TEXT events', async () => { + const { core, emitter } = createMockCore(); + + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + emitter.emit(AgentEventType.ROUND_TEXT, { + subagentId: 'test', + round: 1, + text: 'Hello from round', + thoughtText: '', + timestamp: Date.now(), + }); + return Promise.resolve({ + text: 'Hello from round', + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig({ initialTask: 'test' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + const assistantMsgs = agent + .getMessages() + .filter((m) => m.role === 'assistant' && !m.thought); + expect(assistantMsgs).toHaveLength(1); + expect(assistantMsgs[0]?.content).toBe('Hello from round'); + + await agent.shutdown(); + }); + + it('should not cross-contaminate text across messages', async () => { + const { core, emitter } = createMockCore(); + + let runCount = 0; + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + runCount++; + emitter.emit(AgentEventType.ROUND_TEXT, { + subagentId: 'test', + round: 1, + text: `response-${runCount}`, + thoughtText: '', + timestamp: Date.now(), + }); + return Promise.resolve({ + text: `response-${runCount}`, + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig({ initialTask: 'first message' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + agent.enqueueMessage('second message'); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + expect(runCount).toBe(2); + }); + + const messages = agent.getMessages(); + const assistantMessages = messages.filter( + (m) => m.role === 'assistant' && !m.thought, + ); + const corrupted = assistantMessages.find( + (m) => + m.content.includes('response-1') && m.content.includes('response-2'), + ); + expect(corrupted).toBeUndefined(); + + await agent.shutdown(); + }); + + it('should capture thinking text as assistant messages with thought=true', async () => { + const { core, emitter } = createMockCore(); + + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + emitter.emit(AgentEventType.ROUND_TEXT, { + subagentId: 'test', + round: 1, + text: 'Here is the answer', + thoughtText: 'Let me think...', + timestamp: Date.now(), + }); + return Promise.resolve({ + text: 'Here is the answer', + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig({ initialTask: 'think about this' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + const messages = agent.getMessages(); + const thoughtMsg = messages.find( + (m) => m.role === 'assistant' && m.thought === true, + ); + const textMsg = messages.find((m) => m.role === 'assistant' && !m.thought); + + expect(thoughtMsg).toBeDefined(); + expect(thoughtMsg?.content).toBe('Let me think...'); + expect(textMsg).toBeDefined(); + expect(textMsg?.content).toBe('Here is the answer'); + + await agent.shutdown(); + }); + + it('should record tool_call and tool_result with correct roles', async () => { + const { core, emitter } = createMockCore(); + + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + emitter.emit(AgentEventType.ROUND_TEXT, { + subagentId: 'test', + round: 1, + text: 'I will read the file', + thoughtText: '', + timestamp: Date.now(), + }); + emitter.emit(AgentEventType.TOOL_CALL, { + subagentId: 'test', + round: 1, + callId: 'call-1', + name: 'read_file', + args: { path: 'test.ts' }, + description: 'Read test.ts', + timestamp: Date.now(), + }); + emitter.emit(AgentEventType.TOOL_RESULT, { + subagentId: 'test', + round: 1, + callId: 'call-1', + name: 'read_file', + success: true, + timestamp: Date.now(), + }); + return Promise.resolve({ + text: '', + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig({ initialTask: 'read a file' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + const messages = agent.getMessages(); + const toolCall = messages.find((m) => m.role === 'tool_call'); + const toolResult = messages.find((m) => m.role === 'tool_result'); + + expect(toolCall).toBeDefined(); + expect(toolCall?.metadata?.['toolName']).toBe('read_file'); + expect(toolCall?.metadata?.['callId']).toBe('call-1'); + + expect(toolResult).toBeDefined(); + expect(toolResult?.metadata?.['success']).toBe(true); + + await agent.shutdown(); + }); + + it('should place text before tool_call to preserve temporal ordering', async () => { + const { core, emitter } = createMockCore(); + + (core.runReasoningLoop as ReturnType).mockImplementation( + () => { + emitter.emit(AgentEventType.ROUND_TEXT, { + subagentId: 'test', + round: 1, + text: 'Let me check', + thoughtText: '', + timestamp: Date.now(), + }); + emitter.emit(AgentEventType.TOOL_CALL, { + subagentId: 'test', + round: 1, + callId: 'call-1', + name: 'read_file', + args: {}, + description: '', + timestamp: Date.now(), + }); + emitter.emit(AgentEventType.TOOL_RESULT, { + subagentId: 'test', + round: 1, + callId: 'call-1', + name: 'read_file', + success: true, + timestamp: Date.now(), + }); + return Promise.resolve({ + text: '', + terminateMode: null, + turnsUsed: 1, + }); + }, + ); + + const config = createConfig({ initialTask: 'task' }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + await vi.waitFor(() => { + expect(agent.getStatus()).toBe('idle'); + }); + + const messages = agent.getMessages(); + const nonUser = messages.filter((m) => m.role !== 'user'); + + const textIdx = nonUser.findIndex( + (m) => m.role === 'assistant' && m.content === 'Let me check', + ); + const toolIdx = nonUser.findIndex((m) => m.role === 'tool_call'); + expect(textIdx).toBeLessThan(toolIdx); + + await agent.shutdown(); + }); + + // ─── Chat History ──────────────────────────────────────────── + + it('should pass chatHistory as extraHistory to createChat', async () => { + const { core } = createMockCore(); + const chatHistory = [ + { role: 'user' as const, parts: [{ text: 'earlier question' }] }, + { role: 'model' as const, parts: [{ text: 'earlier answer' }] }, + ]; + const config = createConfig({ chatHistory }); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + expect(core.createChat).toHaveBeenCalledWith(context, { + interactive: true, + extraHistory: chatHistory, + }); + + await agent.shutdown(); + }); + + it('should add info message when chatHistory is present', async () => { + const { core } = createMockCore(); + const chatHistory = [ + { role: 'user' as const, parts: [{ text: 'earlier question' }] }, + { role: 'model' as const, parts: [{ text: 'earlier answer' }] }, + ]; + const agent = new AgentInteractive(createConfig({ chatHistory }), core); + + await agent.start(context); + + const messages = agent.getMessages(); + expect(messages).toHaveLength(1); + expect(messages[0]).toMatchObject({ + role: 'info', + content: 'History context from parent session included (2 messages)', + }); + + await agent.shutdown(); + }); + + it('should not add info message when chatHistory is absent', async () => { + const { core } = createMockCore(); + const agent = new AgentInteractive(createConfig(), core); + + await agent.start(context); + + expect(agent.getMessages()).toHaveLength(0); + + await agent.shutdown(); + }); + + it('should pass undefined extraHistory when chatHistory is not set', async () => { + const { core } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + await agent.start(context); + + expect(core.createChat).toHaveBeenCalledWith(context, { + interactive: true, + extraHistory: undefined, + }); + + await agent.shutdown(); + }); + + // ─── Events ──────────────────────────────────────────────── + + it('should emit status_change events', async () => { + const { core, emitter } = createMockCore(); + const config = createConfig(); + const agent = new AgentInteractive(config, core); + + const statuses: AgentStatus[] = []; + emitter.on(AgentEventType.STATUS_CHANGE, (payload) => { + statuses.push(payload.newStatus); + }); + + await agent.start(context); + await agent.shutdown(); + + expect(statuses).toContain(AgentStatus.COMPLETED); + }); +}); diff --git a/packages/core/src/agents/runtime/agent-interactive.ts b/packages/core/src/agents/runtime/agent-interactive.ts new file mode 100644 index 0000000000..42e9dedce1 --- /dev/null +++ b/packages/core/src/agents/runtime/agent-interactive.ts @@ -0,0 +1,512 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview AgentInteractive — persistent interactive agent. + * + * Composes AgentCore with on-demand message processing. Builds conversation + * state (messages, pending approvals, live outputs) that the UI reads. + */ + +import { createDebugLogger } from '../../utils/debugLogger.js'; +import { type AgentEventEmitter, AgentEventType } from './agent-events.js'; +import type { + AgentRoundTextEvent, + AgentToolCallEvent, + AgentToolResultEvent, + AgentToolOutputUpdateEvent, + AgentApprovalRequestEvent, +} from './agent-events.js'; +import type { AgentStatsSummary } from './agent-statistics.js'; +import type { AgentCore } from './agent-core.js'; +import type { ContextState } from './agent-headless.js'; +import type { GeminiChat } from '../../core/geminiChat.js'; +import type { FunctionDeclaration } from '@google/genai'; +import { + ToolConfirmationOutcome, + type ToolCallConfirmationDetails, + type ToolResultDisplay, +} from '../../tools/tools.js'; +import { AsyncMessageQueue } from '../../utils/asyncMessageQueue.js'; +import { + AgentTerminateMode, + AgentStatus, + isTerminalStatus, + type AgentInteractiveConfig, + type AgentMessage, +} from './agent-types.js'; + +const debugLogger = createDebugLogger('AGENT_INTERACTIVE'); + +/** + * AgentInteractive — persistent interactive agent that processes + * messages on demand. + * + * Three-level cancellation: + * - `cancelCurrentRound()` — abort the current reasoning loop only + * - `shutdown()` — graceful: stop accepting messages, wait for cycle + * - `abort()` — immediate: master abort, set cancelled + */ +export class AgentInteractive { + readonly config: AgentInteractiveConfig; + private readonly core: AgentCore; + private readonly queue = new AsyncMessageQueue(); + private readonly messages: AgentMessage[] = []; + + private status: AgentStatus = AgentStatus.INITIALIZING; + private error: string | undefined; + private lastRoundError: string | undefined; + private executionPromise: Promise | undefined; + private masterAbortController = new AbortController(); + private roundAbortController: AbortController | undefined; + private chat: GeminiChat | undefined; + private toolsList: FunctionDeclaration[] = []; + private processing = false; + private roundCancelledByUser = false; + + // Pending tool approval requests. Keyed by callId. + // Populated by TOOL_WAITING_APPROVAL, removed by TOOL_RESULT or when + // the user responds. The UI reads this to show confirmation dialogs. + private readonly pendingApprovals = new Map< + string, + ToolCallConfirmationDetails + >(); + + // Live streaming output for currently-executing tools. Keyed by callId. + // Populated by TOOL_OUTPUT_UPDATE (replaces previous), cleared on TOOL_RESULT. + // The UI reads this via getLiveOutputs() to show real-time stdout. + private readonly liveOutputs = new Map(); + + // PTY PIDs for currently-executing shell tools. Keyed by callId. + // Populated by TOOL_OUTPUT_UPDATE when pid is present, cleared on TOOL_RESULT. + // The UI reads this via getShellPids() to enable interactive shell input. + private readonly shellPids = new Map(); + + constructor(config: AgentInteractiveConfig, core: AgentCore) { + this.config = config; + this.core = core; + this.setupEventListeners(); + } + + // ─── Lifecycle ────────────────────────────────────────────── + + /** + * Start the agent. Initializes the chat session, then kicks off + * processing if an initialTask is configured. + */ + async start(context: ContextState): Promise { + this.setStatus(AgentStatus.INITIALIZING); + + this.chat = await this.core.createChat(context, { + interactive: true, + extraHistory: this.config.chatHistory, + }); + if (!this.chat) { + this.error = 'Failed to create chat session'; + this.setStatus(AgentStatus.FAILED); + return; + } + + this.toolsList = this.core.prepareTools(); + this.core.stats.start(Date.now()); + + if (this.config.chatHistory?.length) { + this.addMessage( + 'info', + `History context from parent session included (${this.config.chatHistory.length} messages)`, + ); + } + + if (this.config.initialTask) { + this.queue.enqueue(this.config.initialTask); + this.executionPromise = this.runLoop(); + } + } + + /** + * Run loop: process all pending messages, then settle status. + * Exits when the queue is empty or the agent is aborted. + */ + private async runLoop(): Promise { + this.processing = true; + try { + let message = this.queue.dequeue(); + while (message !== null && !this.masterAbortController.signal.aborted) { + this.addMessage('user', message); + await this.runOneRound(message); + message = this.queue.dequeue(); + } + + if (this.masterAbortController.signal.aborted) { + this.setStatus(AgentStatus.CANCELLED); + } else { + this.settleRoundStatus(); + } + } catch (err) { + this.error = err instanceof Error ? err.message : String(err); + this.setStatus(AgentStatus.FAILED); + debugLogger.error('AgentInteractive processing failed:', err); + } finally { + this.processing = false; + } + } + + /** + * Run a single reasoning round for one message. + * Creates a per-round AbortController so cancellation is scoped. + */ + private async runOneRound(message: string): Promise { + if (!this.chat) return; + + this.setStatus(AgentStatus.RUNNING); + this.lastRoundError = undefined; + this.roundCancelledByUser = false; + this.roundAbortController = new AbortController(); + + // Propagate master abort to round + const onMasterAbort = () => this.roundAbortController?.abort(); + this.masterAbortController.signal.addEventListener('abort', onMasterAbort); + if (this.masterAbortController.signal.aborted) { + this.roundAbortController.abort(); + } + + try { + const initialMessages = [ + { role: 'user' as const, parts: [{ text: message }] }, + ]; + + const result = await this.core.runReasoningLoop( + this.chat, + initialMessages, + this.toolsList, + this.roundAbortController, + { + maxTurns: this.config.maxTurnsPerMessage, + maxTimeMinutes: this.config.maxTimeMinutesPerMessage, + }, + ); + + // Surface non-normal termination as a visible info message and as + // lastRoundError so Arena can distinguish limit stops from successes. + if ( + result.terminateMode && + result.terminateMode !== AgentTerminateMode.GOAL + ) { + const msg = terminateModeMessage(result.terminateMode); + if (msg) { + this.addMessage('info', msg.text, { metadata: { level: msg.level } }); + } + this.lastRoundError = `Terminated: ${result.terminateMode}`; + } + } catch (err) { + // User-initiated cancellation already logged by cancelCurrentRound(). + if (this.roundCancelledByUser) return; + // Agent survives round errors — log and settle status in runLoop. + const errorMessage = err instanceof Error ? err.message : String(err); + this.lastRoundError = errorMessage; + debugLogger.error('AgentInteractive round error:', err); + this.addMessage('info', errorMessage, { metadata: { level: 'error' } }); + } finally { + this.masterAbortController.signal.removeEventListener( + 'abort', + onMasterAbort, + ); + this.roundAbortController = undefined; + } + } + + // ─── Cancellation ────────────────────────────────────────── + + /** + * Cancel only the current reasoning round. + * Adds a visible "cancelled" info message and clears pending approvals. + */ + cancelCurrentRound(): void { + this.roundCancelledByUser = true; + this.roundAbortController?.abort(); + this.pendingApprovals.clear(); + this.addMessage('info', 'Agent round cancelled.', { + metadata: { level: 'warning' }, + }); + } + + /** + * Graceful shutdown: stop accepting messages and wait for current + * processing to finish. + */ + async shutdown(): Promise { + this.queue.drain(); + if (this.executionPromise) { + await this.executionPromise; + } + // If no processing cycle ever ran (no initialTask, no messages), + // ensure the agent reaches a terminal status. + if (!isTerminalStatus(this.status)) { + this.setStatus(AgentStatus.COMPLETED); + } + } + + /** + * Immediate abort: cancel everything and set status to cancelled. + */ + abort(): void { + this.masterAbortController.abort(); + this.queue.drain(); + this.pendingApprovals.clear(); + } + + // ─── Message Queue ───────────────────────────────────────── + + /** + * Enqueue a message for the agent to process. + */ + enqueueMessage(message: string): void { + this.queue.enqueue(message); + if (!this.processing) { + this.executionPromise = this.runLoop(); + } + } + + // ─── State Accessors ─────────────────────────────────────── + + getMessages(): readonly AgentMessage[] { + return this.messages; + } + + getStatus(): AgentStatus { + return this.status; + } + + getError(): string | undefined { + return this.error; + } + + getLastRoundError(): string | undefined { + return this.lastRoundError; + } + + getStats(): AgentStatsSummary { + return this.core.getExecutionSummary(); + } + + /** The prompt token count from the most recent model call. */ + getLastPromptTokenCount(): number { + return this.core.lastPromptTokenCount; + } + + getCore(): AgentCore { + return this.core; + } + + getEventEmitter(): AgentEventEmitter | undefined { + return this.core.getEventEmitter(); + } + + /** + * Returns tool calls currently awaiting user approval. + * Keyed by callId → full ToolCallConfirmationDetails (with onConfirm). + * The UI reads this to render confirmation dialogs inside ToolGroupMessage. + */ + getPendingApprovals(): ReadonlyMap { + return this.pendingApprovals; + } + + /** + * Returns live output for currently-executing tools. + * Keyed by callId → latest ToolResultDisplay (replaces on each update). + * Entries are cleared when TOOL_RESULT arrives for the call. + */ + getLiveOutputs(): ReadonlyMap { + return this.liveOutputs; + } + + /** + * Returns PTY PIDs for currently-executing interactive shell tools. + * Keyed by callId → PID. Populated from TOOL_OUTPUT_UPDATE when pid is + * present; cleared when TOOL_RESULT arrives. The UI uses this to enable + * interactive shell input via HistoryItemDisplay's activeShellPtyId prop. + */ + getShellPids(): ReadonlyMap { + return this.shellPids; + } + + /** + * Wait for the run loop to finish (used by InProcessBackend). + */ + async waitForCompletion(): Promise { + if (this.executionPromise) { + await this.executionPromise; + } + } + + // ─── Private Helpers ─────────────────────────────────────── + + /** + * Settle status after the run loop empties. + * On success → IDLE (agent stays alive for follow-up messages). + * On error → FAILED (terminal). + */ + private settleRoundStatus(): void { + if (this.lastRoundError && !this.roundCancelledByUser) { + this.setStatus(AgentStatus.FAILED); + } else { + this.setStatus(AgentStatus.IDLE); + } + } + + private setStatus(newStatus: AgentStatus): void { + const previousStatus = this.status; + if (previousStatus === newStatus) return; + + this.status = newStatus; + + this.core.eventEmitter?.emit(AgentEventType.STATUS_CHANGE, { + agentId: this.config.agentId, + previousStatus, + newStatus, + roundCancelledByUser: this.roundCancelledByUser || undefined, + timestamp: Date.now(), + }); + } + + private addMessage( + role: AgentMessage['role'], + content: string, + options?: { thought?: boolean; metadata?: Record }, + ): void { + const message: AgentMessage = { + role, + content, + timestamp: Date.now(), + }; + if (options?.thought) { + message.thought = true; + } + if (options?.metadata) { + message.metadata = options.metadata; + } + this.messages.push(message); + } + + private setupEventListeners(): void { + const emitter = this.core.eventEmitter; + if (!emitter) return; + + emitter.on(AgentEventType.ROUND_TEXT, (event: AgentRoundTextEvent) => { + if (event.thoughtText) { + this.addMessage('assistant', event.thoughtText, { thought: true }); + } + if (event.text) { + this.addMessage('assistant', event.text); + } + }); + + emitter.on(AgentEventType.TOOL_CALL, (event: AgentToolCallEvent) => { + this.addMessage('tool_call', `Tool call: ${event.name}`, { + metadata: { + callId: event.callId, + toolName: event.name, + args: event.args, + description: event.description, + renderOutputAsMarkdown: event.isOutputMarkdown, + round: event.round, + }, + }); + }); + + emitter.on( + AgentEventType.TOOL_OUTPUT_UPDATE, + (event: AgentToolOutputUpdateEvent) => { + this.liveOutputs.set(event.callId, event.outputChunk); + if (event.pid !== undefined) { + this.shellPids.set(event.callId, event.pid); + } + }, + ); + + emitter.on(AgentEventType.TOOL_RESULT, (event: AgentToolResultEvent) => { + this.liveOutputs.delete(event.callId); + this.shellPids.delete(event.callId); + this.pendingApprovals.delete(event.callId); + + const statusText = event.success ? 'succeeded' : 'failed'; + const summary = event.error + ? `Tool ${event.name} ${statusText}: ${event.error}` + : `Tool ${event.name} ${statusText}`; + this.addMessage('tool_result', summary, { + metadata: { + callId: event.callId, + toolName: event.name, + success: event.success, + resultDisplay: event.resultDisplay, + outputFile: event.outputFile, + round: event.round, + }, + }); + }); + + emitter.on( + AgentEventType.TOOL_WAITING_APPROVAL, + (event: AgentApprovalRequestEvent) => { + const fullDetails = { + ...event.confirmationDetails, + onConfirm: async ( + outcome: Parameters[0], + payload?: Parameters[1], + ) => { + this.pendingApprovals.delete(event.callId); + // Nudge the UI to re-render so the tool transitions visually + // from Confirming → Executing without waiting for the first + // real TOOL_OUTPUT_UPDATE from the tool's execution. + this.core.eventEmitter?.emit(AgentEventType.TOOL_OUTPUT_UPDATE, { + subagentId: this.core.subagentId, + round: event.round, + callId: event.callId, + outputChunk: '', + timestamp: Date.now(), + } as AgentToolOutputUpdateEvent); + await event.respond(outcome, payload); + // When the user denies a tool, cancel the round immediately + // so the agent doesn't waste a turn "acknowledging" the denial. + if (outcome === ToolConfirmationOutcome.Cancel) { + this.cancelCurrentRound(); + } + }, + } as ToolCallConfirmationDetails; + + this.pendingApprovals.set(event.callId, fullDetails); + }, + ); + } +} + +/** + * Map a non-GOAL terminate mode to a visible status message for the UI, + * or return null to suppress the message entirely. + * + * CANCELLED is suppressed here because cancelCurrentRound() already emits + * its own warning. SHUTDOWN is suppressed as a normal lifecycle end. + */ +function terminateModeMessage( + mode: AgentTerminateMode, +): { text: string; level: 'info' | 'warning' | 'error' } | null { + switch (mode) { + case AgentTerminateMode.MAX_TURNS: + return { + text: 'Agent stopped: maximum turns reached.', + level: 'warning', + }; + case AgentTerminateMode.TIMEOUT: + return { text: 'Agent stopped: time limit reached.', level: 'warning' }; + case AgentTerminateMode.ERROR: + return { text: 'Agent stopped due to an error.', level: 'error' }; + case AgentTerminateMode.CANCELLED: + case AgentTerminateMode.SHUTDOWN: + return null; + default: + return null; + } +} diff --git a/packages/core/src/subagents/subagent-statistics.test.ts b/packages/core/src/agents/runtime/agent-statistics.test.ts similarity index 92% rename from packages/core/src/subagents/subagent-statistics.test.ts rename to packages/core/src/agents/runtime/agent-statistics.test.ts index 39ba70aa40..ec9f6e9905 100644 --- a/packages/core/src/subagents/subagent-statistics.test.ts +++ b/packages/core/src/agents/runtime/agent-statistics.test.ts @@ -5,14 +5,14 @@ */ import { describe, it, expect, beforeEach } from 'vitest'; -import { SubagentStatistics } from './subagent-statistics.js'; +import { AgentStatistics } from './agent-statistics.js'; -describe('SubagentStatistics', () => { - let stats: SubagentStatistics; +describe('AgentStatistics', () => { + let stats: AgentStatistics; const baseTime = 1000000000000; // Fixed timestamp for consistent testing beforeEach(() => { - stats = new SubagentStatistics(); + stats = new AgentStatistics(); }); describe('basic statistics tracking', () => { @@ -57,7 +57,23 @@ describe('SubagentStatistics', () => { const summary = stats.getSummary(); expect(summary.thoughtTokens).toBe(10); expect(summary.cachedTokens).toBe(5); - expect(summary.totalTokens).toBe(165); // 100 + 50 + 10 + 5 + // cachedTokens is a subset of inputTokens, not additive + expect(summary.totalTokens).toBe(160); // 100 + 50 + 10 + }); + + it('should use API-provided totalTokenCount when available', () => { + stats.recordTokens(100, 50, 10, 5, 170); + + const summary = stats.getSummary(); + expect(summary.totalTokens).toBe(170); + }); + + it('should accumulate API totalTokenCount across rounds', () => { + stats.recordTokens(100, 50, 0, 0, 150); + stats.recordTokens(200, 80, 0, 0, 280); + + const summary = stats.getSummary(); + expect(summary.totalTokens).toBe(430); // 150 + 280 }); }); @@ -109,7 +125,7 @@ describe('SubagentStatistics', () => { expect(result).toContain('📋 Task Completed: Test task'); expect(result).toContain('🔧 Tool Usage: 1 calls, 100.0% success'); expect(result).toContain('⏱️ Duration: 5.0s | 🔁 Rounds: 2'); - expect(result).toContain('🔢 Tokens: 1,530 (in 1000, out 500)'); + expect(result).toContain('🔢 Tokens: 1,520 (in 1000, out 500)'); }); it('should handle zero tool calls', () => { diff --git a/packages/core/src/subagents/subagent-statistics.ts b/packages/core/src/agents/runtime/agent-statistics.ts similarity index 95% rename from packages/core/src/subagents/subagent-statistics.ts rename to packages/core/src/agents/runtime/agent-statistics.ts index 72308c6332..55c16f529d 100644 --- a/packages/core/src/subagents/subagent-statistics.ts +++ b/packages/core/src/agents/runtime/agent-statistics.ts @@ -14,7 +14,7 @@ export interface ToolUsageStats { averageDurationMs: number; } -export interface SubagentStatsSummary { +export interface AgentStatsSummary { rounds: number; totalDurationMs: number; totalToolCalls: number; @@ -26,11 +26,10 @@ export interface SubagentStatsSummary { thoughtTokens: number; cachedTokens: number; totalTokens: number; - estimatedCost: number; toolUsage: ToolUsageStats[]; } -export class SubagentStatistics { +export class AgentStatistics { private startTimeMs = 0; private rounds = 0; private totalToolCalls = 0; @@ -40,6 +39,7 @@ export class SubagentStatistics { private outputTokens = 0; private thoughtTokens = 0; private cachedTokens = 0; + private apiTotalTokens = 0; private toolUsage = new Map(); start(now = Date.now()) { @@ -83,14 +83,16 @@ export class SubagentStatistics { output: number, thought: number = 0, cached: number = 0, + total: number = 0, ) { this.inputTokens += Math.max(0, input || 0); this.outputTokens += Math.max(0, output || 0); this.thoughtTokens += Math.max(0, thought || 0); this.cachedTokens += Math.max(0, cached || 0); + this.apiTotalTokens += Math.max(0, total || 0); } - getSummary(now = Date.now()): SubagentStatsSummary { + getSummary(now = Date.now()): AgentStatsSummary { const totalDurationMs = this.startTimeMs ? now - this.startTimeMs : 0; const totalToolCalls = this.totalToolCalls; const successRate = @@ -98,11 +100,9 @@ export class SubagentStatistics { ? (this.successfulToolCalls / totalToolCalls) * 100 : 0; const totalTokens = - this.inputTokens + - this.outputTokens + - this.thoughtTokens + - this.cachedTokens; - const estimatedCost = this.inputTokens * 3e-5 + this.outputTokens * 6e-5; + this.apiTotalTokens > 0 + ? this.apiTotalTokens + : this.inputTokens + this.outputTokens + this.thoughtTokens; return { rounds: this.rounds, totalDurationMs, @@ -115,7 +115,6 @@ export class SubagentStatistics { thoughtTokens: this.thoughtTokens, cachedTokens: this.cachedTokens, totalTokens, - estimatedCost, toolUsage: Array.from(this.toolUsage.values()), }; } @@ -217,7 +216,7 @@ export class SubagentStatistics { return `${h}h ${m}m`; } - private generatePerformanceTips(stats: SubagentStatsSummary): string[] { + private generatePerformanceTips(stats: AgentStatsSummary): string[] { const tips: string[] = []; const totalCalls = stats.totalToolCalls; const sr = diff --git a/packages/core/src/agents/runtime/agent-types.ts b/packages/core/src/agents/runtime/agent-types.ts new file mode 100644 index 0000000000..d1204098a3 --- /dev/null +++ b/packages/core/src/agents/runtime/agent-types.ts @@ -0,0 +1,198 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Agent runtime types. + * + * Contains the canonical definitions for agent configuration (prompt, model, + * run, tool), termination modes, and interactive agent types. + */ + +import type { Content, FunctionDeclaration } from '@google/genai'; + +// ─── Agent Configuration ───────────────────────────────────── + +/** + * Configures the initial prompt for an agent. + */ +export interface PromptConfig { + /** + * A single system prompt string that defines the agent's persona and instructions. + * Note: You should use either `systemPrompt` or `initialMessages`, but not both. + */ + systemPrompt?: string; + + /** + * An array of user/model content pairs to seed the chat history for few-shot prompting. + * Note: You should use either `systemPrompt` or `initialMessages`, but not both. + */ + initialMessages?: Content[]; +} + +/** + * Configures the generative model parameters for an agent. + */ +export interface ModelConfig { + /** + * The name or identifier of the model to be used (e.g., 'qwen3-coder-plus'). + * + * TODO: In the future, this needs to support 'auto' or some other string to support routing use cases. + */ + model?: string; + /** The temperature for the model's sampling process. */ + temp?: number; + /** The top-p value for nucleus sampling. */ + top_p?: number; +} + +/** + * Configures the execution environment and constraints for an agent. + * + * TODO: Consider adding max_tokens as a form of budgeting. + */ +export interface RunConfig { + /** The maximum execution time for the agent in minutes. */ + max_time_minutes?: number; + /** + * The maximum number of conversational turns (a user message + model response) + * before the execution is terminated. Helps prevent infinite loops. + */ + max_turns?: number; +} + +/** + * Configures the tools available to an agent during its execution. + */ +export interface ToolConfig { + /** + * A list of tool names (from the tool registry) or full function declarations + * that the agent is permitted to use. + */ + tools: Array; +} + +/** + * Describes the possible termination modes for an agent. + * This enum provides a clear indication of why an agent's execution ended. + */ +export enum AgentTerminateMode { + /** The agent's execution terminated due to an unrecoverable error. */ + ERROR = 'ERROR', + /** The agent's execution terminated because it exceeded the maximum allowed working time. */ + TIMEOUT = 'TIMEOUT', + /** The agent's execution successfully completed all its defined goals. */ + GOAL = 'GOAL', + /** The agent's execution terminated because it exceeded the maximum number of turns. */ + MAX_TURNS = 'MAX_TURNS', + /** The agent's execution was cancelled via an abort signal. */ + CANCELLED = 'CANCELLED', + /** The agent was gracefully shut down (e.g., arena/team session ended). */ + SHUTDOWN = 'SHUTDOWN', +} + +// ─── Agent Status ──────────────────────────────────────────── + +/** + * Canonical lifecycle status for any agent (headless, interactive, arena). + * + * State machine: + * INITIALIZING → RUNNING → IDLE ⇄ RUNNING → … → COMPLETED / FAILED / CANCELLED + * + * - INITIALIZING: Setting up (creating chat, loading tools). + * - RUNNING: Actively processing (model thinking / tool execution). + * - IDLE: Finished current work, waiting — can accept new messages. + * - COMPLETED: Finished for good (explicit shutdown). No further interaction. + * - FAILED: Finished with error (API failure, process crash, etc.). + * - CANCELLED: Cancelled by user or system. + */ +export enum AgentStatus { + INITIALIZING = 'initializing', + RUNNING = 'running', + IDLE = 'idle', + COMPLETED = 'completed', + FAILED = 'failed', + CANCELLED = 'cancelled', +} + +/** True for COMPLETED, FAILED, CANCELLED — agent is done for good. */ +export const isTerminalStatus = (s: AgentStatus): boolean => + s === AgentStatus.COMPLETED || + s === AgentStatus.FAILED || + s === AgentStatus.CANCELLED; + +/** True for IDLE or COMPLETED — agent finished its work successfully. */ +export const isSuccessStatus = (s: AgentStatus): boolean => + s === AgentStatus.IDLE || s === AgentStatus.COMPLETED; + +/** True for terminal statuses OR IDLE — agent has settled (not actively working). */ +export const isSettledStatus = (s: AgentStatus): boolean => + s === AgentStatus.IDLE || isTerminalStatus(s); + +/** + * Lightweight configuration for an AgentInteractive instance. + * Carries only interactive-specific parameters; the heavy runtime + * configs (prompt, model, run, tools) live on AgentCore. + */ +export interface AgentInteractiveConfig { + /** Unique identifier for this agent. */ + agentId: string; + /** Human-readable name for display. */ + agentName: string; + /** Optional initial task to start working on immediately. */ + initialTask?: string; + /** Max model round-trips per enqueued message (default: unlimited). */ + maxTurnsPerMessage?: number; + /** Max wall-clock minutes per enqueued message (default: unlimited). */ + maxTimeMinutesPerMessage?: number; + /** + * Optional conversation history from a parent session to seed the + * agent's chat with prior context. + */ + chatHistory?: Content[]; +} + +/** + * A message exchanged with or produced by an interactive agent. + * + * This is a UI-oriented data model (not the Gemini API Content type). + * AgentInteractive is the sole writer; the UI reads via getMessages(). + */ +export interface AgentMessage { + /** Discriminator for the message kind. */ + role: 'user' | 'assistant' | 'tool_call' | 'tool_result' | 'info'; + /** The text content of the message. */ + content: string; + /** When the message was created (ms since epoch). */ + timestamp: number; + /** + * Whether this assistant message contains thinking/reasoning content. + * Mirrors AgentStreamTextEvent.thought. Only meaningful when role is 'assistant'. + */ + thought?: boolean; + /** + * Optional metadata. + * + * For role='info': metadata.level?: 'info' | 'warning' | 'success' | 'error' + * Controls which status message component is rendered. Defaults to 'info'. + * For role='tool_call': callId, toolName, args, description, renderOutputAsMarkdown, round + * For role='tool_result': callId, toolName, success, resultDisplay, outputFile, round + * For role='assistant' with error: error=true + */ + metadata?: Record; +} + +/** + * Snapshot of in-progress streaming state for UI mid-switch handoff. + * Returned by AgentInteractive.getInProgressStream(). + */ +export interface InProgressStreamState { + /** Accumulated non-thought text so far in the current round. */ + text: string; + /** Accumulated thinking text so far in the current round. */ + thinking: string; + /** The reasoning-loop round number being streamed. */ + round: number; +} diff --git a/packages/core/src/agents/runtime/index.ts b/packages/core/src/agents/runtime/index.ts new file mode 100644 index 0000000000..93ef0e5a3a --- /dev/null +++ b/packages/core/src/agents/runtime/index.ts @@ -0,0 +1,17 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Runtime barrel — re-exports agent execution primitives. + */ + +export * from './agent-types.js'; +export * from './agent-core.js'; +export * from './agent-headless.js'; +export * from './agent-interactive.js'; +export * from './agent-events.js'; +export * from './agent-statistics.js'; +export { AsyncMessageQueue } from '../../utils/asyncMessageQueue.js'; diff --git a/packages/core/src/config/config.ts b/packages/core/src/config/config.ts index e655169252..b2a35229d9 100644 --- a/packages/core/src/config/config.ts +++ b/packages/core/src/config/config.ts @@ -21,6 +21,8 @@ import type { ContentGeneratorConfigSources } from '../core/contentGenerator.js' import type { MCPOAuthConfig } from '../mcp/oauth-provider.js'; import type { ShellExecutionConfig } from '../services/shellExecutionService.js'; import type { AnyToolInvocation } from '../tools/tools.js'; +import type { ArenaManager } from '../agents/arena/ArenaManager.js'; +import { ArenaAgentClient } from '../agents/arena/ArenaAgentClient.js'; // Core import { BaseLlmClient } from '../core/baseLlmClient.js'; @@ -284,6 +286,26 @@ export interface SandboxConfig { image: string; } +/** + * Settings shared across multi-agent collaboration features + * (Arena, Team, Swarm). + */ +export interface AgentsCollabSettings { + /** Display mode for multi-agent sessions ('in-process' | 'tmux' | 'iterm2') */ + displayMode?: string; + /** Arena-specific settings */ + arena?: { + /** Custom base directory for Arena worktrees (default: ~/.qwen/arena) */ + worktreeBaseDir?: string; + /** Preserve worktrees and state files after session ends */ + preserveArtifacts?: boolean; + /** Maximum rounds (turns) per agent. No limit if unset. */ + maxRoundsPerAgent?: number; + /** Total timeout in seconds for the Arena session. No limit if unset. */ + timeoutSeconds?: number; + }; +} + export interface ConfigParameters { sessionId?: string; sessionData?: ResumedSessionData; @@ -381,6 +403,8 @@ export interface ConfigParameters { channel?: string; /** Model providers configuration grouped by authType */ modelProvidersConfig?: ModelProvidersConfig; + /** Multi-agent collaboration settings (Arena, Team, Swarm) */ + agents?: AgentsCollabSettings; /** Enable hook system for lifecycle events */ enableHooks?: boolean; /** Hooks configuration from settings */ @@ -516,6 +540,12 @@ export class Config { private readonly shouldUseNodePtyShell: boolean; private readonly skipNextSpeakerCheck: boolean; private shellExecutionConfig: ShellExecutionConfig; + private arenaManager: ArenaManager | null = null; + private arenaManagerChangeCallback: + | ((manager: ArenaManager | null) => void) + | null = null; + private readonly arenaAgentClient: ArenaAgentClient | null; + private readonly agentsSettings: AgentsCollabSettings; private readonly skipLoopDetection: boolean; private readonly skipStartupContext: boolean; private readonly warnings: string[]; @@ -651,6 +681,8 @@ export class Config { this.inputFormat = params.inputFormat ?? InputFormat.TEXT; this.fileExclusions = new FileExclusions(this); this.eventEmitter = params.eventEmitter; + this.arenaAgentClient = ArenaAgentClient.create(); + this.agentsSettings = params.agents ?? {}; if (params.contextFileName) { setGeminiMdFilename(params.contextFileName); } @@ -1183,6 +1215,8 @@ export class Config { if (this.toolRegistry) { await this.toolRegistry.stop(); } + + await this.cleanupArenaRuntime(); } catch (error) { // Log but don't throw - cleanup should be best-effort this.debugLogger.error('Error during Config shutdown:', error); @@ -1335,6 +1369,50 @@ export class Config { this.geminiMdFileCount = count; } + getArenaManager(): ArenaManager | null { + return this.arenaManager; + } + + setArenaManager(manager: ArenaManager | null): void { + this.arenaManager = manager; + this.arenaManagerChangeCallback?.(manager); + } + + /** + * Register a callback invoked whenever the arena manager changes. + * Pass `null` to unsubscribe. Only one subscriber is supported. + */ + onArenaManagerChange( + cb: ((manager: ArenaManager | null) => void) | null, + ): void { + this.arenaManagerChangeCallback = cb; + } + + getArenaAgentClient(): ArenaAgentClient | null { + return this.arenaAgentClient; + } + + getAgentsSettings(): AgentsCollabSettings { + return this.agentsSettings; + } + + /** + * Clean up Arena runtime. When `force` is true (e.g., /arena select --discard), + * always removes worktrees regardless of preserveArtifacts. + */ + async cleanupArenaRuntime(force?: boolean): Promise { + const manager = this.arenaManager; + if (!manager) { + return; + } + if (!force && this.agentsSettings.arena?.preserveArtifacts) { + await manager.cleanupRuntime(); + } else { + await manager.cleanup(); + } + this.setArenaManager(null); + } + getApprovalMode(): ApprovalMode { return this.approvalMode; } @@ -1808,6 +1886,7 @@ export class Config { async createToolRegistry( sendSdkMcpMessage?: SendSdkMcpMessage, + options?: { skipDiscovery?: boolean }, ): Promise { const registry = new ToolRegistry( this, @@ -1897,7 +1976,9 @@ export class Config { registerCoreTool(LspTool, this); } - await registry.discoverAllTools(); + if (!options?.skipDiscovery) { + await registry.discoverAllTools(); + } this.debugLogger.debug( `ToolRegistry created: ${JSON.stringify(registry.getAllToolNames())} (${registry.getAllToolNames().length} tools)`, ); diff --git a/packages/core/src/config/storage.ts b/packages/core/src/config/storage.ts index 3293280a88..5de57ab0c4 100644 --- a/packages/core/src/config/storage.ts +++ b/packages/core/src/config/storage.ts @@ -17,6 +17,7 @@ const BIN_DIR_NAME = 'bin'; const PROJECT_DIR_NAME = 'projects'; const IDE_DIR_NAME = 'ide'; const DEBUG_DIR_NAME = 'debug'; +const ARENA_DIR_NAME = 'arena'; export class Storage { private readonly targetDir: string; @@ -77,6 +78,10 @@ export class Storage { return path.join(Storage.getGlobalQwenDir(), BIN_DIR_NAME); } + static getGlobalArenaDir(): string { + return path.join(Storage.getGlobalQwenDir(), ARENA_DIR_NAME); + } + getQwenDir(): string { return path.join(this.targetDir, QWEN_DIR); } diff --git a/packages/core/src/core/client.test.ts b/packages/core/src/core/client.test.ts index 2d197fa20e..8d81a5bbba 100644 --- a/packages/core/src/core/client.test.ts +++ b/packages/core/src/core/client.test.ts @@ -358,7 +358,9 @@ describe('Gemini Client (client.ts)', () => { getSkipLoopDetection: vi.fn().mockReturnValue(false), getChatRecordingService: vi.fn().mockReturnValue(undefined), getResumedSessionData: vi.fn().mockReturnValue(undefined), + getArenaAgentClient: vi.fn().mockReturnValue(null), getEnableHooks: vi.fn().mockReturnValue(false), + getArenaManager: vi.fn().mockReturnValue(null), getMessageBus: vi.fn().mockReturnValue(undefined), } as unknown as Config; diff --git a/packages/core/src/core/client.ts b/packages/core/src/core/client.ts index c9f67e072f..ee87e39cdf 100644 --- a/packages/core/src/core/client.ts +++ b/packages/core/src/core/client.ts @@ -23,6 +23,7 @@ const debugLogger = createDebugLogger('CLIENT'); import type { ContentGenerator } from './contentGenerator.js'; import { GeminiChat } from './geminiChat.js'; import { + getArenaSystemReminder, getCoreSystemPrompt, getCustomSystemPrompt, getPlanModeSystemReminder, @@ -239,6 +240,7 @@ export class GeminiClient { }, history, this.config.getChatRecordingService(), + uiTelemetryService, ); } catch (error) { await reportError( @@ -582,6 +584,19 @@ export class GeminiClient { this.forceFullIdeContext = false; } + // Check for arena control signal before starting a new turn + const arenaAgentClient = this.config.getArenaAgentClient(); + if (arenaAgentClient) { + const controlSignal = await arenaAgentClient.checkControlSignal(); + if (controlSignal) { + debugLogger.info( + `Arena control signal received: ${controlSignal.type} - ${controlSignal.reason}`, + ); + await arenaAgentClient.reportCancelled(); + return new Turn(this.getChat(), prompt_id); + } + } + const turn = new Turn(this.getChat(), prompt_id); // append system reminders to the request @@ -606,6 +621,18 @@ export class GeminiClient { ); } + // add arena system reminder if an arena session is active + const arenaManager = this.config.getArenaManager(); + if (arenaManager) { + try { + const sessionDir = arenaManager.getArenaSessionDir(); + const configPath = `${sessionDir}/config.json`; + systemReminders.push(getArenaSystemReminder(configPath)); + } catch { + // Arena config not yet initialized — skip + } + } + requestToSent = [...systemReminders, ...requestToSent]; } @@ -618,11 +645,27 @@ export class GeminiClient { if (!this.config.getSkipLoopDetection()) { if (this.loopDetector.addAndCheck(event)) { yield { type: GeminiEventType.LoopDetected }; + if (arenaAgentClient) { + await arenaAgentClient.reportError('Loop detected'); + } return turn; } } + // Update arena status on Finished events — stats are derived + // automatically from uiTelemetryService by the reporter. + if (arenaAgentClient && event.type === GeminiEventType.Finished) { + await arenaAgentClient.updateStatus(); + } + yield event; if (event.type === GeminiEventType.Error) { + if (arenaAgentClient) { + const errorMsg = + event.value instanceof Error + ? event.value.message + : 'Unknown error'; + await arenaAgentClient.reportError(errorMsg); + } return turn; } } @@ -687,6 +730,10 @@ export class GeminiClient { if (!turn.pendingToolCalls.length && signal && !signal.aborted) { if (this.config.getSkipNextSpeakerCheck()) { + // Report completed before returning — agent has no more work to do + if (arenaAgentClient) { + await arenaAgentClient.reportCompleted(); + } return turn; } @@ -715,9 +762,17 @@ export class GeminiClient { options, boundedTurns - 1, ); + } else if (arenaAgentClient) { + // No continuation needed — agent completed its task + await arenaAgentClient.reportCompleted(); } } + // Report cancelled to arena when user cancelled mid-stream + if (signal?.aborted && arenaAgentClient) { + await arenaAgentClient.reportCancelled(); + } + return turn; } diff --git a/packages/core/src/core/geminiChat.test.ts b/packages/core/src/core/geminiChat.test.ts index 8422968e7d..2f9e2d1076 100644 --- a/packages/core/src/core/geminiChat.test.ts +++ b/packages/core/src/core/geminiChat.test.ts @@ -124,7 +124,13 @@ describe('GeminiChat', async () => { // Disable 429 simulation for tests setSimulate429(false); // Reset history for each test by creating a new instance - chat = new GeminiChat(mockConfig, config, []); + chat = new GeminiChat( + mockConfig, + config, + [], + undefined, + uiTelemetryService, + ); }); afterEach(() => { diff --git a/packages/core/src/core/geminiChat.ts b/packages/core/src/core/geminiChat.ts index 13eae7e5b8..4983b5754c 100644 --- a/packages/core/src/core/geminiChat.ts +++ b/packages/core/src/core/geminiChat.ts @@ -34,7 +34,7 @@ import { ContentRetryEvent, ContentRetryFailureEvent, } from '../telemetry/types.js'; -import { uiTelemetryService } from '../telemetry/uiTelemetry.js'; +import type { UiTelemetryService } from '../telemetry/uiTelemetry.js'; const debugLogger = createDebugLogger('QWEN_CODE_CHAT'); @@ -235,12 +235,16 @@ export class GeminiChat { * @param history - Optional initial conversation history. * @param chatRecordingService - Optional recording service. If provided, chat * messages will be recorded. + * @param telemetryService - Optional UI telemetry service. When provided, + * prompt token counts are reported on each API response. Pass `undefined` + * for sub-agent chats to avoid overwriting the main agent's context usage. */ constructor( private readonly config: Config, private readonly generationConfig: GenerateContentConfig = {}, private history: Content[] = [], private readonly chatRecordingService?: ChatRecordingService, + private readonly telemetryService?: UiTelemetryService, ) { validateHistory(history); } @@ -652,8 +656,8 @@ export class GeminiChat { usageMetadata = chunk.usageMetadata; const lastPromptTokenCount = usageMetadata.totalTokenCount ?? usageMetadata.promptTokenCount; - if (lastPromptTokenCount) { - uiTelemetryService.setLastPromptTokenCount(lastPromptTokenCount); + if (lastPromptTokenCount && this.telemetryService) { + this.telemetryService.setLastPromptTokenCount(lastPromptTokenCount); } } diff --git a/packages/core/src/core/prompts.ts b/packages/core/src/core/prompts.ts index 178372b48e..b2799b79bd 100644 --- a/packages/core/src/core/prompts.ts +++ b/packages/core/src/core/prompts.ts @@ -865,6 +865,16 @@ Plan mode is active. The user indicated that they do not want you to execute yet `; } +/** + * Generates a system reminder about an active Arena session. + * + * @param configFilePath - Absolute path to the arena session's `config.json` + * @returns A formatted system reminder string wrapped in XML tags + */ +export function getArenaSystemReminder(configFilePath: string): string { + return `An Arena session is active. For details, read: ${configFilePath}. This message is for internal use only. Do not mention this to user in your response.`; +} + // ============================================================================ // Insight Analysis Prompts // ============================================================================ diff --git a/packages/core/src/index.ts b/packages/core/src/index.ts index e1fe65d2ff..21a511eac3 100644 --- a/packages/core/src/index.ts +++ b/packages/core/src/index.ts @@ -100,6 +100,7 @@ export * from './services/chatRecordingService.js'; export * from './services/fileDiscoveryService.js'; export * from './services/fileSystemService.js'; export * from './services/gitService.js'; +export * from './services/gitWorktreeService.js'; export * from './services/sessionService.js'; export * from './services/shellExecutionService.js'; @@ -175,13 +176,14 @@ export { } from './telemetry/types.js'; // ============================================================================ -// Extensions, Skills & Subagents +// Extensions, Skills, Subagents & Agents // ============================================================================ export * from './extension/index.js'; export * from './prompts/mcp-prompts.js'; export * from './skills/index.js'; export * from './subagents/index.js'; +export * from './agents/index.js'; // ============================================================================ // Utilities @@ -191,6 +193,7 @@ export * from './utils/browser.js'; export * from './utils/configResolver.js'; export * from './utils/debugLogger.js'; export * from './utils/editor.js'; +export * from './utils/environmentContext.js'; export * from './utils/errorParsing.js'; export * from './utils/errors.js'; export * from './utils/fileUtils.js'; diff --git a/packages/core/src/services/gitWorktreeService.test.ts b/packages/core/src/services/gitWorktreeService.test.ts new file mode 100644 index 0000000000..f34eb1ca20 --- /dev/null +++ b/packages/core/src/services/gitWorktreeService.test.ts @@ -0,0 +1,503 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import { beforeEach, describe, expect, it, vi } from 'vitest'; +import type { Mock } from 'vitest'; +import type * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import { GitWorktreeService } from './gitWorktreeService.js'; +import { isCommandAvailable } from '../utils/shell-utils.js'; + +const hoistedMockSimpleGit = vi.hoisted(() => vi.fn()); +const hoistedMockCheckIsRepo = vi.hoisted(() => vi.fn()); +const hoistedMockInit = vi.hoisted(() => vi.fn()); +const hoistedMockAdd = vi.hoisted(() => vi.fn()); +const hoistedMockCommit = vi.hoisted(() => vi.fn()); +const hoistedMockRevparse = vi.hoisted(() => vi.fn()); +const hoistedMockRaw = vi.hoisted(() => vi.fn()); +const hoistedMockBranch = vi.hoisted(() => vi.fn()); +const hoistedMockDiff = vi.hoisted(() => vi.fn()); +const hoistedMockMerge = vi.hoisted(() => vi.fn()); +const hoistedMockStash = vi.hoisted(() => vi.fn()); + +vi.mock('simple-git', () => ({ + simpleGit: hoistedMockSimpleGit, + CheckRepoActions: { IS_REPO_ROOT: 'is-repo-root' }, +})); + +vi.mock('../utils/shell-utils.js', () => ({ + isCommandAvailable: vi.fn(), +})); + +const hoistedMockGetGlobalQwenDir = vi.hoisted(() => vi.fn()); +vi.mock('../config/storage.js', () => ({ + Storage: { + getGlobalQwenDir: hoistedMockGetGlobalQwenDir, + }, +})); + +const hoistedMockFsMkdir = vi.hoisted(() => vi.fn()); +const hoistedMockFsAccess = vi.hoisted(() => vi.fn()); +const hoistedMockFsWriteFile = vi.hoisted(() => vi.fn()); +const hoistedMockFsReaddir = vi.hoisted(() => vi.fn()); +const hoistedMockFsStat = vi.hoisted(() => vi.fn()); +const hoistedMockFsRm = vi.hoisted(() => vi.fn()); +const hoistedMockFsReadFile = vi.hoisted(() => vi.fn()); + +vi.mock('node:fs/promises', async (importOriginal) => { + const actual = await importOriginal(); + return { + ...actual, + mkdir: hoistedMockFsMkdir, + access: hoistedMockFsAccess, + writeFile: hoistedMockFsWriteFile, + readdir: hoistedMockFsReaddir, + stat: hoistedMockFsStat, + rm: hoistedMockFsRm, + readFile: hoistedMockFsReadFile, + }; +}); + +describe('GitWorktreeService', () => { + beforeEach(() => { + vi.clearAllMocks(); + + hoistedMockGetGlobalQwenDir.mockReturnValue('/mock-qwen'); + (isCommandAvailable as Mock).mockReturnValue({ available: true }); + + hoistedMockSimpleGit.mockImplementation(() => ({ + checkIsRepo: hoistedMockCheckIsRepo, + init: hoistedMockInit, + add: hoistedMockAdd, + commit: hoistedMockCommit, + revparse: hoistedMockRevparse, + raw: hoistedMockRaw, + branch: hoistedMockBranch, + diff: hoistedMockDiff, + merge: hoistedMockMerge, + stash: hoistedMockStash, + })); + + hoistedMockCheckIsRepo.mockResolvedValue(true); + hoistedMockInit.mockResolvedValue(undefined); + hoistedMockAdd.mockResolvedValue(undefined); + hoistedMockCommit.mockResolvedValue(undefined); + hoistedMockRevparse.mockResolvedValue('main\n'); + hoistedMockRaw.mockResolvedValue(''); + hoistedMockBranch.mockResolvedValue({ branches: {} }); + hoistedMockDiff.mockResolvedValue(''); + hoistedMockMerge.mockResolvedValue(undefined); + hoistedMockStash.mockResolvedValue(''); + + hoistedMockFsMkdir.mockResolvedValue(undefined); + hoistedMockFsAccess.mockRejectedValue({ code: 'ENOENT' }); + hoistedMockFsWriteFile.mockResolvedValue(undefined); + hoistedMockFsReaddir.mockResolvedValue([]); + hoistedMockFsStat.mockResolvedValue({ birthtimeMs: 123 }); + hoistedMockFsRm.mockResolvedValue(undefined); + hoistedMockFsReadFile.mockResolvedValue('{}'); + }); + + it('checkGitAvailable should return an error when git is unavailable', async () => { + (isCommandAvailable as Mock).mockReturnValue({ available: false }); + const service = new GitWorktreeService('/repo'); + + await expect(service.checkGitAvailable()).resolves.toEqual({ + available: false, + error: 'Git is not installed. Please install Git.', + }); + }); + + it('isGitRepository should fallback to checkIsRepo() when root check throws', async () => { + hoistedMockCheckIsRepo + .mockRejectedValueOnce(new Error('root check failed')) + .mockResolvedValueOnce(true); + const service = new GitWorktreeService('/repo'); + + await expect(service.isGitRepository()).resolves.toBe(true); + expect(hoistedMockCheckIsRepo).toHaveBeenNthCalledWith(1, 'is-repo-root'); + expect(hoistedMockCheckIsRepo).toHaveBeenNthCalledWith(2); + }); + + it('isGitRepository should detect subdirectory inside an existing repo', async () => { + // IS_REPO_ROOT returns false for a subdirectory, but checkIsRepo() + // (without params) returns true because we're inside a repo. + hoistedMockCheckIsRepo + .mockResolvedValueOnce(false) + .mockResolvedValueOnce(true); + const service = new GitWorktreeService('/repo/subdir'); + + await expect(service.isGitRepository()).resolves.toBe(true); + expect(hoistedMockCheckIsRepo).toHaveBeenNthCalledWith(1, 'is-repo-root'); + expect(hoistedMockCheckIsRepo).toHaveBeenNthCalledWith(2); + }); + + it('createWorktree should create a sanitized branch and worktree path', async () => { + const service = new GitWorktreeService('/repo'); + + const result = await service.createWorktree('s1', 'Model A'); + + const expectedPath = path.join( + '/mock-qwen', + 'worktrees', + 's1', + 'worktrees', + 'model-a', + ); + expect(result.success).toBe(true); + expect(result.worktree?.branch).toBe('main-s1-model-a'); + expect(result.worktree?.path).toBe(expectedPath); + expect(hoistedMockRaw).toHaveBeenCalledWith([ + 'worktree', + 'add', + '-b', + 'main-s1-model-a', + expectedPath, + 'main', + ]); + }); + + it('setupWorktrees should fail early for colliding sanitized names', async () => { + const service = new GitWorktreeService('/repo'); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['Model A', 'model_a'], + }); + + expect(result.success).toBe(false); + expect(result.errors).toHaveLength(1); + expect(result.errors[0]?.error).toContain('collides'); + expect(isCommandAvailable).not.toHaveBeenCalled(); + }); + + it('setupWorktrees should return system error when git is unavailable', async () => { + (isCommandAvailable as Mock).mockReturnValue({ available: false }); + const service = new GitWorktreeService('/repo'); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['model-a'], + }); + + expect(result.success).toBe(false); + expect(result.errors).toEqual([ + { + name: 'system', + error: 'Git is not installed. Please install Git.', + }, + ]); + }); + + it('setupWorktrees should cleanup session after partial creation failure', async () => { + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'isGitRepository').mockResolvedValue(true); + vi.spyOn(service, 'createWorktree') + .mockResolvedValueOnce({ + success: true, + worktree: { + id: 's1/a', + name: 'a', + path: '/w/a', + branch: 'worktrees/s1/a', + isActive: true, + createdAt: 1, + }, + }) + .mockResolvedValueOnce({ + success: false, + error: 'boom', + }); + const cleanupSpy = vi.spyOn(service, 'cleanupSession').mockResolvedValue({ + success: true, + removedWorktrees: [], + removedBranches: [], + errors: [], + }); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['a', 'b'], + }); + + expect(result.success).toBe(false); + expect(result.errors).toContainEqual({ name: 'b', error: 'boom' }); + expect(cleanupSpy).toHaveBeenCalledWith('s1'); + }); + + it('listWorktrees should return empty array when session dir does not exist', async () => { + const err = new Error('missing') as NodeJS.ErrnoException; + err.code = 'ENOENT'; + hoistedMockFsReaddir.mockRejectedValue(err); + const service = new GitWorktreeService('/repo'); + + await expect(service.listWorktrees('missing')).resolves.toEqual([]); + }); + + it('removeWorktree should fallback to fs.rm + worktree prune when git remove fails', async () => { + hoistedMockRaw + .mockRejectedValueOnce(new Error('remove failed')) + .mockResolvedValueOnce(''); + const service = new GitWorktreeService('/repo'); + + const result = await service.removeWorktree('/w/a'); + + expect(result.success).toBe(true); + expect(hoistedMockFsRm).toHaveBeenCalledWith('/w/a', { + recursive: true, + force: true, + }); + expect(hoistedMockRaw).toHaveBeenNthCalledWith(2, ['worktree', 'prune']); + }); + + it('cleanupSession should remove branches from listed worktrees', async () => { + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'listWorktrees').mockResolvedValue([ + { + id: 's1/a', + name: 'a', + path: '/w/a', + branch: 'main-s1-a', + isActive: true, + createdAt: Date.now(), + }, + { + id: 's1/b', + name: 'b', + path: '/w/b', + branch: 'main-s1-b', + isActive: true, + createdAt: Date.now(), + }, + ]); + vi.spyOn(service, 'removeWorktree').mockResolvedValue({ success: true }); + + const result = await service.cleanupSession('s1'); + + expect(result.success).toBe(true); + expect(result.removedBranches).toEqual(['main-s1-a', 'main-s1-b']); + expect(hoistedMockBranch).toHaveBeenCalledWith(['-D', 'main-s1-a']); + expect(hoistedMockBranch).toHaveBeenCalledWith(['-D', 'main-s1-b']); + expect(hoistedMockRaw).toHaveBeenCalledWith(['worktree', 'prune']); + }); + + it('getWorktreeDiff should return staged raw diff without creating commits', async () => { + const service = new GitWorktreeService('/repo'); + hoistedMockDiff.mockResolvedValue('diff --git a/a.ts b/a.ts'); + + const diff = await service.getWorktreeDiff('/w/a', 'main'); + + expect(diff).toBe('diff --git a/a.ts b/a.ts'); + expect(hoistedMockAdd).toHaveBeenCalledWith(['--all']); + expect(hoistedMockDiff).toHaveBeenCalledWith([ + '--binary', + '--cached', + 'main', + ]); + expect(hoistedMockCommit).not.toHaveBeenCalled(); + }); + + it('applyWorktreeChanges should apply raw patch via git apply', async () => { + const service = new GitWorktreeService('/repo'); + // resolveBaseline returns the baseline commit SHA + hoistedMockRaw + .mockResolvedValueOnce('baseline-sha\n') // resolveBaseline log --grep + .mockResolvedValueOnce('') // reset (from withStagedChanges) + .mockResolvedValueOnce(''); // git apply + hoistedMockDiff.mockResolvedValueOnce('diff --git a/a.ts b/a.ts'); + + const result = await service.applyWorktreeChanges('/w/a', '/repo'); + + expect(result.success).toBe(true); + expect(hoistedMockAdd).toHaveBeenCalledWith(['--all']); + // Should diff against the baseline commit, not merge-base + expect(hoistedMockDiff).toHaveBeenCalledWith([ + '--binary', + '--cached', + 'baseline-sha', + ]); + + const applyCall = hoistedMockRaw.mock.calls.find( + (call) => Array.isArray(call[0]) && call[0][0] === 'apply', + ); + expect(applyCall).toBeDefined(); + // When baseline is used, --3way is omitted (target working tree + // matches the pre-image, so plain apply works cleanly). + expect(applyCall?.[0]?.slice(0, 2)).toEqual([ + 'apply', + '--whitespace=nowarn', + ]); + expect(hoistedMockFsWriteFile).toHaveBeenCalled(); + expect(hoistedMockFsRm).toHaveBeenCalledWith( + expect.stringContaining('.worktree-apply-'), + { force: true }, + ); + }); + + it('applyWorktreeChanges should skip apply when patch is empty', async () => { + const service = new GitWorktreeService('/repo'); + // resolveBaseline returns baseline commit + hoistedMockRaw.mockResolvedValueOnce('baseline-sha\n'); + hoistedMockDiff.mockResolvedValueOnce(' \n'); + + const result = await service.applyWorktreeChanges('/w/a', '/repo'); + + expect(result.success).toBe(true); + const applyCall = hoistedMockRaw.mock.calls.find( + (call) => Array.isArray(call[0]) && call[0][0] === 'apply', + ); + expect(applyCall).toBeUndefined(); + expect(hoistedMockFsWriteFile).not.toHaveBeenCalled(); + }); + + it('applyWorktreeChanges should return error when git apply fails', async () => { + const service = new GitWorktreeService('/repo'); + // resolveBaseline returns baseline commit + hoistedMockRaw + .mockResolvedValueOnce('baseline-sha\n') // resolveBaseline + .mockResolvedValueOnce('') // reset from withStagedChanges + .mockRejectedValueOnce(new Error('apply failed')); + hoistedMockDiff.mockResolvedValueOnce('diff --git a/a.ts b/a.ts'); + + const result = await service.applyWorktreeChanges('/w/a', '/repo'); + + expect(result.success).toBe(false); + expect(result.error).toContain('apply failed'); + expect(hoistedMockFsRm).toHaveBeenCalledWith( + expect.stringContaining('.worktree-apply-'), + { force: true }, + ); + }); + + describe('dirty state propagation', () => { + function makeWorktreeInfo( + name: string, + sessionId: string, + ): { + id: string; + name: string; + path: string; + branch: string; + isActive: boolean; + createdAt: number; + } { + return { + id: `${sessionId}/${name}`, + name, + path: `/mock-qwen/worktrees/${sessionId}/worktrees/${name}`, + branch: `worktrees/${sessionId}/${name}`, + isActive: true, + createdAt: 1, + }; + } + + it('setupWorktrees should apply dirty state snapshot to each worktree', async () => { + hoistedMockStash.mockResolvedValue('snapshot-sha\n'); + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'isGitRepository').mockResolvedValue(true); + vi.spyOn(service, 'createWorktree') + .mockResolvedValueOnce({ + success: true, + worktree: makeWorktreeInfo('a', 's1'), + }) + .mockResolvedValueOnce({ + success: true, + worktree: makeWorktreeInfo('b', 's1'), + }); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['a', 'b'], + }); + + expect(result.success).toBe(true); + expect(hoistedMockStash).toHaveBeenCalledWith(['create']); + // stash apply should be called once per worktree + const stashApplyCalls = hoistedMockRaw.mock.calls.filter( + (call: unknown[]) => + Array.isArray(call[0]) && + call[0][0] === 'stash' && + call[0][1] === 'apply', + ); + expect(stashApplyCalls).toHaveLength(2); + expect(stashApplyCalls[0]![0]).toEqual([ + 'stash', + 'apply', + 'snapshot-sha', + ]); + }); + + it('setupWorktrees should skip stash apply when working tree is clean', async () => { + hoistedMockStash.mockResolvedValue('\n'); + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'isGitRepository').mockResolvedValue(true); + vi.spyOn(service, 'createWorktree').mockResolvedValue({ + success: true, + worktree: makeWorktreeInfo('a', 's1'), + }); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['a'], + }); + + expect(result.success).toBe(true); + const stashApplyCalls = hoistedMockRaw.mock.calls.filter( + (call: unknown[]) => + Array.isArray(call[0]) && + call[0][0] === 'stash' && + call[0][1] === 'apply', + ); + expect(stashApplyCalls).toHaveLength(0); + }); + + it('setupWorktrees should still succeed when stash apply fails', async () => { + hoistedMockStash.mockResolvedValue('snapshot-sha\n'); + hoistedMockRaw.mockRejectedValue(new Error('stash apply conflict')); + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'isGitRepository').mockResolvedValue(true); + vi.spyOn(service, 'createWorktree').mockResolvedValue({ + success: true, + worktree: makeWorktreeInfo('a', 's1'), + }); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['a'], + }); + + // Setup should still succeed — dirty state failure is non-fatal + expect(result.success).toBe(true); + expect(result.errors).toHaveLength(0); + }); + + it('setupWorktrees should still succeed when stash create fails', async () => { + hoistedMockStash.mockRejectedValue(new Error('stash create failed')); + const service = new GitWorktreeService('/repo'); + vi.spyOn(service, 'isGitRepository').mockResolvedValue(true); + vi.spyOn(service, 'createWorktree').mockResolvedValue({ + success: true, + worktree: makeWorktreeInfo('a', 's1'), + }); + + const result = await service.setupWorktrees({ + sessionId: 's1', + sourceRepoPath: '/repo', + worktreeNames: ['a'], + }); + + // Setup should still succeed — stash create failure is non-fatal + expect(result.success).toBe(true); + expect(result.errors).toHaveLength(0); + }); + }); +}); diff --git a/packages/core/src/services/gitWorktreeService.ts b/packages/core/src/services/gitWorktreeService.ts new file mode 100644 index 0000000000..6ceebf11e3 --- /dev/null +++ b/packages/core/src/services/gitWorktreeService.ts @@ -0,0 +1,826 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import { execSync } from 'node:child_process'; +import { simpleGit, CheckRepoActions } from 'simple-git'; +import type { SimpleGit } from 'simple-git'; +import { Storage } from '../config/storage.js'; +import { isCommandAvailable } from '../utils/shell-utils.js'; +import { isNodeError } from '../utils/errors.js'; + +/** + * Commit message used for the baseline snapshot in worktrees. + * After overlaying the user's dirty state (tracked changes + untracked files), + * a commit with this message is created so that later diffs only capture the + * agent's changes — not the pre-existing local edits. + */ +export const BASELINE_COMMIT_MESSAGE = 'baseline (dirty state overlay)'; + +/** + * Default directory and branch-prefix name used for worktrees. + * Changing this value affects the on-disk layout (`~/.qwen//`) + * **and** the default git branch prefix (`//…`). + */ +export const WORKTREES_DIR = 'worktrees'; + +export interface WorktreeInfo { + /** Unique identifier for this worktree */ + id: string; + /** Display name (e.g., model name) */ + name: string; + /** Absolute path to the worktree directory */ + path: string; + /** Git branch name for this worktree */ + branch: string; + /** Whether the worktree is currently active */ + isActive: boolean; + /** Creation timestamp */ + createdAt: number; +} + +export interface WorktreeSetupConfig { + /** Session identifier */ + sessionId: string; + /** Source repository path (project root) */ + sourceRepoPath: string; + /** Names/identifiers for each worktree to create */ + worktreeNames: string[]; + /** Base branch to create worktrees from (defaults to current branch) */ + baseBranch?: string; + /** Extra metadata to persist alongside the session config */ + metadata?: Record; +} + +export interface CreateWorktreeResult { + success: boolean; + worktree?: WorktreeInfo; + error?: string; +} + +export interface WorktreeSetupResult { + success: boolean; + sessionId: string; + worktrees: WorktreeInfo[]; + worktreesByName: Record; + errors: Array<{ name: string; error: string }>; +} + +/** + * Minimal session config file written to disk. + * Callers can extend via the `metadata` field in WorktreeSetupConfig. + */ +interface SessionConfigFile { + sessionId: string; + sourceRepoPath: string; + worktreeNames: string[]; + baseBranch?: string; + createdAt: number; + [key: string]: unknown; +} + +/** + * Service for managing git worktrees. + * + * Git worktrees allow multiple working directories to share a single repository, + * enabling isolated environments without copying the entire repo. + */ +export class GitWorktreeService { + private sourceRepoPath: string; + private git: SimpleGit; + private readonly customBaseDir?: string; + + constructor(sourceRepoPath: string, customBaseDir?: string) { + this.sourceRepoPath = path.resolve(sourceRepoPath); + this.git = simpleGit(this.sourceRepoPath); + this.customBaseDir = customBaseDir; + } + + /** + * Gets the directory where worktrees are stored. + * @param customDir - Optional custom base directory override + */ + static getBaseDir(customDir?: string): string { + if (customDir) { + return path.resolve(customDir); + } + return path.join(Storage.getGlobalQwenDir(), WORKTREES_DIR); + } + + /** + * Gets the directory for a specific session. + * @param customBaseDir - Optional custom base directory override + */ + static getSessionDir(sessionId: string, customBaseDir?: string): string { + return path.join(GitWorktreeService.getBaseDir(customBaseDir), sessionId); + } + + /** + * Gets the worktrees directory for a specific session. + * @param customBaseDir - Optional custom base directory override + */ + static getWorktreesDir(sessionId: string, customBaseDir?: string): string { + return path.join( + GitWorktreeService.getSessionDir(sessionId, customBaseDir), + WORKTREES_DIR, + ); + } + + /** + * Instance-level base dir, using the custom dir if provided at construction. + */ + getBaseDirForInstance(): string { + return GitWorktreeService.getBaseDir(this.customBaseDir); + } + + /** + * Checks if git is available on the system. + */ + async checkGitAvailable(): Promise<{ available: boolean; error?: string }> { + const { available } = isCommandAvailable('git'); + if (!available) { + return { + available: false, + error: 'Git is not installed. Please install Git.', + }; + } + return { available: true }; + } + + /** + * Checks if the source path is a git repository. + */ + async isGitRepository(): Promise { + try { + const isRoot = await this.git.checkIsRepo(CheckRepoActions.IS_REPO_ROOT); + if (isRoot) { + return true; + } + } catch { + // IS_REPO_ROOT check failed — fall through to the general check + } + // Not the root (or root check threw) — check if we're inside a git repo + try { + return await this.git.checkIsRepo(); + } catch { + return false; + } + } + + /** + * Initializes the source directory as a git repository. + * Returns true if initialization was performed, false if already a repo. + */ + async initializeRepository(): Promise<{ + initialized: boolean; + error?: string; + }> { + const isRepo = await this.isGitRepository(); + if (isRepo) { + return { initialized: false }; + } + + try { + await this.git.init(false, { '--initial-branch': 'main' }); + + // Create initial commit so we can create worktrees + await this.git.add('.'); + await this.git.commit('Initial commit', { + '--allow-empty': null, + }); + + return { initialized: true }; + } catch (error) { + return { + initialized: false, + error: `Failed to initialize git repository: ${error instanceof Error ? error.message : 'Unknown error'}`, + }; + } + } + + /** + * Gets the current branch name. + */ + async getCurrentBranch(): Promise { + const branch = await this.git.revparse(['--abbrev-ref', 'HEAD']); + return branch.trim(); + } + + /** + * Gets the current commit hash. + */ + async getCurrentCommitHash(): Promise { + const hash = await this.git.revparse(['HEAD']); + return hash.trim(); + } + + /** + * Creates a single worktree. + */ + async createWorktree( + sessionId: string, + name: string, + baseBranch?: string, + ): Promise { + try { + const worktreesDir = GitWorktreeService.getWorktreesDir( + sessionId, + this.customBaseDir, + ); + await fs.mkdir(worktreesDir, { recursive: true }); + + // Sanitize name for use as branch and directory name + const sanitizedName = this.sanitizeName(name); + const worktreePath = path.join(worktreesDir, sanitizedName); + + // Check if worktree already exists + const exists = await this.pathExists(worktreePath); + if (exists) { + return { + success: false, + error: `Worktree already exists at ${worktreePath}`, + }; + } + + // Determine base branch + const base = baseBranch || (await this.getCurrentBranch()); + const shortSession = sessionId.slice(0, 6); + const branchName = `${base}-${shortSession}-${sanitizedName}`; + + // Create the worktree with a new branch + await this.git.raw([ + 'worktree', + 'add', + '-b', + branchName, + worktreePath, + base, + ]); + + const worktree: WorktreeInfo = { + id: `${sessionId}/${sanitizedName}`, + name, + path: worktreePath, + branch: branchName, + isActive: true, + createdAt: Date.now(), + }; + + return { success: true, worktree }; + } catch (error) { + return { + success: false, + error: `Failed to create worktree for "${name}": ${error instanceof Error ? error.message : 'Unknown error'}`, + }; + } + } + + /** + * Sets up all worktrees for a session. + * This is the main entry point for worktree creation. + */ + async setupWorktrees( + config: WorktreeSetupConfig, + ): Promise { + const result: WorktreeSetupResult = { + success: false, + sessionId: config.sessionId, + worktrees: [], + worktreesByName: {}, + errors: [], + }; + + // Validate worktree names early (before touching git) + const sanitizedNames = new Map(); + for (const name of config.worktreeNames) { + const sanitized = this.sanitizeName(name); + if (!sanitized) { + result.errors.push({ + name, + error: 'Worktree name becomes empty after sanitization', + }); + continue; + } + const existing = sanitizedNames.get(sanitized); + if (existing) { + result.errors.push({ + name, + error: `Worktree name collides with "${existing}" after sanitization`, + }); + continue; + } + sanitizedNames.set(sanitized, name); + } + if (result.errors.length > 0) { + return result; + } + + // Check git availability + const gitCheck = await this.checkGitAvailable(); + if (!gitCheck.available) { + result.errors.push({ name: 'system', error: gitCheck.error! }); + return result; + } + + // Ensure source is a git repository + const isRepo = await this.isGitRepository(); + if (!isRepo) { + result.errors.push({ + name: 'repository', + error: 'Source path is not a git repository.', + }); + return result; + } + + // Create session directory + const sessionDir = GitWorktreeService.getSessionDir( + config.sessionId, + this.customBaseDir, + ); + await fs.mkdir(sessionDir, { recursive: true }); + + // Save session config for later reference + const configPath = path.join(sessionDir, 'config.json'); + const configFile: SessionConfigFile = { + sessionId: config.sessionId, + sourceRepoPath: config.sourceRepoPath, + worktreeNames: config.worktreeNames, + baseBranch: config.baseBranch, + createdAt: Date.now(), + ...config.metadata, + }; + await fs.writeFile(configPath, JSON.stringify(configFile, null, 2)); + + // Capture the current dirty state (tracked: staged + unstaged changes) + // without modifying the source working tree or index. + // NOTE: `git stash create` does NOT support --include-untracked; + // untracked files are handled separately below via file copy. + let dirtyStateSnapshot = ''; + try { + dirtyStateSnapshot = (await this.git.stash(['create'])).trim(); + } catch { + // Ignore — proceed without dirty state if stash create fails + } + + // Discover untracked files so they can be copied into each worktree. + // `git ls-files --others --exclude-standard` is read-only and safe. + let untrackedFiles: string[] = []; + try { + const raw = await this.git.raw([ + 'ls-files', + '--others', + '--exclude-standard', + ]); + untrackedFiles = raw.trim().split('\n').filter(Boolean); + } catch { + // Non-fatal: proceed without untracked files + } + + // Create worktrees for each entry + for (const name of config.worktreeNames) { + const createResult = await this.createWorktree( + config.sessionId, + name, + config.baseBranch, + ); + + if (createResult.success && createResult.worktree) { + result.worktrees.push(createResult.worktree); + result.worktreesByName[name] = createResult.worktree; + } else { + result.errors.push({ + name, + error: createResult.error || 'Unknown error', + }); + } + } + + // If any worktree failed, clean up all created resources and fail + if (result.errors.length > 0) { + try { + await this.cleanupSession(config.sessionId); + } catch (error) { + result.errors.push({ + name: 'cleanup', + error: `Failed to cleanup after partial worktree creation: ${error instanceof Error ? error.message : 'Unknown error'}`, + }); + } + result.success = false; + return result; + } + + // Success only if all worktrees were created + result.success = result.worktrees.length === config.worktreeNames.length; + + // Overlay the source repo's dirty state onto each worktree so agents + // see the same files the user currently has on disk. + if (result.success) { + for (const worktree of result.worktrees) { + const wtGit = simpleGit(worktree.path); + + // 1. Apply tracked dirty changes (staged + unstaged) + if (dirtyStateSnapshot) { + try { + await wtGit.raw(['stash', 'apply', dirtyStateSnapshot]); + } catch { + // Non-fatal: worktree still usable with committed state only + } + } + + // 2. Copy untracked files into the worktree + for (const relPath of untrackedFiles) { + try { + const src = path.join(this.sourceRepoPath, relPath); + const dst = path.join(worktree.path, relPath); + await fs.mkdir(path.dirname(dst), { recursive: true }); + await fs.copyFile(src, dst); + } catch { + // Non-fatal: skip files that can't be copied + } + } + + // 3. Create a baseline commit capturing the full starting state + // (committed + dirty + untracked). This allows us to later diff + // only the agent's changes, excluding the pre-existing dirty state. + try { + await wtGit.add(['--all']); + await wtGit.commit(BASELINE_COMMIT_MESSAGE, { + '--allow-empty': null, + '--no-verify': null, + }); + } catch { + // Non-fatal: diff will fall back to merge-base if baseline is missing + } + } + } + + return result; + } + + /** + * Lists all worktrees for a session. + */ + async listWorktrees(sessionId: string): Promise { + const worktreesDir = GitWorktreeService.getWorktreesDir( + sessionId, + this.customBaseDir, + ); + + try { + const entries = await fs.readdir(worktreesDir, { withFileTypes: true }); + const worktrees: WorktreeInfo[] = []; + + for (const entry of entries) { + if (entry.isDirectory()) { + const worktreePath = path.join(worktreesDir, entry.name); + + // Read the actual branch from the worktree + let branchName = ''; + try { + branchName = execSync('git rev-parse --abbrev-ref HEAD', { + cwd: worktreePath, + encoding: 'utf8', + stdio: ['pipe', 'pipe', 'pipe'], + }).trim(); + } catch { + // Fallback if git command fails + } + + // Try to get stats for creation time + let createdAt = Date.now(); + try { + const stats = await fs.stat(worktreePath); + createdAt = stats.birthtimeMs; + } catch { + // Ignore stat errors + } + + worktrees.push({ + id: `${sessionId}/${entry.name}`, + name: entry.name, + path: worktreePath, + branch: branchName, + isActive: true, + createdAt, + }); + } + } + + return worktrees; + } catch (error) { + if (isNodeError(error) && error.code === 'ENOENT') { + return []; + } + throw error; + } + } + + /** + * Removes a single worktree. + */ + async removeWorktree( + worktreePath: string, + ): Promise<{ success: boolean; error?: string }> { + try { + // Remove the worktree from git + await this.git.raw(['worktree', 'remove', worktreePath, '--force']); + return { success: true }; + } catch (error) { + // Try to remove the directory manually if git worktree remove fails + try { + await fs.rm(worktreePath, { recursive: true, force: true }); + // Prune worktree references + await this.git.raw(['worktree', 'prune']); + return { success: true }; + } catch (_rmError) { + return { + success: false, + error: `Failed to remove worktree: ${error instanceof Error ? error.message : 'Unknown error'}`, + }; + } + } + } + + /** + * Cleans up all worktrees and branches for a session. + */ + async cleanupSession(sessionId: string): Promise<{ + success: boolean; + removedWorktrees: string[]; + removedBranches: string[]; + errors: string[]; + }> { + const result = { + success: true, + removedWorktrees: [] as string[], + removedBranches: [] as string[], + errors: [] as string[], + }; + + // Collect actual branch names from worktrees before removing them + const worktrees = await this.listWorktrees(sessionId); + const worktreeBranches = new Set( + worktrees.map((w) => w.branch).filter(Boolean), + ); + + // Remove all worktrees + for (const worktree of worktrees) { + const removeResult = await this.removeWorktree(worktree.path); + if (removeResult.success) { + result.removedWorktrees.push(worktree.name); + } else { + result.errors.push( + removeResult.error || `Failed to remove ${worktree.name}`, + ); + result.success = false; + } + } + + // Remove session directory + const sessionDir = GitWorktreeService.getSessionDir( + sessionId, + this.customBaseDir, + ); + try { + await fs.rm(sessionDir, { recursive: true, force: true }); + } catch (error) { + result.errors.push( + `Failed to remove session directory: ${error instanceof Error ? error.message : 'Unknown error'}`, + ); + } + + // Clean up branches that belonged to the worktrees + try { + for (const branchName of worktreeBranches) { + try { + await this.git.branch(['-D', branchName]); + result.removedBranches.push(branchName); + } catch { + // Branch might already be deleted, ignore + } + } + } catch { + // Ignore branch listing/deletion errors + } + + // Prune worktree references + try { + await this.git.raw(['worktree', 'prune']); + } catch { + // Ignore prune errors + } + + return result; + } + + /** + * Gets the diff between a worktree and its baseline state. + * Prefers the baseline commit (which includes the dirty state overlay) + * so the diff only shows the agent's changes. Falls back to the base branch + * when no baseline commit exists. + */ + async getWorktreeDiff( + worktreePath: string, + baseBranch?: string, + ): Promise { + const worktreeGit = simpleGit(worktreePath); + + const base = + (await this.resolveBaseline(worktreeGit)) ?? + baseBranch ?? + (await this.getCurrentBranch()); + + try { + return await this.withStagedChanges(worktreeGit, () => + worktreeGit.diff(['--binary', '--cached', base]), + ); + } catch (error) { + return `Error getting diff: ${error instanceof Error ? error.message : 'Unknown error'}`; + } + } + + /** + * Applies raw changes from a worktree back to the target working directory. + * + * Diffs from the baseline commit (which already includes the user's + * dirty state) so the patch only contains the agent's new changes. + * Falls back to merge-base when no baseline commit exists. + */ + async applyWorktreeChanges( + worktreePath: string, + targetPath?: string, + ): Promise<{ success: boolean; error?: string }> { + const target = targetPath || this.sourceRepoPath; + const worktreeGit = simpleGit(worktreePath); + const targetGit = simpleGit(target); + + try { + // Prefer the baseline commit (created during worktree setup after + // overlaying dirty state) so the patch excludes pre-existing edits. + let base = await this.resolveBaseline(worktreeGit); + const hasBaseline = !!base; + + if (!base) { + // Fallback: diff from merge-base + const targetHead = (await targetGit.revparse(['HEAD'])).trim(); + base = ( + await worktreeGit.raw(['merge-base', 'HEAD', targetHead]) + ).trim(); + } + + const patch = await this.withStagedChanges(worktreeGit, () => + worktreeGit.diff(['--binary', '--cached', base]), + ); + + if (!patch.trim()) { + return { success: true }; + } + + const patchFile = path.join( + this.getBaseDirForInstance(), + `.worktree-apply-${Date.now()}-${Math.random().toString(16).slice(2)}.patch`, + ); + await fs.mkdir(path.dirname(patchFile), { recursive: true }); + await fs.writeFile(patchFile, patch, 'utf-8'); + + try { + // When using the baseline, the target working tree already matches the + // patch pre-image (both have the dirty state), so a plain apply works. + // --3way is only needed for the merge-base fallback path where the + // pre-image may not match the working tree; it falls back to index + // blob lookup which would fail on baseline-relative patches. + const applyArgs = hasBaseline + ? ['apply', '--whitespace=nowarn', patchFile] + : ['apply', '--3way', '--whitespace=nowarn', patchFile]; + await targetGit.raw(applyArgs); + } finally { + await fs.rm(patchFile, { force: true }); + } + + return { success: true }; + } catch (error) { + return { + success: false, + error: `Failed to apply worktree changes: ${error instanceof Error ? error.message : 'Unknown error'}`, + }; + } + } + + /** + * Lists all sessions stored in the worktree base directory. + */ + static async listSessions(customBaseDir?: string): Promise< + Array<{ + sessionId: string; + createdAt: number; + sourceRepoPath: string; + worktreeCount: number; + }> + > { + const baseDir = GitWorktreeService.getBaseDir(customBaseDir); + const sessions: Array<{ + sessionId: string; + createdAt: number; + sourceRepoPath: string; + worktreeCount: number; + }> = []; + + try { + const entries = await fs.readdir(baseDir, { withFileTypes: true }); + + for (const entry of entries) { + if (entry.isDirectory()) { + const configPath = path.join(baseDir, entry.name, 'config.json'); + try { + const configContent = await fs.readFile(configPath, 'utf-8'); + const config = JSON.parse(configContent) as SessionConfigFile; + + const worktreesDir = path.join(baseDir, entry.name, WORKTREES_DIR); + let worktreeCount = 0; + try { + const worktreeEntries = await fs.readdir(worktreesDir); + worktreeCount = worktreeEntries.length; + } catch { + // Ignore if worktrees dir doesn't exist + } + + sessions.push({ + sessionId: entry.name, + createdAt: config.createdAt || Date.now(), + sourceRepoPath: config.sourceRepoPath || '', + worktreeCount, + }); + } catch { + // Ignore sessions without valid config + } + } + } + + return sessions.sort((a, b) => b.createdAt - a.createdAt); + } catch { + return []; + } + } + + /** + * Finds the baseline commit in a worktree, if one exists. + * Returns the commit SHA, or null if not found. + */ + private async resolveBaseline( + worktreeGit: SimpleGit, + ): Promise { + try { + const sha = ( + await worktreeGit.raw([ + 'log', + '--grep', + BASELINE_COMMIT_MESSAGE, + '--format=%H', + '-1', + ]) + ).trim(); + return sha || null; + } catch { + return null; + } + } + + /** Stages all changes, runs a callback, then resets the index. */ + private async withStagedChanges( + git: SimpleGit, + fn: () => Promise, + ): Promise { + await git.add(['--all']); + try { + return await fn(); + } finally { + try { + await git.raw(['reset']); + } catch { + // Best-effort: ignore reset failures + } + } + } + + private sanitizeName(name: string): string { + // Replace invalid characters with hyphens + return name + .toLowerCase() + .replace(/[^a-z0-9-]/g, '-') + .replace(/-+/g, '-') + .replace(/^-|-$/g, ''); + } + + private async pathExists(p: string): Promise { + try { + await fs.access(p); + return true; + } catch { + return false; + } + } +} diff --git a/packages/core/src/subagents/index.ts b/packages/core/src/subagents/index.ts index 17c62a200a..c05c386971 100644 --- a/packages/core/src/subagents/index.ts +++ b/packages/core/src/subagents/index.ts @@ -5,18 +5,11 @@ */ /** - * @fileoverview Subagents Phase 1 implementation - File-based configuration layer + * @fileoverview Subagents — file-based configuration layer. * * This module provides the foundation for the subagents feature by implementing - * a file-based configuration system that builds on the existing SubAgentScope - * runtime system. It includes: + * a file-based configuration system that builds on the agent runtime. * - * - Type definitions for file-based subagent configurations - * - Validation system for configuration integrity - * - Runtime conversion functions integrated into the manager - * - Manager class for CRUD operations on subagent files - * - * The implementation follows the Markdown + YAML frontmatter format , with storage at both project and user levels. */ // Core types and interfaces @@ -40,36 +33,3 @@ export { SubagentValidator } from './validation.js'; // Main management class export { SubagentManager } from './subagent-manager.js'; - -// Re-export existing runtime types for convenience -export type { - PromptConfig, - ModelConfig, - RunConfig, - ToolConfig, - SubagentTerminateMode, -} from './types.js'; - -export { SubAgentScope } from './subagent.js'; - -// Event system for UI integration -export type { - SubAgentEvent, - SubAgentStartEvent, - SubAgentRoundEvent, - SubAgentStreamTextEvent, - SubAgentUsageEvent, - SubAgentToolCallEvent, - SubAgentToolResultEvent, - SubAgentFinishEvent, - SubAgentErrorEvent, - SubAgentApprovalRequestEvent, -} from './subagent-events.js'; - -export { SubAgentEventEmitter, SubAgentEventType } from './subagent-events.js'; - -// Statistics and formatting -export type { - SubagentStatsSummary, - ToolUsageStats, -} from './subagent-statistics.js'; diff --git a/packages/core/src/subagents/subagent-events.ts b/packages/core/src/subagents/subagent-events.ts deleted file mode 100644 index 5de09a3c27..0000000000 --- a/packages/core/src/subagents/subagent-events.ts +++ /dev/null @@ -1,145 +0,0 @@ -/** - * @license - * Copyright 2025 Qwen - * SPDX-License-Identifier: Apache-2.0 - */ - -import { EventEmitter } from 'events'; -import type { - ToolCallConfirmationDetails, - ToolConfirmationOutcome, - ToolResultDisplay, -} from '../tools/tools.js'; -import type { Part, GenerateContentResponseUsageMetadata } from '@google/genai'; - -export type SubAgentEvent = - | 'start' - | 'round_start' - | 'round_end' - | 'stream_text' - | 'tool_call' - | 'tool_result' - | 'tool_waiting_approval' - | 'usage_metadata' - | 'finish' - | 'error'; - -export enum SubAgentEventType { - START = 'start', - ROUND_START = 'round_start', - ROUND_END = 'round_end', - STREAM_TEXT = 'stream_text', - TOOL_CALL = 'tool_call', - TOOL_RESULT = 'tool_result', - TOOL_WAITING_APPROVAL = 'tool_waiting_approval', - USAGE_METADATA = 'usage_metadata', - FINISH = 'finish', - ERROR = 'error', -} - -export interface SubAgentStartEvent { - subagentId: string; - name: string; - model?: string; - tools: string[]; - timestamp: number; -} - -export interface SubAgentRoundEvent { - subagentId: string; - round: number; - promptId: string; - timestamp: number; -} - -export interface SubAgentStreamTextEvent { - subagentId: string; - round: number; - text: string; - /** Whether this text is reasoning/thinking content (as opposed to regular output) */ - thought?: boolean; - timestamp: number; -} - -export interface SubAgentUsageEvent { - subagentId: string; - round: number; - usage: GenerateContentResponseUsageMetadata; - durationMs?: number; - timestamp: number; -} - -export interface SubAgentToolCallEvent { - subagentId: string; - round: number; - callId: string; - name: string; - args: Record; - description: string; - timestamp: number; -} - -export interface SubAgentToolResultEvent { - subagentId: string; - round: number; - callId: string; - name: string; - success: boolean; - error?: string; - responseParts?: Part[]; - resultDisplay?: ToolResultDisplay; - durationMs?: number; - timestamp: number; -} - -export interface SubAgentApprovalRequestEvent { - subagentId: string; - round: number; - callId: string; - name: string; - description: string; - confirmationDetails: Omit & { - type: ToolCallConfirmationDetails['type']; - }; - respond: ( - outcome: ToolConfirmationOutcome, - payload?: Parameters[1], - ) => Promise; - timestamp: number; -} - -export interface SubAgentFinishEvent { - subagentId: string; - terminateReason: string; - timestamp: number; - rounds?: number; - totalDurationMs?: number; - totalToolCalls?: number; - successfulToolCalls?: number; - failedToolCalls?: number; - inputTokens?: number; - outputTokens?: number; - totalTokens?: number; -} - -export interface SubAgentErrorEvent { - subagentId: string; - error: string; - timestamp: number; -} - -export class SubAgentEventEmitter { - private ee = new EventEmitter(); - - on(event: SubAgentEvent, listener: (...args: unknown[]) => void) { - this.ee.on(event, listener); - } - - off(event: SubAgentEvent, listener: (...args: unknown[]) => void) { - this.ee.off(event, listener); - } - - emit(event: SubAgentEvent, payload: unknown) { - this.ee.emit(event, payload); - } -} diff --git a/packages/core/src/subagents/subagent-hooks.ts b/packages/core/src/subagents/subagent-hooks.ts deleted file mode 100644 index f3bf997bfe..0000000000 --- a/packages/core/src/subagents/subagent-hooks.ts +++ /dev/null @@ -1,33 +0,0 @@ -/** - * @license - * Copyright 2025 Qwen - * SPDX-License-Identifier: Apache-2.0 - */ - -export interface PreToolUsePayload { - subagentId: string; - name: string; // subagent name - toolName: string; - args: Record; - timestamp: number; -} - -export interface PostToolUsePayload extends PreToolUsePayload { - success: boolean; - durationMs: number; - errorMessage?: string; -} - -export interface SubagentStopPayload { - subagentId: string; - name: string; // subagent name - terminateReason: string; - summary: Record; - timestamp: number; -} - -export interface SubagentHooks { - preToolUse?(payload: PreToolUsePayload): Promise | void; - postToolUse?(payload: PostToolUsePayload): Promise | void; - onStop?(payload: SubagentStopPayload): Promise | void; -} diff --git a/packages/core/src/subagents/subagent-manager.ts b/packages/core/src/subagents/subagent-manager.ts index 0552fa60c4..21ad851299 100644 --- a/packages/core/src/subagents/subagent-manager.ts +++ b/packages/core/src/subagents/subagent-manager.ts @@ -19,14 +19,20 @@ import type { SubagentLevel, ListSubagentsOptions, CreateSubagentOptions, +} from './types.js'; +import type { PromptConfig, ModelConfig, RunConfig, ToolConfig, -} from './types.js'; +} from '../agents/runtime/agent-types.js'; import { SubagentError, SubagentErrorCode } from './types.js'; import { SubagentValidator } from './validation.js'; -import { SubAgentScope } from './subagent.js'; +import { AgentHeadless } from '../agents/runtime/agent-headless.js'; +import type { + AgentEventEmitter, + AgentHooks, +} from '../agents/runtime/agent-events.js'; import type { Config } from '../config/config.js'; import { createDebugLogger } from '../utils/debugLogger.js'; import { normalizeContent } from '../utils/textUtils.js'; @@ -579,24 +585,24 @@ export class SubagentManager { } /** - * Creates a SubAgentScope from a subagent configuration. + * Creates an AgentHeadless from a subagent configuration. * * @param config - Subagent configuration * @param runtimeContext - Runtime context - * @returns Promise resolving to SubAgentScope + * @returns Promise resolving to AgentHeadless */ - async createSubagentScope( + async createAgentHeadless( config: SubagentConfig, runtimeContext: Config, options?: { - eventEmitter?: import('./subagent-events.js').SubAgentEventEmitter; - hooks?: import('./subagent-hooks.js').SubagentHooks; + eventEmitter?: AgentEventEmitter; + hooks?: AgentHooks; }, - ): Promise { + ): Promise { try { const runtimeConfig = this.convertToRuntimeConfig(config); - return await SubAgentScope.create( + return await AgentHeadless.create( config.name, runtimeContext, runtimeConfig.promptConfig, @@ -609,7 +615,7 @@ export class SubagentManager { } catch (error) { if (error instanceof Error) { throw new SubagentError( - `Failed to create SubAgentScope: ${error.message}`, + `Failed to create AgentHeadless: ${error.message}`, SubagentErrorCode.INVALID_CONFIG, config.name, ); @@ -620,10 +626,10 @@ export class SubagentManager { /** * Converts a file-based SubagentConfig to runtime configuration - * compatible with SubAgentScope.create(). + * compatible with AgentHeadless.create(). * * @param config - File-based subagent configuration - * @returns Runtime configuration for SubAgentScope + * @returns Runtime configuration for AgentHeadless */ convertToRuntimeConfig(config: SubagentConfig): SubagentRuntimeConfig { // Build prompt configuration diff --git a/packages/core/src/subagents/subagent.ts b/packages/core/src/subagents/subagent.ts deleted file mode 100644 index 613bc80441..0000000000 --- a/packages/core/src/subagents/subagent.ts +++ /dev/null @@ -1,1010 +0,0 @@ -/** - * @license - * Copyright 2025 Qwen - * SPDX-License-Identifier: Apache-2.0 - */ - -import { reportError } from '../utils/errorReporting.js'; -import type { Config } from '../config/config.js'; -import { createDebugLogger } from '../utils/debugLogger.js'; - -const debugLogger = createDebugLogger('SUBAGENT'); -import { type ToolCallRequestInfo } from '../core/turn.js'; -import { - CoreToolScheduler, - type ToolCall, - type WaitingToolCall, -} from '../core/coreToolScheduler.js'; -import type { - ToolConfirmationOutcome, - ToolCallConfirmationDetails, -} from '../tools/tools.js'; -import { getInitialChatHistory } from '../utils/environmentContext.js'; -import type { - Content, - Part, - FunctionCall, - GenerateContentConfig, - FunctionDeclaration, - GenerateContentResponseUsageMetadata, -} from '@google/genai'; -import { GeminiChat } from '../core/geminiChat.js'; -import type { - PromptConfig, - ModelConfig, - RunConfig, - ToolConfig, -} from './types.js'; -import { SubagentTerminateMode } from './types.js'; -import type { - SubAgentFinishEvent, - SubAgentRoundEvent, - SubAgentStartEvent, - SubAgentToolCallEvent, - SubAgentToolResultEvent, - SubAgentErrorEvent, - SubAgentUsageEvent, -} from './subagent-events.js'; -import { - type SubAgentEventEmitter, - SubAgentEventType, -} from './subagent-events.js'; -import { - SubagentStatistics, - type SubagentStatsSummary, -} from './subagent-statistics.js'; -import type { SubagentHooks } from './subagent-hooks.js'; -import { logSubagentExecution } from '../telemetry/loggers.js'; -import { SubagentExecutionEvent } from '../telemetry/types.js'; -import { TaskTool } from '../tools/task.js'; -import { DEFAULT_QWEN_MODEL } from '../config/models.js'; - -/** - * @fileoverview Defines the configuration interfaces for a subagent. - * - * These interfaces specify the structure for defining the subagent's prompt, - * the model parameters, and the execution settings. - */ - -interface ExecutionStats { - startTimeMs: number; - totalDurationMs: number; - rounds: number; - totalToolCalls: number; - successfulToolCalls: number; - failedToolCalls: number; - inputTokens?: number; - outputTokens?: number; - totalTokens?: number; - estimatedCost?: number; -} - -/** - * Manages the runtime context state for the subagent. - * This class provides a mechanism to store and retrieve key-value pairs - * that represent the dynamic state and variables accessible to the subagent - * during its execution. - */ -export class ContextState { - private state: Record = {}; - - /** - * Retrieves a value from the context state. - * - * @param key - The key of the value to retrieve. - * @returns The value associated with the key, or undefined if the key is not found. - */ - get(key: string): unknown { - return this.state[key]; - } - - /** - * Sets a value in the context state. - * - * @param key - The key to set the value under. - * @param value - The value to set. - */ - set(key: string, value: unknown): void { - this.state[key] = value; - } - - /** - * Retrieves all keys in the context state. - * - * @returns An array of all keys in the context state. - */ - get_keys(): string[] { - return Object.keys(this.state); - } -} - -/** - * Replaces `${...}` placeholders in a template string with values from a context. - * - * This function identifies all placeholders in the format `${key}`, validates that - * each key exists in the provided `ContextState`, and then performs the substitution. - * - * @param template The template string containing placeholders. - * @param context The `ContextState` object providing placeholder values. - * @returns The populated string with all placeholders replaced. - * @throws {Error} if any placeholder key is not found in the context. - */ -function templateString(template: string, context: ContextState): string { - const placeholderRegex = /\$\{(\w+)\}/g; - - // First, find all unique keys required by the template. - const requiredKeys = new Set( - Array.from(template.matchAll(placeholderRegex), (match) => match[1]), - ); - - // Check if all required keys exist in the context. - const contextKeys = new Set(context.get_keys()); - const missingKeys = Array.from(requiredKeys).filter( - (key) => !contextKeys.has(key), - ); - - if (missingKeys.length > 0) { - throw new Error( - `Missing context values for the following keys: ${missingKeys.join( - ', ', - )}`, - ); - } - - // Perform the replacement using a replacer function. - return template.replace(placeholderRegex, (_match, key) => - String(context.get(key)), - ); -} - -/** - * Represents the scope and execution environment for a subagent. - * This class orchestrates the subagent's lifecycle, managing its chat interactions, - * runtime context, and the collection of its outputs. - */ -export class SubAgentScope { - executionStats: ExecutionStats = { - startTimeMs: 0, - totalDurationMs: 0, - rounds: 0, - totalToolCalls: 0, - successfulToolCalls: 0, - failedToolCalls: 0, - inputTokens: 0, - outputTokens: 0, - totalTokens: 0, - estimatedCost: 0, - }; - private toolUsage = new Map< - string, - { - count: number; - success: number; - failure: number; - lastError?: string; - totalDurationMs?: number; - averageDurationMs?: number; - } - >(); - private eventEmitter?: SubAgentEventEmitter; - private finalText: string = ''; - private terminateMode: SubagentTerminateMode = SubagentTerminateMode.ERROR; - private readonly stats = new SubagentStatistics(); - private hooks?: SubagentHooks; - private readonly subagentId: string; - - /** - * Constructs a new SubAgentScope instance. - * @param name - The name for the subagent, used for logging and identification. - * @param runtimeContext - The shared runtime configuration and services. - * @param promptConfig - Configuration for the subagent's prompt and behavior. - * @param modelConfig - Configuration for the generative model parameters. - * @param runConfig - Configuration for the subagent's execution environment. - * @param toolConfig - Optional configuration for tools available to the subagent. - */ - private constructor( - readonly name: string, - readonly runtimeContext: Config, - private readonly promptConfig: PromptConfig, - private readonly modelConfig: ModelConfig, - private readonly runConfig: RunConfig, - private readonly toolConfig?: ToolConfig, - eventEmitter?: SubAgentEventEmitter, - hooks?: SubagentHooks, - ) { - const randomPart = Math.random().toString(36).slice(2, 8); - this.subagentId = `${this.name}-${randomPart}`; - this.eventEmitter = eventEmitter; - this.hooks = hooks; - } - - /** - * Creates and validates a new SubAgentScope instance. - * This factory method ensures that all tools provided in the prompt configuration - * are valid for non-interactive use before creating the subagent instance. - * @param {string} name - The name of the subagent. - * @param {Config} runtimeContext - The shared runtime configuration and services. - * @param {PromptConfig} promptConfig - Configuration for the subagent's prompt and behavior. - * @param {ModelConfig} modelConfig - Configuration for the generative model parameters. - * @param {RunConfig} runConfig - Configuration for the subagent's execution environment. - * @param {ToolConfig} [toolConfig] - Optional configuration for tools. - * @returns {Promise} A promise that resolves to a valid SubAgentScope instance. - * @throws {Error} If any tool requires user confirmation. - */ - static async create( - name: string, - runtimeContext: Config, - promptConfig: PromptConfig, - modelConfig: ModelConfig, - runConfig: RunConfig, - toolConfig?: ToolConfig, - eventEmitter?: SubAgentEventEmitter, - hooks?: SubagentHooks, - ): Promise { - return new SubAgentScope( - name, - runtimeContext, - promptConfig, - modelConfig, - runConfig, - toolConfig, - eventEmitter, - hooks, - ); - } - - /** - * Runs the subagent in a non-interactive mode. - * This method orchestrates the subagent's execution loop, including prompt templating, - * tool execution, and termination conditions. - * @param {ContextState} context - The current context state containing variables for prompt templating. - * @returns {Promise} A promise that resolves when the subagent has completed its execution. - */ - async runNonInteractive( - context: ContextState, - externalSignal?: AbortSignal, - ): Promise { - const chat = await this.createChatObject(context); - - if (!chat) { - this.terminateMode = SubagentTerminateMode.ERROR; - return; - } - - // Track the current round's AbortController for external signal propagation - let currentRoundAbortController: AbortController | null = null; - const onExternalAbort = () => { - currentRoundAbortController?.abort(); - }; - if (externalSignal) { - externalSignal.addEventListener('abort', onExternalAbort); - } - - const toolRegistry = this.runtimeContext.getToolRegistry(); - - // Prepare the list of tools available to the subagent. - // If no explicit toolConfig or it contains "*" or is empty, inherit all tools. - const toolsList: FunctionDeclaration[] = []; - if (this.toolConfig) { - const asStrings = this.toolConfig.tools.filter( - (t): t is string => typeof t === 'string', - ); - const hasWildcard = asStrings.includes('*'); - const onlyInlineDecls = this.toolConfig.tools.filter( - (t): t is FunctionDeclaration => typeof t !== 'string', - ); - - if (hasWildcard || asStrings.length === 0) { - toolsList.push( - ...toolRegistry - .getFunctionDeclarations() - .filter((t) => t.name !== TaskTool.Name), - ); - } else { - toolsList.push( - ...toolRegistry.getFunctionDeclarationsFiltered(asStrings), - ); - } - toolsList.push(...onlyInlineDecls); - } else { - // Inherit all available tools by default when not specified. - toolsList.push( - ...toolRegistry - .getFunctionDeclarations() - .filter((t) => t.name !== TaskTool.Name), - ); - } - - const initialTaskText = String( - (context.get('task_prompt') as string) ?? 'Get Started!', - ); - let currentMessages: Content[] = [ - { role: 'user', parts: [{ text: initialTaskText }] }, - ]; - - const startTime = Date.now(); - this.executionStats.startTimeMs = startTime; - this.stats.start(startTime); - let turnCounter = 0; - try { - // Emit start event - this.eventEmitter?.emit(SubAgentEventType.START, { - subagentId: this.subagentId, - name: this.name, - model: - this.modelConfig.model || - this.runtimeContext.getModel() || - DEFAULT_QWEN_MODEL, - tools: (this.toolConfig?.tools || ['*']).map((t) => - typeof t === 'string' ? t : t.name, - ), - timestamp: Date.now(), - } as SubAgentStartEvent); - - // Log telemetry for subagent start - const startEvent = new SubagentExecutionEvent(this.name, 'started'); - logSubagentExecution(this.runtimeContext, startEvent); - while (true) { - // Create a new AbortController for each round to avoid listener accumulation - const roundAbortController = new AbortController(); - currentRoundAbortController = roundAbortController; - - // If external signal already aborted, cancel immediately - if (externalSignal?.aborted) { - roundAbortController.abort(); - } - - // Check termination conditions. - if ( - this.runConfig.max_turns && - turnCounter >= this.runConfig.max_turns - ) { - this.terminateMode = SubagentTerminateMode.MAX_TURNS; - break; - } - let durationMin = (Date.now() - startTime) / (1000 * 60); - if ( - this.runConfig.max_time_minutes && - durationMin >= this.runConfig.max_time_minutes - ) { - this.terminateMode = SubagentTerminateMode.TIMEOUT; - break; - } - - const promptId = `${this.runtimeContext.getSessionId()}#${this.subagentId}#${turnCounter++}`; - - const messageParams = { - message: currentMessages[0]?.parts || [], - config: { - abortSignal: roundAbortController.signal, - tools: [{ functionDeclarations: toolsList }], - }, - }; - - const roundStreamStart = Date.now(); - const responseStream = await chat.sendMessageStream( - this.modelConfig.model || - this.runtimeContext.getModel() || - DEFAULT_QWEN_MODEL, - messageParams, - promptId, - ); - this.eventEmitter?.emit(SubAgentEventType.ROUND_START, { - subagentId: this.subagentId, - round: turnCounter, - promptId, - timestamp: Date.now(), - } as SubAgentRoundEvent); - - const functionCalls: FunctionCall[] = []; - let roundText = ''; - let lastUsage: GenerateContentResponseUsageMetadata | undefined = - undefined; - let currentResponseId: string | undefined = undefined; - for await (const streamEvent of responseStream) { - if (roundAbortController.signal.aborted) { - this.terminateMode = SubagentTerminateMode.CANCELLED; - return; - } - - // Handle retry events - if (streamEvent.type === 'retry') { - continue; - } - - // Handle chunk events - if (streamEvent.type === 'chunk') { - const resp = streamEvent.value; - // Track the response ID for tool call correlation - if (resp.responseId) { - currentResponseId = resp.responseId; - } - if (resp.functionCalls) functionCalls.push(...resp.functionCalls); - const content = resp.candidates?.[0]?.content; - const parts = content?.parts || []; - for (const p of parts) { - const txt = p.text; - const isThought = p.thought ?? false; - if (txt && !isThought) roundText += txt; - if (txt) - this.eventEmitter?.emit(SubAgentEventType.STREAM_TEXT, { - subagentId: this.subagentId, - round: turnCounter, - text: txt, - thought: isThought, - timestamp: Date.now(), - }); - } - if (resp.usageMetadata) lastUsage = resp.usageMetadata; - } - } - this.executionStats.rounds = turnCounter; - this.stats.setRounds(turnCounter); - - durationMin = (Date.now() - startTime) / (1000 * 60); - if ( - this.runConfig.max_time_minutes && - durationMin >= this.runConfig.max_time_minutes - ) { - this.terminateMode = SubagentTerminateMode.TIMEOUT; - break; - } - - // Update token usage if available - if (lastUsage) { - const inTok = Number(lastUsage.promptTokenCount || 0); - const outTok = Number(lastUsage.candidatesTokenCount || 0); - const thoughtTok = Number(lastUsage.thoughtsTokenCount || 0); - const cachedTok = Number(lastUsage.cachedContentTokenCount || 0); - if ( - isFinite(inTok) || - isFinite(outTok) || - isFinite(thoughtTok) || - isFinite(cachedTok) - ) { - this.stats.recordTokens( - isFinite(inTok) ? inTok : 0, - isFinite(outTok) ? outTok : 0, - isFinite(thoughtTok) ? thoughtTok : 0, - isFinite(cachedTok) ? cachedTok : 0, - ); - // mirror legacy fields for compatibility - this.executionStats.inputTokens = - (this.executionStats.inputTokens || 0) + - (isFinite(inTok) ? inTok : 0); - this.executionStats.outputTokens = - (this.executionStats.outputTokens || 0) + - (isFinite(outTok) ? outTok : 0); - this.executionStats.totalTokens = - (this.executionStats.inputTokens || 0) + - (this.executionStats.outputTokens || 0) + - (isFinite(thoughtTok) ? thoughtTok : 0) + - (isFinite(cachedTok) ? cachedTok : 0); - this.executionStats.estimatedCost = - (this.executionStats.inputTokens || 0) * 3e-5 + - (this.executionStats.outputTokens || 0) * 6e-5; - } - this.eventEmitter?.emit(SubAgentEventType.USAGE_METADATA, { - subagentId: this.subagentId, - round: turnCounter, - usage: lastUsage, - durationMs: Date.now() - roundStreamStart, - timestamp: Date.now(), - } as SubAgentUsageEvent); - } - - if (functionCalls.length > 0) { - currentMessages = await this.processFunctionCalls( - functionCalls, - roundAbortController, - promptId, - turnCounter, - toolsList, - currentResponseId, - ); - } else { - // No tool calls — treat this as the model's final answer. - if (roundText && roundText.trim().length > 0) { - this.finalText = roundText.trim(); - this.terminateMode = SubagentTerminateMode.GOAL; - break; - } - // Otherwise, nudge the model to finalize a result. - currentMessages = [ - { - role: 'user', - parts: [ - { - text: 'Please provide the final result now and stop calling tools.', - }, - ], - }, - ]; - } - this.eventEmitter?.emit(SubAgentEventType.ROUND_END, { - subagentId: this.subagentId, - round: turnCounter, - promptId, - timestamp: Date.now(), - } as SubAgentRoundEvent); - } - } catch (error) { - debugLogger.error('Error during subagent execution:', error); - this.terminateMode = SubagentTerminateMode.ERROR; - this.eventEmitter?.emit(SubAgentEventType.ERROR, { - subagentId: this.subagentId, - error: error instanceof Error ? error.message : String(error), - timestamp: Date.now(), - } as SubAgentErrorEvent); - - throw error; - } finally { - if (externalSignal) { - externalSignal.removeEventListener('abort', onExternalAbort); - } - // Clear the reference to allow GC - currentRoundAbortController = null; - this.executionStats.totalDurationMs = Date.now() - startTime; - const summary = this.stats.getSummary(Date.now()); - this.eventEmitter?.emit(SubAgentEventType.FINISH, { - subagentId: this.subagentId, - terminateReason: this.terminateMode, - timestamp: Date.now(), - rounds: summary.rounds, - totalDurationMs: summary.totalDurationMs, - totalToolCalls: summary.totalToolCalls, - successfulToolCalls: summary.successfulToolCalls, - failedToolCalls: summary.failedToolCalls, - inputTokens: summary.inputTokens, - outputTokens: summary.outputTokens, - totalTokens: summary.totalTokens, - } as SubAgentFinishEvent); - - const completionEvent = new SubagentExecutionEvent( - this.name, - this.terminateMode === SubagentTerminateMode.GOAL - ? 'completed' - : 'failed', - { - terminate_reason: this.terminateMode, - result: this.finalText, - execution_summary: this.stats.formatCompact( - 'Subagent execution completed', - ), - }, - ); - logSubagentExecution(this.runtimeContext, completionEvent); - - await this.hooks?.onStop?.({ - subagentId: this.subagentId, - name: this.name, - terminateReason: this.terminateMode, - summary: summary as unknown as Record, - timestamp: Date.now(), - }); - } - } - - /** - * Processes a list of function calls, executing each one and collecting their responses. - * This method iterates through the provided function calls, executes them using the - * `executeToolCall` function (or handles `self.emitvalue` internally), and aggregates - * their results. It also manages error reporting for failed tool executions. - * @param {FunctionCall[]} functionCalls - An array of `FunctionCall` objects to process. - * @param {ToolRegistry} toolRegistry - The tool registry to look up and execute tools. - * @param {AbortController} abortController - An `AbortController` to signal cancellation of tool executions. - * @param {string} responseId - Optional API response ID for correlation with tool calls. - * @returns {Promise} A promise that resolves to an array of `Content` parts representing the tool responses, - * which are then used to update the chat history. - */ - private async processFunctionCalls( - functionCalls: FunctionCall[], - abortController: AbortController, - promptId: string, - currentRound: number, - toolsList: FunctionDeclaration[], - responseId?: string, - ): Promise { - const toolResponseParts: Part[] = []; - - // Build allowed tool names set for filtering - const allowedToolNames = new Set(toolsList.map((t) => t.name)); - - // Filter unauthorized tool calls before scheduling - const authorizedCalls: FunctionCall[] = []; - for (const fc of functionCalls) { - const callId = fc.id ?? `${fc.name}-${Date.now()}`; - - if (!allowedToolNames.has(fc.name)) { - const toolName = String(fc.name); - const errorMessage = `Tool "${toolName}" not found. Tools must use the exact names provided.`; - - // Emit TOOL_CALL event for visibility - this.eventEmitter?.emit(SubAgentEventType.TOOL_CALL, { - subagentId: this.subagentId, - round: currentRound, - callId, - name: toolName, - args: fc.args ?? {}, - description: `Tool "${toolName}" not found`, - timestamp: Date.now(), - } as SubAgentToolCallEvent); - - // Build function response part (used for both event and LLM) - const functionResponsePart = { - functionResponse: { - id: callId, - name: toolName, - response: { error: errorMessage }, - }, - }; - - // Emit TOOL_RESULT event with error (include responseParts for UI rendering) - this.eventEmitter?.emit(SubAgentEventType.TOOL_RESULT, { - subagentId: this.subagentId, - round: currentRound, - callId, - name: toolName, - success: false, - error: errorMessage, - responseParts: [functionResponsePart], - resultDisplay: errorMessage, - durationMs: 0, - timestamp: Date.now(), - } as SubAgentToolResultEvent); - - // Record blocked tool call in stats - this.recordToolCallStats(toolName, false, 0, errorMessage); - - // Add function response for LLM - toolResponseParts.push(functionResponsePart); - continue; - } - authorizedCalls.push(fc); - } - - // Build scheduler - const responded = new Set(); - let resolveBatch: (() => void) | null = null; - const scheduler = new CoreToolScheduler({ - config: this.runtimeContext, - outputUpdateHandler: undefined, - onAllToolCallsComplete: async (completedCalls) => { - for (const call of completedCalls) { - const toolName = call.request.name; - const duration = call.durationMs ?? 0; - const success = call.status === 'success'; - const errorMessage = - call.status === 'error' || call.status === 'cancelled' - ? call.response.error?.message - : undefined; - - // Record stats - this.recordToolCallStats(toolName, success, duration, errorMessage); - - // Emit tool result event - this.eventEmitter?.emit(SubAgentEventType.TOOL_RESULT, { - subagentId: this.subagentId, - round: currentRound, - callId: call.request.callId, - name: toolName, - success, - error: errorMessage, - responseParts: call.response.responseParts, - resultDisplay: call.response.resultDisplay - ? typeof call.response.resultDisplay === 'string' - ? call.response.resultDisplay - : JSON.stringify(call.response.resultDisplay) - : undefined, - durationMs: duration, - timestamp: Date.now(), - } as SubAgentToolResultEvent); - - // post-tool hook - await this.hooks?.postToolUse?.({ - subagentId: this.subagentId, - name: this.name, - toolName, - args: call.request.args, - success, - durationMs: duration, - errorMessage, - timestamp: Date.now(), - }); - - // Append response parts - const respParts = call.response.responseParts; - if (respParts) { - const parts = Array.isArray(respParts) ? respParts : [respParts]; - for (const part of parts) { - if (typeof part === 'string') { - toolResponseParts.push({ text: part }); - } else if (part) { - toolResponseParts.push(part); - } - } - } - } - // Signal that this batch is complete (all tools terminal) - resolveBatch?.(); - }, - onToolCallsUpdate: (calls: ToolCall[]) => { - for (const call of calls) { - if (call.status !== 'awaiting_approval') continue; - const waiting = call as WaitingToolCall; - - // Emit approval request event for UI visibility - try { - const { confirmationDetails } = waiting; - const { onConfirm: _onConfirm, ...rest } = confirmationDetails; - this.eventEmitter?.emit(SubAgentEventType.TOOL_WAITING_APPROVAL, { - subagentId: this.subagentId, - round: currentRound, - callId: waiting.request.callId, - name: waiting.request.name, - description: this.getToolDescription( - waiting.request.name, - waiting.request.args, - ), - confirmationDetails: rest, - respond: async ( - outcome: ToolConfirmationOutcome, - payload?: Parameters< - ToolCallConfirmationDetails['onConfirm'] - >[1], - ) => { - if (responded.has(waiting.request.callId)) return; - responded.add(waiting.request.callId); - await waiting.confirmationDetails.onConfirm(outcome, payload); - }, - timestamp: Date.now(), - }); - } catch { - // ignore UI event emission failures - } - - // UI now renders inline confirmation via task tool live output. - } - }, - getPreferredEditor: () => undefined, - onEditorClose: () => {}, - }); - - // Prepare requests and emit TOOL_CALL events - const requests: ToolCallRequestInfo[] = authorizedCalls.map((fc) => { - const toolName = String(fc.name || 'unknown'); - const callId = fc.id ?? `${fc.name}-${Date.now()}`; - const args = (fc.args ?? {}) as Record; - const request: ToolCallRequestInfo = { - callId, - name: toolName, - args, - isClientInitiated: true, - prompt_id: promptId, - response_id: responseId, - }; - - const description = this.getToolDescription(toolName, args); - this.eventEmitter?.emit(SubAgentEventType.TOOL_CALL, { - subagentId: this.subagentId, - round: currentRound, - callId, - name: toolName, - args, - description, - timestamp: Date.now(), - } as SubAgentToolCallEvent); - - // pre-tool hook - void this.hooks?.preToolUse?.({ - subagentId: this.subagentId, - name: this.name, - toolName, - args, - timestamp: Date.now(), - }); - - return request; - }); - - if (requests.length > 0) { - // Create a per-batch completion promise, resolve when onAllToolCallsComplete fires - const batchDone = new Promise((resolve) => { - resolveBatch = () => { - resolve(); - resolveBatch = null; - }; - }); - await scheduler.schedule(requests, abortController.signal); - await batchDone; // Wait for approvals + execution to finish - } - // If all tool calls failed, inform the model so it can re-evaluate. - if (functionCalls.length > 0 && toolResponseParts.length === 0) { - toolResponseParts.push({ - text: 'All tool calls failed. Please analyze the errors and try an alternative approach.', - }); - } - - return [{ role: 'user', parts: toolResponseParts }]; - } - - getEventEmitter() { - return this.eventEmitter; - } - - getStatistics() { - const total = this.executionStats.totalToolCalls; - const successRate = - total > 0 ? (this.executionStats.successfulToolCalls / total) * 100 : 0; - return { - ...this.executionStats, - successRate, - toolUsage: Array.from(this.toolUsage.entries()).map(([name, v]) => ({ - name, - ...v, - })), - }; - } - - getExecutionSummary(): SubagentStatsSummary { - return this.stats.getSummary(); - } - - getFinalText(): string { - return this.finalText; - } - - getTerminateMode(): SubagentTerminateMode { - return this.terminateMode; - } - - private async createChatObject(context: ContextState) { - if (!this.promptConfig.systemPrompt && !this.promptConfig.initialMessages) { - throw new Error( - 'PromptConfig must have either `systemPrompt` or `initialMessages` defined.', - ); - } - if (this.promptConfig.systemPrompt && this.promptConfig.initialMessages) { - throw new Error( - 'PromptConfig cannot have both `systemPrompt` and `initialMessages` defined.', - ); - } - - const envHistory = await getInitialChatHistory(this.runtimeContext); - - const start_history = [ - ...envHistory, - ...(this.promptConfig.initialMessages ?? []), - ]; - - const systemInstruction = this.promptConfig.systemPrompt - ? this.buildChatSystemPrompt(context) - : undefined; - - try { - const generationConfig: GenerateContentConfig & { - systemInstruction?: string | Content; - } = { - temperature: this.modelConfig.temp, - topP: this.modelConfig.top_p, - }; - - if (systemInstruction) { - generationConfig.systemInstruction = systemInstruction; - } - - return new GeminiChat( - this.runtimeContext, - generationConfig, - start_history, - ); - } catch (error) { - await reportError( - error, - 'Error initializing chat session.', - start_history, - 'startChat', - ); - // The calling function will handle the undefined return. - return undefined; - } - } - - /** - * Safely retrieves the description of a tool by attempting to build it. - * Returns an empty string if any error occurs during the process. - * - * @param toolName The name of the tool to get description for. - * @param args The arguments that would be passed to the tool. - * @returns The tool description or empty string if error occurs. - */ - private getToolDescription( - toolName: string, - args: Record, - ): string { - try { - const toolRegistry = this.runtimeContext.getToolRegistry(); - const tool = toolRegistry.getTool(toolName); - if (!tool) { - return ''; - } - - const toolInstance = tool.build(args); - return toolInstance.getDescription() || ''; - } catch { - // Safely ignore all runtime errors and return empty string - return ''; - } - } - - /** - * Records tool call statistics for both successful and failed tool calls. - * This includes updating aggregate stats, per-tool usage, and the statistics service. - */ - private recordToolCallStats( - toolName: string, - success: boolean, - durationMs: number, - errorMessage?: string, - ): void { - // Update aggregate stats - this.executionStats.totalToolCalls += 1; - if (success) { - this.executionStats.successfulToolCalls += 1; - } else { - this.executionStats.failedToolCalls += 1; - } - - // Per-tool usage - const tu = this.toolUsage.get(toolName) || { - count: 0, - success: 0, - failure: 0, - totalDurationMs: 0, - averageDurationMs: 0, - }; - tu.count += 1; - if (success) { - tu.success += 1; - } else { - tu.failure += 1; - tu.lastError = errorMessage || 'Unknown error'; - } - tu.totalDurationMs = (tu.totalDurationMs || 0) + durationMs; - tu.averageDurationMs = tu.count > 0 ? tu.totalDurationMs / tu.count : 0; - this.toolUsage.set(toolName, tu); - - // Update statistics service - this.stats.recordToolCall( - toolName, - success, - durationMs, - this.toolUsage.get(toolName)?.lastError, - ); - } - - private buildChatSystemPrompt(context: ContextState): string { - if (!this.promptConfig.systemPrompt) { - // This should ideally be caught in createChatObject, but serves as a safeguard. - return ''; - } - - let finalPrompt = templateString(this.promptConfig.systemPrompt, context); - - // Add general non-interactive instructions. - finalPrompt += ` - -Important Rules: - - You operate in non-interactive mode: do not ask the user questions; proceed with available context. - - Use tools only when necessary to obtain facts or make changes. - - When the task is complete, return the final result as a normal model response (not a tool call) and stop.`; - - // Append user memory (QWEN.md + output-language.md) to ensure subagent respects project conventions - const userMemory = this.runtimeContext.getUserMemory(); - if (userMemory && userMemory.trim().length > 0) { - finalPrompt += `\n\n---\n\n${userMemory.trim()}`; - } - - return finalPrompt; - } -} diff --git a/packages/core/src/subagents/types.ts b/packages/core/src/subagents/types.ts index efa73a7e4d..55e57f61e1 100644 --- a/packages/core/src/subagents/types.ts +++ b/packages/core/src/subagents/types.ts @@ -4,7 +4,19 @@ * SPDX-License-Identifier: Apache-2.0 */ -import type { Content, FunctionDeclaration } from '@google/genai'; +/** + * @fileoverview Subagent configuration types. + * + * Agent runtime types (PromptConfig, ModelConfig, RunConfig, ToolConfig, + * AgentTerminateMode) are canonically defined in agents/runtime/agent-types.ts. + */ + +import type { + ModelConfig, + RunConfig, + PromptConfig, + ToolConfig, +} from '../agents/runtime/agent-types.js'; /** * Represents the storage level for a subagent configuration. @@ -24,7 +36,7 @@ export type SubagentLevel = /** * Core configuration for a subagent as stored in Markdown files. * This interface represents the file-based configuration that gets - * converted to runtime configuration for SubAgentScope. + * converted to runtime configuration for AgentHeadless. */ export interface SubagentConfig { /** Unique name identifier for the subagent */ @@ -82,20 +94,20 @@ export interface SubagentConfig { } /** - * Runtime configuration that converts file-based config to existing SubAgentScope. + * Runtime configuration that converts file-based config to AgentHeadless. * This interface maps SubagentConfig to the existing runtime interfaces. */ export interface SubagentRuntimeConfig { - /** Prompt configuration for SubAgentScope */ + /** Prompt configuration for AgentHeadless */ promptConfig: PromptConfig; - /** Model configuration for SubAgentScope */ + /** Model configuration for AgentHeadless */ modelConfig: ModelConfig; - /** Runtime execution configuration for SubAgentScope */ + /** Runtime execution configuration for AgentHeadless */ runConfig: RunConfig; - /** Optional tool configuration for SubAgentScope */ + /** Optional tool configuration for AgentHeadless */ toolConfig?: ToolConfig; } @@ -176,97 +188,3 @@ export const SubagentErrorCode = { export type SubagentErrorCode = (typeof SubagentErrorCode)[keyof typeof SubagentErrorCode]; - -/** - * Describes the possible termination modes for a subagent. - * This enum provides a clear indication of why a subagent's execution might have ended. - */ -export enum SubagentTerminateMode { - /** - * Indicates that the subagent's execution terminated due to an unrecoverable error. - */ - ERROR = 'ERROR', - /** - * Indicates that the subagent's execution terminated because it exceeded the maximum allowed working time. - */ - TIMEOUT = 'TIMEOUT', - /** - * Indicates that the subagent's execution successfully completed all its defined goals. - */ - GOAL = 'GOAL', - /** - * Indicates that the subagent's execution terminated because it exceeded the maximum number of turns. - */ - MAX_TURNS = 'MAX_TURNS', - /** - * Indicates that the subagent's execution was cancelled via an abort signal. - */ - CANCELLED = 'CANCELLED', -} - -/** - * Configures the initial prompt for the subagent. - */ -export interface PromptConfig { - /** - * A single system prompt string that defines the subagent's persona and instructions. - * Note: You should use either `systemPrompt` or `initialMessages`, but not both. - */ - systemPrompt?: string; - - /** - * An array of user/model content pairs to seed the chat history for few-shot prompting. - * Note: You should use either `systemPrompt` or `initialMessages`, but not both. - */ - initialMessages?: Content[]; -} - -/** - * Configures the tools available to the subagent during its execution. - */ -export interface ToolConfig { - /** - * A list of tool names (from the tool registry) or full function declarations - * that the subagent is permitted to use. - */ - tools: Array; -} - -/** - * Configures the generative model parameters for the subagent. - * This interface specifies the model to be used and its associated generation settings, - * such as temperature and top-p values, which influence the creativity and diversity of the model's output. - */ -export interface ModelConfig { - /** - * The name or identifier of the model to be used (e.g., 'qwen3-coder-plus'). - * - * TODO: In the future, this needs to support 'auto' or some other string to support routing use cases. - */ - model?: string; - /** - * The temperature for the model's sampling process. - */ - temp?: number; - /** - * The top-p value for nucleus sampling. - */ - top_p?: number; -} - -/** - * Configures the execution environment and constraints for the subagent. - * This interface defines parameters that control the subagent's runtime behavior, - * such as maximum execution time, to prevent infinite loops or excessive resource consumption. - * - * TODO: Consider adding max_tokens as a form of budgeting. - */ -export interface RunConfig { - /** The maximum execution time for the subagent in minutes. */ - max_time_minutes?: number; - /** - * The maximum number of conversational turns (a user message + model response) - * before the execution is terminated. Helps prevent infinite loops. - */ - max_turns?: number; -} diff --git a/packages/core/src/subagents/validation.ts b/packages/core/src/subagents/validation.ts index ac45a37969..15fb312690 100644 --- a/packages/core/src/subagents/validation.ts +++ b/packages/core/src/subagents/validation.ts @@ -5,12 +5,8 @@ */ import { SubagentError, SubagentErrorCode } from './types.js'; -import type { - ModelConfig, - RunConfig, - SubagentConfig, - ValidationResult, -} from './types.js'; +import type { SubagentConfig, ValidationResult } from './types.js'; +import type { ModelConfig, RunConfig } from '../agents/runtime/agent-types.js'; /** * Validates subagent configurations to ensure they are well-formed diff --git a/packages/core/src/telemetry/constants.ts b/packages/core/src/telemetry/constants.ts index 8149dfc474..6de60015b1 100644 --- a/packages/core/src/telemetry/constants.ts +++ b/packages/core/src/telemetry/constants.ts @@ -39,6 +39,11 @@ export const EVENT_SKILL_LAUNCH = 'qwen-code.skill_launch'; export const EVENT_AUTH = 'qwen-code.auth'; export const EVENT_USER_FEEDBACK = 'qwen-code.user_feedback'; +// Arena Events +export const EVENT_ARENA_SESSION_STARTED = 'qwen-code.arena_session_started'; +export const EVENT_ARENA_AGENT_COMPLETED = 'qwen-code.arena_agent_completed'; +export const EVENT_ARENA_SESSION_ENDED = 'qwen-code.arena_session_ended'; + // Performance Events export const EVENT_STARTUP_PERFORMANCE = 'qwen-code.startup.performance'; export const EVENT_MEMORY_USAGE = 'qwen-code.memory.usage'; diff --git a/packages/core/src/telemetry/index.ts b/packages/core/src/telemetry/index.ts index cc21d7716c..596db3fa12 100644 --- a/packages/core/src/telemetry/index.ts +++ b/packages/core/src/telemetry/index.ts @@ -49,6 +49,9 @@ export { logAuth, logSkillLaunch, logUserFeedback, + logArenaSessionStarted, + logArenaAgentCompleted, + logArenaSessionEnded, } from './loggers.js'; export type { SlashCommandEvent, ChatCompressionEvent } from './types.js'; export { @@ -72,8 +75,18 @@ export { SkillLaunchEvent, UserFeedbackEvent, UserFeedbackRating, + makeArenaSessionStartedEvent, + makeArenaAgentCompletedEvent, + makeArenaSessionEndedEvent, } from './types.js'; export { makeSlashCommandEvent, makeChatCompressionEvent } from './types.js'; +export type { + ArenaSessionStartedEvent, + ArenaAgentCompletedEvent, + ArenaSessionEndedEvent, + ArenaSessionEndedStatus, + ArenaAgentCompletedStatus, +} from './types.js'; export type { TelemetryEvent } from './types.js'; export { SpanStatusCode, ValueType } from '@opentelemetry/api'; export { SemanticAttributes } from '@opentelemetry/semantic-conventions'; @@ -100,6 +113,10 @@ export { recordPerformanceRegression, recordBaselineComparison, isPerformanceMonitoringActive, + // Arena metrics functions + recordArenaSessionStartedMetrics, + recordArenaAgentCompletedMetrics, + recordArenaSessionEndedMetrics, // Performance monitoring types PerformanceMetricType, MemoryMetricType, diff --git a/packages/core/src/telemetry/loggers.ts b/packages/core/src/telemetry/loggers.ts index e2bf6b1e55..0a7842f38e 100644 --- a/packages/core/src/telemetry/loggers.ts +++ b/packages/core/src/telemetry/loggers.ts @@ -41,6 +41,9 @@ import { EVENT_SKILL_LAUNCH, EVENT_EXTENSION_UPDATE, EVENT_USER_FEEDBACK, + EVENT_ARENA_SESSION_STARTED, + EVENT_ARENA_AGENT_COMPLETED, + EVENT_ARENA_SESSION_ENDED, } from './constants.js'; import { recordApiErrorMetrics, @@ -54,6 +57,9 @@ import { recordSubagentExecutionMetrics, recordTokenUsageMetrics, recordToolCallMetrics, + recordArenaSessionStartedMetrics, + recordArenaAgentCompletedMetrics, + recordArenaSessionEndedMetrics, } from './metrics.js'; import { QwenLogger } from './qwen-logger/qwen-logger.js'; import { isTelemetrySdkInitialized } from './sdk.js'; @@ -92,6 +98,9 @@ import type { AuthEvent, SkillLaunchEvent, UserFeedbackEvent, + ArenaSessionStartedEvent, + ArenaAgentCompletedEvent, + ArenaSessionEndedEvent, } from './types.js'; import type { UiEvent } from './uiTelemetry.js'; import { uiTelemetryService } from './uiTelemetry.js'; @@ -968,3 +977,86 @@ export function logUserFeedback( }; logger.emit(logRecord); } + +export function logArenaSessionStarted( + config: Config, + event: ArenaSessionStartedEvent, +): void { + QwenLogger.getInstance(config)?.logArenaSessionStartedEvent(event); + if (!isTelemetrySdkInitialized()) return; + + const attributes: LogAttributes = { + ...getCommonAttributes(config), + ...event, + model_ids: JSON.stringify(event.model_ids), + 'event.name': EVENT_ARENA_SESSION_STARTED, + 'event.timestamp': new Date().toISOString(), + }; + + const logger = logs.getLogger(SERVICE_NAME); + const logRecord: LogRecord = { + body: `Arena session started. Agents: ${event.model_ids.length}.`, + attributes, + }; + logger.emit(logRecord); + recordArenaSessionStartedMetrics(config); +} + +export function logArenaAgentCompleted( + config: Config, + event: ArenaAgentCompletedEvent, +): void { + QwenLogger.getInstance(config)?.logArenaAgentCompletedEvent(event); + if (!isTelemetrySdkInitialized()) return; + + const attributes: LogAttributes = { + ...getCommonAttributes(config), + ...event, + 'event.name': EVENT_ARENA_AGENT_COMPLETED, + 'event.timestamp': new Date().toISOString(), + }; + + const logger = logs.getLogger(SERVICE_NAME); + const logRecord: LogRecord = { + body: `Arena agent ${event.agent_model_id} ${event.status}. Duration: ${event.duration_ms}ms. Tokens: ${event.total_tokens}.`, + attributes, + }; + logger.emit(logRecord); + recordArenaAgentCompletedMetrics( + config, + event.agent_model_id, + event.status, + event.duration_ms, + event.input_tokens, + event.output_tokens, + ); +} + +export function logArenaSessionEnded( + config: Config, + event: ArenaSessionEndedEvent, +): void { + QwenLogger.getInstance(config)?.logArenaSessionEndedEvent(event); + if (!isTelemetrySdkInitialized()) return; + + const attributes: LogAttributes = { + ...getCommonAttributes(config), + ...event, + 'event.name': EVENT_ARENA_SESSION_ENDED, + 'event.timestamp': new Date().toISOString(), + }; + + const logger = logs.getLogger(SERVICE_NAME); + const logRecord: LogRecord = { + body: `Arena session ended: ${event.status}.${event.winner_model_id ? ` Winner: ${event.winner_model_id}.` : ''}`, + attributes, + }; + logger.emit(logRecord); + recordArenaSessionEndedMetrics( + config, + event.status, + event.display_backend, + event.duration_ms, + event.winner_model_id, + ); +} diff --git a/packages/core/src/telemetry/metrics.ts b/packages/core/src/telemetry/metrics.ts index 0ab499e0f1..f71498c365 100644 --- a/packages/core/src/telemetry/metrics.ts +++ b/packages/core/src/telemetry/metrics.ts @@ -23,6 +23,14 @@ const CONTENT_RETRY_FAILURE_COUNT = `${SERVICE_NAME}.chat.content_retry_failure. const MODEL_SLASH_COMMAND_CALL_COUNT = `${SERVICE_NAME}.slash_command.model.call_count`; export const SUBAGENT_EXECUTION_COUNT = `${SERVICE_NAME}.subagent.execution.count`; +// Arena Metrics +const ARENA_SESSION_COUNT = `${SERVICE_NAME}.arena.session.count`; +const ARENA_SESSION_DURATION = `${SERVICE_NAME}.arena.session.duration`; +const ARENA_AGENT_COUNT = `${SERVICE_NAME}.arena.agent.count`; +const ARENA_AGENT_DURATION = `${SERVICE_NAME}.arena.agent.duration`; +const ARENA_AGENT_TOKENS = `${SERVICE_NAME}.arena.agent.tokens`; +const ARENA_RESULT_SELECTED = `${SERVICE_NAME}.arena.result.selected`; + // Performance Monitoring Metrics const STARTUP_TIME = `${SERVICE_NAME}.startup.duration`; const MEMORY_USAGE = `${SERVICE_NAME}.memory.usage`; @@ -345,6 +353,14 @@ let performanceScoreGauge: Histogram | undefined; let regressionDetectionCounter: Counter | undefined; let regressionPercentageChangeHistogram: Histogram | undefined; let baselineComparisonHistogram: Histogram | undefined; +// Arena Metrics +let arenaSessionCounter: Counter | undefined; +let arenaSessionDurationHistogram: Histogram | undefined; +let arenaAgentCounter: Counter | undefined; +let arenaAgentDurationHistogram: Histogram | undefined; +let arenaAgentTokensCounter: Counter | undefined; +let arenaResultSelectedCounter: Counter | undefined; + let isMetricsInitialized = false; let isPerformanceMonitoringEnabled = false; @@ -373,6 +389,37 @@ export function initializeMetrics(config: Config): void { valueType: ValueType.INT, }); + // Arena metrics + arenaSessionCounter = meter.createCounter(ARENA_SESSION_COUNT, { + description: 'Counts arena sessions by status and display backend.', + valueType: ValueType.INT, + }); + arenaSessionDurationHistogram = meter.createHistogram( + ARENA_SESSION_DURATION, + { + description: 'Duration of arena sessions in milliseconds.', + unit: 'ms', + valueType: ValueType.INT, + }, + ); + arenaAgentCounter = meter.createCounter(ARENA_AGENT_COUNT, { + description: 'Counts arena agent completions by status and model.', + valueType: ValueType.INT, + }); + arenaAgentDurationHistogram = meter.createHistogram(ARENA_AGENT_DURATION, { + description: 'Duration of arena agent execution in milliseconds.', + unit: 'ms', + valueType: ValueType.INT, + }); + arenaAgentTokensCounter = meter.createCounter(ARENA_AGENT_TOKENS, { + description: 'Token usage by arena agents.', + valueType: ValueType.INT, + }); + arenaResultSelectedCounter = meter.createCounter(ARENA_RESULT_SELECTED, { + description: 'Counts arena result selections by model.', + valueType: ValueType.INT, + }); + Object.entries(HISTOGRAM_DEFINITIONS).forEach( ([name, { description, unit, valueType, assign }]) => { assign(meter.createHistogram(name, { description, unit, valueType })); @@ -747,3 +794,85 @@ export function recordSubagentExecutionMetrics( subagentExecutionCounter.add(1, attributes); } + +// ─── Arena Metric Recording Functions ─────────────────────────── + +export function recordArenaSessionStartedMetrics(config: Config): void { + if (!isMetricsInitialized) return; + arenaSessionCounter?.add(1, { + ...baseMetricDefinition.getCommonAttributes(config), + status: 'started', + }); +} + +export function recordArenaAgentCompletedMetrics( + config: Config, + modelId: string, + status: string, + durationMs: number, + inputTokens: number, + outputTokens: number, +): void { + if (!isMetricsInitialized) return; + + const common = baseMetricDefinition.getCommonAttributes(config); + + arenaAgentCounter?.add(1, { + ...common, + status, + model_id: modelId, + }); + + arenaAgentDurationHistogram?.record(durationMs, { + ...common, + model_id: modelId, + }); + + if (inputTokens > 0) { + arenaAgentTokensCounter?.add(inputTokens, { + ...common, + model_id: modelId, + type: 'input', + }); + } + + if (outputTokens > 0) { + arenaAgentTokensCounter?.add(outputTokens, { + ...common, + model_id: modelId, + type: 'output', + }); + } +} + +export function recordArenaSessionEndedMetrics( + config: Config, + status: string, + displayBackend?: string, + durationMs?: number, + winnerModelId?: string, +): void { + if (!isMetricsInitialized) return; + + const common = baseMetricDefinition.getCommonAttributes(config); + + arenaSessionCounter?.add(1, { + ...common, + status, + ...(displayBackend ? { display_backend: displayBackend } : {}), + }); + + if (durationMs !== undefined && arenaSessionDurationHistogram) { + arenaSessionDurationHistogram.record(durationMs, { + ...common, + status, + }); + } + + if (winnerModelId) { + arenaResultSelectedCounter?.add(1, { + ...common, + model_id: winnerModelId, + }); + } +} diff --git a/packages/core/src/telemetry/qwen-logger/qwen-logger.ts b/packages/core/src/telemetry/qwen-logger/qwen-logger.ts index 81cf7efbfe..b0bb22bb0d 100644 --- a/packages/core/src/telemetry/qwen-logger/qwen-logger.ts +++ b/packages/core/src/telemetry/qwen-logger/qwen-logger.ts @@ -46,6 +46,9 @@ import type { RipgrepFallbackEvent, EndSessionEvent, ExtensionUpdateEvent, + ArenaSessionStartedEvent, + ArenaAgentCompletedEvent, + ArenaSessionEndedEvent, } from '../types.js'; import type { RumEvent, @@ -937,6 +940,61 @@ export class QwenLogger { this.flushIfNeeded(); } + // arena events + logArenaSessionStartedEvent(event: ArenaSessionStartedEvent): void { + const rumEvent = this.createActionEvent('arena', 'arena_session_started', { + properties: { + arena_session_id: event.arena_session_id, + model_ids: JSON.stringify(event.model_ids), + task_length: event.task_length, + }, + }); + + this.enqueueLogEvent(rumEvent); + this.flushIfNeeded(); + } + + logArenaAgentCompletedEvent(event: ArenaAgentCompletedEvent): void { + const rumEvent = this.createActionEvent('arena', 'arena_agent_completed', { + properties: { + arena_session_id: event.arena_session_id, + agent_session_id: event.agent_session_id, + agent_model_id: event.agent_model_id, + status: event.status, + duration_ms: event.duration_ms, + rounds: event.rounds, + total_tokens: event.total_tokens, + input_tokens: event.input_tokens, + output_tokens: event.output_tokens, + tool_calls: event.tool_calls, + successful_tool_calls: event.successful_tool_calls, + failed_tool_calls: event.failed_tool_calls, + }, + }); + + this.enqueueLogEvent(rumEvent); + this.flushIfNeeded(); + } + + logArenaSessionEndedEvent(event: ArenaSessionEndedEvent): void { + const rumEvent = this.createActionEvent('arena', 'arena_session_ended', { + properties: { + arena_session_id: event.arena_session_id, + status: event.status, + duration_ms: event.duration_ms, + display_backend: event.display_backend, + agent_count: event.agent_count, + completed_agents: event.completed_agents, + failed_agents: event.failed_agents, + cancelled_agents: event.cancelled_agents, + winner_model_id: event.winner_model_id, + }, + }); + + this.enqueueLogEvent(rumEvent); + this.flushIfNeeded(); + } + getProxyAgent() { const proxyUrl = this.config?.getProxy(); if (!proxyUrl) return undefined; diff --git a/packages/core/src/telemetry/types.ts b/packages/core/src/telemetry/types.ts index e25e937e4a..39b6c5c48a 100644 --- a/packages/core/src/telemetry/types.ts +++ b/packages/core/src/telemetry/types.ts @@ -877,7 +877,128 @@ export type TelemetryEvent = | ModelSlashCommandEvent | AuthEvent | SkillLaunchEvent - | UserFeedbackEvent; + | UserFeedbackEvent + | ArenaSessionStartedEvent + | ArenaAgentCompletedEvent + | ArenaSessionEndedEvent; + +// ─── Arena Telemetry Events ──────────────────────────────────── + +export interface ArenaSessionStartedEvent extends BaseTelemetryEvent { + 'event.name': 'arena_session_started'; + arena_session_id: string; + model_ids: string[]; + task_length: number; +} + +export function makeArenaSessionStartedEvent({ + arena_session_id, + model_ids, + task_length, +}: Omit): ArenaSessionStartedEvent { + return { + 'event.name': 'arena_session_started', + 'event.timestamp': new Date().toISOString(), + arena_session_id, + model_ids, + task_length, + }; +} + +export type ArenaAgentCompletedStatus = 'completed' | 'failed' | 'cancelled'; + +export interface ArenaAgentCompletedEvent extends BaseTelemetryEvent { + 'event.name': 'arena_agent_completed'; + arena_session_id: string; + agent_session_id: string; + agent_model_id: string; + status: ArenaAgentCompletedStatus; + duration_ms: number; + rounds: number; + total_tokens: number; + input_tokens: number; + output_tokens: number; + tool_calls: number; + successful_tool_calls: number; + failed_tool_calls: number; +} + +export function makeArenaAgentCompletedEvent({ + arena_session_id, + agent_session_id, + agent_model_id, + status, + duration_ms, + rounds, + total_tokens, + input_tokens, + output_tokens, + tool_calls, + successful_tool_calls, + failed_tool_calls, +}: Omit): ArenaAgentCompletedEvent { + return { + 'event.name': 'arena_agent_completed', + 'event.timestamp': new Date().toISOString(), + arena_session_id, + agent_session_id, + agent_model_id, + status, + duration_ms, + rounds, + total_tokens, + input_tokens, + output_tokens, + tool_calls, + successful_tool_calls, + failed_tool_calls, + }; +} + +export type ArenaSessionEndedStatus = + | 'selected' + | 'discarded' + | 'failed' + | 'cancelled'; + +export interface ArenaSessionEndedEvent extends BaseTelemetryEvent { + 'event.name': 'arena_session_ended'; + arena_session_id: string; + status: ArenaSessionEndedStatus; + duration_ms: number; + display_backend?: string; + agent_count: number; + completed_agents: number; + failed_agents: number; + cancelled_agents: number; + winner_model_id?: string; +} + +export function makeArenaSessionEndedEvent({ + arena_session_id, + status, + duration_ms, + display_backend, + agent_count, + completed_agents, + failed_agents, + cancelled_agents, + winner_model_id, +}: Omit): ArenaSessionEndedEvent { + return { + 'event.name': 'arena_session_ended', + 'event.timestamp': new Date().toISOString(), + arena_session_id, + status, + duration_ms, + display_backend, + agent_count, + completed_agents, + failed_agents, + cancelled_agents, + winner_model_id, + }; +} export class ExtensionDisableEvent implements BaseTelemetryEvent { 'event.name': 'extension_disable'; diff --git a/packages/core/src/tools/read-file.ts b/packages/core/src/tools/read-file.ts index 215ae5c36b..2e0a1c2283 100644 --- a/packages/core/src/tools/read-file.ts +++ b/packages/core/src/tools/read-file.ts @@ -188,18 +188,21 @@ export class ReadFileTool extends BaseDeclarativeTool< const globalTempDir = Storage.getGlobalTempDir(); const projectTempDir = this.config.storage.getProjectTempDir(); const userSkillsDir = this.config.storage.getUserSkillsDir(); + const arenaDir = Storage.getGlobalArenaDir(); const resolvedFilePath = path.resolve(filePath); const osTempDir = os.tmpdir(); const isWithinTempDir = isSubpath(projectTempDir, resolvedFilePath) || isSubpath(globalTempDir, resolvedFilePath) || isSubpath(osTempDir, resolvedFilePath); + const isWithinArenaDir = isSubpath(arenaDir, resolvedFilePath); const isWithinUserSkills = isSubpath(userSkillsDir, resolvedFilePath); if ( !workspaceContext.isPathWithinWorkspace(filePath) && !isWithinTempDir && - !isWithinUserSkills + !isWithinUserSkills && + !isWithinArenaDir ) { const directories = workspaceContext.getDirectories(); return `File path must be within one of the workspace directories: ${directories.join( diff --git a/packages/core/src/tools/task.test.ts b/packages/core/src/tools/task.test.ts index 458b026b69..3100a771dc 100644 --- a/packages/core/src/tools/task.test.ts +++ b/packages/core/src/tools/task.test.ts @@ -10,11 +10,12 @@ import type { PartListUnion } from '@google/genai'; import type { ToolResultDisplay, TaskResultDisplay } from './tools.js'; import type { Config } from '../config/config.js'; import { SubagentManager } from '../subagents/subagent-manager.js'; +import type { SubagentConfig } from '../subagents/types.js'; +import { AgentTerminateMode } from '../agents/runtime/agent-types.js'; import { - type SubagentConfig, - SubagentTerminateMode, -} from '../subagents/types.js'; -import { type SubAgentScope, ContextState } from '../subagents/subagent.js'; + type AgentHeadless, + ContextState, +} from '../agents/runtime/agent-headless.js'; import { partToString } from '../utils/partUtils.js'; // Type for accessing protected methods in tests @@ -34,7 +35,7 @@ type TaskToolWithProtectedMethods = TaskTool & { // Mock dependencies vi.mock('../subagents/subagent-manager.js'); -vi.mock('../subagents/subagent.js'); +vi.mock('../agents/runtime/agent-headless.js'); const MockedSubagentManager = vi.mocked(SubagentManager); const MockedContextState = vi.mocked(ContextState); @@ -80,7 +81,7 @@ describe('TaskTool', () => { mockSubagentManager = { listSubagents: vi.fn().mockResolvedValue(mockSubagents), loadSubagent: vi.fn(), - createSubagentScope: vi.fn(), + createAgentHeadless: vi.fn(), addChangeListener: vi.fn((listener: () => void) => { changeListeners.push(listener); return () => { @@ -293,14 +294,14 @@ describe('TaskTool', () => { }); describe('TaskToolInvocation', () => { - let mockSubagentScope: SubAgentScope; + let mockSubagentScope: AgentHeadless; let mockContextState: ContextState; beforeEach(() => { mockSubagentScope = { - runNonInteractive: vi.fn().mockResolvedValue(undefined), + execute: vi.fn().mockResolvedValue(undefined), result: 'Task completed successfully', - terminateMode: SubagentTerminateMode.GOAL, + terminateMode: AgentTerminateMode.GOAL, getFinalText: vi.fn().mockReturnValue('Task completed successfully'), formatCompactResult: vi .fn() @@ -317,7 +318,6 @@ describe('TaskTool', () => { inputTokens: 1000, outputTokens: 500, totalTokens: 1500, - estimatedCost: 0.045, toolUsage: [ { name: 'grep', @@ -344,8 +344,8 @@ describe('TaskTool', () => { successfulToolCalls: 3, failedToolCalls: 0, }), - getTerminateMode: vi.fn().mockReturnValue(SubagentTerminateMode.GOAL), - } as unknown as SubAgentScope; + getTerminateMode: vi.fn().mockReturnValue(AgentTerminateMode.GOAL), + } as unknown as AgentHeadless; mockContextState = { set: vi.fn(), @@ -356,7 +356,7 @@ describe('TaskTool', () => { vi.mocked(mockSubagentManager.loadSubagent).mockResolvedValue( mockSubagents[0], ); - vi.mocked(mockSubagentManager.createSubagentScope).mockResolvedValue( + vi.mocked(mockSubagentManager.createAgentHeadless).mockResolvedValue( mockSubagentScope, ); }); @@ -376,12 +376,12 @@ describe('TaskTool', () => { expect(mockSubagentManager.loadSubagent).toHaveBeenCalledWith( 'file-search', ); - expect(mockSubagentManager.createSubagentScope).toHaveBeenCalledWith( + expect(mockSubagentManager.createAgentHeadless).toHaveBeenCalledWith( mockSubagents[0], config, expect.any(Object), // eventEmitter parameter ); - expect(mockSubagentScope.runNonInteractive).toHaveBeenCalledWith( + expect(mockSubagentScope.execute).toHaveBeenCalledWith( mockContextState, undefined, // signal parameter (undefined when not provided) ); @@ -416,7 +416,7 @@ describe('TaskTool', () => { }); it('should handle execution errors gracefully', async () => { - vi.mocked(mockSubagentManager.createSubagentScope).mockRejectedValue( + vi.mocked(mockSubagentManager.createAgentHeadless).mockRejectedValue( new Error('Creation failed'), ); diff --git a/packages/core/src/tools/task.ts b/packages/core/src/tools/task.ts index e811dde0df..430d25a65f 100644 --- a/packages/core/src/tools/task.ts +++ b/packages/core/src/tools/task.ts @@ -18,22 +18,20 @@ import type { } from './tools.js'; import type { Config } from '../config/config.js'; import type { SubagentManager } from '../subagents/subagent-manager.js'; +import type { SubagentConfig } from '../subagents/types.js'; +import { AgentTerminateMode } from '../agents/runtime/agent-types.js'; +import { ContextState } from '../agents/runtime/agent-headless.js'; import { - type SubagentConfig, - SubagentTerminateMode, -} from '../subagents/types.js'; -import { ContextState } from '../subagents/subagent.js'; -import { - SubAgentEventEmitter, - SubAgentEventType, -} from '../subagents/subagent-events.js'; + AgentEventEmitter, + AgentEventType, +} from '../agents/runtime/agent-events.js'; import type { - SubAgentToolCallEvent, - SubAgentToolResultEvent, - SubAgentFinishEvent, - SubAgentErrorEvent, - SubAgentApprovalRequestEvent, -} from '../subagents/subagent-events.js'; + AgentToolCallEvent, + AgentToolResultEvent, + AgentFinishEvent, + AgentErrorEvent, + AgentApprovalRequestEvent, +} from '../agents/runtime/agent-events.js'; import { createDebugLogger } from '../utils/debugLogger.js'; export interface TaskParams { @@ -54,6 +52,7 @@ export class TaskTool extends BaseDeclarativeTool { private subagentManager: SubagentManager; private availableSubagents: SubagentConfig[] = []; + private readonly removeChangeListener: () => void; constructor(private readonly config: Config) { // Initialize with a basic schema first @@ -89,7 +88,7 @@ export class TaskTool extends BaseDeclarativeTool { ); this.subagentManager = config.getSubagentManager(); - this.subagentManager.addChangeListener(() => { + this.removeChangeListener = this.subagentManager.addChangeListener(() => { void this.refreshSubagents(); }); @@ -97,6 +96,10 @@ export class TaskTool extends BaseDeclarativeTool { this.refreshSubagents(); } + dispose(): void { + this.removeChangeListener(); + } + /** * Asynchronously initializes the tool by loading available subagents * and updating the description and schema. @@ -262,7 +265,7 @@ assistant: "I'm going to use the Task tool to launch the with the greeting-respo } class TaskToolInvocation extends BaseToolInvocation { - private readonly _eventEmitter: SubAgentEventEmitter; + readonly eventEmitter: AgentEventEmitter = new AgentEventEmitter(); private currentDisplay: TaskResultDisplay | null = null; private currentToolCalls: TaskResultDisplay['toolCalls'] = []; @@ -272,11 +275,6 @@ class TaskToolInvocation extends BaseToolInvocation { params: TaskParams, ) { super(params); - this._eventEmitter = new SubAgentEventEmitter(); - } - - get eventEmitter(): SubAgentEventEmitter { - return this._eventEmitter; } /** @@ -304,12 +302,12 @@ class TaskToolInvocation extends BaseToolInvocation { private setupEventListeners( updateOutput?: (output: ToolResultDisplay) => void, ): void { - this.eventEmitter.on(SubAgentEventType.START, () => { + this.eventEmitter.on(AgentEventType.START, () => { this.updateDisplay({ status: 'running' }, updateOutput); }); - this.eventEmitter.on(SubAgentEventType.TOOL_CALL, (...args: unknown[]) => { - const event = args[0] as SubAgentToolCallEvent; + this.eventEmitter.on(AgentEventType.TOOL_CALL, (...args: unknown[]) => { + const event = args[0] as AgentToolCallEvent; const newToolCall = { callId: event.callId, name: event.name, @@ -327,33 +325,30 @@ class TaskToolInvocation extends BaseToolInvocation { ); }); - this.eventEmitter.on( - SubAgentEventType.TOOL_RESULT, - (...args: unknown[]) => { - const event = args[0] as SubAgentToolResultEvent; - const toolCallIndex = this.currentToolCalls!.findIndex( - (call) => call.callId === event.callId, - ); - if (toolCallIndex >= 0) { - this.currentToolCalls![toolCallIndex] = { - ...this.currentToolCalls![toolCallIndex], - status: event.success ? 'success' : 'failed', - error: event.error, - responseParts: event.responseParts, - }; + this.eventEmitter.on(AgentEventType.TOOL_RESULT, (...args: unknown[]) => { + const event = args[0] as AgentToolResultEvent; + const toolCallIndex = this.currentToolCalls!.findIndex( + (call) => call.callId === event.callId, + ); + if (toolCallIndex >= 0) { + this.currentToolCalls![toolCallIndex] = { + ...this.currentToolCalls![toolCallIndex], + status: event.success ? 'success' : 'failed', + error: event.error, + responseParts: event.responseParts, + }; - this.updateDisplay( - { - toolCalls: [...this.currentToolCalls!], - }, - updateOutput, - ); - } - }, - ); + this.updateDisplay( + { + toolCalls: [...this.currentToolCalls!], + }, + updateOutput, + ); + } + }); - this.eventEmitter.on(SubAgentEventType.FINISH, (...args: unknown[]) => { - const event = args[0] as SubAgentFinishEvent; + this.eventEmitter.on(AgentEventType.FINISH, (...args: unknown[]) => { + const event = args[0] as AgentFinishEvent; this.updateDisplay( { status: event.terminateReason === 'GOAL' ? 'completed' : 'failed', @@ -363,8 +358,8 @@ class TaskToolInvocation extends BaseToolInvocation { ); }); - this.eventEmitter.on(SubAgentEventType.ERROR, (...args: unknown[]) => { - const event = args[0] as SubAgentErrorEvent; + this.eventEmitter.on(AgentEventType.ERROR, (...args: unknown[]) => { + const event = args[0] as AgentErrorEvent; this.updateDisplay( { status: 'failed', @@ -376,9 +371,9 @@ class TaskToolInvocation extends BaseToolInvocation { // Indicate when a tool call is waiting for approval this.eventEmitter.on( - SubAgentEventType.TOOL_WAITING_APPROVAL, + AgentEventType.TOOL_WAITING_APPROVAL, (...args: unknown[]) => { - const event = args[0] as SubAgentApprovalRequestEvent; + const event = args[0] as AgentApprovalRequestEvent; const idx = this.currentToolCalls!.findIndex( (c) => c.callId === event.callId, ); @@ -506,7 +501,7 @@ class TaskToolInvocation extends BaseToolInvocation { if (updateOutput) { updateOutput(this.currentDisplay); } - const subagentScope = await this.subagentManager.createSubagentScope( + const subagent = await this.subagentManager.createAgentHeadless( subagentConfig, this.config, { eventEmitter: this.eventEmitter }, @@ -517,13 +512,13 @@ class TaskToolInvocation extends BaseToolInvocation { contextState.set('task_prompt', this.params.prompt); // Execute the subagent (blocking) - await subagentScope.runNonInteractive(contextState, signal); + await subagent.execute(contextState, signal); // Get the results - const finalText = subagentScope.getFinalText(); - const terminateMode = subagentScope.getTerminateMode(); - const success = terminateMode === SubagentTerminateMode.GOAL; - const executionSummary = subagentScope.getExecutionSummary(); + const finalText = subagent.getFinalText(); + const terminateMode = subagent.getTerminateMode(); + const success = terminateMode === AgentTerminateMode.GOAL; + const executionSummary = subagent.getExecutionSummary(); if (signal?.aborted) { this.updateDisplay( diff --git a/packages/core/src/tools/tool-registry.ts b/packages/core/src/tools/tool-registry.ts index 5fccddb4b3..e2110810bb 100644 --- a/packages/core/src/tools/tool-registry.ts +++ b/packages/core/src/tools/tool-registry.ts @@ -209,6 +209,22 @@ export class ToolRegistry { this.tools.set(tool.name, tool); } + /** + * Copies discovered (non-core) tools from another registry into this one. + * Used to share MCP/command-discovered tools with per-agent registries + * that were built with skipDiscovery. + */ + copyDiscoveredToolsFrom(source: ToolRegistry): void { + for (const tool of source.getAllTools()) { + if ( + (tool instanceof DiscoveredTool || tool instanceof DiscoveredMCPTool) && + !this.tools.has(tool.name) + ) { + this.tools.set(tool.name, tool); + } + } + } + private removeDiscoveredTools(): void { for (const tool of this.tools.values()) { if (tool instanceof DiscoveredTool || tool instanceof DiscoveredMCPTool) { @@ -527,10 +543,20 @@ export class ToolRegistry { } /** - * Stops all MCP clients and cleans up resources. + * Stops all MCP clients, disposes tools, and cleans up resources. * This method is idempotent and safe to call multiple times. */ async stop(): Promise { + for (const tool of this.tools.values()) { + if ('dispose' in tool && typeof tool.dispose === 'function') { + try { + tool.dispose(); + } catch (error) { + debugLogger.error(`Error disposing tool ${tool.name}:`, error); + } + } + } + try { await this.mcpClientManager.stop(); } catch (error) { diff --git a/packages/core/src/tools/tools.ts b/packages/core/src/tools/tools.ts index 05b488d123..2605a01056 100644 --- a/packages/core/src/tools/tools.ts +++ b/packages/core/src/tools/tools.ts @@ -9,7 +9,7 @@ import { ToolErrorType } from './tool-error.js'; import type { DiffUpdateResult } from '../ide/ide-client.js'; import type { ShellExecutionConfig } from '../services/shellExecutionService.js'; import { SchemaValidator } from '../utils/schemaValidator.js'; -import { type SubagentStatsSummary } from '../subagents/subagent-statistics.js'; +import { type AgentStatsSummary } from '../agents/runtime/agent-statistics.js'; import type { AnsiOutput } from '../utils/terminalSerializer.js'; /** @@ -447,7 +447,7 @@ export interface TaskResultDisplay { status: 'running' | 'completed' | 'failed' | 'cancelled'; terminateReason?: string; result?: string; - executionSummary?: SubagentStatsSummary; + executionSummary?: AgentStatsSummary; // If the subagent is awaiting approval for a tool call, // this contains the confirmation details for inline UI rendering. diff --git a/packages/core/src/utils/asyncMessageQueue.test.ts b/packages/core/src/utils/asyncMessageQueue.test.ts new file mode 100644 index 0000000000..fe54210333 --- /dev/null +++ b/packages/core/src/utils/asyncMessageQueue.test.ts @@ -0,0 +1,75 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +import { describe, it, expect } from 'vitest'; +import { AsyncMessageQueue } from './asyncMessageQueue.js'; + +describe('AsyncMessageQueue', () => { + it('should dequeue items in FIFO order', () => { + const queue = new AsyncMessageQueue(); + queue.enqueue('a'); + queue.enqueue('b'); + queue.enqueue('c'); + + expect(queue.dequeue()).toBe('a'); + expect(queue.dequeue()).toBe('b'); + expect(queue.dequeue()).toBe('c'); + }); + + it('should return null when empty', () => { + const queue = new AsyncMessageQueue(); + expect(queue.dequeue()).toBeNull(); + }); + + it('should return remaining items then null after drain()', () => { + const queue = new AsyncMessageQueue(); + queue.enqueue('x'); + queue.enqueue('y'); + + queue.drain(); + + expect(queue.dequeue()).toBe('x'); + expect(queue.dequeue()).toBe('y'); + expect(queue.dequeue()).toBeNull(); + }); + + it('should silently drop items enqueued after drain()', () => { + const queue = new AsyncMessageQueue(); + queue.drain(); + queue.enqueue('dropped'); + + expect(queue.size).toBe(0); + }); + + it('should track size accurately', () => { + const queue = new AsyncMessageQueue(); + expect(queue.size).toBe(0); + + queue.enqueue(1); + queue.enqueue(2); + expect(queue.size).toBe(2); + + queue.dequeue(); + expect(queue.size).toBe(1); + }); + + it('should report isDrained correctly', () => { + const queue = new AsyncMessageQueue(); + expect(queue.isDrained).toBe(false); + + queue.drain(); + expect(queue.isDrained).toBe(true); + }); + + it('should handle multiple sequential enqueue-dequeue cycles', () => { + const queue = new AsyncMessageQueue(); + + for (let i = 0; i < 5; i++) { + queue.enqueue(i); + expect(queue.dequeue()).toBe(i); + } + }); +}); diff --git a/packages/core/src/utils/asyncMessageQueue.ts b/packages/core/src/utils/asyncMessageQueue.ts new file mode 100644 index 0000000000..3268718ef0 --- /dev/null +++ b/packages/core/src/utils/asyncMessageQueue.ts @@ -0,0 +1,54 @@ +/** + * @license + * Copyright 2025 Qwen + * SPDX-License-Identifier: Apache-2.0 + */ + +/** + * @fileoverview Generic non-blocking message queue. + * + * Simple FIFO queue for producer/consumer patterns. Dequeue is + * non-blocking — returns null when empty. The consumer decides + * when and how to process items. + */ + +/** + * A generic non-blocking message queue. + * + * - `enqueue(item)` adds an item. Silently dropped after `drain()`. + * - `dequeue()` returns the next item, or `null` if empty. + * - `drain()` signals that no more items will be enqueued. + */ +export class AsyncMessageQueue { + private items: T[] = []; + private drained = false; + + /** Add an item to the queue. Dropped silently after drain. */ + enqueue(item: T): void { + if (this.drained) return; + this.items.push(item); + } + + /** Remove and return the next item, or null if empty. */ + dequeue(): T | null { + if (this.items.length > 0) { + return this.items.shift()!; + } + return null; + } + + /** Signal that no more items will be enqueued. */ + drain(): void { + this.drained = true; + } + + /** Number of items currently in the queue. */ + get size(): number { + return this.items.length; + } + + /** Whether `drain()` has been called. */ + get isDrained(): boolean { + return this.drained; + } +} diff --git a/packages/core/src/utils/atomicFileWrite.test.ts b/packages/core/src/utils/atomicFileWrite.test.ts new file mode 100644 index 0000000000..7d30caed0c --- /dev/null +++ b/packages/core/src/utils/atomicFileWrite.test.ts @@ -0,0 +1,63 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as fs from 'node:fs/promises'; +import * as path from 'node:path'; +import * as os from 'node:os'; +import { describe, it, expect, beforeEach, afterEach } from 'vitest'; +import { atomicWriteJSON } from './atomicFileWrite.js'; + +describe('atomicWriteJSON', () => { + let tmpDir: string; + + beforeEach(async () => { + tmpDir = await fs.mkdtemp(path.join(os.tmpdir(), 'atomic-write-test-')); + }); + + afterEach(async () => { + await fs.rm(tmpDir, { recursive: true, force: true }); + }); + + it('should write valid JSON to the target file', async () => { + const filePath = path.join(tmpDir, 'test.json'); + const data = { hello: 'world', count: 42 }; + + await atomicWriteJSON(filePath, data); + + const content = await fs.readFile(filePath, 'utf-8'); + expect(JSON.parse(content)).toEqual(data); + }); + + it('should pretty-print with 2-space indent', async () => { + const filePath = path.join(tmpDir, 'test.json'); + await atomicWriteJSON(filePath, { a: 1 }); + + const content = await fs.readFile(filePath, 'utf-8'); + expect(content).toBe(JSON.stringify({ a: 1 }, null, 2)); + }); + + it('should overwrite existing file atomically', async () => { + const filePath = path.join(tmpDir, 'test.json'); + await atomicWriteJSON(filePath, { version: 1 }); + await atomicWriteJSON(filePath, { version: 2 }); + + const content = await fs.readFile(filePath, 'utf-8'); + expect(JSON.parse(content)).toEqual({ version: 2 }); + }); + + it('should not leave temp files on success', async () => { + const filePath = path.join(tmpDir, 'test.json'); + await atomicWriteJSON(filePath, { ok: true }); + + const files = await fs.readdir(tmpDir); + expect(files).toEqual(['test.json']); + }); + + it('should throw if parent directory does not exist', async () => { + const filePath = path.join(tmpDir, 'nonexistent', 'test.json'); + await expect(atomicWriteJSON(filePath, {})).rejects.toThrow(); + }); +}); diff --git a/packages/core/src/utils/atomicFileWrite.ts b/packages/core/src/utils/atomicFileWrite.ts new file mode 100644 index 0000000000..e79a057382 --- /dev/null +++ b/packages/core/src/utils/atomicFileWrite.ts @@ -0,0 +1,72 @@ +/** + * @license + * Copyright 2025 Qwen Team + * SPDX-License-Identifier: Apache-2.0 + */ + +import * as crypto from 'node:crypto'; +import * as fs from 'node:fs/promises'; +import { isNodeError } from './errors.js'; + +export interface AtomicWriteOptions { + /** Number of rename retries on EPERM/EACCES (default: 3). */ + retries?: number; + /** Base delay in ms for exponential backoff (default: 50). */ + delayMs?: number; +} + +/** + * Atomically write a JSON value to a file. + * + * Writes to a temporary file first, then renames it to the target path. + * On POSIX `fs.rename` is atomic, so readers never see a partial file. + * On Windows the rename can fail with EPERM under concurrent access, + * so we retry with exponential backoff. + * + * The parent directory of `filePath` must already exist. + */ +export async function atomicWriteJSON( + filePath: string, + data: unknown, + options?: AtomicWriteOptions, +): Promise { + const retries = options?.retries ?? 3; + const delayMs = options?.delayMs ?? 50; + + const tmpPath = `${filePath}.${crypto.randomBytes(4).toString('hex')}.tmp`; + try { + await fs.writeFile(tmpPath, JSON.stringify(data, null, 2), 'utf-8'); + await renameWithRetry(tmpPath, filePath, retries, delayMs); + } catch (error) { + try { + await fs.unlink(tmpPath); + } catch { + // Ignore cleanup errors + } + throw error; + } +} + +async function renameWithRetry( + src: string, + dest: string, + retries: number, + delayMs: number, +): Promise { + for (let attempt = 0; attempt <= retries; attempt++) { + try { + await fs.rename(src, dest); + return; + } catch (error: unknown) { + const isRetryable = + isNodeError(error) && + (error.code === 'EPERM' || error.code === 'EACCES'); + if (!isRetryable || attempt === retries) { + throw error; + } + await new Promise((resolve) => + setTimeout(resolve, delayMs * 2 ** attempt), + ); + } + } +} diff --git a/packages/core/src/utils/environmentContext.test.ts b/packages/core/src/utils/environmentContext.test.ts index 0b24a9b018..6c2258c78c 100644 --- a/packages/core/src/utils/environmentContext.test.ts +++ b/packages/core/src/utils/environmentContext.test.ts @@ -18,6 +18,7 @@ import { getEnvironmentContext, getDirectoryContextString, getInitialChatHistory, + stripStartupContext, } from './environmentContext.js'; import type { Config } from '../config/config.js'; import { getFolderStructure } from './getFolderStructure.js'; @@ -223,3 +224,76 @@ describe('getInitialChatHistory', () => { expect(history).toEqual([]); }); }); + +describe('stripStartupContext', () => { + it('should strip the env context + model ack from the start of history', () => { + const history: Content[] = [ + { role: 'user', parts: [{ text: 'This is the Qwen Code...' }] }, + { + role: 'model', + parts: [{ text: 'Got it. Thanks for the context!' }], + }, + { role: 'user', parts: [{ text: 'Hello' }] }, + { role: 'model', parts: [{ text: 'Hi there' }] }, + ]; + + const result = stripStartupContext(history); + expect(result).toEqual([ + { role: 'user', parts: [{ text: 'Hello' }] }, + { role: 'model', parts: [{ text: 'Hi there' }] }, + ]); + }); + + it('should return history unchanged when no startup context is present', () => { + const history: Content[] = [ + { role: 'user', parts: [{ text: 'Hello' }] }, + { role: 'model', parts: [{ text: 'Hi there' }] }, + ]; + + const result = stripStartupContext(history); + expect(result).toEqual(history); + }); + + it('should return empty array when history is only the startup context', () => { + const history: Content[] = [ + { role: 'user', parts: [{ text: 'This is the Qwen Code...' }] }, + { + role: 'model', + parts: [{ text: 'Got it. Thanks for the context!' }], + }, + ]; + + const result = stripStartupContext(history); + expect(result).toEqual([]); + }); + + it('should return history unchanged when it has fewer than 2 entries', () => { + expect(stripStartupContext([])).toEqual([]); + expect( + stripStartupContext([{ role: 'user', parts: [{ text: 'Hello' }] }]), + ).toEqual([{ role: 'user', parts: [{ text: 'Hello' }] }]); + }); + + it('should round-trip with getInitialChatHistory', async () => { + const mockConfig = { + getSkipStartupContext: vi.fn().mockReturnValue(false), + getWorkspaceContext: vi.fn().mockReturnValue({ + getDirectories: vi.fn().mockReturnValue(['/test/dir']), + }), + getFileService: vi.fn(), + }; + + const conversation: Content[] = [ + { role: 'user', parts: [{ text: 'Hello' }] }, + { role: 'model', parts: [{ text: 'Hi' }] }, + ]; + + const withStartup = await getInitialChatHistory( + mockConfig as unknown as Config, + conversation, + ); + const stripped = stripStartupContext(withStartup); + + expect(stripped).toEqual(conversation); + }); +}); diff --git a/packages/core/src/utils/environmentContext.ts b/packages/core/src/utils/environmentContext.ts index 4f5c03209d..4d6fe0ab7b 100644 --- a/packages/core/src/utils/environmentContext.ts +++ b/packages/core/src/utils/environmentContext.ts @@ -69,6 +69,8 @@ ${directoryContext} return [{ text: context }]; } +const STARTUP_CONTEXT_MODEL_ACK = 'Got it. Thanks for the context!'; + export async function getInitialChatHistory( config: Config, extraHistory?: Content[], @@ -87,8 +89,26 @@ export async function getInitialChatHistory( }, { role: 'model', - parts: [{ text: 'Got it. Thanks for the context!' }], + parts: [{ text: STARTUP_CONTEXT_MODEL_ACK }], }, ...(extraHistory ?? []), ]; } + +/** + * Strip the leading startup context (env-info user message + model ack) + * from a chat history. Used when forwarding a parent session's history + * to a child agent that will generate its own startup context for its + * own working directory. + */ +export function stripStartupContext(history: Content[]): Content[] { + if (history.length < 2) return history; + + const secondEntry = history[1]; + const ackText = secondEntry?.parts?.[0]?.text; + if (secondEntry?.role === 'model' && ackText === STARTUP_CONTEXT_MODEL_ACK) { + return history.slice(2); + } + + return history; +} diff --git a/packages/core/src/utils/terminalSerializer.ts b/packages/core/src/utils/terminalSerializer.ts index 7bcd2a4ce6..e12fe25aa5 100644 --- a/packages/core/src/utils/terminalSerializer.ts +++ b/packages/core/src/utils/terminalSerializer.ts @@ -131,17 +131,26 @@ class Cell { } } -export function serializeTerminalToObject(terminal: Terminal): AnsiOutput { +export function serializeTerminalToObject( + terminal: Terminal, + scrollOffset: number = 0, +): AnsiOutput { const buffer = terminal.buffer.active; - const cursorX = buffer.cursorX; - const cursorY = buffer.cursorY; const defaultFg = ''; const defaultBg = ''; + // Clamp scrollOffset to valid range [0, viewportY] + const clampedOffset = Math.max(0, Math.min(scrollOffset, buffer.viewportY)); + const startRow = buffer.viewportY - clampedOffset; + + // Only show cursor when viewing the live viewport (no scroll) + const cursorX = clampedOffset === 0 ? buffer.cursorX : -1; + const cursorY = clampedOffset === 0 ? buffer.cursorY : -1; + const result: AnsiOutput = []; for (let y = 0; y < terminal.rows; y++) { - const line = buffer.getLine(buffer.viewportY + y); + const line = buffer.getLine(startRow + y); const currentLine: AnsiLine = []; if (!line) { result.push(currentLine); diff --git a/packages/vscode-ide-companion/schemas/settings.schema.json b/packages/vscode-ide-companion/schemas/settings.schema.json index 8f56a9c89a..1d3b33692d 100644 --- a/packages/vscode-ide-companion/schemas/settings.schema.json +++ b/packages/vscode-ide-companion/schemas/settings.schema.json @@ -564,6 +564,51 @@ "type": "object", "additionalProperties": true }, + "agents": { + "description": "Settings for multi-agent collaboration features (Arena, Team, Swarm).", + "type": "object", + "properties": { + "displayMode": { + "description": "Display mode for multi-agent sessions. Currently only \"in-process\" is supported. Options: in-process", + "enum": [ + "in-process" + ] + }, + "arena": { + "description": "Settings for Arena (multi-model competitive execution).", + "type": "object", + "properties": { + "worktreeBaseDir": { + "description": "Custom base directory for Arena worktrees. Defaults to ~/.qwen/arena.", + "type": "string" + }, + "preserveArtifacts": { + "description": "When enabled, Arena worktrees and session state files are preserved after the session ends or the main agent exits.", + "type": "boolean", + "default": false + }, + "maxRoundsPerAgent": { + "description": "Maximum number of rounds (turns) each agent can execute. No limit if unset.", + "type": "number" + }, + "timeoutSeconds": { + "description": "Total timeout in seconds for the Arena session. No limit if unset.", + "type": "number" + } + } + }, + "team": { + "description": "Settings for Agent Team (role-based collaborative execution). Reserved for future use.", + "type": "object", + "additionalProperties": true + }, + "swarm": { + "description": "Settings for Agent Swarm (parallel sub-agent execution). Reserved for future use.", + "type": "object", + "additionalProperties": true + } + } + }, "hooksConfig": { "description": "Hook configurations for intercepting and customizing agent behavior.", "type": "object", @@ -718,6 +763,11 @@ } } }, + "experimental": { + "description": "Setting to enable experimental features", + "type": "object", + "properties": {} + }, "$version": { "type": "number", "description": "Settings schema version for migration tracking.",