Skip to content

fix(cli): show full plan for gate failures#5077

Merged
wenshao merged 1 commit into
QwenLM:mainfrom
he-yufeng:fix/exit-plan-gate-display
Jun 13, 2026
Merged

fix(cli): show full plan for gate failures#5077
wenshao merged 1 commit into
QwenLM:mainfrom
he-yufeng:fix/exit-plan-gate-display

Conversation

@he-yufeng

Copy link
Copy Markdown
Contributor

What this PR does

Keeps the submitted plan visible when the autonomous plan approval gate does not approve execution. The blocked, needs_user, cap_escalation, and unavailable outcomes now return the same structured plan_summary display shape used by approved plans, with rejected: true and the gate details appended after the original plan.

Why it's needed

Fixes #5075. The approved path already sends { type: 'plan_summary', message, plan }, so the TUI can render the full plan. The non-approved gate paths returned only a short string such as Plan gate: blocked (1 finding(s)), which left users unable to read the plan that had just been rejected or blocked.

Reviewer Test Plan

How to verify

Trigger exit_plan_mode from an AUTO or YOLO pre-plan session where the plan gate returns a non-approved outcome. The CLI should still show the submitted plan body, followed by the gate findings or questions, instead of only showing a one-line gate status.

Evidence (Before & After)

Before: exitPlanMode.ts returned plain string returnDisplay values for gate blocked, needs_user, cap_escalation, and unavailable, so PlanSummaryDisplay never received the plan. After: exitPlanMode.test.ts covers all four non-approved gate outcomes and asserts that returnDisplay is a rejected plan_summary containing both the original plan text and the gate details.

Tested on

OS Status
🍏 macOS N/A
🪟 Windows ✅ tested
🐧 Linux N/A

Environment (optional)

Windows PowerShell, Node/npm from the repository dev setup. Validation commands: npm run test --workspace=packages/core -- src/tools/exitPlanMode.test.ts; npx prettier --check packages/core/src/tools/exitPlanMode.ts packages/core/src/tools/exitPlanMode.test.ts; npx eslint packages/core/src/tools/exitPlanMode.ts packages/core/src/tools/exitPlanMode.test.ts; npm run typecheck --workspace=packages/core; npm run build --workspace=packages/core.

Risk & Scope

  • Main risk or tradeoff: the rejected plan display now includes the gate details below a separator inside the plan summary body, so users see both the original plan and why the gate stopped it in one panel.
  • Not validated / out of scope: no live TUI screenshot was captured; this PR validates the tool result shape at the core layer.
  • Breaking changes / migration notes: none.

Linked Issues

Fixes #5075

中文说明

这个 PR 做了什么

当自动 plan approval gate 没有批准执行时,仍然保留并展示用户刚提交的完整 plan。blockedneeds_usercap_escalationunavailable 四种结果现在都会返回和 approved 路径一致的结构化 plan_summary,并设置 rejected: true,同时把 gate 的发现或问题追加在原 plan 后面。

为什么需要

修复 #5075。approved 路径已经返回 { type: 'plan_summary', message, plan },所以 TUI 能渲染完整 plan。但 gate 非批准路径之前只返回一行字符串,例如 Plan gate: blocked (1 finding(s)),导致用户看不到刚被阻止的完整 plan。

Reviewer Test Plan

如何验证

在 AUTO 或 YOLO pre-plan session 中触发 exit_plan_mode,并让 plan gate 返回非批准结果。CLI 应该展示完整 plan,并在后面展示 gate findings 或 questions,而不是只显示一行 gate 状态。

Before / After 证据

Before:exitPlanMode.tsblockedneeds_usercap_escalationunavailable 返回纯字符串 returnDisplayPlanSummaryDisplay 收不到 plan。After:exitPlanMode.test.ts 覆盖四种非批准 gate 结果,断言 returnDisplay 是 rejected plan_summary,且包含原始 plan 和 gate 细节。

测试环境

Windows PowerShell,本地 Node/npm 开发环境。已运行:npm run test --workspace=packages/core -- src/tools/exitPlanMode.test.tsnpx prettier --check packages/core/src/tools/exitPlanMode.ts packages/core/src/tools/exitPlanMode.test.tsnpx eslint packages/core/src/tools/exitPlanMode.ts packages/core/src/tools/exitPlanMode.test.tsnpm run typecheck --workspace=packages/corenpm run build --workspace=packages/core

风险与范围

  • 主要风险或取舍:rejected plan display 会在同一个 plan summary 面板里用分隔线追加 gate 细节,让用户同时看到原 plan 和被阻止原因。
  • 未验证 / 不在范围内:没有录制真实 TUI 截图;本 PR 在 core 层验证 tool result shape。
  • 破坏性变更 / 迁移说明:无。

@wenshao wenshao left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found. LGTM! ✅ — qwen3.7-max via Qwen Code /review

case 'unavailable':
case 'unavailable': {
const llmContent = formatUnavailableResponse(decision);
const message = `Plan gate: unavailable - ${decision.reason}`;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Suggestion] The unavailable message uses a plain hyphen (-, U+002D) where the original code used an em-dash (, U+2014). The old returnDisplay was `Plan gate: unavailable — ${decision.reason}` and formatUnavailableResponse() in planApprovalGate.ts still uses **unavailable** — ${decision.reason} with an em-dash, so the header and body now disagree on punctuation.

Suggested change
const message = `Plan gate: unavailable - ${decision.reason}`;
const message = `Plan gate: unavailable ${decision.reason}`;

Also update the test expectation for the unavailable case from 'Plan gate: unavailable - review model timed out' to 'Plan gate: unavailable — review model timed out'.

— qwen3.7-max via Qwen Code /review

@wenshao

wenshao commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

@qwen-code /triage

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Thanks for the PR!

Template looks good ✓ — all required headings present, bilingual body included.

On direction: this fixes a clear UX gap — when the plan gate blocks execution, the user is left staring at a one-liner with no way to read the plan that was just rejected. Directly aligned with the CLI's core experience. Linked issue #5075 confirms it's a real user problem.

On approach: scope is tight — 2 files, one extracted helper (buildRejectedGateDisplay), four call sites updated uniformly. The rejected: true flag on PlanResultDisplay is already wired into PlanSummaryDisplay (yellow vs green message color). No over-engineering, no speculative additions. This is about as minimal as it gets.

Moving on to code review. 🔍

中文说明

感谢贡献!

模板完整 ✓ — 所有必需标题齐全,包含中英双语正文。

方向:修复了一个明确的 UX 缺陷——plan gate 阻止执行后,用户只能看到一行摘要,无法阅读刚被拒绝的完整 plan。与 CLI 核心体验直接相关,关联 issue #5075 确认了这是真实用户问题。

方案:范围紧凑——2 个文件,提取一个 helper(buildRejectedGateDisplay),四个调用点统一更新。rejected: true 已在 PlanSummaryDisplay 中处理(黄色 vs 绿色消息颜色)。没有过度工程,没有多余添加。已经是最小化的方案了。

进入代码审查 🔍

Qwen Code · qwen3.7-max

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

Code Review

Clean, focused change. Two files, one extracted helper, four call sites updated uniformly.

What I verified:

  • Tests (27/27 exitPlanMode, 34/34 plan-gate, 36/36 ToolMessage): all pass. The new it.each parameterized tests cover all four non-approved outcomes (blocked, needs_user, cap_escalation, unavailable) and assert the plan_summary shape, rejected: true, plan content inclusion, gate detail inclusion, and that savePlan is never called.
  • Type safety: rejected?: boolean already exists on PlanResultDisplay — no type changes needed. PlanSummaryDisplay already renders yellow for rejected: true.
  • ESLint + Prettier + TypeScript: all clean.
  • Diff quality: buildRejectedGateDisplay is a straightforward helper that concatenates the plan with gate details behind a --- separator. No over-abstraction, no speculative code.

One minor observation (not a blocker): the unavailable case message changed from em dash () to hyphen-minus (-) in Plan gate: unavailable - ${reason}. Cosmetic only — the test validates the reason substring, not the exact separator character.

Real-Scenario Testing

Attempted to exercise the plan gate path via tmux with --approval-mode auto and -i (interactive). The CLI started correctly and the model loaded enter_plan_mode, but the plan gate requires an AI review model to produce a non-approved decision — this is non-deterministic and can't be forced in a live session without modifying source code.

Plan mode is also unavailable in headless/non-interactive mode, which rules out -o stream-json piped testing.

What we can confirm from tests instead:

$ cd packages/core && npx vitest run src/tools/exitPlanMode.test.ts

 ✓ src/tools/exitPlanMode.test.ts (27 tests) 19ms

 Test Files  1 passed (1)
      Tests  27 passed (27)
   Duration  6.72s

$ cd packages/core && npx vitest run src/plan-gate/

 ✓ src/plan-gate/gateReviewAgents.test.ts (12 tests) 9ms
 ✓ src/plan-gate/planApprovalGate.test.ts (22 tests) 14ms

 Test Files  2 passed (2)
      Tests  34 passed (34)
   Duration  7.24s

$ cd packages/cli && npx vitest run src/ui/components/messages/ToolMessage.test.tsx

 ✓ src/ui/components/messages/ToolMessage.test.tsx (36 tests) 217ms

 Test Files  1 passed (1)
      Tests  36 passed (36)
   Duration  10.49s

The PR author acknowledged this limitation: "no live TUI screenshot was captured; this PR validates the tool result shape at the core layer." Given that the change is purely at the data layer (structuring returnDisplay) and the rendering layer (PlanSummaryDisplay) already handles rejected: true, the unit test coverage is sufficient.

中文说明

代码审查

简洁、聚焦的改动。两个文件,提取一个 helper,四个调用点统一更新。

已验证:

  • 测试(exitPlanMode 27/27,plan-gate 34/34,ToolMessage 36/36): 全部通过。新增的 it.each 参数化测试覆盖四种非批准结果,断言 plan_summary 结构、rejected: true、plan 内容包含、gate 细节包含、以及 savePlan 未被调用。
  • 类型安全: rejected?: boolean 已存在于 PlanResultDisplay,无需类型变更。PlanSummaryDisplay 已对 rejected: true 渲染黄色。
  • ESLint + Prettier + TypeScript: 全部通过。
  • Diff 质量: buildRejectedGateDisplay 是直白的 helper,用 --- 分隔线连接 plan 和 gate 详情。无过度抽象,无多余代码。

一个非阻塞的小观察: unavailable 的 message 从 em dash()变成了 hyphen-minus(-)。纯视觉差异,测试验证的是 reason 子串而非精确分隔符。

真实场景测试

尝试通过 tmux 在 --approval-mode auto + -i 交互模式下触发 plan gate 路径。CLI 正常启动,模型加载了 enter_plan_mode,但 plan gate 需要 AI 审查模型做出非批准决策——这是非确定性的,无法在不修改源码的情况下强制触发。

Plan mode 在 headless/非交互模式下也不可用,排除了 -o stream-json 管道测试。

PR 作者也承认了这个限制:"没有录制真实 TUI 截图;本 PR 在 core 层验证 tool result shape。" 鉴于改动纯粹在数据层(构造 returnDisplay),而渲染层(PlanSummaryDisplay)已处理 rejected: true,单元测试覆盖已经足够。

Qwen Code · qwen3.7-max

@qwen-code-ci-bot

Copy link
Copy Markdown
Collaborator

This is a clean, well-scoped fix. Here's what I'm thinking after reviewing everything:

The PR's approach matches what I would have done independently — extract a helper, update four call sites, keep the change at the data layer. No surprises.

What convinces me:

  • The fix directly addresses the reported problem (ExitPlanMode 时,plan gate 报错,导致看不到 plan #5075): users can now read the full plan that was just rejected, instead of a one-liner.
  • The PlanSummaryDisplay component already handles rejected: true (yellow message color), so no TUI changes are needed — this PR just makes the data flow consistent.
  • 97 related tests pass, including 4 new parameterized tests covering every non-approved gate outcome.
  • The diff is +168/-11 across 2 files. For a fix that touches 4 switch cases, that's tight.
  • No security concerns, no breaking changes, no speculative additions.

The only thing I couldn't do is live tmux testing of a gate failure — the plan gate requires AI review model decisions that can't be forced deterministically. But the unit test coverage is thorough and the rendering layer was already prepared for this data shape. The risk of this change breaking something in production is very low.

Approving. ✅

中文说明

这是一个干净、范围合理的修复。

PR 的方案和我独立提出的方案一致——提取 helper、更新四个调用点、把改动限制在数据层。没有意外。

让我确信的原因:

  • 修复直接解决了报告的问题(ExitPlanMode 时,plan gate 报错,导致看不到 plan #5075):用户现在能看到刚被拒绝的完整 plan,而不是一行摘要。
  • PlanSummaryDisplay 组件已经处理 rejected: true(黄色消息),不需要 TUI 改动——本 PR 只是让数据流一致。
  • 97 个相关测试通过,包括 4 个新的参数化测试覆盖每种非批准 gate 结果。
  • Diff 为 +168/-11,跨 2 个文件。对于修改 4 个 switch case 的修复来说,很紧凑。
  • 无安全问题,无破坏性变更,无多余添加。

唯一没能做的是 gate 失败的真实 tmux 测试——plan gate 需要 AI 审查模型的决策,无法确定性触发。但单元测试覆盖充分,渲染层已经为这个数据形状做好了准备。这个改动在生产环境出问题的风险很低。

批准 ✅

Qwen Code · qwen3.7-max

@qwen-code-ci-bot qwen-code-ci-bot left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, looks ready to ship. ✅

@wenshao

wenshao commented Jun 13, 2026

Copy link
Copy Markdown
Collaborator

✅ Local verification report — real TUI render via tmux

Verdict: safe to merge. I reproduced the fix end-to-end on a real terminal, filling the exact gap the PR description flagged as out of scope ("no live TUI screenshot was captured; this PR validates the tool result shape at the core layer"). The core layer was already proven by the unit tests; what was missing was proof that the TUI consumer actually renders the new { type: 'plan_summary', rejected: true } shape as a full plan instead of choking on the string→object change. It does.

Environment

  • Branch fix/exit-plan-gate-display @ 9cc4a2ab9b
  • macOS (Darwin 25.5.0), Node v22.22.2, tmux 3.6a
  • Fills the 🍏 macOS — N/A row in the PR's test matrix.

Method

  1. Real producer. Drove the real ExitPlanModeTool.execute() through all four non-approved gate outcomes (blocked / needs_user / cap_escalation / unavailable) using the same mock seam as the PR's own test (runPlanApprovalGate — the gate logic is untouched by this PR), and dumped the exact returnDisplay objects the tool emits.
  2. Real consumer in a real TTY. Rendered the real PlanSummaryDisplay component (packages/cli/src/ui/components/PlanSummaryDisplay.tsx) with those exact objects inside a tmux pane (a genuine TTY, so ink paints as it would in production), then capture-pane'd the result. Dispatch path confirmed at ToolMessage.tsx:158-168 (type === 'plan_summary'PlanSummaryDisplay).

Evidence

1) Real render — blocked (the other three are equivalent; all show the full plan):

Plan gate: blocked (2 finding(s))          ← was the ONLY thing shown before this PR

Implementation Plan: add --max-retries flag

 1. Parse the flag in packages/cli/src/config/cli.ts (yargs .option).
 2. Thread it through Config.getMaxRetries() in core.
 3. Apply in the request retry loop in contentGenerator.ts.
 4. Add regression tests in cli.test.ts and contentGenerator.test.ts.
 5. Update docs in docs/cli/configuration.md.

---

Plan Approval Gate: blocked. The following issues must be resolved before the plan can be executed:

 - GF-1 [P2]: The plan omits the rollback path for a failed migration.
  Rationale: Autonomous execution would not know how to recover safely.
  Suggested fix: Add an explicit rollback step before exiting plan mode.
 - GF-2 [P3]: No upper bound is specified for --max-retries.
  ...
Revise the plan to address each finding, then call exit_plan_mode again. ...

Before the PR the user saw only the first line; now the full plan + --- separator + gate findings render. ✔ All four outcomes verified identically.

2) rejected: true is actually consumed (not just present). Capturing with ANSI escapes preserved, the header line is wrapped in \033[38;2;255;215;0m = RGB(255,215,0) = #FFD700. The active theme is QwenDark, whose AccentYellow = #FFD700 and AccentGreen = #AAD94C. So the header renders yellow, proving the rejected ? Colors.AccentYellow : Colors.AccentGreen branch (PlanSummaryDisplay.tsx:25) takes the rejected path — an approved plan would be green.

3) Tests + before/after regression proof.

  • exitPlanMode.test.ts: 27/27 pass on the PR branch.
  • Reverting only exitPlanMode.ts to its pre-PR version (HEAD~1) while keeping the new test file makes exactly the 4 new gate cases fail (4 failed | 23 passed); restoring the source → 27 pass again. This proves the new tests genuinely guard the regression rather than tautologically passing.

4) No string→object regression in other consumers. DaemonTuiAdapter.formatToolResultDisplay (DaemonTuiAdapter.ts:262-289) is object-aware and explicitly routes type === 'plan_summary' through sanitizeDaemonValue (the rejected boolean passes through), so the daemon / web-shell path is unaffected. No code path assumes returnDisplay is a string for this tool.

5) Convention consistency. The { type: 'plan_summary', …, rejected: true } shape is already produced by coreToolScheduler.ts:1202-1208 for user-rejected plans. This PR reuses that established shape for the autonomous-gate non-approval paths — it is not inventing a new field.

6) Static checks on macOS: prettier --check, eslint, and tsc --noEmit (core) all pass.

Minor, non-blocking nits

  • Dash inconsistency (unavailable): the new header uses a hyphen — Plan gate: unavailable - <reason> — while the body (formatUnavailableResponse) keeps an em-dash — unavailable — <reason>. Same panel, two different dashes. Purely cosmetic.
  • Verbosity: the user-facing plan now embeds the full model-directed llmContent, including instructions like "call exit_plan_mode again" and AskUserQuestion with metadata { source: … }. Informative, but slightly noisy for an end user. Optional follow-up, not a blocker.

Neither affects correctness. LGTM.

中文版

✅ 本地验证报告 —— 用 tmux 做真实 TUI 渲染

结论:可以合并。 我在真实终端里端到端复现了这个修复,补上了 PR 描述中明确标注的盲区("没有录制真实 TUI 截图;本 PR 在 core 层验证 tool result shape")。core 层已由单测证明,缺的是证明 TUI 消费端真的能把新的 { type: 'plan_summary', rejected: true } 形状渲染成完整 plan,而不是被 string→object 的改动打破。结论是:能。

环境

  • 分支 fix/exit-plan-gate-display @ 9cc4a2ab9b
  • macOS (Darwin 25.5.0)、Node v22.22.2、tmux 3.6a
  • 填补了 PR 测试矩阵里 🍏 macOS — N/A 这一行。

方法

  1. 真实生产端。 用和 PR 自带测试相同的 mock 缝(runPlanApprovalGate,gate 逻辑本 PR 没动)驱动真实的 ExitPlanModeTool.execute() 跑完四种非批准结果(blocked / needs_user / cap_escalation / unavailable),把 tool 实际产出的 returnDisplay 对象原样落盘。
  2. 真实消费端 + 真实 TTY。tmux pane(真 TTY,ink 按生产环境方式绘制)里,用上述真实对象渲染真实的 PlanSummaryDisplay 组件(packages/cli/src/ui/components/PlanSummaryDisplay.tsx),再 capture-pane 抓屏。分发路径在 ToolMessage.tsx:158-168 确认(type === 'plan_summary'PlanSummaryDisplay)。

证据

1) 真实渲染 —— blocked(其余三种等价,都显示完整 plan): 见上方英文 code block。本 PR 之前用户只能看到第一行 Plan gate: blocked (2 finding(s));现在完整 plan + --- 分隔线 + gate findings 全部渲染。四种结果均一致验证通过。

2) rejected: true 确实被消费(不只是存在)。 带 ANSI 转义抓屏,标题行被 \033[38;2;255;215;0m = RGB(255,215,0) = #FFD700 包裹。当前默认主题 QwenDarkAccentYellow = #FFD700AccentGreen = #AAD94C。所以标题渲染成黄色,证明走的是 PlanSummaryDisplay.tsx:25rejected ? AccentYellow : AccentGreen 中 rejected 分支——approved 会是绿色。

3) 单测 + before/after 回归证明。

  • exitPlanMode.test.ts:PR 分支 27/27 通过
  • exitPlanMode.ts 退回 PR 前版本(HEAD~1)、保留新测试文件,结果恰好那 4 个新 gate 用例失败4 failed | 23 passed);恢复源码后又全部通过。证明新测试是真正守护回归,而非恒真空过。

4) 其它消费方无 string→object 回归。 DaemonTuiAdapter.formatToolResultDisplayDaemonTuiAdapter.ts:262-289)是 object-aware 的,显式把 type === 'plan_summary' 交给 sanitizeDaemonValuerejected 布尔透传),daemon / web-shell 路径不受影响。没有任何代码假设此工具的 returnDisplay 是字符串。

5) 约定一致性。 { type: 'plan_summary', …, rejected: true } 这个形状早已coreToolScheduler.ts:1202-1208 用于用户手动拒绝 plan 的场景。本 PR 只是把同一形状扩展到自治 gate 非批准路径,并非新发明字段。

6) macOS 静态检查: prettier --checkeslinttsc --noEmit(core)全部通过。

次要、不阻塞的小问题

  • 破折号不一致(unavailable): 新标题用连字符 —— Plan gate: unavailable - <reason>,而正文(formatUnavailableResponse)仍用破折号 —— unavailable — <reason>。同一面板两种符号,纯外观。
  • 冗长: 面向用户的 plan 现在内嵌了面向模型的完整 llmContent,包含 "call exit_plan_mode again"AskUserQuestion with metadata { source: … } 之类指令。信息量够,但对终端用户略噪。可作为后续可选优化,不阻塞。

两点都不影响正确性。LGTM。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ExitPlanMode 时,plan gate 报错,导致看不到 plan

3 participants