feat(core): surface PreToolUse hook 'ask' as a TUI confirmation#5629
feat(core): surface PreToolUse hook 'ask' as a TUI confirmation#5629LaZzyMan wants to merge 2 commits into
Conversation
A PreToolUse hook returning permissionDecision:'ask' was treated the same as 'deny'. The hook fires in the execution phase (_executeToolCallBody), after the confirmation flow in _schedule has finished, so an 'ask' could only block the tool as EXECUTION_DENIED instead of prompting the user. Bounce the tool from the execution phase back to awaiting_approval when a hook asks: build a synthetic 'info' confirmation whose onConfirm routes through handleConfirmationResponse (ProceedOnce re-executes, Cancel cancels). PreToolUse keeps its "before execution" timing — only the 'ask' branch is new; 'denied'/'stop' keep deny-as-error. A non-interactive CLI or background agent cannot prompt, so 'ask' falls back to deny there. The re-execution after approval skips both the hook re-fire (no infinite re-ask loop) and the non-idempotent path-unescape prelude. A walk-away abort sets a terminal status so the turn cannot hang, and the tool span survives the bounce so it is finalized exactly once. Tests: add coverage for ask->awaiting_approval, approve->execute-once (no re-ask loop), decline->cancelled, non-interactive/background deny, walk-away abort, single span finalize, and no double path-unescape.
Code Coverage Summary
CLI Package - Full Text ReportCore Package - Full Text ReportFor detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
✅ Verification report — real end-to-end test of PR #5629Verdict: behavior matches the PR description exactly. Recommend merge. I built the real Test harness
Results
Key evidencePR build — interactive Merge-base build (A/B control) — same scenario, same hook, no prompt; auto-denied: The A/B swapped a single compiled file ( Non-interactive Coverage notes (honest scope)
No regressions or unexpected behavior observed. The change is well-scoped (one source file; deny/stop paths untouched) and the edge cases called out in the description hold up under a real run. 🇨🇳 中文版验证报告(点击展开)✅ 验证报告 —— 对 PR #5629 的真实端到端测试结论:实际行为与 PR 描述完全一致,建议合并。 我从本 PR 分支构建了真实的 测试装置
结果
关键证据PR 构建 —— 交互式 Merge-base 构建(A/B 对照)—— 同场景同钩子,无弹窗、被自动拒绝: A/B 仅替换单个编译产物( 非交互式 覆盖范围说明(如实标注)
未发现回归或异常行为。改动范围清晰(单一源文件;deny/stop 路径未改),描述中列出的边界情况在真实运行下均成立。 Verified by building the PR branch and driving the real CLI in tmux against a deterministic mock model; A/B against merge-base |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
Hi @LaZzyMan — thanks for the PR and the detailed description of the behavior and edge cases. The concept here is solid.
However, the PR body doesn't follow the pull request template. The template requires specific headings that help reviewers quickly assess scope, risk, and how to verify:
- What this PR does / Why it's needed — the PR uses "Summary" and "Behavior" instead
- Reviewer Test Plan with "How to verify", "Evidence (Before & After)", and "Tested on" (OS matrix) — the PR has "Tests" but no reproduction steps, no before/after evidence, and no OS test matrix
- Risk & Scope — missing entirely (main risk/tradeoff, out-of-scope items, breaking changes)
- Linked Issues — missing (is there an issue this closes?)
- Chinese translation (
<details>block) — missing
Could you update the PR description to follow the template? It makes review significantly easier, especially the "How to verify" steps and before/after evidence — a hook bouncing a tool back to confirmation is a user-visible TUI change that reviewers need to see working.
Note: @wenshao has already left a substantive code review with important concerns (multi-tool batch hang, orphaned hook events, missing test coverage) that should also be addressed.
中文说明
感谢 PR 以及对行为和边界情况的详细描述,方向是好的。
但 PR 正文没有遵循 PR 模板。模板要求的几个关键部分缺失:
- What this PR does / Why it's needed — 目前用的是 "Summary" 和 "Behavior"
- Reviewer Test Plan 含 "How to verify"、"Evidence (Before & After)"、"Tested on"(操作系统矩阵)— 目前只有 "Tests",缺少复现步骤、前后对比证据和测试平台信息
- Risk & Scope — 完全缺失(主要风险/权衡、不在范围内的内容、破坏性变更)
- Linked Issues — 缺失(是否有关联的 issue?)
- 中文翻译(
<details>块)— 缺失
请按模板更新 PR 描述。特别是 "How to verify" 步骤和前后对比证据——hook 将工具弹回确认是用户可见的 TUI 变更,reviewer 需要看到实际运行效果。
另外,@wenshao 已经留下了实质性的代码审查意见(多工具并发挂起、孤立 hook 事件、测试覆盖缺失),也请一并处理。
— Qwen Code · qwen3.7-max
Round 1 review of #5629 surfaced edge cases in the bounce mechanism: - Multi-tool batch hang: a bounced tool approved while a sibling was still executing stayed stuck in 'scheduled'. attemptExecutionOfScheduledCalls now loops, re-checking for newly-scheduled bounce-approved calls after each batch drains. - Orphaned hook events: the post-approval re-execution generated a fresh tool_use_id, leaving PreToolUse(old)/PostToolUse(new) unpaired. Preserve and reuse the original id across the bounce. - ModifyWithEditor double-unescape: request.args is unescaped in place before the hook fires, so the ModifyWithEditor branch must skip its own unescape for a bounced tool (it would double-strip escaped metacharacters). - Missing signal.aborted re-check before bouncing: mirror the confirmation-phase guard so an aborted signal falls through to deny instead of flashing a confirmation nobody can answer. Tests: multi-tool-hang regression (RED before the loop fix), non-interactive STREAM_JSON and Zed bounce paths, and span-finalize assertions on the walk-away abort test.
Round 1 review addressed — 302f12fThanks for the thorough pass. Code fixes (
Declined as out of scope (this PR makes
PR description updated to follow the template (What / Why, Reviewer Test Plan with @wenshao's A/B evidence, Risk & Scope, Linked Issues #5424, 中文 block). |
qwen-code-ci-bot
left a comment
There was a problem hiding this comment.
— qwen3.7-max via Qwen Code /review
| const { callId, name: toolName } = scheduledCall.request; | ||
| const canonicalName = canonicalToolName(toolName); | ||
|
|
||
| this.bouncedAwaitingApproval.add(callId); |
There was a problem hiding this comment.
[Suggestion] autoApproveCompatiblePendingTools can silently override a hook's 'ask' decision
When a sibling tool is approved with ProceedAlways, autoApproveCompatiblePendingTools (line 4237) auto-approves all awaiting_approval tools — including those bounced by a PreToolUse 'ask'. It does not check bouncedAwaitingApproval, so the bounced tool gets auto-approved and re-executed with the PreToolUse hook entirely skipped (because isPostAskReexecution is true). The hook author's argument-aware 'ask' is silently defeated.
Scenario: hook returns decision:'ask' for bash rm -rf /. User approves a sibling read_file with ProceedAlways. The dangerous bash call is auto-approved, runs without hook re-evaluation.
| this.bouncedAwaitingApproval.add(callId); | |
| this.bouncedAwaitingApproval.add(callId); | |
| // NOTE: autoApproveCompatiblePendingTools (line 4237) must skip | |
| // callIds in bouncedAwaitingApproval to preserve the hook's 'ask' intent. |
— qwen3.7-max via Qwen Code /review
| const { callId, name: toolName } = scheduledCall.request; | ||
| const canonicalName = canonicalToolName(toolName); | ||
|
|
||
| this.bouncedAwaitingApproval.add(callId); |
There was a problem hiding this comment.
[Suggestion] Zero logging across the bounce lifecycle
bounceToAwaitingApprovalForAsk, the isPostAskReexecution consumption in _executeToolCallBody, and the bouncedAwaitingApproval cleanup in finalizeToolSpan have no debugLogger calls. The surrounding code has 29+ debugLogger calls across the file (e.g., autoApproveCompatiblePendingTools at line 4269, permission notification at line 3120). A tool stuck in awaiting_approval from a bounce is indistinguishable from a normal confirmation hold in logs.
Consider adding debugLogger.info at three points: (1) bounce creation (callId, toolName, reason), (2) marker consumption on re-execution, (3) terminal cleanup of bounce markers.
— qwen3.7-max via Qwen Code /review
| // the original deny-as-error behavior. | ||
| if ( | ||
| preHookResult.blockType === 'ask' && | ||
| !signal.aborted && |
There was a problem hiding this comment.
[Suggestion] Untested branch: !signal.aborted guard before bounce
The !signal.aborted check here ensures the code falls through to deny-as-error when the signal is already aborted. The existing "cancels a pending ask" test aborts after the bounce has happened — no test covers the case where the signal is aborted before the hook response arrives.
If this guard is accidentally removed, a tool could bounce into awaiting_approval on a dead signal, flashing a confirmation prompt nobody can answer. Consider adding a test that pre-aborts the AbortController before scheduling a tool whose hook returns 'ask', and asserts the tool ends as 'error' without entering awaiting_approval.
— qwen3.7-max via Qwen Code /review
| // completes the batch and the turn hangs. setStatusInternal | ||
| // no-ops on already-terminal/cleared calls, so this only affects | ||
| // a genuine walk-away-then-abort. | ||
| this.setStatusInternal( |
There was a problem hiding this comment.
[Suggestion] Redundant UI notifications on abort for already-terminal calls
setStatusInternal fires for every callId in the batch, including calls already in terminal states (success, error, cancelled). While setStatusInternal no-ops on the status mutation, it still calls notifyToolCallsUpdate() (triggering UI re-renders) and checkAndNotifyCompletion() on every invocation. For a batch of N tools where only 1 was bounced, this produces N-1 redundant UI updates during abort.
| this.setStatusInternal( | |
| const call = this.toolCalls.find(c => c.request.callId === callId); | |
| if (call && call.status !== 'success' && call.status !== 'error' && call.status !== 'cancelled') { | |
| this.setStatusInternal( | |
| callId, | |
| 'cancelled', | |
| 'Tool call cancelled by user.', | |
| ); | |
| } |
— qwen3.7-max via Qwen Code /review
| (normalizedArgs as Record<string, unknown>)[key] = unescapePath( | ||
| String(normalizedArgs[key]).trim(), | ||
| ); | ||
| if (!this.bouncedAwaitingApproval.has(callId)) { |
There was a problem hiding this comment.
[Suggestion] ModifyWithEditor after bounce path is untested
The bouncedAwaitingApproval.has(callId) guard that prevents double-unescaping of path args in the ModifyWithEditor flow has no corresponding test. The test suite covers the normal approval path and direct bounce approval, but not the combination of: PreToolUse 'ask' bounce → user selects ModifyWithEditor → tool has path args with escaped metacharacters.
Consider adding a test: schedule a tool with file_path: 'a\\\\ b', let the hook return 'ask', then approve with ToolConfirmationOutcome.ModifyWithEditor. Assert the path is unescaped exactly once.
— qwen3.7-max via Qwen Code /review
✅ Local tmux verification —
|
| Scenario | Before (a233394e5) |
After (this PR) |
|---|---|---|
| Interactive | ask → auto-denied, no prompt (x ListFiles) |
confirmation box → approve runs once / decline cancels |
Non-interactive -p |
denied | falls back to deny, exit 0, no hang |
1) Interactive — ask now renders a confirmation; approve runs the tool exactly once
╭─────────────────────────────────────────────────────────────────╮
│ ? ListFiles . ← │
│ │
│ Hook gate: please confirm running this tool before it executes. │
│ │
│ Do you want to proceed? │
│ │
│ › 1. Yes, allow once │
│ 2. No, suggest changes (esc) │
╰─────────────────────────────────────────────────────────────────╯
⠏ Waiting for user confirmation...
- The hook
permissionDecisionReasonis surfaced as the prompt body. - Only Yes, allow once / No, suggest changes — no “always allow” (
hideAlwaysAllowhonored, since the hook re-evaluates every call).
After Yes, allow once → tool executes once and the model answers:
✓ ListFiles .
Listed 2 item(s)
✦ ALL_DONE_SENTINEL
The mock received the directory listing as a tool result exactly once — the re-execution skips the PreToolUse hook, so there is no re-ask loop.
2) Interactive — decline cancels the tool
Selecting No, suggest changes on the same box:
- ListFiles .
- is the Canceled glyph (vs ✓ Success / x Error). The tool never ran and the TUI returned to the prompt with no hang.
3) Non-interactive (-p) — ask falls back to deny (control isolates the hook)
-p -y run |
tool result the model received | tool ran? | exit |
|---|---|---|---|
with ask hook |
Hook gate: please confirm running this tool… (deny) |
No | 0, no hang |
| control, no hook | Listed 2 item(s) … [DIR] .qwen, hook-ask.sh |
Yes | 0 |
stderr (hooked run): Tool "list_directory" requires user approval but cannot execute in non-interactive mode. The control proves the deny is caused by the hook ask, not by the tool itself.
4) Before/after (merge-base) — confirms the behavior change
Same hook + same mock on a233394e5:
x ListFiles {"path":"…/project"}
Hook gate: please confirm running this tool before it executes.
✦ ALL_DONE_SENTINEL
x = Error: ask was silently auto-denied with no prompt (the hook reason became the failure message; the tool never ran). This PR turns that into the interactive confirmation above.
Verdict
Every behavior in the Reviewer Test Plan reproduces on a real built CLI: interactive ask → native confirmation (approve runs once, no loop; decline cancels), and non-interactive ask → graceful deny with no hang. The merge-base A/B confirms the change is exactly the intended “ask was deny → now prompts”. LGTM for merge from a behavioral standpoint.
中文版(合并参考)
✅ 本地 tmux 真实验证 —— PreToolUse 的 ask → TUI 确认框
在真实构建的 CLI(npm ci && npm run build,head 302f12f0c,独立 worktree,Linux)上通过 tmux 驱动验证,配合确定性 mock OpenAI 服务器和一个真实的、返回 permissionDecision: "ask" 的 PreToolUse command hook,并与 merge-base a233394e5(无 ask-bounce 代码)做了 A/B 对比。
环境
- Hook —— 项目
.qwen/settings.json,matcher*,已预信任,输出:
{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"ask","permissionDecisionReason":"Hook gate: please confirm running this tool before it executes."}} - 模型 —— mock OpenAI:第一轮用户消息返回一个
list_directorytool call,随后返回最终文本(ALL_DONE_SENTINEL)。 - 使用
-y运行,使工具自身的确认被自动通过,让 hook 的ask成为唯一闸门(正常交互会话中getShouldAvoidPermissionPrompts()为false,所以-y不会抑制 bounce)。
结论一览
| 场景 | 改动前(a233394e5) |
改动后(本 PR) |
|---|---|---|
| 交互模式 | ask → 静默拒绝、无弹窗(x ListFiles) |
弹出确认框 → 批准执行一次 / 拒绝则取消 |
非交互 -p |
拒绝 | 回退为 deny,exit 0,不挂起 |
1) 交互模式 —— ask 现在弹出确认框,批准后工具恰好执行一次
确认框显示 hook 的 permissionDecisionReason,选项仅有 Yes, allow once / No, suggest changes,没有 “always allow”(hideAlwaysAllow 生效,因为 hook 每次都会重新评估)。选择 Yes, allow once 后工具执行一次(✓ ListFiles . / Listed 2 item(s)),mock 仅收到一次目录列表结果 —— 重执行会跳过 PreToolUse hook,没有重复询问循环。
2) 交互模式 —— 拒绝则取消
在同一确认框选择 No, suggest changes 后,工具显示 -(Canceled)—— 未执行,TUI 正常回到输入框,不挂起。
3) 非交互(-p)—— ask 回退为 deny
带 hook 时工具被拒绝(模型收到的 tool result 即 hook 文案,工具未执行),exit 0、不挂起;无 hook 的对照组工具正常执行(返回目录列表)。对照组证明该拒绝来自 hook ask,而非工具本身。
4) 改动前后对比(merge-base)
在 a233394e5 上用相同 hook + mock:工具显示 x(Error),ask 被静默自动拒绝、无任何弹窗(hook 文案变成失败信息,工具从未执行)。本 PR 将其改为上面的交互式确认。
结论
Reviewer Test Plan 中的所有行为都在真实构建的 CLI 上复现:交互式 ask → 原生确认(批准只执行一次、无循环;拒绝则取消),非交互 ask → 优雅 deny 且不挂起。merge-base A/B 印证了改动正是预期的“ask 原本=deny → 现在弹确认”。从行为角度 LGTM,可以合并。
Verified by @wenshao on Linux via a real built CLI + mock OpenAI server driven over tmux (head 302f12f0c vs merge-base a233394e5).
What this PR does
A
PreToolUsehook returningpermissionDecision: "ask"now surfaces a native TUI confirmation before the tool runs, instead of being treated like"deny". The hook fires in the execution phase (_executeToolCallBody), after the confirmation flow has already finished, so previously an"ask"could only block the tool as anEXECUTION_DENIEDerror. This change bounces the tool from the execution phase back into the existingawaiting_approvalflow: a syntheticinfoconfirmation whoseonConfirmroutes throughhandleConfirmationResponse(approve → re-execute, decline → cancelled).PreToolUsekeeps its "before execution" timing — only the"ask"branch is new;"denied"/"stop"keep deny-as-error, and the separatePermissionRequesthook is untouched. Non-interactive CLI and background agents cannot prompt, so"ask"falls back to deny there.Why it's needed
permissionDecision: "ask"is already part of the hook schema (PreToolUseHookOutput.isAsk()) and mirrors Claude Code, where anasklets the user confirm a tool in the TUI. In qwen-code it was silently equivalent todeny— the confirmation never appeared. This is the human-in-the-loop gate that external-injection / CI-driven workflows asked for (see #5424): push a proposed action and have the human approve it in the TUI before the agent acts.Reviewer Test Plan
How to verify
Configure a
PreToolUsecommand hook (matcher*) returning{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"ask","permissionDecisionReason":"confirm running this tool"}}, then drive the CLI so the model calls any tool. Interactive: a confirmation box appears showing the hook reason withYes, allow once/No, suggest changes(no "always allow", since the hook re-evaluates every call); approving runs the tool once, declining cancels it. Non-interactive (-p/ background agent): theaskfalls back to deny (tool_resultis_error), no hang.Evidence (Before & After)
@wenshao independently built the PR branch and ran a real tmux A/B against the merge-base with a deterministic mock model: #5629 (comment) — Before (merge-base): the same hook produced no prompt and the tool was auto-denied; After (this PR): the interactive
askrenders a confirmation, approve runs the tool exactly once (no re-ask loop), and non-interactive falls back to deny with exit 0.Tested on
macOS: unit tests (
coreToolScheduler.test.ts,toolHookTriggers.test.ts) plus @wenshao's real-CLI tmux run. Windows/Linux: covered by CI (cross-platform unit tests green).Environment (optional)
Unit tests only locally; @wenshao's run used a real built CLI + mock OpenAI server over tmux.
Risk & Scope
coreToolScheduler.ts); the bounce reuses the existingawaiting_approvalmachinery rather than adding a parallel path. Deny/stop and thePermissionRequesthook are unchanged.reasoncarries the message) — a separable enhancement.askpreviously denied; it now prompts (interactive) or still denies (non-interactive).Linked Issues
Relates to #5424.
中文说明
这个 PR 做了什么
PreToolUsehook 返回permissionDecision: "ask"时,现在会在工具执行前弹出原生 TUI 确认,而不再等同于"deny"。hook 在执行阶段(_executeToolCallBody)触发,此时确认流程已结束,所以之前"ask"只能把工具当成EXECUTION_DENIED拦掉。本改动把工具从执行阶段打回现有的awaiting_approval流程:构造一个info确认,其onConfirm走handleConfirmationResponse(批准→重执行,拒绝→取消)。PreToolUse保持"执行前"的触发时机——只有"ask"分支是新增的,"denied"/"stop"仍是 deny-as-error,独立的PermissionRequesthook 不受影响。非交互 CLI 和后台 agent 无法弹确认,"ask"在那里回退为 deny。为什么需要
permissionDecision: "ask"本就是 hook schema 的一部分(PreToolUseHookOutput.isAsk()),对齐 Claude Code——ask让用户在 TUI 里确认工具。但在 qwen-code 里它静默等同于deny,确认框从不出现。这正是外部注入 / CI 驱动工作流所需的人在环路闸门(见 #5424):推送一个待执行动作,让人在 TUI 批准后 agent 才执行。Reviewer Test Plan
如何验证
配置一个
PreToolUsecommand hook(matcher*)返回{"hookSpecificOutput":{"hookEventName":"PreToolUse","permissionDecision":"ask","permissionDecisionReason":"confirm running this tool"}},然后驱动 CLI 让模型调用任意工具。交互模式:弹出确认框显示 hook reason,选项Yes, allow once/No, suggest changes(无"始终允许",因为 hook 每次重新评估);批准后工具执行一次,拒绝则取消。非交互(-p/ 后台 agent):ask回退为 deny(tool_resultis_error),不挂起。证据(前后对比)
@wenshao 独立构建本 PR 分支,用真实 tmux + 确定性 mock 模型对 merge-base 做了 A/B:#5629 (comment) —— 前(merge-base):同样的 hook 没有弹窗,工具被自动拒绝;后(本 PR):交互式
ask弹出确认,批准后工具恰好执行一次(无重复询问循环),非交互回退为 deny 且 exit 0。测试平台
macOS:单元测试(
coreToolScheduler.test.ts、toolHookTriggers.test.ts)加 @wenshao 的真实 CLI tmux 运行。Windows/Linux:由 CI 覆盖(跨平台单测绿)。环境(可选)
本地仅单元测试;@wenshao 的运行用了真实构建的 CLI + mock OpenAI 服务器,通过 tmux 驱动。
风险与范围
coreToolScheduler.ts);bounce 复用现有awaiting_approval机制而非新增并行路径。Deny/stop 与PermissionRequesthook 不变。reason已承载文案)——属于可单独提的增强。ask之前是拒绝;现在交互模式弹确认、非交互仍拒绝。关联 Issue
Relates to #5424.