Skip to content

feat: stream by default + response metadata footer#342

Merged
qhkm merged 5 commits intomainfrom
feat/stream-ux
Mar 13, 2026
Merged

feat: stream by default + response metadata footer#342
qhkm merged 5 commits intomainfrom
feat/stream-ux

Conversation

@qhkm
Copy link
Copy Markdown
Owner

@qhkm qhkm commented Mar 12, 2026

Summary

  • Streaming by default: CLI now streams responses by default. Use --no-stream to disable. Config agents.defaults.streaming changed from false to true.
  • Response metadata footer: After each response, prints a compact footer showing token count, tool calls, and wall time (e.g. ⠿ 1,247 tokens · 3 tool calls · 2.1s)

Changes

  • src/cli/mod.rs: Replace --stream flag with --no-stream
  • src/cli/agent.rs: Invert streaming logic, add metadata footer via MetricsCollector
  • src/cli/shimmer.rs: Add print_metadata_footer() and format_number_with_commas()
  • src/config/types.rs: Change streaming default to true

Test plan

  • All 3126 lib tests pass
  • Clippy clean
  • cargo fmt -- --check clean
  • Manual: zeptoclaw agent -m "Hello" streams by default
  • Manual: zeptoclaw agent --no-stream -m "Hello" disables streaming
  • Manual: metadata footer shows after response

Closes #341

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Conversations can pause for user clarifications via a new pause-for-input capability.
    • Added a response metadata footer showing tokens, tool calls, and elapsed time.
  • Changes

    • Streaming enabled by default; use --no-stream to disable.
    • Batch mode now requires explicit streaming flag to enable streaming.
    • Agent flow updated to signal final response readiness even in non-streaming paths.
  • Other

    • Safety checks narrowed for certain file-write/edit operations; docs and examples updated.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 12, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ec5d0111-ce8a-448c-8936-d32933fbe340

📥 Commits

Reviewing files that changed from the base of the PR and between 1537aea and 0f03a21.

📒 Files selected for processing (11)
  • CLAUDE.md
  • docs/superpowers/plans/2026-03-13-deep-research-and-clarification.md
  • docs/superpowers/specs/2026-03-13-deep-research-skill-and-clarification-tool-design.md
  • src/agent/loop.rs
  • src/cli/agent.rs
  • src/cli/batch.rs
  • src/cli/mod.rs
  • src/cli/shimmer.rs
  • src/config/types.rs
  • src/kernel/gate.rs
  • tests/integration.rs

📝 Walkthrough

Walkthrough

Adds a pause-for-input flag to tool outputs and wiring through tooling and the agent loop; introduces CLI streaming-on-by-default with a --no-stream flag and a response metadata footer; updates safety scan inputs for file tools; documentation and tests adjusted accordingly.

Changes

Cohort / File(s) Summary
Tooling types & builders
src/tools/mod.rs, src/tools/...
Adds public pause_for_input: bool to ToolOutput, ensures constructors set it false, and adds pub fn with_pause(mut self) -> Self to enable it.
Clarification tool & docs
docs/superpowers/specs/..., docs/superpowers/plans/..., src/tools/...
Spec and plan docs updated/added for deep-research/clarification tool; references adjusted (e.g., src/cli/batch.rs).
Agent loop & control flow
src/agent/loop.rs, src/kernel/gate.rs
AgentLoop streaming default now honors config; emits ResponseReady signal in non-streaming path; safety pre-scan for file tools changed to scan only path (write) or path + old_text (edit).
CLI entrypoints & flags
src/cli/mod.rs, src/cli/agent.rs, src/cli/batch.rs
Swaps stream: boolno_stream: bool in CLI surface and agent handler; batch mode no longer inherits config default (only explicit --stream enables streaming).
CLI shimmer & metadata footer
src/cli/shimmer.rs, CLAUDE.md
Adds pub fn print_metadata_footer(total_tokens, tool_calls, elapsed) and format_number_with_commas helper; updates examples/docs to use --no-stream.
Config & tests
src/config/types.rs, tests/..., tests/integration.rs
Flips AgentDefaults.streaming default to true; updates unit/integration tests to reflect new default and toggling behavior.
Manifest/docs references
docs/.../CLAUDE.md, manifest files
Small documentation reference and formatting tweaks in plans and CLAUDE.md; no public API changes beyond tooling flag and CLI signature edits.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant ToolRegistry
    participant AskClarificationTool
    participant User

    Agent->>ToolRegistry: execute tool(name, params, ToolContext)
    ToolRegistry->>AskClarificationTool: run(params, context)
    AskClarificationTool->>AskClarificationTool: analyze if clarification needed
    alt clarification required
        AskClarificationTool-->>Agent: ToolOutput(..., pause_for_input=true)
        Agent->>Agent: detect pause_for_input -> pause processing
        Agent->>User: emit clarification prompt
        User->>Agent: send clarification input
        Agent->>ToolRegistry: resume execution with clarification
    else no clarification
        AskClarificationTool-->>Agent: ToolOutput(..., pause_for_input=false)
        Agent->>Agent: continue normal processing (stream or non-stream)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I nibble at tokens, one by one,
If meaning blurs, I'll halt — not run.
A footer hums the count and time,
Then we resume the answer's rhyme.
Clarify once, then hop to fun.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Out of Scope Changes check ❓ Inconclusive Multiple changes beyond the stated objectives appear in the PR: deep-research skill documentation, ask_clarification tool implementation, tool output modifications, and safety scanning logic updates unrelated to streaming/footer requirements. Clarify whether deep-research skill, ask_clarification tool, and safety scanning changes are intentional parts of this PR or should be separated into distinct pull requests.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: stream by default + response metadata footer' accurately and concisely summarizes the two main changes: switching streaming to default and adding a metadata footer display.
Linked Issues check ✅ Passed The pull request implements all primary coding requirements from issue #341: streaming enabled by default with --no-stream flag, response metadata footer displaying tokens/tool calls/time, and proper use of existing MetricsCollector data.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/stream-ux
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/config/types.rs (1)

699-721: ⚠️ Potential issue | 🟠 Major

Shared default flip also turns on batch streaming.

This default is consumed by src/cli/batch.rs via stream || config.agents.defaults.streaming, so changing it to true makes zeptoclaw batch take the streaming path even when the user did not pass --stream. That changes batch behavior outside the agent CLI and leaves the batch flag/help semantics inconsistent.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/config/types.rs` around lines 699 - 721, AgentDefaults::default enables
streaming by setting streaming: true which unintentionally turns on streaming
for the global batch path (used via stream || config.agents.defaults.streaming
in src/cli/batch.rs); change the default to streaming: false in the
AgentDefaults::default implementation (types: AgentDefaults, method: default) so
batch behavior only follows an explicit --stream flag, and update or add a brief
comment near the streaming field explaining that the batch CLI relies on an
explicit flag rather than this default.
src/cli/agent.rs (1)

489-529: ⚠️ Potential issue | 🟠 Major

Interactive responses still never print the metadata footer.

The new footer is only emitted in the single-message branch; the REPL path at Lines 491-529 still finishes each turn without any metrics line. That misses the “after each response” behavior for the main interactive CLI flow.

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/superpowers/plans/2026-03-13-deep-research-and-clarification.md`:
- Around line 218-225: The fenced code block that shows the test command (the
block starting with ``` and containing "cargo nextest run --lib
test_tool_output_default_pause_false test_tool_output_with_pause
test_tool_output_split_with_pause" and "Expected: PASS") needs a language tag to
satisfy markdownlint MD040; update the opening fence to include a tag such as
```text or ```markdown so the block becomes e.g. ```text and the rest of the
content remains unchanged.
- Around line 240-242: Update the plan to point to the correct batch CLI
entrypoint by replacing the incorrect reference "src/batch.rs" with
"src/cli/batch.rs" so readers are directed to the actual implementation; ensure
any occurrences in the document (the modified files list and any later
references) consistently use "src/cli/batch.rs" to avoid confusion.

In
`@docs/superpowers/specs/2026-03-13-deep-research-skill-and-clarification-tool-design.md`:
- Around line 149-172: The spec snippet would overwrite existing semantics;
instead, add a new field pause_for_input: bool to the existing ToolOutput struct
(which already includes is_error and is_async) and ensure Default and all
existing constructors/builders (e.g., ToolOutput::llm_only, ToolOutput::new,
ToolOutput::error(), ToolOutput::async_task(), ToolOutput::split()) initialize
pause_for_input = false; also add a builder method ToolOutput::with_pause(self)
-> Self that sets pause_for_input = true and returns self so callers can opt in
without losing is_error/is_async semantics.

---

Outside diff comments:
In `@src/config/types.rs`:
- Around line 699-721: AgentDefaults::default enables streaming by setting
streaming: true which unintentionally turns on streaming for the global batch
path (used via stream || config.agents.defaults.streaming in src/cli/batch.rs);
change the default to streaming: false in the AgentDefaults::default
implementation (types: AgentDefaults, method: default) so batch behavior only
follows an explicit --stream flag, and update or add a brief comment near the
streaming field explaining that the batch CLI relies on an explicit flag rather
than this default.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9c514303-073c-4993-be58-a7597ccb25aa

📥 Commits

Reviewing files that changed from the base of the PR and between 397f4ec and 3ef4e16.

📒 Files selected for processing (6)
  • docs/superpowers/plans/2026-03-13-deep-research-and-clarification.md
  • docs/superpowers/specs/2026-03-13-deep-research-skill-and-clarification-tool-design.md
  • src/cli/agent.rs
  • src/cli/mod.rs
  • src/cli/shimmer.rs
  • src/config/types.rs

Comment on lines +149 to +172
### ToolOutput Change

**File:** `src/tools/mod.rs` (or `src/tools/types.rs`, wherever `ToolOutput` is defined)

Add field:
```rust
pub struct ToolOutput {
pub for_llm: String,
pub for_user: Option<String>,
pub pause_for_input: bool, // NEW — default false
}
```

Add builder method:
```rust
impl ToolOutput {
pub fn with_pause(mut self) -> Self {
self.pause_for_input = true;
self
}
}
```

Ensure `Default` and existing constructors (`llm_only`, `new`, etc.) set `pause_for_input: false`.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

The ToolOutput example is incomplete enough to mislead implementation.

This snippet looks like a full replacement, but the real type also carries is_error and is_async, and its existing builders include error(), async_task(), and split(). Please phrase this as “add pause_for_input to the existing struct/builders” or show the full current shape so nobody drops those semantics by following the spec literally.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/superpowers/specs/2026-03-13-deep-research-skill-and-clarification-tool-design.md`
around lines 149 - 172, The spec snippet would overwrite existing semantics;
instead, add a new field pause_for_input: bool to the existing ToolOutput struct
(which already includes is_error and is_async) and ensure Default and all
existing constructors/builders (e.g., ToolOutput::llm_only, ToolOutput::new,
ToolOutput::error(), ToolOutput::async_task(), ToolOutput::split()) initialize
pause_for_input = false; also add a builder method ToolOutput::with_pause(self)
-> Self that sets pause_for_input = true and returns self so callers can opt in
without losing is_error/is_async semantics.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/kernel/gate.rs (1)

125-442: Missing test coverage for the new write_file/edit_file path-only scanning behavior.

The change introduces special-case behavior for file-writing tools, but no test verifies:

  1. That write_file content containing shell patterns (e.g., $(cmd)) is not blocked.
  2. That write_file path containing malicious patterns is blocked.
  3. That other tools still scan full input.
🧪 Suggested test to add
#[tokio::test]
async fn test_write_file_scans_path_only_not_content() {
    let registry = setup_registry();
    let metrics = MetricsCollector::new();
    let ctx = ToolContext::default();
    let mut safety_config = SafetyConfig::default();
    safety_config.enabled = true;
    let safety = SafetyLayer::new(safety_config);

    // Content with shell injection pattern should NOT be blocked
    // (path-only scanning for write_file)
    let result = execute_tool(
        &registry,
        "write_file",
        json!({
            "path": "/tmp/script.sh",
            "content": "#!/bin/bash\necho $(whoami)"
        }),
        &ctx,
        Some(&safety),
        &metrics,
        None,
    )
    .await
    .unwrap();

    // Should not be blocked by safety (tool may fail for other reasons)
    assert!(
        !result.for_llm.contains("blocked by safety"),
        "write_file content should not trigger shell_injection policy"
    );
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/gate.rs` around lines 125 - 442, Tests are missing for the new
write_file path-only safety scanning; add unit tests exercising execute_tool to
verify (1) write_file does NOT block when content contains shell patterns (e.g.,
"$(cmd)"), (2) write_file DOES block when the path contains malicious patterns,
and (3) other tools (e.g., "echo" or "shell_execute") still scan full input; use
SafetyLayer::new(SafetyConfig::default()) with enabled=true, call execute_tool
with the appropriate tool name and json payloads (checking "path" vs "content"),
unwrap the Ok result and assert on result.for_llm contains/does-not-contain
"blocked by safety" and that metrics (MetricsCollector) still increment as
expected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/kernel/gate.rs`:
- Around line 58-62: The code currently builds the path with
input.get("path").and_then(|v| v.as_str()).unwrap_or("").to_string(), which
silently yields an empty string for missing/non-string values; update the
handling in the function that constructs the "path" (the input.get("path")
chain) to detect None or non-string results and emit a warning (e.g., via
log::warn! or tracing::warn!) that the tool input shape is unexpected, and
either propagate an Option/Result or explicitly abort/return an error instead of
proceeding with an empty string so the safety pre-check cannot be trivially
bypassed.

---

Nitpick comments:
In `@src/kernel/gate.rs`:
- Around line 125-442: Tests are missing for the new write_file path-only safety
scanning; add unit tests exercising execute_tool to verify (1) write_file does
NOT block when content contains shell patterns (e.g., "$(cmd)"), (2) write_file
DOES block when the path contains malicious patterns, and (3) other tools (e.g.,
"echo" or "shell_execute") still scan full input; use
SafetyLayer::new(SafetyConfig::default()) with enabled=true, call execute_tool
with the appropriate tool name and json payloads (checking "path" vs "content"),
unwrap the Ok result and assert on result.for_llm contains/does-not-contain
"blocked by safety" and that metrics (MetricsCollector) still increment as
expected.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c858048a-48e9-47c7-8284-2922d09ff0fa

📥 Commits

Reviewing files that changed from the base of the PR and between 3ef4e16 and 3abb147.

📒 Files selected for processing (2)
  • src/cli/agent.rs
  • src/kernel/gate.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/cli/agent.rs

Comment on lines +58 to +62
input
.get("path")
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Edge case: missing "path" field results in empty-string safety scan.

If input.get("path") returns None or a non-string value, unwrap_or("") produces an empty string that trivially passes all safety checks. While the tool execution will fail later (missing required argument), the safety pre-check is effectively skipped.

This is low-risk since the tool would reject the input anyway, but consider logging a warning for unexpected tool input shape.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/gate.rs` around lines 58 - 62, The code currently builds the path
with input.get("path").and_then(|v| v.as_str()).unwrap_or("").to_string(), which
silently yields an empty string for missing/non-string values; update the
handling in the function that constructs the "path" (the input.get("path")
chain) to detect None or non-string results and emit a warning (e.g., via
log::warn! or tracing::warn!) that the tool input shape is unexpected, and
either propagate an Option/Result or explicitly abort/return an error instead of
proceeding with an empty string so the safety pre-check cannot be trivially
bypassed.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cli/batch.rs (1)

18-25: ⚠️ Potential issue | 🟡 Minor

Include cargo test --doc output before merge

All four pre-push checks are required per coding guidelines. While this file has no doc tests, cargo test --doc must still be run and results shown. Add the output alongside fmt/clippy/lib test results.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/batch.rs` around lines 18 - 25, The PR is missing the required "cargo
test --doc" results; run cargo test --doc in the repository root and capture its
output, then add that output alongside the existing fmt/clippy/lib test results
in the PR (or CI artifact) before merging; ensure the CI workflow that reports
fmt/clippy/lib also runs cargo test --doc so future PRs include these results
automatically. Reference: cmd_batch function is unchanged, this change is only
to the PR/CI reporting (no code edits required to cmd_batch).
♻️ Duplicate comments (1)
src/kernel/gate.rs (1)

58-69: ⚠️ Potential issue | 🟡 Minor

Malformed path/old_text still collapses to empty/partial scan input.

unwrap_or("") for missing/non-string fields weakens the pre-check by silently scanning an empty/partial string. This concern was previously raised and is still present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/gate.rs` around lines 58 - 69, The current scan builds input with
input.get(...).and_then(|v| v.as_str()).unwrap_or("") which lets missing or
non-string "path" or "old_text" collapse to empty/partial strings; change the
"edit_file" branch to validate both fields explicitly (e.g., check
input.get("path").and_then(|v| v.as_str()) and
input.get("old_text").and_then(|v| v.as_str()) return Some), and if either is
missing or not a string, skip scanning (or return an explicit error/None)
instead of using empty defaults; update the variables referenced in this branch
(path, old_text, and the format!("{} {}", path, old_text) construction) so
scanning only occurs when both values are present and valid.
🧹 Nitpick comments (2)
src/cli/shimmer.rs (1)

186-215: Extract footer string-building into a pure formatter for easier testing.

print_metadata_footer currently combines formatting + I/O; splitting these improves testability and keeps this path composable.

Refactor sketch
+fn format_metadata_footer(total_tokens: u64, tool_calls: u64, elapsed: std::time::Duration) -> Option<String> {
+    if total_tokens == 0 && tool_calls == 0 {
+        return None;
+    }
+
+    let mut parts = Vec::with_capacity(3);
+    if total_tokens > 0 {
+        parts.push(format!("{} tokens", format_number_with_commas(total_tokens)));
+    }
+    if tool_calls > 0 {
+        let label = if tool_calls == 1 { "tool call" } else { "tool calls" };
+        parts.push(format!("{} {}", format_number_with_commas(tool_calls), label));
+    }
+    parts.push(fmt_elapsed(elapsed.as_millis() as u64));
+    Some(parts.join(" \x1b[38;5;240m·\x1b[38;5;245m "))
+}
+
 pub fn print_metadata_footer(total_tokens: u64, tool_calls: u64, elapsed: std::time::Duration) {
-    // Skip footer entirely when there is nothing meaningful to report
-    if total_tokens == 0 && tool_calls == 0 {
-        return;
-    }
-
-    let mut parts = Vec::with_capacity(3);
-    ...
-    eprintln!();
-    eprintln!(
-        "  \x1b[38;5;245m⠿ {}\x1b[0m",
-        parts.join(" \x1b[38;5;240m·\x1b[38;5;245m ")
-    );
+    if let Some(line) = format_metadata_footer(total_tokens, tool_calls, elapsed) {
+        eprintln!();
+        eprintln!("  \x1b[38;5;245m⠿ {}\x1b[0m", line);
+    }
 }

As per coding guidelines: "Keep Rust changes minimal and focused; prefer small composable functions over large blocks."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cli/shimmer.rs` around lines 186 - 215, The print_metadata_footer
function mixes string construction and I/O, making it hard to test; extract the
footer string-building into a pure function (e.g., format_metadata_footer) that
accepts total_tokens: u64, tool_calls: u64, elapsed: Duration and returns the
formatted footer String; reuse existing helpers format_number_with_commas and
fmt_elapsed inside that function to build the same parts vector and join with
the same separators, then have print_metadata_footer call format_metadata_footer
and only perform the eprintln! calls (including the leading blank line and color
codes) so the formatting logic can be unit tested independently.
src/kernel/gate.rs (1)

47-73: Add regression tests for write/edit scan coverage to prevent future safety regressions.

Given this is security-gate logic, please add tests that assert which fields are included in pre-exec scan for write_file and edit_file (including malformed input shapes).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/gate.rs` around lines 47 - 73, Add unit tests that exercise the
pre-exec scanning branch around the safety layer: construct calls that set
safety = Some(... ) and invoke the logic paths for "write_file" and "edit_file"
to assert the exact scan_input passed to safety_layer.scan (for "write_file"
only the "path" field string; for "edit_file" the concatenation of "path" and
"old_text" and explicitly excluding "new_text"). Include malformed shapes
(missing "path", missing "old_text", non-string fields) to verify fallback
behavior (empty string or default serialization) and assert the
CheckDirection::Input scan is invoked with the expected string; reference the
scan_input variable, the "write_file"/"edit_file" branches, safety_layer.scan,
and CheckDirection::Input when locating code to test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/kernel/gate.rs`:
- Around line 53-70: The scan currently omits the file body for "write_file" and
"edit_file" (only scanning "path" and "old_text"), which lets dangerous payloads
bypass checks; update the match arms for name == "write_file" and "edit_file"
(and the local variables path/old_text) to also include the file-body field
(e.g., "content" or "new_text" as used by callers) in the constructed scan_input
string so the actual data being written is scanned (for example, append
input.get("new_text").and_then(|v| v.as_str()).unwrap_or("") to the string in
both branches), keeping existing behavior for path/old_text while adding the
body to the scanned input.

---

Outside diff comments:
In `@src/cli/batch.rs`:
- Around line 18-25: The PR is missing the required "cargo test --doc" results;
run cargo test --doc in the repository root and capture its output, then add
that output alongside the existing fmt/clippy/lib test results in the PR (or CI
artifact) before merging; ensure the CI workflow that reports fmt/clippy/lib
also runs cargo test --doc so future PRs include these results automatically.
Reference: cmd_batch function is unchanged, this change is only to the PR/CI
reporting (no code edits required to cmd_batch).

---

Duplicate comments:
In `@src/kernel/gate.rs`:
- Around line 58-69: The current scan builds input with
input.get(...).and_then(|v| v.as_str()).unwrap_or("") which lets missing or
non-string "path" or "old_text" collapse to empty/partial strings; change the
"edit_file" branch to validate both fields explicitly (e.g., check
input.get("path").and_then(|v| v.as_str()) and
input.get("old_text").and_then(|v| v.as_str()) return Some), and if either is
missing or not a string, skip scanning (or return an explicit error/None)
instead of using empty defaults; update the variables referenced in this branch
(path, old_text, and the format!("{} {}", path, old_text) construction) so
scanning only occurs when both values are present and valid.

---

Nitpick comments:
In `@src/cli/shimmer.rs`:
- Around line 186-215: The print_metadata_footer function mixes string
construction and I/O, making it hard to test; extract the footer string-building
into a pure function (e.g., format_metadata_footer) that accepts total_tokens:
u64, tool_calls: u64, elapsed: Duration and returns the formatted footer String;
reuse existing helpers format_number_with_commas and fmt_elapsed inside that
function to build the same parts vector and join with the same separators, then
have print_metadata_footer call format_metadata_footer and only perform the
eprintln! calls (including the leading blank line and color codes) so the
formatting logic can be unit tested independently.

In `@src/kernel/gate.rs`:
- Around line 47-73: Add unit tests that exercise the pre-exec scanning branch
around the safety layer: construct calls that set safety = Some(... ) and invoke
the logic paths for "write_file" and "edit_file" to assert the exact scan_input
passed to safety_layer.scan (for "write_file" only the "path" field string; for
"edit_file" the concatenation of "path" and "old_text" and explicitly excluding
"new_text"). Include malformed shapes (missing "path", missing "old_text",
non-string fields) to verify fallback behavior (empty string or default
serialization) and assert the CheckDirection::Input scan is invoked with the
expected string; reference the scan_input variable, the "write_file"/"edit_file"
branches, safety_layer.scan, and CheckDirection::Input when locating code to
test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d9c6438e-fa87-4d83-9e5d-74428aca8be9

📥 Commits

Reviewing files that changed from the base of the PR and between 6534308 and 1537aea.

📒 Files selected for processing (5)
  • CLAUDE.md
  • src/agent/loop.rs
  • src/cli/batch.rs
  • src/cli/shimmer.rs
  • src/kernel/gate.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/agent/loop.rs

Comment on lines +53 to +70
let scan_input = match name {
"write_file" => {
// Only scan the "path" field — file body text legitimately
// contains code patterns (backticks, `$(...)`, etc.) that
// would false-positive on shell_injection.
input
.get("path")
.and_then(|v| v.as_str())
.unwrap_or("")
.to_string()
}
"edit_file" => {
// Scan path + old_text (user-supplied search input) but skip
// new_text (file body text that legitimately contains code patterns).
let path = input.get("path").and_then(|v| v.as_str()).unwrap_or("");
let old_text = input.get("old_text").and_then(|v| v.as_str()).unwrap_or("");
format!("{} {}", path, old_text)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Excluding content/new_text from safety scan creates a real bypass.

For write_file/edit_file, the gate no longer scans the actual file body being written. That skips policy/leak/injection checks on the highest-risk part of the payload and allows unsafe content to pass pre-exec screening.

🔧 Proposed fix
-        let scan_input = match name {
-            "write_file" => {
-                input
-                    .get("path")
-                    .and_then(|v| v.as_str())
-                    .unwrap_or("")
-                    .to_string()
-            }
-            "edit_file" => {
-                let path = input.get("path").and_then(|v| v.as_str()).unwrap_or("");
-                let old_text = input.get("old_text").and_then(|v| v.as_str()).unwrap_or("");
-                format!("{} {}", path, old_text)
-            }
-            _ => serde_json::to_string(&input).unwrap_or_default(),
-        };
+        let scan_input = serde_json::to_string(&input).unwrap_or_default();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/kernel/gate.rs` around lines 53 - 70, The scan currently omits the file
body for "write_file" and "edit_file" (only scanning "path" and "old_text"),
which lets dangerous payloads bypass checks; update the match arms for name ==
"write_file" and "edit_file" (and the local variables path/old_text) to also
include the file-body field (e.g., "content" or "new_text" as used by callers)
in the constructed scan_input string so the actual data being written is scanned
(for example, append input.get("new_text").and_then(|v|
v.as_str()).unwrap_or("") to the string in both branches), keeping existing
behavior for path/old_text while adding the body to the scanned input.

@qhkm
Copy link
Copy Markdown
Owner Author

qhkm commented Mar 13, 2026

Responses to CodeRabbit findings

1. MD040 bare fenced blocks → Fixed ✅

Added text language tags.

2. Wrong batch path → Fixed ✅

src/batch.rssrc/cli/batch.rs.

3. ToolOutput example incomplete → Fixed ✅

Clarified spec that the change is additive to the existing struct.

4. Missing path field warning → Fixed ✅

Added tracing::warn when write_file/edit_file input is missing the path field.

5. "Critical: content bypass" → Intentional, not a bug

This is by design. The safety gate's shell_injection policy regex matches $(...) and backtick patterns — which are perfectly legitimate in code files being written by the agent. Before this fix, write_file was blocked from writing Python/JS code that contained $(command) or `backtick` patterns.

The threat model:

  • File content is LLM-generated, not user input — prompt injection is checked at the inbound message level (agent loop), not at write time
  • Path traversal is still scanned — the path field goes through the full safety pipeline
  • Leak detection still runs on output (step 4) — secrets in tool output are caught post-execution
  • Taint tracking (step 2) independently blocks execution of tainted content

Reverting this would re-break the agent's ability to write code files containing shell-like syntax.

qhkm and others added 5 commits March 13, 2026 17:11
- Flip CLI from --stream (opt-in) to streaming-by-default with --no-stream
- Change AgentDefaults.streaming default from false to true
- Add compact metadata footer after responses: tokens, tool calls, wall time
- Add format_number_with_commas() helper for readable token counts

Closes #341

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two fixes uncovered by the coding benchmark:

1. CLI agent now overrides workspace to CWD when the default
   (~/.zeptoclaw/workspace) is unchanged, so file tools resolve relative
   paths against the directory the user ran the command from.

2. Safety input scanning for write_file/edit_file now only checks the
   "path" field, not file body content. Code containing patterns like
   `$(...)` or backticks was false-positiving on the shell_injection
   policy rule.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
AgentLoop constructors hardcoded streaming to false instead of reading
config.agents.defaults.streaming. Updated both constructors and tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Emit ResponseReady feedback in streaming path (was missing, causing
  shimmer spinner to never clear after tool use)
- Batch mode no longer inherits streaming default (avoids session-save
  race conditions; only streams when explicitly --stream)
- Safety scan now includes old_text for edit_file (path-only scan let
  user-supplied search input bypass policy engine)
- Gate metadata footer on non-zero tokens/tool_calls
- Update CLAUDE.md --stream → --no-stream docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add language tags to bare fenced code blocks (MD040)
- Fix wrong batch file path: src/batch.rs → src/cli/batch.rs
- Clarify ToolOutput spec is additive, not a replacement
- Add tracing::warn for missing path field in safety gate

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@qhkm qhkm merged commit 2aaa0c8 into main Mar 13, 2026
8 checks passed
@qhkm qhkm deleted the feat/stream-ux branch March 13, 2026 09:15
taqtiqa-mark pushed a commit to taqtiqa-mark/zeptoclaw that referenced this pull request Mar 25, 2026
## Summary
- **Streaming by default**: CLI now streams responses by default. Use
`--no-stream` to disable. Config `agents.defaults.streaming` changed
from `false` to `true`.
- **Response metadata footer**: After each response, prints a compact
footer showing token count, tool calls, and wall time (e.g. `⠿ 1,247
tokens · 3 tool calls · 2.1s`)

## Changes
- `src/cli/mod.rs`: Replace `--stream` flag with `--no-stream`
- `src/cli/agent.rs`: Invert streaming logic, add metadata footer via
`MetricsCollector`
- `src/cli/shimmer.rs`: Add `print_metadata_footer()` and
`format_number_with_commas()`
- `src/config/types.rs`: Change `streaming` default to `true`

## Test plan
- [x] All 3126 lib tests pass
- [x] Clippy clean
- [x] `cargo fmt -- --check` clean
- [ ] Manual: `zeptoclaw agent -m "Hello"` streams by default
- [ ] Manual: `zeptoclaw agent --no-stream -m "Hello"` disables
streaming
- [ ] Manual: metadata footer shows after response

Closes qhkm#341

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Conversations can pause to request user clarifications via a new
clarification tool.
* Added a response metadata footer showing tokens, tool calls, and
elapsed time.

* **Changes**
  * Streaming enabled by default; use --no-stream to disable.
* Agent loop and tooling now respect a pause-for-input signal and
explicit batch-mode behavior.
  * Deep-research skill and system-prompt guidance added.

* **Other**
  * Safety pre-checks narrowed for certain file-writing/editing tools.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: streaming by default + response metadata footer

1 participant