dsarno · dsarno · Sep 7, 2025
diff --git a/.claude/prompts/nl-unity-suite-full-additive.md b/.claude/prompts/nl-unity-suite-full-additive.md
@@ -51,11 +51,11 @@ CI provides:
 - Do not restate tool JSON; summarize in ≤ 2 short lines.
 - Never paste full file contents. For matches, include only the matched line and ±1 line.
 - Prefer `mcp__unity__find_in_file` for targeting; avoid `mcp__unity__read_resource` unless strictly necessary. If needed, limit to `head_bytes ≤ 256` or `tail_lines ≤ 10`.
-- Per‑test `system-out` ≤ 400 chars: brief status + latest SHA only.
-- Console evidence: fetch the last 10 lines and include ≤ 3 lines in the fragment.
+- Per‑test `system-out` ≤ 400 chars: brief status only (no SHA).
+- Console evidence: fetch the last 10 lines with `include_stacktrace:false` and include ≤ 3 lines in the fragment.
 - Avoid quoting multi‑line diffs; reference markers instead.
-— Console scans: perform two reads — last 10 `log/info` lines and up to 3 `error` entries; include ≤ 3 lines total in the fragment; if no errors, state "no errors".
-— Final check is folded into T‑J: perform an errors‑only scan and include a single "no errors" line or up to 3 error lines within the T‑J fragment.
+— Console scans: perform two reads — last 10 `log/info` lines and up to 3 `error` entries (use `include_stacktrace:false`); include ≤ 3 lines total in the fragment; if no errors, state "no errors".
+— Final check is folded into T‑J: perform an errors‑only scan (with `include_stacktrace:false`) and include a single "no errors" line or up to 3 error lines within the T‑J fragment.
 
 ---
 
@@ -85,7 +85,7 @@ STRICT OP GUARDRAILS
 
 **State Tracking:**
 - Track file SHA after each test (`mcp__unity__get_sha`) and use it as a precondition
-  for `apply_text_edits` in T‑F/T‑G/T‑I to exercise `stale_file` semantics.
+  for `apply_text_edits` in T‑F/T‑G/T‑I to exercise `stale_file` semantics. Do not include SHA values in report fragments.
 - Use content signatures (method names, comment markers) to verify expected state
 - Validate structural integrity after each major change
 
@@ -138,7 +138,7 @@ STRICT OP GUARDRAILS
 - Perform a targeted scan for errors/exceptions (type: errors), up to 3 entries
 - Validate no compilation errors from previous operations
 - **Expected final state**: State C (unchanged)
-- **IMMEDIATELY** write clean XML fragment to `reports/NL-4_results.xml` (no extra text). The `<testcase name>` must start with `NL-4`. Include at most 3 lines total across both reads, or simply state "no errors; console OK" (≤ 400 chars), plus the latest SHA.
+- **IMMEDIATELY** write clean XML fragment to `reports/NL-4_results.xml` (no extra text). The `<testcase name>` must start with `NL-4`. Include at most 3 lines total across both reads, or simply state "no errors; console OK" (≤ 400 chars).
 
 ### T-A. Temporary Helper Lifecycle (Returns to State C)
 **Goal**: Test insert → verify → delete cycle for temporary code
@@ -149,6 +149,9 @@ STRICT OP GUARDRAILS
 - Delete helper method via structured delete operation
 - **Expected final state**: Return to State C (helper removed, other changes intact)
 
+### Late-Test Editing Rule
+- When modifying a method body, use `mcp__unity__script_apply_edits`. If the method is expression-bodied (`=>`), convert it to a block or replace the whole method definition. After the edit, run `mcp__unity__validate_script` and rollback on error. Use `//` comments in inserted code.
+
 ### T-B. Method Body Interior Edit (Additive State D)
 **Goal**: Edit method interior without affecting structure, on modified file
 **Actions**:
@@ -172,7 +175,7 @@ STRICT OP GUARDRAILS
 - Use smart anchor matching to find current class-ending brace (after NL-3 tail comments)
 - Insert permanent helper before class brace: `private void TestHelper() { /* placeholder */ }`
 - Validate with `mcp__unity__validate_script(level:"standard")`
-- **IMMEDIATELY** write clean XML fragment to `reports/T-D_results.xml` (no extra text). The `<testcase name>` must start with `T-D`. Include brief evidence and the latest SHA in `system-out`.
+- **IMMEDIATELY** write clean XML fragment to `reports/T-D_results.xml` (no extra text). The `<testcase name>` must start with `T-D`. Include brief evidence in `system-out`.
 - **Expected final state**: State E + TestHelper() method before class end
 
 ### T-E. Method Evolution Lifecycle (Additive State G)
@@ -193,7 +196,7 @@ STRICT OP GUARDRAILS
   3. Add final class comment: `// end of test modifications`
 - All edits computed from same file snapshot, applied atomically
 - **Expected final state**: State G + three coordinated comments
-- After applying the atomic edits, run `validate_script(level:"standard")` and emit a clean fragment to `reports/T-F_results.xml` with a short summary and the latest SHA.
+- After applying the atomic edits, run `validate_script(level:"standard")` and emit a clean fragment to `reports/T-F_results.xml` with a short summary.
 
 ### T-G. Path Normalization Test (No State Change)
 **Goal**: Verify URI forms work equivalently on modified file
@@ -203,15 +206,15 @@ STRICT OP GUARDRAILS
 - Second should return `stale_file`, retry with updated SHA
 - Verify both URI forms target same file
 - **Expected final state**: State H (no content change, just path testing)
-- Emit `reports/T-G_results.xml` showing evidence of stale SHA handling and final SHA.
+- Emit `reports/T-G_results.xml` showing evidence of stale SHA handling.
 
 ### T-H. Validation on Modified File (No State Change)
 **Goal**: Ensure validation works correctly on heavily modified file
 **Actions**:
 - Run `validate_script(level:"standard")` on current state
 - Verify no structural errors despite extensive modifications
 - **Expected final state**: State H (validation only, no edits)
-- Emit `reports/T-H_results.xml` confirming validation OK and including the latest SHA.
+- Emit `reports/T-H_results.xml` confirming validation OK.
 
 ### T-I. Failure Surface Testing (No State Change)
 **Goal**: Test error handling on real modified file
@@ -220,7 +223,7 @@ STRICT OP GUARDRAILS
 - Attempt edit with stale SHA (should fail cleanly) 
 - Verify error responses are informative
 - **Expected final state**: State H (failed operations don't modify file)
-- Emit `reports/T-I_results.xml` capturing error evidence and final SHA; file must contain one `<testcase>`.
+- Emit `reports/T-I_results.xml` capturing error evidence; file must contain one `<testcase>`.
 
 ### T-J. Idempotency on Modified File (Additive State I)
 **Goal**: Verify operations behave predictably when repeated
@@ -232,7 +235,7 @@ STRICT OP GUARDRAILS
 - **Remove again** (same `regex_replace`) → expect `no_op: true`.
 - `mcp__unity__validate_script(level:"standard")`
 - Perform a final console scan for errors/exceptions (errors only, up to 3); include "no errors" if none
-- **IMMEDIATELY** write clean XML fragment to `reports/T-J_results.xml` with evidence of both `no_op: true` outcomes and the console result. The `<testcase name>` must start with `T-J` and include the latest SHA.
+- **IMMEDIATELY** write clean XML fragment to `reports/T-J_results.xml` with evidence of both `no_op: true` outcomes and the console result. The `<testcase name>` must start with `T-J`.
 - **Expected final state**: State H + verified idempotent behavior
 
 ---
@@ -299,7 +302,7 @@ BAN ON EXTRA TOOLS AND DIRS
 
 ## XML Fragment Templates (T-F .. T-J)
 
-Use these skeletons verbatim as a starting point. Replace the bracketed placeholders with your evidence and the latest SHA. Ensure each file contains exactly one `<testcase>` element and that the `name` begins with the exact test id.
+Use these skeletons verbatim as a starting point. Replace the bracketed placeholders with your evidence. Ensure each file contains exactly one `<testcase>` element and that the `name` begins with the exact test id.
 
 ```xml
 <testcase name="T-F — Atomic Multi-Edit" classname="UnityMCP.NL-T">
@@ -309,7 +312,6 @@ Applied 3 non-overlapping edits in one atomic call:
 - ApplyBlend(): added "// safe animation"
 - End-of-class: added "// end of test modifications"
 validate_script: OK
-SHA: [sha-here]
   ]]></system-out>
 </testcase>
 ```
@@ -319,7 +321,6 @@ SHA: [sha-here]
   <system-out><![CDATA[
 Read Unity console (INFO): OK.
 No compilation errors detected.
-SHA: [sha-here]
   ]]></system-out>
 </testcase>
 ```
@@ -328,8 +329,7 @@ SHA: [sha-here]
 <testcase name="T-G — Path Normalization Test" classname="UnityMCP.NL-T">
   <system-out><![CDATA[
 Edit via unity://path/... succeeded.
-Same edit via Assets/... returned stale_file, retried with updated SHA: OK.
-Final SHA: [sha-here]
+Same edit via Assets/... returned stale_file, retried with updated hash: OK.
   ]]></system-out>
 </testcase>
 ```
@@ -338,7 +338,6 @@ Final SHA: [sha-here]
 <testcase name="T-H — Validation on Modified File" classname="UnityMCP.NL-T">
   <system-out><![CDATA[
 validate_script(level:"standard"): OK on the modified file.
-SHA: [sha-here]
   ]]></system-out>
 </testcase>
 ```
@@ -347,8 +346,8 @@ SHA: [sha-here]
 <testcase name="T-I — Failure Surface Testing" classname="UnityMCP.NL-T">
   <system-out><![CDATA[
 Overlapping edit: failed cleanly (error captured).
-Stale SHA edit: failed cleanly (error captured).
-File unchanged; final SHA: [sha-here]
+Stale hash edit: failed cleanly (error captured).
+File unchanged.
   ]]></system-out>
 </testcase>
 ```
@@ -361,7 +360,6 @@ Insert same marker again: no_op: true.
 regex_remove marker: OK.
 regex_remove again: no_op: true.
 validate_script: OK.
-SHA: [sha-here]
   ]]></system-out>
 </testcase>
 ```
diff --git a/.github/workflows/claude-nl-suite.yml b/.github/workflows/claude-nl-suite.yml
@@ -202,13 +202,15 @@ jobs:
               manual_args=(-manualLicenseFile "/root/.local/share/unity3d/Unity/Unity_lic.ulf")
             fi
 
+            mkdir -p "$RUNNER_TEMP/unity-status"
             docker rm -f unity-mcp >/dev/null 2>&1 || true
             docker run -d --name unity-mcp --network host \
               -e HOME=/root \
               -e UNITY_MCP_ALLOW_BATCH=1 \
               -e UNITY_MCP_STATUS_DIR=/root/.unity-mcp \
               -e UNITY_MCP_BIND_HOST=127.0.0.1 \
               -v "${{ github.workspace }}:/workspace" -w /workspace \
+              -v "$RUNNER_TEMP/unity-status:/root/.unity-mcp" \
               -v "$RUNNER_TEMP/unity-config:/root/.config/unity3d:ro" \
               -v "$RUNNER_TEMP/unity-local:/root/.local/share/unity3d:ro" \
               "$UNITY_IMAGE" /opt/unity/Editor/Unity -batchmode -nographics -logFile - \
@@ -238,7 +240,7 @@ jobs:
               logs="$(docker logs unity-mcp 2>&1 || true)"
 
               # 1) Primary: status JSON exposes TCP port
-              port="$(docker exec unity-mcp bash -lc 'shopt -s nullglob; for f in /root/.unity-mcp/unity-mcp-status-*.json; do grep -ho "\"unity_port\"[[:space:]]*:[[:space:]]*[0-9]\+" "$f"; done | sed -E "s/.*: *([0-9]+).*/\1/" | head -n1' 2>/dev/null || true)"
+              port="$(jq -r '.unity_port // empty' "$RUNNER_TEMP"/unity-status/unity-mcp-status-*.json 2>/dev/null | head -n1 || true)"
               if [[ -n "${port:-}" ]] && timeout 1 bash -lc "exec 3<>/dev/tcp/127.0.0.1/$port"; then
                 echo "Bridge ready on port $port"
                 exit 0
@@ -288,12 +290,39 @@ jobs:
                   "env": {
                     "PYTHONUNBUFFERED": "1",
                     "MCP_LOG_LEVEL": "debug",
-                    "UNITY_PROJECT_ROOT": "$GITHUB_WORKSPACE/TestProjects/UnityMCPTests"
+                    "UNITY_PROJECT_ROOT": "$GITHUB_WORKSPACE/TestProjects/UnityMCPTests",
+                    "UNITY_MCP_STATUS_DIR": "$RUNNER_TEMP/unity-status",
+                    "UNITY_MCP_HOST": "127.0.0.1"
                   }
                 }
               }
             }
             JSON
+
+        - name: Pin Claude tool permissions (.claude/settings.json)
+          run: |
+            set -eux
+            mkdir -p .claude
+            cat > .claude/settings.json <<'JSON'
+            {
+              "permissions": {
+                "allow": [
+                  "mcp__unity",
+                  "Edit(reports/**)"
+                ],
+                "deny": [
+                  "Bash",
+                  "MultiEdit",
+                  "WebFetch",
+                  "WebSearch",
+                  "Task",
+                  "TodoWrite",
+                  "NotebookEdit",
+                  "NotebookRead"
+                ]
+              }
+            }
+            JSON
 
         # ---------- Reports & helper ----------
         - name: Prepare reports and dirs
@@ -314,32 +343,65 @@ jobs:
             </testsuite></testsuites>
             XML
             printf '# Unity NL/T Editing Suite Test Results\n\n' > "$MD_OUT"
+
+        - name: Verify Unity bridge status/port
+          run: |
+            set -euxo pipefail
+            ls -la "$RUNNER_TEMP/unity-status" || true
+            jq -r . "$RUNNER_TEMP"/unity-status/unity-mcp-status-*.json | sed -n '1,80p' || true
+
+            shopt -s nullglob
+            status_files=("$RUNNER_TEMP"/unity-status/unity-mcp-status-*.json)
+            if ((${#status_files[@]})); then
+              port="$(grep -hEo '"unity_port"[[:space:]]*:[[:space:]]*[0-9]+' "${status_files[@]}" \
+                | sed -E 's/.*: *([0-9]+).*/\1/' | head -n1 || true)"
+            else
+              port=""
+            fi
+
+            echo "unity_port=$port"
+            if [[ -n "$port" ]]; then
+              timeout 1 bash -lc "exec 3<>/dev/tcp/127.0.0.1/$port" && echo "TCP OK"
+            fi
 
         # (removed) Revert helper and baseline snapshot are no longer used
 
 
-        # ---------- Run suite ----------
-        - name: Run Claude NL suite (single pass)
+        # ---------- Run suite in two passes ----------
+        - name: Run Claude NL pass
+          uses: anthropics/claude-code-base-action@beta
+          if: steps.detect.outputs.anthropic_ok == 'true' && steps.detect.outputs.unity_ok == 'true'
+          continue-on-error: true
+          with:
+            use_node_cache: false
+            prompt_file: .claude/prompts/nl-unity-suite-full-additive.md
+            mcp_config: .claude/mcp.json
+            settings: .claude/settings.json
+            allowed_tools: "mcp__unity,Edit(reports/**)"
+            disallowed_tools: "Bash,MultiEdit,WebFetch,WebSearch,Task,TodoWrite,NotebookEdit,NotebookRead"
+            model: claude-3-7-sonnet-20250219
+            append_system_prompt: |
+              You are running the NL pass only. Do not run any T-* tests.
+              Emit only NL-0..NL-4 fragments and stop.
+            timeout_minutes: "30"
+            anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
+
+        - name: Run Claude T pass
           uses: anthropics/claude-code-base-action@beta
           if: steps.detect.outputs.anthropic_ok == 'true' && steps.detect.outputs.unity_ok == 'true'
           continue-on-error: true
           with:
             use_node_cache: false
             prompt_file: .claude/prompts/nl-unity-suite-full-additive.md
             mcp_config: .claude/mcp.json
-            allowed_tools: >-
-              Write,
-              mcp__unity__manage_editor,
-              mcp__unity__list_resources,
-              mcp__unity__read_resource,
-              mcp__unity__apply_text_edits,
-              mcp__unity__script_apply_edits,
-              mcp__unity__validate_script,
-              mcp__unity__find_in_file,
-              mcp__unity__read_console,
-              mcp__unity__get_sha
-            disallowed_tools: TodoWrite,Task,Bash
-            model: claude-3-7-sonnet-latest
+            settings: .claude/settings.json
+            allowed_tools: "mcp__unity,Edit(reports/**)"
+            disallowed_tools: "Bash,MultiEdit,WebFetch,WebSearch,Task,TodoWrite,NotebookEdit,NotebookRead"
+            model: claude-3-5-haiku-20241022
+            fallback_model: claude-3-7-sonnet-20250219
+            append_system_prompt: |
+              You are running the T pass only. Do not run any NL-* tests.
+              Emit only T-A..T-J fragments and stop.
             timeout_minutes: "30"
             anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
 

diff --git a/README-DEV.md b/README-DEV.md
@@ -46,6 +46,27 @@ Restores original files from backup.
 2. Allows you to select which backup to restore
 3. Restores both Unity Bridge and Python Server files
 
+### `prune_tool_results.py`
+Compacts large `tool_result` blobs in conversation JSON into concise one-line summaries.
+
+**Usage:**
+```bash
+python3 prune_tool_results.py < reports/claude-execution-output.json > reports/claude-execution-output.pruned.json
+```
+
+The script reads a conversation from `stdin` and writes the pruned version to `stdout`, making logs much easier to inspect or archive.
+
+### Lean Tool Responses
+To keep live conversations small, server tools now emit minimal payloads by default:
+
+* `find_in_file` – first match positions only (`startLine/Col`, `endLine/Col`).
+* `read_console` – full entries by default; pass `include_stacktrace=False` to trim to `{level, message}` (use `count` to limit).
+* `validate_script` – diagnostics summarized as `{warnings, errors}` counts.
+* `get_sha` – `{sha256, lengthBytes}` only.
+* `read_resource` – returns only `metadata.sha256` and byte length unless `include_text` or window arguments are provided.
+
+These defaults dramatically cut token usage without affecting essential information.
+
 ## Finding Unity Package Cache Path
 
 Unity stores Git packages under a version-or-hash folder. Expect something like:
@@ -70,10 +91,12 @@ Note: In recent builds, the Python server sources are also bundled inside the pa
 
 We provide a CI job to run a Natural Language Editing mini-suite against the Unity test project. It spins up a headless Unity container and connects via the MCP bridge.
 
-- Trigger: Workflow dispatch (`Claude NL suite (Unity live)`).
-- Image: `UNITY_IMAGE` (UnityCI) pulled by tag; the job resolves a digest at runtime. Logs are sanitized.
-- Reports: JUnit at `reports/junit-nl-suite.xml`, Markdown at `reports/junit-nl-suite.md`.
-- Publishing: JUnit is normalized to `reports/junit-for-actions.xml` and published; artifacts upload all files under `reports/`.
+ - Trigger: Workflow dispatch (`Claude NL suite (Unity live)`).
+ - Image: `UNITY_IMAGE` (UnityCI) pulled by tag; the job resolves a digest at runtime. Logs are sanitized.
+ - Execution: runs in two passes (NL then T) so each session stays lean.
+ - Tool permissions are pinned via `.claude/settings.json`, allowing Unity MCP tools and edits under `reports/` only.
+ - Reports: JUnit at `reports/junit-nl-suite.xml`, Markdown at `reports/junit-nl-suite.md`.
+ - Publishing: JUnit is normalized to `reports/junit-for-actions.xml` and published; artifacts upload all files under `reports/`.
 
 ### Test target script
 - The repo includes a long, standalone C# script used to exercise larger edits and windows:

diff --git a/UnityMcpBridge/Editor/Tools/ManageScript.cs b/UnityMcpBridge/Editor/Tools/ManageScript.cs
@@ -1347,6 +1347,10 @@ private static object EditScript(
                     appliedCount = replacements.Count;
                 }
 
+                // Guard against structural imbalance before validation
+                if (!CheckBalancedDelimiters(working, out int lineBal, out char expectedBal))
+                    return Response.Error("unbalanced_braces", new { status = "unbalanced_braces", line = lineBal, expected = expectedBal.ToString() });
+
                 // No-op guard for structured edits: if text unchanged, return explicit no-op
                 if (string.Equals(working, original, StringComparison.Ordinal))
                 {