-
Notifications
You must be signed in to change notification settings - Fork 0
Add tool_result pruning utility and split NL/T workflow into two passes #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tool_result pruning utility and split NL/T workflow into two passes #80
Conversation
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on October 7. To receive Bugbot reviews on all of your PRs, please upgrade to Bugbot Pro by visiting the Cursor dashboard. Your first 14 days will be free to try! |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive tool result pruning system for the Unity MCP (Multi-Client Protocol) integration to optimize token usage in AI conversations. The changes focus on making tool responses more compact by default while maintaining essential functionality.
The core addition is prune_tool_results.py, a utility that processes AI conversation JSON files to replace verbose tool outputs with concise summaries. This script handles common Unity MCP patterns like file validation, console logs, search results, and file operations by extracting key metrics (counts, positions, hash values) while discarding verbose details.
Several MCP tools have been modified to return filtered responses by default:
read_resourcenow returns only metadata (SHA256 hash and byte length) unless explicitly requested viainclude_textor windowing parametersvalidate_scriptreturns diagnostic counts (warnings/errors) instead of full diagnostic arraysget_shareturns only essential hash and size informationread_consolemaintains full output by default but supports a truncated mode viainclude_stacktrace=falseparameterfind_in_filereturns structured position data with reduced default results limit
The GitHub workflow has been restructured into a two-pass approach, splitting Natural Language (NL) and Technical (T) tests into separate executions with different Claude models. This includes improved Unity bridge status detection using JSON parsing instead of shell globbing, and better port management through dedicated status directories.
Extensive test coverage has been added with new unit tests validating the pruning behavior, truncation modes, and metadata-only responses. Documentation has been updated to reflect the new "Lean Tool Responses" behavior and the pruning utility usage.
These changes integrate seamlessly with the existing Unity MCP architecture by maintaining backward compatibility while defaulting to more efficient response formats. The system now supports both verbose (legacy) and compact (default) modes, allowing clients to opt into full responses when needed.
Important Files Changed
Changed Files
| Filename | Score | Overview |
|---|---|---|
| prune_tool_results.py | 4/5 | New utility script that summarizes verbose tool results into compact one-line summaries for token efficiency |
| UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py | 3/5 | Modified read_resource to return metadata-only by default, requires explicit opt-in for text content |
| UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py | 3/5 | Changed defaults to focus on errors only and added stacktrace truncation functionality |
| .github/workflows/claude-nl-suite.yml | 3/5 | Split workflow into two-pass execution with improved Unity bridge status detection |
| UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py | 4/5 | Added response filtering to return only essential data for validate_script and get_sha tools |
| UnityMcpBridge/Editor/Tools/ManageScript.cs | 4/5 | Added balanced delimiter validation guard clause to prevent structural code errors |
| .claude/prompts/nl-unity-suite-full-additive.md | 5/5 | Updated test suite prompt to remove SHA output and clarify console behavior |
| tests/test_validate_script_summary.py | 4/5 | New test validating diagnostic count summarization in validate_script tool |
| tests/test_read_console_truncate.py | 4/5 | New test verifying both full and truncated console output modes |
| tests/test_read_resource_minimal.py | 4/5 | New test validating metadata-only response behavior for resource reading |
| tests/test_find_in_file_minimal.py | 4/5 | New test for find_in_file tool position information functionality |
| tests/test_get_sha.py | 4/5 | Updated test to validate response filtering for essential hash information only |
| README-DEV.md | 5/5 | Added documentation for pruning utility and lean tool responses behavior |
Confidence score: 3/5
- This PR introduces significant behavioral changes that could break existing clients expecting verbose tool responses by default
- Score reflects the broad scope of changes across multiple critical tools with new default behaviors that represent breaking API changes
- Pay close attention to UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py and read_console.py as these change default response formats
Sequence Diagram
sequenceDiagram
participant User as "GitHub Workflow"
participant Claude as "Claude AI Agent"
participant Unity as "Unity Editor"
participant MCP as "MCP Server"
participant Tools as "Tool Scripts"
Note over User: PR adds tool_result pruning utility
User->>Claude: "Start NL/T Suite (Two-Pass Mode)"
Note over Claude: Pass 1: NL Tests (NL-0 to NL-4)
Claude->>MCP: "mcp__unity__read_console(include_stacktrace=false)"
MCP->>Unity: "read_console with truncated output"
Unity-->>MCP: "Console entries (level + message only)"
MCP-->>Claude: "Compact console data"
Claude->>MCP: "mcp__unity__validate_script(level='standard')"
MCP->>Unity: "validate_script request"
Unity-->>MCP: "Validation diagnostics"
MCP-->>Claude: "Summary: warnings=N, errors=M"
Claude->>MCP: "mcp__unity__get_sha(uri)"
MCP->>Unity: "get_sha request"
Unity-->>MCP: "SHA + metadata only"
MCP-->>Claude: "Minimal hash response"
Note over Claude: Pass 2: T Tests (T-A to T-J)
Claude->>MCP: "mcp__unity__apply_text_edits"
MCP->>Unity: "Apply atomic edits"
Unity-->>MCP: "Edit results"
MCP-->>Claude: "Compact edit confirmation"
Claude->>Tools: "Write test results to reports/"
Tools-->>Claude: "XML fragments written"
Note over User: After both passes complete
User->>Tools: "Run prune_tool_results.py"
Tools->>Tools: "Process claude-execution-output.json"
Note over Tools: Convert large tool_result content<br/>to one-line summaries
Tools-->>User: "Pruned conversation JSON"
User->>User: "Merge NL + T test fragments"
User->>User: "Generate JUnit report"
User->>User: "Upload artifacts"
13 files reviewed, 4 comments
| @@ -0,0 +1,58 @@ | |||
| #!/usr/bin/env python3 | |||
| import sys, json, re | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: re module imported but never used
| import sys, json, re | |
| import sys, json |
| loop = asyncio.new_event_loop() | ||
| try: | ||
| resp = loop.run_until_complete( | ||
| read_resource(uri="unity://path/Assets/A.txt", ctx=None, project_root=str(proj)) | ||
| ) | ||
| finally: | ||
| loop.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider using asyncio.run() instead of manually managing event loop lifecycle for cleaner async test execution
| ignore_case: bool | None = True, | ||
| project_root: str | None = None, | ||
| max_results: int | None = 200, | ||
| max_results: int | None = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Reducing max_results default from 200 to 1 is a significant change that could break existing workflows expecting multiple search results.
| return {"success": True, "data": {"text": text, "metadata": {"sha256": sha}}} | ||
| want_text = ( | ||
| bool(include_text) | ||
| or (head_bytes is not None and head_bytes >= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The condition head_bytes >= 0 allows head_bytes=0 to trigger text inclusion, which may not be intended behavior.
Summary
read_consoledefault to include stack traces while trimming to level and message wheninclude_stacktrace:falseTesting
pytesthttps://chatgpt.com/codex/tasks/task_e_68bcbbf232cc832791102f8596bd5683