-
Notifications
You must be signed in to change notification settings - Fork 0
Add tool_result pruning utility and split NL/T workflow into two passes #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tool_result pruning utility and split NL/T workflow into two passes #70
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Summary
This PR implements a comprehensive optimization strategy for the Unity MCP (Multi-Client Protocol) system focused on reducing tool output verbosity and improving conversation efficiency. The changes introduce a tool result pruning utility (prune_tool_results.py) that condenses large JSON tool outputs into concise one-line summaries, making conversation logs more manageable and reducing token usage in AI interactions.
The core changes modify several Unity MCP tools to return "lean" responses by default: find_in_file now returns only match positions instead of full text excerpts and limits results to 1 match, validate_script provides warning/error counts rather than detailed diagnostics, get_sha returns only SHA256 hash and byte length, read_console defaults to error-only messages with trimmed fields, and read_resource returns metadata-only responses unless text content is explicitly requested. These modifications maintain full functionality through optional parameters while encouraging more efficient usage patterns.
Additionally, the GitHub CI workflow has been restructured to split the Claude NL/T test suite into two separate passes: an NL (Natural Language) pass using Claude 3.5 Sonnet for read-only operations, and a T (Transform) pass using Claude 3.5 Haiku for editing operations. This separation allows for more focused tool usage and keeps individual conversation sessions smaller and more manageable. The changes are well-documented in the README-DEV.md file and include comprehensive test coverage for the new trimmed behaviors.
Changed Files
| Filename | Score | Overview |
|---|---|---|
.github/workflows/claude-nl-suite.yml |
4/5 | Split workflow into separate NL and T passes using different Claude models and tool allowlists |
UnityMcpBridge/UnityMcpServer~/src/tools/resource_tools.py |
4/5 | Modified read_resource and find_in_file to return minimal metadata by default with optional full content |
UnityMcpBridge/UnityMcpServer~/src/tools/read_console.py |
4/5 | Changed defaults to return only error messages in JSON format with trimmed field output |
README-DEV.md |
5/5 | Added comprehensive documentation for pruning utility, lean tool responses, and workflow changes |
tests/test_read_resource_minimal.py |
4/5 | New test validating read_resource returns only SHA256 and byte length metadata by default |
UnityMcpBridge/UnityMcpServer~/src/tools/manage_script.py |
5/5 | Modified validate_script and get_sha to return condensed summaries instead of full data |
tests/test_find_in_file_minimal.py |
4/5 | New test ensuring find_in_file returns only position coordinates for matches |
tests/test_read_console_trimmed.py |
4/5 | New test verifying read_console removes time fields and returns essential data only |
tests/test_validate_script_summary.py |
4/5 | New test validating validate_script returns warning/error counts instead of full diagnostics |
tests/test_get_sha.py |
4/5 | Updated existing test to verify get_sha returns only SHA256 and byte length fields |
prune_tool_results.py |
4/5 | New utility script that condenses tool_result JSON outputs into one-line summaries for conversation management |
Confidence score: 4/5
- This PR is safe to merge with minimal risk as it focuses on output optimization without changing core functionality
- Score reflects well-structured changes with comprehensive test coverage and clear documentation of the optimization strategy
- Pay attention to the workflow file changes and ensure the split NL/T execution model works as expected in CI
Sequence Diagram
sequenceDiagram
participant User
participant GitHub as "GitHub Actions"
participant Unity as "Unity Container"
participant MCP as "MCP Server"
participant Tools as "Unity Tools"
User->>GitHub: "Trigger workflow dispatch"
GitHub->>GitHub: "Setup Python/uv environment"
GitHub->>Unity: "Start Unity container with MCP bridge"
Unity->>MCP: "Initialize MCP connection on port"
GitHub->>GitHub: "Write MCP config (.claude/mcp.json)"
Note over GitHub: NL Pass
GitHub->>MCP: "Run Claude NL pass (read-only tools)"
MCP->>Tools: "find_in_file (returns positions only)"
Tools-->>MCP: "{startLine, startCol, endLine, endCol}"
MCP->>Tools: "validate_script (returns counts)"
Tools-->>MCP: "{warnings: N, errors: N}"
MCP->>Tools: "get_sha (returns hash + length)"
Tools-->>MCP: "{sha256, lengthBytes}"
MCP->>Tools: "read_console (trimmed entries)"
Tools-->>MCP: "[{level, message}]"
MCP-->>GitHub: "NL results (minimal payloads)"
Note over GitHub: T Pass
GitHub->>MCP: "Run Claude T pass (editing tools)"
MCP->>Tools: "apply_text_edits"
Tools-->>MCP: "Edit results"
MCP->>Tools: "script_apply_edits"
Tools-->>MCP: "Script modification results"
MCP-->>GitHub: "T results"
GitHub->>GitHub: "Canonicalize testcase names (NL/T prefixes)"
GitHub->>GitHub: "Backfill missing tests as failures"
GitHub->>GitHub: "Merge fragments into JUnit XML"
GitHub->>GitHub: "Build markdown summary"
GitHub->>GitHub: "Publish JUnit report and artifacts"
GitHub->>Unity: "Stop Unity container"
11 files reviewed, 5 comments
| loop = asyncio.new_event_loop() | ||
| try: | ||
| resp = loop.run_until_complete( | ||
| read_resource(uri="unity://path/Assets/A.txt", ctx=None, project_root=str(proj)) | ||
| ) | ||
| finally: | ||
| loop.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider using asyncio.run() instead of manual event loop management for cleaner async test execution
| loop = asyncio.new_event_loop() | |
| try: | |
| resp = loop.run_until_complete( | |
| read_resource(uri="unity://path/Assets/A.txt", ctx=None, project_root=str(proj)) | |
| ) | |
| finally: | |
| loop.close() | |
| resp = asyncio.run( | |
| read_resource(uri="unity://path/Assets/A.txt", ctx=None, project_root=str(proj)) | |
| ) |
| loop = asyncio.new_event_loop() | ||
| try: | ||
| resp = loop.run_until_complete( | ||
| find_in_file(uri="unity://path/Assets/A.txt", pattern="world", ctx=None, project_root=str(proj)) | ||
| ) | ||
| finally: | ||
| loop.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Consider using asyncio.run() instead of manually managing event loop lifecycle for cleaner code
| {"severity": "error"}, | ||
| {"severity": "fatal"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The test expects 'fatal' severity to be counted as an error, but this behavior should be verified against the actual implementation
| data = obj.get("data", {}) or {} | ||
| msg = obj.get("message") or obj.get("status") or "" | ||
| # Common tool shapes | ||
| if "sha256" in str(data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Using str(data) for SHA256 detection could match unrelated strings containing 'sha256'. Consider checking data.get('sha256') directly.
| if "sha256" in str(data): | |
| if data.get("sha256"): |
| lines = data["lines"] or [] | ||
| lvls = {"info":0,"warning":0,"error":0} | ||
| for L in lines: | ||
| lvls[L.get("level","" ).lower()] = lvls.get(L.get("level","" ).lower(),0)+1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Dictionary lookup pattern lvls.get(L.get('level','').lower(),0) is inefficient. Consider using defaultdict(int) or .setdefault().
Summary
prune_tool_results.pyto condense tool_result JSON outputs into one-line summariesfind_in_filereturns first match positions only,validate_script&get_shaprovide condensed counts and hashes,read_consoletrims entries, andread_resourcereturns only hash and byte length unless text is requestedTesting
pytesthttps://chatgpt.com/codex/tasks/task_e_68bcbbf232cc832791102f8596bd5683