fix: add tool_info schema discovery for WASM tools by henrypark133 · Pull Request #1086 · nearai/ironclaw

henrypark133 · 2026-03-12T22:44:01Z

Summary

This PR adds an on-demand schema discovery path for tools and fixes the prompt-size regression for WASM tools by separating the schema sent in the normal tools array from the richer schema exposed through tool_info.

The final design is intentionally simpler than the intermediate version on this branch:

WASM tools keep a compact advertised schema for normal tool definitions
WASM tools expose a richer discovery schema through tool_info
execution can still use a transient runtime fallback when registration-time discovery is permissive
there is no cached coercion schema and no per-turn hint dedup state anymore

What Changed

add a new built-in tool_info tool that returns:
- compact tool info by default: name, description, and parameter names
- full typed JSON Schema when called with include_schema: true
extend the Tool trait with discovery_schema() so tools can expose richer discovery data than the compact parameters_schema() used in the normal tools array
keep WASM tool definitions compact by default:
- parameters_schema() returns the advertised schema used in the main prompt
- discovery_schema() returns the richer registration-time discovery schema used by tool_info
introduce a small internal WasmToolSchemas helper to hold the two immutable schema roles:
- advertised for the normal tools array
- discovery for tool_info and typed coercion when available
preserve execution-time coercion without adding mutable schema state:
- if the discovery schema is typed, execution uses it directly
- if the discovery schema is still permissive, execution makes a transient call to the WASM schema() export for that invocation only
always emit the short tool_info(name: ..., include_schema: true) retry hint on WASM tool-level errors
remove the now-unnecessary per-turn WASM hint reset plumbing from the shared agentic loop and delegates

Why This Approach

This keeps the common path cheap while still letting the model discover precise schemas when it needs them:

the normal tools array stays small and stable
the model can call tool_info on demand for the full schema
execution still gets typed coercion when schema information is available
the implementation stays mostly immutable and easier to reason about

The simplification also removes branch-specific state that no longer pays for itself:

no cached coercion schema to keep in sync
no hint_sent flag
no loop lifecycle hook whose only job was resetting that flag once per turn

Review Fixes Included

This PR addresses the review issues raised on the branch:

keep extracted WASM schemas out of the main tools array unless a sidecar explicitly overrides the advertised schema
make the tool_info retry hint behavior consistent across chat, job, and container flows by removing the once-per-turn dedup mechanism entirely and always emitting the short hint

Testing

cargo fmt
cargo clippy --all-targets --all-features -- -D warnings
cargo test test_advertised_schema_stays_permissive_until_sidecar_override -- --nocapture
cargo test test_tool_intent_nudge_fires_and_caps -- --nocapture

gemini-code-assist · 2026-03-12T22:44:05Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Copilot

Pull request overview

This PR introduces on-demand tool schema discovery via a new tool_info built-in tool, keeping the normal LLM tool advertisement compact (especially for large WASM tool schemas) while still enabling rich schema access for coercion and targeted retries.

Changes:

Add tool_info built-in to fetch compact tool info by default and full typed JSON Schema on demand (include_schema: true).
Split WASM tool schemas into a compact advertised schema vs a full discovery/coercion schema, with per-turn hint dedup reset wired into the shared agentic loop lifecycle.
Add regression/e2e coverage for tool discovery behavior, per-turn reset hook behavior, and advertised-vs-discovery schema handling.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tools-src/web-search/web-search-tool.capabilities.json	Adds explicit description + parameters schema in sidecar for a WASM tool.
tests/fixtures/llm_traces/tools/tool_info_discovery.json	Adds an LLM trace fixture exercising `tool_info` default vs full-schema mode.
tests/e2e_builtin_tool_coverage.rs	Adds an e2e coverage test validating `tool_info` discovery behavior/results.
src/worker/job.rs	Hooks `notify_turn_start()` into job loop turns via `LoopDelegate::on_turn_start`.
src/worker/container.rs	Hooks `notify_turn_start()` into container loop turns via `LoopDelegate::on_turn_start`.
src/tools/wasm/wrapper.rs	Splits advertised vs discovery schema, adds lazy coercion schema caching + per-turn hint dedup, and updates WASM error hinting to point to `tool_info`.
src/tools/wasm/runtime.rs	Switches registration-time metadata extraction to `extract_wasm_metadata()` with limits/timeouts and fallback behavior.
src/tools/wasm/mod.rs	Narrows public re-exports (removes trap-related exports).
src/tools/wasm/limits.rs	Removes unused limiter bookkeeping fields.
src/tools/wasm/error.rs	Removes unused trap structs/enums; updates hint formatting expectations in tests.
src/tools/tool.rs	Extends `Tool` trait with `on_turn_start()` and `discovery_schema()`.
src/tools/registry.rs	Adds `notify_turn_start()` and `register_tool_info()`; protects `tool_info` name.
src/tools/builtin/tool_info.rs	Implements the `tool_info` built-in tool + unit tests.
src/tools/builtin/mod.rs	Wires `ToolInfoTool` into the built-in module exports.
src/app.rs	Registers `tool_info` during app tool initialization.
src/agent/dispatcher.rs	Calls `notify_turn_start()` at the start of chat turns via delegate hook.
src/agent/agentic_loop.rs	Adds `LoopDelegate::on_turn_start()` and calls it once per run; adds tests for once-per-loop behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tools/wasm/wrapper.rs

zmanian

Reviewed the full diff. This is well-structured and solves a real problem (reducing token usage from full WASM schema dumps).

Looks good:

tool_info correctly added to PROTECTED_TOOL_NAMES
Two-tier schema system (advertised vs discovery) is clean
Good test coverage: unit tests for both modes, WasmToolSchemas behavior, and E2E trace test
Dead code cleanup (TrapInfo, TrapCode, unused fields) is welcome
Backward-compatible discovery_schema() trait method with sensible default

Minor suggestions (non-blocking):

WasmToolSchemas::advertised() clones the schema on every call via parameters_schema(), which runs on every LLM request. Consider returning &serde_json::Value instead.
The hint text manipulation (append_schema_hint_if_permissive / strip_schema_hint) using contains/find/truncate is fragile. A separate schema_hint: Option<String> field composed at display time would be more robust.
ToolInfoTool holds Arc<ToolRegistry> while the registry holds ToolInfoTool -- circular Arc reference. Fine for app-lifetime objects, but Weak<ToolRegistry> would be cleaner if this ever gets refactored.

LGTM

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tools/registry.rs

+    /// Register the `tool_info` discovery tool.
+    ///
+    /// Requires `Arc<Self>` so the tool can query the registry for other tools'
+    /// schemas at runtime. Call after `register_builtin_tools()`.
+    pub fn register_tool_info(self: &Arc<Self>) {
+        use crate::tools::builtin::ToolInfoTool;
+        let tool = ToolInfoTool::new(Arc::downgrade(self));
+        self.register_sync(Arc::new(tool));
+        tracing::debug!("Registered tool_info discovery tool");
+    }


src/tools/wasm/mod.rs


 // Core types
-pub use error::{TrapCode, TrapInfo, WasmError};
+pub use error::WasmError;


src/tools/builtin/tool_info.rs

+        let schema = tool.discovery_schema();
+
+        // Extract just param names from the schema's "properties" keys
+        let param_names: Vec<&str> = schema
+            .get("properties")
+            .and_then(|p| p.as_object())
+            .map(|props| props.keys().map(|k| k.as_str()).collect())
+            .unwrap_or_default();


src/tools/wasm/wrapper.rs

+    /// Append a tool_info hint to the description when the schema is permissive
+    /// (no typed properties), so the LLM knows to call tool_info for the full schema.
+    fn append_schema_hint_if_permissive(&mut self) {
+        if self.schemas.is_advertised_permissive() && !self.description.contains("tool_info") {
+            self.description
+                .push_str(" (call tool_info for parameter schema)");
+        }


zmanian

Design Validations

The core architecture here is solid and well-motivated:

Two-tier schema system (advertised vs discovery) is the right approach. Keeping the tools array compact while exposing full schemas on demand directly addresses the prompt-size regression. The WasmToolSchemas struct makes the two roles explicit and immutable.
discovery_schema() trait method with the default delegation to parameters_schema() is backward-compatible and clean. Non-WASM tools don't need to care about it.
tool_info in PROTECTED_TOOL_NAMES is correct -- prevents WASM tools from shadowing it.
Removing per-turn hint dedup is the right call. Repeated hints on repeated failures give the LLM useful signal about the failure pattern. The old once-per-turn mechanism was hiding information.
Dead code cleanup (TrapInfo, TrapCode, unused limiter fields) is welcome. Confirmed these types have zero consumers outside their own module and tests.
E2E test with LLM trace fixture covers both detail levels and verifies the tool is wired end-to-end.

Actionable Feedback

1. Description hint mutation is fragile (blocking)

append_schema_hint_if_permissive() / strip_schema_hint() mutate the stored description string using contains / find / truncate. This couples presentation concerns to stored state and breaks if a tool description naturally contains "tool_info".

Suggestion: Store a schema_hint_needed: bool field on WasmToolSchemas (it already knows whether advertised is permissive). Then compose the hint in the Tool::schema() method which returns an owned ToolSchema, or in description() if you change the return type. The raw description stays clean and the hint is purely a presentation concern. This also addresses Copilot's comment about the hint text not mentioning include_schema: true -- you can fix the hint text in one place.

2. `extract_wasm_metadata` does redundant work (non-blocking, but worth addressing)

The new extract_wasm_metadata instantiates each WASM module during prepare() to call description() and schema() exports. But when a sidecar capabilities.json exists (which is the common case), the loader immediately overrides both via with_description() / with_schema(), throwing away the extraction results.

Suggestion: Either:

Pass optional description/schema into prepare() so it can skip WASM instantiation when the sidecar already provides them
Or make extraction lazy -- only instantiate when discovery_schema() is first called and no sidecar override was set

For registries with many WASM tools, this avoids doubling the instantiation cost at startup.

3. `effective_for_coercion` transient fallback needs safety documentation (non-blocking)

When the discovery schema is permissive, effective_for_coercion calls tool_iface.call_schema(&mut store) on the already-running WASM instance between store setup and call_execute(). This has two risks:

State contamination: The WASM module's linear memory is already initialized. If schema() reads or writes mutable state, calling it mid-execution could corrupt the environment before call_execute().
Inconsistency: If schema() returns different results depending on internal state (e.g., post-initialization), the coercion schema could differ from the load-time extraction.

In practice, well-behaved tools return a static string, so this is likely safe. But please either:

Add a comment documenting this assumption ("schema() is assumed to be a pure function with no side effects")
Or preferably, use the load-time extracted schema from PreparedModule.schema instead of re-calling the export. PreparedModule already holds it and is available via self.prepared.

4. `register_tool_info` is easy to forget (non-blocking)

Copilot flagged this and it's valid. register_tool_info() is a separate call from register_builtin_tools(), which means test harnesses and other call sites can forget it. Then WASM tool descriptions say "call tool_info" but the tool isn't registered.

Consider either folding it into a register_builtin_tools(self: &Arc<Self>) that takes Arc<Self>, or adding a register_all_builtins(self: &Arc<Self>) convenience method.

5. Minor: `WasmToolSchemas::advertised()` clones on every call

parameters_schema() is called on every LLM request for every tool. advertised() does a full serde_json::Value clone each time. Consider returning &serde_json::Value or wrapping in Arc<serde_json::Value>.

…rcion safety Two fixes from the review of #1086 (tool_info schema discovery): 1. Replace fragile description string mutation (append_schema_hint_if_permissive / strip_schema_hint) with composition at display time. The raw description stays clean; the tool_info hint is composed in the Tool::schema() override only when the advertised schema is permissive. This also includes the tool name and `include_schema: true` in the hint for better LLM guidance. 2. Make effective_for_coercion use the load-time extracted schema from PreparedModule instead of re-calling the WASM schema() export on the already-running instance mid-execution. This avoids potential state contamination from calling schema() after linear memory is initialized for execution. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…rcion safety (#1092) Two fixes from the review of #1086 (tool_info schema discovery): 1. Replace fragile description string mutation (append_schema_hint_if_permissive / strip_schema_hint) with composition at display time. The raw description stays clean; the tool_info hint is composed in the Tool::schema() override only when the advertised schema is permissive. This also includes the tool name and `include_schema: true` in the hint for better LLM guidance. 2. Make effective_for_coercion use the load-time extracted schema from PreparedModule instead of re-calling the WASM schema() export on the already-running instance mid-execution. This avoids potential state contamination from calling schema() after linear memory is initialized for execution. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add tool_info schema discovery for WASM tools * refactor: simplify WASM schema and hint state * refactor: store tool_info registry reference as Weak

…nd coercion safety (nearai#1092) Two fixes from the review of nearai#1086 (tool_info schema discovery): 1. Replace fragile description string mutation (append_schema_hint_if_permissive / strip_schema_hint) with composition at display time. The raw description stays clean; the tool_info hint is composed in the Tool::schema() override only when the advertised schema is permissive. This also includes the tool name and `include_schema: true` in the hint for better LLM guidance. 2. Make effective_for_coercion use the load-time extracted schema from PreparedModule instead of re-calling the WASM schema() export on the already-running instance mid-execution. This avoids potential state contamination from calling schema() after linear memory is initialized for execution. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

fix: add tool_info schema discovery for WASM tools

ff5575f

Copilot AI review requested due to automatic review settings March 12, 2026 22:44

Copilot started reviewing on behalf of henrypark133 March 12, 2026 22:44 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

src/tools/wasm/wrapper.rs Show resolved Hide resolved

refactor: simplify WASM schema and hint state

042f353

zmanian previously approved these changes Mar 12, 2026

View reviewed changes

refactor: store tool_info registry reference as Weak

9ba2cb5

Copilot AI review requested due to automatic review settings March 12, 2026 23:27

henrypark133 dismissed zmanian’s stale review via 9ba2cb5 March 12, 2026 23:27

Copilot started reviewing on behalf of henrypark133 March 12, 2026 23:27 View session

nickpismenkov approved these changes Mar 12, 2026

View reviewed changes

henrypark133 merged commit 8a60fa2 into staging Mar 12, 2026
12 checks passed

henrypark133 deleted the fix/wasm-tool-info-discovery branch March 12, 2026 23:30

Copilot AI reviewed Mar 12, 2026

View reviewed changes

zmanian reviewed Mar 12, 2026

View reviewed changes

zmanian mentioned this pull request Mar 13, 2026

fix(wasm): address #1086 review followups -- description hint and coercion safety #1092

Merged

6 tasks

This was referenced Mar 13, 2026

🦞 OpenClaw 生态日报 2026-03-13 gsscsd/big_model_radar#28

Open

🦞 Bản tin hàng ngày hệ sinh thái OpenClaw 2026-03-13 compasify/agents-radar#36

Open

ironclaw-ci bot mentioned this pull request Mar 13, 2026

chore: promote staging to staging-promote/e2eb340c-22999151534 (2026-03-13 03:36 UTC) #1096

Merged

This was referenced Mar 13, 2026

chore: promote staging to staging-promote/3c619b62-23035039465 (2026-03-13 04:32 UTC) #1101

Closed

chore: promote staging to staging-promote/3c619b62-23035039465 (2026-03-13 04:35 UTC) #1102

Merged

chore: release v0.19.0 #973

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: add tool_info schema discovery for WASM tools#1086

fix: add tool_info schema discovery for WASM tools#1086
henrypark133 merged 3 commits intostagingfrom
fix/wasm-tool-info-discovery

henrypark133 commented Mar 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

zmanian left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

zmanian left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

henrypark133 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Why This Approach

Review Fixes Included

Testing

Uh oh!

gemini-code-assist bot commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

zmanian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

zmanian left a comment

Choose a reason for hiding this comment

Design Validations

Actionable Feedback

1. Description hint mutation is fragile (blocking)

2. extract_wasm_metadata does redundant work (non-blocking, but worth addressing)

3. effective_for_coercion transient fallback needs safety documentation (non-blocking)

4. register_tool_info is easy to forget (non-blocking)

5. Minor: WasmToolSchemas::advertised() clones on every call

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

henrypark133 commented Mar 12, 2026 •

edited

Loading

2. `extract_wasm_metadata` does redundant work (non-blocking, but worth addressing)

3. `effective_for_coercion` transient fallback needs safety documentation (non-blocking)

4. `register_tool_info` is easy to forget (non-blocking)

5. Minor: `WasmToolSchemas::advertised()` clones on every call