Skip to content

fix(extensions): fix lifecycle bugs + comprehensive E2E tests#1070

Merged
henrypark133 merged 12 commits intostagingfrom
feature/e2e-extensions-tools
Mar 12, 2026
Merged

fix(extensions): fix lifecycle bugs + comprehensive E2E tests#1070
henrypark133 merged 12 commits intostagingfrom
feature/e2e-extensions-tools

Conversation

@henrypark133
Copy link
Copy Markdown
Collaborator

@henrypark133 henrypark133 commented Mar 12, 2026

Summary

  • Fix extension lifecycle cleanup bugs across install, activate, remove, and reinstall for WASM tools and channels.
  • Unify auth UX so OAuth and token-based setup from chat, the Extensions tab, gateway flows, and channel flows all use the same interface.
  • Preserve saved secrets across uninstall so reinstall can reuse shared credentials and auto-reactivate when appropriate.

What Changed

Backend lifecycle fixes

  • Guard activation when required auth is not configured.
  • Evict compiled WASM modules on remove so reinstall does not reuse stale runtime state.
  • Clear activation_errors, pending OAuth state, and pending auth listeners when removing WASM tools.
  • Clear stale activation_errors and delete channel artifacts when removing WASM channels.
  • Emit an auth_completed failure event when an OAuth callback arrives after the flow has expired.

Web UX improvements

  • Replace the chat-scoped auth card with a global auth modal/overlay.
  • Route install, configure, and activate flows that return auth_url through the same auth experience.
  • Only dismiss auth/configure UI for the matching extension when auth_completed arrives.
  • Keep the user in a recoverable state on auth failure with consistent toasts and Extensions reload behavior.

Regression coverage

  • Add Rust regression tests for WASM remove cleanup and expired OAuth callback behavior.
  • Extend E2E coverage for:
    • removing an extension during a pending OAuth flow
    • reinstall after remove with replacement WASM bytes
    • unified auth prompt handling across install/configure/activate flows
    • matching-extension modal dismissal and auth failure UX

CX Decisions Captured In This Branch

  • remove preserves saved secrets; reinstall can reuse credentials instead of forcing users to recreate tokens or re-enter shared keys.
  • Auth, OAuth, installation, and configuration flows should feel the same regardless of whether they start from chat, the Extensions tab, gateway integrations, or channel setup.

Test Plan

  • cargo test --lib remove_wasm_
  • cargo test --lib oauth_callback_expired_flow
  • uv run --project tests/e2e python -m pytest tests/e2e/scenarios/test_extensions.py -q
  • uv run --project tests/e2e python -m pytest tests/e2e -q (120 passed, 1 skipped)
  • CI passes

Copilot AI review requested due to automatic review settings March 12, 2026 19:48
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/web Web gateway channel scope: tool/builtin Built-in tools scope: tool/wasm WASM tool sandbox scope: tool/mcp MCP client scope: extensions Extension management scope: setup Onboarding / setup size: XL 500+ changed lines risk: high Safety, secrets, auth, or critical infrastructure contributor: core 20+ merged PRs labels Mar 12, 2026
@henrypark133 henrypark133 changed the base branch from main to staging March 12, 2026 19:50
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the stability, security, and testability of the extension management system. It addresses several critical bugs in the extension lifecycle, particularly around authentication and cleanup, which previously led to stale states and blocked operations. A major refactoring of the authentication and configuration APIs provides a more unified and robust mechanism for managing extension secrets and activation. Complementing these code changes, a comprehensive suite of new end-to-end tests has been added, ensuring the reliability of the extension system and providing a solid foundation for future development. Additionally, HTTP request security has been bolstered with advanced SSRF protections, and the web UI's HTML sanitization has been upgraded to a more secure library.

Highlights

  • Extension Lifecycle Refactoring: Refactored extension authentication and configuration logic for improved clarity and robustness, unifying secret handling and activation processes.
  • Critical Bug Fixes: Implemented 5 critical bug fixes addressing issues in extension activation, stale module caching, uncleared activation errors, OAuth flow cleanup, and missing SSE notifications for expired OAuth.
  • Comprehensive E2E Testing: Introduced 53 new end-to-end (E2E) tests across 4 test files, covering comprehensive extension lifecycle, OAuth flows, tool execution, and pairing requests.
  • Enhanced E2E Test Infrastructure: Enhanced E2E test infrastructure with OAuth mock endpoints, WASM fixtures, and API helpers to facilitate robust testing.
  • Improved HTTP Security: Improved security of HTTP requests by implementing per-request DNS pinning and stricter SSRF protections, including checks for IPv4-mapped IPv6 addresses.
  • Web UI HTML Sanitization Upgrade: Replaced regex-based HTML sanitization in the web UI with DOMPurify for enhanced XSS prevention and improved security.
Changelog
  • channels-src/telegram/telegram.capabilities.json
    • Added a validation endpoint for Telegram bot tokens.
  • src/agent/thread_ops.rs
    • Refactored extension authentication to use configure_token and handle validation errors.
  • src/channels/web/handlers/chat.rs
    • Updated chat authentication handler to use configure_token and manage validation failures.
  • src/channels/web/server.rs
    • Added an SSE notification for expired OAuth flows to update the UI.
    • Updated chat authentication handler to use configure_token and manage validation failures.
    • Refactored extension install, activate, and setup handlers to use the new configure and auth methods.
  • src/channels/web/static/app.js
    • Replaced custom HTML sanitization with DOMPurify for enhanced security.
  • src/channels/web/static/index.html
    • Included the DOMPurify library via CDN.
  • src/channels/web/ws.rs
    • Updated WebSocket authentication to use configure_token and broadcast AuthRequired on validation errors.
  • src/cli/mcp.rs
    • Updated MCP client creation to use process manager and factory for transport.
  • src/extensions/manager.rs
    • Removed the SetupResult struct and introduced ConfigureResult.
    • Refactored the auth method to be a read-only status check, removing the token parameter.
    • Implemented cleanup for in-progress OAuth flows and eviction of WASM module cache/activation errors during extension remove().
    • Renamed save_setup_secrets to configure, centralizing secret handling, validation, auto-generation, and activation dispatch.
    • Added a new configure_token method for single-token configuration.
    • Introduced a check in activate_wasm_tool to prevent activation if setup secrets are missing.
  • src/extensions/mod.rs
    • Defined a new ConfigureResult struct and added a ValidationFailed error variant for token validation.
  • src/setup/prompts.rs
    • Added event draining to prevent stale key events in terminal prompts.
  • src/tools/builtin/extension_tools.rs
    • Updated extension tool authentication and activation to use the new read-only auth method.
  • src/tools/builtin/http.rs
    • Implemented per-request DNS pinning for HTTP client.
    • Added redirect following for simple GET requests with per-hop validation.
    • Improved IP validation to correctly identify IPv4-mapped IPv6 addresses in private ranges.
  • src/tools/mcp/client.rs
    • Added an idempotency flag for MCP client initialization and streamlined initialization calls.
    • Modified send_request to handle JSON-RPC notifications as fire-and-forget.
  • src/tools/mcp/stdio_transport.rs
    • Modified stdio transport to handle JSON-RPC notifications as fire-and-forget.
  • src/tools/mcp/unix_transport.rs
    • Modified Unix socket transport to handle JSON-RPC notifications as fire-and-forget.
  • src/tools/wasm/capabilities_schema.rs
    • Added description and parameters fields to WASM capabilities schema.
  • src/tools/wasm/loader.rs
    • Enhanced WASM tool loader to parse and validate description and parameter schemas from capabilities files.
  • src/tools/wasm/runtime.rs
    • Updated comments to clarify that tool descriptions and schemas are primarily loaded from capabilities files.
  • tests/e2e/conftest.py
    • Added WASM build symlink support for worktree compatibility.
    • Added WASM tool/channel directories and mock OAuth configuration for E2E tests.
  • tests/e2e/helpers.py
    • Added API helper functions for authenticated GET and POST requests in E2E tests.
  • tests/e2e/mock_llm.py
    • Enhanced mock LLM server with tool call handling, tool results, streaming, and OAuth exchange.
  • tests/e2e/scenarios/test_extension_oauth.py
    • Added new E2E tests for the extension OAuth flow.
  • tests/e2e/scenarios/test_pairing.py
    • Added new E2E tests for the extension pairing request lifecycle.
  • tests/e2e/scenarios/test_tool_execution.py
    • Added new E2E tests for tool execution via chat API.
  • tests/e2e/scenarios/test_wasm_lifecycle.py
    • Added new comprehensive E2E tests for WASM extension lifecycle.
Activity
  • The author, henrypark133, implemented 5 bug fixes related to extension lifecycle: missing auth guard, stale module cache, uncleared activation errors, uncleaned OAuth flows, and missing SSE notification for expired OAuth.
  • The author added 53 new E2E tests across 4 files, covering WASM lifecycle, OAuth, tool execution, and pairing.
  • Test infrastructure was enhanced with OAuth mock endpoints, WASM fixtures, and API helpers.
  • The author confirmed all 44 E2E tests pass (114 passed, 1 skipped in full regression).
  • cargo clippy --all --all-features resulted in zero warnings.
  • cargo fmt reported a clean format.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides several important bug fixes for the extension lifecycle, addressing issues like missing auth guards, stale caches, and incomplete cleanup on removal. The changes are well-implemented and logical. The addition of a comprehensive suite of 53 new E2E tests is a fantastic improvement, ensuring these lifecycle flows are robust and preventing future regressions. The test infrastructure enhancements are also well done.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens extension lifecycle behavior (WASM + OAuth + MCP) based on issues found during end-to-end testing, and adds new E2E scenarios + supporting test infrastructure to prevent regressions.

Changes:

  • Unifies extension secret configuration via a new ExtensionManager::configure() / configure_token() flow and tightens activation/auth lifecycle behavior (cleanup on remove, auth gating, clearer error signaling).
  • Improves WASM and MCP runtime behaviors (WASM capabilities description/schema overrides; MCP JSON-RPC notification handling and initialization idempotency guard).
  • Adds comprehensive E2E tests and test harness support (OAuth exchange mock endpoint, WASM lifecycle scenarios, tool execution, pairing).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/e2e/scenarios/test_wasm_lifecycle.py New WASM tool lifecycle E2E coverage (registry→install→setup→activate→remove→reinstall + UI check).
tests/e2e/scenarios/test_tool_execution.py New E2E tests for tool-calling through the chat UI using the mock LLM.
tests/e2e/scenarios/test_pairing.py New E2E tests for pairing API auth/error handling.
tests/e2e/scenarios/test_extension_oauth.py New E2E tests for OAuth round-trip using gateway callback mode + mock exchange endpoint.
tests/e2e/mock_llm.py Enhances mock LLM with tool-calling + OAuth exchange endpoint + additional routing compatibility.
tests/e2e/helpers.py Adds authenticated API helper functions for E2E scenarios.
tests/e2e/conftest.py E2E infra: WASM dirs, worktree symlink support for build artifacts, OAuth env wiring.
src/tools/wasm/runtime.rs Clarifies tool description/schema extraction as fallback-only (prefer capabilities sidecar).
src/tools/wasm/loader.rs Loads tool description + parameters schema from capabilities.json with validation + warnings.
src/tools/wasm/capabilities_schema.rs Extends capabilities schema with description + parameters and adds unit tests.
src/tools/mcp/unix_transport.rs Avoids waiting for responses to JSON-RPC notifications (id-less requests).
src/tools/mcp/stdio_transport.rs Same notification handling fix for stdio transport.
src/tools/mcp/client.rs Adds local initialization guard, always initializes before list/call, and updates tests accordingly.
src/tools/builtin/http.rs Refactors SSRF/DNS handling to single-resolution + DNS pinning; adds redirect handling constraints and tests.
src/tools/builtin/extension_tools.rs Updates extension tool auth calls to the new manager API signature.
src/setup/prompts.rs Improves terminal prompt robustness on Windows (drain queued events; Press-only key handling).
src/extensions/mod.rs Introduces ConfigureResult and a distinct ValidationFailed error variant.
src/extensions/manager.rs Major lifecycle updates: auth becomes read-only, new configure/configure_token entrypoints, remove cleanup, activation guard, token validation via validation_endpoint.
src/cli/mcp.rs Uses MCP client factory for transport dispatch in mcp test.
src/channels/web/ws.rs Switches WS auth flow to configure_token() and handles validation errors distinctly.
src/channels/web/static/index.html Adds DOMPurify dependency for safer markdown rendering.
src/channels/web/static/app.js Replaces regex sanitizer with DOMPurify-based sanitizer.
src/channels/web/server.rs Sends SSE notification on expired OAuth flows; switches setup submit to configure() and auth checks to new API.
src/channels/web/handlers/chat.rs Updates chat auth token handler to use configure_token() and surface validation failures.
src/agent/thread_ops.rs Updates auth-mode token handling to configure_token() and aligns status signaling.
channels-src/telegram/telegram.capabilities.json Adds validation_endpoint for Telegram token validation.
Comments suppressed due to low confidence (1)

src/extensions/manager.rs:3593

  • Token validation uses validate_fetch_url() (which only validates the URL string) but then performs the request with a plain reqwest::Client. This reintroduces a DNS-rebinding window where an attacker-controlled domain could resolve to a public IP during validation and then to a private IP at request time. Consider reusing the same DNS-pinning approach as the HTTP tool (single resolution + SSRF check + resolve_to_addrs) or making the existing safe-fetch helper reusable here.
                let url = endpoint_template.replace(&format!("{{{}}}", secret_def.name), &encoded);
                // SSRF defense: block private IPs, localhost, cloud metadata endpoints
                crate::tools::builtin::skill_tools::validate_fetch_url(&url)
                    .map_err(|e| ExtensionError::Other(format!("SSRF blocked: {}", e)))?;
                let resp = reqwest::Client::builder()
                    .timeout(std::time::Duration::from_secs(10))
                    .build()
                    .map_err(|e| ExtensionError::Other(e.to_string()))?
                    .get(&url)
                    .send()
                    .await

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@github-actions github-actions bot added risk: medium Business logic, config, or moderate-risk modules and removed risk: high Safety, secrets, auth, or critical infrastructure labels Mar 12, 2026
Copilot AI review requested due to automatic review settings March 12, 2026 20:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

henrypark133 and others added 8 commits March 12, 2026 14:14
Refactors the extension lifecycle to eliminate the divergence between
chat and gateway paths that caused Telegram setup via chat to fail
(missing webhook secret auto-generation, no token validation).

Key changes:
- Rename save_setup_secrets() → configure(): single entrypoint for
  providing secrets to any extension (WasmChannel, WasmTool, MCP).
  Validates, stores, auto-generates, and activates.
- Add configure_token(): convenience wrapper for single-token callers
  (chat auth card, WebSocket, agent auth mode).
- Refactor auth() to pure status check: remove token parameter,
  delete token-storing branches from auth_mcp/auth_wasm_tool,
  rename auth_wasm_channel → auth_wasm_channel_status.
- Add ConfigureResult/MissingSecret types for structured responses.
- Replace hardcoded Telegram token validation with generic
  validation_endpoint from capabilities.json.
- Update all callers (9 files) to use the new interface.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace brittle msg.contains("Invalid token") checks with a proper
ExtensionError::ValidationFailed variant. configure() now returns
this variant for token validation failures, and callers match on it
directly instead of parsing error message strings.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…election, WS auth

1. SSRF: call validate_fetch_url() before validation_endpoint HTTP request
2. Transport errors map to ExtensionError::Other (not ValidationFailed)
3. configure_token() picks first *missing* secret, not first non-optional
4. WebSocket error path re-emits AuthRequired on ValidationFailed

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- test_configure_token_picks_first_missing_secret: verifies multi-secret
  channels can be configured one secret at a time (commit ce106f4)
- test_auth_is_read_only_for_wasm_channel: verifies auth() has no side
  effects and doesn't store secrets (commit 47f8eb6)
- test_validation_failed_is_distinct_error_variant: verifies the typed
  error variant can be pattern-matched (commit a318161)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…onsolidation

- Fix configure() fallthrough bug: dispatch activation by ExtensionKind
  instead of unconditionally calling activate_wasm_channel() for all
  non-WasmTool types (MCP servers and channel relays now use their
  correct activation methods)
- Remove dead MissingSecret struct and missing_secrets field (never
  populated, flagged by reviewer)
- Consolidate capabilities file parsing in configure(): parse once
  and reuse for allowed names, validation_endpoint, and auto-generation
- Fix auth() doc comment: note MCP OAuth side effects
- Fix stale save_setup_secrets reference in server.rs comment
- Add regression test for activation dispatch bug

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Bug fixes in src/extensions/manager.rs:
- Add auth guard to activate_wasm_tool() blocking activation when secrets
  are missing (NeedsSetup), matching activate_wasm_channel() behavior
- Evict WasmToolRuntime module cache on remove() so reinstall uses fresh binary
- Clear activation_errors on remove() for both WasmTool and WasmChannel
- Clean up in-progress OAuth flows on remove() (abort TCP listener, purge
  pending flow entries)

Bug fix in src/channels/web/server.rs:
- Broadcast AuthCompleted SSE event on expired OAuth callback so web UI
  doesn't stay stuck showing "auth required"

E2E test coverage:
- test_wasm_lifecycle.py: 35 tests covering install/configure/activate/
  remove/reinstall lifecycle with regression tests for bugs 1 and 3
- test_extension_oauth.py: 9 tests covering OAuth round-trip flow
- test_tool_execution.py: 5 tests for tool invocation via chat
- test_pairing.py: 4 tests for pairing request lifecycle
- Enhanced conftest.py, helpers.py, mock_llm.py for OAuth mock support

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@henrypark133 henrypark133 force-pushed the feature/e2e-extensions-tools branch from 109b3b1 to efcdd54 Compare March 12, 2026 21:23
Copilot AI review requested due to automatic review settings March 12, 2026 21:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 43 to 47
# Temp directory for WASM tools (pre-populated from dev builds)
_WASM_TOOLS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-tools-")
_WASM_CHANNELS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-channels-")


Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the WASM tools temp dir is "pre-populated from dev builds", but the fixtures intentionally keep WASM_TOOLS_DIR empty at boot and rely on the install pipeline to write files there. Please update this comment to reflect the actual behavior so future changes don’t misinterpret the intent.

Suggested change
# Temp directory for WASM tools (pre-populated from dev builds)
_WASM_TOOLS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-tools-")
_WASM_CHANNELS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-channels-")
# Temp directory for WASM tools; starts empty and is populated by the install
# pipeline during tests (fixtures do not pre-populate from dev builds)
_WASM_TOOLS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-tools-")
_WASM_CHANNELS_TMPDIR = tempfile.TemporaryDirectory(prefix="ironclaw-e2e-wasm-channels-")

Copilot uses AI. Check for mistakes.
Comment on lines +3071 to +3080
let ext_mgr = Arc::new(ExtensionManager::new(
mcp_sm,
Arc::new(crate::tools::mcp::process::McpProcessManager::new()),
secrets.clone(),
tool_registry,
None,
None,
std::path::PathBuf::from("/tmp/wasm_tools"),
std::path::PathBuf::from("/tmp/wasm_channels"),
None,
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test hard-codes /tmp/wasm_tools and /tmp/wasm_channels for the ExtensionManager directories. That can be non-portable (e.g., Windows) and can also create cross-test interference if multiple test processes run concurrently. Prefer using a tempfile::tempdir() and passing those paths into ExtensionManager::new.

Copilot uses AI. Check for mistakes.
Comment on lines +3106 to +3109
created_at: std::time::Instant::now()
.checked_sub(std::time::Duration::from_secs(600))
.expect("System uptime is too low to run expired flow test"),
};
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created_at is set using Instant::now().checked_sub(Duration::from_secs(600)).expect(...), which will panic on systems with very low monotonic uptime (e.g., freshly-booted environments). To avoid a potential CI-only flake, set created_at relative to oauth_defaults::OAUTH_FLOW_EXPIRY (plus a small delta) and handle the None case by skipping the test or choosing a smaller subtraction that cannot underflow in practice.

Copilot uses AI. Check for mistakes.
@github-actions github-actions bot added the scope: ci CI/CD workflows label Mar 12, 2026
Copilot AI review requested due to automatic review settings March 12, 2026 22:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@nickpismenkov nickpismenkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@henrypark133 henrypark133 merged commit 9fbdd42 into staging Mar 12, 2026
15 checks passed
@henrypark133 henrypark133 deleted the feature/e2e-extensions-tools branch March 12, 2026 23:36
bkutasi pushed a commit to bkutasi/ironclaw that referenced this pull request Mar 28, 2026
…#1070)

* feat(extensions): unify auth and configure into single entrypoint

Refactors the extension lifecycle to eliminate the divergence between
chat and gateway paths that caused Telegram setup via chat to fail
(missing webhook secret auto-generation, no token validation).

Key changes:
- Rename save_setup_secrets() → configure(): single entrypoint for
  providing secrets to any extension (WasmChannel, WasmTool, MCP).
  Validates, stores, auto-generates, and activates.
- Add configure_token(): convenience wrapper for single-token callers
  (chat auth card, WebSocket, agent auth mode).
- Refactor auth() to pure status check: remove token parameter,
  delete token-storing branches from auth_mcp/auth_wasm_tool,
  rename auth_wasm_channel → auth_wasm_channel_status.
- Add ConfigureResult/MissingSecret types for structured responses.
- Replace hardcoded Telegram token validation with generic
  validation_endpoint from capabilities.json.
- Update all callers (9 files) to use the new interface.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* fix: use ValidationFailed error variant instead of string matching

Replace brittle msg.contains("Invalid token") checks with a proper
ExtensionError::ValidationFailed variant. configure() now returns
this variant for token validation failures, and callers match on it
directly instead of parsing error message strings.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* fix: address review — SSRF protection, error typing, missing-secret selection, WS auth

1. SSRF: call validate_fetch_url() before validation_endpoint HTTP request
2. Transport errors map to ExtensionError::Other (not ValidationFailed)
3. configure_token() picks first *missing* secret, not first non-optional
4. WebSocket error path re-emits AuthRequired on ValidationFailed

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* test: add regression tests for extension lifecycle refactoring

- test_configure_token_picks_first_missing_secret: verifies multi-secret
  channels can be configured one secret at a time (commit ce106f4)
- test_auth_is_read_only_for_wasm_channel: verifies auth() has no side
  effects and doesn't store secrets (commit 47f8eb6)
- test_validation_failed_is_distinct_error_variant: verifies the typed
  error variant can be pattern-matched (commit a318161)

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix: address review comments — activation dispatch, dead code, caps consolidation

- Fix configure() fallthrough bug: dispatch activation by ExtensionKind
  instead of unconditionally calling activate_wasm_channel() for all
  non-WasmTool types (MCP servers and channel relays now use their
  correct activation methods)
- Remove dead MissingSecret struct and missing_secrets field (never
  populated, flagged by reviewer)
- Consolidate capabilities file parsing in configure(): parse once
  and reuse for allowed names, validation_endpoint, and auto-generation
- Fix auth() doc comment: note MCP OAuth side effects
- Fix stale save_setup_secrets reference in server.rs comment
- Add regression test for activation dispatch bug

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix(extensions): fix 5 extension lifecycle bugs found during E2E testing

Bug fixes in src/extensions/manager.rs:
- Add auth guard to activate_wasm_tool() blocking activation when secrets
  are missing (NeedsSetup), matching activate_wasm_channel() behavior
- Evict WasmToolRuntime module cache on remove() so reinstall uses fresh binary
- Clear activation_errors on remove() for both WasmTool and WasmChannel
- Clean up in-progress OAuth flows on remove() (abort TCP listener, purge
  pending flow entries)

Bug fix in src/channels/web/server.rs:
- Broadcast AuthCompleted SSE event on expired OAuth callback so web UI
  doesn't stay stuck showing "auth required"

E2E test coverage:
- test_wasm_lifecycle.py: 35 tests covering install/configure/activate/
  remove/reinstall lifecycle with regression tests for bugs 1 and 3
- test_extension_oauth.py: 9 tests covering OAuth round-trip flow
- test_tool_execution.py: 5 tests for tool invocation via chat
- test_pairing.py: 4 tests for pairing request lifecycle
- Enhanced conftest.py, helpers.py, mock_llm.py for OAuth mock support

[skip-regression-check]

Co-Authored-By: Claude Opus 4.6 <[email protected]>

* fix(web): unify extension auth UX and add lifecycle regressions

* test: fix pending oauth flow fixtures after rebase

* test(e2e): fix playwright route ordering for extensions reloads

* test: address e2e review follow-ups

* test: address remaining PR review comments

---------

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/cli TUI / CLI channel scope: channel/web Web gateway channel scope: ci CI/CD workflows scope: extensions Extension management scope: setup Onboarding / setup scope: tool/builtin Built-in tools scope: tool/mcp MCP client scope: tool/wasm WASM tool sandbox size: XL 500+ changed lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants