Skip to content

RuFlo Web UI (flo.ruv.io) — capabilities, roadmap, and known limits #1689

@ruvnet

Description

@ruvnet

Overview

The RuFlo Web UI (Beta) shipped in PR #1687 — a multi-model chat surface deployed to Cloud Run that talks to the same MCP backbone as the CLI, with no install required.

Live URLs (all map to one Cloud Run service):

Source: `ruflo/src/ruvocal/` (SvelteKit fork of HF chat-ui v0.20.0, with WASM-MCP layer from `ruvnet/RuVector/ui/ruvocal`).

ADR: ruflo/docs/adr/ADR-033-RUVOCAL-WASM-MCP-INTEGRATION.md


✅ Shipped capabilities

Models

  • Default: Claude Sonnet 4.6 via OpenRouter (best tool-calling reliability)
  • Available: Claude Sonnet 4.6, Claude Opus 4.7, Claude Haiku 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash, GPT-4o
  • All flagged `supportsTools: true` and `multimodal: true`
  • Routing: `OPENAI_BASE_URL=https://openrouter.ai/api/v1\` with `OPENAI_API_KEY` mapped from `OPENROUTER_API_KEY` Secret Manager secret

MCP Tooling (~210 tools across 6 servers)

Server Source Tools Notes
Core built-in (`mcp-bridge`) 3 `search`, `web_research`, `guidance` (Gemini-grounded search)
Intelligence `ruvector mcp start` 49 Pattern learning, routing, AST analysis, security scan, RAG
Agents `ruflo mcp start` 50 `agent_spawn`, `swarm_init`, `hive-mind_*`, `task_create`
Memory `ruflo mcp start` 35 `memory_store`, `memory_search`, `memory_retrieve`, AgentDB
DevTools `ruflo mcp start` 73 `system_status`, `performance_`, `analyze_diff`, `github_*`, `terminal_execute`
WASM Gallery in-browser `rvagent_wasm.{js,wasm}` (588 KB) 18 IndexedDB-persisted templates, no server roundtrip

The bridge is deployed as Cloud Run service `mcp-bridge` with `min-instances=1` (keeps backends warm) and `120s` MCP-init timeout for cold-start tolerance.

Parallel tool calling (verified)

Server logs from the deployed instance:
```
[mcp] tools executed; toolMsgCount: 4 ← single Claude response → 4 parallel tools
[mcp] tools executed; toolMsgCount: 6 ← single Claude response → 6 parallel tools
```

The chat-ui dispatches all `tool_calls` from a single response via `Promise.all` (`src/lib/server/textGeneration/mcp/toolInvocation.ts`). Each tool's result streams back as a separate card in the UI.

Persistent memory

  • AgentDB (sql.js + HNSW) for cross-session recall
  • 384-dim ONNX all-MiniLM-L6-v2 embeddings
  • Surface via `ruflo__memory_*` and `ruvector__hooks_remember/recall` tools

UX additions

  • Help icon in sidebar → "RuFlo Capabilities" modal listing all 6 tool groups with per-tool descriptions and quick-start
  • 8 RuFlo-themed example prompts on the welcome screen (replacing HF defaults): swarm, memory, route, analyze diff, system health, WASM gallery, GOAP, neural training
  • RuFlo branding throughout (`PUBLIC_APP_NAME=RuFlo`)
  • Chat input "MCP (6)" pill showing all enabled servers; expandable per-server tool toggles

Infrastructure

  • Embedded MongoDB via `INCLUDE_DB=true` (multi-stage Dockerfile copies `mongod` from `mongo:7`); ephemeral on cold starts but acceptable for chat session storage
  • Custom domains via Cloudflare DNS (CNAME unproxied so Google manages the cert): `flo.ruv.io`, `ruflo.ruv.io`, `ruvocal.ruv.io`
  • Cloud Build pipeline with `DOCKER_BUILDKIT=1` for `COPY --link` syntax
  • `.gcloudignore` prevents `.gitignore`'s `models/*` rule from stripping route directories during upload

🛠️ Roadmap / open work

Web Workers for WASM MCP (P1)

Why: `src/lib/wasm/index.ts` (1,213 lines) and `src/lib/stores/wasmMcp.ts` (454 lines) load and execute the WASM MCP server on the main thread. The 588 KB `rvagent_wasm_bg.wasm` compile + init takes ~300ms and freezes the UI on first load.

Scope:

  • New `src/lib/wasm/wasm.worker.ts` that owns the WASM module and IndexedDB persistence
  • Postmessage RPC bridge replacing direct calls in `wasmMcp.ts`
  • Async wrappers for the 13 currently-exported functions (`callMcp`, `executeTool`, `listTools`, `listGalleryTemplates`, `searchGalleryTemplates`, `loadGalleryTemplate`, etc.)
  • Type-safe message schema (request id, method, params / result, error)

Estimate: 4–8 hours including svelte-check and a smoke test that compares wall-clock first-paint with/without the worker.

Persistent MongoDB (P1)

Current: `INCLUDE_DB=true` runs `mongod` inside the Cloud Run container. Data evicts on cold starts.

Options (in increasing complexity):

  1. MongoDB Atlas M0 (free tier) — provision externally, store URI as `ruvocal-mongodb-url` secret, stop using `INCLUDE_DB`. Cleanest for a beta.
  2. Cloud Run multi-container with mongo sidecar + Cloud Storage volume — keep self-contained, gain persistence between cold starts. Needs the GA "Cloud Storage volume mounts" feature and a service-YAML rewrite.
  3. MongoDB on Compute Engine — full control, ops overhead.

Authentication (P1)

Current: `AUTOMATIC_LOGIN=false`, `OPENID_CLIENT_ID=""` — anonymous sessions only. Conversations persist by `sessionId` cookie.

Wanted: Google OAuth via OpenID Connect so we can show "Settings → Application" admin diagnostics, attach memory namespaces to a user, and enable per-user usage caps.

UI: parallel tool-call visualization (P2)

The server already runs tools in parallel. The UI renders each tool-call card as it lands but doesn't visually group a single "step" the way Claude Code's task panel does (with collapsed thumbnails, durations, and parallel lanes). Worth a UX pass on `ChatMessage.svelte` and `ToolUpdate.svelte`.

LLM router / Omni mode (P3)

`LLM_ROUTER_ARCH_BASE_URL` is empty so the auto-routing alias model isn't created. Re-enable when a smart router (Arch-Router) is hosted somewhere reachable.

Bridge backend reliability (P2)

  • The mcp-bridge cold-start issue (45–60s for ruflo / ruvector to boot) is mitigated by `min-instances=1` but not eliminated. Investigate whether the npm globals can be pre-warmed at image-build time more aggressively, or switch to npx with a global cache mount.
  • Some optional groups (`security`, `browser`, `neural`, `agentic-flow`, `claude-code`, `gemini`, `codex`) ship in the Dockerfile but are gated by `MCP_GROUP_*=true` env vars and not enabled in the deploy. Audit and turn on the safe ones.

Documentation (P2)

  • README has the new section + capability table row (commit `5cb4ca4f9`).
  • `ADR-033` is comprehensive but doesn't cover the OpenRouter provider switch as a follow-up; appendix needed.
  • No user-facing docs page yet on `docs.ruv.io` or similar.

🐛 Known issues

  1. Gemini 2.5 Flash sometimes opts out of tool calls and returns empty content when the conversation grows. `final answer emitted (no tool_calls)` then `No output generated after streaming`. Fix: select Claude Sonnet 4.6 (now default) or Gemini 2.5 Pro.
  2. `/api/v2/debug/config` returns 403 for non-admin users. By design — that endpoint is admin-gated. Only visible from Settings → Application. Cosmetic console error.
  3. First-time visitors see a 30+s cold start if Cloud Run scales to zero on the bridge. The chat-ui side has `min-instances=0`; consider raising to 1 if user volume justifies it.
  4. MCP "Sorry, something went wrong" on Gemini OAI-compat endpoint when sending tool results back as conversation history (`Error: 400 status code (no body)`). Fixed by switching default to Claude.
  5. WASM gallery is currently the mock implementation (`[WASM] Using mock rvagent-wasm implementation`). The real `rvagent_wasm.{js,wasm}` ships but the mock fallback runs because the actual init script isn't wired into `app.html`. Tracking separately.
  6. Help modal still says "default: Gemini 2.5 Flash" in the Quick Start text. One-line copy fix.

📂 Source layout reference

```
ruflo/src/ruvocal/ # SvelteKit fork
├── src/
│ ├── lib/
│ │ ├── wasm/ # WASM MCP loader, IDB, tests
│ │ ├── components/wasm/ # GalleryPanel
│ │ ├── components/RufloHelpModal.svelte
│ │ ├── stores/wasmMcp.ts # Svelte store (main-thread for now)
│ │ ├── stores/mcpServers.ts # Multi-server registry
│ │ └── constants/{mcpExamples,routerExamples}.ts # RuFlo prompts
│ └── routes/... # SvelteKit routes
├── mcp-bridge/ # Express + spawn(npx ruflo|ruvector)
│ ├── index.js # Routes /mcp/, /mcp-servers
│ ├── mcp-stdio-kernel.js # Kept for future RVF tunnel
│ └── cloudbuild.yaml
├── static/wasm/rvagent_wasm.{js,wasm}
├── cloudbuild.yaml # ruvocal image build + Cloud Run deploy
├── .gcloudignore # CRITICAL — see ADR-033
└── Dockerfile # Multi-stage, INCLUDE_DB support
```


Acceptance criteria for closing this issue

  • Web Workers wrap WASM MCP (P1)
  • Persistent MongoDB (Atlas or Cloud Run volume) (P1)
  • Google OAuth login enabled (P1)
  • Parallel tool-call visualization parity with Claude Code (P2)
  • Optional MCP groups (`security`, `browser`, `neural`, `gemini`, `codex`) enabled and tested (P2)
  • Real `rvagent_wasm` wired into `app.html` (replace mock) (P2)
  • Help-modal copy fixed ("default: Claude Sonnet 4.6") (P3)
  • LLM router (Omni) enabled (P3)

cc @ruvnet

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions