RuFlo's multi-model AI chat with built-in Model Context Protocol (MCP) tool calling. Live at flo.ruv.io.
RuVocal is the SvelteKit web app that lets you chat with Qwen, Claude, Gemini, or OpenAI while RuFlo invokes the same ~210 MCP tools the CLI uses — agent orchestration, persistent memory, swarm coordination, code review, GitHub ops — all directly from chat. No install, no API key needed to try the hosted demo.
It started as a fork of the HuggingFace chat-ui v0.20.0 and has been extended with a WASM-MCP integration layer, parallel tool execution, an in-browser tool gallery, and a "RuFlo Capabilities" tour modal. See ADR-033 for the architecture.
| 🛠️ ~210 MCP tools, prefixed | Five RuFlo server groups (Core, Intelligence, Agents, Memory, DevTools) plus a 18-tool in-browser WASM gallery |
| ⚡ Parallel tool calls | One model turn fires 4–6+ tools at once via Promise.all. The UI shows a Step N — X tools completed badge per turn |
| 📘 RuFlo Capabilities modal | Question-mark icon → multi-section tour: models, tools, architecture, shortcuts |
| 💾 AgentDB-backed memory | "Remember my favorite color is indigo" → recalled weeks later via HNSW vector search |
| 🧠 6 curated frontier models | Qwen 3.6 Max (default), Claude Sonnet 4.6, Claude Haiku 4.5, Gemini 2.5 Pro, Gemini 2.5 Flash, OpenAI — via OpenRouter |
| 🔌 Bring-your-own MCP servers | Add HTTP/SSE/stdio endpoints from the chat input; they join the parallel-execution flow |
| 🦾 ruvLLM ready | Native support for ruvLLM — RuFlo's self-improving local model layer |
| 🏠 Self-hostable | Multi-stage Dockerfile (INCLUDE_DB=true builds in MongoDB), cloudbuild.yaml for Google Cloud Run |
flo.ruv.io — pick a model, type a question. That's it.
git clone https://github.com/ruvnet/ruflo
cd ruflo/ruflo/src/ruvocal
cp .env .env.local # then edit .env.local — see below
npm install
npm run dev # → http://localhost:5173Minimum .env.local to use OpenRouter (matches the hosted setup):
OPENAI_BASE_URL=https://openrouter.ai/api/v1
OPENAI_API_KEY=sk-or-v1-...
# Curated RuFlo model list (optional — defaults to /models from the base URL)
TASK_MODEL=qwen/qwen3.6-max-preview
PUBLIC_APP_NAME=RuFlo
PUBLIC_APP_DESCRIPTION="Intelligent workflow automation assistant powered by Claude/Gemini/Qwen and MCP tools."Any OpenAI-compatible endpoint works (vLLM, Ollama, LM Studio, llama.cpp, Together, Groq, self-hosted, …):
| Provider | OPENAI_BASE_URL |
Key |
|---|---|---|
| OpenRouter | https://openrouter.ai/api/v1 |
sk-or-v1-... |
| Hugging Face router | https://router.huggingface.co/v1 |
hf_xxx |
| llama.cpp server | http://127.0.0.1:8080/v1 |
any string |
| Ollama (OAI bridge) | http://127.0.0.1:11434/v1 |
ollama |
| Poe | https://api.poe.com/v1 |
pk_... |
docker build -t ruvocal --build-arg INCLUDE_DB=true .
docker run -p 3000:3000 \
-e OPENAI_BASE_URL=https://openrouter.ai/api/v1 \
-e OPENAI_API_KEY=sk-or-v1-... \
-v ruvocal-data:/data \
ruvocalcloudbuild.yaml does a multi-stage build with INCLUDE_DB=true, pushes to Artifact Registry, and deploys to Cloud Run. Hosted demo at flo.ruv.io runs from this exact pipeline. See cloudbuild.yaml and the deploy notes in ADR-033.
Chat history, users, settings, files, and stats live in MongoDB. Three options:
- Embedded (zero-config) — omit
MONGODB_URL; the app usesMongoMemoryServerand persists to./db. Good for local dev and theINCLUDE_DB=trueDocker path. - MongoDB Atlas (managed) — free cluster at mongodb.com, allow-list your IP, set
MONGODB_URLto the connection string. - Local container —
docker run -d -p 27017:27017 --name mongo-ruvocal mongo:latestthenMONGODB_URL=mongodb://localhost:27017.
MONGODB_DB_NAME defaults to chat-ui (kept for upstream compatibility); change per environment.
RuVocal calls tools exposed by Model Context Protocol servers and feeds results back to the model via OpenAI function calling. Configure trusted servers via env, let users add their own, and the router auto-selects tools-capable models when needed.
MCP_SERVERS=[
{"name":"RuFlo Core","url":"https://mcp-bridge-...run.app/mcp/core","transport":"sse"},
{"name":"RuFlo Intelligence","url":"https://mcp-bridge-...run.app/mcp/intelligence","transport":"sse"},
{"name":"RuFlo Agents","url":"https://mcp-bridge-...run.app/mcp/agents","transport":"sse"},
{"name":"RuFlo Memory","url":"https://mcp-bridge-...run.app/mcp/memory","transport":"sse"},
{"name":"RuFlo DevTools","url":"https://mcp-bridge-...run.app/mcp/devtools","transport":"sse"},
{"name":"π Shared Brain","url":"https://mcp.pi.ruv.io","transport":"streamable-http"}
]In the chat UI: MCP (n) pill in the chat input → Add Server to drop in any HTTP/SSE/stdio endpoint. Run a local MCP server on localhost:3000 and it just works.
When a model calls a tool, the message shows a compact card with parameters, a progress bar while running, and the result. Multiple tools in the same turn render as a parallel-execution group.
RuVocal can do server-side smart routing using katanemo/Arch-Router-1.5B without a separate router service. Selecting "Omni" in the model picker:
- Calls Arch once (non-streaming) to pick the best route for recent turns
- Emits
RouterMetadataso the UI shows route + selected model - Streams from the selected model via
OPENAI_BASE_URL; on errors, tries route fallbacks
Shortcut paths bypass Arch:
- Multimodal —
LLM_ROUTER_ENABLE_MULTIMODAL=true+ image attached → usesLLM_ROUTER_MULTIMODAL_MODEL - Tools/Agentic —
LLM_ROUTER_ENABLE_TOOLS=true+ ≥1 MCP server enabled → usesLLM_ROUTER_TOOLS_MODEL
Configure via LLM_ROUTER_ROUTES_PATH (JSON array of route entries with name, description, primary_model, optional fallback_models), LLM_ROUTER_ARCH_BASE_URL, LLM_ROUTER_ARCH_MODEL, LLM_ROUTER_OTHER_ROUTE, LLM_ROUTER_FALLBACK_MODEL, LLM_ROUTER_ARCH_TIMEOUT_MS.
Display: PUBLIC_LLM_ROUTER_ALIAS_ID (default omni), PUBLIC_LLM_ROUTER_DISPLAY_NAME (default Omni), PUBLIC_LLM_ROUTER_LOGO_URL.
PUBLIC_APP_NAME=RuFlo
PUBLIC_APP_ASSETS=chatui
PUBLIC_APP_DESCRIPTION="Intelligent workflow automation assistant powered by Claude/Gemini/Qwen and MCP tools."
PUBLIC_APP_DATA_SHARING=PUBLIC_APP_ASSETS picks the logo/favicon directory under static/$PUBLIC_APP_ASSETS.
npm run build # production bundle
npm run preview # preview the build locally- SvelteKit 2 + Svelte 5 runes (
$state,$derived,$effect) - MongoDB persistence with
MongoMemoryServerfallback - TailwindCSS with
scrollbar-customand Tailwind class sorting - OpenAI-compatible model registry pulled from
${OPENAI_BASE_URL}/models - MCP bridge in
src/lib/server/mcp/; each server group exposes its own SSE endpoint - WASM tool gallery via Web Worker (
src/lib/wasm/wasm.worker.ts), opt-in via?worker=1 - Parallel tool calls —
Promise.allinsrc/lib/server/tools/toolInvocation.ts - Capabilities modal —
src/lib/components/RufloHelpModal.svelte - Dynamic follow-ups — tool-call-aware suggested next prompts in
ChatWindow.svelte
For deeper internals, see CLAUDE.md (Claude Code agent guide) and ADR-033.
- 🏠 Parent project — RuFlo
- 🎯 Goal Planner UI — goal.ruv.io · /agents
- 📖 ADR-033 — RuVocal/WASM-MCP integration architecture
- 📋 Roadmap — issue #1689
- 🍴 Upstream — huggingface/chat-ui (RuVocal is forked from v0.20.0)
MIT — same as the parent RuFlo project.