An interactive web application demonstrating the power of Microsoft Foundry Model Router - an intelligent routing system that automatically selects the optimal language model for each request based on complexity, reasoning requirements, and task type.
Compare intelligent routing vs fixed model deployments in real-time!
✨ NEW FEATURES:
- ✏️ Custom Prompt Input - Test your own prompts to validate routing decisions
- 📊 All Three Routing Modes - Compare Balanced, Cost-Optimized, and Quality-Optimized
- 📈 Real Benchmark Data - Measured 5.5-7% cost savings across modes
- 🎯 Visual Analytics - See routing distribution across 4+ models
Select prompts, choose routing modes, and run comparisons in a clean, intuitive interface. Now includes custom prompt input for testing your own use cases!
Test your own prompts to validate routing decisions. When you click Use This Prompt, the benchmark automatically runs and compares Router vs Standard:
Enter any prompt to test routing behavior

Custom prompt benchmarks execute automatically - showing routing decisions, latency, and cost comparison
See instant comparisons between Model Router and standard deployments with live benchmark data:
Watch the router intelligently distribute requests across different models based on complexity:
Balanced Mode - 7% cost savings with optimal quality:

Cost-Optimized Mode - 5.5% savings prioritizing efficiency:

Quality-Optimized Mode - Routes to premium models for maximum accuracy:

- 🔀 Intelligent Model Routing - Watch as Model Router selects the best model for each prompt (GPT-5, GPT-4.1, O4-mini, etc.)
- 📊 Real-time Comparison - Run prompts through both router and standard deployments side-by-side
- 💰 Cost Analytics - Track estimated costs and see potential savings with smart routing
- ⚡ Performance Metrics - Monitor latency, token usage, and model distribution
- 🎯 Routing Modes - Test Balanced, Cost-Optimized, and Quality-Optimized routing strategies
- 📈 Visual Analytics - Charts showing model distribution and comparative statistics
- 🔍 Comprehensive Testing - Run individual prompts or batch test entire prompt sets
- ✏️ Custom Prompts - Test your own prompts to see how the router handles them
Model Router is a trained language model that intelligently routes your prompts in real-time to the most suitable large language model (LLM). Think of it as a smart dispatcher that:
- 🧠 Analyzes prompt complexity in real-time (reasoning, task type, attributes)
- 💡 Selects optimal models from a pool of 18+ underlying models
- 💵 Optimizes costs by using smaller models when sufficient, larger models when needed
- ⚡ Reduces latency while maintaining comparable quality
- 🎯 Supports multiple modes: Balanced (default), Cost, Quality
The latest Model Router supports 18 underlying models including:
- OpenAI Models: GPT-5, GPT-5-mini, GPT-5-nano, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, O4-mini
- Reasoning Models: GPT-5-chat, Grok-4, Grok-4-fast-reasoning
- Open Source Models: DeepSeek-V3.1, GPT-OSS-120B, Llama-4-Maverick
- Anthropic Claude: Claude-Haiku-4-5, Claude-Opus-4-1, Claude-Sonnet-4-5
- Node.js 18+ and npm
- Microsoft Foundry account with:
- Model Router deployment
- At least one standard model deployment (for comparison)
- API keys for both deployments
git clone <repository-url>
cd router-demo-appnpm installCopy the example environment file:
cp .env.example .env.localEdit .env.local with your Azure credentials:
# Azure Model Router Deployment
VITE_ROUTER_ENDPOINT=https://your-resource.cognitiveservices.azure.com
VITE_ROUTER_API_KEY=your-api-key-here
VITE_ROUTER_DEPLOYMENT=model-router
# Standard Model Deployment (for comparison)
VITE_STANDARD_ENDPOINT=https://your-resource.cognitiveservices.azure.com
VITE_STANDARD_API_KEY=your-api-key-here
VITE_STANDARD_DEPLOYMENT=gpt-4.1
⚠️ Security Note: Never commit.env.local- it's already in.gitignore
- Navigate to your Azure OpenAI resource
- Go to Keys and Endpoints
- Copy the base URL (e.g.,
https://your-resource.cognitiveservices.azure.com) - Copy one of the API Keys
- Note your deployment names from the Deployments tab
- Go to ai.azure.com
- Open your project
- Navigate to Deployments
- Confirm you have
model-routerdeployed - Get connection details from Project Settings
npm run devThe app will be available at http://localhost:5173 (or next available port)
- Select a Prompt - Choose from pre-configured prompts in the left sidebar (categorized by complexity) or click ✏️ Custom to test your own prompts
- Choose Action:
- 🔀 Run Router - Test model router only
- 📌 Run Standard - Test standard deployment only
- ⚡ Run Both - Compare side-by-side
- 🚀 Run All Prompts - Batch test all prompts
- Review Results - Analyze model selection, latency, costs in the results table
- Compare Metrics - Check stats cards and distribution charts
The custom prompt feature allows you to test any prompt and automatically run benchmarks when activated:
- Click the ✏️ Custom button in the prompt selector
- Enter your prompt text (any length, any complexity)
- Click ✓ Use This Prompt - benchmarks run automatically!
- View instant comparison results between Router and Standard deployments
This feature is perfect for validating how the router handles your specific use cases before deploying to production.
Example Custom Prompt:
Explain the concept of quantum computing to a 10-year-old child,
using simple analogies and examples they can relate to.
The router will analyze the prompt complexity and select the most appropriate model. In our test, it routed to gpt-5-nano-2025-08-07 (the efficient model for explanatory content) with:
- Latency: ~7099ms
- Tokens: 1055
- Cost: $0.00310
Test different routing strategies using the Routing Mode dropdown:
-
🎯 Balanced (Default) - Optimal balance of cost and quality (1-2% quality range)
- Result: 7.0% cost savings, 7506ms avg latency
- Use case: General production workloads
-
💰 Cost-Optimized - Maximize cost savings (5-6% quality range)
- Result: 5.5% cost savings, 6528ms avg latency
- Use case: High-volume, budget-conscious applications
-
💎 Quality-Optimized - Prioritize maximum accuracy (ignores cost)
- Result: Routes to premium models for best quality
- Use case: Critical accuracy scenarios, compliance requirements
📝 Note: Routing mode is passed to the API but actual routing behavior is configured in Microsoft Foundry Portal
All three routing modes tested with 10 diverse prompts:
| Mode | Cost Savings | Avg Latency (Router) | Best For |
|---|---|---|---|
| Balanced | 7.0% | 7506ms | General production (recommended) |
| Cost-Optimized | 5.5% | 6528ms | High-volume, budget-conscious |
| Quality-Optimized | Varies | 5927ms | Critical accuracy scenarios |
Real testing with all 10 prompts in Balanced Mode shows:
Performance Metrics:
- Average Latency (Router): 7506ms
- Average Latency (Standard): 6125ms
- Total Cost (Router): $0.0276
- Total Cost (Standard): $0.0297
- Cost Savings: 7.0%
Model Distribution:
- gpt-5-nano-2025-08-07: 8 requests (simple tasks)
- gpt-5-mini-2025-08-07: 5 requests (medium complexity)
- gpt-4.1-mini-2025-04-14: 1 request (complex reasoning)
- gpt-oss-120b: 2 requests (specialized tasks)
The router intelligently distributed requests across 4 different models, achieving cost savings while maintaining quality.
| Column | Description |
|---|---|
| Prompt | The input text sent to the model |
| Path | Router vs Standard deployment |
| Routing Mode | The routing strategy used (Balanced/Cost/Quality) |
| Chosen Model | The actual model selected (reveals routing decisions) |
| Latency | Response time in milliseconds |
| Tokens | Total tokens used (prompt + completion) |
| Est. Cost | Calculated cost based on model pricing |
- Router rows (blue): Shows which underlying model was selected
- Standard rows (gray): Always uses the same fixed model
- Model variety: 4 different models used = intelligent optimization
- Cost savings: 7% reduction with balanced mode, scalable at volume
- Smart routing: Simple prompts → nano, complex → premium models
router-demo-app/
├── src/
│ ├── components/ # React components
│ │ ├── DistributionChart.tsx # Model distribution visualization
│ │ ├── MetadataBadge.tsx # Config display
│ │ ├── PromptSelector.tsx # Prompt selection UI
│ │ ├── ResultsTable.tsx # Results display
│ │ ├── RunControls.tsx # Action buttons
│ │ └── StatsCards.tsx # Aggregate statistics
│ ├── config/ # Configuration files
│ │ ├── endpoints.ts # API endpoints config
│ │ ├── pricing.ts # Model pricing data
│ │ └── prompts.ts # Test prompt sets
│ ├── hooks/ # Custom React hooks
│ │ ├── useCompletion.ts # API call logic
│ │ └── useResults.ts # Results management
│ ├── types/ # TypeScript types
│ │ └── index.ts # Type definitions
│ ├── App.tsx # Main application
│ └── main.tsx # Entry point
├── .env.example # Environment template
├── .gitignore # Git ignore rules
├── package.json # Dependencies
├── tsconfig.json # TypeScript config
├── vite.config.ts # Vite config
└── README.md # This file
npm run buildOutput will be in the dist/ directory.
npm run previewnpm run lint- React 19.2 - UI framework with latest features
- TypeScript 5.9 - Type-safe development
- Vite 7.2 - Lightning-fast build tool
- Tailwind CSS 4.1 - Utility-first styling
- Recharts 3.7 - Chart visualizations
- Azure OpenAI API - Model Router & completions
✅ Implemented Security Measures:
- API keys stored in
.env.local(gitignored) - No hardcoded credentials in source code
- Environment variables prefixed with
VITE_for Vite security .env.localexcluded from version control
- Never commit
.env.localto version control - Rotate API keys regularly in Azure Portal
- Use separate keys for development and production
- Monitor API usage in Azure Portal
Problem: UI appears unresponsive, buttons are disabled Solution:
- Verify
.env.localhas correct base URLs only (no paths)- ✅ Correct:
https://your-resource.cognitiveservices.azure.com - ❌ Wrong:
https://.../openai/deployments/.../chat/completions
- ✅ Correct:
- Restart dev server:
Ctrl+Cthennpm run dev
Problem: Getting authentication errors Solution:
- Verify API keys in
.env.localare correct - Check keys are active in Azure Portal
- Ensure no extra spaces or quotes around keys
Problem: Deployment not found errors Solution:
- Verify deployment names in
.env.localmatch Azure Portal exactly - Check deployments are in the same region/resource
- Confirm Model Router is deployed (version 2025-11-18 recommended)
Problem: Cross-origin request blocked Solution: This shouldn't happen with Azure OpenAI, but if it does:
- Verify you're using correct endpoints
- Check Azure OpenAI resource settings
Problem: Changes to .env.local not reflected
Solution:
- Restart the dev server (Vite doesn't hot-reload env vars)
- Clear browser cache (
Ctrl+Shift+R) - Verify variables are prefixed with
VITE_
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Open a Pull Request
This project is licensed under the MIT License.
- Built with Microsoft Foundry Model Router
- Powered by OpenAI, Anthropic, DeepSeek, and Meta models
- UI components styled with Tailwind CSS
- Charts powered by Recharts
For issues related to:
- This demo app: Open a GitHub issue
- Microsoft Foundry: Check Microsoft Docs
- Azure Support: Contact Azure Support
🌟 Star this repo if you find it helpful!
Made with ❤️ for the Azure AI community


