🔀 Microsoft Foundry Model Router Demo

An interactive web application demonstrating the power of Microsoft Foundry Model Router - an intelligent routing system that automatically selects the optimal language model for each request based on complexity, reasoning requirements, and task type.

Compare intelligent routing vs fixed model deployments in real-time!

✨ NEW FEATURES:

✏️ Custom Prompt Input - Test your own prompts to validate routing decisions
📊 All Three Routing Modes - Compare Balanced, Cost-Optimized, and Quality-Optimized
📈 Real Benchmark Data - Measured 5.5-7% cost savings across modes
🎯 Visual Analytics - See routing distribution across 4+ models

📸 Screenshots

Application Overview

Select prompts, choose routing modes, and run comparisons in a clean, intuitive interface. Now includes custom prompt input for testing your own use cases!

Custom Prompt Feature

Test your own prompts to validate routing decisions. When you click Use This Prompt, the benchmark automatically runs and compares Router vs Standard:

Enter any prompt to test routing behavior

Custom prompt benchmarks execute automatically - showing routing decisions, latency, and cost comparison

Real-time Results & Analytics

See instant comparisons between Model Router and standard deployments with live benchmark data:

Model Distribution Visualization

Watch the router intelligently distribute requests across different models based on complexity:

Routing Mode Comparisons

Balanced Mode - 7% cost savings with optimal quality:

Cost-Optimized Mode - 5.5% savings prioritizing efficiency:

Quality-Optimized Mode - Routes to premium models for maximum accuracy:

🌟 Features

🔀 Intelligent Model Routing - Watch as Model Router selects the best model for each prompt (GPT-5, GPT-4.1, O4-mini, etc.)
📊 Real-time Comparison - Run prompts through both router and standard deployments side-by-side
💰 Cost Analytics - Track estimated costs and see potential savings with smart routing
⚡ Performance Metrics - Monitor latency, token usage, and model distribution
🎯 Routing Modes - Test Balanced, Cost-Optimized, and Quality-Optimized routing strategies
📈 Visual Analytics - Charts showing model distribution and comparative statistics
🔍 Comprehensive Testing - Run individual prompts or batch test entire prompt sets
✏️ Custom Prompts - Test your own prompts to see how the router handles them

🚀 What is Microsoft Foundry Model Router?

Model Router is a trained language model that intelligently routes your prompts in real-time to the most suitable large language model (LLM). Think of it as a smart dispatcher that:

🧠 Analyzes prompt complexity in real-time (reasoning, task type, attributes)
💡 Selects optimal models from a pool of 18+ underlying models
💵 Optimizes costs by using smaller models when sufficient, larger models when needed
⚡ Reduces latency while maintaining comparable quality
🎯 Supports multiple modes: Balanced (default), Cost, Quality

Supported Models (2025-11-18 version)

The latest Model Router supports 18 underlying models including:

OpenAI Models: GPT-5, GPT-5-mini, GPT-5-nano, GPT-4.1, GPT-4.1-mini, GPT-4.1-nano, O4-mini
Reasoning Models: GPT-5-chat, Grok-4, Grok-4-fast-reasoning
Open Source Models: DeepSeek-V3.1, GPT-OSS-120B, Llama-4-Maverick
Anthropic Claude: Claude-Haiku-4-5, Claude-Opus-4-1, Claude-Sonnet-4-5

📋 Prerequisites

Node.js 18+ and npm
Microsoft Foundry account with:
- Model Router deployment
- At least one standard model deployment (for comparison)
- API keys for both deployments

🛠️ Setup Instructions

1. Clone the Repository

git clone <repository-url>
cd router-demo-app

2. Install Dependencies

npm install

3. Configure Environment Variables

Copy the example environment file:

cp .env.example .env.local

Edit .env.local with your Azure credentials:

# Azure Model Router Deployment
VITE_ROUTER_ENDPOINT=https://your-resource.cognitiveservices.azure.com
VITE_ROUTER_API_KEY=your-api-key-here
VITE_ROUTER_DEPLOYMENT=model-router

# Standard Model Deployment (for comparison)
VITE_STANDARD_ENDPOINT=https://your-resource.cognitiveservices.azure.com
VITE_STANDARD_API_KEY=your-api-key-here
VITE_STANDARD_DEPLOYMENT=gpt-4.1

⚠️ Security Note: Never commit .env.local - it's already in .gitignore

4. Get Azure Credentials

From Azure Portal:

Navigate to your Azure OpenAI resource
Go to Keys and Endpoints
Copy the base URL (e.g., https://your-resource.cognitiveservices.azure.com)
Copy one of the API Keys
Note your deployment names from the Deployments tab

From Microsoft Foundry Portal:

Go to ai.azure.com
Open your project
Navigate to Deployments
Confirm you have model-router deployed
Get connection details from Project Settings

5. Run Development Server

npm run dev

The app will be available at http://localhost:5173 (or next available port)

🎮 Usage Guide

Quick Start

Select a Prompt - Choose from pre-configured prompts in the left sidebar (categorized by complexity) or click ✏️ Custom to test your own prompts
Choose Action:
- 🔀 Run Router - Test model router only
- 📌 Run Standard - Test standard deployment only
- ⚡ Run Both - Compare side-by-side
- 🚀 Run All Prompts - Batch test all prompts
Review Results - Analyze model selection, latency, costs in the results table
Compare Metrics - Check stats cards and distribution charts

Testing Custom Prompts

The custom prompt feature allows you to test any prompt and automatically run benchmarks when activated:

Click the ✏️ Custom button in the prompt selector
Enter your prompt text (any length, any complexity)
Click ✓ Use This Prompt - benchmarks run automatically!
View instant comparison results between Router and Standard deployments

This feature is perfect for validating how the router handles your specific use cases before deploying to production.

Example Custom Prompt:

Explain the concept of quantum computing to a 10-year-old child, 
using simple analogies and examples they can relate to.

The router will analyze the prompt complexity and select the most appropriate model. In our test, it routed to gpt-5-nano-2025-08-07 (the efficient model for explanatory content) with:

Latency: ~7099ms
Tokens: 1055
Cost: $0.00310

Routing Modes

Test different routing strategies using the Routing Mode dropdown:

🎯 Balanced (Default) - Optimal balance of cost and quality (1-2% quality range)
- Result: 7.0% cost savings, 7506ms avg latency
- Use case: General production workloads
💰 Cost-Optimized - Maximize cost savings (5-6% quality range)
- Result: 5.5% cost savings, 6528ms avg latency
- Use case: High-volume, budget-conscious applications
💎 Quality-Optimized - Prioritize maximum accuracy (ignores cost)
- Result: Routes to premium models for best quality
- Use case: Critical accuracy scenarios, compliance requirements

📝 Note: Routing mode is passed to the API but actual routing behavior is configured in Microsoft Foundry Portal

📊 Understanding the Results

Routing Mode Comparison

All three routing modes tested with 10 diverse prompts:

Mode	Cost Savings	Avg Latency (Router)	Best For
Balanced	7.0%	7506ms	General production (recommended)
Cost-Optimized	5.5%	6528ms	High-volume, budget-conscious
Quality-Optimized	Varies	5927ms	Critical accuracy scenarios

Benchmark Results

Real testing with all 10 prompts in Balanced Mode shows:

Performance Metrics:

Average Latency (Router): 7506ms
Average Latency (Standard): 6125ms
Total Cost (Router): $0.0276
Total Cost (Standard): $0.0297
Cost Savings: 7.0%

Model Distribution:

gpt-5-nano-2025-08-07: 8 requests (simple tasks)
gpt-5-mini-2025-08-07: 5 requests (medium complexity)
gpt-4.1-mini-2025-04-14: 1 request (complex reasoning)
gpt-oss-120b: 2 requests (specialized tasks)

The router intelligently distributed requests across 4 different models, achieving cost savings while maintaining quality.

Results Table

Column	Description
Prompt	The input text sent to the model
Path	Router vs Standard deployment
Routing Mode	The routing strategy used (Balanced/Cost/Quality)
Chosen Model	The actual model selected (reveals routing decisions)
Latency	Response time in milliseconds
Tokens	Total tokens used (prompt + completion)
Est. Cost	Calculated cost based on model pricing

Key Insights

Router rows (blue): Shows which underlying model was selected
Standard rows (gray): Always uses the same fixed model
Model variety: 4 different models used = intelligent optimization
Cost savings: 7% reduction with balanced mode, scalable at volume
Smart routing: Simple prompts → nano, complex → premium models

🏗️ Project Structure

router-demo-app/
├── src/
│   ├── components/          # React components
│   │   ├── DistributionChart.tsx    # Model distribution visualization
│   │   ├── MetadataBadge.tsx        # Config display
│   │   ├── PromptSelector.tsx       # Prompt selection UI
│   │   ├── ResultsTable.tsx         # Results display
│   │   ├── RunControls.tsx          # Action buttons
│   │   └── StatsCards.tsx           # Aggregate statistics
│   ├── config/              # Configuration files
│   │   ├── endpoints.ts             # API endpoints config
│   │   ├── pricing.ts               # Model pricing data
│   │   └── prompts.ts               # Test prompt sets
│   ├── hooks/               # Custom React hooks
│   │   ├── useCompletion.ts         # API call logic
│   │   └── useResults.ts            # Results management
│   ├── types/               # TypeScript types
│   │   └── index.ts                 # Type definitions
│   ├── App.tsx              # Main application
│   └── main.tsx             # Entry point
├── .env.example             # Environment template
├── .gitignore               # Git ignore rules
├── package.json             # Dependencies
├── tsconfig.json            # TypeScript config
├── vite.config.ts           # Vite config
└── README.md                # This file

🔧 Development

Build for Production

npm run build

Output will be in the dist/ directory.

Preview Production Build

npm run preview

Linting

npm run lint

📚 Key Technologies

React 19.2 - UI framework with latest features
TypeScript 5.9 - Type-safe development
Vite 7.2 - Lightning-fast build tool
Tailwind CSS 4.1 - Utility-first styling
Recharts 3.7 - Chart visualizations
Azure OpenAI API - Model Router & completions

🔐 Security Best Practices

✅ Implemented Security Measures:

API keys stored in .env.local (gitignored)
No hardcoded credentials in source code
Environment variables prefixed with VITE_ for Vite security
.env.local excluded from version control

⚠️ Important Reminders:

Never commit .env.local to version control
Rotate API keys regularly in Azure Portal
Use separate keys for development and production
Monitor API usage in Azure Portal

🐛 Troubleshooting

Application Not Loading / Buttons Disabled

Problem: UI appears unresponsive, buttons are disabled Solution:

Verify .env.local has correct base URLs only (no paths)
- ✅ Correct: https://your-resource.cognitiveservices.azure.com
- ❌ Wrong: https://.../openai/deployments/.../chat/completions
Restart dev server: Ctrl+C then npm run dev

API Errors (401 Unauthorized)

Problem: Getting authentication errors Solution:

Verify API keys in .env.local are correct
Check keys are active in Azure Portal
Ensure no extra spaces or quotes around keys

API Errors (404 Not Found)

Problem: Deployment not found errors Solution:

Verify deployment names in .env.local match Azure Portal exactly
Check deployments are in the same region/resource
Confirm Model Router is deployed (version 2025-11-18 recommended)

CORS Errors

Problem: Cross-origin request blocked Solution: This shouldn't happen with Azure OpenAI, but if it does:

Verify you're using correct endpoints
Check Azure OpenAI resource settings

Environment Variables Not Working

Problem: Changes to .env.local not reflected Solution:

Restart the dev server (Vite doesn't hot-reload env vars)
Clear browser cache (Ctrl+Shift+R)
Verify variables are prefixed with VITE_

📖 Additional Resources

Official Documentation

Learn More

🤝 Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Open a Pull Request

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

Built with Microsoft Foundry Model Router
Powered by OpenAI, Anthropic, DeepSeek, and Meta models
UI components styled with Tailwind CSS
Charts powered by Recharts

📧 Support

For issues related to:

This demo app: Open a GitHub issue
Microsoft Foundry: Check Microsoft Docs
Azure Support: Contact Azure Support

🌟 Star this repo if you find it helpful!

Made with ❤️ for the Azure AI community

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
public		public
screenshots		screenshots
src		src
.env.example		.env.example
.gitignore		.gitignore
BLOG_POST.md		BLOG_POST.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

License

leestott/router-demo-app

Folders and files

Latest commit

History

Repository files navigation