Transcriptor MCP exposes Prometheus metrics on both the REST API and MCP HTTP server. You can scrape them with Prometheus and visualize in Grafana.
For error monitoring with stack traces and grouping, see Sentry (optional, sentry.io Cloud).
Add Prometheus and Grafana to your deployment:
cp docker-compose.example.yml docker-compose.yml
docker compose up -dThis starts:
- Prometheus at
http://localhost:9090— scrapestranscriptor-mcp-api:3000/metricsandtranscriptor-mcp:4200/metrics - Grafana at
http://localhost:3001— login:admin/admin(change on first login)
The Grafana Prometheus datasource is provisioned automatically.
| Service | Metrics | Failures list |
|---|---|---|
| REST API (port 3000) | GET /metrics |
GET /failures |
| MCP HTTP (port 4200) | GET /metrics |
GET /failures |
| Metric | Type | Labels | Description |
|---|---|---|---|
http_requests_total |
Counter | method, route, status_code | Total HTTP requests |
http_request_duration_seconds |
Histogram | method, route | Request latency |
http_request_errors_total |
Counter | — | Total 4xx/5xx responses |
cache_hits_total |
Counter | — | Cache hits |
cache_misses_total |
Counter | — | Cache misses |
subtitles_extraction_failures_total |
Counter | — | Videos where subtitles could not be obtained (neither YouTube nor Whisper) |
whisper_requests_total |
Counter | mode | Requests to Whisper (transcription attempts; mode=local or api) |
whisper_background_jobs_active |
Gauge | — | In-flight deduplicated background Whisper jobs |
| Metric | Type | Labels | Description |
|---|---|---|---|
mcp_tool_calls_total |
Counter | tool | Successful MCP tool calls |
mcp_tool_errors_total |
Counter | tool | Failed MCP tool calls |
mcp_session_total |
Gauge | type=streamable|sse | Active MCP sessions |
mcp_request_duration_seconds |
Histogram | endpoint | MCP request latency |
mcp_quota_checks_total |
Counter | result, tier | Quota checks before tools/call (result: allowed, exceeded, rejected_no_key, rejected_invalid_key; tier: registered, default, anonymous) |
mcp_quota_exceeded_total |
Counter | tier, key_id | Limit exceeded (key_id from registry, or none) |
mcp_quota_tool_calls_blocked_total |
Counter | tool | Tool calls blocked by quota |
mcp_quota_http_429_total |
Counter | route | HTTP 429 returned for quota before MCP session |
mcp_quota_check_duration_seconds |
Histogram | — | Latency of one quota check (Redis or in-memory) |
subtitles_extraction_failures_total |
Counter | — | Same as API |
whisper_requests_total |
Counter | mode | Same as API |
whisper_background_jobs_active |
Gauge | — | Same as API |
GET /failures returns a JSON list of URLs where subtitle extraction failed (YouTube subtitles and Whisper fallback both returned nothing):
{
"failures": [
{ "url": "https://youtube.com/watch?v=xxx", "timestamp": "2025-02-13T12:00:00.000Z" }
],
"total": 42
}- Only records failures when Whisper fallback was enabled and attempted.
- Stores the last 100 failures per process in memory (reset on restart).
- API and MCP each maintain their own list.
# Request rate (API)
rate(http_requests_total{service="api"}[5m])
# Error rate
rate(http_request_errors_total[5m])
# Latency p95 (API)
histogram_quantile(0.95, rate(http_request_duration_seconds_bucket{service="api"}[5m]))
# Cache hit rate
rate(cache_hits_total[5m]) / (rate(cache_hits_total[5m]) + rate(cache_misses_total[5m]))
# Subtitles extraction failures
increase(subtitles_extraction_failures_total[1h])
# Whisper requests (rate and total by mode)
rate(whisper_requests_total[5m])
increase(whisper_requests_total[1h])
# MCP tool calls by tool
rate(mcp_tool_calls_total{service="mcp"}[5m])
# Active MCP sessions
mcp_session_total{service="mcp"}
# MCP quota (per-client API keys / default tier)
sum by (tier) (rate(mcp_quota_exceeded_total{service="mcp"}[5m]))
histogram_quantile(0.95, sum(rate(mcp_quota_check_duration_seconds_bucket{service="mcp"}[5m])) by (le))
Provisioned JSON: monitoring/grafana/provisioning/dashboards/files/mcp-quota.json. Mount Grafana provisioning so that:
monitoring/grafana/provisioning→/etc/grafana/provisioning- Datasource URL in
datasources.ymlpoints at your Prometheus service (e.g.http://prometheus:9090).
Panels cover: exceeded rate by tier, share of checks with result=exceeded, top key_id (registered tier), MCP tool call rate vs quota exceeded, p95 quota check latency, HTTP 429 by route, blocked tool call rate.
Prometheus scrape config is in monitoring/prometheus.yml. Grafana datasource is provisioned from monitoring/grafana/provisioning/datasources/datasources.yml.
For a custom setup (e.g. existing Prometheus), add scrape targets:
scrape_configs:
- job_name: 'transcriptor-mcp-api'
static_configs:
- targets: ['<api-host>:3000']
metrics_path: /metrics
- job_name: 'transcriptor-mcp'
static_configs:
- targets: ['<mcp-host>:4200']
metrics_path: /metrics