-
Notifications
You must be signed in to change notification settings - Fork 118
Include metrics from upstream chat requests #361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- remove metrics_middleware.go as this wrapper is no longer needed. This also eliminiates double body parsing for the modelID - move metrics parsing to be part of MetricsMonitor - refactor how metrics are recording in ProxyManager
WalkthroughThe diff removes the Gin middleware-based metrics capture file and refactors metrics into an unexported metricsMonitor with a handler-wrapping approach; ProxyManager is updated to use the new metrics wrapping for relevant POST proxy paths and route-level middleware is removed. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant ProxyManager
participant MetricsMonitor
participant Upstream
Client->>ProxyManager: HTTP POST request
alt metricsMonitor present
ProxyManager->>MetricsMonitor: wrapHandler(modelID, wrappedWriter, req)
MetricsMonitor->>MetricsMonitor: wrap response writer (responseBodyCopier)
MetricsMonitor->>ProxyManager: call next handler with wrapped writer
ProxyManager->>Upstream: forward request
Upstream-->>ProxyManager: Response (JSON or SSE)
ProxyManager-->>MetricsMonitor: response flows through responseBodyCopier
MetricsMonitor->>MetricsMonitor: parse metrics (streaming or non-streaming)
MetricsMonitor->>MetricsMonitor: record TokenMetrics & emit event
MetricsMonitor-->>Client: proxied response returned
else no metricsMonitor
ProxyManager->>Upstream: forward request directly
Upstream-->>Client: proxied response returned
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
proxy/proxymanager.go (1)
488-488: Typo in error message.Use “read” instead of “ready”.
- pm.sendErrorResponse(c, http.StatusBadRequest, "could not ready request body") + pm.sendErrorResponse(c, http.StatusBadRequest, "could not read request body")proxy/metrics_monitor.go (1)
63-75: Avoid emitting events under the write lock (deadlock risk).Handlers may reenter MetricsMonitor (e.g., GetMetrics), causing self-deadlock. Emit after unlocking.
func (mp *MetricsMonitor) addMetrics(metric TokenMetrics) { - mp.mu.Lock() - defer mp.mu.Unlock() - - metric.ID = mp.nextID - mp.nextID++ - mp.metrics = append(mp.metrics, metric) - if len(mp.metrics) > mp.maxMetrics { - mp.metrics = mp.metrics[len(mp.metrics)-mp.maxMetrics:] - } - event.Emit(TokenMetricsEvent{Metrics: metric}) + mp.mu.Lock() + metric.ID = mp.nextID + mp.nextID++ + mp.metrics = append(mp.metrics, metric) + if len(mp.metrics) > mp.maxMetrics { + mp.metrics = mp.metrics[len(mp.metrics)-mp.maxMetrics:] + } + mp.mu.Unlock() + event.Emit(TokenMetricsEvent{Metrics: metric}) }
🧹 Nitpick comments (5)
proxy/proxymanager.go (2)
659-665: Consider metrics for multipart endpoints too.proxyOAIPostFormHandler doesn’t record metrics. If upstream returns usage/timings for audio/transcriptions, wrap this path as well.
- if err := processGroup.ProxyRequest(realModelName, c.Writer, modifiedReq); err != nil { + if pm.metricsMonitor != nil { + if err := pm.metricsMonitor.WrapHandler(realModelName, c.Writer, modifiedReq, processGroup.ProxyRequest); err != nil { + if !c.Writer.Written() { + pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error proxying metrics wrapped request: %s", err.Error())) + } + pm.proxyLogger.Errorf("Error Proxying Metrics Wrapped Request for processGroup %s and model %s: %v", processGroup.id, realModelName, err) + return + } + return + } + if err := processGroup.ProxyRequest(realModelName, c.Writer, modifiedReq); err != nil { pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error proxying request: %s", err.Error())) pm.proxyLogger.Errorf("Error Proxying Request for processGroup %s and model %s", processGroup.id, realModelName) return }
402-404: CORS correctness: add Vary: Origin.Responses vary by Origin; add Vary to enable proper caching behavior.
if origin := c.GetHeader("Origin"); origin != "" { c.Header("Access-Control-Allow-Origin", origin) + c.Header("Vary", "Origin") }proxy/metrics_monitor.go (2)
244-251: Capture WriteString too, otherwise some writes bypass the buffer.Upstream may call WriteString; without overriding it, body capture can be incomplete.
func (w *ResponseBodyCopier) Write(b []byte) (int, error) { if w.start.IsZero() { w.start = time.Now() } // Single write operation that writes to both the response and buffer return w.tee.Write(b) } +func (w *ResponseBodyCopier) WriteString(s string) (int, error) { + if w.start.IsZero() { + w.start = time.Now() + } + return w.tee.Write([]byte(s)) +}
134-178: Memory risk: buffering entire SSE response.ResponseBodyCopier stores full streaming bodies; long chats can exhaust memory. Consider incremental SSE parsing (track only last valid data: line) or apply a bounded ring buffer for SSE content.
If helpful, I can prototype a minimal rolling window parser that processes chunks as they stream and retains only the last data: JSON block.
proxy/metrics_monitor_test.go (1)
345-364: Update expectations if WrapHandler stops surfacing parse errors.If you adopt the non-breaking WrapHandler behavior (only propagate next() errors), change these cases to assert no error and no metrics recorded.
Examples:
- Lines 345-364 “invalid JSON”: replace assert.Error with assert.NoError and assert.Len(mm.GetMetrics(), 0).
- Lines 651-654 “no valid JSON data found in stream”: same.
- Lines 385-405 “response without usage or timings”: same.
Also applies to: 651-654, 385-405
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
proxy/metrics_middleware.go(0 hunks)proxy/metrics_monitor.go(2 hunks)proxy/metrics_monitor_test.go(1 hunks)proxy/proxymanager.go(3 hunks)
💤 Files with no reviewable changes (1)
- proxy/metrics_middleware.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go
📄 CodeRabbit inference engine (CLAUDE.md)
Fix all staticcheck-reported issues in Go code
Files:
proxy/proxymanager.goproxy/metrics_monitor.goproxy/metrics_monitor_test.go
🧬 Code graph analysis (1)
proxy/metrics_monitor_test.go (3)
proxy/config/config.go (1)
Config(112-132)proxy/metrics_monitor.go (4)
NewMetricsMonitor(49-60)TokenMetrics(20-30)TokenMetricsEvent(33-35)NewMetricsBodyRecorder(235-242)event/default.go (1)
On(16-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: run-tests
- GitHub Check: run-tests
This PR watches the
/upstream/{model}/...endpoints for chat requests and includes them on the activity page.Summary by CodeRabbit