Skip to content

Conversation

@mostlygeek
Copy link
Owner

@mostlygeek mostlygeek commented Oct 25, 2025

This PR watches the /upstream/{model}/... endpoints for chat requests and includes them on the activity page.

Summary by CodeRabbit

  • Refactor
    • Centralized and streamlined metrics collection and wrapping logic for proxy requests, reducing per-route middleware usage.
    • Tightened internal visibility and simplified metrics buffering/retention for more efficient runtime behavior.
    • Improved handling of streaming and non‑streaming responses to keep metrics recording consistent.
  • Tests
    • Expanded metrics unit and integration tests; a couple of legacy middleware tests were removed to match the new approach.

- remove metrics_middleware.go as this wrapper is no longer needed. This
  also eliminiates double body parsing for the modelID
- move metrics parsing to be part of MetricsMonitor
- refactor how metrics are recording in ProxyManager
@coderabbitai
Copy link

coderabbitai bot commented Oct 25, 2025

Walkthrough

The diff removes the Gin middleware-based metrics capture file and refactors metrics into an unexported metricsMonitor with a handler-wrapping approach; ProxyManager is updated to use the new metrics wrapping for relevant POST proxy paths and route-level middleware is removed.

Changes

Cohort / File(s) Summary
Metrics middleware removal
proxy/metrics_middleware.go
File deleted; removes Gin middleware and associated types/functions (MetricsMiddleware, MetricsRecorder, MetricsResponseWriter, and their helpers) that previously captured and parsed streaming and non-streaming metrics.
Metrics monitor refactor
proxy/metrics_monitor.go
MetricsMonitor renamed to unexported metricsMonitor; constructor and API made unexported (newMetricsMonitor, getMetrics, getMetricsJSON); response-capture and parsing logic moved/renamed to unexported helpers (e.g., responseBodyCopier, newBodyCopier, parseMetrics, processStreamingResponse); wrapHandler retained as a metrics-wrapping entrypoint on the unexported type.
Metrics monitor tests added
proxy/metrics_monitor_test.go
New comprehensive tests covering metrics addition/eviction, JSON serialization, event emission, wrapHandler behavior (streaming/non-streaming/edge cases), response body capture, concurrency, parsing precedence rules, and benchmarks.
ProxyManager integration
proxy/proxymanager.go, proxy/proxymanager_api.go
ProxyManager now holds *metricsMonitor (unexported) and initializes it via newMetricsMonitor(...); route-level MetricsMiddleware removed; proxying paths are conditionally wrapped with metricsMonitor.wrapHandler for POST flows; calls to GetMetrics/GetMetricsJSON replaced with getMetrics/getMetricsJSON.
Tests removal
proxy/proxymanager_test.go
Two tests asserting middleware-emitted metrics for non-streaming and streaming flows were removed.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ProxyManager
    participant MetricsMonitor
    participant Upstream

    Client->>ProxyManager: HTTP POST request
    alt metricsMonitor present
        ProxyManager->>MetricsMonitor: wrapHandler(modelID, wrappedWriter, req)
        MetricsMonitor->>MetricsMonitor: wrap response writer (responseBodyCopier)
        MetricsMonitor->>ProxyManager: call next handler with wrapped writer
        ProxyManager->>Upstream: forward request
        Upstream-->>ProxyManager: Response (JSON or SSE)
        ProxyManager-->>MetricsMonitor: response flows through responseBodyCopier
        MetricsMonitor->>MetricsMonitor: parse metrics (streaming or non-streaming)
        MetricsMonitor->>MetricsMonitor: record TokenMetrics & emit event
        MetricsMonitor-->>Client: proxied response returned
    else no metricsMonitor
        ProxyManager->>Upstream: forward request directly
        Upstream-->>Client: proxied response returned
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Attention areas:
    • Verify behavioral parity between removed middleware and the new wrapHandler parsing (timing vs usage precedence, streaming SSE parsing).
    • Confirm ProxyManager error propagation and status/header preservation when using wrapped handlers.
    • Inspect concurrency and eviction semantics in the unexported metricsMonitor and its tests.
    • Validate public API surface changes (export -> unexport) do not break callers relying on previous exported symbols.

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Include metrics from upstream chat requests" directly aligns with the stated PR objective "Refactor metrics to include upstream stats." The changes demonstrate that the core goal is to capture and record metrics data from upstream responses, which the title accurately describes. While the implementation involves significant architectural changes—including deletion of the metrics middleware, making MetricsMonitor unexported, and restructuring how metrics collection occurs at the handler level—these changes serve to accomplish the stated objective of including upstream metrics. The title appropriately focuses on the intended outcome rather than implementation details.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch metrics-for-upstream

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
proxy/proxymanager.go (1)

488-488: Typo in error message.

Use “read” instead of “ready”.

-        pm.sendErrorResponse(c, http.StatusBadRequest, "could not ready request body")
+        pm.sendErrorResponse(c, http.StatusBadRequest, "could not read request body")
proxy/metrics_monitor.go (1)

63-75: Avoid emitting events under the write lock (deadlock risk).

Handlers may reenter MetricsMonitor (e.g., GetMetrics), causing self-deadlock. Emit after unlocking.

 func (mp *MetricsMonitor) addMetrics(metric TokenMetrics) {
-    mp.mu.Lock()
-    defer mp.mu.Unlock()
-
-    metric.ID = mp.nextID
-    mp.nextID++
-    mp.metrics = append(mp.metrics, metric)
-    if len(mp.metrics) > mp.maxMetrics {
-        mp.metrics = mp.metrics[len(mp.metrics)-mp.maxMetrics:]
-    }
-    event.Emit(TokenMetricsEvent{Metrics: metric})
+    mp.mu.Lock()
+    metric.ID = mp.nextID
+    mp.nextID++
+    mp.metrics = append(mp.metrics, metric)
+    if len(mp.metrics) > mp.maxMetrics {
+        mp.metrics = mp.metrics[len(mp.metrics)-mp.maxMetrics:]
+    }
+    mp.mu.Unlock()
+    event.Emit(TokenMetricsEvent{Metrics: metric})
 }
🧹 Nitpick comments (5)
proxy/proxymanager.go (2)

659-665: Consider metrics for multipart endpoints too.

proxyOAIPostFormHandler doesn’t record metrics. If upstream returns usage/timings for audio/transcriptions, wrap this path as well.

-    if err := processGroup.ProxyRequest(realModelName, c.Writer, modifiedReq); err != nil {
+    if pm.metricsMonitor != nil {
+        if err := pm.metricsMonitor.WrapHandler(realModelName, c.Writer, modifiedReq, processGroup.ProxyRequest); err != nil {
+            if !c.Writer.Written() {
+                pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error proxying metrics wrapped request: %s", err.Error()))
+            }
+            pm.proxyLogger.Errorf("Error Proxying Metrics Wrapped Request for processGroup %s and model %s: %v", processGroup.id, realModelName, err)
+            return
+        }
+        return
+    }
+    if err := processGroup.ProxyRequest(realModelName, c.Writer, modifiedReq); err != nil {
         pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error proxying request: %s", err.Error()))
         pm.proxyLogger.Errorf("Error Proxying Request for processGroup %s and model %s", processGroup.id, realModelName)
         return
     }

402-404: CORS correctness: add Vary: Origin.

Responses vary by Origin; add Vary to enable proper caching behavior.

     if origin := c.GetHeader("Origin"); origin != "" {
         c.Header("Access-Control-Allow-Origin", origin)
+        c.Header("Vary", "Origin")
     }
proxy/metrics_monitor.go (2)

244-251: Capture WriteString too, otherwise some writes bypass the buffer.

Upstream may call WriteString; without overriding it, body capture can be incomplete.

 func (w *ResponseBodyCopier) Write(b []byte) (int, error) {
     if w.start.IsZero() {
         w.start = time.Now()
     }
     // Single write operation that writes to both the response and buffer
     return w.tee.Write(b)
 }
 
+func (w *ResponseBodyCopier) WriteString(s string) (int, error) {
+    if w.start.IsZero() {
+        w.start = time.Now()
+    }
+    return w.tee.Write([]byte(s))
+}

134-178: Memory risk: buffering entire SSE response.

ResponseBodyCopier stores full streaming bodies; long chats can exhaust memory. Consider incremental SSE parsing (track only last valid data: line) or apply a bounded ring buffer for SSE content.

If helpful, I can prototype a minimal rolling window parser that processes chunks as they stream and retains only the last data: JSON block.

proxy/metrics_monitor_test.go (1)

345-364: Update expectations if WrapHandler stops surfacing parse errors.

If you adopt the non-breaking WrapHandler behavior (only propagate next() errors), change these cases to assert no error and no metrics recorded.

Examples:

  • Lines 345-364 “invalid JSON”: replace assert.Error with assert.NoError and assert.Len(mm.GetMetrics(), 0).
  • Lines 651-654 “no valid JSON data found in stream”: same.
  • Lines 385-405 “response without usage or timings”: same.

Also applies to: 651-654, 385-405

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d18dc26 and a66bb55.

📒 Files selected for processing (4)
  • proxy/metrics_middleware.go (0 hunks)
  • proxy/metrics_monitor.go (2 hunks)
  • proxy/metrics_monitor_test.go (1 hunks)
  • proxy/proxymanager.go (3 hunks)
💤 Files with no reviewable changes (1)
  • proxy/metrics_middleware.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (CLAUDE.md)

Fix all staticcheck-reported issues in Go code

Files:

  • proxy/proxymanager.go
  • proxy/metrics_monitor.go
  • proxy/metrics_monitor_test.go
🧬 Code graph analysis (1)
proxy/metrics_monitor_test.go (3)
proxy/config/config.go (1)
  • Config (112-132)
proxy/metrics_monitor.go (4)
  • NewMetricsMonitor (49-60)
  • TokenMetrics (20-30)
  • TokenMetricsEvent (33-35)
  • NewMetricsBodyRecorder (235-242)
event/default.go (1)
  • On (16-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run-tests
  • GitHub Check: run-tests

@mostlygeek mostlygeek changed the title Refactor metrics to include upstream stats Include metrics from upstream chat requests Oct 26, 2025
@mostlygeek mostlygeek merged commit e250e71 into main Oct 26, 2025
3 checks passed
@mostlygeek mostlygeek deleted the metrics-for-upstream branch October 26, 2025 00:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants