Skip to content

Conversation

@mostlygeek
Copy link
Owner

@mostlygeek mostlygeek commented Oct 22, 2025

Problems fixed in calculating average tokens/second:

  • filter out all requests with missing data
  • use sum(tokens generated)/sum(time) instead of an average of averages

Additionally, vite upgraded to v6.4.1 to account for some small security issues.

fixes #355

New look with percentiles and a histogram:
image

Summary by CodeRabbit

  • New Features
    • Added histogram chart visualization for token metrics with percentile indicators (P50, P95, P99)
    • Enhanced token statistics panel to display tokens per second with detailed percentile breakdown
    • Improved token metrics formatting for better readability

Problems fixed in calculating average tokens/second:

- filter out all requests with missing data
- use sum(tokens generated)/sum(time) instead of an average of averages

Additionally, vite upgraded to v6.4.1 to account for some small security
issues.

fixes #355
@coderabbitai
Copy link

coderabbitai bot commented Oct 22, 2025

Walkthrough

Modified the Models page to add token statistics visualization with percentile metrics (P50, P95, P99) and histogram rendering. Introduced filtering logic to exclude metrics with no output tokens, preventing negative tokens-per-second calculations. Added a new TokenHistogram component for bar chart display with percentile indicators.

Changes

Cohort / File(s) Summary
Token Statistics and Histogram Visualization
ui/src/pages/Models.tsx
Added HistogramData interface and TokenHistogram component for responsive histogram rendering. Updated StatsPanel useMemo to compute percentiles (P50, P95, P99) from valid metrics, build histogram data, and return formatted token statistics. Implemented filtering to exclude metrics where duration_ms <= 0 or output_tokens <= 0. Enhanced UI table with "Token Stats (tokens/sec)" column displaying percentiles and embedded histogram. Applied Intl.NumberFormat for token value formatting.

Sequence Diagram(s)

sequenceDiagram
    participant SM as StatsPanel useMemo
    participant FM as Filter & Compute
    participant TS as Token Stats
    participant HD as Histogram Data
    participant UI as UI Render

    SM->>FM: Receive metrics array
    FM->>FM: Filter: duration_ms > 0 && output_tokens > 0
    alt Valid metrics exist
        FM->>TS: Calculate tokens/sec for each metric
        TS->>TS: Sort values & compute P50, P95, P99
        FM->>HD: Build bins from tokens/sec range
        TS->>UI: Return formatted percentiles + histogram
    else No valid metrics
        FM->>TS: Return placeholder values (0.00, 0.00, 0.00)
        FM->>HD: Return null histogram
        TS->>UI: Render zero stats
    end
    UI->>UI: Display percentiles & TokenHistogram component
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

The changes involve new component introduction, statistical computation logic, and filtering to address the negative tokens-per-second issue. While contained to a single file, the modification spans multiple concerns (data filtering, percentile calculation, histogram generation, and UI updates), requiring verification of edge cases and mathematical correctness.

Suggested labels

enhancement, UI

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning While the core calculation fix (filtering invalid metrics and using sum-based computation) directly addresses issue #355's requirements, several UI enhancements appear out of scope: the TokenHistogram component with percentile line rendering, the new "Token Stats (tokens/sec)" column with nested percentile displays, and formatting updates with Intl.NumberFormat were not mentioned as requirements in the linked issue. Issue #355 specifically requests preventing negative values and excluding non-token-producing models, not visualization enhancements or histogram rendering. Additionally, the PR objectives mention upgrading Vite to v6.4.1 for security, which is unrelated to the models page token calculation fix. Review whether the histogram visualization, percentile rendering, and new table column are necessary supporting changes for the calculation fix or should be addressed in a separate UI enhancement PR. Confirm whether the Vite upgrade mentioned in the PR objectives is actually included in this changeset or belongs in a separate dependency update PR. Consider narrowing this PR to focus on fixing the token/sec calculation logic (the core requirement from issue #355) and moving UI enhancements to a follow-up PR if they're not essential for displaying the fix correctly.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "ui: fix avg token/sec calculation on models page" directly relates to the primary objective of fixing the average tokens-per-second calculation on the models tab, which is the core change highlighted in the PR objectives. The title accurately identifies the main fix without needing to cover implementation details like percentile calculations or histogram rendering, which are supporting changes. The title is clear, concise, and specific enough for a developer scanning the history to understand the primary change.
Linked Issues Check ✅ Passed The code changes address all primary coding requirements from linked issue #355: filtering out metrics where output_tokens is zero (excluding embedding models that don't produce tokens), computing tokens-per-second using sum(tokens generated) / sum(time) instead of averaging per-request values, and preventing negative values by excluding invalid data. The changes to derive percentiles (P50, P95, P99) and build histogram data provide proper handling of the corrected calculation and support the visualization of token statistics.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch negative-stats-355

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
ui/src/pages/Models.tsx (2)

219-219: Move NumberFormat instantiation outside the component.

Creating a new Intl.NumberFormat() instance on every render is inefficient. Move it outside the component body or wrap it in useMemo.

Apply this diff to move the formatter outside:

+const nf = new Intl.NumberFormat();
+
 function StatsPanel() {
   const { metrics } = useAPI();
 
   const [totalRequests, totalInputTokens, totalOutputTokens, avgTokensPerSecond] = useMemo(() => {
     // ... calculation logic
   }, [metrics]);
 
-  const nf = new Intl.NumberFormat();
-
   return (

Alternatively, if you need it to be reactive to locale changes:

+  const nf = useMemo(() => new Intl.NumberFormat(), []);

236-238: Number formatting looks good; nullish coalescing is unnecessary.

Using nf.format() for token counts improves readability and internationalization. However, the ?? 0 at line 238 is unnecessary since avgTokensPerSecond is always defined (either 0 from line 209 or a string from line 214).

You can simplify line 238:

-              <td>{avgTokensPerSecond ?? 0}</td>
+              <td>{avgTokensPerSecond}</td>
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c07179d and 9db1d65.

⛔ Files ignored due to path filters (1)
  • ui/package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (1)
  • ui/src/pages/Models.tsx (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: run-tests
🔇 Additional comments (2)
ui/src/pages/Models.tsx (2)

197-201: LGTM! Early return now correctly returns 4 values.

The extended return value matches the destructured array and properly handles the new avgTokensPerSecond field.


204-216: Excellent fix for the negative tokens/sec issue!

The new calculation correctly:

  • Filters out invalid metrics (zero duration or zero output tokens)
  • Uses sum-of-tokens / sum-of-time instead of averaging per-request rates (which incorrectly weights all requests equally)
  • Prevents division by zero

This properly addresses the objectives from issue #355.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
ui/src/pages/Models.tsx (2)

388-388: Consider adjusting minimum bin count for small datasets.

When there are very few valid metrics (e.g., 1-5), the minimum bin count of 10 creates a sparse histogram with mostly empty bins. Consider reducing the minimum or using Math.min(binCount, tokensPerSecond.length) for better visualization.

-const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5)));
+const binCount = Math.min(30, Math.max(5, Math.floor(tokensPerSecond.length / 3)));

420-420: Move NumberFormat instantiation outside render cycle.

Creating new Intl.NumberFormat() inside the component body causes it to be recreated on every render, which is wasteful.

Apply this diff to optimize:

+const nf = new Intl.NumberFormat();
+
 function StatsPanel() {
   const { metrics } = useAPI();
 
   const [totalRequests, totalInputTokens, totalOutputTokens, tokenStats, histogramData] = useMemo(() => {
     // ...
   }, [metrics]);
 
-  const nf = new Intl.NumberFormat();
-
   return (
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 175761e and a6d7ba5.

📒 Files selected for processing (1)
  • ui/src/pages/Models.tsx (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ui/src/pages/Models.tsx (1)
ui/src/contexts/APIProvider.tsx (1)
  • useAPI (240-246)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: run-tests
  • GitHub Check: run-tests
🔇 Additional comments (5)
ui/src/pages/Models.tsx (5)

194-202: LGTM!

The interface structure appropriately captures histogram data and percentile metrics.


204-351: LGTM with assumption of valid input data.

The histogram rendering is well-structured with responsive SVG, percentile indicators, and hover tooltips. The component assumes that bins is non-empty and that max > min (non-zero range). These assumptions should hold given the upstream filtering, but it's worth verifying edge cases are handled properly in the calling code.


364-383: Verify that percentile-based approach aligns with PR objectives.

The PR description states "Using sum(tokens generated) / sum(time) instead of averaging per-request averages," but the implementation computes individual per-request token rates and derives percentiles (P50, P95, P99) rather than a single aggregate rate.

The current approach provides a richer distribution view and is arguably better for performance analysis. However, please confirm this matches the intended design.

For reference, a sum-based aggregate would be:

const totalOutputTokens = validMetrics.reduce((sum, m) => sum + m.output_tokens, 0);
const totalDurationSeconds = validMetrics.reduce((sum, m) => sum + m.duration_ms, 0) / 1000;
const avgTokensPerSecond = totalOutputTokens / totalDurationSeconds;

364-369: Excellent fix for the token calculation issue!

The filtering logic correctly addresses the PR objectives by:

  • Excluding metrics with invalid duration (duration_ms > 0)
  • Filtering out requests with no output tokens (output_tokens > 0), such as embedding model calls
  • Preventing negative or undefined tokens-per-second calculations

This ensures the statistics are meaningful and accurate.


422-493: LGTM!

The updated UI effectively presents token statistics with:

  • Clear column headers and data organization
  • Readable number formatting with thousand separators
  • Well-structured percentile display with visual badges
  • Conditional histogram rendering with proper null checks
  • Dark mode support throughout

Comment on lines +388 to +395
const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5))); // Adaptive bin count
const binSize = (max - min) / binCount;

const bins = Array(binCount).fill(0);
tokensPerSecond.forEach((value) => {
const binIndex = Math.min(Math.floor((value - min) / binSize), binCount - 1);
bins[binIndex]++;
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Division by zero when all token rates are identical.

If all tokensPerSecond values are the same, max === min and binSize becomes 0 (line 389). This causes division by zero on line 393: (value - min) / binSize, resulting in NaN for binIndex.

Apply this diff to handle the edge case:

 const min = Math.min(...tokensPerSecond);
 const max = Math.max(...tokensPerSecond);
+
+// Handle case where all values are identical
+if (min === max) {
+  const histogramData = {
+    bins: [tokensPerSecond.length],
+    min: min,
+    max: min,
+    binSize: 0,
+    p99,
+    p95,
+    p50,
+  };
+  return [
+    totalRequests,
+    totalInputTokens,
+    totalOutputTokens,
+    {
+      p99: p99.toFixed(2),
+      p95: p95.toFixed(2),
+      p50: p50.toFixed(2),
+    },
+    histogramData,
+  ];
+}
+
 const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5)));
 const binSize = (max - min) / binCount;

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In ui/src/pages/Models.tsx around lines 388 to 395, the computation of binSize
can be zero when max === min causing division by zero and NaN bin indices;
detect this edge case and handle it by setting a safe fallback: if max === min
(or computed binSize === 0) populate a single bin (e.g., put all counts into bin
0 or the middle bin) instead of computing indices, otherwise compute binSize and
binIndex as before; ensure you still use Math.min to clamp indices and avoid
dividing by zero.

@mostlygeek mostlygeek merged commit 8357714 into main Oct 24, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Negative tokens per second on models tab

2 participants