-
Notifications
You must be signed in to change notification settings - Fork 120
ui: fix avg token/sec calculation on models page #357
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Problems fixed in calculating average tokens/second: - filter out all requests with missing data - use sum(tokens generated)/sum(time) instead of an average of averages Additionally, vite upgraded to v6.4.1 to account for some small security issues. fixes #355
WalkthroughModified the Models page to add token statistics visualization with percentile metrics (P50, P95, P99) and histogram rendering. Introduced filtering logic to exclude metrics with no output tokens, preventing negative tokens-per-second calculations. Added a new Changes
Sequence Diagram(s)sequenceDiagram
participant SM as StatsPanel useMemo
participant FM as Filter & Compute
participant TS as Token Stats
participant HD as Histogram Data
participant UI as UI Render
SM->>FM: Receive metrics array
FM->>FM: Filter: duration_ms > 0 && output_tokens > 0
alt Valid metrics exist
FM->>TS: Calculate tokens/sec for each metric
TS->>TS: Sort values & compute P50, P95, P99
FM->>HD: Build bins from tokens/sec range
TS->>UI: Return formatted percentiles + histogram
else No valid metrics
FM->>TS: Return placeholder values (0.00, 0.00, 0.00)
FM->>HD: Return null histogram
TS->>UI: Render zero stats
end
UI->>UI: Display percentiles & TokenHistogram component
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes The changes involve new component introduction, statistical computation logic, and filtering to address the negative tokens-per-second issue. While contained to a single file, the modification spans multiple concerns (data filtering, percentile calculation, histogram generation, and UI updates), requiring verification of edge cases and mathematical correctness. Suggested labels
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ui/src/pages/Models.tsx (2)
219-219: Move NumberFormat instantiation outside the component.Creating a new
Intl.NumberFormat()instance on every render is inefficient. Move it outside the component body or wrap it inuseMemo.Apply this diff to move the formatter outside:
+const nf = new Intl.NumberFormat(); + function StatsPanel() { const { metrics } = useAPI(); const [totalRequests, totalInputTokens, totalOutputTokens, avgTokensPerSecond] = useMemo(() => { // ... calculation logic }, [metrics]); - const nf = new Intl.NumberFormat(); - return (Alternatively, if you need it to be reactive to locale changes:
+ const nf = useMemo(() => new Intl.NumberFormat(), []);
236-238: Number formatting looks good; nullish coalescing is unnecessary.Using
nf.format()for token counts improves readability and internationalization. However, the?? 0at line 238 is unnecessary sinceavgTokensPerSecondis always defined (either0from line 209 or a string from line 214).You can simplify line 238:
- <td>{avgTokensPerSecond ?? 0}</td> + <td>{avgTokensPerSecond}</td>
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
ui/package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (1)
ui/src/pages/Models.tsx(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: run-tests
🔇 Additional comments (2)
ui/src/pages/Models.tsx (2)
197-201: LGTM! Early return now correctly returns 4 values.The extended return value matches the destructured array and properly handles the new
avgTokensPerSecondfield.
204-216: Excellent fix for the negative tokens/sec issue!The new calculation correctly:
- Filters out invalid metrics (zero duration or zero output tokens)
- Uses sum-of-tokens / sum-of-time instead of averaging per-request rates (which incorrectly weights all requests equally)
- Prevents division by zero
This properly addresses the objectives from issue #355.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
ui/src/pages/Models.tsx (2)
388-388: Consider adjusting minimum bin count for small datasets.When there are very few valid metrics (e.g., 1-5), the minimum bin count of 10 creates a sparse histogram with mostly empty bins. Consider reducing the minimum or using
Math.min(binCount, tokensPerSecond.length)for better visualization.-const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5))); +const binCount = Math.min(30, Math.max(5, Math.floor(tokensPerSecond.length / 3)));
420-420: Move NumberFormat instantiation outside render cycle.Creating
new Intl.NumberFormat()inside the component body causes it to be recreated on every render, which is wasteful.Apply this diff to optimize:
+const nf = new Intl.NumberFormat(); + function StatsPanel() { const { metrics } = useAPI(); const [totalRequests, totalInputTokens, totalOutputTokens, tokenStats, histogramData] = useMemo(() => { // ... }, [metrics]); - const nf = new Intl.NumberFormat(); - return (
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ui/src/pages/Models.tsx(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
ui/src/pages/Models.tsx (1)
ui/src/contexts/APIProvider.tsx (1)
useAPI(240-246)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: run-tests
- GitHub Check: run-tests
🔇 Additional comments (5)
ui/src/pages/Models.tsx (5)
194-202: LGTM!The interface structure appropriately captures histogram data and percentile metrics.
204-351: LGTM with assumption of valid input data.The histogram rendering is well-structured with responsive SVG, percentile indicators, and hover tooltips. The component assumes that
binsis non-empty and thatmax > min(non-zero range). These assumptions should hold given the upstream filtering, but it's worth verifying edge cases are handled properly in the calling code.
364-383: Verify that percentile-based approach aligns with PR objectives.The PR description states "Using sum(tokens generated) / sum(time) instead of averaging per-request averages," but the implementation computes individual per-request token rates and derives percentiles (P50, P95, P99) rather than a single aggregate rate.
The current approach provides a richer distribution view and is arguably better for performance analysis. However, please confirm this matches the intended design.
For reference, a sum-based aggregate would be:
const totalOutputTokens = validMetrics.reduce((sum, m) => sum + m.output_tokens, 0); const totalDurationSeconds = validMetrics.reduce((sum, m) => sum + m.duration_ms, 0) / 1000; const avgTokensPerSecond = totalOutputTokens / totalDurationSeconds;
364-369: Excellent fix for the token calculation issue!The filtering logic correctly addresses the PR objectives by:
- Excluding metrics with invalid duration (
duration_ms > 0)- Filtering out requests with no output tokens (
output_tokens > 0), such as embedding model calls- Preventing negative or undefined tokens-per-second calculations
This ensures the statistics are meaningful and accurate.
422-493: LGTM!The updated UI effectively presents token statistics with:
- Clear column headers and data organization
- Readable number formatting with thousand separators
- Well-structured percentile display with visual badges
- Conditional histogram rendering with proper null checks
- Dark mode support throughout
| const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5))); // Adaptive bin count | ||
| const binSize = (max - min) / binCount; | ||
|
|
||
| const bins = Array(binCount).fill(0); | ||
| tokensPerSecond.forEach((value) => { | ||
| const binIndex = Math.min(Math.floor((value - min) / binSize), binCount - 1); | ||
| bins[binIndex]++; | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Division by zero when all token rates are identical.
If all tokensPerSecond values are the same, max === min and binSize becomes 0 (line 389). This causes division by zero on line 393: (value - min) / binSize, resulting in NaN for binIndex.
Apply this diff to handle the edge case:
const min = Math.min(...tokensPerSecond);
const max = Math.max(...tokensPerSecond);
+
+// Handle case where all values are identical
+if (min === max) {
+ const histogramData = {
+ bins: [tokensPerSecond.length],
+ min: min,
+ max: min,
+ binSize: 0,
+ p99,
+ p95,
+ p50,
+ };
+ return [
+ totalRequests,
+ totalInputTokens,
+ totalOutputTokens,
+ {
+ p99: p99.toFixed(2),
+ p95: p95.toFixed(2),
+ p50: p50.toFixed(2),
+ },
+ histogramData,
+ ];
+}
+
const binCount = Math.min(30, Math.max(10, Math.floor(tokensPerSecond.length / 5)));
const binSize = (max - min) / binCount;Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In ui/src/pages/Models.tsx around lines 388 to 395, the computation of binSize
can be zero when max === min causing division by zero and NaN bin indices;
detect this edge case and handle it by setting a safe fallback: if max === min
(or computed binSize === 0) populate a single bin (e.g., put all counts into bin
0 or the middle bin) instead of computing indices, otherwise compute binSize and
binIndex as before; ensure you still use Math.min to clamp indices and avoid
dividing by zero.
Problems fixed in calculating average tokens/second:
Additionally, vite upgraded to v6.4.1 to account for some small security issues.
fixes #355
New look with percentiles and a histogram:

Summary by CodeRabbit