-
Notifications
You must be signed in to change notification settings - Fork 121
Add metrics logging and an Activity page to show requests #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 89 commits
Commits
Show all changes
96 commits
Select commit
Hold shift + click to select a range
f669140
feat: add activity page and handler
g2mt f26779b
feat: add metrics logging and UI display
g2mt 822e2cf
feat: add metrics parser and model-specific metrics endpoint
g2mt 198646d
run go fmt
g2mt e9edd96
refactor: remove GetLatestMetrics and update apiGetMetrics to collect…
g2mt 0d74cf0
refactor: remove ParseLogData from MetricsParser and refactor Process…
g2mt fc3ca90
refactor: update metrics parsing and API to use input/output tokens
g2mt 67fd770
Remove setInterval
g2mt 8b49999
Rename table column
g2mt d3f0147
Remove colors, hide token count if zero
g2mt b274f25
use - for empty token count column
g2mt 75b2cdf
Add metricsMaxInMemory to config
g2mt 4a26d32
Fix whitespace
g2mt ffd8dae
Remove getSummary
g2mt bacad51
refactor: remove model-specific metrics parsing and API endpoints
g2mt b3f5d2b
update tests
g2mt 7ee133b
Run fetchMetrics on mount
g2mt e9a4156
refactor: update metrics parser to simplify method signatures
g2mt 9f22155
Rename addMetric
g2mt fea861d
feat: add config-based metrics parser initialization
g2mt 99b3eb2
remove newline
g2mt 39908a1
feat: add metrics persistence to file
g2mt 2fee6b8
document metricsLogPath in example config
g2mt 4d30155
Add MetricsMaxInMemory to windows test
g2mt 749ace4
Check if pm.metricsParser is nil in apiGetMetrics
g2mt fd7f626
feat: add metricsUseServerResponse config and update proxy logic
g2mt 91b7efe
chore: add metricsUseServerResponse config option
g2mt 9749b69
feat: add useServerResponse to MetricsParser
g2mt 158a202
fix
g2mt f5b60a0
correct comment
g2mt 6a84eab
Merge remote-tracking branch 'fork/activity-page' into activity-page
g2mt 55efb27
refactor: update generation speed calculation
g2mt 0e79f64
refactor: unify stdout and stderr handling in process.go
g2mt 52436fd
feat: add streaming response handling in proxyOAIHandler
g2mt 4f0ee68
feat: add log event subscription management
g2mt 09e1e95
remove import
g2mt f473788
Use bufio.NewScanner to parse stdout lines
g2mt 6d7bca3
use custom response recorder
g2mt f33222d
add bufio import
g2mt 5cfbb84
refactor metrics debug logging
g2mt cdbc196
Remove metricsLogPath
g2mt c8be9ad
Merge branch 'activity-page-remove-httptest' into activity-page
g2mt 8746468
Move responserecorder to another file
g2mt f574b6c
Merge branch 'activity-page-stream' into activity-page
g2mt e516610
move StreamingResponseRecorder to separate file
g2mt 037e7d9
Rename responserecorder.go, remove NewResponseRecorder
g2mt 3616243
Add Activity streaming
g2mt 9508b7c
add fmt
g2mt 814b533
Remove first fetch
g2mt b6b8046
fix missing !
g2mt f36cda1
Rename to StreamingResponseRecorder
g2mt 0c44cfd
Refactor response recorder functions into a single middleware
g2mt 2aa1c38
Rename metrics parser to MetricsMonitor
g2mt 724d270
Rename to metricsDataCancel
g2mt cc8a1bb
Rename ResponseMiddleware to MetricsMiddleware, process HTTP request …
g2mt 55516c6
Move log parsing to SubscribeToProcessLogs
g2mt 9286fd9
Remove unused log
g2mt 0c92922
Extract metrics recording into method
g2mt 7a9a413
Add comment to OutputTokens
g2mt df3fbdb
Rename generationSpeed
g2mt 990e8bd
Add comments for regexes
g2mt 0e250a4
Add an ID for token metrics
g2mt 252b451
Refactor into GetMetricsJSONByLines
g2mt 1f5f850
cleanup
g2mt c80557c
add back toLocaleString
g2mt 7c875e6
Merge branch 'mostlygeek:main' into activity-page
g2mt a2c6451
revert change
g2mt 7a07a09
Remove GetMetricsJSONByLines
g2mt 1d260f9
Remove apiStreamMetrics and move streaming into /api/events
g2mt 1c6543c
Fix switch case variable declaration scope issue.
g2mt 2e37d38
Fix loading state logic to handle empty metrics.
g2mt af4b4dc
Remove loading state from Activity
g2mt 464d91c
Remove subtitle
g2mt ef97c68
fix style
g2mt 6fd771b
remove debug logger
g2mt 5ce9abd
Remove metricsMonitor event bus
g2mt 6ed6dc8
remove nil checks
g2mt a0d1161
refactor MetricsMiddleware
g2mt e836ebc
Clean up ResponseWriter
g2mt 93af520
refactor metrics middleware
g2mt 741ca5a
add comment
g2mt 8f2d75b
Remove flush
g2mt ce27be5
remove mm from irrelevant endpoints
g2mt b5ad4d9
move requested model parsing to ls-requested-model key
g2mt 0813157
Remove import
g2mt 2de9250
rm
g2mt 82522b6
Add MiddlewareWritesMetrics tests
g2mt bab8b79
Rename metrics parser
g2mt f4288bb
record modelName for metrics in proxyOAIHandler
g2mt cd9dc5b
Add streaming to simple-responder.go
g2mt 3a94a96
hide stream behind url query
g2mt 19ca3a8
wrong test
g2mt 4b0e94f
Convert to interface{}
g2mt a245674
fix test
g2mt 05509e6
Get realModelName in middleware
g2mt 242da36
add startTime
g2mt File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,6 @@ | ||
| healthCheckTimeout: 300 | ||
| logRequests: true | ||
| metricsMaxInMemory: 1000 | ||
|
|
||
| models: | ||
| "qwen2.5": | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,145 @@ | ||
| package proxy | ||
|
|
||
| import ( | ||
| "bytes" | ||
| "io" | ||
| "net/http" | ||
| "time" | ||
|
|
||
| "github.com/gin-gonic/gin" | ||
| "github.com/tidwall/gjson" | ||
| ) | ||
|
|
||
| // MetricsMiddleware sets up the MetricsResponseWriter for capturing upstream requests | ||
| func MetricsMiddleware(pm *ProxyManager) gin.HandlerFunc { | ||
| return func(c *gin.Context) { | ||
| bodyBytes, err := io.ReadAll(c.Request.Body) | ||
| if err != nil { | ||
| pm.sendErrorResponse(c, http.StatusBadRequest, "could not ready request body") | ||
| return | ||
| } | ||
| c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes)) | ||
|
|
||
| requestedModel := gjson.GetBytes(bodyBytes, "model").String() | ||
| if requestedModel == "" { | ||
| pm.sendErrorResponse(c, http.StatusBadRequest, "missing or invalid 'model' key") | ||
| return | ||
| } | ||
| c.Set("ls-requested-model", requestedModel) | ||
|
|
||
| writer := &MetricsResponseWriter{ | ||
| ResponseWriter: c.Writer, | ||
| metricsRecorder: &MetricsRecorder{ | ||
| metricsMonitor: pm.metricsMonitor, | ||
| modelName: requestedModel, // will be updated in proxyOAIHandler | ||
| isStreaming: gjson.GetBytes(bodyBytes, "stream").Bool(), | ||
| }, | ||
| } | ||
| c.Writer = writer | ||
| c.Next() | ||
|
|
||
| rec := writer.metricsRecorder | ||
| rec.processBody(writer.body) | ||
| } | ||
| } | ||
|
|
||
| type MetricsRecorder struct { | ||
| metricsMonitor *MetricsMonitor | ||
| modelName string | ||
| isStreaming bool | ||
| startTime time.Time | ||
| } | ||
|
|
||
| // processBody handles response processing after request completes | ||
| func (rec *MetricsRecorder) processBody(body []byte) { | ||
| if rec.isStreaming { | ||
| rec.processStreamingResponse(body) | ||
| } else { | ||
| rec.processNonStreamingResponse(body) | ||
| } | ||
| } | ||
|
|
||
| func (rec *MetricsRecorder) parseAndRecordMetrics(jsonData gjson.Result) { | ||
| if !jsonData.Get("usage").Exists() { | ||
| return | ||
| } | ||
|
|
||
| outputTokens := int(jsonData.Get("usage.completion_tokens").Int()) | ||
| inputTokens := int(jsonData.Get("usage.prompt_tokens").Int()) | ||
|
|
||
| if outputTokens > 0 { | ||
| duration := time.Since(rec.startTime) | ||
| tokensPerSecond := float64(inputTokens+outputTokens) / duration.Seconds() | ||
|
|
||
g2mt marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| metrics := TokenMetrics{ | ||
| Timestamp: time.Now(), | ||
| Model: rec.modelName, | ||
| InputTokens: inputTokens, | ||
| OutputTokens: outputTokens, | ||
| TokensPerSecond: tokensPerSecond, | ||
| DurationMs: int(duration.Milliseconds()), | ||
| } | ||
| rec.metricsMonitor.addMetrics(metrics) | ||
| } | ||
| } | ||
|
|
||
| func (rec *MetricsRecorder) processStreamingResponse(body []byte) { | ||
| lines := bytes.Split(body, []byte("\n")) | ||
| for _, line := range lines { | ||
| line = bytes.TrimSpace(line) | ||
| if len(line) == 0 { | ||
| continue | ||
| } | ||
|
|
||
| // Check for SSE data prefix | ||
| if bytes.HasPrefix(line, []byte("data: ")) { | ||
| data := bytes.TrimSpace(line[6:]) | ||
| if len(data) == 0 { | ||
| continue | ||
| } | ||
| if bytes.Equal(data, []byte("[DONE]")) { | ||
| break | ||
| } | ||
|
|
||
| // Parse JSON to look for usage data | ||
| if gjson.ValidBytes(data) { | ||
| rec.parseAndRecordMetrics(gjson.ParseBytes(data)) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| func (rec *MetricsRecorder) processNonStreamingResponse(body []byte) { | ||
| if len(body) == 0 { | ||
| return | ||
| } | ||
|
|
||
| // Parse JSON to extract usage information | ||
| if gjson.ValidBytes(body) { | ||
| rec.parseAndRecordMetrics(gjson.ParseBytes(body)) | ||
| } | ||
| } | ||
|
|
||
| // MetricsResponseWriter captures the entire response for non-streaming | ||
| type MetricsResponseWriter struct { | ||
| gin.ResponseWriter | ||
| body []byte | ||
| metricsRecorder *MetricsRecorder | ||
| } | ||
|
|
||
| func (w *MetricsResponseWriter) Write(b []byte) (int, error) { | ||
| n, err := w.ResponseWriter.Write(b) | ||
| if err != nil { | ||
| return n, err | ||
| } | ||
| w.body = append(w.body, b...) | ||
| return n, nil | ||
| } | ||
|
|
||
| func (w *MetricsResponseWriter) WriteHeader(statusCode int) { | ||
| w.ResponseWriter.WriteHeader(statusCode) | ||
| } | ||
|
|
||
| func (w *MetricsResponseWriter) Header() http.Header { | ||
| return w.ResponseWriter.Header() | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| package proxy | ||
|
|
||
| import ( | ||
| "encoding/json" | ||
| "sync" | ||
| "time" | ||
|
|
||
| "github.com/mostlygeek/llama-swap/event" | ||
| ) | ||
|
|
||
| // TokenMetrics represents parsed token statistics from llama-server logs | ||
| type TokenMetrics struct { | ||
| ID int `json:"id"` | ||
| Timestamp time.Time `json:"timestamp"` | ||
| Model string `json:"model"` | ||
| InputTokens int `json:"input_tokens"` | ||
| OutputTokens int `json:"output_tokens"` | ||
| TokensPerSecond float64 `json:"tokens_per_second"` | ||
| DurationMs int `json:"duration_ms"` | ||
| } | ||
|
|
||
| // TokenMetricsEvent represents a token metrics event | ||
| type TokenMetricsEvent struct { | ||
| Metrics TokenMetrics | ||
| } | ||
|
|
||
| func (e TokenMetricsEvent) Type() uint32 { | ||
| return TokenMetricsEventID // defined in events.go | ||
| } | ||
|
|
||
| // MetricsMonitor parses llama-server output for token statistics | ||
| type MetricsMonitor struct { | ||
| mu sync.RWMutex | ||
| metrics []TokenMetrics | ||
| maxMetrics int | ||
| nextID int | ||
| } | ||
|
|
||
| func NewMetricsMonitor(config *Config) *MetricsMonitor { | ||
| maxMetrics := config.MetricsMaxInMemory | ||
| if maxMetrics <= 0 { | ||
| maxMetrics = 1000 // Default fallback | ||
| } | ||
|
|
||
| mp := &MetricsMonitor{ | ||
| maxMetrics: maxMetrics, | ||
| } | ||
|
|
||
| return mp | ||
| } | ||
|
|
||
| // addMetrics adds a new metric to the collection and publishes an event | ||
| func (mp *MetricsMonitor) addMetrics(metric TokenMetrics) { | ||
| mp.mu.Lock() | ||
| defer mp.mu.Unlock() | ||
|
|
||
| metric.ID = mp.nextID | ||
| mp.nextID++ | ||
| mp.metrics = append(mp.metrics, metric) | ||
| if len(mp.metrics) > mp.maxMetrics { | ||
| mp.metrics = mp.metrics[len(mp.metrics)-mp.maxMetrics:] | ||
| } | ||
|
|
||
| event.Emit(TokenMetricsEvent{Metrics: metric}) | ||
| } | ||
|
|
||
| // GetMetrics returns a copy of the current metrics | ||
| func (mp *MetricsMonitor) GetMetrics() []TokenMetrics { | ||
| mp.mu.RLock() | ||
| defer mp.mu.RUnlock() | ||
|
|
||
| result := make([]TokenMetrics, len(mp.metrics)) | ||
| copy(result, mp.metrics) | ||
| return result | ||
| } | ||
|
|
||
| // GetMetricsJSON returns metrics as JSON | ||
| func (mp *MetricsMonitor) GetMetricsJSON() ([]byte, error) { | ||
| mp.mu.RLock() | ||
| defer mp.mu.RUnlock() | ||
| return json.Marshal(mp.metrics) | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.