Skip to content

Inaccurate tok/sec in Activity Page #198

@mostlygeek

Description

@mostlygeek

Describe the bug

The Token/Second reported on the Activity page (#195) are incorrect. The way they are calculated are based on the start of the request:

func (rec *MetricsRecorder) parseAndRecordMetrics(jsonData gjson.Result) {
	if !jsonData.Get("usage").Exists() {
		return
	}

	outputTokens := int(jsonData.Get("usage.completion_tokens").Int())
	inputTokens := int(jsonData.Get("usage.prompt_tokens").Int())

	if outputTokens > 0 {
		duration := time.Since(rec.startTime)
		tokensPerSecond := float64(inputTokens+outputTokens) / duration.Seconds()

		metrics := TokenMetrics{
			Timestamp:       time.Now(),
			Model:           rec.realModelName,
			InputTokens:     inputTokens,
			OutputTokens:    outputTokens,
			TokensPerSecond: tokensPerSecond,
			DurationMs:      int(duration.Milliseconds()),
		}
		rec.metricsMonitor.addMetrics(metrics)
	}
}

Expected behaviour

  • The tokens/second accurately matches the results of the upstream server
  • Better to have no data than totally inaccurate data

Changes

  • look for llama.cpp's timings JSON in response and extract information from there
  • do not try to calculate tokens/second and use -1 when the data is not available

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions