Skip to content

Conversation

@NikolaBorisov
Copy link
Contributor

Adding some more metrics to Prometheus.

Total Number of requests
Total Number of input tokens
Total Number of output tokens

Avg time per output token
Max time per output token
Avg time per input token
Max time per input token

It is useful to have avg time per output token, in order to make sure server is generating at reasonable speed and the batch size is not too large

@NikolaBorisov NikolaBorisov marked this pull request as draft December 30, 2023 01:18
@NikolaBorisov NikolaBorisov marked this pull request as ready for review December 30, 2023 01:30
@NikolaBorisov
Copy link
Contributor Author

@WoosukKwon can you take a look

@NikolaBorisov
Copy link
Contributor Author

@simon-mo Can you take a look?

Fix up the avg stats. Add more counters.
Added total request, and tokens counters
@hmellor
Copy link
Member

hmellor commented Mar 12, 2024

Closing this as stale because some of these were added in #2316 and we no longer use aioprometheus as of #2730

@hmellor hmellor closed this Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants