feat: add additional labels to the existing scheduler_attempts_total metric#2545
Conversation
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Welcome @lionelvillard! |
|
Hi @lionelvillard. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Pull request overview
This PR extends the existing inference_extension_scheduler_attempts_total Prometheus counter to include endpoint- and model-related labels, enabling per-endpoint dispatch rate calculations for model-based optimizers.
Changes:
- Add
target_model_name,pod_name,namespace, andportlabels toscheduler_attempts_total. - Update scheduler instrumentation to pass scheduling result/model into the metric recorder.
- Add/update golden metric tests and testdata for the new label set.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| site-src/guides/metrics-and-observability.md | Document the updated scheduler attempts metric and its labels. |
| pkg/epp/scheduling/scheduler.go | Pass TargetModel and scheduling result into scheduler attempt metric recording. |
| pkg/epp/metrics/metrics.go | Add new label set to the metric and record endpoint metadata on success. |
| pkg/epp/metrics/metrics_test.go | Expand test scenarios and compare against new golden outputs. |
| pkg/epp/metrics/testdata/* | Add/update golden metric outputs for new label dimensions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
|
/ok-to-test |
Signed-off-by: Lionel Villard <villard@us.ibm.com>
|
CC: @JeffLuoo |
|
/lgtm |
|
@nirrozenbaum can you PTAL? |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, lionelvillard The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…metric (kubernetes-sigs#2545) * Add selected endpoint info to SchedulerAttemptsTotal metric Signed-off-by: Lionel Villard <villard@us.ibm.com> * add model_name label to scheduler_attempts_total Signed-off-by: Lionel Villard <villard@us.ibm.com> * add unit tests Signed-off-by: Lionel Villard <villard@us.ibm.com> * document metric Signed-off-by: Lionel Villard <villard@us.ibm.com> * fix typo Signed-off-by: Lionel Villard <villard@us.ibm.com> * check primaryResults Signed-off-by: Lionel Villard <villard@us.ibm.com> * add targetModelName even when attempt failed Signed-off-by: Lionel Villard <villard@us.ibm.com> * fix golden file Signed-off-by: Lionel Villard <villard@us.ibm.com> --------- Signed-off-by: Lionel Villard <villard@us.ibm.com>
…metric (kubernetes-sigs/gateway-api-inference-extension#2545) * Add selected endpoint info to SchedulerAttemptsTotal metric Signed-off-by: Lionel Villard <villard@us.ibm.com> * add model_name label to scheduler_attempts_total Signed-off-by: Lionel Villard <villard@us.ibm.com> * add unit tests Signed-off-by: Lionel Villard <villard@us.ibm.com> * document metric Signed-off-by: Lionel Villard <villard@us.ibm.com> * fix typo Signed-off-by: Lionel Villard <villard@us.ibm.com> * check primaryResults Signed-off-by: Lionel Villard <villard@us.ibm.com> * add targetModelName even when attempt failed Signed-off-by: Lionel Villard <villard@us.ibm.com> * fix golden file Signed-off-by: Lionel Villard <villard@us.ibm.com> --------- Signed-off-by: Lionel Villard <villard@us.ibm.com>
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds 4 new labels to the
scheduler_attempts_totalmetric:target_model_name,pod_name,namespace,port.llm-d WVA model-based optimizer uses
scheduler_attempt_totalto compute dispatch rate per endpoint (requests/sec) to represent the arrival rate to each replica for a specific model. It needs the additional labels to achieve this.Which issue(s) this PR fixes:
Fixes #
Does this PR introduce a user-facing change?: