Skip to content

feat: add additional labels to the existing scheduler_attempts_total metric#2545

Merged
k8s-ci-robot merged 8 commits intokubernetes-sigs:mainfrom
lionelvillard:metric_replica_info
Mar 14, 2026
Merged

feat: add additional labels to the existing scheduler_attempts_total metric#2545
k8s-ci-robot merged 8 commits intokubernetes-sigs:mainfrom
lionelvillard:metric_replica_info

Conversation

@lionelvillard
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR adds 4 new labels to the scheduler_attempts_total metric: target_model_name, pod_name, namespace, port.

llm-d WVA model-based optimizer uses scheduler_attempt_total to compute dispatch rate per endpoint (requests/sec) to represent the arrival rate to each replica for a specific model. It needs the additional labels to achieve this.

Which issue(s) this PR fixes:

Fixes #

Does this PR introduce a user-facing change?:

Added `target_model_name`, `pod_name`, `namespace`, `port` labels to the `scheduler_attempts_total` metric.

Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Copilot AI review requested due to automatic review settings March 10, 2026 20:40
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 10, 2026
@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 10, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 1744439
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69b0a64029d4bc0008cecf87
😎 Deploy Preview https://deploy-preview-2545--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 10, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @lionelvillard!

It looks like this is your first PR to kubernetes-sigs/gateway-api-inference-extension 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api-inference-extension has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 10, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @lionelvillard. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 10, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the existing inference_extension_scheduler_attempts_total Prometheus counter to include endpoint- and model-related labels, enabling per-endpoint dispatch rate calculations for model-based optimizers.

Changes:

  • Add target_model_name, pod_name, namespace, and port labels to scheduler_attempts_total.
  • Update scheduler instrumentation to pass scheduling result/model into the metric recorder.
  • Add/update golden metric tests and testdata for the new label set.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
site-src/guides/metrics-and-observability.md Document the updated scheduler attempts metric and its labels.
pkg/epp/scheduling/scheduler.go Pass TargetModel and scheduling result into scheduler attempt metric recording.
pkg/epp/metrics/metrics.go Add new label set to the metric and record endpoint metadata on success.
pkg/epp/metrics/metrics_test.go Expand test scenarios and compare against new golden outputs.
pkg/epp/metrics/testdata/* Add/update golden metric outputs for new label dimensions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/epp/metrics/metrics.go Outdated
Comment thread pkg/epp/metrics/metrics.go Outdated
Comment thread pkg/epp/metrics/metrics.go Outdated
Comment thread site-src/guides/metrics-and-observability.md
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
Signed-off-by: Lionel Villard <villard@us.ibm.com>
@nirrozenbaum
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 10, 2026
Signed-off-by: Lionel Villard <villard@us.ibm.com>
@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Mar 11, 2026

CC: @JeffLuoo

@JeffLuoo
Copy link
Copy Markdown
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 11, 2026
@lionelvillard
Copy link
Copy Markdown
Contributor Author

@nirrozenbaum can you PTAL?

@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Mar 14, 2026

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, lionelvillard

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2026
@k8s-ci-robot k8s-ci-robot merged commit 1f392ec into kubernetes-sigs:main Mar 14, 2026
9 checks passed
BizerNotNull pushed a commit to BizerNotNull/gateway-api-inference-extension that referenced this pull request Mar 15, 2026
…metric (kubernetes-sigs#2545)

* Add selected endpoint info to SchedulerAttemptsTotal metric

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add model_name label to scheduler_attempts_total

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add unit tests

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* document metric

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* fix typo

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* check primaryResults

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add targetModelName even when attempt failed

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* fix golden file

Signed-off-by: Lionel Villard <villard@us.ibm.com>

---------

Signed-off-by: Lionel Villard <villard@us.ibm.com>
elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026
…metric (kubernetes-sigs/gateway-api-inference-extension#2545)

* Add selected endpoint info to SchedulerAttemptsTotal metric

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add model_name label to scheduler_attempts_total

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add unit tests

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* document metric

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* fix typo

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* check primaryResults

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* add targetModelName even when attempt failed

Signed-off-by: Lionel Villard <villard@us.ibm.com>

* fix golden file

Signed-off-by: Lionel Villard <villard@us.ibm.com>

---------

Signed-off-by: Lionel Villard <villard@us.ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants