Skip to content

Add flowcontrol queue length in bytes metric#2044

Merged
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
RyanRosario:bytemetric
Jan 22, 2026
Merged

Add flowcontrol queue length in bytes metric#2044
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
RyanRosario:bytemetric

Conversation

@RyanRosario
Copy link
Copy Markdown
Contributor

@RyanRosario RyanRosario commented Dec 30, 2025

What type of PR is this?
/kind documentation
/kind feature

What this PR does / why we need it:

It adds new observability metrics for flow control.

Which issue(s) this PR fixes:
Related to #1708

Does this PR introduce a user-facing change?:

Added `inference_extension_flow_control_queue_bytes` metric to track the total size (in bytes) of requests currently buffered in the Flow Control layer.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. labels Dec 30, 2025
@netlify
Copy link
Copy Markdown

netlify Bot commented Dec 30, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 04c958d
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69716124a7513300089389d8
😎 Deploy Preview https://deploy-preview-2044--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 30, 2025
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @RyanRosario. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 30, 2025
@RyanRosario RyanRosario changed the title [WIP] Add additional observability metrics for flow control Add flowcontrol queue length in bytes metric Jan 9, 2026
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 9, 2026
@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Jan 12, 2026

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 12, 2026
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Jan 14, 2026

/assign @LukeAVanDrie

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@ahg-g: GitHub didn't allow me to assign the following users: LukeAVanDrie.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

/assign @LukeAVanDrie

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown
Contributor

@LukeAVanDrie LukeAVanDrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Ryan! I have left a few minor inline comments, but I have no blocking concerns.
This LGTM!

/assign @ahg-g

Comment thread pkg/epp/flowcontrol/controller/controller.go Outdated
Comment thread pkg/epp/metrics/metrics_test.go Outdated
@LukeAVanDrie
Copy link
Copy Markdown
Contributor

Oh, @RyanRosario, since we are adding a new public metric that operators will use, this is a user-facing change. Please update the release note section in your PR description (on this and the other metrics PRs):

E.g.,

Added `inference_extension_flow_control_queue_bytes` metric to track the total size (in bytes) of requests currently buffered in the Flow Control layer.

Comment thread pkg/epp/metrics/metrics.go Outdated
prometheus.GaugeOpts{
Subsystem: InferenceExtension,
Name: "flow_control_queue_bytes",
Help: metricsutil.HelpMsgWithStability("Current number of bytes associated with requests actively managed by the EPP flow control layer, from the start of the EnqueueAndWait call until a final outcome is reached.", compbasemetrics.ALPHA),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove the phrase EnqueueAndWait as it is an internal detail and describe the metric in abstract terms.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ryan is mirroring my description for the flow_control_queue_size (len) metric I already added. Ryan, do you mind updating it there as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Thank you both for catching that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not addressed yet.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I looked through the last commit and saw that two were updated (queue_size and queue_duration). We still need queue_bytes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about that. This is addressed.

@RyanRosario
Copy link
Copy Markdown
Contributor Author

All feedback addressed. Ready for final review.

@LukeAVanDrie
Copy link
Copy Markdown
Contributor

/approve
/lgtm

@ahg-g

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 20, 2026
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2026
@RyanRosario
Copy link
Copy Markdown
Contributor Author

/assign @LukeAVanDrie

@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Jan 22, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 22, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, LukeAVanDrie, RyanRosario

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 22, 2026
@k8s-ci-robot k8s-ci-robot merged commit c5cf00c into kubernetes-sigs:main Jan 22, 2026
11 checks passed
elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026
…-api-inference-extension#2044)

* Add flow_control_queue_bytes metric

* Add flow_control_queue_bytes metric and documentation

* Address reviewer feedback

* Update comment to remove internal function name

---------

Co-authored-by: Ryan Rosario <6713180+RyanRosario@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/documentation Categorizes issue or PR as related to documentation. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants