Skip to content

Conversation

@nirrozenbaum
Copy link
Contributor

This PR updates metrics and logging of the various extension points.
the following changes are included:

  • plugins metrics are published with the triplet (extension_point, plugin_type, plugin_name). with the recent changes that added TypedName for plugins, it's no longer enough to publish metrics about extension point and plugin type. adding plugin name is useful for better resolution while still allowing to aggregate on plugin type and/or extension point type.
  • removed the two functions RecordSchedulerPlugin.. and RecordRequestControlPlugin.... instead we now have just RecordPlugin.... there is no need to differentiate between the functions, the extension point is included.
  • more consistent logging across plugins invocation which improves debugability.
  • updated tests accordingly.

@netlify
Copy link

netlify bot commented Jul 26, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit c74cd06
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/68851d91f4f834000834545e
😎 Deploy Preview https://deploy-preview-1235--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot requested review from ahg-g and danehans July 26, 2025 18:25
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 26, 2025
@kfswain
Copy link
Collaborator

kfswain commented Jul 29, 2025

CC: @JeffLuoo

@kfswain
Copy link
Collaborator

kfswain commented Jul 29, 2025

/approve

Looks great to me. Will let Jeff give final stamp.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [kfswain,nirrozenbaum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines -185 to -195
SchedulerPluginProcessingLatencies = prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Subsystem: InferenceExtension,
Name: "scheduler_plugin_duration_seconds",
Help: metricsutil.HelpMsgWithStability("Scheduler plugin processing latency distribution in seconds for each plugin type and plugin name.", compbasemetrics.ALPHA),
Buckets: []float64{
0.0001, 0.0002, 0.0005, 0.001, 0.002, 0.005, 0.01, 0.02, 0.05, 0.1,
},
},
[]string{"plugin_type", "plugin_name"},
)
Copy link
Contributor

@JeffLuoo JeffLuoo Jul 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR LGTM'd but deleting an existing metric should be careful when we promote the metric to Beta in the future: https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-metric

The deprecated metric should be kept for 2 releases or 8 months. Since current metric is still Alpha so we are good with removing the metric directly. We need to track the progress of marking metrics as beta for stability of metrics. I will create a tracker for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffLuoo what is the expectation when adding new metrics? use alpha for a while and then promote to beta?
is there also a promotion from beta to "v1" or similar?
additionally, what is the required time to have a metric in alpha before promoting it to beta?
I expect more metrics to be added as we continue to make progress and I think a clear documentation of these points could be useful (e.g., metrics management guide).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v1 in Kubernetes will be Stable. Hence the cycle will be Alpha -> Beta -> Stable.

There isn't rigid requirement to say a metric has to be promoted to next cycle in xx months. It's all about project owner to determine when to promote. Higher stability level will just mean less breaking change so external dependencies like alerts and dashboards can rely on it.

+1 that a "metrics management guide" is recommended to manage the lifecycle of the metric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffLuoo would you be interested in taking an AI to create this metrics management guide?

Copy link
Contributor

@JeffLuoo JeffLuoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2025
@k8s-ci-robot k8s-ci-robot merged commit 7624058 into kubernetes-sigs:main Jul 29, 2025
9 checks passed
@nirrozenbaum nirrozenbaum deleted the metrics-and-logging branch July 29, 2025 21:36
kfswain pushed a commit to kfswain/llm-instance-gateway that referenced this pull request Jul 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants