Skip to content

Read metrics Datasource configuration from config file#2441

Merged
k8s-ci-robot merged 6 commits intokubernetes-sigs:mainfrom
Mohamedma96:metrics-datasource-config-params
Mar 4, 2026
Merged

Read metrics Datasource configuration from config file#2441
k8s-ci-robot merged 6 commits intokubernetes-sigs:mainfrom
Mohamedma96:metrics-datasource-config-params

Conversation

@Mohamedma96
Copy link
Copy Markdown
Contributor

@Mohamedma96 Mohamedma96 commented Mar 1, 2026

Fixes:
issue #2440

What this PR does
the metrics-data-source DataLayer plugin read its configuration (scheme, path, insecureSkipVerify) from CLI flags (--model-server-metrics-scheme, --model-server-metrics-path, --model-server-metrics-https-insecure-skip-verify) via flag.Lookup(). This tightly coupled the plugin to the CLI layer .

This PR decouples the plugin from CLI flags by reading its configuration from the plugin's parameters field in the config.

Changes:

  • Removed direct pflag / flag.Lookup() usage.
  • CLI flags deprecated (but still supported)
  • add default values when the values are missing in the config.

TODO:

  • Add specific test case to check this behavior.

@netlify
Copy link
Copy Markdown

netlify Bot commented Mar 1, 2026

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 68eb26f
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69a6e1467f84e100082e5bdd
😎 Deploy Preview https://deploy-preview-2441--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot requested review from kfswain and robscott March 1, 2026 02:14
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 1, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @Mohamedma96. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 1, 2026
@nirrozenbaum
Copy link
Copy Markdown
Contributor

/ok-to-test
/cc @elevran

@k8s-ci-robot k8s-ci-robot requested a review from elevran March 1, 2026 07:33
@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 1, 2026
@shmuelk
Copy link
Copy Markdown
Contributor

shmuelk commented Mar 1, 2026

This PR looks quite good. I have two questions:

  1. Usually when something is deprecated, it still works, just gives a message, usually for a release. This PR removes the old command line argument support. Why wasn't the support left in?
  2. Why were the go.mod and go.sum files updated?

@elevran
Copy link
Copy Markdown
Contributor

elevran commented Mar 1, 2026

nit: @Mohamedma96 I think PR description comment should say Fixes or Fix if you want to automatically close an issue, not Solves.

Usually when something is deprecated, it still works, just gives a message, usually for a release. This PR removes the old command line argument support. Why wasn't the support left in?

The command line options are marked deprecated in options.go, so agree with @shmuelk that we should still refer to them.
The precedence shuold be default < cli < config file.

Why were the go.mod and go.sum files updated?

I presume it is not intentional. @Mohamedma96 ?

scheme: {{ .Values.inferenceExtension.metricsDataSource.scheme | default "http" | quote }}
path: {{ .Values.inferenceExtension.metricsDataSource.path | default "/metrics" | quote }}
insecureSkipVerify: {{ .Values.inferenceExtension.metricsDataSource.insecureSkipVerify | default true }}
- type: model-server-protocol-metrics
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elevran should we rename the extractor plugin to metrics-extractor?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is for a well scoped specific set of metrics so I think the MSP prefix makes sense.
If we need other metrics extracted, they shuold be using a different Extractor over the same source and setting the attributes on the endpoint.

Copy link
Copy Markdown
Contributor

@ahg-g ahg-g Mar 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least the naming can be consistent with all other plugins suffixed with their main function, in this case suffixed with "metrics-extractor". But then it will be too long model-server-protocol-metrics-extractor, so how about core-metrics-extractor?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, +1 for consistency. Either core-metrics-extractor or msp-metrics-extractor.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

msp is not a well established/known term, and generally we better avoid abbreviations in ux, so I would lean towards "core" as the prefix with good documentation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with core

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mohamedma96 can we pls change the plugin name here in the config and the code from model-server-protocol-metrics to core-metrics-extractor

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
Please re-review.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 1, 2026
Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
@Mohamedma96 Mohamedma96 force-pushed the metrics-datasource-config-params branch from ef5c1c6 to 5450baa Compare March 2, 2026 13:32
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 2, 2026
@Mohamedma96 Mohamedma96 requested a review from ahg-g March 2, 2026 13:38
Comment thread pkg/epp/framework/plugins/datalayer/source/metrics/datasource.go Outdated
Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Mar 2, 2026

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 2, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, Mohamedma96

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Mar 2, 2026

/hold

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 2, 2026
Copy link
Copy Markdown
Contributor

@ahg-g ahg-g Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this got reverted already

… set/unset

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2026
Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
@Mohamedma96 Mohamedma96 requested a review from ahg-g March 3, 2026 13:25
Copy link
Copy Markdown
Contributor

@elevran elevran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left two comments, neither is a blocker. Can be done subsequently from my point of view.

@@ -115,7 +115,7 @@ func defaultDataSourceConfigParams() (*metricsDatasourceParams, error) {
// The second return value is false when the flag is not registered; no error is returned in that case.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: comment should be changed to reflect ... false when the flag is not registered or not explicitly set by the user?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"github.com/spf13/pflag"
)

func resetFlags() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The global pflag.CommandLine is shared. If one test modifies it, subsequent tests will see those changes. It is cleaner to localize the flagset and revert it (e.g., using t.Cleanup()).

func TestSomeFlags(t *testing.T) {
	oldCommandLine := flag.CommandLine // save global FlagSet
	t.Cleanup(func() { 	// revert to previous at the end of the test
		pflag.CommandLine = oldCommandLine
	})

	// create a fresh FlagSet for this test
	pflag.CommandLine = flag.NewFlagSet("test", flag.ContinueOnError)
	
    ... define flags, parse them and validate using test logic
}```

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@elevran
Copy link
Copy Markdown
Contributor

elevran commented Mar 4, 2026

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2026
@ahg-g
Copy link
Copy Markdown
Contributor

ahg-g commented Mar 4, 2026

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 4, 2026
@k8s-ci-robot k8s-ci-robot merged commit 1e9ea89 into kubernetes-sigs:main Mar 4, 2026
9 checks passed
RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Mar 9, 2026
…gs#2441)

* Read metrics Datasource configuration from config file

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* Add MetricsDataSource to values, update go.mod

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* rename plugin, keep supporting command line flags for depracated flags

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* revert pflag import alias

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* Add test for config precedence, add f.Changed check to check cli flag set/unset

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* rename factory and New method to match new naming

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

---------

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
  Allow configuring a separate port for scraping model server metrics via
  EndpointPickerConfig plugin parameters. This addresses the gap left by
  the deprecation of --model-server-metrics-port (PR kubernetes-sigs#1886, kubernetes-sigs#2441) which
  had no replacement for the port configuration.

  When metricsPort is set in the metrics-data-source plugin parameters,
  it overrides the inference port encoded in the endpoint's MetricsHost.
  This enables deployments where model servers expose metrics on a
  separate port (e.g., vLLM with --metrics-port) from inference traffic,
  which is required in Istio mTLS STRICT environments.

  Related: kubernetes-sigs#1396, kubernetes-sigs#1556
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
--model-server-metrics-port was fully removed from options.go in kubernetes-sigs#2441,
so pflag.Lookup would always return nil — making fromIntFlag a no-op.
metricsPort is now configured exclusively via EndpointPickerConfig parameters.

Also tighten NewHTTPDataSource comment.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
Adds MetricsPort to metricsDatasourceParams so the metrics-data-source
plugin can scrape a port different from the inference port. When metricsPort
is non-zero in EndpointPickerConfig plugin parameters, it overrides the port
derived from the endpoint's MetricsHost; zero (default) preserves existing
behavior.

This restores functionality that was available via --model-server-metrics-port
but lost when that flag was removed in kubernetes-sigs#2441 without an equivalent plugin
config parameter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
Adds MetricsPort to metricsDatasourceParams so the metrics-data-source
plugin can scrape a port different from the inference port. When metricsPort
is non-zero in EndpointPickerConfig plugin parameters, it overrides the port
derived from the endpoint's MetricsHost; zero (default) preserves existing
behavior.

This restores functionality that was available via --model-server-metrics-port
but lost when that flag was removed in kubernetes-sigs#2441 without an equivalent plugin
config parameter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
Adds MetricsPort to metricsDatasourceParams so the metrics-data-source
plugin can scrape a port different from the inference port. When metricsPort
is non-zero in EndpointPickerConfig plugin parameters, it overrides the port
derived from the endpoint's MetricsHost; zero (default) preserves existing
behavior.

This restores functionality that was available via --model-server-metrics-port
but lost when that flag was removed in kubernetes-sigs#2441 without an equivalent plugin
config parameter.
janghyukjin added a commit to janghyukjin/gateway-api-inference-extension that referenced this pull request Apr 2, 2026
Adds MetricsPort to metricsDatasourceParams so the metrics-data-source
plugin can scrape a port different from the inference port. When metricsPort
is non-zero in EndpointPickerConfig plugin parameters, it overrides the port
derived from the endpoint's MetricsHost; zero (default) preserves existing
behavior.

This restores functionality that was available via --model-server-metrics-port
but lost when that flag was removed in kubernetes-sigs#2441 without an equivalent plugin
config parameter.
elevran pushed a commit to llm-d/llm-d-inference-scheduler that referenced this pull request Apr 23, 2026
…gs/gateway-api-inference-extension#2441)

* Read metrics Datasource configuration from config file

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* Add MetricsDataSource to values, update go.mod

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* rename plugin, keep supporting command line flags for depracated flags

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* revert pflag import alias

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* Add test for config precedence, add f.Changed check to check cli flag set/unset

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

* rename factory and New method to match new naming

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>

---------

Signed-off-by: mohamedmahameed <mohamed.mahameed@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants