Skip to content

Saturation check should become an extension point #1405

@nirrozenbaum

Description

@nirrozenbaum

Current saturation detector is checking the saturation of the system based on two main metrics for each pod -
the waiting queue size and the kv cache utilization.
more details here:
https://github.com/kubernetes-sigs/gateway-api-inference-extension/blob/main/pkg/epp/saturationdetector/saturationdetector.go

whoever is using IGW may want to define different criteria for saturation, not necessarily these metrics.

In order to allow flexibility of the saturation check - it should become an extension point, and current code may become an implementation of that extension point (we may ship it as default plugin).
This change will also clean the saturation config.go file which defines env vars for setting the thresholds of current saturation flags and those will become parameters of a plugin.

Metadata

Metadata

Assignees

Labels

triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions