Training server ensemble by kaushikmitr · Pull Request #2473 · kubernetes-sigs/gateway-api-inference-extension

kaushikmitr · 2026-03-03T22:32:10Z

This pull request introduces a new ensemble (gated) model training approach to the latency predictor, enabling the system to train and save separate models for "noqueue" and "queued" regimes, and wraps them into a single serializable object for easier deployment and prediction logic. The changes also include configuration options for ensemble mode, new metrics, and API support for downloading and listing these ensemble models.

Ensemble Model Training and Management

Added ensemble (gated) model training logic, splitting training data into "noqueue" and "queued" regimes, training separate sub-models for each, and wrapping them in a new QueueGatedModel class. The ensemble is activated only if sufficient samples exist for all sub-models. [1] [2] [3]
Ensemble models are now saved as single files (ttft_gated.joblib, tpot_gated.joblib) and included in the model listing, download, and info API endpoints. [1] [2] [3] [4]

Configuration and Settings

Added new configuration options to Settings for enabling ensemble mode, specifying minimum samples for ensemble split, and setting paths for gated ensemble model files. The default maximum training data size per bucket was also reduced.

Metrics and Monitoring

Exposed new metrics for ensemble mode and ensemble activation status in the Prometheus metrics output.

API Enhancements

The /data_status API now reports sample counts for each regime and ensemble configuration details, aiding monitoring and debugging of the ensemble training process. [1] [2]

These changes collectively enable more robust and flexible latency prediction by accounting for queueing effects, and improve observability and manageability of the model lifecycle.

netlify · 2026-03-03T22:32:16Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`a5e6642`
🔍 Latest deploy log	https://app.netlify.com/projects/gateway-api-inference-extension/deploys/69a8b56545da7900083dc81a
😎 Deploy Preview	https://deploy-preview-2473--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

BenjaminBraunDev

lgtm

BenjaminBraunDev · 2026-03-04T17:09:50Z

/lgtm

kaushikmitr · 2026-03-04T21:54:46Z

@danehans @ahg-g
This PR was failing test due to missing boilterplate comment in hack/release-staging-digests.sh (broken in #2479). I added it in this PR

kaushikmitr · 2026-03-04T22:02:30Z

actually it seems someone already added it, just need to rebase once it lands. #2484

k8s-ci-robot · 2026-03-04T22:42:50Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenjaminBraunDev, kaushikmitr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~latencypredictor/OWNERS~~ [kaushikmitr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

BenjaminBraunDev · 2026-03-04T23:45:39Z

/lgtm

…sigs#2473)

k8s-ci-robot requested review from ahg-g and kfswain March 3, 2026 22:32

k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Mar 3, 2026

BenjaminBraunDev approved these changes Mar 4, 2026

View reviewed changes

k8s-ci-robot assigned BenjaminBraunDev Mar 4, 2026

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 4, 2026

kaushikmitr force-pushed the training-server-ensemble branch 2 times, most recently from a4ee705 to 5486d30 Compare March 4, 2026 21:27

k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 4, 2026

latencypredictor: ensemble mode (queue-gated sub-models)

a5e6642

kaushikmitr force-pushed the training-server-ensemble branch from 5486d30 to a5e6642 Compare March 4, 2026 22:42

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 4, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 4, 2026

k8s-ci-robot merged commit b32374e into kubernetes-sigs:main Mar 5, 2026
11 checks passed

RyanRosario pushed a commit to RyanRosario/gateway-api-inference-extension that referenced this pull request Mar 9, 2026

latencypredictor: ensemble mode (queue-gated sub-models) (kubernetes-…

8d4aea5

…sigs#2473)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training server ensemble#2473

Training server ensemble#2473
k8s-ci-robot merged 1 commit intokubernetes-sigs:mainfrom
tomatillo-and-multiverse:training-server-ensemble

kaushikmitr commented Mar 3, 2026

Uh oh!

netlify Bot commented Mar 3, 2026 •

edited

Loading

Uh oh!

BenjaminBraunDev left a comment

Uh oh!

BenjaminBraunDev commented Mar 4, 2026

Uh oh!

kaushikmitr commented Mar 4, 2026 •

edited

Loading

Uh oh!

kaushikmitr commented Mar 4, 2026

Uh oh!

k8s-ci-robot commented Mar 4, 2026

Uh oh!

BenjaminBraunDev commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kaushikmitr commented Mar 3, 2026

Uh oh!

netlify Bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

BenjaminBraunDev left a comment

Choose a reason for hiding this comment

Uh oh!

BenjaminBraunDev commented Mar 4, 2026

Uh oh!

kaushikmitr commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaushikmitr commented Mar 4, 2026

Uh oh!

k8s-ci-robot commented Mar 4, 2026

Uh oh!

BenjaminBraunDev commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify Bot commented Mar 3, 2026 •

edited

Loading

kaushikmitr commented Mar 4, 2026 •

edited

Loading