Skip to content

Update troubleshooting guide to include remediation for incorrect pre…#2040

Merged
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
BenjaminBraunDev:vertex-troubleshooting-guide
Jan 21, 2026
Merged

Update troubleshooting guide to include remediation for incorrect pre…#2040
k8s-ci-robot merged 4 commits intokubernetes-sigs:mainfrom
BenjaminBraunDev:vertex-troubleshooting-guide

Conversation

@BenjaminBraunDev
Copy link
Copy Markdown
Contributor

Due to customer issues with TTFT spikes caused by the prefix cache scorer having an incorrect configuration, adding this to the troubleshooting guide to make it easier for users to diagnose and remediate similar issues.

In this case it was unclear that the TTFT spikes were caused by the prefix cache config until we saw the config wasn't set to the right parameters for the model being served.

Does this PR introduce a user-facing change?:

NONE

@netlify
Copy link
Copy Markdown

netlify Bot commented Dec 23, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 9044848
🔍 Latest deploy log https://app.netlify.com/projects/gateway-api-inference-extension/deploys/696a96c98aba08000831aa90
😎 Deploy Preview https://deploy-preview-2040--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 23, 2025
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 23, 2025
@liu-cong
Copy link
Copy Markdown
Contributor

liu-cong commented Dec 24, 2025

Since v1.2 the plugin auto tunes such configurations from the model server metrics so no manual tuning is required, #1748. We should recommend users using the v1.2+ versions, and highlight that such tuning is only required before v1.2

@BenjaminBraunDev
Copy link
Copy Markdown
Contributor Author

Since v1.2 the plugin auto tunes such configurations from the model server metrics so no manual tuning is required, #1748. We should recommend users using the v1.2+ versions, and highlight that such tuning is only required before v1.2

Done, specified that past v1.2 autotuning is supported, so long as the model server exposes the required metrics, like vLLM does.

@kfswain
Copy link
Copy Markdown
Collaborator

kfswain commented Jan 21, 2026

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 21, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenjaminBraunDev, kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 21, 2026
@k8s-ci-robot k8s-ci-robot merged commit 55b3e09 into kubernetes-sigs:main Jan 21, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants