AEP-8818: InPlace Update Mode#8818
Conversation
|
Skipping CI for Draft Pull Request. |
|
/kind api-review |
|
@omerap12: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/label kind/api-review |
|
@omerap12: The label(s) DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/label api-review |
|
/lgtm @omerap12 If you have a draft PR that makes sense to review, then let me know Omer. |
|
@iamzili: changing LGTM is restricted to collaborators DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
I'll take a look at this next week @omerap12 sorry for the delay 🥲 |
| ### Infeasible Attempt Tracking | ||
|
|
||
| VPA must track `infeasible` resize attempts to prevent infinite retry loops. This is necessary because infeasibility can be detected at different points depending on the Kubernetes version: | ||
| | Kubernetes Version | When Infeasibility Is Detected | `spec.resources` After Attempt | How VPA Learns | | ||
| |-------------------------------------|----------------------------------------------------|--------------------------------|---------------------------| | ||
| | < 1.36 (or later if KEP slips) | After patch succeeds, kubelet reports status | Updated to attempted value | Resize status = Infeasible| | ||
| | >= 1.36 (targeted, not guaranteed) | At patch time, API server rejects | Unchanged (old value) | Patch error response | | ||
|
|
||
| > **Note:** The Kubernetes sig-node team is *targeting* admission-time feasibility checks for 1.36, but this timeline is not guaranteed and may slip to a later release. VPA implements version-agnostic detection that works correctly regardless of which version introduces this change. |
There was a problem hiding this comment.
It may be worth linking to the issue/PR that discusses this.
Also, is this technically out of scope of the InPlace mode?
It's a problem we'll face for InPlaceOrRecreate also
There was a problem hiding this comment.
I will link the PR.
Also, is this technically out of scope of the InPlace mode?
I don’t think so - this AEP should define a clear solution for it.
It’s a problem we’ll face for InPlaceOrRecreate as well.
Agreed, but since InPlaceOrRecreate can recreate Pods, we should take a different approach there.
Good catch flagging this. We should adjust it before it’s merged into k/k. SIG Node will likely put this guarded behind a feature gate, but we’ll need to account for that as well (e.g. someone enable this feature gate).
There was a problem hiding this comment.
It’s a problem we’ll face for InPlaceOrRecreate as well.
Agreed, but since
InPlaceOrRecreatecan recreate Pods, we should take a different approach there.
Yeah, good point, I guess it needs to handle the error, rather than saving it. May be that can happen in the AEP for the InPlaceOrRecreate mode
|
when thinking about the data structure that will store the infeasible attempts, one situation came to mind that I think we haven't discussed yet: What if we record an infeasible attempt for a pod, and then for some reason, the pod is moved to another node (e.g. due to a node drain or node-pressure eviction), where the required resources suddenly become available? In that case, the updater would not try to apply the previously infeasible attempt to the pod, right? |
I had thought about this, and had decided that pods can't move nodes. My assumption was that a pod can be destroyed and a new one recreated on a new node, however, may be I'm wrong? EDIT: oh, what about statefulsets? those names don't really change. May be we need to store the UID rather than the pod? |
we are fine with |
Not sure I understand, a Pod named "my-first-statefulset-0" can be recreated and land on a new (larger) node, and still be called "my-first-statefulset-0", in which case we may not resize it, since it's still stored in the map (ie: the problem you described in a previous comment) |
ah, now I understand what you meant 🙂 Good point! You are right - this issue that I described, and that you also thought about, is really a problem for |
|
We can rely on the pod’s UID, which is guaranteed to be unique. |
|
/kind api-change (Is this correct?) |
|
|
||
| Periodically, VPA removes entries from the infeasibleAttempts map for pods that no longer exist. This prevents memory leaks from accumulating stale entries. This cleanup behavior is targeted for beta. | ||
|
|
||
| **Key Difference from `InPlaceOrRecreate`**: In `InPlace` mode, `Deferred`, `Infeasible`, and `InProgress` statuses all result in waiting—VPA never falls back to eviction. In contrast, `InPlaceOrRecreate` mode may fall back to eviction after a timeout. This ensures that `InPlace` mode pods are never evicted, regardless of how long they remain in a non-updatable state. |
There was a problem hiding this comment.
all result in waiting—VPA never falls back to eviction
We are thinking about creating a proposal for IPPR-triggered scheduler eviction, so I'd like to confirm my understanding about the Deferred condition handling here. What I am envisioning is something like this:
- VPA has a recommendation that could theoretically fit on the node, but that space is currently being occupied by some lower-priority pods. It initiates an in-place pod resize request (assume
InPlaceOnlymode). - Kubelet marks the resize as
Deferredin the pod status. - The scheduler watches for
Deferredresizes, and preempts a lower-priority pod accordingly. - After the lower-priority pod is removed, kubelet triggers a retry of the deferred resize and it succeeds.
From VPA perspective, does this sound like a use case that the current InPlaceOnly proposal already supports?
cc @tallclair
There was a problem hiding this comment.
Yes. The main point is this with Deferred, we are only waiting (with Infeasible, there is more logic involved e.g. remembering earlier recommendations etc ).
We are thinking about creating a proposal for IPPR-triggered scheduler eviction
Has the KEP been created already? I would like to review it :)
There was a problem hiding this comment.
Thanks! I will CC you in the KEP.
There was a problem hiding this comment.
Can you also CC me? Thanks!
soltysh
left a comment
There was a problem hiding this comment.
@omerap12 and @adrianmoisey asked me to check this from API review pov, and from that perspective this looks good 👍
| } | ||
|
|
||
| cr, present := ip.podToReplicaCreatorMap[getPodID(pod)] | ||
| if present { |
There was a problem hiding this comment.
Nit: I'd suggest inverting the conditions here (yes, I'm aware this is just a design doc 😉 but I won't likely review the actual PR, so I'm leaving that comment here), but that will make the code easier to understand. Same thing applies to identical check few lines below. Iow:
cr, present := ip.podToReplicaCreatorMap[getPodID(pod)]
if !present {
return utils.InPlaceDeferred
}
...
singleGroupStats, present := ip.creatorToSingleGroupStatsMap[cr]
if pod.Status.Phase == apiv1.PodPending {
return utils.InPlaceDeferred
}
if !present {
return utils.InPlaceDeferred
}
...There was a problem hiding this comment.
Thanks for the review. fixed :)
Signed-off-by: Omer Aplatony <[email protected]>
|
/lgtm Thanks! This is amazing |
|
/unhold |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: adrianmoisey, omerap12, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Omer Aplatony <[email protected]>
What type of PR is this?
/kind documentation
What this PR does / why we need it:
AEP for #8720
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: