Skip to content

Breaking conversion issues from v1beta1 → v1beta2 in CAPI 1.11 for KubeadmControlPlaneTemplate #12605

@LingyanCao

Description

@LingyanCao

What steps did you take and what happened?

In CAPI v1.11, v1beta2 is now the storage version for several resources, including KubeadmControlPlaneTemplate.
When creating a v1beta1 resource, the conversion webhook converts it to v1beta2 before persisting it to etcd.

I found two breaking changes where the conversion fails due to differences in required fields or validation behavior between v1beta1 and v1beta2.

Case 1 – rolloutStrategy.type becomes required
In KubeadmControlPlaneTemplate v1beta1, the rolloutStrategy.type field is optional:

// RolloutStrategy describes how to replace existing machines with new ones.
type RolloutStrategy struct {
    // Type of rollout. Currently the only supported strategy is "RollingUpdate".
    // Default is RollingUpdate.
    // +optional
    Type RolloutStrategyType `json:"type,omitempty"`
}

In v1beta2, the equivalent field is required:

type KubeadmControlPlaneRolloutStrategy struct {
    // Type of rollout. Currently the only supported strategy is "RollingUpdate".
    // Default is RollingUpdate.
    // +required
    Type KubeadmControlPlaneRolloutStrategyType `json:"type"`
}

If I create a v1beta1 CR without type: RollingUpdate explicitly set, the conversion webhook fails because v1beta2 requires the field.

Example:

spec:
  rolloutStrategy:
    rollingUpdate:
      maxSurge: 1

Case 2 – timeoutForControlPlane validation mismatch
If the v1beta1 CR contains timeoutForControlPlane in the apiServer section, conversion to v1beta2 can result in invalid bootstrap config.

Example v1beta1 snippet:

apiServer:
  timeoutForControlPlane: 20m0s

Conversion leads to:

failed to generate bootstrap config: failed to create bootstrap configuration:
admission webhook "validation.kubeadmconfig.bootstrap.cluster.x-k8s.io" denied the request:
KubeadmConfig.bootstrap.cluster.x-k8s.io "..." is invalid:
[spec.initConfiguration.timeouts.controlPlaneComponentHealthCheckSeconds: Invalid value: "1200":
controlPlaneComponentHealthCheckSeconds must be set to the same value both in initConfiguration.timeouts (1200)
and in joinConfiguration.timeouts (unset),
spec.joinConfiguration.timeouts.controlPlaneComponentHealthCheckSeconds: Invalid value: "unset":
controlPlaneComponentHealthCheckSeconds must be set to the same value both in initConfiguration.timeouts (1200)
and in joinConfiguration.timeouts (unset)],
failed to cleanup generated resources: expected pointer, but got nil

It may be the conversion bug of populating the new V1Beta2 property of ControlPlaneComponentHealthCheckSeconds.

What did you expect to happen?

The conversion webhook should handle these cases gracefully. The existing v1beta1 resource could be converted to v1beta2 successfully.

Cluster API version

v1.11.0-beta.2

Kubernetes version

1.32

Anything else you would like to add?

I’m not entirely sure if this qualifies as a breaking change in the conversion webhook, but if the issue remains unresolved, existing valid v1beta1 CRs will fail to convert to v1beta2 once v1beta1 is removed. This would block customers from upgrading without manual manifest changes, causing potential disruption in production.

Label(s) to be applied

/kind bug
area/api
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.needs-priorityIndicates an issue lacks a `priority/foo` label and requires one.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions