Skip to content

feat(t8s-cluster/management-cluster): switch to hcp#1759

Merged
cwrau merged 1 commit intomainfrom
feat/t8s-cluster/switch-to-hcp
Nov 20, 2025
Merged

feat(t8s-cluster/management-cluster): switch to hcp#1759
cwrau merged 1 commit intomainfrom
feat/t8s-cluster/switch-to-hcp

Conversation

@cwrau
Copy link
Member

@cwrau cwrau commented Oct 22, 2025

Summary by CodeRabbit

  • New Features

    • Hosted control plane deployment option with dedicated hosted templates
    • Node pool autoscaling support
    • Containerd GPU/runtime and registry mirror support
    • Cilium CNI connectivity test job
  • Improvements

    • Simplified bootstrapping and kubelet behavior
    • Streamlined node scheduling, tolerations and cloud-controller-manager handling
    • Worker timeouts adjusted (drain/deletion)
  • Chores

    • Removed legacy audit configuration
    • Control-plane template reorganizations and schema tweaks (flavor optional)

Copilot AI review requested due to automatic review settings October 22, 2025 13:34
@cwrau cwrau enabled auto-merge October 22, 2025 13:34
@coderabbitai
Copy link

coderabbitai bot commented Oct 22, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Refactors control-plane templates toward a hosted control plane, removes Kubernetes audit policy and its plumbing, adds containerd plugin config, forces Kubeadm bootstrap usage, introduces a hosted control plane spec and Cilium test Job, tightens OpenStack flavor validation, adjusts kubelet/timeout/nodeSelector logic, and adds an autoscaling values file.

Changes

Cohort / File(s) Summary
Autoscaling Configuration
charts/t8s-cluster/ci/autoscaling-values.yaml
Added values file enabling nodePools.autoscaling with flavor standard.2.4096 and replicas min: 1, max: 3.
Audit Policy Removal
charts/t8s-cluster/files/audit-config.yaml
Deleted Kubernetes audit policy file and removed related static/dynamic file references and API server audit plumbing.
Helper Templates & Containerd
charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl
Added t8s-cluster.clusterClass.containerdConfig.plugins; removed audit-config path/file helpers and removed default API server arg injections.
Bootstrap Templates
.../bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml, .../_k0smotronConfigTemplateSpec.yaml
Made bootstrap kind a fixed KubeadmConfigTemplate (removed hosted branching); removed k0smotron-specific k0smotron.spec template.
Control Plane Migration
charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml, .../k0smotronControlPlaneTemplate/*, .../hostedControlPlaneTemplate/*
Removed k0smotron control plane templates; added hosted control plane templates/helpers; controlPlane.apiVersion now dynamic (v1alpha1 if hosted, else v1beta1); hosted path uses HostedControlPlaneTemplate.
Hosted Control Plane Spec
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml
New template t8s-cluster.clusterClass.hostedControlPlaneTemplate.spec rendering hosted control plane spec (apiServer, audit webhook/policy, controllerManager, scheduler, gateway, dynamic file merging).
Hosted Helper Rename
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl
Renamed helper from k0smotronControlPlaneTemplate.specHashhostedControlPlaneTemplate.specHash and updated include references.
Worker / Timeouts
charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml
nodeDrainTimeout made a duration (8m) and added nodeDeletionTimeout: 15m; worker bootstrap.template always KubeadmConfigTemplate.
OpenStack Flavor Validation
charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml
flavor now required via `
Kubelet Patch Logic
charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl
Removed semver gate; apply imagePulls patch whenever maxParallelImagePulls > 1.
etcd-defrag Inclusion
charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml
Removed hosted-specific inclusion of etcd-defrag template.
Cloud Controller / CSI Adjustments
charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml, charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml
Reworked hosted/version guards: unconditional nodeSelector removal under hosted; removed hosted-only nodePlugin block for CSI nodePlugin.
CNI Test Job
charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml
Added Helm test Job for Cilium connectivity (init containers for kubeconfig and connectivity test, plus cleanup container).
Schema Changes
charts/t8s-cluster/values.schema.json
Removed controlPlane.flavor from the required array (flavor no longer required at that level).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Helm as Helm Render
    participant CC as ClusterClass templates
    participant Hosted as HostedControlPlaneTemplate
    participant Kubeadm as KubeadmConfigTemplate

    Helm->>CC: render clusterClass.yaml
    CC->>CC: evaluate .Values.controlPlane.hosted
    alt hosted == true
        CC->>Hosted: include hostedControlPlaneTemplate.spec
        Hosted->>Hosted: merge static/dynamic files, build apiServer/audit/webhook
        Hosted->>Kubeadm: reference KubeadmConfigTemplate for worker bootstrap
    else hosted == false
        CC->>Kubeadm: use Kubeadm/KubeadmControlPlaneTemplate paths (v1beta1)
    end
Loading
sequenceDiagram
    autonumber
    participant Job as Cilium Test Job
    participant InitKube as init: test-kubeconfig
    participant InitConn as init: connectivity-test
    participant Cleaner as main: delete-namespace

    Job->>InitKube: mount workload-kubeconfig, run kubectl to prepare
    InitKube-->>Job: kubeconfig ready
    Job->>InitConn: run cilium connectivity test
    InitConn-->>Job: test result
    Job->>Cleaner: delete test namespace (cleanup)
    Cleaner->>Cleaner: cleanup completes
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • tasches
  • marvinWolff
  • teutonet-bot

Poem

🐰 I hop from k0s to hosted, tidy and spry,
Audit files tucked, containerd tunes nigh,
Cilium tests tumble, timeouts stretched wide,
Flavors now required where checks must abide,
A rabbit's small cheer — charts refactored with pride! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "feat(t8s-cluster/management-cluster): switch to hcp" directly corresponds to the primary objective of this changeset, which involves migrating the management cluster from a K0smotron-based control plane architecture to a Hosted Control Plane (HCP) architecture. The changes comprehensively demonstrate this transition through the removal of K0smotronControlPlaneTemplate resources, deletion of K0smotron-related template specifications, and introduction of new HostedControlPlaneTemplate resources and helpers. The title is concise, specific, and clearly communicates the main change without vague terminology or unnecessary noise.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/t8s-cluster/switch-to-hcp

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b19c4ef and aa16abe.

📒 Files selected for processing (18)
  • charts/t8s-cluster/ci/autoscaling-values.yaml (1 hunks)
  • charts/t8s-cluster/files/audit-config.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (3 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (1 hunks)
  • charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml (1 hunks)
  • charts/t8s-cluster/values.schema.json (0 hunks)
💤 Files with no reviewable changes (7)
  • charts/t8s-cluster/values.schema.json
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml
  • charts/t8s-cluster/files/audit-config.yaml
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml
🚧 Files skipped from review as they are similar to previous changes (5)
  • charts/t8s-cluster/ci/autoscaling-values.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl
  • charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml

[error] 6-6: syntax error: expected '', but found '{'

(syntax)

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: check licenses
  • GitHub Check: lint helm chart (t8s-cluster)
🔇 Additional comments (12)
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1)

1-11: LGTM: Clean implementation of HostedControlPlaneTemplate rendering.

The conditional rendering and resource structure are well-implemented. The use of $.Release.Name (root context) on lines 5 and 10 is correct for supporting .Files.Get operations as noted in the comments.

Note: The YAMLlint syntax error is a false positive - it doesn't understand Helm template syntax.

charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (3)

32-32: LGTM: Dynamic apiVersion selection is appropriate.

The ternary operator correctly selects v1alpha1 for hosted mode (HostedControlPlaneTemplate) and v1beta1 for non-hosted mode (KubeadmControlPlaneTemplate).


33-39: LGTM: Control plane template selection is well-structured.

The conditional logic correctly selects between HostedControlPlaneTemplate and KubeadmControlPlaneTemplate based on the hosted flag, with appropriate spec hash references for each mode.


165-169: LGTM: Simplified bootstrap configuration.

Forcing KubeadmConfigTemplate for all worker nodes simplifies the bootstrap logic and aligns with the PR's objective to standardize on Kubeadm bootstrap.

charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (2)

38-65: LGTM: Well-structured containerd plugin configuration.

The containerd plugin configuration is properly structured with:

  • Conditional registry mirror support
  • Proper runc runtime configuration with SystemdCgroup enabled
  • Conditional NVIDIA runtime for GPU workloads

The comment on line 38 about containerd 2.0.0 compatibility is helpful for future maintenance.


127-134: LGTM: Simplified shared arguments template.

The removal of default argument injections (authorization-always-allow-paths, bind-address) aligns with the broader audit configuration cleanup in this PR.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (4)

7-20: LGTM: API server configuration is well-structured.

The API server deployment configuration properly:

  • Mounts configuration from a ConfigMap
  • Merges static and dynamic files with validation
  • Uses appropriate helper templates for args and resources

21-27: Verify hardcoded audit webhook configuration.

The audit webhook configuration contains hardcoded values:

  • Line 24: Wazuh server URL with a specific domain
  • Line 27: Secret namespace capi-hosted-control-plane-system

These hardcoded values may be intentional for your infrastructure, but consider whether they should be configurable to support different environments or deployments.

If these values should be configurable, consider extracting them to values.yaml and using them as template variables.


28-59: LGTM: Comprehensive and well-designed audit policy.

The audit policy follows Kubernetes best practices by:

  • Excluding long-running requests at RequestReceived stage
  • Filtering out noisy system components and high-volume resources
  • Protecting secret data by logging only metadata
  • Capturing all mutation operations for auditability

60-71: LGTM: Controller manager, scheduler, and gateway configuration.

The control plane components are properly configured with:

  • Appropriate args from helper templates
  • Resource requests from common helpers
  • Single replica per component (appropriate for hosted control plane)

Note: The gateway namespace on line 69 is also hardcoded to capi-hosted-control-plane-system, similar to the audit webhook configuration.

charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (2)

65-69: Approve the toleration configuration for hosted control planes.

The tolerations are correctly configured to allow the CCM pods to tolerate uninitialized node taints, enabling them to schedule on hosted control plane nodes. Both the cluster-level (node.cluster.x-k8s.io/uninitialized) and cloud-provider-level (node.cloudprovider.kubernetes.io/uninitialized) taints are handled.


63-70: The review comment is based on a misunderstanding of the conditional logic.

The tolerations in lines 63-70 are applied only when controlPlane.hosted=true, not removed from non-hosted deployments. The code shows:

  • {{- if .Values.controlPlane.hosted }} wraps both the postRenderer (line 48) and the tolerations (line 63)
  • Non-hosted deployments (controlPlane.hosted=false) remain the default configuration and continue using upstream chart defaults
  • No breaking change occurs to non-hosted deployments; they are unaffected by this block

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR switches the management cluster from k0smotron to HCP (Hosted Control Plane) for hosted control plane deployments. The main changes include removing k0smotron-specific templates and configuration, introducing HCP templates, and simplifying cloud controller manager tolerations while removing version-specific logic.

Key Changes:

  • Replaced K0smotronControlPlaneTemplate with HostedControlPlaneTemplate for hosted control planes
  • Removed k0smotron-specific bootstrap configuration and worker config templates
  • Consolidated cloud controller manager tolerations and removed Kubernetes version checks

Reviewed Changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
charts/t8s-cluster/values.schema.json Removed required constraint on flavor field
charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml Added new Cilium CNI connectivity test job
charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml Simplified tolerations configuration and removed version checks
charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml Removed k0s-specific kubelet directory workaround
charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml Removed etcd defrag configuration for hosted control planes
charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl Removed Kubernetes version check for image pull configuration
charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml Added explicit validation for required flavor field
charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml Removed k0smotron control plane template
charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml Removed k0smotron control plane spec
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml Added new HCP template
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml Added HCP specification with audit webhook configuration
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl Updated helpers for HCP spec hash generation
charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml Updated to reference HCP templates and changed bootstrap to use KubeadmConfigTemplate
charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml Removed k0smotron bootstrap config spec
charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml Simplified to always use KubeadmConfigTemplate
charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl Removed audit config helpers and shared args configuration
charts/t8s-cluster/files/audit-config.yaml Removed standalone audit config file
charts/t8s-cluster/ci/autoscaling-values.yaml Added autoscaling test configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (1)

50-55: Update TODO comments to reference the correct issue.

The TODO comments reference containerd issue #5837, but that issue is about config merge via imports, not SystemdCgroup. The SystemdCgroup = true setting is a legitimate requirement for containerd + runc when cgroup v2 is in use, not a workaround. Update the TODO comments to either remove them or reference the correct containerd documentation/issue about cgroup v2 and runc configuration requirements.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5b6ead9 and 0d313fc.

📒 Files selected for processing (18)
  • charts/t8s-cluster/ci/autoscaling-values.yaml (1 hunks)
  • charts/t8s-cluster/files/audit-config.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (3 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (1 hunks)
  • charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml (1 hunks)
  • charts/t8s-cluster/values.schema.json (0 hunks)
💤 Files with no reviewable changes (7)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml
  • charts/t8s-cluster/values.schema.json
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml
  • charts/t8s-cluster/files/audit-config.yaml
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml

[error] 6-6: syntax error: expected '', but found '{'

(syntax)

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

🔇 Additional comments (18)
charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (2)

65-69: Verify toleration matches node taint structure and consistency.

The first toleration (line 65–66) lacks an explicit value and operator, which means it will tolerate any taint with the key node.cluster.x-k8s.io/uninitialized regardless of the taint's value. The second toleration (line 67–69) explicitly specifies value: "true". This inconsistency could lead to scheduling mismatches if:

  1. The actual node taints have specific values that don't align with the toleration definitions.
  2. The first toleration's intent differs from the second (one is a "flag," the other a keyed value).

To verify the actual node taints and ensure the tolerations are correct, please check:

  • The node taints for both node.cluster.x-k8s.io/uninitialized and node.cloudprovider.kubernetes.io/uninitialized keys
  • Whether the first toleration should explicitly include value: "" and operator: Equal for clarity and consistency

For consistency, consider aligning the toleration definitions:

      - key: node.cluster.x-k8s.io/uninitialized
+       operator: Equal
+       value: ""
        effect: NoSchedule

63-70: Confirm backward compatibility for non-hosted deployments.

The tolerations block is now conditional on .Values.controlPlane.hosted (line 63). This means non-hosted deployments receive no tolerations at all. According to the PR summary, a previous block "added a control-plane specific toleration and nodeSelector when not hosted"—implying non-hosted deployments may have previously had tolerations.

Verify that:

  1. Non-hosted deployments do not need any tolerations for the CCM to function correctly.
  2. Removing these tolerations from non-hosted deployments is intentional and tested.
  3. Existing non-hosted clusters will not experience scheduling failures after this change.
charts/t8s-cluster/ci/autoscaling-values.yaml (1)

1-6: Configuration looks good.

The autoscaling setup is well-formed and provides a reasonable test range (1–3 replicas) for CI validation of the standard.2.4096 flavor.

charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml (1)

1-1: YAML lint error is a false positive.

The YAMLlint error about syntax is a false positive caused by Helm template syntax not being recognized by the linter. The Go template conditional is valid Helm syntax once the boolean negation syntax is corrected (see above).

charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml (1)

7-7: Clarify the intent of the required constraint with placeholder fallback.

The required "flavor is required" constraint is combined with a ternary that provides "compute-plane-placeholder" as a fallback for non-control-plane machines. This means:

  • For control-plane: The required check validates .Values.controlPlane.flavor (intended behavior).
  • For non-control-plane: The placeholder value always satisfies the required check, making it ineffective.

If the intention is to only validate control-plane flavor, this is working correctly. However, if all machine templates should have a valid flavor, the placeholder approach should be reconsidered.

charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml (1)

4-11: LGTM! Simplification aligns with hosted control plane migration.

The removal of host-based conditional logic and the switch to always use KubeadmConfigTemplate with a fixed spec path simplifies the template and aligns with the PR's objective to standardize on hosted control planes.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1)

1-11: LGTM! Proper implementation of hosted control plane template.

The template is correctly gated by .Values.controlPlane.hosted and follows the immutable template pattern using a spec hash suffix. The YAML structure properly references the spec definition with normalization via fromYaml | toYaml.

Note: The YAMLlint syntax error is a false positive—it doesn't understand Helm template delimiters.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl (1)

1-5: LGTM! Consistent renaming for hosted control plane.

The helper has been properly renamed from k0smotronControlPlaneTemplate to hostedControlPlaneTemplate, with all internal references updated accordingly. This aligns with the broader migration to hosted control planes.

charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (3)

32-39: LGTM! Proper API version and kind selection for hosted vs non-hosted control planes.

The dynamic selection between:

  • controlplane.cluster.x-k8s.io/v1alpha1 with HostedControlPlaneTemplate for hosted
  • controlplane.cluster.x-k8s.io/v1beta1 with KubeadmControlPlaneTemplate for non-hosted

correctly aligns with CAPI conventions and the PR's migration strategy.


168-169: LGTM! Bootstrap template simplification.

The removal of conditional logic to always use KubeadmConfigTemplate is consistent with the broader hosted control plane migration and matches the changes in _bootstrapConfigTemplate.yaml.


150-151: Based on my verification of the CAPI ClusterClass schema and the codebase patterns:

Kubernetes metav1.Duration fields are serialized in YAML as strings formatted according to Go's time.ParseDuration function. The unquoted format (8m, 15m) used at lines 150-151 is valid and consistent with how duration values are handled throughout the codebase—all other timeout fields use the same unquoted format (600s, 5m, 10m0s, etc.).

The change from quoted format ("8m") to unquoted format (8m) aligns with standard Kubernetes and CAPI practices. No formatting correction is needed.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (4)

21-27: Review hardcoded namespace in audit webhook configuration.

The audit webhook configuration uses a hardcoded namespace:

  • secretNamespace: capi-hosted-control-plane-system (line 27)
  • server URL includes dynamic namespace: .Release.Namespace (line 24)

While the webhook server URL dynamically uses the release namespace, the authentication secret namespace is hardcoded. This could cause issues if:

  • Deployments use a different namespace convention
  • Multi-tenancy requires namespace isolation

Consider making the secret namespace configurable via .Values or aligning it with .Release.Namespace if appropriate.


28-59: LGTM! Well-structured audit policy.

The audit policy is comprehensive and follows best practices:

  • Filters out high-volume, low-value events (leases, events)
  • Reduces noise from system components (controller-manager, scheduler, apiserver)
  • Logs metadata for secrets (not full content)
  • Captures mutating operations at Metadata level
  • Uses appropriate omitStages to avoid duplicate RequestReceived events

68-70: Review hardcoded gateway namespace.

The gateway configuration uses a hardcoded namespace:

namespace: capi-hosted-control-plane-system

Similar to the audit webhook secret namespace, this may need to be configurable for deployments that don't follow this namespace convention. Consider parameterizing this via .Values for flexibility.


7-20: LGTM! Proper API server deployment configuration.

The API server configuration correctly:

  • Mounts config from a dynamically named ConfigMap
  • Aggregates static and dynamic files with validation
  • References args and resources via includes
  • Uses proper path resolution

The file aggregation logic with mustMerge and required ensures all files have the necessary fileName attribute.

charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (3)

131-134: Verify removed API server arguments are handled by HCP.

The authorization-always-allow-paths and bind-address arguments have been removed from the shared configuration. Ensure that:

  1. The hosted control plane (HCP) handles these configurations appropriately
  2. Authorization bypass paths (if needed) are configured at the HCP level
  3. API server binding is correctly managed by the HCP infrastructure

199-222: Audit logging is handled by the hosted control plane provider—no action needed.

Managed Kubernetes control planes (GKE, EKS, AKS) provide API-server audit logging through their respective logging systems. Control-plane audit logs are exposed and can be enabled/exported to provider logging systems (Cloud Logging/BigQuery, CloudWatch Logs, Azure logs). The removal of audit-config references is correct for HCP deployments since audit logging is configured and managed through the provider's platform rather than the Helm chart. Ensure your HCP provider's audit logging is enabled in their console.


48-48: The review comment is incorrect and based on a misunderstanding of gpu-operator behavior.

The design is actually correct: NVIDIA gpu-operator (v1.7+) automatically creates the nvidia RuntimeClass and can create runtime classes like nvidia-cdi and nvidia-legacy.

The default_runtime_name = "runc" should remain as-is—it is intentionally not set to "nvidia" because:

  • Only GPU workloads should use the nvidia runtime
  • GPU pods request resources (e.g., nvidia.com/gpu), and the gpu-operator/device-plugin automatically handles runtime selection
  • Setting the default globally to "nvidia" would incorrectly affect all pods, breaking non-GPU workloads

The conditional NVIDIA runtime configuration (lines 57–63) correctly uses the standard /usr/local/nvidia/toolkit/nvidia-container-runtime path, which the gpu-operator provisions on GPU-enabled nodes. No manual runtime class configuration or explicit pod-level runtimeClassName specifications are needed—the gpu-operator handles this automatically.

Likely an incorrect or invalid review comment.

@cwrau cwrau force-pushed the feat/t8s-cluster/switch-to-hcp branch 2 times, most recently from 68ba1cd to b19c4ef Compare October 22, 2025 14:26
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (4)
charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml (2)

1-1: Fix Go template syntax for boolean negation.

The conditional uses .Release.IsUpgrade false, which is invalid Go template syntax. Use the not function to check if an upgrade is NOT occurring.

-{{- if and (eq (include "t8s-cluster.cni" .) "cilium") .Release.IsUpgrade false -}}
+{{- if and (eq (include "t8s-cluster.cni" .) "cilium") (not .Release.IsUpgrade) -}}

45-67: Resolve namespace mismatch between test and cleanup.

The connectivity test runs with --test-namespace=cilium-test (line 55), but cleanup targets cilium-test-1 (line 67). This will leave the test namespace orphaned and attempt to delete a non-existent namespace.

Ensure both reference the same namespace:

-            - cilium-test-1
+            - cilium-test

Alternatively, if cilium-test-1 is correct, update line 55 instead:

-            - --test-namespace=cilium-test
+            - --test-namespace=cilium-test-1
charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl (1)

4-4: The Kubernetes version check is still missing.

As flagged in the previous review, the maxParallelImagePulls field requires Kubernetes 1.27+. Without a version check, this patch will cause errors on older clusters.

charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (1)

150-151: Quote the duration values for clarity.

Duration strings like 8m and 15m should be quoted to ensure consistent parsing and match Kubernetes conventions.

Apply this diff:

-        nodeDrainTimeout: 8m
-        nodeDeletionTimeout: 15m
+        nodeDrainTimeout: "8m"
+        nodeDeletionTimeout: "15m"
🧹 Nitpick comments (1)
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (1)

24-24: Minor: Add space before closing braces for consistency.

While this doesn't affect parsing, add a space before }} to match the formatting on line 69.

Apply this diff:

-          - server: https://k8s.master.wazuh.teuto.net/{{ .Release.Namespace}}/{{ .Release.Name }}
+          - server: https://k8s.master.wazuh.teuto.net/{{ .Release.Namespace }}/{{ .Release.Name }}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d313fc and b19c4ef.

📒 Files selected for processing (18)
  • charts/t8s-cluster/ci/autoscaling-values.yaml (1 hunks)
  • charts/t8s-cluster/files/audit-config.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (3 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml (0 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/patches/_kubelet.tpl (1 hunks)
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml (0 hunks)
  • charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (1 hunks)
  • charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml (1 hunks)
  • charts/t8s-cluster/values.schema.json (0 hunks)
💤 Files with no reviewable changes (7)
  • charts/t8s-cluster/templates/workload-cluster/cinder-csi-plugin/cinder-csi-plugin.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/_k0smotronControlPlaneTemplateSpec.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_k0smotronConfigTemplateSpec.yaml
  • charts/t8s-cluster/templates/management-cluster/etcd-defrag.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/k0smotronControlPlaneTemplate/k0smotronControlPlaneTemplate.yaml
  • charts/t8s-cluster/values.schema.json
  • charts/t8s-cluster/files/audit-config.yaml
🚧 Files skipped from review as they are similar to previous changes (3)
  • charts/t8s-cluster/templates/management-cluster/clusterClass/openStackMachineTemplates/_openstackMachineTemplateSpec.yaml
  • charts/t8s-cluster/ci/autoscaling-values.yaml
  • charts/t8s-cluster/templates/management-cluster/clusterClass/_helpers.tpl
🧰 Additional context used
🪛 YAMLlint (1.37.1)
charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml

[error] 6-6: syntax error: expected '', but found '{'

(syntax)

charts/t8s-cluster/templates/workload-cluster/tests/cni-cilium.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml

[error] 1-1: syntax error: expected the node content, but found '-'

(syntax)

🔇 Additional comments (10)
charts/t8s-cluster/templates/management-cluster/clusterClass/bootstrapConfigTemplate/_bootstrapConfigTemplate.yaml (1)

4-11: LGTM! Bootstrap template simplified.

The hardcoded KubeadmConfigTemplate approach is cleaner and aligns with the hosted control plane migration.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/hostedControlPlaneTemplate.yaml (1)

1-11: LGTM! Hosted control plane template structure is sound.

The conditional rendering, spec hash generation, and YAML normalization are all correctly implemented.

charts/t8s-cluster/templates/management-cluster/clusterClass/clusterClass.yaml (2)

32-39: LGTM! Dynamic control plane apiVersion correctly implemented.

The ternary logic properly selects between v1alpha1 for hosted and v1beta1 for standard control planes, with appropriate template references.


168-169: LGTM! Bootstrap template reference simplified.

Hardcoding KubeadmConfigTemplate is consistent with the HCP migration and removes unnecessary conditional complexity.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_helpers.tpl (1)

1-5: LGTM! Template rename aligns with HCP migration.

The rename from k0smotronControlPlaneTemplate to hostedControlPlaneTemplate is consistent with the broader architectural change.

charts/t8s-cluster/templates/management-cluster/clusterClass/hostedControlPlaneTemplate/_hostedControlPlaneTemplateSpec.yaml (3)

21-27: Verify the hardcoded audit webhook endpoint.

The audit webhook target uses a hardcoded URL https://k8s.master.wazuh.teuto.net/{{ .Release.Namespace}}/{{ .Release.Name }}. Ensure this external dependency is appropriate and that the endpoint will be available for all hosted control planes.


28-59: LGTM! Audit policy is well-structured.

The audit rules appropriately filter system noise while capturing meaningful events at the Metadata level for secrets and mutations.


60-70: LGTM! Component configurations are appropriate.

The controller manager, scheduler, and gateway configurations are correctly structured for a hosted control plane deployment.

charts/t8s-cluster/templates/workload-cluster/cloud-controller-manager.yaml (2)

63-70: Toleration structure and hosted behavior.

The tolerations are correctly structured as a list with two entries. The change from a minor-version conditional to a hosted conditional is appropriate for the hosted control plane model. However, verify that non-hosted deployments have access to necessary tolerations elsewhere, or confirm that running without these specific tolerations is acceptable for non-hosted scenarios.


58-59: Verify minimum Kubernetes version compatibility for hosted control plane patches.

The op: remove operation at line 59 unconditionally removes /spec/template/spec/nodeSelector for all hosted control planes. According to RFC 6902, the remove operation requires the target path to exist; if it doesn't, the patch will fail.

The dynamic version selection openstack-cloud-controller-manager 2.${minorVersion}.x can map to older chart versions (e.g., 2.28.x for Kubernetes 1.28). The enriched summary notes that the conditional guard previously restricting this to minorVersion > 28 was removed. Confirm that all supported chart versions (especially pre-1.29) include nodeSelector on the DaemonSet before this change is deployed, or add version-conditional logic to only apply the patch for compatible chart versions.

@cwrau cwrau force-pushed the feat/t8s-cluster/switch-to-hcp branch from b19c4ef to aa16abe Compare October 24, 2025 09:51
@cwrau cwrau added this pull request to the merge queue Nov 20, 2025
Merged via the queue into main with commit 303b0b6 Nov 20, 2025
32 checks passed
@cwrau cwrau deleted the feat/t8s-cluster/switch-to-hcp branch November 20, 2025 11:43
github-merge-queue bot pushed a commit that referenced this pull request Jan 15, 2026
🤖 I have created a release *beep* *boop*
---


##
[9.5.0](t8s-cluster-v9.4.1...t8s-cluster-v9.5.0)
(2026-01-15)


### Features

* **t8s-cluster/artifacthub:** use centralised helmRepositories template
([#1846](#1846))
([73a41f9](73a41f9))
* **t8s-cluster/cilium:** enable kubeProxy replacement
([#1815](#1815))
([b3c412d](b3c412d))
* **t8s-cluster/management-cluster:** add cluster-autoscaler deployment
([#1756](#1756))
([5b6ead9](5b6ead9))
* **t8s-cluster/management-cluster:** enable ImageVolume feature flag
([#1786](#1786))
([9676ee0](9676ee0))
* **t8s-cluster/management-cluster:** set apiServerLoadBalancer.provider
via TeutonetesCloud
([#1898](#1898))
([6bf8889](6bf8889))
* **t8s-cluster/management-cluster:** switch to hcp
([#1759](#1759))
([303b0b6](303b0b6))
* **t8s-cluster/management-cluster:** use new
KubeletEnsureSecretPulledImages feature gate
([#1858](#1858))
([40d7bef](40d7bef))
* **t8s-cluster:** migrate to CAPI v1beta2
([#1685](#1685))
([dc5f071](dc5f071))


### Bug Fixes

* **t8s-cluster/autoscaler:** these names are inside the workload
cluster
([#1877](#1877))
([f345cea](f345cea))
* **t8s-cluster/management-cluster:** leave out protocol if `nil`
([#1837](#1837))
([f370dac](f370dac))
* **t8s-cluster:** only allow nodePools with valid k8s names
([#1851](#1851))
([b9431c5](b9431c5))


### Miscellaneous Chores

* **t8s-cluster/dependencies:** update common docker tag to v1.6.0
([#1811](#1811))
([b3b4c94](b3b4c94))
* **t8s-cluster/dependencies:** update common docker tag to v1.7.0
([#1873](#1873))
([71e062f](71e062f))
* **t8s-cluster/dependencies:** update helm release cilium to v1.18.6
([#1894](#1894))
([e1adc88](e1adc88))
* **t8s-cluster/dependencies:** update helm release cluster-autoscaler
to v9.53.0
([#1856](#1856))
([dc67fcd](dc67fcd))
* **t8s-cluster/dependencies:** update helm release
openstack-cloud-controller-manager to v2.34.1
([#1553](#1553))
([e984d19](e984d19))
* **t8s-cluster/dependencies:** update registry.k8s.io/etcd docker tag
to v3.5.24
([#1793](#1793))
([a5098e3](a5098e3))
* **t8s-cluster/dependencies:** update registry.k8s.io/etcd docker tag
to v3.6.6
([#1813](#1813))
([e07ffa7](e07ffa7))
* **t8s-cluster/dependencies:** update registry.k8s.io/etcd docker tag
to v3.6.7
([#1895](#1895))
([cf1d3b4](cf1d3b4))
* **t8s-cluster/flux:** use centralised HelmRepositories instead of
per-instance
([#1758](#1758))
([3deff65](3deff65))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants