✨ Add scale from/to 0 support for CAPD #12572

Karthik-K-N · 2025-08-01T14:31:24Z

What this PR does / why we need it:

This PR adds scale from 0 support for CAPD

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #12505

Karthik-K-N · 2025-08-01T14:33:25Z

/hold for testing

During initial testing with autoscler and CAPI v1beta2 api's seems like the autoscaler needs to be updated to change in CAPI's apiVersion to apiGroup change in v1beta2. will update more based on the test progress.

Karthik-K-N · 2025-08-01T14:49:34Z

/cc @sbueringer

sbueringer · 2025-08-01T15:10:41Z

During initial testing with autoscler and CAPI v1beta2 api's seems like the autoscaler needs to be updated to change in CAPI's apiVersion to apiGroup change in v1beta2. will update more based on the test progress.

This should work for now as we are pinning the CAPI_VERSION here:

cluster-api/test/e2e/data/autoscaler/autoscaler-to-workload-workload.yaml

Line 205 in 1636804

- name: CAPI_VERSION

(until autoscaler is adjusted)

There is this other issue here though: kubernetes/autoscaler#7908 (comment)

So overall. This should work at the moment with autoscaler v1.32.1

We have to make the following changes to get the corresponding test coverage.

cluster-api/test/e2e/autoscaler_test.go

Line 39 in 1636804

AutoscalerVersion: "v1.33.0",

Should be

AutoscalerVersion:                     "v1.32.1",
ScaleToAndFromZero:                    true,

(Let's make that change in this PR, it's fine to downgrade for now until kubernetes/autoscaler#7908 (comment) is fixed)

(autoscaler test is part of pull-cluster-api-e2e-main)

elmiko · 2025-08-01T16:42:41Z

i think the approach here looks solid, although one thing i would want to double check is that the system resources reported by the container runtime are correct. in the past, i have seen the inside container resources looking very similar to the system wide resources and this can cause issues when multiple containers are started as kubelets with the same resource capacity as the host.

i know docker and podman allow limiting the resource capacity of a container, but i haven't seen it working as i would expect with kubernetes.

Karthik-K-N · 2025-08-04T05:57:52Z

i think the approach here looks solid, although one thing i would want to double check is that the system resources reported by the container runtime are correct. in the past, i have seen the inside container resources looking very similar to the system wide resources and this can cause issues when multiple containers are started as kubelets with the same resource capacity as the host.

i know docker and podman allow limiting the resource capacity of a container, but i haven't seen it working as i would expect with kubernetes.

hey, Thanks for the feedback, I will update based on my obervation during testing. But do you recommend any other way of fetching system resources apart from using runtime?

sbueringer · 2025-08-04T06:43:39Z

The current behavior is expected in my opinion.

If you run CAPD and look at Node allocatble resource etc it will show the entire resource for every single Node. We want the capacity information on the DockerMachineTemplate to match that.

We do not want to introduce limiting/reserving CAPD Machine memory/CPU as part of enabling CAPD for autoscaling from/to 0.

Let's please also not forget that CAPD existly for the sole purpose of testing core CAPI. If we would start enforcing memory reservations/limits on CAPD containers so the actual available resources are perfectly split up we would not be able to run our e2e tests anymore (where we run a huge number of basically empty CAPD Machines at the same time).

sbueringer · 2025-08-04T06:47:45Z

/test pull-cluster-api-e2e-main

(to run the autoscaler test)

Karthik-K-N · 2025-08-04T06:56:41Z

A question in general I wanted to record it before I miss it, Whats the CAPI recomended way of logging a CRD name in logs,
Say for example we have DockerMachineTemplate, how should that be logged

log.Info("Calculating capacity for Docker Machine Template")

or

log.Info("Calculating capacity for DockerMachineTemplate")

or

log.Info("Calculating capacity for docker machine template")

sbueringer · 2025-08-04T06:58:54Z

The second option

Karthik-K-N · 2025-08-04T07:00:00Z

The second option

Thanks. Should that be record here https://cluster-api.sigs.k8s.io/developer/core/logging#log-messages?

sbueringer · 2025-08-04T08:40:36Z

The second option

Thanks. Should that be record here cluster-api.sigs.k8s.io/developer/core/logging#log-messages?

Makes sense. Feel free to open a PR. It should be something along the lines of: "If kinds are mentioned in log messages always use the literal kind". Maybe with an example

Karthik-K-N · 2025-08-04T12:02:25Z

will debug and update on the failure, will try to run locally, But from autoscaler logs I could see

I0804 08:29:28.274247       1 actuator.go:166] Scale-down: removing empty node "autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr"
I0804 08:29:28.274883       1 actuator.go:286] Scale-down: waiting 5s before trying to delete nodes
W0804 08:29:33.288013       1 warnings.go:70] cluster.x-k8s.io/v1beta1 Machine is deprecated; use cluster.x-k8s.io/v1beta2 Machine
W0804 08:29:33.343132       1 warnings.go:70] cluster.x-k8s.io/v1beta1 Machine is deprecated; use cluster.x-k8s.io/v1beta2 Machine
W0804 08:29:39.092767       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:29:39.111636       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:29:50.702257       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:29:50.720390       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:30:02.303995       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:30:02.320868       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:30:13.914242       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:30:13.933645       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:30:25.519404       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:30:25.537134       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:30:37.127578       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:30:37.147262       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:30:48.930771       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:30:48.947725       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
W0804 08:31:00.743607       1 clusterstate.go:657] Nodegroup is nil for docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr
W0804 08:31:00.765473       1 static_autoscaler.go:821] No node group for node docker:////autoscaler-d4p5q9-md-0-gqk4j-gv24j-h5gcr, skipping
I0804 08:31:02.124780       1 actuator.go:166] Scale-down: removing empty node "autoscaler-d4p5q9-md-0-gqk4j-gv24j-jhdhf"
I0804 08:31:02.125125       1 actuator.go:166] Scale-down: removing empty node "autoscaler-d4p5q9-md-0-gqk4j-gv24j-xkjcp"
I0804 08:31:02.125546       1 actuator.go:286] Scale-down: waiting 5s before trying to delete nodes
I0804 08:31:02.136484       1 actuator.go:259] Scale-down: removing node autoscaler-d4p5q9-md-0-gqk4j-gv24j-c9mx7, utilization: {0.00625 0.00039140503889954515 0 0 cpu 0.00625}, pods to reschedule: cluster-autoscaler-7cc5775ccc-6x94r
I0804 08:31:02.136816       1 actuator.go:286] Scale-down: waiting 5s before trying to delete nodes
W0804 08:31:07.139129       1 warnings.go:70] cluster.x-k8s.io/v1beta1 Machine is deprecated; use cluster.x-k8s.io/v1beta2 Machine

I hope that autoscaler did not remove the scaled nodes, Just trying to better read the logs of the artifactory

sbueringer · 2025-08-04T13:20:39Z

Hm yeah. Who knows what the problem is. In CAPV we are running the autoscaler on the mgmt cluster for test setup reasons. So there might be an additional problem when running the autoscaler on the workload cluster that we weren't aware of yet

elmiko · 2025-08-04T13:30:25Z

hey, Thanks for the feedback, I will update based on my obervation during testing. But do you recommend any other way of fetching system resources apart from using runtime?

not specifically, i more wanted to highlight the behavior. i absolutely defer to @sbueringer though, this makes sense to me:

We do not want to introduce limiting/reserving CAPD Machine memory/CPU as part of enabling CAPD for autoscaling from/to 0.

on the autoscaler failure

will debug and update on the failure, will try to run locally, But from autoscaler logs I could see

it looks like the autoscaler removed the nodes due to them being underutilized. this is where knowing the capacity of nodes created on capd will be important. by default the autoscaler will want to remove nodes that are below 50% resource utilization as calculated by comparing the requests of pods on the node versus the allocatable and capacity resources of the node. so, if the node is taking the resource capacity from the host, it could be much larger than expected and we need to account for that either by lowering the threshold for removal or by adjusting our workloads for testing to use higher requests.

sbueringer · 2025-08-04T13:37:22Z

Just for context. We are already testing CAPD with the autoscaler. The only part we did not test before was scale from/to 0.

The way this works today in CAPD is that the Node comes up with some fantasy number for memory and sets it on Node.status.Allocatable.Memory. We then create a Deployment that takes 60% of that fantasy number and sets it as requested memory:

cluster-api/test/framework/autoscaler_helpers.go

Line 182 in 1636804

memoryRequired := int64(float64(memory.Value()) * 0.6)

So my assumption would be that this should continue to work

(In reality neither the Deployment nor the Machine will use anything close to the memory values we see)

sbueringer · 2025-08-04T13:40:30Z

The test currently fails at the following point:

Scale that Deployment with the 60% requested memory down to 0
Checking the MachineDeployment finished scaling down to zero

So we should have no Pods of that Deployment anymore in the workload cluster, but the autoscaler currently does not scale the MD to 0, it stays at 3

Karthik-K-N · 2025-08-04T15:01:12Z

Just another update

Created management cluster using tilt with autoscaler support
Created a clusterclass from tilt UI and created a development cluster with 0 workers
Edited machinedeployment and added required annotations
Created a workload with following yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: busybox
  name: busybox-deployment
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: busybox
  template:
    metadata:
      labels:
        app: busybox
    spec:
      containers:
        - command:
            - sh
            - -c
            - echo Container 1 is Running ; sleep 3600
          image: busybox
          imagePullPolicy: IfNotPresent
          name: busybox
          resources:
            requests:
              cpu: "0.2"
              memory: 3G

Cloned autoscaler and checkout cluster-autoscaler-1.32.1 branch
Built binary and started autoscaler with following args

export CAPI_VERSION=v1beta1

./cluster-autoscaler \
--cloud-provider=clusterapi \
--v=5 \
--namespace=default \
--max-nodes-total=30 \
--scale-down-delay-after-add=100s \
--scale-down-delay-after-delete=10s \
--scale-down-delay-after-failure=10s \
--scale-down-unneeded-time=5m \
--max-node-provision-time=30m \
--balance-similar-node-groups \
--expander=random \
--kubeconfig=/root/karthik-workspace/cluster-api/cmd/clusterctl/workload.conf \
--cloud-config=/root/karthik-workspace/cluster-api/cmd/clusterctl/management.conf

Observing machines getting provisioned and deleted too quickly|

kubectl get machines -w
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Pending        0s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Pending        0s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Pending        0s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   0s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Provisioning   1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Deleting       1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Deleting       1s      v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080                                                                              Deleting       11s     v1.33.0
development-19080-md-0-qgr98-f7tn6-m9hpw   development-19080

test/infrastructure/docker/api/v1alpha3/conversion.go

sbueringer

Did a quick review. But I think I found nothing that explains the test failure. I'll take a look

test/infrastructure/docker/api/v1beta2/dockermachinetemplate_types.go

test/infrastructure/docker/controllers/alias.go

test/infrastructure/docker/internal/controllers/dockermachinetemplate_controller.go

test/e2e/autoscaler_test.go

sbueringer · 2025-08-05T13:12:31Z

@Karthik-K-N Probably this is the issue: #12572 (comment)

I used make tilt-up and then ran the autoscaler e2e test against it
(xref: https://cluster-api.sigs.k8s.io/developer/core/testing#test-execution-via-ide-1 including tips, not sure if it's 100% up-to-date)

sbueringer · 2025-08-05T18:46:11Z

/test pull-cluster-api-e2e-main

sbueringer

Thank you!

Last nit.

Also verified in CI. cpu/memory is perfectly matching up between Node and DockerMachineTemplate

Node: https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api/12572/pull-cluster-api-e2e-main/1952803346867818496/artifacts/clusters/autoscaler-fhoyg4/resources/Node/autoscaler-fhoyg4-md-0-g8wdm-45rlw-rcbmh.yaml

  capacity:
    cpu: "16"
    ephemeral-storage: 292825676Ki
    hugepages-1Gi: "0"
    hugepages-2Mi: "0"
    memory: 130810784Ki
    pods: "110"

DockerMachineTemplate: https://storage.googleapis.com/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api/12572/pull-cluster-api-e2e-main/1952803346867818496/artifacts/clusters/bootstrap/resources/autoscaler-cg5wet/DockerMachineTemplate/autoscaler-fhoyg4-md-0-pmxlf.yaml

status:
  capacity:
    cpu: "16"
    memory: 130810784Ki

test/e2e/autoscaler.go

test/infrastructure/docker/api/v1alpha3/conversion.go

sbueringer · 2025-08-06T08:21:18Z

/cherry-pick release-1.11

k8s-infra-cherrypick-robot · 2025-08-06T08:21:21Z

@sbueringer: once the present PR merges, I will cherry-pick it on top of release-1.11 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

sbueringer · 2025-08-06T10:50:01Z

/test pull-cluster-api-e2e-main
/lgtm
/approve
/hold cancel

k8s-ci-robot · 2025-08-06T10:50:09Z

LGTM label has been added.

Details

Git tree hash: 3fa8f57fa8318a6e8cc39fb93bbae5f9797a78e1

k8s-ci-robot · 2025-08-06T10:50:10Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [sbueringer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sbueringer · 2025-08-06T10:51:44Z

@Karthik-K-N Thank you very much! I think this will help a lot finding compatibility issues between Cluster API and cluster-autoscaler sooner

k8s-infra-cherrypick-robot · 2025-08-06T13:50:04Z

@sbueringer: new pull request created: #12591

Details

In response to this:

/cherry-pick release-1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area PR is missing an area label labels Aug 1, 2025

k8s-ci-robot requested review from elmiko and richardcase August 1, 2025 14:31

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 1, 2025

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 1, 2025

k8s-ci-robot requested a review from sbueringer August 1, 2025 14:49

sbueringer added the area/provider/infrastructure-docker Issues or PRs related to the docker infrastructure provider label Aug 1, 2025

k8s-ci-robot removed the do-not-merge/needs-area PR is missing an area label label Aug 1, 2025

Add scale from 0 support for CAPD

8cfcd07

Karthik-K-N force-pushed the autoscale branch from 86f1940 to 8cfcd07 Compare August 4, 2025 06:28

Karthik-K-N mentioned this pull request Aug 4, 2025

📖 Update book to add guidelines for logging Kubernetes resource names #12575

Merged

sbueringer mentioned this pull request Aug 5, 2025

🐛 Run CAPD conversion tests in CI #12583

Merged

sbueringer reviewed Aug 5, 2025

View reviewed changes

test/infrastructure/docker/api/v1alpha3/conversion.go Show resolved Hide resolved

sbueringer reviewed Aug 5, 2025

View reviewed changes

test/e2e/autoscaler_test.go Show resolved Hide resolved

Karthik-K-N force-pushed the autoscale branch 2 times, most recently from 1ea99d3 to 1c0c9f1 Compare August 5, 2025 17:15

sbueringer reviewed Aug 6, 2025

View reviewed changes

test/e2e/autoscaler.go Outdated Show resolved Hide resolved

test/infrastructure/docker/api/v1alpha3/conversion.go Show resolved Hide resolved

Add devcluster template reconciler

0f4c714

Karthik-K-N force-pushed the autoscale branch from 1c0c9f1 to 0f4c714 Compare August 6, 2025 10:18

sbueringer added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Aug 6, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 6, 2025

k8s-ci-robot assigned sbueringer Aug 6, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 6, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 6, 2025

sbueringer changed the title ~~✨ Add scale from 0 support for CAPD~~ ✨ Add scale from/to 0 support for CAPD Aug 6, 2025

k8s-ci-robot merged commit bfb09ac into kubernetes-sigs:main Aug 6, 2025
24 of 25 checks passed

k8s-ci-robot added this to the v1.11 milestone Aug 6, 2025

k8s-infra-cherrypick-robot mentioned this pull request Aug 6, 2025

[release-1.11] ✨ Add scale from/to 0 support for CAPD #12591

Merged

Karthik-K-N deleted the autoscale branch August 20, 2025 09:37

✨ Add scale from/to 0 support for CAPD #12572

✨ Add scale from/to 0 support for CAPD #12572

Uh oh!

Conversation

Karthik-K-N commented Aug 1, 2025

Uh oh!

Karthik-K-N commented Aug 1, 2025

Uh oh!

Karthik-K-N commented Aug 1, 2025

Uh oh!

sbueringer commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elmiko commented Aug 1, 2025

Uh oh!

Karthik-K-N commented Aug 4, 2025

Uh oh!

sbueringer commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbueringer commented Aug 4, 2025

Uh oh!

Karthik-K-N commented Aug 4, 2025

Uh oh!

sbueringer commented Aug 4, 2025

Uh oh!

Karthik-K-N commented Aug 4, 2025

Uh oh!

sbueringer commented Aug 4, 2025

Uh oh!

Karthik-K-N commented Aug 4, 2025

Uh oh!

sbueringer commented Aug 4, 2025

Uh oh!

elmiko commented Aug 4, 2025

Uh oh!

sbueringer commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbueringer commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Karthik-K-N commented Aug 4, 2025

Uh oh!

Uh oh!

sbueringer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sbueringer commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbueringer commented Aug 5, 2025

Uh oh!

sbueringer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sbueringer commented Aug 6, 2025

Uh oh!

k8s-infra-cherrypick-robot commented Aug 6, 2025

Uh oh!

sbueringer commented Aug 6, 2025

Uh oh!

k8s-ci-robot commented Aug 6, 2025

Uh oh!

k8s-ci-robot commented Aug 6, 2025

Uh oh!

sbueringer commented Aug 6, 2025

Uh oh!

Uh oh!

sbueringer commented Aug 1, 2025 •

edited

Loading

sbueringer commented Aug 4, 2025 •

edited

Loading

sbueringer commented Aug 4, 2025 •

edited

Loading

sbueringer commented Aug 4, 2025 •

edited

Loading

sbueringer left a comment •

edited

Loading

sbueringer commented Aug 5, 2025 •

edited

Loading