Skip to content

feat: improve cse bootstrap latency by deferring non-critical work#8105

Merged
awesomenix merged 1 commit into
mainfrom
nishp/tinyimprovements
Mar 25, 2026
Merged

feat: improve cse bootstrap latency by deferring non-critical work#8105
awesomenix merged 1 commit into
mainfrom
nishp/tinyimprovements

Conversation

@awesomenix

@awesomenix awesomenix commented Mar 16, 2026

Copy link
Copy Markdown
Contributor

Summary

  • reduce the CSE bootstrap critical path by starting containerd/kubelet asynchronously and moving non-critical work until after kubelet startup
  • add explicit containerd and kubelet readiness checks before running work that depends on them
  • update supporting tests and VHD builder scripts to match the new startup ordering and latency measurement flow

What changed

  • in cse_main.sh, skip container runtime installation on Azure Linux OS Guard unless an explicit containerd override is provided, pre-warm kubelet, move containerd ulimit configuration earlier, and defer ensureNoDupOnPromiscuBridge plus non-GPU cleanup until after ensureKubelet
  • in cse_config.sh, switch containerd startup to the non-blocking helper, wait for containerd before artifact streaming / pause image / GPU-driver work, load nf_conntrack, and record TLS bootstrap start time before kubelet so the latency service measures the full kubelet bootstrap window
  • in cse_helpers.sh and kubelet.service, add checkServiceHealth, waitForContainerdReady, and an ExecStartPre wait on the containerd socket so later work runs only after services are actually ready
  • in measure-tls-bootstrapping-latency.sh, emit the completed event using the recorded kubelet start timestamp even when kubeconfig already exists or is created during the race window before inotifywait starts listening
  • in cse_install.sh and packer scripts, move downloaded kubelet/kubectl binaries into place with normalized ownership/permissions, restart containerd where packer validation needs it, and disable containerd during VHD cleanup so images do not carry the service enabled unexpectedly
  • add an AzureLinux V3 ARM64 e2e scenario, tag the DCGM exporter compatibility scenario as GPU, and update the ShellSpec coverage for the TLS bootstrapping timing changes

Timings

Before

  • CSE start: +0.000s
  • kubelet started: +25.000s
  • node registered: +26.270s
  • NodeReady: +26.486s

After

  • CSE start: +0.000s
  • ensureKubelet done: +6.489s
  • kubelet started: +7.000s
  • first kubelet log: +8.151s
  • node registered: +9.591s
  • NodeReady: +9.738s
  • CSE finish: +19.000s

Copilot AI review requested due to automatic review settings March 16, 2026 23:37
@awesomenix awesomenix changed the title Improve CSE bootstrap latency by deferring non-critical work feat: improve cse bootstrap latency by deferring non-critical work Mar 16, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces Linux CSE bootstrap critical-path work by deferring non-essential steps until after ensureKubelet, and updates generated pkg/agent/testdata snapshots to reflect the new CSE/custom data output.

Changes:

  • Defers ensureNoDupOnPromiscuBridge, enableLocalDNS, and non-GPU driver cleanup until after ensureKubelet in cse_main.sh.
  • Optimizes provisioning/runtime setup by switching kube binary activation to mv + chmod, and reloading only a targeted sysctl file instead of sysctl --system.
  • Updates VHD cleanup to disable containerd and regenerates pkg/agent/testdata CustomData snapshots.

Reviewed changes

Copilot reviewed 18 out of 75 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
vhdbuilder/packer/cleanup-vhd.sh Disables containerd during VHD cleanup to avoid shipping images with it enabled.
parts/linux/cloud-init/artifacts/cse_main.sh Defers some non-critical steps until after ensureKubelet; skips container runtime install for golden images/OSGuard.
parts/linux/cloud-init/artifacts/cse_install.sh Changes kubelet/kubectl “activation” to mv + chmod to avoid redundant copy work.
parts/linux/cloud-init/artifacts/cse_config.sh Uses targeted sysctl -p and starts kubelet before the TLS bootstrapping latency measurement service.
pkg/agent/testdata/MarinerV2+Kata/CustomData Regenerated snapshot for updated CSE/custom data output.
pkg/agent/testdata/CustomizedImage/CustomData Regenerated snapshot for updated CSE/custom data output.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread vhdbuilder/packer/cleanup-vhd.sh Outdated
mv "/opt/bin/kubelet-${KUBERNETES_VERSION}" /opt/bin/kubelet
mv "/opt/bin/kubectl-${KUBERNETES_VERSION}" /opt/bin/kubectl

chmod a+x /opt/bin/kubelet /opt/bin/kubectl

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was what was before, keeping it as is

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change ? I'm not understanding, install was cleaner ? but slower ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why not force the access level ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also curious about both

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

install does a copy and not a move.

Operation: It copies the file to the destination. A key difference from cp is that install unlinks (removes) the destination file first if it already exists, which can prevent issues (like an EBUSY error) when replacing a running executable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i kept the operation as is before chewi made the change to avoid regression, not sure if it was better or worse but just guarenteed to work and no regression.

https://github.com/Azure/AgentBaker/pull/7125/changes#diff-ff0e92780b2c7f35348b62de54b815b2c9919cfd4f6612f43808aace9dc0a134R638

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regression? My change was merged two months ago. There are important reasons to use install over cp, including the one stated above. There are cases where the destination will be an existing symlink, and it is crucial that we replace the symlink, not its target. mv will do that, but I can't remember if there was some other reason why I didn't stick with mv.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chewi would you be ok to move it back to mv because its definitely seems faster in this case.

Comment thread parts/linux/cloud-init/artifacts/cse_config.sh
Comment thread parts/linux/cloud-init/artifacts/cse_config.sh Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce Linux CSE bootstrap critical-path latency by deferring non-critical steps until after ensureKubelet, avoiding redundant work (targeted sysctl reload, moving kube binaries), and adjusting VHD build/runtime behaviors around containerd.

Changes:

  • Reorders CSE provisioning steps so kubelet starts earlier; starts kubelet before measure-tls-bootstrapping-latency.service.
  • Optimizes provisioning work (targeted sysctl -p, mv+chmod for kube binaries, skip runtime install when golden image already contains it).
  • Adjusts VHD build scripts/tests to ensure containerd is started when needed and disabled during image cleanup; regenerates pkg/agent/testdata.

Reviewed changes

Copilot reviewed 18 out of 77 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
vhdbuilder/packer/trivy-scan.sh Sources provision helpers and ensures containerd is started before Trivy operations.
vhdbuilder/packer/test/linux-vhd-content-test.sh Starts containerd before executing VHD validation tests.
vhdbuilder/packer/cleanup-vhd.sh Disables containerd during VHD cleanup.
parts/linux/cloud-init/artifacts/cse_main.sh Defers non-critical steps until after ensureKubelet; skips container runtime install on golden images.
parts/linux/cloud-init/artifacts/cse_install.sh Uses mv + chmod when activating downloaded kubelet/kubectl.
parts/linux/cloud-init/artifacts/cse_config.sh Uses targeted sysctl -p and starts kubelet before the TLS bootstrapping latency measurement service.
pkg/agent/testdata/MarinerV2+Kata/CustomData Regenerated snapshot output for MarinerV2+Kata CustomData.
pkg/agent/testdata/CustomizedImage/CustomData Regenerated snapshot output for CustomizedImage CustomData.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread vhdbuilder/packer/test/linux-vhd-content-test.sh Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 85 changed files in this pull request and generated 2 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_helpers.sh
Comment thread parts/linux/cloud-init/artifacts/cse_config.sh

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 1 comment.

Comment on lines 524 to +540
@@ -536,21 +537,37 @@ systemctlEnableAndStartNoBlock() {
systemctl status $service --no-pager -l > /var/log/azure/$service-status.log || true
return 1
fi
}

Copilot AI Mar 23, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

systemctlEnableAndStartNoBlock uses systemctl restart --no-block, which only enqueues the job and can return success even if the unit immediately transitions to failed afterward. ensureKubelet relies on this returning non-zero to detect startup failures, so this can mask real failures (kubelet/measure-tls/containerd). Consider adding a short post-start health check (e.g., systemctl is-failed/ActiveState after a delay) or reintroducing an optional delay parameter so callers can fail fast when the unit enters failed.

Copilot uses AI. Check for mistakes.
@awesomenix awesomenix force-pushed the nishp/tinyimprovements branch from fff45dc to b938c10 Compare March 23, 2026 23:15
Copilot AI review requested due to automatic review settings March 23, 2026 23:33
@awesomenix awesomenix force-pushed the nishp/tinyimprovements branch from b938c10 to e9962a7 Compare March 23, 2026 23:33

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment thread parts/linux/cloud-init/artifacts/cse_config.sh
Comment thread parts/linux/cloud-init/artifacts/cse_main.sh

@Devinwong Devinwong left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for walking through the changes. LGTM

ExecStartPre=-/sbin/iptables -t nat --numeric --list

ExecStartPre=/bin/bash /opt/azure/containers/validate-kubelet-credentials.sh
ExecStartPre=/bin/sh -c 'until [ -S /run/containerd/containerd.sock ]; do sleep 0.1; done'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this to avoid kubelet going into a bad exponential back-off or something?

#!/bin/bash -eux

systemctl daemon-reload
systemctl disable --now containerd

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: would prefer we do this in post-install-dependencies.sh - this script is also ran by the image builder service after optimization is complete (it shouldn't anything if this is also ran by the image builder service, though I think it's cleaner to just have this be scoped to our build-specific logic)

exit $VALIDATION_ERR
fi

checkServiceHealth containerd || exit $ERR_SYSTEMCTL_START_FAIL

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be better if we had specific exit codes here, that way we can always pinpoint exactly which call to checkServiceHealth is failing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants