Change cgroup driver to systemd#593
Change cgroup driver to systemd#593reegnz wants to merge 1 commit intoawslabs:masterfrom reegnz:change_cgroup_driver
Conversation
Kubernetes documentation indicates that for stability reasons one should run kubernetes with the systemd cgroup driver if the init system itself is systemd. https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers Fixes #490
|
#587 reverted this change earlier, now that eksctl was made compatible with this change in eksctl-io/eksctl#2962 it should be OK to merge again. |
|
This is another run at #490 |
|
Hi, now that eksctl has released a change, we're looking into how to make this release possible. There's still a condition where if we release an AMI with this change and a user is using an older version of eksctl, this could still lead to failures. We will be update this issue with more details, once we have a plan to pick this change. |
|
@abeer91 I do understand the concern of breaking downstream tooling, so I might look into how that could be handled, although I can't promise anything (thinking about maybe using a path unit to keep the kubelet config in sync with the How long do you think we should wait for the newer eksctl to be adopted? |
|
Hi, upon cluster creation with GPU instances i could not fetch nodes with
Will the above fix solves this issue. @abeer91 when can we expect this fix. |
|
@reegnz you might want to add systemd cgroup support to containerd in this PR now it's been released. @abeer91 if this is still being blocked due to concerns about |
|
note: systemd cgroup support to containerd is not correct. We are in runc v2 not in v1
|
|
@josephprem I can't find any official documentation with the best being reading containerd/containerd#4203 (specifically containerd/containerd#4203 (comment)) |
yes it is cat /etc/containerd/config.toml
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"
[grpc]
address = "/run/dockershim.sock"
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d" |
|
@josephprem I re-read the config and that's why I struck the comment through about v2 being enabled. I'm still interested if you have any better documentation than the comment I found and linked above. |
https://github.com/containerd/cri/blob/master/docs/config.md |
|
I have added this in my user-data conf=/etc/eks/containerd/containerd-config.toml
# EKS bootstrap script moves $conf to /etc/eks/containerd/config.toml
section="\[plugins.\"io.containerd.grpc.v1.cri\".containerd.runtimes.runc.options\]"
key="SystemdCgroup"
grep -q $section $conf || sed -i "/^runtime_type.*/a $section" $conf
grep -q $key $conf || sed -i "/$section.*/a $key = true" $confFollowed by enabling --cgroup-driver=systemd in kubelet ( note the service file patched ) sed -i 's/KUBELET_EXTRA_ARGS/KUBELET_EXTRA_ARGS $EXTENDED_KUBELET_ARGS/' /etc/eks/containerd/kubelet-containerd.service
cat << EOF > /etc/systemd/system/kubelet.service.d/9999-extended-kubelet-args.conf
[Service]
Environment='EXTENDED_KUBELET_ARGS=--cgroup-driver=systemd'
EOF
systemctl daemon-reload |
|
@josephprem I assume that you could modify /etc/kubernetes/kubelet/kubelet-config.json like the changes in this PR instead of setting |
I guess so , but my preference for pushing variables is through systemd Drop-Ins |
|
Hey folks 👋
Please note that we are no longer making changes to the legacy path. To address this point:
We emphatically do not support older versions of |
@Callisto13 thanks for emphasizing that! @abeer91 so now that being said, what is still blocking the merging of this PR? |
|
@reegnz I think what @Callisto13 was saying is that the latest version of |
|
@stevehipwell Legacy flow seems to be supported for Custom AMIs, not the official EKS AMI, for the official EKS AMI they use the /etc/eks/bootstrap.sh, no legacy code-path there. We're talking about enabling systemd cgroup driver for the official AMI, not custom AMI-s. The custom AMI problem has been handled on their side. |
|
@reegnz I think the concern was that someone using legacy flow with a custom AMI built from this AMI would be broken after this PR was merged? |
|
I want to verify that there are no outstanding concerns/issues, and I'll try to get this merged this week. |
|
I'm going to close this; we've made |
Kubernetes documentation indicates that for stability reasons
one should run kubernetes with the systemd cgroup driver if the
init system itself is systemd.
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.