Skip to content

Change cgroup driver to systemd#593

Closed
reegnz wants to merge 1 commit intoawslabs:masterfrom
reegnz:change_cgroup_driver
Closed

Change cgroup driver to systemd#593
reegnz wants to merge 1 commit intoawslabs:masterfrom
reegnz:change_cgroup_driver

Conversation

@reegnz
Copy link
Contributor

@reegnz reegnz commented Jan 6, 2021

Kubernetes documentation indicates that for stability reasons
one should run kubernetes with the systemd cgroup driver if the
init system itself is systemd.

https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Kubernetes documentation indicates that for stability reasons
one should run kubernetes with the systemd cgroup driver if the
init system itself is systemd.

https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers

Fixes #490
@reegnz
Copy link
Contributor Author

reegnz commented Jan 6, 2021

#587 reverted this change earlier, now that eksctl was made compatible with this change in eksctl-io/eksctl#2962 it should be OK to merge again.

@reegnz
Copy link
Contributor Author

reegnz commented Jan 8, 2021

This is another run at #490

@abeer91
Copy link
Contributor

abeer91 commented Jan 11, 2021

Hi, now that eksctl has released a change, we're looking into how to make this release possible. There's still a condition where if we release an AMI with this change and a user is using an older version of eksctl, this could still lead to failures. We will be update this issue with more details, once we have a plan to pick this change.

@reegnz
Copy link
Contributor Author

reegnz commented Jan 12, 2021

@abeer91 I do understand the concern of breaking downstream tooling, so I might look into how that could be handled, although I can't promise anything (thinking about maybe using a path unit to keep the kubelet config in sync with the daemon.json but that is very hacky).

How long do you think we should wait for the newer eksctl to be adopted?
Does it warrant a changelog note that this might break some tooling downstream (tools that assume only kubelet config needs to be changed)?

@jagadeeshi2i
Copy link

Hi, upon cluster creation with GPU instances i could not fetch nodes with kubectl get nodes.
I see the below error on kubelet logs

failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"
eksctl version : 0.51.0
eks version: 1.20
kubectl version: v1.21.0
AMI: ami-0b450512a684db582

Will the above fix solves this issue. @abeer91 when can we expect this fix.

@stevehipwell
Copy link
Contributor

@reegnz you might want to add systemd cgroup support to containerd in this PR now it's been released.

@abeer91 if this is still being blocked due to concerns about eksctl how about a containerd only PR (see aws/containers-roadmap#1210 (comment))?

@josephprem
Copy link

note: systemd cgroup support to containerd is not correct. We are in runc v2 not in v1

@reegnz you might want to add systemd cgroup support to containerd in this PR now it's been released.

@abeer91 if this is still being blocked due to concerns about eksctl how about a containerd only PR (see aws/containers-roadmap#1210 (comment))?

@stevehipwell
Copy link
Contributor

stevehipwell commented Jul 28, 2021

@josephprem I can see that runc cgroup v2 is supported but is it actually enabled in the AMI?

I can't find any official documentation with the best being reading containerd/containerd#4203 (specifically containerd/containerd#4203 (comment))

@josephprem
Copy link

@josephprem I can see that runc cgroup v2 is supported but is it actually enabled in the AMI?

I can't find any official documentation with the best being reading containerd/containerd#4203 (specifically containerd/containerd#4203 (comment))

yes it is

cat /etc/containerd/config.toml
version = 2
root = "/var/lib/containerd"
state = "/run/containerd"

[grpc]
address = "/run/dockershim.sock"

[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"

[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/opt/cni/bin"
conf_dir = "/etc/cni/net.d"

@stevehipwell
Copy link
Contributor

@josephprem I re-read the config and that's why I struck the comment through about v2 being enabled. I'm still interested if you have any better documentation than the comment I found and linked above.

@josephprem
Copy link

@josephprem I re-read the config and that's why I struck the comment through about v2 being enabled. I'm still interested if you have any better documentation than the comment I found and linked above.

https://github.com/containerd/cri/blob/master/docs/config.md

@josephprem
Copy link

josephprem commented Jul 28, 2021

I have added this in my user-data

conf=/etc/eks/containerd/containerd-config.toml 
# EKS bootstrap script moves $conf to /etc/eks/containerd/config.toml 
section="\[plugins.\"io.containerd.grpc.v1.cri\".containerd.runtimes.runc.options\]"
key="SystemdCgroup"
grep -q $section $conf || sed -i "/^runtime_type.*/a $section" $conf
grep -q $key $conf || sed -i "/$section.*/a $key = true" $conf

Followed by enabling --cgroup-driver=systemd in kubelet ( note the service file patched )

sed -i 's/KUBELET_EXTRA_ARGS/KUBELET_EXTRA_ARGS $EXTENDED_KUBELET_ARGS/' /etc/eks/containerd/kubelet-containerd.service
cat << EOF > /etc/systemd/system/kubelet.service.d/9999-extended-kubelet-args.conf
[Service]
Environment='EXTENDED_KUBELET_ARGS=--cgroup-driver=systemd'
EOF
systemctl daemon-reload

@stevehipwell
Copy link
Contributor

@josephprem I assume that you could modify /etc/kubernetes/kubelet/kubelet-config.json like the changes in this PR instead of setting --cgroup-driver=systemd?

@josephprem
Copy link

/etc/kubernetes/kubelet/kubelet-config.json

I guess so , but my preference for pushing variables is through systemd Drop-Ins

@Callisto13
Copy link

Hey folks 👋 eksctl here. A couple of things to note on our side:

  • We have changed the way that eksctl bootstraps: we no longer roll our own, and simply call the /etc/eks/bootstrap.sh to delegate to whatever is going on in the AMIs. Therefore eksctl will not need to do anything with cgroup stuff. (See details of this decision here Use built-in AMI bootstrapping scripts for AL2/Ubuntu unmanaged nodes eksctl-io/eksctl#3564)
  • HOWEVER: The above switch comes with a breaking change for those using Unmanaged Nodegroups with Custom AMIs, therefore the legacy codepath is still in place for a little while longer. (See notice here (Future) Breaking: overrideBootstrapCommand soon to be required with Custom AMIs for AL2 and Ubuntu unmanaged nodegroups eksctl-io/eksctl#3563.) On the legacy path we have removed previously added changes which set the cgroup driver. This was because we kept experiencing breakages every time the AMI folks moved something around (this pain was actually one of the reasons why we removed our own bootstrapper). This could mean that if the custom AMI is based on one which sets the docker driver to be systemd, the kubelet will fail because eksctl's legacy bootstrapper will override the kubelet config, setting the driver to cgroupfs.
    • BUT, users can get by this by using the overrideBootstrapCommand to set their own userdata.

Please note that we are no longer making changes to the legacy path.

To address this point:

There's still a condition where if we release an AMI with this change and a user is using an older version of eksctl, this could still lead to failures.

We emphatically do not support older versions of eksctl. If users are running any version of eksctl which is not latest, they are doing it wrong 🙃 .

@reegnz
Copy link
Contributor Author

reegnz commented Aug 3, 2021

We emphatically do not support older versions of eksctl. If users are running any version of eksctl which is not latest, they are doing it wrong 🙃 .

@Callisto13 thanks for emphasizing that!

@abeer91 so now that being said, what is still blocking the merging of this PR?

@stevehipwell
Copy link
Contributor

@reegnz I think what @Callisto13 was saying is that the latest version of eksctl still supports a legacy flow for compatibility and that is what's currently blocking this? This is just an observation; personally I don't think a higher level project should dictate the direction of a dependency, this is why strong APIs/contracts are essential.

@reegnz
Copy link
Contributor Author

reegnz commented Aug 3, 2021

@stevehipwell Legacy flow seems to be supported for Custom AMIs, not the official EKS AMI, for the official EKS AMI they use the /etc/eks/bootstrap.sh, no legacy code-path there. We're talking about enabling systemd cgroup driver for the official AMI, not custom AMI-s. The custom AMI problem has been handled on their side.

@stevehipwell
Copy link
Contributor

@reegnz I think the concern was that someone using legacy flow with a custom AMI built from this AMI would be broken after this PR was merged?

@mmerkes
Copy link
Contributor

mmerkes commented Nov 2, 2021

I want to verify that there are no outstanding concerns/issues, and I'll try to get this merged this week.

@cartermckinnon
Copy link
Contributor

I'm going to close this; we've made systemd the cgroup driver for containerd, and we don't plan to touch docker until we remove it in 1.25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants