-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What would you like to be added:
EKS AMI by default to use "systemd" cgroups driver for both kubelet and docker.
Why is this needed:
Since AL2 is using systemd and used systemd driver for cgroups managing, kubelet and docker using cgroupfs would result in systemd unaware of the resource allocation by cgroupfs and could result in system crash in certain cases.
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#cgroup-drivers
I have tested this by performing the following change in config files and adding the node back to master. In my testing the node was marked as Ready and I was able to create pods in this node.
### Cordoned a workernode:
k drain ip-192-168-0-171.us-west-2.compute.internal --ignore-daemonsets
node/ip-192-168-0-171.us-west-2.compute.internal already cordoned
WARNING: ignoring DaemonSet-managed Pods: kube-system/aws-node-jfnzw, kube-system/kube-proxy-sdx2q
node/ip-192-168-0-171.us-west-2.compute.internal drained
### Remove node entry from EKS so that node will be joined as a new entity altogether
k delete no ip-192-168-0-171.us-west-2.compute.internal
### Stopped kubelet:
[ec2-user@ip-192-168-0-171 ~]$ sudo systemctl stop kubelet docker
### Edited kubelet and docker config files to add systemd as cgroups manager:
[ec2-user@ip-192-168-0-171 ~]$ cat /etc/docker/daemon.json
{
"bridge": "none",
"exec-opts": ["native.cgroupdriver=systemd"],
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "10"
},
"live-restore": true,
"max-concurrent-downloads": 10
}
### Since I'm using EKSCTl to create my cluster and nodegroup, I have modified the following file:
[root@ip-192-168-66-171 ec2-user]# cat /etc/eksctl/kubelet.yaml
address: 0.0.0.0
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 2m0s
enabled: true
x509:
clientCAFile: /etc/eksctl/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 5m0s
cacheUnauthorizedTTL: 30s
cgroupDriver: systemd
clusterDNS:
- 10.100.0.10
clusterDomain: cluster.local
featureGates:
RotateKubeletServerCertificate: true
kind: KubeletConfiguration
kubeReserved:
cpu: 70m
ephemeral-storage: 1Gi
memory: 200Mi
systemReserved:
cpu: 1000m
ephemeral-storage: 1Gi
memory: 2Gi
serverTLSBootstrap: true
### Ran bootstrap.sh for the node to join master:
sudo /etc/eks/bootstrap.sh myclustername
### Found that new node came up healthy and was able to successfully run some nginx test pods on it:
k get no
NAME STATUS ROLES AGE VERSION
ip-192-168-0-171.us-west-2.compute.internal Ready <none> 10m v1.15.11-eks-af3caf
ip-192-168-35-63.us-west-2.compute.internal Ready <none> 99m v1.15.11-eks-af3caf
k get po -owide | grep 171 -c
8
Can we move into "systemd" driver for eks-optimized AMIs ?
Note: Found following GH Issue where setting kube-reserved/system-reserved memory was not taken into while calculating kubepods.slice "MemoryLimit". It was using node memory as its value.
kubernetes/kubernetes#88197