How to increase Pods limit per worker node in Kubernetes

Kubelet is close to pod limit


The Alert Manager reported that our Kubelets running on the production nodes were running too many pods, close to the default limit of 110.


First We need to take a look at Grafana dashboard to see if the nodes are able to receive more pods in terms of infrastructure resources (CPU/RAM, ....), then calculate the average of this resource utilization in order to determine the new value of pods limit on each node.


We suppose that the chosen value for the pods limit on nodes is 200, you can change it following these steps :

    1.    Connect:

First connect to the node (the worker with the issue):                                                                   
$ ssh -i /path/to/your/ssh-key/id_rsa node-user@node-hostname
node-user@node-hostname:~$ sudo -i 

    2.    Edit: 

Change the value KUBELET_MAX_PODS in the file /etc/default/kubelet from 110 to 200 ( use your favorite file editor to change the value):
root@node-hostname:~# cat /etc/default/kubelet


    3.    Restart & check:

Restart the kubelet service in the node and check if it is properly started:
root@node-hostname:~# systemctl restart kubelet.service
root@node-hostname:~# systemctl status kubelet.service
● kubelet.service - Kubelet
   Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2019-03-22 13:52:42 UTC; 5min ago
  Process: 63755 ExecStartPre=/sbin/iptables -t nat --list (code=exited, status=0/SUCCESS)
  Process: 63750 ExecStartPre=/sbin/ebtables -t nat --list (code=exited, status=0/SUCCESS)
  Process: 63747 ExecStartPre=/sbin/sysctl -w net.ipv4.tcp_retries2=8 (code=exited, status=0/SUCCESS)
  Process: 63742 ExecStartPre=/bin/mount --make-shared /var/lib/kubelet (code=exited, status=0/SUCCESS)
  Process: 63734 ExecStartPre=/bin/bash -c if [ $(mount | grep "/var/lib/kubelet" | wc -l) -le 0 ] ; then /bin/mount --bind /var/lib/kubelet /var/lib/kubelet ; fi (code=exited, status=0/SUCCESS)
  Process: 63728 ExecStartPre=/bin/mkdir -p /var/lib/kubelet (code=exited, status=0/SUCCESS)
  Process: 63722 ExecStartPre=/bin/bash /opt/azure/containers/ (code=exited, status=0/SUCCESS)
 Main PID: 70046 (docker)
    Tasks: 7
   Memory: 4.0M
      CPU: 945ms
   CGroup: /system.slice/kubelet.service
           └─70046 /usr/bin/docker run --net=host --pid=host --privileged --rm --volume=/dev:/dev --volume=/sys:/sys:ro --volume=/var/run:/var/run:rw --volume=/var/lib/docker/:/var/lib/docker:rw --volume=/var/lib/kubelet/:/var/lib/kubelet

    4.    Double check:

 Go back on your machine and check again the value in Capacity.pods if it is equal to 200:
me@My-computer$ kubectl describe node node-hostname
Name:               node-hostname
Roles:              agent
Labels:             agentpool=poolX
Taints:             <none>
CreationTimestamp:  Wed, 11 Jul 2018 11:57:04 +0200
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Wed, 11 Jul 2018 11:57:26 +0200   Wed, 11 Jul 2018 11:57:26 +0200   RouteCreated                 RouteController created a route
  OutOfDisk            False   Fri, 22 Mar 2019 16:39:55 +0100   Thu, 21 Mar 2019 20:22:16 +0100   KubeletHasSufficientDisk     kubelet has sufficient disk space available
  MemoryPressure       False   Fri, 22 Mar 2019 16:39:55 +0100   Thu, 21 Mar 2019 20:22:16 +0100   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Fri, 22 Mar 2019 16:39:55 +0100   Thu, 21 Mar 2019 20:22:16 +0100   KubeletHasNoDiskPressure     kubelet has no disk pressure
  Ready                True    Fri, 22 Mar 2019 16:39:55 +0100   Fri, 22 Mar 2019 15:04:50 +0100   KubeletReady                 kubelet is posting ready status
  Hostname:    k8s-homelab-worker-pool2-3
Capacity:  0
 cpu:                             32
 memory:                          132017808Ki
 pods:                            200
Allocatable:  0
 cpu:                             32
 memory:                          131915408Ki
 pods:                            200
System Info:
 Machine ID:                 aef45egerv0a378fe845rtg6a2acef156871e
 System UUID:                DCDF8472-a378-78fe8-tg6a2-tg6a2acef1568
 Boot ID:                    5egerv0a-a378-nr6e-78fe8-386a5cda4dfd
 Kernel Version:             4.4.0-134-generic
 OS Image:                   Debian GNU/Linux 8 (jessie)
 Operating System:           linux
 Architecture:               amd64
 Container Runtime Version:  docker://1.12.6
 Kubelet Version:            v1.7.7-3

    5.    Repeat steps from 1 to 4 on all the nodes with this issue.


Posts les plus consultés de ce blog

Knative vs OpenFaaS: What are the differences?

How to lunch Rancher using docker and heat in openstack