Posted on Jul 15, 2024

40 Days Of Kubernetes (14/40)

Day 14/40

Taints and Tolerations in Kubernetes

Video Link
@piyushsachdeva
Git Repository
My Git Repo

We're going to look at taint and toleration. While a node has a label it means it has taint for scheduling a workload with that specific label and doesn't tolerate other workloads to be scheduling on itself.

We taint node and tell a pod to tolerate that taint to be scheduled on that node.

(Photo from the video)

"Tolerations are applied to pods. Tolerations allow the scheduler to schedule pods with matching taints. Tolerations allow scheduling but don't guarantee scheduling: the scheduler also evaluates other parameters as part of its function.

Taints and tolerations work together to ensure that pods are not scheduled onto inappropriate nodes. One or more taints are applied to a node; this marks that the node should not accept any pods that do not tolerate the taints."
Note There are two special cases:

An empty key with operator Exists matches all keys, values and effects which means this will tolerate everything.
An empty effect matches all effects with key key1.

The allowed values for the effect field are:

NoExecute > for newer and existing pods
NoSchedule > for newer pods
PreferNoSchedule > No Guaranteed

source

A toleration is essentially the counter to a taint, allowing a pod to ignore taints applied to a node. A toleration is defined in the pod specification and must match the key, value, and effect of the taint it intends to tolerate.
Toleration Operators: While matching taints, tolerations can use operators like Equal and Exists.
The Equal operator requires an exact match of key, value, and effect , whereas the Exists operator matches a taint based on the key alone, disregarding the value.

For instance:

tolerations: - key: "key" operator: "Equal" value: "value" effect: "NoSchedule"

source

1. Taint the node

root@localhost:~# kubectl get nodes NAME STATUS ROLES AGE VERSION lucky-luke-control-plane Ready control-plane 8d v1.30.0 lucky-luke-worker Ready <none> 8d v1.30.0 lucky-luke-worker2 Ready <none> 8d v1.30.0 root@localhost:~# kubectl taint node lucky-luke-worker gpu=true:NoSchedule node/lucky-luke-worker tainted root@localhost:~# kubectl taint node lucky-luke-worker2 gpu=true:NoSchedule node/lucky-luke-worker2 tainted root@localhost:~# kubectl describe node lucky-luke-worker | grep -i taints Taints: gpu=true:NoSchedule

Let's schedule a pod

root@localhost:~# kubectl run nginx --image=nginx pod/nginx created root@localhost:~# kubectl get pods NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 6s

It says it's in Pending status, so let's see the error message of the pod:

root@localhost:~# kubectl describe pod nginx Name: nginx Namespace: default Priority: 0 Service Account: default Node: <none> Labels: run=nginx Annotations: <none> Status: Pending IP: IPs: <none> Containers: nginx: Image: nginx  Port: <none>  Host Port: <none>  Environment: <none>  Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-4xh8p (ro) Conditions: Type Status PodScheduled False Volumes: kube-api-access-4xh8p: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt  ConfigMapOptional: <nil>  DownwardAPI: true QoS Class: BestEffort Node-Selectors: <none> Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 89s default-scheduler 0/3 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }, 2 node(s) had untolerated taint {gpu: true}. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

And the message is clear to us :)

0/3 nodes are available. 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: } 2 node(s) had untolerated taint {gpu: true} 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

We need create toleration on a pod to be scheduled

root@localhost:~# kubectl run redis --image=redis --dry-run=client -o yaml > redis_day14.yaml root@localhost:~# vim redis_day14.yaml

Adding tolerations to yaml file:

apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: redis name: redis spec: containers: - image: redis name: redis resources: {} dnsPolicy: ClusterFirst restartPolicy: Always tolerations: - key: "gpu" operator: "Equal" value: "true" effect: "NoSchedule" status: {}

Apply the file:

root@localhost:~# kubectl apply -f redis_day14.yaml pod/redis created root@localhost:~# kubectl get pods NAME READY STATUS RESTARTS AGE nginx 0/1 Pending 0 10m redis 1/1 Running 0 5s root@localhost:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 0/1 Pending 0 10m <none> <none> <none> <none> redis 1/1 Running 0 17s 10.244.2.12 lucky-luke-worker2 <none> <none>

Let's delete the taint of one node and see what will happen to our pending pod:

root@localhost:~# kubectl taint node lucky-luke-worker gpu=true:NoSchedule- node/lucky-luke-worker untainted root@localhost:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 22m 10.244.1.14 lucky-luke-worker <none> <none> redis 1/1 Running 0 11m 10.244.2.12 lucky-luke-worker2 <none> <none>

By default, the control-plane node has taint NoSchedule

root@localhost:~# kubectl get nodes NAME STATUS ROLES AGE VERSION lucky-luke-control-plane Ready control-plane 8d v1.30.0 lucky-luke-worker Ready <none> 8d v1.30.0 lucky-luke-worker2 Ready <none> 8d v1.30.0 root@localhost:~# kubectl describe node lucky-luke-control-plane | grep Taint Taints: node-role.kubernetes.io/control-plane:NoSchedule

Selector

Instead a node can decide which type pod to accept, it will give the decision to a pod to which node can deployed on.

Let's try:

root@localhost:~# kubectl run nginx2 --image=nginx --dry-run=client -o yaml > nginx2-day14.yaml root@localhost:~# vim nginx2-day14.yaml

apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: run: nginx2 name: nginx2 spec: containers: - image: nginx name: nginx2 resources: {} dnsPolicy: ClusterFirst restartPolicy: Always nodeSelector: gpu: "false" status: {}

root@localhost:~# kubectl apply -f nginx2-day14.yaml pod/nginx2 created root@localhost:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 46m 10.244.1.14 lucky-luke-worker <none> <none> nginx2 0/1 Pending 0 8s <none> <none> <none> <none> redis 1/1 Running 0 35m 10.244.2.12 lucky-luke-worker2 <none> <none>

Label one node and let's see what will happen:

root@localhost:~# kubectl label node lucky-luke-worker gpu="false" node/lucky-luke-worker labeled root@localhost:~# kubectl get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES nginx 1/1 Running 0 49m 10.244.1.14 lucky-luke-worker <none> <none> nginx2 1/1 Running 0 3m21s 10.244.1.15 lucky-luke-worker <none> <none> redis 1/1 Running 0 38m 10.244.2.12 lucky-luke-worker2 <none> <none>

DEV Community

40 Days Of Kubernetes (14/40)

Day 14/40

Taints and Tolerations in Kubernetes

1. Taint the node

Selector

Top comments (0)