I am facing difficulty in this, maybe the answer is simple so if someone knows the answer, please comment here.
I have created an EKS cluster using the following manifest.
apiVersion: eksctl.io/v1alpha5 kind: ClusterConfig metadata: name: test-cluster region: us-west-2 version: "1.29" vpc: subnets: public: us-west-2a: { id: subnet-094d01de2dd2148c0 } us-west-2b: { id: subnet-04429e132a1f42826 } us-west-2c: { id: subnet-028a738bdafc344c6 } nodeGroups: - name: ng-spot instanceType: t3.medium labels: { role: builders } desiredCapacity: 2 minSize: 2 maxSize: 4 volumeSize: 30 ssh: allow: true publicKeyName: techies tags: Name: ng-spot maxPodsPerNode: 110 This cluster is for testing purposes, so I am using the t3.medium instance with the maximum pod limit 110.
arun@ArunLAL555:~$ k get nodes NAME STATUS ROLES AGE VERSION ip-192-168-37-0.us-west-2.compute.internal Ready <none> 26m v1.29.0-eks-5e0fdde ip-192-168-86-42.us-west-2.compute.internal Ready <none> 26m v1.29.0-eks-5e0fdde arun@ArunLAL555:~$ kubectl get nodes -o jsonpath='{.items[*].status.allocatable.pods}{"\n"}' 110 110 This ensures that I can create 110 pods on each node.
arun@ArunLAL555:~$ k create deployment test-deploy --image nginx --replicas 50 deployment.apps/test-deploy created arun@ArunLAL555:~$ k get po NAME READY STATUS RESTARTS AGE test-deploy-859f95ffcc-2c5k6 0/1 ContainerCreating 0 19s test-deploy-859f95ffcc-2p9rh 1/1 Running 0 19s test-deploy-859f95ffcc-468wm 0/1 ContainerCreating 0 18s . . test-deploy-859f95ffcc-xxm7z 0/1 ContainerCreating 0 18s test-deploy-859f95ffcc-z88x6 1/1 Running 0 19s Here, the remaining pods not getting IP
arun@ArunLAL555:~$ k events po test-deploy-859f95ffcc-xxm7z 1s (x5 over 55s) Warning FailedCreatePodSandBox Pod/test-deploy-859f95ffcc-m7t62 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "528eaad224c5578435db12a57a8fa7063a03423b28d57c681bab742cc8389a1a": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: failed to assign an IP address to container The following are the subnets and their IP availability
arun@ArunLAL555:~$ aws eks describe-cluster --name test-cluster --query "cluster.resourcesVpcConfig.su bnetIds" [ "subnet-094d01de2dd2148c0", "subnet-04429e132a1f42826", "subnet-028a738bdafc344c6" ] arun@ArunLAL555:~$ aws ec2 describe-subnets --subnet-ids subnet-094d01de2dd2148c0 subnet-04429e132a1f42826 subnet-028a738bdafc344c6 --query 'Subnets[*].[SubnetId,AvailableIpAddressCount]' --output text subnet-028a738bdafc344c6 8167 subnet-094d01de2dd2148c0 8185 subnet-04429e132a1f42826 8168 I have updated the VPC CNI
arun@ArunLAL555:~$ kubectl describe daemonset aws-node --namespace kube-system | grep amazon-k8s-cni: | cut -d : -f 3 v1.16.0-eksbuild.1 arun@ArunLAL555:~$ aws eks create-addon --cluster-name test-cluster --addon-name vpc-cni --addon-version v1.17.1-eksbuild.1 \ -service> --service-account-role-arn arn:aws:iam::111122223333:role/AmazonEKSVPCCNIRole { "addon": { "addonName": "vpc-cni", "clusterName": "test-cluster", "status": "CREATING", "addonVersion": "v1.17.1-eksbuild.1", "health": { "issues": [] }, "addonArn": "arn:aws:eks:us-west-2:111122223333:addon/test-cluster/vpc-cni/fec7333d-c1fc-c2fc-1287-c14beaa883f8", "createdAt": "2024-03-22T19:35:54.685000+05:30", "modifiedAt": "2024-03-22T19:35:54.703000+05:30", "serviceAccountRoleArn": "arn:aws:iam::111122223333:role/AmazonEKSVPCCNIRole", "tags": {} } } arun@ArunLAL555:~$ aws eks describe-addon --cluster-name test-cluster --addon-name vpc-cni --query addon.addonVersion --output text v1.17.1-eksbuild.1 After that, I have terminated the existing instances, since that the nodes art not getting ready.
arun@ArunLAL555:~$ k get nodes NAME STATUS ROLES AGE VERSION ip-192-168-40-177.us-west-2.compute.internal NotReady <none> 86s v1.29.0-eks-5e0fdde ip-192-168-83-11.us-west-2.compute.internal NotReady <none> 3m29s v1.29.0-eks-5e0fdde arun@ArunLAL555:~$ k describe nodes ip-192-168-40-177.us-west-2.compute.internal Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Fri, 22 Mar 2024 19:45:20 +0530 Fri, 22 Mar 2024 19:44:49 +0530 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized I would like to know why this is happening, if someone know the answer please comment.
- First, why the Pods didn't get IP even though the pod limit was set to the maximum
- Second, why the nodes are not ready after updating the VPC CNI plugin