I'm quite new to Kubernetes, event if it doesn't feel like after I spent dozens of hours trying to setup a working Kubernetes.
The edge parameters:
- 1 master and 3 nodes
- set up using kubeadm
- kubernetes version 1.12.1, Calico 3.2
- Primary IP addresses of hosts are 192.168.1.0/21x (relevant because this collides with default pod subnet, because of this I set
--pod-network-cidr=10.10.0.0/16)
Installation using kubeadm init and joining worked so far. All pods are running, only coredns keeps crashing, but this is not relevant here.
Installation of Calico
Then, I starting with installing with the etcd datastore and installing with the kubernetes api datastore 50 nodes or less
kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/rbac.yaml curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/calico.yaml -O # modify calico.yaml # Here, I feel a lack of documentation: Which etcd is needed? The one of kubernetes or a new one? See below kubectl apply -f calico.yaml kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml curl https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml -O # modify calico.yaml (here, I have to change the range of CALICO_IPV4POOL_CIDR) sed -i 's/192.168.0.0/10.10.0.0/' calico.yaml kubectl apply -f calico.yaml Test
Now, I use the following definition for testing:
apiVersion: v1 kind: Pod metadata: name: www1 labels: service: testwww spec: containers: - name: meinserver image: erkules/nginxhostname ports: - containerPort: 80 --- apiVersion: v1 kind: Pod metadata: name: www2 labels: service: testwww spec: containers: - name: meinserver image: erkules/nginxhostname --- kind: Service apiVersion: v1 metadata: name: www-np spec: type: NodePort selector: service: testwww ports: - name: http1 protocol: TCP nodePort: 30333 port: 8080 targetPort: 80 How I test:
curl http://192.168.1.211:30333 # master, no success curl http://192.168.1.212:30333 # node, no success curl http://192.168.1.213:30333 # node, only works 50%, with www1 (which is on this node) curl http://192.168.1.214:30333 # node, only works 50%, with www2 (which is on this node) The above commands work only if the (randomly chosen) pod is on the node which owns the specified IP address. I expected to have 100% success rate on all nodes.
I saw more success when using the etcd server of kubernetes (pod/etcd-master1). In this case, all the above commands worked. But the pod/calico-kube-controllers didn't start in this case because it was running on a worker node and thus didn't have access to etcd.
In the getting started guide, I found an instruction to install an extra etcd:
kubectl apply -f https://docs.projectcalico.org/v3.2/getting-started/kubernetes/installation/hosted/etcd.yaml It's weird: This line is only in the "getting started", but not in "installation". But the default calico.yaml already contains the correct clusterIp of exactly this etcd server (btw how is this IP static? Is it generated by a hash?). Anyway: with this, all Calico nodes came up without an error, but I had the described behaviour where not all NodePorts were working. And I also care about etcd which is open to everyone this way which is not what I want.
So, there are the main questions:
- What's the correct etcd server to use? A separate one or the one of Kubernetes?
- If it should be the one of Kubernetes, why isn't pod/calico-kube-controllers configured by default to run on the master where it has access to etcd?
- If I should serve an own etcd for calico, why isn't it documented under "installation", any why do I have these NodePort problems?
Btw: I was the answers which recommend changing the iptables default rule from DROP to ACCEPT. But this is an ugly hack and probably bypasses all of the security features of Calico
Requested details (Variant with extra etcd)
$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default pod/www1 1/1 Running 0 8s 192.168.104.9 node2 <none> default pod/www2 1/1 Running 0 8s 192.168.166.136 node1 <none> kube-system pod/calico-etcd-46g2q 1/1 Running 0 22m 192.168.1.211 master1 <none> kube-system pod/calico-kube-controllers-f4dcbf48b-88795 1/1 Running 10 23h 192.168.1.212 node0 <none> kube-system pod/calico-node-956lj 2/2 Running 6 21h 192.168.1.213 node1 <none> kube-system pod/calico-node-mhtvg 2/2 Running 5 21h 192.168.1.211 master1 <none> kube-system pod/calico-node-s9njn 2/2 Running 6 21h 192.168.1.214 node2 <none> kube-system pod/calico-node-wjqlk 2/2 Running 6 21h 192.168.1.212 node0 <none> kube-system pod/coredns-576cbf47c7-4tcx6 0/1 CrashLoopBackOff 15 24h 192.168.137.86 master1 <none> kube-system pod/coredns-576cbf47c7-hjpgv 0/1 CrashLoopBackOff 15 24h 192.168.137.85 master1 <none> kube-system pod/etcd-master1 1/1 Running 17 24h 192.168.1.211 master1 <none> kube-system pod/kube-apiserver-master1 1/1 Running 2 24h 192.168.1.211 master1 <none> kube-system pod/kube-controller-manager-master1 1/1 Running 3 24h 192.168.1.211 master1 <none> kube-system pod/kube-proxy-22mb9 1/1 Running 2 23h 192.168.1.212 node0 <none> kube-system pod/kube-proxy-96tn7 1/1 Running 2 23h 192.168.1.213 node1 <none> kube-system pod/kube-proxy-vb4pq 1/1 Running 2 24h 192.168.1.211 master1 <none> kube-system pod/kube-proxy-vq7qj 1/1 Running 2 23h 192.168.1.214 node2 <none> kube-system pod/kube-scheduler-master1 1/1 Running 2 24h 192.168.1.211 master1 <none> kube-system pod/kubernetes-dashboard-77fd78f978-h8czs 1/1 Running 2 23h 192.168.180.9 node0 <none> NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 24h <none> default service/www-np NodePort 10.99.149.53 <none> 8080:30333/TCP 8s service=testwww kube-system service/calico-etcd ClusterIP 10.96.232.136 <none> 6666/TCP 21h k8s-app=calico-etcd kube-system service/calico-typha ClusterIP 10.105.199.162 <none> 5473/TCP 23h k8s-app=calico-typha kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 24h k8s-app=kube-dns kube-system service/kubernetes-dashboard ClusterIP 10.96.235.235 <none> 443/TCP 23h k8s-app=kubernetes-dashboard NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR kube-system daemonset.apps/calico-etcd 1 1 1 1 1 node-role.kubernetes.io/master= 21h calico-etcd quay.io/coreos/etcd:v3.3.9 k8s-app=calico-etcd kube-system daemonset.apps/calico-node 4 4 4 4 4 beta.kubernetes.io/os=linux 23h calico-node,install-cni quay.io/calico/node:v3.2.3,quay.io/calico/cni:v3.2.3 k8s-app=calico-node kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 24h kube-proxy k8s.gcr.io/kube-proxy:v1.12.1 k8s-app=kube-proxy NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR kube-system deployment.apps/calico-kube-controllers 1 1 1 1 23h calico-kube-controllers quay.io/calico/kube-controllers:v3.2.3 k8s-app=calico-kube-controllers kube-system deployment.apps/calico-typha 0 0 0 0 23h calico-typha quay.io/calico/typha:v3.2.3 k8s-app=calico-typha kube-system deployment.apps/coredns 2 2 2 0 24h coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns kube-system deployment.apps/kubernetes-dashboard 1 1 1 1 23h kubernetes-dashboard k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0 k8s-app=kubernetes-dashboard NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR kube-system replicaset.apps/calico-kube-controllers-f4dcbf48b 1 1 1 23h calico-kube-controllers quay.io/calico/kube-controllers:v3.2.3 k8s-app=calico-kube-controllers,pod-template-hash=f4dcbf48b kube-system replicaset.apps/calico-typha-5f646c475c 0 0 0 23h calico-typha quay.io/calico/typha:v3.2.3 k8s-app=calico-typha,pod-template-hash=5f646c475c kube-system replicaset.apps/coredns-576cbf47c7 2 2 0 24h coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns,pod-template-hash=576cbf47c7 kube-system replicaset.apps/kubernetes-dashboard-77fd78f978 1 1 1 23h kubernetes-dashboard k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0 k8s-app=kubernetes-dashboard,pod-template-hash=77fd78f978 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master1 Ready master 24h v1.12.0 192.168.1.211 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node0 Ready <none> 23h v1.12.0 192.168.1.212 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node1 Ready <none> 23h v1.12.0 192.168.1.213 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node2 Ready <none> 23h v1.12.0 192.168.1.214 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce $ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n - ;done x---x-x-x--x-xx-x--- Requested details (Variant with existing etcd)
$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default pod/www1 1/1 Running 0 9m27s 10.10.2.3 node1 <none> default pod/www2 1/1 Running 0 9m27s 10.10.3.3 node2 <none> kube-system pod/calico-kube-controllers-f4dcbf48b-qrqnc 0/1 CreateContainerConfigError 1 18m 192.168.1.212 node0 <none> kube-system pod/calico-node-j8cwr 2/2 Running 2 17m 192.168.1.212 node0 <none> kube-system pod/calico-node-qtq9m 2/2 Running 2 17m 192.168.1.214 node2 <none> kube-system pod/calico-node-qvf6w 2/2 Running 2 17m 192.168.1.211 master1 <none> kube-system pod/calico-node-rdt7k 2/2 Running 2 17m 192.168.1.213 node1 <none> kube-system pod/coredns-576cbf47c7-6l9wz 1/1 Running 2 21m 10.10.0.11 master1 <none> kube-system pod/coredns-576cbf47c7-86pxp 1/1 Running 2 21m 10.10.0.10 master1 <none> kube-system pod/etcd-master1 1/1 Running 19 20m 192.168.1.211 master1 <none> kube-system pod/kube-apiserver-master1 1/1 Running 2 20m 192.168.1.211 master1 <none> kube-system pod/kube-controller-manager-master1 1/1 Running 1 20m 192.168.1.211 master1 <none> kube-system pod/kube-proxy-28qct 1/1 Running 1 20m 192.168.1.212 node0 <none> kube-system pod/kube-proxy-8ltpd 1/1 Running 1 21m 192.168.1.211 master1 <none> kube-system pod/kube-proxy-g9wmn 1/1 Running 1 20m 192.168.1.213 node1 <none> kube-system pod/kube-proxy-qlsxc 1/1 Running 1 20m 192.168.1.214 node2 <none> kube-system pod/kube-scheduler-master1 1/1 Running 5 19m 192.168.1.211 master1 <none> NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 21m <none> default service/www-np NodePort 10.106.27.58 <none> 8080:30333/TCP 9m27s service=testwww kube-system service/calico-typha ClusterIP 10.99.14.62 <none> 5473/TCP 17m k8s-app=calico-typha kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 21m k8s-app=kube-dns NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR kube-system daemonset.apps/calico-node 4 4 4 4 4 beta.kubernetes.io/os=linux 18m calico-node,install-cni quay.io/calico/node:v3.2.3,quay.io/calico/cni:v3.2.3 k8s-app=calico-node kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 21m kube-proxy k8s.gcr.io/kube-proxy:v1.12.1 k8s-app=kube-proxy NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR kube-system deployment.apps/calico-kube-controllers 1 1 1 0 18m calico-kube-controllers quay.io/calico/kube-controllers:v3.2.3 k8s-app=calico-kube-controllers kube-system deployment.apps/calico-typha 0 0 0 0 17m calico-typha quay.io/calico/typha:v3.2.3 k8s-app=calico-typha kube-system deployment.apps/coredns 2 2 2 2 21m coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR kube-system replicaset.apps/calico-kube-controllers-f4dcbf48b 1 1 0 18m calico-kube-controllers quay.io/calico/kube-controllers:v3.2.3 k8s-app=calico-kube-controllers,pod-template-hash=f4dcbf48b kube-system replicaset.apps/calico-typha-5f646c475c 0 0 0 17m calico-typha quay.io/calico/typha:v3.2.3 k8s-app=calico-typha,pod-template-hash=5f646c475c kube-system replicaset.apps/coredns-576cbf47c7 2 2 2 21m coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns,pod-template-hash=576cbf47c7 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master1 Ready master 21m v1.12.0 192.168.1.211 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node0 Ready <none> 20m v1.12.0 192.168.1.212 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node1 Ready <none> 20m v1.12.0 192.168.1.213 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node2 Ready <none> 20m v1.12.0 192.168.1.214 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce $ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n - ;done xxxxxxxxxxxxxxxxxxxx Update: Variant with flannel
I just tried with flannel: Result is surprisingly the same as with extra etcd (pods only answering if on the same node). This brings me to the question: is there anything about my OS? Ubuntu 18.04 with latest updates, installed using debootstrap. No firewall...
How I installed it:
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml Result:
$ kubectl get all --all-namespaces=true -o wide; kubectl get nodes -o wide NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE default pod/www1 1/1 Running 0 3m40s 10.10.2.2 node1 <none> default pod/www2 1/1 Running 0 3m40s 10.10.3.2 node2 <none> kube-system pod/coredns-576cbf47c7-64wxp 1/1 Running 3 21m 10.10.1.3 node0 <none> kube-system pod/coredns-576cbf47c7-7zvqs 1/1 Running 3 21m 10.10.1.2 node0 <none> kube-system pod/etcd-master1 1/1 Running 0 21m 192.168.1.211 master1 <none> kube-system pod/kube-apiserver-master1 1/1 Running 0 20m 192.168.1.211 master1 <none> kube-system pod/kube-controller-manager-master1 1/1 Running 0 21m 192.168.1.211 master1 <none> kube-system pod/kube-flannel-ds-amd64-brnmq 1/1 Running 0 8m22s 192.168.1.214 node2 <none> kube-system pod/kube-flannel-ds-amd64-c6v67 1/1 Running 0 8m22s 192.168.1.213 node1 <none> kube-system pod/kube-flannel-ds-amd64-gchmv 1/1 Running 0 8m22s 192.168.1.211 master1 <none> kube-system pod/kube-flannel-ds-amd64-l9mpl 1/1 Running 0 8m22s 192.168.1.212 node0 <none> kube-system pod/kube-proxy-5pmtc 1/1 Running 0 21m 192.168.1.213 node1 <none> kube-system pod/kube-proxy-7ctp5 1/1 Running 0 21m 192.168.1.212 node0 <none> kube-system pod/kube-proxy-9zfhl 1/1 Running 0 21m 192.168.1.214 node2 <none> kube-system pod/kube-proxy-hcs4g 1/1 Running 0 21m 192.168.1.211 master1 <none> kube-system pod/kube-scheduler-master1 1/1 Running 0 20m 192.168.1.211 master1 <none> NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22m <none> default service/www-np NodePort 10.101.213.118 <none> 8080:30333/TCP 3m40s service=testwww kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP 22m k8s-app=kube-dns NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE CONTAINERS IMAGES SELECTOR kube-system daemonset.apps/kube-flannel-ds-amd64 4 4 4 4 4 beta.kubernetes.io/arch=amd64 8m22s kube-flannel quay.io/coreos/flannel:v0.10.0-amd64 app=flannel,tier=node kube-system daemonset.apps/kube-flannel-ds-arm 0 0 0 0 0 beta.kubernetes.io/arch=arm 8m22s kube-flannel quay.io/coreos/flannel:v0.10.0-arm app=flannel,tier=node kube-system daemonset.apps/kube-flannel-ds-arm64 0 0 0 0 0 beta.kubernetes.io/arch=arm64 8m22s kube-flannel quay.io/coreos/flannel:v0.10.0-arm64 app=flannel,tier=node kube-system daemonset.apps/kube-flannel-ds-ppc64le 0 0 0 0 0 beta.kubernetes.io/arch=ppc64le 8m21s kube-flannel quay.io/coreos/flannel:v0.10.0-ppc64le app=flannel,tier=node kube-system daemonset.apps/kube-flannel-ds-s390x 0 0 0 0 0 beta.kubernetes.io/arch=s390x 8m21s kube-flannel quay.io/coreos/flannel:v0.10.0-s390x app=flannel,tier=node kube-system daemonset.apps/kube-proxy 4 4 4 4 4 <none> 22m kube-proxy k8s.gcr.io/kube-proxy:v1.12.1 k8s-app=kube-proxy NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR kube-system deployment.apps/coredns 2 2 2 2 22m coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns NAMESPACE NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR kube-system replicaset.apps/coredns-576cbf47c7 2 2 2 21m coredns k8s.gcr.io/coredns:1.2.2 k8s-app=kube-dns,pod-template-hash=576cbf47c7 NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master1 Ready master 22m v1.12.1 192.168.1.211 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node0 Ready <none> 21m v1.12.1 192.168.1.212 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node1 Ready <none> 21m v1.12.1 192.168.1.213 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce node2 Ready <none> 21m v1.12.1 192.168.1.214 <none> Ubuntu 18.04 LTS 4.15.0-20-generic docker://17.12.1-ce $ for i in $(seq 20); do timeout 1 curl -so/dev/null http://192.168.1.214:30333 && echo -n x || echo -n - ;done -x--xxxxx-x-x---xxxx