|
| 1 | +# AKS Cheat Sheet |
| 2 | + |
| 3 | +> Unofficial AKS Cheat Sheet |
| 4 | +
|
| 5 | +Official AKS FAQ is [here](https://docs.microsoft.com/bs-cyrl-ba/azure/aks/faq) |
| 6 | + |
| 7 | +<!-- TOC --> |
| 8 | +- [AKS Cheat Sheet](#aks-cheat-sheet) |
| 9 | + - [Azure CLI Commands](#azure-cli-commands) |
| 10 | + - [Reference Architecture](#reference-architecture) |
| 11 | + - [AKS Features](#aks-features) |
| 12 | + - [Service Principal](#service-principal) |
| 13 | + - [Authn and Authz](#authn-and-authz) |
| 14 | + - [Cluster Security](#cluster-security) |
| 15 | + - [Data Volume](#data-volume) |
| 16 | + - [Network Plugin](#network-plugin) |
| 17 | + - [Network Policiy](#network-policiy) |
| 18 | + - [Load Balancer](#load-balancer) |
| 19 | + - [Ingress](#ingress) |
| 20 | + - [Egress](#egress) |
| 21 | + - [DNS](#dns) |
| 22 | + - [GPU nodes](#gpu-nodes) |
| 23 | + - [Quota and Limits for AKS](#quota-and-limits-for-aks) |
| 24 | + - [Troubleshooting](#troubleshooting) |
| 25 | + - [Azure Container Registory (ACR)](#azure-container-registory-acr) |
| 26 | + |
| 27 | +## Azure CLI Commands |
| 28 | +- Get Node Resource Group |
| 29 | + ``` |
| 30 | + az aks show --resource-group $RESOURCE_GROUP --name $CLUSTER_NAME --query nodeResourceGroup -o tsv |
| 31 | + ``` |
| 32 | +- Check egress IP |
| 33 | + ``` |
| 34 | + kubectl run -it --rm runtest --image=debian --generator=run-pod/v1 |
| 35 | + pod# apt-get update && apt-get install curl -y |
| 36 | + pod# curl -s checkip.dyndns.org |
| 37 | + ``` |
| 38 | +## Reference Architecture |
| 39 | + |
| 40 | +- [Microservices architecture on Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/microservices/aks) |
| 41 | +- https://github.com/mspnp/microservices-reference-implementation |
| 42 | +- [Building microservices on Azure](https://docs.microsoft.com/en-us/azure/architecture/microservices/index) |
| 43 | +
|
| 44 | +## AKS Features |
| 45 | +### Service Principal |
| 46 | +- About Service Principal |
| 47 | + - https://docs.microsoft.com/en-us/azure/aks/kubernetes-service-principal |
| 48 | +- Update Service Principal |
| 49 | + - https://docs.microsoft.com/en-us/azure/aks/update-credentials |
| 50 | +
|
| 51 | +### Authn and Authz |
| 52 | +- 3 options to manage access and identity for AKS clusters |
| 53 | + - [Azure RBAC (integration with Azure AD) to control the access to AKS](https://docs.microsoft.com/en-us/azure/aks/aad-integration) |
| 54 | + ``` |
| 55 | + 1. Developer authenticates with Azure AD(AAD). |
| 56 | + 2. AAD token issuance endpoint issues the access token. |
| 57 | + 3. The developer performs an action using the AAD token, such as kubectl create pod |
| 58 | + 4. k8s validates the token with AAD and fetches the developer's group memberships. |
| 59 | + 5. k8s RBAC and cluster policies are applied. |
| 60 | + 6. Developer's request is successful or not based on previous validation of AAD group membership and k8s RBAC and policies. |
| 61 | + ``` |
| 62 | + from [Bast pracitses for authn & authz in AKS](https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-identity) |
| 63 | + - Kubernetes RBAC |
| 64 | + - [Using RBAC Authorization@k8s.io](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) |
| 65 | + - Roles, ClusterRoles, RoleBindings, ClusterRoleBindings |
| 66 | + - Pod Identities |
| 67 | + - Use managed identities for Pods in AKS to access to Azure resources |
| 68 | + - Managed Identities let you automatically request access to services through Azure AD. You don't manually define credentials for pods, instead they request an access token in real time (See [azure doc](https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-identity#use-pod-identities)) |
| 69 | + - [Use Pod Identities(Managed Identity)](https://github.com/Azure/aad-pod-identity) |
| 70 | +
|
| 71 | +### Cluster Security |
| 72 | +- [cluster security and upgrades](https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-cluster-security) |
| 73 | + - Securing access to the API server, limiting container access, and managing upgrades and node reboots. |
| 74 | +- [Container image management and security](https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-container-image-management) |
| 75 | + - Securing the image and runtimes, using trusted registries, and automated builds on base image updates.. |
| 76 | +- [Pod security](https://docs.microsoft.com/en-us/azure/aks/developer-best-practices-pod-security) |
| 77 | + - Securing access to resources, limiting credential exposure, and using [pod identities](https://docs.microsoft.com/en-us/azure/aks/operator-best-practices-identity#use-pod-identities) and [Azure Key Vault](https://docs.microsoft.com/en-us/azure/aks/developer-best-practices-pod-security#use-azure-key-vault-with-flexvol) |
| 78 | + - [KeyVault with FlexVol@Github page](https://github.com/Azure/kubernetes-keyvault-flexvol) |
| 79 | +
|
| 80 | +### Data Volume |
| 81 | +- Data Volume Options |
| 82 | + - Azure Disk ([Dynamic](https://docs.microsoft.com/en-us/azure/aks/azure-disks-dynamic-pv) / [Static](https://docs.microsoft.com/en-us/azure/aks/azure-disk-volume)) |
| 83 | + - Azure Files ([Dynamic](https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv) / [Static](https://docs.microsoft.com/en-us/azure/aks/azure-files-volume)) |
| 84 | +
|
| 85 | +### Network Plugin |
| 86 | +- [kubenet](https://docs.microsoft.com/en-us/azure/aks/configure-kubenet) (default policy) |
| 87 | + - az aks create --network-plugin option: `kubenet` |
| 88 | + - see also [@k8s.io]((https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/#kubenet)) |
| 89 | +- [Azure CNI](https://docs.microsoft.com/en-us/azure/aks/configure-azure-cni) |
| 90 | + - az aks create --network-plugin option: `azure` |
| 91 | +
|
| 92 | +### Network Policiy |
| 93 | +- Kubernetes version: `1.12+` |
| 94 | +- [Network Policy Recipes](https://github.com/ahmetb/kubernetes-network-policy-recipes) |
| 95 | +- [Network policy Options in AKS](https://docs.microsoft.com/en-us/azure/aks/use-network-policies) |
| 96 | + - 1. `Azure Network Policies` - the Azure CNI sets up a bridge in the VM host for intra-node networking. The filtering rules are applied when the packets pass through the bridge |
| 97 | + - az aks create --network-plugin `azure` |
| 98 | + - 2. `Calico Network Policies` - the Azure CNI sets up local kernel routes for the intra-node traffic. The policies are applied on the pod’s network interface. |
| 99 | + - see [the difference between the two](the Azure CNI sets up local kernel routes for the intra-node traffic. The policies are applied on the pod’s network interface.) |
| 100 | + - az aks create --network-plugin `azure` && --network-policy `calico` |
| 101 | +
|
| 102 | +### Load Balancer |
| 103 | +- Service: type=`LoadBalancer` (NOT `ClusterIP` nor `NodePort`) |
| 104 | +- Default: External Load balancer |
| 105 | +- Static IP to LB (see [azure doc](https://docs.microsoft.com/en-us/azure/aks/static-ip)) |
| 106 | + ```YAML |
| 107 | + apiVersion: v1 |
| 108 | + kind: Service |
| 109 | + metadata: |
| 110 | + name: servicename |
| 111 | + spec: |
| 112 | + loadBalancerIP: 41.222.222.66 |
| 113 | + type: LoadBalancer |
| 114 | + ``` |
| 115 | +- [Internal Load balancer](https://docs.microsoft.com/en-us/azure/aks/internal-lb) - Only accessible from the same VNET |
| 116 | + - Annotation for Internal LB |
| 117 | + ```YAML |
| 118 | + apiVersion: v1 |
| 119 | + kind: Service |
| 120 | + metadata: |
| 121 | + name: servicename |
| 122 | + annotations: |
| 123 | + service.beta.kubernetes.io/azure-load-balancer-internal: "true" |
| 124 | + spec: |
| 125 | + type: LoadBalancer |
| 126 | + ... |
| 127 | + ``` |
| 128 | + - You can specify IP address for LB: `loadBalancerIP:XX.XX.XX.XX` |
| 129 | + - You can specify a subnet for LB with special annotation |
| 130 | + ```YAML |
| 131 | + annotations: |
| 132 | + service.beta.kubernetes.io/azure-load-balancer-internal: "true" |
| 133 | + service.beta.kubernetes.io/azure-load-balancer-internal-subnet: "apps-subnet" |
| 134 | + ``` |
| 135 | +
|
| 136 | +### Ingress |
| 137 | +- Ingress Controllers provided by Azure (Not [nginx ingress](https://github.com/kubernetes/ingress-nginx) or others) |
| 138 | + - [HTTP application routing add-on](https://docs.microsoft.com/en-us/azure/aks/http-application-routing) |
| 139 | + - [Application Gateway Kubernetes Ingress](https://github.com/Azure/application-gateway-kubernetes-ingress) |
| 140 | +- TLS Termination Configfuration |
| 141 | + - [Your Certificates](https://docs.microsoft.com/en-us/azure/aks/ingress-own-tls) |
| 142 | + - [Let's Encrypt](https://docs.microsoft.com/en-us/azure/aks/ingress-tls) |
| 143 | +- Ingress for Internal VNET by using a service with [Internal LB](https://docs.microsoft.com/en-us/azure/aks/internal-lb) |
| 144 | +
|
| 145 | +### Egress |
| 146 | +- Static IP for egress traffic |
| 147 | + - See [azure doc](https://docs.microsoft.com/en-us/azure/aks/egress) |
| 148 | + - Default: egress IP from AKS is randomly assigned |
| 149 | + > Once a Kubernetes service of type LoadBalancer is created, agent nodes are added to an Azure Load Balancer pool. For outbound flow, Azure translates it to the first public IP address configured on the load balancer. This public IP address is only valid for the lifespan of that resource. If you delete the Kubernetes LoadBalancer service, the associated load balancer and IP address are also deleted. |
| 150 | + - Procedures |
| 151 | + - 1. Create static IP in AKS node resource Group |
| 152 | + - 2. Create a service with the static IP ( put the static IP to the `loadBalancerIP` property) |
| 153 | +
|
| 154 | +### DNS |
| 155 | +- Kubernetes +1.12.x: `CoreDNS` |
| 156 | + - [Customize CoreDNS](https://docs.microsoft.com/en-us/azure/aks/coredns-custom) |
| 157 | +- Kubernetes < 1.12.x: `kube-dns` |
| 158 | + - [Customize kube-dns](https://www.danielstechblog.io/using-custom-dns-server-for-domain-specific-name-resolution-with-azure-kubernetes-service/) |
| 159 | +
|
| 160 | +### GPU nodes |
| 161 | +- https://docs.microsoft.com/en-us/azure/aks/gpu-cluster |
| 162 | +
|
| 163 | +### Quota and Limits for AKS |
| 164 | +- https://docs.microsoft.com/en-us/azure/aks/container-service-quotas |
| 165 | +- Default limit |
| 166 | + - max clusters per subscription: `100` |
| 167 | + - max nodes per cluster: `100` |
| 168 | + - max pods per node setting for AKS |
| 169 | + - Basic networking with Kubenet: `110` |
| 170 | + - Advanced networking with Azure CNI: `30` - Portal Deploy, `110` - ARM template / Azure CLI deploy |
| 171 | +- [Region availability](https://docs.microsoft.com/en-us/azure/aks/container-service-quotas#region-availability) |
| 172 | +- [Provisioned Infrastructure](https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits) |
| 173 | +- [Supported k8s versions](https://docs.microsoft.com/en-us/azure/aks/supported-kubernetes-versions) |
| 174 | + ``` |
| 175 | + az aks get-versions --location $REGION --output table |
| 176 | + ``` |
| 177 | +
|
| 178 | +
|
| 179 | +### Troubleshooting |
| 180 | +- [Official troubleshooting Guide @k8s.io](https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/) |
| 181 | +- https://docs.microsoft.com/en-us/azure/aks/troubleshooting |
| 182 | +- [Kubernetes Troubleshooting @Github](https://github.com/feiskyer/kubernetes-handbook/blob/master/en/troubleshooting/index.md) |
| 183 | +- https://docs.microsoft.com/en-us/azure/aks/kube-advisor-tool |
| 184 | +- [SSH login to k8s nodes](https://github.com/yokawasa/kubectl-plugin-ssh-jump) |
| 185 | +
|
| 186 | +
|
| 187 | +## Azure Container Registory (ACR) |
| 188 | +- VNET & Firewall Rule |
| 189 | + - https://docs.microsoft.com/en-us/azure/container-registry/container-registry-vnet |
| 190 | +- ACR Task - Automate OS and framework patching |
| 191 | + - http://aka.ms/acr/tasks |
| 192 | + - https://docs.microsoft.com/en-us/azure/container-registry/container-registry-tasks-multi-step |
| 193 | +- Repo & Tag Locking |
| 194 | + - http://aka.ms/acr/tag-locking |
| 195 | +- Helm Chart Repositories |
| 196 | + - https://docs.microsoft.com/en-us/azure/container-registry/container-registry-helm-repos |
0 commit comments