DEV Community

Cover image for Combining IAM Roles for Service Accounts with Pod level Security Groups for a defense-in-depth strategy
Chabane R. for Onepoint x Stack Labs

Posted on • Edited on

Combining IAM Roles for Service Accounts with Pod level Security Groups for a defense-in-depth strategy

In the previous part we created our RDS instance. In this part, we'll put them all together and deploy the metabase to Kubernetes. Our objective is to:

  • Enable IAM roles for Service Account.
  • Create an IAM role to connect to the RDS instance. It will be added to the metabase service account.
  • Enable Pod Security Group by adding the managed policy AmazonEKSVPCResourceController on Amazon EKS cluster.
  • Create a security group that allows inbound traffic to RDS. It will be assigned to the metabase service account.
  • Upgrade the VPC CNI to the latest version. Version +1.7.7 is required to enable Pod Security Group in the EKS Cluster.
  • Enabling POD ENI in the aws-node daemonset.
  • Deploy and test our Kubernetes manifests.

Alt Text

Enabling IAM roles for Service Account

To assign an IAM role to a pod, we need:

  • To create an IAM OIDC provider for the cluster. The cluster has an OpenID Connect issuer URL associated with it.
  • To create the IAM role and attach an IAM policy to it with the rds-db:connect permission that the service account needs:

Complete infra/plan/eks-cluster.tf with:

 data "tls_certificate" "cert" { url = aws_eks_cluster.eks.identity[0].oidc[0].issuer } resource "aws_iam_openid_connect_provider" "openid" { client_id_list = ["sts.amazonaws.com"] thumbprint_list = [data.tls_certificate.cert.certificates[0].sha1_fingerprint] url = aws_eks_cluster.eks.identity[0].oidc[0].issuer } data "aws_iam_policy_document" "web_identity_assume_role_policy" { statement { actions = ["sts:AssumeRoleWithWebIdentity"] effect = "Allow" condition { test = "StringEquals" variable = "${replace(aws_iam_openid_connect_provider.openid.url, "https://", "")}:sub" values = ["system:serviceaccount:metabase:metabase"] } condition { test = "StringEquals" variable = "${replace(aws_iam_openid_connect_provider.openid.url, "https://", "")}:aud" values = ["sts.amazonaws.com"] } principals { identifiers = [aws_iam_openid_connect_provider.openid.arn] type = "Federated" } } } resource "aws_iam_role" "web_identity_role" { assume_role_policy = data.aws_iam_policy_document.web_identity_assume_role_policy.json name = "web-identity-role-${var.env}" } 
Enter fullscreen mode Exit fullscreen mode

By combining the OpenID Connect (OIDC) identity provider and Kubernetes service account annotations, we will be able use IAM roles at the pod level.

Inside EKS, there is an admission controller that will inject AWS session credentials into pods respectively of the roles based on the annotation on the Service Account used by the pod. The credentials will get exposed by AWS_ROLE_ARN & AWS_WEB_IDENTITY_TOKEN_FILE environment variables. [3]

For a detailed explanation of this capability, see the [introducing fine-grained IAM roles for service accounts][aws-7]

Now we can create the IAM role to allow access to RDS instance from Kubernetes pods:

Complete infra/plan/eks-cluster.tf with:

resource "aws_iam_role_policy" "rds_access_from_k8s_pods" { name = "rds-access-from-k8s-pods-${var.env}" role = aws_iam_role.web_identity_role.id policy = jsonencode({ Version = "2012-10-17" Statement = [ { Action = [ "rds-db:connect", ] Effect = "Allow" Resource = "arn:aws:rds-db:${var.region}:${data.aws_caller_identity.current.account_id}:dbuser:${aws_db_instance.postgresql.resource_id}/metabase" } ] }) } 
Enter fullscreen mode Exit fullscreen mode

Pod Security Group

To enable Pod security group, we need to add the managed policy AmazonEKSVPCResourceController. It allows the role to manage network interfaces, their private IP addresses, and their attachment and detachment to and from instances.

Complete infra/plan/eks-cluster.tf with:

resource "aws_iam_role_policy_attachment" "eks-AmazonEKSVPCResourceController" { policy_arn = "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController" role = aws_iam_role.eks.name } 
Enter fullscreen mode Exit fullscreen mode

Now let's create our pod security group

Complete infra/plan/eks-node-group.tf with:

 resource "aws_security_group" "rds_access" { name = "rds-access-from-pod-${var.env}" description = "Allow RDS Access from Kubernetes Pods" vpc_id = aws_vpc.main.id ingress { from_port = 3000 to_port = 3000 protocol = "tcp" self = true } ingress { from_port = 53 to_port = 53 protocol = "tcp" security_groups = [aws_eks_cluster.eks.vpc_config[0].cluster_security_group_id] } ingress { from_port = 53 to_port = 53 protocol = "udp" security_groups = [aws_eks_cluster.eks.vpc_config[0].cluster_security_group_id] } egress { from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "rds-access-from-pod-${var.env}" Environment = var.env } } 
Enter fullscreen mode Exit fullscreen mode

To allow the pod to access the Amazon RDS instance, we need to allow the pod security group as the source of inbound / outbound traffic on the RDS port.

Update the VPC security group aws_security_group.sg in infra/plan/rds.tf with the following ingress / egress rules:

 ingress { from_port = var.rds_port to_port = var.rds_port protocol = "tcp" security_groups = [aws_security_group.rds_access.id] } egress { from_port = 1025 to_port = 65535 protocol = "tcp" security_groups = [aws_security_group.rds_access.id] } 
Enter fullscreen mode Exit fullscreen mode

Add the following outputs:

output "sg-eks-cluster" { value = aws_eks_cluster.eks.vpc_config[0].cluster_security_group_id } output "sg-rds-access" { value = aws_security_group.rds_access.id } 
Enter fullscreen mode Exit fullscreen mode

Let's deploy our modifications

cd infra/envs/dev terraform apply ../../plan/ 
Enter fullscreen mode Exit fullscreen mode

Kubernetes configuration

Let's connect to EKS cluster

aws eks --region $REGION update-kubeconfig --name $EKS_CLUSTER_NAME 
Enter fullscreen mode Exit fullscreen mode

Now we need to enable pods to receive their own network interfaces. Before doing that, use the following command to print your cluster's CNI version:

kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2 
Enter fullscreen mode Exit fullscreen mode

The Amazon EKS cluster must be running Kubernetes version 1.17 and Amazon EKS platform version eks.3 or later.

Upgrade your CNI version [1]

curl -o aws-k8s-cni.yaml https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.7.9/config/v1.7/aws-k8s-cni.yaml sed -i "s/us-west-2/$REGION/g" aws-k8s-cni.yaml kubectl apply -f aws-k8s-cni.yaml 
Enter fullscreen mode Exit fullscreen mode

Enable the CNI plugin to manage network interfaces for pods by setting the ENABLE_POD_ENI variable to true in the aws-node DaemonSet. Once this setting is set to true, for each node in the cluster the plugin adds a label with the value vpc.amazonaws.com/has-trunk-attached=true. The VPC resource controller creates and attaches one special network interface called a trunk network interface with the description aws-k8s-trunk-eni [2].

kubectl set env daemonset -n kube-system aws-node ENABLE_POD_ENI=true 
Enter fullscreen mode Exit fullscreen mode

You can see which of your nodes have aws-k8s-trunk-eni set to true with the following command.

$ kubectl get nodes -o wide -l vpc.amazonaws.com/has-trunk-attached=true NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-3-109.eu-west-1.compute.internal Ready <none> 56m v1.18.9-eks-d1db3c 10.0.3.109 <none> Amazon Linux 2 4.14.219-164.354.amzn2.x86_64 docker://19.3.13 ip-10-0-7-157.eu-west-1.compute.internal Ready <none> 56m v1.18.9-eks-d1db3c 10.0.7.157 34.253.89.183 Amazon Linux 2 4.14.219-164.354.amzn2.x86_64 docker://19.3.13 
Enter fullscreen mode Exit fullscreen mode

Testing metabase connection to the RDS Instance

We deploy our k8s manifests using Kustomize. Add the following manifests in the folder config/base

config/base/service-account.yaml

apiVersion: v1 kind: ServiceAccount metadata: labels: app: metabase name: metabase 
Enter fullscreen mode Exit fullscreen mode

config/base/security-group-policy.yaml

apiVersion: vpcresources.k8s.aws/v1beta1 kind: SecurityGroupPolicy metadata: name: metabase spec: serviceAccountSelector: matchLabels: app: metabase 
Enter fullscreen mode Exit fullscreen mode

config/base/database-secret.yaml

apiVersion: v1 kind: Secret metadata: name: metabase type: Opaque data: password: metabase 
Enter fullscreen mode Exit fullscreen mode

config/base/deployment.yaml

apiVersion: apps/v1 kind: Deployment metadata: name: metabase labels: app: metabase spec: selector: matchLabels: app: metabase replicas: 1 template: metadata: labels: app: metabase spec: containers: - name: metabase image: metabase/metabase imagePullPolicy: IfNotPresent resources: requests: memory: "1Gi" cpu: "512m" limits: memory: "4Gi" cpu: "2000m" livenessProbe: httpGet: path: /api/health port: 3000 initialDelaySeconds: 100 periodSeconds: 10 readinessProbe: httpGet: path: /api/health port: 3000 initialDelaySeconds: 60 periodSeconds: 10 
Enter fullscreen mode Exit fullscreen mode

config/base/service.yaml

apiVersion: v1 kind: Service metadata: name: metabase labels: app: metabase spec: type: LoadBalancer ports: - port: 8000 targetPort: 3000 protocol: TCP selector: app: metabase 
Enter fullscreen mode Exit fullscreen mode

And finally our config/base/kustomization.yaml file

apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: metabase resources: - security-group-policy.yaml - service-account.yaml - deployment.yaml - service.yaml - database-secret.yaml 
Enter fullscreen mode Exit fullscreen mode

Now we have our kustomize base, we can patch the manifests with the values provided as terraform outputs.

Create config/envs/$ENV/service-account.patch.yaml. We annotate the service account with the IAM role created before for RDS access.

apiVersion: v1 kind: ServiceAccount metadata: annotations: eks.amazonaws.com/role-arn: <RDS_ACCESS_ROLE_ARN> labels: app: metabase name: metabase 
Enter fullscreen mode Exit fullscreen mode

Create config/envs/$ENV/security-group-policy.patch.yaml.

The SecurityGroupPolicy CRD specifies which security groups to assign to pods. Within a namespace, we can select pods based on pod labels, or based on labels of the service account associated with a pod. We define the security group IDs to be applied.

apiVersion: vpcresources.k8s.aws/v1beta1 kind: SecurityGroupPolicy metadata: name: metabase spec: serviceAccountSelector: matchLabels: app: metabase securityGroups: groupIds: - <POD_SECURITY_GROUP_ID> - <EKS_CLUSTER_SECURITY_GROUP_ID> 
Enter fullscreen mode Exit fullscreen mode

Create config/envs/$ENV/database-secret.patch.yaml

apiVersion: v1 kind: Secret metadata: name: metabase type: Opaque data: password: <MB_DB_PASS> 
Enter fullscreen mode Exit fullscreen mode

Create config/envs/$ENV/deployment.patch.yaml

apiVersion: apps/v1 kind: Deployment metadata: name: metabase labels: app: metabase spec: selector: matchLabels: app: metabase replicas: 1 template: metadata: labels: app: metabase spec: serviceAccountName: metabase containers: - name: metabase image: metabase/metabase imagePullPolicy: IfNotPresent env: - name: MB_DB_TYPE value: postgres - name: MB_DB_HOST value: <MB_DB_HOST> - name: MB_DB_PORT value: "5432" - name: MB_DB_DBNAME value: metabase - name: MB_DB_USER value: metabase - name: MB_DB_PASS valueFrom: secretKeyRef: name: metabase key: password nodeSelector: type: private 
Enter fullscreen mode Exit fullscreen mode

And the config/envs/$ENV/kustomization.yaml file

apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: metabase resources: - ../../base patchesStrategicMerge: - security-group-policy.patch.yaml - service-account.patch.yaml - database-secret.patch.yaml - deployment.patch.yaml 
Enter fullscreen mode Exit fullscreen mode

Let's replace the by real values:

cd config/envs/dev # Generate DB auth token METABASE_PWD=$(aws rds generate-db-auth-token --hostname $(terraform output private-rds-endpoint) --port 5432 --username metabase --region $REGION) METABASE_PWD=$(echo -n $METABASE_PWD | base64 -w 0 ) sed -i "s/<MB_DB_PASS>/$METABASE_PWD/g" database-secret.patch.yaml sed -i "s/<POD_SECURITY_GROUP_ID>/$(terraform output sg-rds-access)/g; s/<EKS_CLUSTER_SECURITY_GROUP_ID>/$(terraform output sg-eks-cluster)/g" security-group-policy.patch.yaml sed -i "s,<RDS_ACCESS_ROLE_ARN>,$(terraform output rds-access-role-arn),g" service-account.patch.yaml sed -i "s/<MB_DB_HOST>/$(terraform output private-rds-endpoint)/g" deployment.patch.yaml 
Enter fullscreen mode Exit fullscreen mode

Run the manifests

kubectl create namespace metabase kubectl config set-context --current --namespace=metabase kustomize build . | kubectl apply -f - 
Enter fullscreen mode Exit fullscreen mode

Let's see if it worked

$ kubectl get pods NAME READY STATUS RESTARTS AGE metabase-6d47d7b94b-796sx 1/1 Running 2 98s 
Enter fullscreen mode Exit fullscreen mode
$ kubectl describe pods metabase-6d47d7b94b-796sx Name: metabase-6d47d7b94b-796sx Namespace: metabase Priority: 0 Node: ip-10-0-3-109.eu-west-1.compute.internal/10.0.3.109 [..] Labels: app=metabase pod-template-hash=6d47d7b94b Annotations: kubernetes.io/psp: eks.privileged vpc.amazonaws.com/pod-eni: [{"eniId":"eni-054df22ad2b1b89c3","ifAddress":"02:3b:a8:a7:9c:f5","privateIp":"10.0.3.128","vlanId":1,"subnetCidr":"10.0.2.0/23"}] Status: Running IP: 10.0.3.128 IPs: IP: 10.0.3.128 [..] Node-Selectors: type=private Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 32s default-scheduler Successfully assigned metabase/metabase-6d47d7b94b-796sx to ip-10-0-3-109.eu-west-1.compute.internal Normal SecurityGroupRequested 32s vpc-resource-controller Pod will get the following Security Groups [sg-0c0195a69b1b8bdc3 sg-0d4b509bad15ec963] Normal ResourceAllocated 31s vpc-resource-controller Allocated [{"eniId":"eni-054df22ad2b1b89c3","ifAddress":"02:3b:a8:a7:9c:f5","privateIp":"10.0.3.128","vlanId":1,"subnetCidr":"10.0.2.0/23"}] to the pod Normal Pulled 31s kubelet Container image "metabase/metabase" already present on machine Normal Created 31s kubelet Created container metabase Normal Started 31s kubelet Started container metabase 
Enter fullscreen mode Exit fullscreen mode

As we can see the security groups have been attached to the pods.

$ kubectl logs metabase-6d47d7b94b-796sx [..] 2021-03-20 13:22:35,660 INFO metabase.core :: Setting up and migrating Metabase DB. Please sit tight, this may take a minute... 2021-03-20 13:22:35,663 INFO db.setup :: Verifying postgres Database Connection ... 2021-03-20 13:22:40,245 INFO db.setup :: Successfully verified PostgreSQL 12.5 application database connection. ✅ 2021-03-20 13:22:40,246 INFO db.setup :: Running Database Migrations... 2021-03-20 13:22:40,387 INFO db.setup :: Setting up Liquibase... 2021-03-20 13:22:40,502 INFO db.setup :: Liquibase is ready. 2021-03-20 13:22:40,503 INFO db.liquibase :: Checking if Database has unrun migrations... 2021-03-20 13:22:42,900 INFO db.liquibase :: Database has unrun migrations. Waiting for migration lock to be cleared... 2021-03-20 13:22:42,980 INFO db.liquibase :: Migration lock is cleared. Running migrations... 2021-03-20 13:22:48,068 INFO db.setup :: Database Migrations Current ... ✅ [..] 2021-03-20 13:23:13,054 INFO metabase.core :: Metabase Initialization COMPLETE 
Enter fullscreen mode Exit fullscreen mode

If the deployment is created before the SecurityGroupPolicy you will get a connect timed out. Delete and recreate the deployment.

Now, let's delete the security groups policy and recreate the deployment to check if the connection fails.

$ kubectl delete -f security-group-policy.patch.yaml $ kubectl delete -f deployment.patch.yaml $ kubectl apply -f deployment.patch.yaml $ kubectl logs metabase-6d47d7b94b-wbn4r 2021-03-20 13:31:32,993 INFO db.setup :: Verifying postgres Database Connection ... 2021-03-20 13:31:43,052 ERROR metabase.core :: Metabase Initialization FAILED clojure.lang.ExceptionInfo: Unable to connect to Metabase postgres DB. [..] Caused by: java.net.SocketTimeoutException: connect timed out [..] 2021-03-20 13:31:43,072 INFO metabase.core :: Metabase Shutting Down ... 2021-03-20 13:31:43,077 INFO metabase.server :: Shutting Down Embedded Jetty Webserver 2021-03-20 13:31:43,088 INFO metabase.core :: Metabase Shutdown COMPLETE 
Enter fullscreen mode Exit fullscreen mode

As you can see, metabase is no longer authorized to access the RDS instance.

Last check, let's add Security Group Policy again and remove the annotation from the service account that attaches the IAM role to the pod.

$ kubectl annotate sa metabase eks.amazonaws.com/role-arn- $ kubectl apply -f security-group-policy.patch.yaml $ kubectl delete -f deployment.patch.yaml $ kubectl apply -f deployment.patch.yaml 2021-03-20 13:43:42,329 INFO db.setup :: Verifying postgres Database Connection ... 2021-03-20 13:43:42,710 ERROR metabase.core :: Metabase Initialization FAILED clojure.lang.ExceptionInfo: Unable to connect to Metabase postgres DB. [..] Caused by: org.postgresql.util.PSQLException: FATAL: PAM authentication failed for user "metabase" [..] 
Enter fullscreen mode Exit fullscreen mode

As you can see, metabase is no longer authenticated and then authorized to access the user "metabase".

Conclusion

In this long workshop, we created:

  • An isolated network to host our Amazon RDS
  • Configured an Amazon EKS cluster with fine-grained access control to Amazon RDS
  • We tested the connectivity between a Kubernetes container and an RDS instance database.

That's it!

Clean

kustomize build . | kubectl delete -f - cd ../../../infra/envs/$ENV terraform destroy ../../plan/ 
Enter fullscreen mode Exit fullscreen mode

Final Words

The source code is available on Gitlab.

If you have any questions or feedback, please feel free to leave a comment.

Otherwise, I hope I have helped you answer some of the hard questions about connecting Amazon EKS to Amazon RDS and providing a pod level defense in depth security strategy at both the networking and authentication layers.

By the way, do not hesitate to share with peers 😊

Thanks for reading!

Documentation

[1] https://docs.aws.amazon.com/eks/latest/userguide/cni-upgrades.html
[2] https://docs.aws.amazon.com/eks/latest/userguide/security-groups-for-pods.html
[3] https://eksctl.io/usage/iamserviceaccounts/#how-it-works

Top comments (1)

Collapse
 
yi2020 profile image
Yoni Leitersdorf

Hi Chabane,

Great series of articles! I was curious what results I'd get if I ran Cloudrail against your TF code, and looks like everything checks out!