I am a DevOps engineer at Cloudify.co and I will share in this post my experience related to automation of Vault backup creation using Kubernetes CronJob.
This post is a continuation of the previous post: https://igorzhivilo.com/vault/scheduled-backup-vault-secrets
The repository with all the code: https://github.com/warolv/vault-backup
What is HashiCorp's Vault?
Vault is a tool for securely accessing secrets. A secret is anything that you want to tightly control access to, such as API keys, passwords, certificates, and more. Vault provides a unified interface to any secret while providing tight access control and recording a detailed audit log.
My Setup
- EKS Kubernetes cluster
- Vault runs on EKS cluster
What you will learn from this post?
How to create a scheduled backup for Vault secrets with CronJob of Kubernetes.
How to add Prometheus alerts for failed jobs.
You can find all the code presented in my repository: https://github.com/warolv/vault-backup
Let's start.
Building the docker container
First, need to build a docker container based on python3 and include the code of vault_handler.py
Need clone the repo first with Docker file: 'git clone https://github.com/warolv/vault-backup'
Docker file:
FROM python:3 COPY requirements.txt / RUN pip install -r requirements.txt COPY vault_handler.py / CMD [ "python", "./vault_handler.py" ] Building image
# login to dockerhub $ docker login -u YOUR_USERNAME -p YOUR_PASSWORD # Build Docker $ docker build -t vault-backup . Validate docker container working properly
$ docker run --name test-vault-backup --rm vault-backup Specify one of the commands below print print-dump dump populate It's working, we got a list of commands from vault-backup:-)
Pushing vault-backup docker container to docker hub
$ docker tag vault-backup <Your Docker ID>/vault-backup:latest $ docker push <Your Docker ID>/vault-backup:latest In my case, it's 'warolv/vault-backup:latest', you can find an already built image there.
CronJob to run vault-backup on a daily basis
https://github.com/warolv/vault-backup/blob/main/examples/cronjob/cronjob.yaml
apiVersion: batch/v1beta1 kind: CronJob metadata: name: vault-backup spec: schedule: "0 1 * * *" jobTemplate: spec: template: spec: restartPolicy: Never nodeSelector: instance-type: spot containers: - name: awscli image: amazon/aws-cli:latest command: - "aws" - "s3" - "cp" - "/data/vault_secrets.enc" - "s3://jenkins-backups/vault_secrets.enc" imagePullPolicy: Always envFrom: - secretRef: name: aws-creds-secret volumeMounts: - name: backup-dir mountPath: /data initContainers: - name: vault-backup image: warolv/vault-backup command: - "python3" - "vault_handler.py" - "dump" - "-dp" - "/data/vault_secrets.enc" imagePullPolicy: Always envFrom: - secretRef: name: vault-backup-secret volumeMounts: - name: backup-dir mountPath: /data volumes: - name: backup-dir emptyDir: {} Explanation
First 'vault_backup.py' script will run from InitContainer and secrets dump will be created (vault_secrets.enc) and saved to /data folder which is a shared folder for both containers.
The second will run 'awscli' container which will be used to push the secrets dump to a private S3 bucket (AWS CLI is used to copy the secrets dump to the privare S3 bucket). Of course, S3 private bucket must exist.
Credentials for AWS CLI (AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY) and for vault_backup script exported to the environment as k8s secrets.
In this example, I am copying the dump to 's3://jenkins-backups/vault_secrets.enc', in the production use case I suggest adding a timestamp to dump of secrets to be something like vault_secrets_${timestamp}.enc
Creating secrets for CronJob
# create k8s secret for AWS $ kubectl create secret generic aws-creds-secret \ --from-literal=AWS_ACCESS_KEY_ID=YOUR_AWS_ACCESS_KEY_ID \ --from-literal=AWS_SECRET_ACCESS_KEY=YOUR_AWS_SECRET_ACCESS_KEY # create k8s secret with all needed data for vault-backup $ kubectl create secret generic vault-backup-secret \ --from-literal=VAULT_ADDR=http://vault.vault.svc.cluster.local:8200 \ --from-literal=ROLE_ID=YOUR_ROLE_ID \ --from-literal=SECRET_ID=YOUR_SECRET_ID \ --from-literal=ENCRYPTION_KEY=ENCRYPTION_KEY \ --from-literal=VAULT_PREFIX=jenkins It's only an example, you need to put real values.
Deploy vault-backup cronjob
$ kubectl apply -f examples/cronjob/cronjob.yaml How to trigger a Job from CronJob?
In case you want to test your job is working properly:
$ kubectl create job --from=cronjob/vault-backup vault-backup-001 Adding alerts to Prometheus
I am using kube-state-metrics with Prometheus and we have these metrics available: https://github.com/kubernetes/kube-state-metrics/blob/master/docs/cronjob-metrics.md
Let's add an alert for 'failed job' and for cronjob which 'takes too much time', of course, it's only an example to give you an idea.
groups: - name: cronjob.rules rules: - alert: SlowCronJob expr: time()-kube_cronjob_next_schedule_time > 1800 for: 30m labels: severity: warning annotations: description: CronJob {{$labels.namespaces}}/{{$labels.cronjob}} is taking more than 30m to complete summary: CronJob taking more than 30m - alert: FailedJob expr: kube_job_status_failed > 0 for: 30m labels: severity: warning annotations: description: Job {{$labels.namespaces}}/{{$labels.job}} failed summary: Job failure In this post, I described how to automate Vault backup creation using Kubernetes CronJob and a simple python script that I built.
Thank you for reading, I hope you enjoyed it, see you in the next post.
Original story on my blog: https://igorzhivilo.com/vault/scheduled-backup-vault-cronjob/
If you want to be notified when the next post of this tutorial is published, please follow me on Twitter @warolv.
Instagram: @warolv
Top comments (0)