Hi there 👋, let's see how to deploy HarperDB on EKS, and then test it with an API call from CURL. You can get the Kubernetes manifests that we make in this post from this link.
Hope you are already familiar with topics such as Deployment, Load Balancer service, Secret and Persistent volume claim
Ensure you have the required IAM permissions, have installed the aws, eksctl & kubectl cli tools, and have setup the config and credentials.
For me the config is as follows.
$ cat ~/.aws/config [default] region=us-east-1
Cluster
We can now create an EKS cluster with eksctl. You may see this video for cluster creation from the CLI.
$ eksctl create cluster --name eks-cluster --zones=us-east-1a,us-east-1b
This has taken around 20 mins for me. Once it's done we can update the kubeconfig.
$ aws eks update-kubeconfig --name eks-cluster
Docker hub
We can visit the docker hub page of harperdb to get an idea on the ports, environment variables, volume path etc.
They have given an example docker command as below.
docker run -d \ -v /host/directory:/opt/harperdb/hdb \ -e HDB_ADMIN_USERNAME=HDB_ADMIN \ -e HDB_ADMIN_PASSWORD=password \ -p 9925:9925 \ harperdb/harperdb
This tells us the volume mount path in the container is /opt/harperdb/hdb, there are 2 environment variables for username and password, and the container port is 9925. Finally the image is harperdb/harperdb.
We now have enough info to start writing our Kubernetes manifests.
Kubernetes manifests
I am going to create a directory by name harperdb where I would keep all the manifests.
$ mkdir harperdb $ cd harperdb
Let's begin with the environment variables, we can write both username and password in a secret object.
$ cat <<EOF > secret.yaml --- apiVersion: v1 kind: Secret metadata: name: harperdb namespace: harperdb stringData: HDB_ADMIN_USERNAME: admin HDB_ADMIN_PASSWORD: password12345 ... EOF
We can now go with a persistent volume claim, that can dynamically create an EBS volume of size 5Gi in AWS.
$ cat <<EOF > pvc.yaml --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: harperdb namespace: harperdb spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi ... EOF
Then comes the deployment manifest, where we can define the container image, refer to the secret for the env vars, and pvc for the volume. Note that the volume mount path matches with that in the docker command.
$ cat <<EOF > deploy.yaml --- apiVersion: apps/v1 kind: Deployment metadata: name: harperdb namespace: harperdb spec: selector: matchLabels: app: harperdb template: metadata: labels: app: harperdb spec: containers: - name: harperdb image: harperdb/harperdb envFrom: - secretRef: name: harperdb volumeMounts: - name: data mountPath: /opt/harperdb/hdb volumes: - name: data persistentVolumeClaim: claimName: harperdb ... EOF
Finally, we have to expose the deployment with a service, we know from the docker command that the container port is 9925.
$ cat <<EOF > svc.yaml --- apiVersion: v1 kind: Service metadata: name: harperdb namespace: harperdb spec: selector: app: harperdb type: LoadBalancer ports: - name: http port: 8080 targetPort: 9925 ... EOF
Note that we have used 8080 as the service port.
Workloads
Create a namespace by name harperdb, where we can create our objects.
$ kubectl create ns harperdb namespace/harperdb created
We are good to create objects with the 4 manifests.
$ ls deploy.yaml pvc.yaml secret.yaml svc.yaml $ kubectl create -f . deployment.apps/harperdb created persistentvolumeclaim/harperdb created secret/harperdb created service/harperdb created
Fix PVC
The pvc should be in pending status.
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE harperdb Pending gp2 7m3s
Please follow this link to add IAM role in AWS cloud, and ebs csi objects on the cluster. This should fix the PVC issue.
Once done, the pvc should be bound to a persistent volume(pv).
$ kubectl get pvc -n harperdb NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE harperdb Bound pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7 5Gi RWO gp2 9s
And the pv should be mapped to an EBS volume.
$ kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7 5Gi RWO Delete Bound harperdb/harperdb gp2 64s $ kubectl describe pv pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7 | grep VolumeID VolumeID: vol-0bbca736346f02aa1
Note that a persistent volume is a cluster level object and not bound to a namespace. We can check the volume details from the aws cli.
$ aws ec2 describe-volumes --volume-ids vol-0bbca736346f02aa1 --query "Volumes[0].Size" 5 $ aws ec2 describe-volumes --volume-ids vol-0bbca736346f02aa1 --query "Volumes[0].Tags" [ { "Key": "ebs.csi.aws.com/cluster", "Value": "true" }, { "Key": "CSIVolumeName", "Value": "pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7" }, { "Key": "kubernetes.io/created-for/pv/name", "Value": "pvc-7c83e38c-b00a-4194-8c67-ba5c9c1118e7" }, { "Key": "kubernetes.io/created-for/pvc/namespace", "Value": "harperdb" }, { "Key": "kubernetes.io/created-for/pvc/name", "Value": "harperdb" } ]
Volume permission fix
So the pvc seems good. Let's check our application status.
$ kubectl get po -n harperdb NAME READY STATUS RESTARTS AGE harperdb-79694c8b75-6ckn7 0/1 CrashLoopBackOff 4 (80s ago) 3m25s
The application was crashing, but the volume was getting mounted, and the env vars were fine too. I tried commenting out volumeMounts and volume and updated the deployment.
$ cat deploy.yaml | grep # #volumeMounts: #- name: data #mountPath: /opt/harperdb/hdb #volumes: #- name: data #persistentVolumeClaim: #claimName: harperdb $ kubectl apply -f deploy.yaml
The pod was running, and I checked the permissions of the directory where we need to mount the volume. And subsequently the id of the group.
$ kubectl exec -it deploy/harperdb -n harperdb -- bash ubuntu@harperdb-858cc7967d-5jcqm:~$ ls -l /opt/harperdb total 0 drwxr-xr-x 11 ubuntu ubuntu 155 Jan 9 06:59 hdb ubuntu@harperdb-858cc7967d-5jcqm:~$ id uid=1000(ubuntu) gid=1000(ubuntu) groups=1000(ubuntu) ubuntu@harperdb-858cc7967d-5jcqm:~$ exit
So the group id of the running user is 1000, hence we can set this as the group owner for the volume directory with the fsGroup option. If we don't specify this then the mountPath would by default be set with root(user) and root(group) as the owner for the directory and the running user ubuntu wouldn't have permissions on the mountPath to create any new files. This video has information about fsGroup.
We have to change the deployment as follows. We have added the security context with the fsGroup.
$ cat deploy.yaml --- apiVersion: apps/v1 kind: Deployment metadata: name: harperdb namespace: harperdb spec: selector: matchLabels: app: harperdb template: metadata: labels: app: harperdb spec: securityContext: fsGroup: 1000 containers: - name: harperdb image: harperdb/harperdb envFrom: - secretRef: name: harperdb volumeMounts: - name: data mountPath: /opt/harperdb/hdb volumes: - name: data persistentVolumeClaim: claimName: harperdb ...
Alternately, we could also set mountPath to just /opt/harperdb, where we wouldn't have to set the securityContext. But I thought this is a good use case to know about the fsGroup.
Update the deployment.
$ kubectl apply -f deploy.yaml
Check the workloads.
$ kubectl get all -n harperdb NAME READY STATUS RESTARTS AGE pod/harperdb-cc4f49dfc-m7d5p 1/1 Running 0 55s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/harperdb LoadBalancer 10.100.54.78 a0ba701c9c5a4463bb636551c79b4158-169592876.us-east-1.elb.amazonaws.com 8080:31819/TCP 55s NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/harperdb 1/1 1 1 57s NAME DESIRED CURRENT READY AGE replicaset.apps/harperdb-cc4f49dfc 1 1 1 57s
API call
Send a CURL command to test schema creation. The endpoint is from the external IP column in the service. You may check this video to know how to obtain the curl command for harperdb.
$ HDB_API_ENDPOINT=http://a0ba701c9c5a4463bb636551c79b4158-169592876.us-east-1.elb.amazonaws.com:8080 $ curl --location --request POST ${HDB_API_ENDPOINT} \ --header 'Content-Type: application/json' \ --header 'Authorization: Basic YWRtaW46cGFzc3dvcmQxMjM0NQ==' \ --data-raw '{ "operation": "create_schema", "schema": "qa" }' {"message":"schema 'qa' successfully created"}
All good, it's working...
Persistence
Test persistence by deleting the pod.
$ kubectl delete po -n harperdb -l app=harperdb pod "harperdb-cc4f49dfc-m7d5p" deleted
This should launch a new pod.
$ kubectl get po -n harperdb NAME READY STATUS RESTARTS AGE harperdb-cc4f49dfc-c6vnc 1/1 Running 0 57s
We can try sending the same API call again.
$ curl --location --request POST ${HDB_API_ENDPOINT} \ --header 'Content-Type: application/json' \ --header 'Authorization: Basic YWRtaW46cGFzc3dvcmQxMjM0NQ==' \ --data-raw '{ "operation": "create_schema", "schema": "qa" }' {"error":"Schema 'qa' already exists"}
It's not creating a new schema, because the existing schema is restored from the attached volume. Hence, it's persistent.
Clean up
Let's do the clean up...
Delete all the objects that were created via manifests.
$ kubectl delete -f . deployment.apps "harperdb" deleted persistentvolumeclaim "harperdb" deleted secret "harperdb" deleted service "harperdb" deleted
Then delete the namespace.
$ kubectl delete ns harperdb namespace "harperdb" deleted
Delete the folder.
$ cd .. $ rm -rf harperdb
Finally delete the cluster.
$ eksctl delete cluster --name eks-cluster
That's it for the post, Thank you for reading !!!
Top comments (0)