Creating Highly-Available MongoDB Microservices with Docker Containers and Kubernetes

CREATING HIGHLY-AVAILABLE MONGODB MICROSERVICES WITH DOCKER CONTAINERS AND KUBERNETES Marco Bonezzi Senior Technical Services Engineer, MongoDB @marcobonezzi marco@mongodb.com

About the speaker I am Marco Bonezzi: Senior TSE at MongoDB TSE = Help customers to be successful with MongoDB Based in Dublin, Ireland Experience in databases, distributed systems, high availability and containers:

#MDBW17 MICROSERVICES AND CONTAINERS App

#MDBW17 THERE ARE COMMON PROBLEMS WHEN USING CONTAINERS: Capacity Connectivity State Isolation Affinity How do we manage all these?

#MDBW17 THIS IS WHAT WE WILL LEARN TODAY 1. Deploy MongoDB on containers using Kubernetes 2. Build a StatefulSet for our MongoDB deployment 3. Production-like recommendations for replica sets on GCE 4. High-Availability considerations for a micro-service application

#MDBW17 1. MONGODB ON CONTAINERS USING KUBERNETES • Why MongoDB is a good fit for microservices: • Benefits of using Kubernetes: Automate the distribution and scheduling of MongoDB containers across a cluster in a more efficient way. Time to market Scalability Resiliency Allignment (API) Flexible Data Model Redundancy Orchestration Persistency Monitoring

#MDBW17 KUBERNETES BUILDING BLOCKS Node: Provide capacity o Worker machine for pods (and their containers) o Virtual or physical o Can be grouped in a cluster, managed by master CPU Memory CPU Memory Pod Pod Pod

#MDBW17 Pod: Consume capacity Group of one or more application containers + shared resources for those containers container container container container Volume Volume Volume IP Image Port … … POD

#MDBW17 Container: Units of packaging Isolated process. Based on the Linux kernel: • Namespaces: what a process can see • cgroups: what a process can use Application + dependencies + shared kernel+ libraries container mongod –dbpath /data/db --port 27017 container

#MDBW17 Service Allow your application to receive traffic o Logical set of Pods and a policy by which access them o LabelSelector to define pods targeted o Types available: ClusterIP, NodePort, LoadBalancer, ExternalName or Headless Service Service B: 10.10.9.3 Service A: 10.10.9.4 POD NODE

#MDBW17 BASIC REPLICA SET EXAMPLE Node S P S mongod mongodmongod https://github.com/kubernetes/minikube

#MDBW17 BASIC REPLICA SET EXAMPLE https://github.com/sisteming/mongo-kube/tree/master/demo1

#MDBW17 2. BUILDING OUR MONGODB STATEFULSET • StatefulSet: Designed for applications that require • Components required: Stable, unique network identifiers mongo-1 mongo-n ... Volu me Volu medata Stable, persistent storage 1 2 3 n Ordered, graceful deployment and scaling Ordered, graceful deletion and terminationPersistent Volume Claim Headless Service StatefulSet

#MDBW17 STATEFULSET • Provide a unique identity to its Pods, comprised of an ordinal Identity stays with the Pod, regardless of which node it is scheduled on Hostnames: Pods created, scaled and deleted sequentially (i.e. mongo-{0..N-1}) mongo-1 mongo-2mongo-0 BETA FEATURE FROM KUBERNETES 1.5

#MDBW17 HEADLESS SERVICE • Similar to Kubernetes services but without any load balancing ‒ Combined with StatefulSets: unique DNS addresses to access pods ‒ Template for DNS name is <pod-name>.<service-name> I’m mongo-1.mongo! Cool, I can add you two to the replica set I’m mongo-2.mongo! rs.initiate() rs.add(“mongo-1.mongo:27017”) rs.add(“mongo-2.mongo:27017”)

#MDBW17 PERSISTENT STORAGE VOLUMES • Storage: critical component for Stateful containers • Dynamic volume provisioning: Before: provision new storage, then create volume in Kubernetes Now: dynamic provision using provisioner defined Persistent Volume Physical Storage Persistent Volume Claim Persistent Volume Persistent Volume Claim POD POD StatefulSet STABLE FROM KUBERNETES 1.6

#MDBW17 MONGODB STATEFUL SET POD - mongo-0 container (mongod) POD - mongo-1 POD - mongo-2 SSD volume SSD volume SSD volume Headless Service (*.mongo, 27017) container (mongod) container (mongod) Application

#MDBW17 SUMMARY: WHAT MAKES STATEFUL SETS GREAT FOR MONGODB Unique pod identity Stable network Stable storage Scaling Known and predictable hostnames Persistency resilient to rescheduling Scale application reads Easier to manage

#MDBW17 DEMO 2: REPLICA SET AS A STATEFULSET mongo-watch: https://github.com/sisteming/mongo- kube/tree/master/mongo-watch https://github.com/sisteming/mongo-kube/tree/master/demo2

#MDBW17 Node 2Node 1 3. ORCHESTRATING AND DEPLOYING PRODUCTION-LIKE STATEFULSET Node 0 rs0 • Replica Sets are about High Availability rs0 rs0 How do we ensure all containers are evenly distributed?

#MDBW17 • Kubernetes cluster o Coordinate cluster of computers connected to work as a single unit o Applications decoupled from individual hosts o Automates the distribution and scheduling of applications containers across a cluster • Two main resources MASTER NODE Kubelet Docker/rkt Schedule applications Maintain state Scaling Rolling updates Worker machine Kubelet: Agent (API) Docker/rkt: Container runtime

#MDBW17 POD SCHEDULING • Master controls nodes (minions) via Scheduler • Responsible for tracking all resources and pods • Takes care of: Resource requirements Constraints Affinity/Anti-affinity

#MDBW17 SCHEDULING OUR REPLICA SET MEMBERS nodeSelector Node labels Affinity Eligible if node has each of the k-v pairs as labels Node 1 mongod-RS1 env: prod rs: rs0 env: prod rs: rs0 env: test rs: rs-t1 mongod-RS1 Hostname os Instance arch Standard set of labels, beta.kubernetes.io Soft/hard constraints on nodes and pods nodeAffinity Affinity or anti- Affinity node labels labels on pods currently running BETA FEATURES IN KUBERNETES 1.6

#MDBW17 DISTRIBUTION OVER MULTIPLE NODES Node 2Node 1Node 0 mongo-0 / rs0 mongo-1 / rs0 mongo-2 / rs0

#MDBW17 COMPUTE RESOURCES FOR CONTAINERS We can define how much CPU and memory each container needs CPU Memory Requests Limits How much I’d like to get (scheduling) The most I can get (contention) CPU Units: spec.containers[].resources.limits.cpu spec.containers[].resources.requests.cpu Bytes: spec.containers[].resources.limits.memory spec.containers[].resources.requests.memory Scheduler Ensures that the sum of the resource requests of the scheduled containers is less than the capacity of the node.

#MDBW17 HIGH AVAILABILITY IN OUR STATEFULSET Replicated copy of our data Automatic failover Scalability of read operations Container restart (same node) Container reschedule (diff. node) Persistent volumes MongoDB Replica Set Kubernetes + Stateful Sets

#MDBW17 4. MICROSERVICE APPLICATIONS WITH MONGODB ON KUBERNETES STATEFULSET Connecting our microservice application (within Kubernetes) mongodb://mongo-0.mongo,mongo-1.mongo,mongo- 2.mongo:27017/APP_? Cases to be handled: o High Availability: 1. Failover (change of primary in the replica set) 2. Pod killed (i.e. Pod is killed and it comes back up) 3. Pod rescheduled (i.e. Node is down and pod gets scheduled to a different node) o Geographically distributed reads: 1. Read from nearest secondary using ReadPreference + Replica Set tags

#MDBW17 DEMO: Q&A LIST Scenarios: 1. Replica Set failover 2. Pod with primary mongod is killed 3. Node with primary mongod is

#MDBW17 DEMO 3: MONGODB REPLICA SET STATEFULSET ON GOOGLE CLOUD ENGINE https://github.com/sisteming/mongo-kube/tree/master/demo3

#MDBW17 SUMMARY Key elements for a successful MongoDB Stateful Set on Kubernetes Resource requests/limit s Persistent state Isolation AffinityScheduling Unique network identifiers High Availability Labels Monitoring Liveness Probes Disruption Budget Further improvements

#MDBW17 THANK YOU! https://github.com/sisteming/mongo-kube

Creating Highly-Available MongoDB Microservices with Docker Containers and Kubernetes

Creating Highly-Available MongoDB Microservices with Docker Containers and Kubernetes

In this document