Scaling MongoDB with Docker and cgroups Marco Bonezzi TechnicalServicesEngineer,MongoDB marco@mongodb.com @marcobonezzi
#MDBW16 About the speaker I am Marco Bonezzi, TSE at MongoDB TSE = helping customers with a variety of issues Based in Dublin, Ireland Experience in distributed systems, high availability:
#MDBW16 How many of you have ever... 1.  … manually deployed a MongoDB replica sets or sharded clusters? 2.  ... had issues with resource allocation? 3.  ... used Docker? 4.  … used MongoDB running on Docker?
#MDBW16 We know how it feels… Different architectures in Development and Production Co-located MongoDB processes Production != docker run mongodb
#MDBW16 What	are	the	pain	points? Deployment Complex architectures and deployment patterns Resource Contention Multiple instances competing for resources MongoDB & Docker Combining both: network configuration, container location, other
#MDBW16 How	to	solve	this? Deployment Using predefined cluster patterns Replicating environments Resource Control Setting limits to key resources MongoDB & Docker Create once, deploy everywhere Deploy patterns, not processes Orchestration Resource Management Automate for scaling
#MDBW16 About	this	talk Patterns for successful deployments Difference between success and failure Orchestrating MongoDB with Docker MongoDB cluster on AWS with containers Patterns with Swarm and Compose Managing container resources with cgroups Benefits of cgroups in a MongoDB cluster P S S
#MDBW16 Redundancy and fault tolerance Deploy an odd number of voting members Members ó Majority required ó Fault tolerance High availability and resource colocation Single member of a replica set / server Shards as Replica Set Ideally: primary / secondary / secondary Deployment patterns: Replica Set and Sharded Clusters Server 3Server 2Server 1 mongos Primary Primary RS1 SecondarySecondary Secondary RS2 Secondary RS3 Secondary RS1 Primary RS2 Secondary RS3 Secondary RS1 Secondary RS2 Primary RS3 mongos mongos cfgsvr1 cfgsvr2 cfgsvr3
#MDBW16 Docker •  Noun: a person employed in a port to load and unload ships (from “what is docker” on Google) Containers: Isolated process in userspace Application + dependencies Shared kernel and libraries Can run on any infrastructure (or cloud) www.docker.com
#MDBW16 44% of orgs adopting microservices Why use Docker? 41% want application portability 13x improvement in release frequency 62% MTTR on software issues 60% Using Docker to migrate to cloud Reason to run containers: SPEED Microservices architectures Efficiency Cloud (The	Docker	Survey,	2016)
#MDBW16 Good	news:	Docker	can	help	with	this! Orchestration Coordinate MongoDB containers to deploy a recommended deployment pattern Resource Control Sizing each instance (and each cluster) to avoid resource contention issues
#MDBW16 Orchestrating MongoDB with Docker How can we use Docker for MongoDB deployments? How can we deploy these patterns using Docker containers? Why should we use Docker? Our recipe:
#MDBW16 Docker	ecosystem Provisioning and managing your Dockerized hosts Native clustering: turns a pool of Docker hosts into a single, virtual Docker host. Define a multi-container application with all of its dependencies in a single file S
#MDBW16 Why	Docker	Swarm? 5x faster than Kubernetes to spin up a new container 7x faster than Kubernetes to list all running containers Evaluating Container Platforms at Scale 1000 EC2 instances in a cluster What is their performance at scale? Can they operate at scale? What does it take to support them at scale? https://medium.com/on-docker/evaluating-container-platforms-at-scale-5e7b44d93f2c#.k2fxds8c2 hGps://www.docker.com/survey-2016
#MDBW16 swarm-node-2 swarm-node-3swarm-node-1 Swarm multi-host networking How each mongod process connects to other processes outside of the host? •  Swarm overlay container-to-container network •  Using the hostname defined in the Compose file
#MDBW16 Swarm filters to build our patterns Constraint filters Mark each mongod container with a label: “role=mongod” “replset=rs1”
#MDBW16 Affinity filters Prevent multiple RS members on the same host: "affinity:replset!=rs1” swarm-node-1 swarm-node-3swarm-node-2 Affinity filters for container distribution
#MDBW16 Road to container success Deploying containers to the right node is not enough… Next step: Resource control on each swarm cluster node using cgroups Maritime New Zealand
#MDBW16 Resource control with cgroups and Docker Simple parameters to add to docker run or compose: --cpu-shares --cpuset-cpus --memory --blkio-weight --net
#MDBW16 MongoDB Memory usage in 3.2 with WiredTiger MongoDB Memory: mongod process: connections, aggregations, mapReduce, etc WiredTiger cache: (0.6 x total memory) – 1 GB Total = mem(mongod) + mem(WiredTiger cache) WiredTiger cache mongod mongod memory
#MDBW16 cgroup! memory_limit! Process memory with containers and cgroups WiredTiger cache mongod mongod memory total memory (seen from mongod process)! Inside the container •  Can see total memory and not memory limit WiredTiger cache: •  memory_limit *0.6
#MDBW16 cgroups: Memory and CPU limits mongod! mongod! mongod! mongod! MEM = (TOTAL_MEM - OS_MEM) / NUM_MONGOD WiredTiger cache = (MEM * 0.6) CPUSET= 0, 1, …, MAX_CPU_CORES mongod! mongod! mongod! mongod!
#MDBW16 MongoDB on Docker + cgroups: Memory WiredTiger cache: 60% of the container memory limit (for each mongod) Compose: mem_limit! Docker Engine: --memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! OS Memory!
#MDBW16 MongoDB on Docker + cgroups: CPU mongod (and mongos) mapped to 1+ core Compose: cpuset! Docker Engine: --cpuset-cpus! mongod! rs1a! mongod! rs1a! mongod! rs2b! mongod! rs2b! core0! core1! core2! core3! OS! OS! mongod cfgsrv -c! mongos! core4! core5! core6! core7! --cpuset-mems (NUMA)!
#MDBW16 Understanding resource usage: •  docker top rs1a! •  docker stats rs1a! Container stats available via Docker remote API: GET /containers/(id)/stats Also available from docker-py: http://docker-py.readthedocs.org/en/latest/api/#stats Resource usage with Docker
#MDBW16 Resource usage with Docker Multiple statistics for each container: Memory limit and usage, CPU (per core level), Network, Disk Useful to combine with MongoDB metrics (like db.serverStatus())
#MDBW16 Creating a Swarm cluster on AWS to deploy MongoDB
#MDBW16 Creating a Swarm cluster on AWS to deploy MongoDB DEMO! Configure docker-machine with ec2 driver (AWS) Deploy discovery service for Swarm Master Deploy AWS instances for: •  Swarm master •  Swarm worker nodes Connect to the Swarm master Define compose file for deployment Define Swarm filters and constraints and cgroup limits Deploy the environment with a single command using the compose file Configure our MongoDB sharded cluster using Cloud Manager API
#MDBW16 What our dockerized sharded cluster looks like… rs1a! rs1c!rs1b! rs2b! rs2a! rs2c! rs3b! rs3a! rs3c! mongos! cfgsvr -a! cfgsvr -b! cfgsvr -c! dockerC1! dockerC2! dockerC3! PRIMARY! SECONDARY! MONGOS!
Summary
#MDBW16 Advantages of using MongoDB with Docker Speed: testing and deploying cluster patterns easily Build once, deploy everywhere Control: Resource control and utilization Key to success with containers Agility: Microservices architectures Making change less expensive Flexibility: Multi vendor cloud opportunities AWS, Azure, Google, IBM, CloudFoundry P S S
#MDBW16 How successful customers use MongoDB with Docker •  Case Studies @	hGps://www.mongodb.com/blog •  Whitepaper: “Enabling Microservices – Containers & Orchestration Explained” https://www.mongodb.com/collateral/microservices-containers-and-orchestration-explained
#MDBW16 Now it’s YOUR turn Share with us your use case of MongoDB & Docker: http://bit.do/DockerMongoDB @marcobonezzi You can actually try this at home: https://github.com/sisteming/mongo-swarm
Thank You! Marco Bonezzi marco@mongodb.com @marcobonezzi

MongoDB World 2016: Scaling MongoDB with Docker and cGroups

  • 1.
    Scaling MongoDB withDocker and cgroups Marco Bonezzi TechnicalServicesEngineer,MongoDB marco@mongodb.com @marcobonezzi
  • 2.
    #MDBW16 About the speaker Iam Marco Bonezzi, TSE at MongoDB TSE = helping customers with a variety of issues Based in Dublin, Ireland Experience in distributed systems, high availability:
  • 3.
    #MDBW16 How many ofyou have ever... 1.  … manually deployed a MongoDB replica sets or sharded clusters? 2.  ... had issues with resource allocation? 3.  ... used Docker? 4.  … used MongoDB running on Docker?
  • 4.
    #MDBW16 We know howit feels… Different architectures in Development and Production Co-located MongoDB processes Production != docker run mongodb
  • 5.
    #MDBW16 What are the pain points? Deployment Complex architectures anddeployment patterns Resource Contention Multiple instances competing for resources MongoDB & Docker Combining both: network configuration, container location, other
  • 6.
    #MDBW16 How to solve this? Deployment Using predefined clusterpatterns Replicating environments Resource Control Setting limits to key resources MongoDB & Docker Create once, deploy everywhere Deploy patterns, not processes Orchestration Resource Management Automate for scaling
  • 7.
    #MDBW16 About this talk Patterns for successfuldeployments Difference between success and failure Orchestrating MongoDB with Docker MongoDB cluster on AWS with containers Patterns with Swarm and Compose Managing container resources with cgroups Benefits of cgroups in a MongoDB cluster P S S
  • 8.
    #MDBW16 Redundancy and faulttolerance Deploy an odd number of voting members Members ó Majority required ó Fault tolerance High availability and resource colocation Single member of a replica set / server Shards as Replica Set Ideally: primary / secondary / secondary Deployment patterns: Replica Set and Sharded Clusters Server 3Server 2Server 1 mongos Primary Primary RS1 SecondarySecondary Secondary RS2 Secondary RS3 Secondary RS1 Primary RS2 Secondary RS3 Secondary RS1 Secondary RS2 Primary RS3 mongos mongos cfgsvr1 cfgsvr2 cfgsvr3
  • 9.
    #MDBW16 Docker •  Noun: aperson employed in a port to load and unload ships (from “what is docker” on Google) Containers: Isolated process in userspace Application + dependencies Shared kernel and libraries Can run on any infrastructure (or cloud) www.docker.com
  • 10.
    #MDBW16 44% of orgs adopting microservices Whyuse Docker? 41% want application portability 13x improvement in release frequency 62% MTTR on software issues 60% Using Docker to migrate to cloud Reason to run containers: SPEED Microservices architectures Efficiency Cloud (The Docker Survey, 2016)
  • 11.
    #MDBW16 Good news: Docker can help with this! Orchestration Coordinate MongoDB containersto deploy a recommended deployment pattern Resource Control Sizing each instance (and each cluster) to avoid resource contention issues
  • 12.
    #MDBW16 Orchestrating MongoDB withDocker How can we use Docker for MongoDB deployments? How can we deploy these patterns using Docker containers? Why should we use Docker? Our recipe:
  • 13.
    #MDBW16 Docker ecosystem Provisioning and managingyour Dockerized hosts Native clustering: turns a pool of Docker hosts into a single, virtual Docker host. Define a multi-container application with all of its dependencies in a single file S
  • 14.
    #MDBW16 Why Docker Swarm? 5x faster than Kubernetesto spin up a new container 7x faster than Kubernetes to list all running containers Evaluating Container Platforms at Scale 1000 EC2 instances in a cluster What is their performance at scale? Can they operate at scale? What does it take to support them at scale? https://medium.com/on-docker/evaluating-container-platforms-at-scale-5e7b44d93f2c#.k2fxds8c2 hGps://www.docker.com/survey-2016
  • 15.
    #MDBW16 swarm-node-2 swarm-node-3swarm-node-1 Swarm multi-host networking Howeach mongod process connects to other processes outside of the host? •  Swarm overlay container-to-container network •  Using the hostname defined in the Compose file
  • 16.
    #MDBW16 Swarm filters tobuild our patterns Constraint filters Mark each mongod container with a label: “role=mongod” “replset=rs1”
  • 17.
    #MDBW16 Affinity filters Prevent multipleRS members on the same host: "affinity:replset!=rs1” swarm-node-1 swarm-node-3swarm-node-2 Affinity filters for container distribution
  • 18.
    #MDBW16 Road to containersuccess Deploying containers to the right node is not enough… Next step: Resource control on each swarm cluster node using cgroups Maritime New Zealand
  • 19.
    #MDBW16 Resource control withcgroups and Docker Simple parameters to add to docker run or compose: --cpu-shares --cpuset-cpus --memory --blkio-weight --net
  • 20.
    #MDBW16 MongoDB Memory usagein 3.2 with WiredTiger MongoDB Memory: mongod process: connections, aggregations, mapReduce, etc WiredTiger cache: (0.6 x total memory) – 1 GB Total = mem(mongod) + mem(WiredTiger cache) WiredTiger cache mongod mongod memory
  • 21.
    #MDBW16 cgroup! memory_limit! Process memory withcontainers and cgroups WiredTiger cache mongod mongod memory total memory (seen from mongod process)! Inside the container •  Can see total memory and not memory limit WiredTiger cache: •  memory_limit *0.6
  • 22.
    #MDBW16 cgroups: Memory andCPU limits mongod! mongod! mongod! mongod! MEM = (TOTAL_MEM - OS_MEM) / NUM_MONGOD WiredTiger cache = (MEM * 0.6) CPUSET= 0, 1, …, MAX_CPU_CORES mongod! mongod! mongod! mongod!
  • 23.
    #MDBW16 MongoDB on Docker+ cgroups: Memory WiredTiger cache: 60% of the container memory limit (for each mongod) Compose: mem_limit! Docker Engine: --memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! WiredTiger cache! ! ! ! ! ! ! mongod memory! OS Memory!
  • 24.
    #MDBW16 MongoDB on Docker+ cgroups: CPU mongod (and mongos) mapped to 1+ core Compose: cpuset! Docker Engine: --cpuset-cpus! mongod! rs1a! mongod! rs1a! mongod! rs2b! mongod! rs2b! core0! core1! core2! core3! OS! OS! mongod cfgsrv -c! mongos! core4! core5! core6! core7! --cpuset-mems (NUMA)!
  • 25.
    #MDBW16 Understanding resource usage: • docker top rs1a! •  docker stats rs1a! Container stats available via Docker remote API: GET /containers/(id)/stats Also available from docker-py: http://docker-py.readthedocs.org/en/latest/api/#stats Resource usage with Docker
  • 26.
    #MDBW16 Resource usage withDocker Multiple statistics for each container: Memory limit and usage, CPU (per core level), Network, Disk Useful to combine with MongoDB metrics (like db.serverStatus())
  • 27.
    #MDBW16 Creating a Swarmcluster on AWS to deploy MongoDB
  • 28.
    #MDBW16 Creating a Swarmcluster on AWS to deploy MongoDB DEMO! Configure docker-machine with ec2 driver (AWS) Deploy discovery service for Swarm Master Deploy AWS instances for: •  Swarm master •  Swarm worker nodes Connect to the Swarm master Define compose file for deployment Define Swarm filters and constraints and cgroup limits Deploy the environment with a single command using the compose file Configure our MongoDB sharded cluster using Cloud Manager API
  • 29.
    #MDBW16 What our dockerizedsharded cluster looks like… rs1a! rs1c!rs1b! rs2b! rs2a! rs2c! rs3b! rs3a! rs3c! mongos! cfgsvr -a! cfgsvr -b! cfgsvr -c! dockerC1! dockerC2! dockerC3! PRIMARY! SECONDARY! MONGOS!
  • 30.
  • 31.
    #MDBW16 Advantages of usingMongoDB with Docker Speed: testing and deploying cluster patterns easily Build once, deploy everywhere Control: Resource control and utilization Key to success with containers Agility: Microservices architectures Making change less expensive Flexibility: Multi vendor cloud opportunities AWS, Azure, Google, IBM, CloudFoundry P S S
  • 32.
    #MDBW16 How successful customersuse MongoDB with Docker •  Case Studies @ hGps://www.mongodb.com/blog •  Whitepaper: “Enabling Microservices – Containers & Orchestration Explained” https://www.mongodb.com/collateral/microservices-containers-and-orchestration-explained
  • 33.
    #MDBW16 Now it’s YOURturn Share with us your use case of MongoDB & Docker: http://bit.do/DockerMongoDB @marcobonezzi You can actually try this at home: https://github.com/sisteming/mongo-swarm
  • 34.

Editor's Notes

  • #2 Thank you everybody for being here today for this talk. It is really exciting to be speaking today about these two technologies, Docker and MongoDB, We get many questions on how to use both of these technologies successfully and in this talk I will show you some how running our MongoDB Clusters on Docker containers can be really useful in some situations
  • #3 Let me introduce myself, my name is Marco Bonezzi, I’m a TSE (or Technical Services Engineer) at MongoDB. What TSE really means is that we help customers to be successfull with MongoDB by assisting them with a variety of issues. I am based in Dublin, in the sunny Ireland My main experience is in databases, distributed systems and high availability solutions with different database technologies
  • #4 Before we start, I’d like to ask you if you (or maybe not you but some friend or someone else at a different company you may know about) have ever been in one the situations we are about to see: So let’s start with some questions: For the first one, you are all here at MongoDB World so I’m pretty sure that you all have manually deployed a MongoDB replica set or shurded cluster Who heard of issues with resource allocation when running multiple mongod processes on the same server? In terms of your docker experience, how many of you are using Docker in production for the last year or two? Ok, that’s great. So this one is for you: How many of you are running MongoDB on Docker? Well, this is interesting and some of you might be familiar with the following situations and hopefully you will all learn some tricks to improve your MongoDB deployment running on Docker
  • #5 If you have seen these issues closer than what you liked, it’s fine. Some of our customers have had these issues. We know that sometimes our issues come from having different architectures in development and production. Or also just by adding an extra MongoDB process on the same server, that starts consuming resources dedicates to our main process. And generally when we work with our customers, we check that they apply our production notes. In many cases, we hear that they will deploy their cluster if not already deployed on Docker containers. This is really great, as this will give us the opportunity to help them understanding the key conceptoss to successfully implement these technologies. KISS
  • #6 3 – THIS IS OUR STORY Based on our interaction with different customers, we identified several paint points and today I will show you how Docker can be really useful for our MongoDB deployments Deployment Deploying the architectures for our clusters is time consuming Sometimes this also affect how we deploy our clusters for example in terms of high availability Resource Contention It’s great to add more and more processes to our servers, but what about resource contention with multiple instances? Docker How do we setup MongoDB on Docker? How do we setup a replica set between containers? Where is my data? Can I access the mongod log file as I currently do?
  • #7 3 – THIS IS OUR STORY So how can we solve these issues? We will see how we can deploy faster our replica set or sharded clusters, making sure that we use recommended highly available patterns by using orchestration Using resource management is the key for successful and reliable architectures, so that we will avoid obscure issues and we can have almost predictable resource utilization Also by automating our MongoDB deployments with Docker we can make sure that we have a scalable configuration and that we deploy patterns and not just independent processes AN IMPORTANT POINT TO BEAR IN MIND IS ALSO THE TIME SAVED WITH THIS APPROACH COMPARED TO MORE TRADITIONAL TOOLS OR DEVELOPMENT TECHNIQUES
  • #8 SHOULD BE HERE AT 5 MINUTES In this talk today, I will show you briefly how to define highly available patterns for our MongoDB deployments Once we define these patterns, we’ll see how we can orchestrate our MongoDB containers to build these patterns by using Docker Swarm and compose on AWS Lastly, we will see how to use the cgroup implementation in Docker and the key points to define limits for memory or cpu for each MongoDB container of our cluster These three ingredients will help you not only to implement your MongoDB deployment with docker containers, but to deploy successful patterns that can be configured to your needs also in terms of resource utilisation
  • #9 When speaking about replica sets, and this is a general MongoDB best practice, we generally suggest a primary secondary secondary configuration as recommended pattern. If we have two secondaries and the primary replicating all operations to them, so that we have two copies of our data and in case of losing one member, we still have a replicated copy. We always recommend having an odd number of members and this is strictly related to the fault tolerance: With more members, higher majority and higher resiliency of our environment. It is also essential to understand why we want to have each replica set member on a different server or instances. Losing one of the servers won’t mean that our replica set will be unavailable.
  • #10 5 – THIS IS WHAT WE REALIZED So probably most of you already know and use Docker, but in case there is someone that does not, I’ll quickly introduce it. If you ask Google – what is Docker – Google will tell you that Docker is a person employed in a port to load and unload ships. While this might seem unrelated, it is quite related as our ship will be our servers, and Docker will allow us to load and unload containers on them. So these containers are basically be isolated processes in the userspace, they have a shared kernel, and in these isolated process we can deploy applications and their dependencies. The idea is that we can containarize a single application, including its dependencies, and this will allow us to run them everywhere – we just need to have a docker daemon running, whether in our laptop, in a cloud infrastructure like AWS or Azure or even on our Rasperry Pi.
  • #11 5 – THIS IS WHAT WE REALIZED So probably most of you already know and use Docker, but in case there is someone that does not, I’ll quickly introduce it. If you ask Google – what is Docker – Google will tell you that Docker is a person employed in a port to load and unload ships. While this might seem unrelated, it is quite related as our ship will be our servers, and Docker will allow us to load and unload containers on them. So these containers are basically be isolated processes in the userspace, they have a shared kernel, and in these isolated process we can deploy applications and their dependencies. The idea is that we can containarize a single application, including its dependencies, and this will allow us to run them everywhere – we just need to have a docker daemon running, whether in our laptop, in a cloud infrastructure like AWS or Azure or even on our Rasperry Pi.
  • #12 4 – UNTIL WHEN SOME OF OUR SUCCESSFUL CUSTOMERS DISCOVER DOCKER The good news is that Docker can help us with this! So for example we can easily coordinate our containers to deploy a recommended highly available pattern, and we can then size each container and therefore its instance and each cluster to avoid any resource allocation issue
  • #13 So now we know how a highly available MongoDB pattern looks like and we will dive into how we can use Docker to implement these patterns. As we will see shortly, in our recipe today we will run MongoDB on Docker containers, and we will use Docker compose, docker-machine, swarm and the Cloud Manager API to orchestrate and automate the deployment of a sharded cluster
  • #14 Generally, what we mean by Docker is refered to docker daemon, but Docker has an ecosystem of tools to help us manage containers So for example with docker-machine we can provision and manage our containers under different providers, like AWS, virtualbox or others, We then have docker swarm that provide us with a clustering solution to have multiple nodes running docker daemon. And then we can have filters and rules to orchestrate the deployment of the containers into each node of the cluster. Docker compose instead can be used to define our patterns and multi-container deployments just with a YAML description. This way we can define services for replica sets or sharded cluster and easily deploy this type of deployments - i.e. Deploy a replica set with a single command All these tools have in common the use of Docker API, so any tool that works with Docker can use Swarm to scale to multiple hosts
  • #16 One of the interesting things about docker swarm is the possibility of having multi-host networking. As we have containers in each of the swarm nodes, they need to be able to connect with each other. For this reason, Swarm automatically creates a overlay container-to-container network. The underliying technology is based in the docker swarm master and its service discovery (in this example we are using consul as a discovery service) With this, by refering to the hostname associated with each container, we can reach the containers located in a different swarm node. This is a key concept for deploying our MongoDB containers in a swarm cluster, but once understood is really easy to use and connect all the containers with each other.
  • #17 As previously mentioned, Swarm also offers interesting features to orchestrate and schedule the deployment of containers to the swarm cluster nodes. There are different type of filters that can be used, we can for instance establish some constraints or affinity rules. In this slide we can see how we will be marking each mongod container with a label, so that it will say that this is a mongod process part of replica set rs1. Once each container is marked, we will see how to use this label.
  • #18 Based on the label previously defined, we will define affinity filters to ensure that we will only have at most one container from replica set rs1 in the same swarm node. So the effect of this rule will be that with 3 containers for a replica set, each will deploy to a different node
  • #19 Now we know how a recommended pattern looks like and how we can make sure that we follow these patterns and rules when deploying Docker containers to a swarm cluster. So of course deploying containers to the right node is very important, so that we ensure their availability, but in addition to this, for a successful deployment the next step is to use Resource Management on each swarm node by using the Docker implementation of cgroups.
  • #20 Resource controlwith Docker and cgroups is quite easy, it can be done just from the docker run command or also defining the limits in the compose file. The idea for using cgroups is that thanks to the cgroups feature included in the Linux Kernel, we can define limits for the key resources of a process, like memory, CPU, network, or disk.
  • #21 As you probably know, MongoDB 3.2 with the WiredTiger storage engine uses a memory region called WiredTiger cache. Other than this region, we also have a region outside of the WiredTiger cache that includes the kernel filesystem cache plus and it is used for other elements required by the mongod process like connections, aggregations or mapreduce operations. Based on this, the total memory for the mongod process running WiredTiger is the sum of the WiredTiger cache, whether defined by default or manually, plus the memory required for the mongodb process. While this is a general overview of the memory usage with WiredTiger, there are some additional caveats to consider if running MongoDB on Docker containers.
  • #22 As mentioned in the previous slide, with WiredTiger we have the WiredTiger cache which is used for compressed indexes and uncompressed data. By default, the cache is set to 60% - 1GB of the total memory on the server. However, when running in a container, the mongod process (and any process) does not have an easy way to check a memory limit defined for a container. As the process only reads the total memory on the server, is really important to set manually the WiredTiger cache for each container according to the memory limit defined for a container, so that it will always be at most 60% of the container limit.
  • #23 In terms of memory limit, with the memory option (or memory limit in compose) we can define a memory limit for each container. This is really important mainly for two reasons: Docker is often used in larger servers with large volumes of RAM, so we want to correctly size the environment to avoid any resource contention issue If running MongoDB with WiredTiger in a docker container, we need to set the WiredTiger cache to 60% of the memory limit as previously mentioned
  • #24 To see this in a more visual way, here we have a server or our swarm node, and we will have: Memory regiond left available for the OS memory The rest of the available memory would be splitted into smaller memory sub regions for each container, as the idea is that we definit the memory limit to each container. Then based on this, we will define the WiredTiger cache to 60% of the size of the memory limit previously defined for the container. As you can see, this is a sizing exercise that may require some planning but in the long run this will ensure the stability and correct resource utilisation for our MongoDB cluster.
  • #25 This is also a visual representation of how we can pin different containers to specific nodes, so we could have each mongod associated to one core, and the same for mongos or config servers.
  • #26 Lastly, as we speak of resource management and key metrics like CPU or memory, it is also important to understand how to monitor these resources when running on Docker. Docker has some commands like docker top or stats which will show us the command line or other stats like the CPU % or the memory usage and memory limit. Additionally, we can also use the Docker remote API to get these metrics or we can also use other clients or libraries as for instance docker-py, which allow us to retrieve these stats into python and work with them from there. DEMO GRAPHS
  • #27 Lastly, as we speak of resource management and key metrics like CPU or memory, it is also important to understand how to monitor these resources when running on Docker. Docker has some commands like docker top or stats which will show us the command line or other stats like the CPU % or the memory usage and memory limit. Additionally, we can also use the Docker remote API to get these metrics or we can also use other clients or libraries as for instance docker-py, which allow us to retrieve these stats into python and work with them from there. DEMO GRAPHS
  • #28 To be able to easily deploy our MongoDB patters with Docker, we will use a Swarm cluster In this slide we can see the architecture of the example we are using today in this talk. This is a simplified version where we have: A swarm manager node, that manages the swarm cluster and is responsible to schedule where each container will be deployed and will check any rules or filters defined. We will have then a number of swarm nodes, that will have already the docker daemon installed so that we can start deploying our MongoDB containers straight away. These four elements are all deployed using docker-machine, that allow us create these instances on AWS just by running the docker-machine command from our terminal. Docker-machine will take care of setting up ssh keys, deploy the instance type we define and it will have Ubuntu (or other distribution) with the Docker daemon already installed. Once we have this architecture deployed, we will define the pattern and configuration of our MongoDB deployment using compose yml files, and through the docker command we will run them against the swarm manager. This will then schedule each container according to any filter defined and in the end each swarm node will be running a number of MongoDB containers.
  • #29 SWAAARM SHOULD BE AT 20 min here In summary, creating a swarm cluster on AWS to deploy our MongoDB cluster is not that complicated and can be done in a small series of steps: With using docker-machine and the ecd driver Deploy service discovery (consul) and the docker swarm master instance Deploy mutiple swarm workes nodes Connect to the swarm master with docker tools Use compose and swarm to define affinity filters and constraints Deploy our containers Configure our cluster with Cloud Manager API I will now show this in more detail with a small demo. The files I am using during this demo are publicly available and you can actually try this at home. Firstly we use docker-machine with the ec2driver to enable docker-machine to create aws instances. With this, we will create a discovery service container that will be then use by the Swarm master. Once we deploy an instance for the swarm master, we will also create 3 or more swarm worker nodes. Once we have these nodes, we can then connect to the swarm master, which is the main point to interact with our Docker cluster. We can define filters and constraints in the docker compose description and once we have this, we can deploy all the containers across the swarm cluster. At this point we only need to enable the sharding or replica set configuration, and this can be done by using manual scripts to automate this task or as in this case, we can use the Cloud Manager API to enable our MongoDB cluster running on Docker in Cloud Manager with all the benefits of monitoring or automation. What I will show you now is an example of a swarm cluster up and running. We have our worker nodes and we will deploy mongodb containers to them, to build a sharded cluster. We will see how the compose files are defined, how we can define labels and use them to build our cluster. Then we will see how running a
  • #32 In summary, what are advantages of using MongoDB on Docker for you? As we have seen today like having predefined patterns of deployment, so that we can deploy reliable architectures and clusters easily just as a single element. Docker allow us also to deploy faster, fail faster, and test and deploy complex architectures like an issue occurred in production. For instance, some capabilities from Docker cgroups could be quite interesting if trying to simulate a specific configuration when one of the containers disks is way slower. Additionally, the resource management possibilites with Docker and cgroups is easier to implement that just a plain cgroup configuration, and this will also encourage more people
  • #33 6 – THESE ARE THE ADVANTAGES FOR YOU In summary, these are some recommendations from our side on why our Docker can be really useful for some cases. By our parntership with different customers enabling these technologies, you can now also learn how succesful customers are using Docker for MongoDB in their environments.
  • #34 6 – THESE ARE THE ADVANTAGES FOR YOU