An Introduction to Using PostgreSQL with Docker & Kubernetes JONATHAN S. KATZ JULY 19, 2018 LOS ANGELES POSTGRESQL USER GROUP
About Crunchy Data 2 • Leading provider of trusted open source PostgreSQL and PostgreSQL related technologies, support, and training to enterprises • We're hiring! • crunchydata.com • @crunchydb
• Director of Communications, Crunchy Data • Previously: Engineering leadership in startups • Longtime PostgreSQL community contributor • Advocacy & various committees for PGDG • @postgresql + .org content • Director, PgUS • Co-Organizer, NYCPUG • Conference organization + speaking • @jkatz05 About Me 3
• Containers: A Brief History • Containers + PostgreSQL • Setting up PostgreSQL with Containers • Deploying! - Container Orchestration • Look Ahead: Trends in the Container World Outline 4
• Containers are processes that encapsulate all the requirements to execute an application • Similar to virtual machines, Sandbox for applications similar to a virtual machine but with increased density on a single host What Are Containers? 5 Source: Docker
• Container Image - the file that describes how to build a container • Container Engine - prepares for container to be executed by container runtime by collecting container images, accepting user input, preparing mount points, etc. Examples: docker, CRI-O, RKT, LXD • Container Runtime - Takes information passed from container engine and sets up containerized process. Open Containers Initiative (OCI) helping to standardize on runc • Container - The runtime instantiation of a Container Image, i.e. a process! Container Glossary 6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
• Lightweight • compared to virtual machines, use less disk, RAM, CPU • Sandboxed • Container runtime is isolated from other processes • Portability • Containers can be run on different platforms as long as container engine is available • Convenience • Requirements for running applications bundled together • Prevents messy dependency overlaps Why Containers? 7
Example: Basic Web Application 8
Example: Production Setup for Web Application 9
Example: Upgrading a Web Application 10
• Containers provide several advantages to running PostgreSQL: • Setup & distribution for developer environments • Ease of packaging extensions & minor upgrades • Separate out secondary applications (monitoring, administration) • Automation and scale for provisioning and creating replicas, backups Containers & PostgreSQL 11
• Containers also introduce several challenges: • Administrator needs to understand and select appropriate storage options • Configuration for individual database specifications and user access • Managing 100s - 1000s of containers requires appropriate orchestration (more on that later) • Still a database within the container; standard DBA tuning applies • However, these are challenges you will find in most database environments Containers & PostgreSQL 12
• We will use the Crunchy Container Suite • PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension • pgpool + pgbouncer: connection pooling, load balancing • pgbackrest: terabyte-scale backup management • Monitoring: Prometheus + export • Scheduling: "crunchy-dba" • pgadmin4: UX-driven management • Open source! • Apache 2.0 license • Support for Docker 1.12+, Kubernetes 1.5+ • Actively maintained and updated Getting Started With Containers & PostgreSQL 13 https://github.com/CrunchyData/crunchy-containers
Getting Started With Containers & PostgreSQL 14
Demo: Creating & Working With Containerized PostgreSQL 15 mkdir postgres cd postgres docker volume create --driver local --name=pgvolume docker network create --driver bridge pgnetwork cat << EOF > pg-env.list PG_MODE=primary PG_PRIMARY_USER=postgres PG_PRIMARY_PASSWORD=password PG_DATABASE=whales PG_USER=jkatz PG_PASSWORD=password PG_ROOT_PASSWORD=password PG_PRIMARY_PORT=5432 EOF docker run --publish 5432:5432 --volume=pgvolume:/pgdata --env-file=pg-env.list --name="postgres" --hostname="postgres" --network="pgnetwork" --detach crunchydata/crunchy-postgres:centos7-10.4-2.0.0
Demo: Adding in pgadmin4 16 docker volume create --driver local --name=pga4volume cat << EOF > pgadmin4-env.list PGADMIN_SETUP_EMAIL=jonathan.katz@crunchydata.com PGADMIN_SETUP_PASSWORD=securepassword SERVER_PORT=5050 EOF docker run --publish 5050:5050 --volume=pga4volume:/var/lib/pgadmin --env-file=pgadmin4-env.list --name="pgadmin4" --hostname="pgadmin4" --network="pgnetwork" --detach crunchydata/crunchy-pgadmin4:centos7-10.4-2.0.0
Demo: Adding Monitoring 17 cat << EOF > collect-env.list DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable EOF docker run --env-file=collect-env.list --network=pgnetwork --name=collect --hostname=collect --detach crunchydata/crunchy-collect:centos7-10.4-2.0.0 docker volume create --driver local --name=prometheus cat << EOF > prometheus-env.list COLLECT_HOST=collect SCRAPE_INTERVAL=5s SCRAPE_TIMEOUT=5s EOF docker run --publish 9090:9090 --env-file=prometheus-env.list --volume prometheus:/data --network=pgnetwork --name=prometheus --hostname=prometheus --detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0 docker volume create --driver local --name=grafana cat << EOF > grafana-env.list ADMIN_USER=jkatz ADMIN_PASS=password PROM_HOST=prometheus PROM_PORT=9090 EOF docker run --publish 3000:3000 --env-file=grafana-env.list --volume grafana:/data --network=pgnetwork --name=grafana --hostname=grafana --detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0 1. Set up the metric collector 2. Set up prometheus to store metrics 3. Set up grafana to visualize
• Explored what / why / how of containers • Set up a PostgreSQL 10 instance • Set up pgadmin4 to manage our PostgreSQL instance • Set up monitoring to analyze performance of our system • Of course, the next question naturally is: Recap 18
How do I manage these things at scale?
• "Open-source system for automating deployment, scaling, and management of containerized applications." • Manage the full lifecycle of a container • Assists with scheduling, scaling, failover, high-availability, and more Kubernetes: Container Orchestration 20 Source: https://kubernetes.io
• Value of Kubernetes increases exponentially as number of containers increases • Due to statefulness of databases, Kubernetes requires more knowledge to successfully operate a standard database workload: • Avoid scheduling and availability issues for longer-running database containers • Data continues to exist even if container does not When to Use Kubernetes 21
• Node: A Kubernetes "worker" machine that is able to run pods • Pod: One or more running containers; the "atomic" unit of Kubernetes • Service: The access point to a set of Pods • ReplicaSet: Ensures that a specified number of replica Pods are running at a given time • Deployment: A controller that ensures all running Pods / ReplicaSets match the desired state of the execution environment (total number of pods, resources, etc.) • Persistent Volume (PV): A storage API that enables information to persist after a Pod has terminated • Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes information such as amount of storage. Used for dynamic provisioning Kubernetes Glossary Important for PostgreSQL 22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
• Kubernetes provide the gateway to run your own "database-as-a-service:" • Mass apply databases commands: • Updates • Backups / Restores • ACL rule changes • Scale up / down replicas • Failover 23 PostgreSQL in a Kubernetes World
• Kubernetes is "turnkey" for stateless applications • e.g. web servers • Databases do maintain state: permanent storage • Persistent Volumes (PV) • Persistent Volume Claims (PVC) PostgreSQL in a Kubernetes World 24
• Utilizes Operator framework initially launched by CoreOS to help capture nuances of managing complex applications that maintain state, e.g. databases • Allows an administrator to run PostgreSQL-specific commands to manage database clusters, including: • Creating / Deleting a cluster (your own DBaaS) • Scaling up / down replicas • Failover • Apply user policies to PostgreSQL instances • Define what container resources to use (RAM, CPU, etc.) • Smart pod deployments to nodes • REST API Crunchy PostgreSQL Operator 25 https://github.com/CrunchyData/postgres-operator
• Automation: Complex, multi-step DBA tasks reduced to one-line commands • Standardization: Many customizations, same workflow • Ease-of-Use: Simple CLI; UI in beta • Scale • Provision & manage clusters quickly amongst thousands of instances • Load balancing, disaster recovery, security policies, deployment specifications • Security: Sandboxed environments, RBAC, mass grant/revoke policies Why Use An Operator With PostgreSQL? 26
Demo: Perhaps Videos. 27
Demo: Exploring the Operator User Interface 28
Demo (Alternative): Exploring the Operator User Interface 29
• Containers are no longer "new" - orchestration technologies have matured • Debate with containers + databases: storage & management • No different than virtual machines + databases • Databases are still databases: need expertise to manage • Stateful Sets vs. Deployments • Database deployment automation flexibility • Deploy your architecture to any number of clouds • Monitoring: A new frontier Containerized PostgreSQL: Looking Ahead 30
• Containers + PostgreSQL gives you: • Easy-to-setup development environments • Your own production database-as-a-service • Tools to automate management of over 1000s of instances in short- order Conclusion 31
Jonathan S. Katz jonathan.katz@crunchydata.com @jkatz05 Thank You!

Using PostgreSQL With Docker & Kubernetes - July 2018

  • 1.
    An Introduction toUsing PostgreSQL with Docker & Kubernetes JONATHAN S. KATZ JULY 19, 2018 LOS ANGELES POSTGRESQL USER GROUP
  • 2.
    About Crunchy Data 2 •Leading provider of trusted open source PostgreSQL and PostgreSQL related technologies, support, and training to enterprises • We're hiring! • crunchydata.com • @crunchydb
  • 3.
    • Director ofCommunications, Crunchy Data • Previously: Engineering leadership in startups • Longtime PostgreSQL community contributor • Advocacy & various committees for PGDG • @postgresql + .org content • Director, PgUS • Co-Organizer, NYCPUG • Conference organization + speaking • @jkatz05 About Me 3
  • 4.
    • Containers: ABrief History • Containers + PostgreSQL • Setting up PostgreSQL with Containers • Deploying! - Container Orchestration • Look Ahead: Trends in the Container World Outline 4
  • 5.
    • Containers areprocesses that encapsulate all the requirements to execute an application • Similar to virtual machines, Sandbox for applications similar to a virtual machine but with increased density on a single host What Are Containers? 5 Source: Docker
  • 6.
    • Container Image- the file that describes how to build a container • Container Engine - prepares for container to be executed by container runtime by collecting container images, accepting user input, preparing mount points, etc. Examples: docker, CRI-O, RKT, LXD • Container Runtime - Takes information passed from container engine and sets up containerized process. Open Containers Initiative (OCI) helping to standardize on runc • Container - The runtime instantiation of a Container Image, i.e. a process! Container Glossary 6 Source: https://developers.redhat.com/blog/2018/02/22/container-terminology-practical-introduction/
  • 7.
    • Lightweight • comparedto virtual machines, use less disk, RAM, CPU • Sandboxed • Container runtime is isolated from other processes • Portability • Containers can be run on different platforms as long as container engine is available • Convenience • Requirements for running applications bundled together • Prevents messy dependency overlaps Why Containers? 7
  • 8.
    Example: Basic WebApplication 8
  • 9.
    Example: Production Setupfor Web Application 9
  • 10.
    Example: Upgrading aWeb Application 10
  • 11.
    • Containers provideseveral advantages to running PostgreSQL: • Setup & distribution for developer environments • Ease of packaging extensions & minor upgrades • Separate out secondary applications (monitoring, administration) • Automation and scale for provisioning and creating replicas, backups Containers & PostgreSQL 11
  • 12.
    • Containers alsointroduce several challenges: • Administrator needs to understand and select appropriate storage options • Configuration for individual database specifications and user access • Managing 100s - 1000s of containers requires appropriate orchestration (more on that later) • Still a database within the container; standard DBA tuning applies • However, these are challenges you will find in most database environments Containers & PostgreSQL 12
  • 13.
    • We willuse the Crunchy Container Suite • PostgreSQL (+ PostGIS): our favorite database; option to add our favorite geospatial extension • pgpool + pgbouncer: connection pooling, load balancing • pgbackrest: terabyte-scale backup management • Monitoring: Prometheus + export • Scheduling: "crunchy-dba" • pgadmin4: UX-driven management • Open source! • Apache 2.0 license • Support for Docker 1.12+, Kubernetes 1.5+ • Actively maintained and updated Getting Started With Containers & PostgreSQL 13 https://github.com/CrunchyData/crunchy-containers
  • 14.
    Getting Started WithContainers & PostgreSQL 14
  • 15.
    Demo: Creating &Working With Containerized PostgreSQL 15 mkdir postgres cd postgres docker volume create --driver local --name=pgvolume docker network create --driver bridge pgnetwork cat << EOF > pg-env.list PG_MODE=primary PG_PRIMARY_USER=postgres PG_PRIMARY_PASSWORD=password PG_DATABASE=whales PG_USER=jkatz PG_PASSWORD=password PG_ROOT_PASSWORD=password PG_PRIMARY_PORT=5432 EOF docker run --publish 5432:5432 --volume=pgvolume:/pgdata --env-file=pg-env.list --name="postgres" --hostname="postgres" --network="pgnetwork" --detach crunchydata/crunchy-postgres:centos7-10.4-2.0.0
  • 16.
    Demo: Adding inpgadmin4 16 docker volume create --driver local --name=pga4volume cat << EOF > pgadmin4-env.list PGADMIN_SETUP_EMAIL=jonathan.katz@crunchydata.com PGADMIN_SETUP_PASSWORD=securepassword SERVER_PORT=5050 EOF docker run --publish 5050:5050 --volume=pga4volume:/var/lib/pgadmin --env-file=pgadmin4-env.list --name="pgadmin4" --hostname="pgadmin4" --network="pgnetwork" --detach crunchydata/crunchy-pgadmin4:centos7-10.4-2.0.0
  • 17.
    Demo: Adding Monitoring 17 cat<< EOF > collect-env.list DATA_SOURCE_NAME=postgresql://postgres:password@postgres:5432/postgres?sslmode=disable EOF docker run --env-file=collect-env.list --network=pgnetwork --name=collect --hostname=collect --detach crunchydata/crunchy-collect:centos7-10.4-2.0.0 docker volume create --driver local --name=prometheus cat << EOF > prometheus-env.list COLLECT_HOST=collect SCRAPE_INTERVAL=5s SCRAPE_TIMEOUT=5s EOF docker run --publish 9090:9090 --env-file=prometheus-env.list --volume prometheus:/data --network=pgnetwork --name=prometheus --hostname=prometheus --detach crunchydata/crunchy-prometheus:centos7-10.4-2.0.0 docker volume create --driver local --name=grafana cat << EOF > grafana-env.list ADMIN_USER=jkatz ADMIN_PASS=password PROM_HOST=prometheus PROM_PORT=9090 EOF docker run --publish 3000:3000 --env-file=grafana-env.list --volume grafana:/data --network=pgnetwork --name=grafana --hostname=grafana --detach crunchydata/crunchy-grafana:centos7-10.4-2.0.0 1. Set up the metric collector 2. Set up prometheus to store metrics 3. Set up grafana to visualize
  • 18.
    • Explored what/ why / how of containers • Set up a PostgreSQL 10 instance • Set up pgadmin4 to manage our PostgreSQL instance • Set up monitoring to analyze performance of our system • Of course, the next question naturally is: Recap 18
  • 19.
    How do Imanage these things at scale?
  • 20.
    • "Open-source systemfor automating deployment, scaling, and management of containerized applications." • Manage the full lifecycle of a container • Assists with scheduling, scaling, failover, high-availability, and more Kubernetes: Container Orchestration 20 Source: https://kubernetes.io
  • 21.
    • Value ofKubernetes increases exponentially as number of containers increases • Due to statefulness of databases, Kubernetes requires more knowledge to successfully operate a standard database workload: • Avoid scheduling and availability issues for longer-running database containers • Data continues to exist even if container does not When to Use Kubernetes 21
  • 22.
    • Node: AKubernetes "worker" machine that is able to run pods • Pod: One or more running containers; the "atomic" unit of Kubernetes • Service: The access point to a set of Pods • ReplicaSet: Ensures that a specified number of replica Pods are running at a given time • Deployment: A controller that ensures all running Pods / ReplicaSets match the desired state of the execution environment (total number of pods, resources, etc.) • Persistent Volume (PV): A storage API that enables information to persist after a Pod has terminated • Persistent Volume Claim (PVC): Enables a PV to be mounted to a container, includes information such as amount of storage. Used for dynamic provisioning Kubernetes Glossary Important for PostgreSQL 22 Source: https://kubernetes.io/docs/reference/glossary/?fundamental=true&storage=true
  • 23.
    • Kubernetes providethe gateway to run your own "database-as-a-service:" • Mass apply databases commands: • Updates • Backups / Restores • ACL rule changes • Scale up / down replicas • Failover 23 PostgreSQL in a Kubernetes World
  • 24.
    • Kubernetes is"turnkey" for stateless applications • e.g. web servers • Databases do maintain state: permanent storage • Persistent Volumes (PV) • Persistent Volume Claims (PVC) PostgreSQL in a Kubernetes World 24
  • 25.
    • Utilizes Operatorframework initially launched by CoreOS to help capture nuances of managing complex applications that maintain state, e.g. databases • Allows an administrator to run PostgreSQL-specific commands to manage database clusters, including: • Creating / Deleting a cluster (your own DBaaS) • Scaling up / down replicas • Failover • Apply user policies to PostgreSQL instances • Define what container resources to use (RAM, CPU, etc.) • Smart pod deployments to nodes • REST API Crunchy PostgreSQL Operator 25 https://github.com/CrunchyData/postgres-operator
  • 26.
    • Automation: Complex,multi-step DBA tasks reduced to one-line commands • Standardization: Many customizations, same workflow • Ease-of-Use: Simple CLI; UI in beta • Scale • Provision & manage clusters quickly amongst thousands of instances • Load balancing, disaster recovery, security policies, deployment specifications • Security: Sandboxed environments, RBAC, mass grant/revoke policies Why Use An Operator With PostgreSQL? 26
  • 27.
  • 28.
    Demo: Exploring theOperator User Interface 28
  • 29.
    Demo (Alternative): Exploringthe Operator User Interface 29
  • 30.
    • Containers areno longer "new" - orchestration technologies have matured • Debate with containers + databases: storage & management • No different than virtual machines + databases • Databases are still databases: need expertise to manage • Stateful Sets vs. Deployments • Database deployment automation flexibility • Deploy your architecture to any number of clouds • Monitoring: A new frontier Containerized PostgreSQL: Looking Ahead 30
  • 31.
    • Containers +PostgreSQL gives you: • Easy-to-setup development environments • Your own production database-as-a-service • Tools to automate management of over 1000s of instances in short- order Conclusion 31
  • 32.