© 2018 Nokia1 Deploying MariaDB databases with containers at Nokia Deploying MariaDB solutions in containerized environments in Nokia Networks Rick Lane 27-02-2019
© 2018 Nokia2 Deploying MariaDB databases with containers at Nokia Deploying MariaDB solutions in containerized environments in Nokia Networks Rick Lane 27-02-2019
© 2018 Nokia3 © 2018 Nokia3 CMDB - MariaDB Common Software Foundation (CSF) Component MariaDB (CMDB)
© 2018 Nokia4 Helm/Kubernetes/Container tradeoffs Pros • Fully separates services from kernel/other services • Extremely light-weight and portable • Containers disposable  Kill and recreate pod as new  Readiness/Liveless probes automate recovery • Deploy application with multiple services in one command/click (helm umbrella charts) • Deployment time significantly faster than VM/ansible (4 minutes compared to 40 minutes) Cons • Containers disposable  Recreated with new IP (looks like new server)  Failure root cause difficult - logs disappear with container. (pod stdout or persistent storage) • Umbrella charts introduce other difficult problems  Can deploy new service instance with helm upgrade of parent chart
© 2018 Nokia5 Nokia Container Management Service Helm/Kubernetes Deployment Model controller worker worker worker worker worker controller edge edge deploy chart cmdb helm chart deploy pods External Connections
© 2018 Nokia6 Security / Affinity Helm/Kubernetes Deployment Model Security • RBAC fully supported • All containers must run as non-root user • Kubernetes RBAC ServiceAccount and Role/RoleBindings limit container privileges • Password security • All user-supplied passwords loaded to kubernetes secret during pre-install-job • Secret used to propagate passwords to maxscale/mariadb pods • Password secret deleted on post-terminate • User must provide secret with old/new password to update passwords Affinity • podAntiAffinity • hard (default) – all pods must be scheduled on separate nodes or deployment will fail • soft – try to schedule pods on separate nodes, but if will deploy anyway • nodeAffinity • mariadb pods forced to deploy on worker nodes • maxscale pods by default deploy on edge nodes (can configure to deploy on worker)
© 2018 Nokia7 Containers CMDB - MariaDB cmdb/mariadb (FROM centos-7.6 os base image) MariaDB database container supporting deployment all configurations (simplex, Galera, Master/Master, Master/Slave) • MariaDB-10.3.11 (client, server, backup, etc) • Galera • SDC/etcd client RPMs • CSF CMDB deployment, configuration and management RPMs cmdb/maxscale (FROM centos-7.6 os base image) MaxScale proxy container supporting deployment of data center configuration • Maxscale-2.2.19 • SDC/etcd client RPMs • CSF CMDB deployment, configuration and management RPMs cmdb/admin (FROM kubectl base image) Kubernetes/Helm Job Administration container supporting all life cycle events (install, upgrade, delete, etc) • MariaDB-10.3.11-client • SDC/etcd client RPMs • Python job orchestrator and python classes to implement configuration specific job tasks
© 2018 Nokia8 Helm chart (services and admin) CMDB - MariaDB ## Image Registry global: registry: "csf-docker-delivered.repo.internal.nokia.com" registry1: "registry1-docker-io.repo.internal.nokia.com" rbac_enabled: true nodeAntiAffinity: hard cluster_name: "my-cluster“ ## Topology master-slave, master-master, galera, simplex cluster_type: “master-slave“ ## Values on how to expose services ## ClusterIP will expose only within cluister, NodePort to expose externally services: ## MySQL service exposes the mysql database service (mariadb or maxscale) mysql: type: ClusterIP ## MariaDB Master exposes the pod that is master mariadb_master: type: NodePort ## Maxscale exposes the administrative interface of Maxscale maxscale: type: NodePort ## Maxctrl (optional) exposes the maxctrl administrative interface of Maxscale maxctrl: enabled: false type: ClusterIP port: 8989 admin: image: name: "cmdb/admin" tag: "4.5-1" pullPolicy: IfNotPresent ## A recovery flag. If changed, will trigger a heal of the database to occur #recovery: none quickInstall: "" ## If set, administrative jobs will be more verbose to stdout (kubectl logs) debug: false ## Exposes the hook-delete-policy. By default, this is set to delete the ## hooks only upon success. In helm v2.9+, this should be set to ## before-hook-creation. This can also be unset to avoid hook deletion ## for troubleshooting and debugging purposes hook_delete_policy: "hook-succeeded"
© 2018 Nokia9 Helm chart (mariadb and maxscale) CMDB - MariaDB mariadb: image: name: "cmdb/mariadb" tag: "4.5-1" pullPolicy: IfNotPresent ## The number of MariaDB pods to create count: 3 heuristic_recover: rollback use_tls: true ## Enable persistence using Persistent Volume Claims persistence: enabled: true accessMode: ReadWriteOnce size: 20Gi storageClass: "" resourcePolicy: delete preserve_pvc: false ## MariaDB server customized configuration mysqld_site_conf: |- [mysqld] userstat = on ## metrics metrics: enabled: false ## Grafana dashboard dashboard: enabled: false maxscale: image: name: "cmdb/maxscale" tag: "4.5-1” pullPolicy: IfNotPresent ## The number of MaxScale pods count: 2 ## MaxScale customized configuration maxscale_site_conf: |- [maxscale] threads = 2 query_retries = 2 query_retry_timeout = 10 [MariaDB-Monitor] monitor_interval = 1000 failcount = 4 ## MaxScale promotion/demotion SQL sql: ## Mariadb Node promoted to master promotion: [] ## Mariadb Node demoted to slave demotion: [] ## leader-elector elector: image: name: "googlecontainer/leader-elector" tag: 0.5 pullPolicy: IfNotPresent
© 2018 Nokia10 Events Life Cycle Management • Kubernetes native events  install = deploy chart and create resources  delete = terminate chart and delete all resources created by install  upgrade = make any changes to mariadb/maxscale resources (configuration, etc) special code to handle heal and scale-in/out events • Nokia plugin events  heal = implemented also with kubernetes upgrade admin.recovery value  scale-in/scale-out = implemented also with kubernetes upgrade mariadb.count or maxscale.count  backup/restore = implemented with Backup/Restore policy
© 2018 Nokia11 Kubernetes Resources Galera Cluster • Deploy mariadb-statefulset with 3+ (odd number). MariaDB pod contains: • Mariadb container  Configures mariadb in Galera configuration automatically at deploy based on IP advertisements  If pod restarts, configured to always come back to Join existing cluster  Persistent Volume Claim mounted for database storage • Backup/Restore container for scheduling routine mariadb container backups • Optional mysqld_exporter container for metrics collection (if metrics enabled) • Mysql Service created to provide access to all DB nodes (all DB nodes added to service as endpoints) • Metrics Service created to provide access to DB nodes from Grafana dashboard (if metrics enabled)
© 2018 Nokia12 Galera Cluster mysql load-balance pods service mariadb metrics mariadb-0 BR volume mariadb metrics mariadb-1 BR volume mariadb metrics mariadb-2 BR volume
© 2018 Nokia13 Kubernetes Resources Master/Slave with HA MaxScale • Deploy maxscale-statefulset with 1 to 3 pods. Maxscale pods contains: • Maxscale container  Configures maxscale using helm values and mariadb container advertised IPs (via etcd)  Monitors http://localhost:4040 for leader-elector changes (setting maxscale passive mode) • Leader-elector container for managing HA  Configured to manage kubernetes endpoint with lease for election of leader in cluster  Starts small web server to publish elected leader to port 4040 • Deploy mariadb-statefulset with 2+ (3+ odd number preferred). MariaDB pod contains: • Mariadb container  Configures mariadb in Master/Slave/Slave configuration automatically at deploy based on IP advertisements  If pod restarts, configured to always come back as a Slave  Persistent Volume Claim mounted for database storage • Backup/Restore container for scheduling routine mariadb container backups • Optional mysqld_exporter container for metrics collection (if metrics enabled) • Mysql Service created to provide access to all maxscale nodes (all maxscale nodes added to service as endpoints) • Maxctrl Service created to provide REST API access to “active” maxscale node (labeled with ‘maxscale-leader’)
© 2018 Nokia14 mariadb metrics mariadb-0 BR volume Master/Slave Cluster with HA MaxScale maxscale elector maxscale-0 maxscale elector maxscale-1 mysql service maxctrl maxscale-1 endpoint watching managing Master Slave Slave passive active load-balance mariadb metrics mariadb-2 BR volume mariadb metrics mariadb-1 BR volume
© 2018 Nokia15 Pod IP Advertisements / Single Pod Failure Pod failures result in the re-created pod being re-deployed with a new IP address (looks like a new cluster server) mariadb-0 etcd server cmdb/my-cluster/services/attributes/mariadb-0 = {“role”: “RM”, “ip”: “172.16.0.35”} cmdb/my-cluster/services/attributes/mariadb-1 = {“role”: “RS”, “ip”: “172.16.0.104”} cmdb/my-cluster/services/attributes/mariadb-2 = {“role”: “RS”, “ip”: “172.16.0.97”} cmdb/my-cluster/services/attributes/maxscale-0 = {“role”: “MXS”, “ip”: “172.16.0.39”} cmdb/my-cluster/services/attributes/maxscale-1 = {“role”: “MXS”, “ip”: “172.16.0.52”} cmdb/my-cluster/services/attributes/mariadb-2 = {“role”: “RS”, “ip”: “172.16.0.201”} mariadb-1 mariadb-2 maxscale-0 maxscale-1 172.16.0.35 172.16.0.104 172.16.0.97 172.16.0.39 172.16.0.52 172.16.0.201 mariadb-2 maxadmin alter server mariadb-2 address=172.16.0.201 Advertise IP Audit advertisements
© 2018 Nokia16 Galera Cluster Heal mariadb-0 mariadb-1 mariadb-2 etcd server cmdb/my-cluster/mariadb-0/config/role = “--cluster=join:SST” cmdb/my-cluster/mariadb-1/config/role = “--cluster=new” cmdb/my-cluster/mariadb-2/config/role = “--cluster=join:SST” admin post-upgrade-job Admin container heal operation (helm upgrade of admin.recovery value) etcd server cmdb/my-cluster/actions/wait_role = {“advertise”: “recovery_pos”} cmdb/my-cluster/services/recovery_pos/mariadb-0 = {“seqno”: “527”} cmdb/my-cluster/services/recovery_pos/mariadb-1 = {“seqno”: “528”} cmdb/my-cluster/services/recovery_pos/mariadb-2 = {“seqno”: “527”} (1) Write wait_role action (2) Kill all mariadb pods (3) Pods advertise recovery_pos seqno values (4) Find pod with largest seqno (5) Largest pod starts cluster, rest join (6) Pods detect role and re-deploy
© 2018 Nokia17 Galera Cluster Scale-Out mariadb-0 mariadb-1 mariadb-2 etcd server cmdb/my-cluster/mariadb-3/config/role = “--cluster=join:SST” cmdb/my-cluster/mariadb-4/config/role = “--cluster=join:SST”admin pre-upgrade-job Admin container scale-out operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Set new pods roles (2) New pods created (3) Notify existing pods of new cluster size
© 2018 Nokia18 Galera Cluster Scale-In mariadb-0 mariadb-1 mariadb-2 admin pre-upgrade-job Admin container scale-in operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Verify new cluster size (2) pods deleted (3) Notify remaining pods of new cluster size
© 2018 Nokia19 MaxScale Cluster Heal mariadb-0 Maxscale will auto-heal MariaDB cluster when all database pods fail mariadb-0 mariadb-0 mariadb-0 mariadb-0 mariadb-0 Remote Data Center Master SlaveSlave Topology Audit (no audit if event < 15 seconds) • After all pods restart: o Original master will be replicating from remote DC (Slave, Running) o Original slaves will still be replicating from old master (Running) • Expected_master = first server replicating to remote DC • If all other servers replicating to same server (old master) For all servers (except expected_master) CHANGE MASTER TO expected_master Run promotion.sql
© 2018 Nokia20 MaxScale Cluster Scale-Out etcd server cmdb/my-cluster/mariadb-3/config/role = “--replicate=slave” cmdb/my-cluster/mariadb-4/config/role = “--replicate=slave” admin pre-upgrade-job Admin container scale-out operation (helm upgrade of mariadb.count) etcd server cmdb/my-cluster/actions/wait_role = {“advertise”: “ready”} cmdb/my-cluster/services/ready/mariadb-3 = ‘true’ cmdb/my-cluster/services/ready/mariadb-4 = ‘true’ (1) Make sure master exists (2) Write wait_role action (3) New pods created (5) As ready pods detected, restore from master backup and advertise pod role mariadb-0 mariadb-1 mariadb-2 mariadb-3 mariadb-4 admin post-upgrade-job (4) Backup Master (0) M (6) Notify existing pods of new cluster size Maxscale: maxadmin create server <server> … maxadmin add server <server>
© 2018 Nokia21 MaxScale Cluster Scale-In mariadb-0 mariadb-1 mariadb-2 admin pre-upgrade-job Admin container scale-in operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Verify new cluster size (3) pods deleted (4) Notify remaining pods of new cluster size MM (2) Switchover Master via MaxScale if necessary Maxscale: maxadmin remove server <server> maxadmin destroy server <server>
© 2018 Nokia22 Future Work • Additional enhancements to prevent data loss  Supporting semi-sync replication in Master/Slave/Slave cluster with MaxScale  Implement preStop hook to trigger switchover if Master is being deleted (eg, for migration) • Kubernetes Horizontal Pod Autoscaling (HPA)
Deploying MariaDB databases with containers at Nokia Networks
Deploying MariaDB databases with containers at Nokia Networks

Deploying MariaDB databases with containers at Nokia Networks

  • 1.
    © 2018 Nokia1 DeployingMariaDB databases with containers at Nokia Deploying MariaDB solutions in containerized environments in Nokia Networks Rick Lane 27-02-2019
  • 2.
    © 2018 Nokia2 DeployingMariaDB databases with containers at Nokia Deploying MariaDB solutions in containerized environments in Nokia Networks Rick Lane 27-02-2019
  • 3.
    © 2018 Nokia3© 2018 Nokia3 CMDB - MariaDB Common Software Foundation (CSF) Component MariaDB (CMDB)
  • 4.
    © 2018 Nokia4 Helm/Kubernetes/Containertradeoffs Pros • Fully separates services from kernel/other services • Extremely light-weight and portable • Containers disposable  Kill and recreate pod as new  Readiness/Liveless probes automate recovery • Deploy application with multiple services in one command/click (helm umbrella charts) • Deployment time significantly faster than VM/ansible (4 minutes compared to 40 minutes) Cons • Containers disposable  Recreated with new IP (looks like new server)  Failure root cause difficult - logs disappear with container. (pod stdout or persistent storage) • Umbrella charts introduce other difficult problems  Can deploy new service instance with helm upgrade of parent chart
  • 5.
    © 2018 Nokia5 NokiaContainer Management Service Helm/Kubernetes Deployment Model controller worker worker worker worker worker controller edge edge deploy chart cmdb helm chart deploy pods External Connections
  • 6.
    © 2018 Nokia6 Security/ Affinity Helm/Kubernetes Deployment Model Security • RBAC fully supported • All containers must run as non-root user • Kubernetes RBAC ServiceAccount and Role/RoleBindings limit container privileges • Password security • All user-supplied passwords loaded to kubernetes secret during pre-install-job • Secret used to propagate passwords to maxscale/mariadb pods • Password secret deleted on post-terminate • User must provide secret with old/new password to update passwords Affinity • podAntiAffinity • hard (default) – all pods must be scheduled on separate nodes or deployment will fail • soft – try to schedule pods on separate nodes, but if will deploy anyway • nodeAffinity • mariadb pods forced to deploy on worker nodes • maxscale pods by default deploy on edge nodes (can configure to deploy on worker)
  • 7.
    © 2018 Nokia7 Containers CMDB- MariaDB cmdb/mariadb (FROM centos-7.6 os base image) MariaDB database container supporting deployment all configurations (simplex, Galera, Master/Master, Master/Slave) • MariaDB-10.3.11 (client, server, backup, etc) • Galera • SDC/etcd client RPMs • CSF CMDB deployment, configuration and management RPMs cmdb/maxscale (FROM centos-7.6 os base image) MaxScale proxy container supporting deployment of data center configuration • Maxscale-2.2.19 • SDC/etcd client RPMs • CSF CMDB deployment, configuration and management RPMs cmdb/admin (FROM kubectl base image) Kubernetes/Helm Job Administration container supporting all life cycle events (install, upgrade, delete, etc) • MariaDB-10.3.11-client • SDC/etcd client RPMs • Python job orchestrator and python classes to implement configuration specific job tasks
  • 8.
    © 2018 Nokia8 Helmchart (services and admin) CMDB - MariaDB ## Image Registry global: registry: "csf-docker-delivered.repo.internal.nokia.com" registry1: "registry1-docker-io.repo.internal.nokia.com" rbac_enabled: true nodeAntiAffinity: hard cluster_name: "my-cluster“ ## Topology master-slave, master-master, galera, simplex cluster_type: “master-slave“ ## Values on how to expose services ## ClusterIP will expose only within cluister, NodePort to expose externally services: ## MySQL service exposes the mysql database service (mariadb or maxscale) mysql: type: ClusterIP ## MariaDB Master exposes the pod that is master mariadb_master: type: NodePort ## Maxscale exposes the administrative interface of Maxscale maxscale: type: NodePort ## Maxctrl (optional) exposes the maxctrl administrative interface of Maxscale maxctrl: enabled: false type: ClusterIP port: 8989 admin: image: name: "cmdb/admin" tag: "4.5-1" pullPolicy: IfNotPresent ## A recovery flag. If changed, will trigger a heal of the database to occur #recovery: none quickInstall: "" ## If set, administrative jobs will be more verbose to stdout (kubectl logs) debug: false ## Exposes the hook-delete-policy. By default, this is set to delete the ## hooks only upon success. In helm v2.9+, this should be set to ## before-hook-creation. This can also be unset to avoid hook deletion ## for troubleshooting and debugging purposes hook_delete_policy: "hook-succeeded"
  • 9.
    © 2018 Nokia9 Helmchart (mariadb and maxscale) CMDB - MariaDB mariadb: image: name: "cmdb/mariadb" tag: "4.5-1" pullPolicy: IfNotPresent ## The number of MariaDB pods to create count: 3 heuristic_recover: rollback use_tls: true ## Enable persistence using Persistent Volume Claims persistence: enabled: true accessMode: ReadWriteOnce size: 20Gi storageClass: "" resourcePolicy: delete preserve_pvc: false ## MariaDB server customized configuration mysqld_site_conf: |- [mysqld] userstat = on ## metrics metrics: enabled: false ## Grafana dashboard dashboard: enabled: false maxscale: image: name: "cmdb/maxscale" tag: "4.5-1” pullPolicy: IfNotPresent ## The number of MaxScale pods count: 2 ## MaxScale customized configuration maxscale_site_conf: |- [maxscale] threads = 2 query_retries = 2 query_retry_timeout = 10 [MariaDB-Monitor] monitor_interval = 1000 failcount = 4 ## MaxScale promotion/demotion SQL sql: ## Mariadb Node promoted to master promotion: [] ## Mariadb Node demoted to slave demotion: [] ## leader-elector elector: image: name: "googlecontainer/leader-elector" tag: 0.5 pullPolicy: IfNotPresent
  • 10.
    © 2018 Nokia10 Events LifeCycle Management • Kubernetes native events  install = deploy chart and create resources  delete = terminate chart and delete all resources created by install  upgrade = make any changes to mariadb/maxscale resources (configuration, etc) special code to handle heal and scale-in/out events • Nokia plugin events  heal = implemented also with kubernetes upgrade admin.recovery value  scale-in/scale-out = implemented also with kubernetes upgrade mariadb.count or maxscale.count  backup/restore = implemented with Backup/Restore policy
  • 11.
    © 2018 Nokia11 KubernetesResources Galera Cluster • Deploy mariadb-statefulset with 3+ (odd number). MariaDB pod contains: • Mariadb container  Configures mariadb in Galera configuration automatically at deploy based on IP advertisements  If pod restarts, configured to always come back to Join existing cluster  Persistent Volume Claim mounted for database storage • Backup/Restore container for scheduling routine mariadb container backups • Optional mysqld_exporter container for metrics collection (if metrics enabled) • Mysql Service created to provide access to all DB nodes (all DB nodes added to service as endpoints) • Metrics Service created to provide access to DB nodes from Grafana dashboard (if metrics enabled)
  • 12.
    © 2018 Nokia12 GaleraCluster mysql load-balance pods service mariadb metrics mariadb-0 BR volume mariadb metrics mariadb-1 BR volume mariadb metrics mariadb-2 BR volume
  • 13.
    © 2018 Nokia13 KubernetesResources Master/Slave with HA MaxScale • Deploy maxscale-statefulset with 1 to 3 pods. Maxscale pods contains: • Maxscale container  Configures maxscale using helm values and mariadb container advertised IPs (via etcd)  Monitors http://localhost:4040 for leader-elector changes (setting maxscale passive mode) • Leader-elector container for managing HA  Configured to manage kubernetes endpoint with lease for election of leader in cluster  Starts small web server to publish elected leader to port 4040 • Deploy mariadb-statefulset with 2+ (3+ odd number preferred). MariaDB pod contains: • Mariadb container  Configures mariadb in Master/Slave/Slave configuration automatically at deploy based on IP advertisements  If pod restarts, configured to always come back as a Slave  Persistent Volume Claim mounted for database storage • Backup/Restore container for scheduling routine mariadb container backups • Optional mysqld_exporter container for metrics collection (if metrics enabled) • Mysql Service created to provide access to all maxscale nodes (all maxscale nodes added to service as endpoints) • Maxctrl Service created to provide REST API access to “active” maxscale node (labeled with ‘maxscale-leader’)
  • 14.
    © 2018 Nokia14 mariadb metrics mariadb-0 BR volume Master/SlaveCluster with HA MaxScale maxscale elector maxscale-0 maxscale elector maxscale-1 mysql service maxctrl maxscale-1 endpoint watching managing Master Slave Slave passive active load-balance mariadb metrics mariadb-2 BR volume mariadb metrics mariadb-1 BR volume
  • 15.
    © 2018 Nokia15 PodIP Advertisements / Single Pod Failure Pod failures result in the re-created pod being re-deployed with a new IP address (looks like a new cluster server) mariadb-0 etcd server cmdb/my-cluster/services/attributes/mariadb-0 = {“role”: “RM”, “ip”: “172.16.0.35”} cmdb/my-cluster/services/attributes/mariadb-1 = {“role”: “RS”, “ip”: “172.16.0.104”} cmdb/my-cluster/services/attributes/mariadb-2 = {“role”: “RS”, “ip”: “172.16.0.97”} cmdb/my-cluster/services/attributes/maxscale-0 = {“role”: “MXS”, “ip”: “172.16.0.39”} cmdb/my-cluster/services/attributes/maxscale-1 = {“role”: “MXS”, “ip”: “172.16.0.52”} cmdb/my-cluster/services/attributes/mariadb-2 = {“role”: “RS”, “ip”: “172.16.0.201”} mariadb-1 mariadb-2 maxscale-0 maxscale-1 172.16.0.35 172.16.0.104 172.16.0.97 172.16.0.39 172.16.0.52 172.16.0.201 mariadb-2 maxadmin alter server mariadb-2 address=172.16.0.201 Advertise IP Audit advertisements
  • 16.
    © 2018 Nokia16 GaleraCluster Heal mariadb-0 mariadb-1 mariadb-2 etcd server cmdb/my-cluster/mariadb-0/config/role = “--cluster=join:SST” cmdb/my-cluster/mariadb-1/config/role = “--cluster=new” cmdb/my-cluster/mariadb-2/config/role = “--cluster=join:SST” admin post-upgrade-job Admin container heal operation (helm upgrade of admin.recovery value) etcd server cmdb/my-cluster/actions/wait_role = {“advertise”: “recovery_pos”} cmdb/my-cluster/services/recovery_pos/mariadb-0 = {“seqno”: “527”} cmdb/my-cluster/services/recovery_pos/mariadb-1 = {“seqno”: “528”} cmdb/my-cluster/services/recovery_pos/mariadb-2 = {“seqno”: “527”} (1) Write wait_role action (2) Kill all mariadb pods (3) Pods advertise recovery_pos seqno values (4) Find pod with largest seqno (5) Largest pod starts cluster, rest join (6) Pods detect role and re-deploy
  • 17.
    © 2018 Nokia17 GaleraCluster Scale-Out mariadb-0 mariadb-1 mariadb-2 etcd server cmdb/my-cluster/mariadb-3/config/role = “--cluster=join:SST” cmdb/my-cluster/mariadb-4/config/role = “--cluster=join:SST”admin pre-upgrade-job Admin container scale-out operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Set new pods roles (2) New pods created (3) Notify existing pods of new cluster size
  • 18.
    © 2018 Nokia18 GaleraCluster Scale-In mariadb-0 mariadb-1 mariadb-2 admin pre-upgrade-job Admin container scale-in operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Verify new cluster size (2) pods deleted (3) Notify remaining pods of new cluster size
  • 19.
    © 2018 Nokia19 MaxScaleCluster Heal mariadb-0 Maxscale will auto-heal MariaDB cluster when all database pods fail mariadb-0 mariadb-0 mariadb-0 mariadb-0 mariadb-0 Remote Data Center Master SlaveSlave Topology Audit (no audit if event < 15 seconds) • After all pods restart: o Original master will be replicating from remote DC (Slave, Running) o Original slaves will still be replicating from old master (Running) • Expected_master = first server replicating to remote DC • If all other servers replicating to same server (old master) For all servers (except expected_master) CHANGE MASTER TO expected_master Run promotion.sql
  • 20.
    © 2018 Nokia20 MaxScaleCluster Scale-Out etcd server cmdb/my-cluster/mariadb-3/config/role = “--replicate=slave” cmdb/my-cluster/mariadb-4/config/role = “--replicate=slave” admin pre-upgrade-job Admin container scale-out operation (helm upgrade of mariadb.count) etcd server cmdb/my-cluster/actions/wait_role = {“advertise”: “ready”} cmdb/my-cluster/services/ready/mariadb-3 = ‘true’ cmdb/my-cluster/services/ready/mariadb-4 = ‘true’ (1) Make sure master exists (2) Write wait_role action (3) New pods created (5) As ready pods detected, restore from master backup and advertise pod role mariadb-0 mariadb-1 mariadb-2 mariadb-3 mariadb-4 admin post-upgrade-job (4) Backup Master (0) M (6) Notify existing pods of new cluster size Maxscale: maxadmin create server <server> … maxadmin add server <server>
  • 21.
    © 2018 Nokia21 MaxScaleCluster Scale-In mariadb-0 mariadb-1 mariadb-2 admin pre-upgrade-job Admin container scale-in operation (helm upgrade of mariadb.count) mariadb-3 mariadb-4 admin post-upgrade-job (1) Verify new cluster size (3) pods deleted (4) Notify remaining pods of new cluster size MM (2) Switchover Master via MaxScale if necessary Maxscale: maxadmin remove server <server> maxadmin destroy server <server>
  • 22.
    © 2018 Nokia22 FutureWork • Additional enhancements to prevent data loss  Supporting semi-sync replication in Master/Slave/Slave cluster with MaxScale  Implement preStop hook to trigger switchover if Master is being deleted (eg, for migration) • Kubernetes Horizontal Pod Autoscaling (HPA)