Multi-Cluster Kubernetes and Service Mesh Patterns Christian Posta Field CTO – Solo.io
2 | Copyright © 2020 CHRISTIAN POSTA Global Field CTO, Solo.io @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
3 | Copyright © 2020 Challenges • Improve velocity of teams building and delivering code • Decentralized implementations vs centralized operations • Connect and include existing systems and investments • Improve security posture • Stay within regulations and compliance
4 | Copyright © 2020 More, smaller clusters • High availability • Compliance • Isolation / Autonomy • Scale • Data locality, cost • Public/DMZ/Private networks
5 | Copyright © 2020 Multiple clusters • Exact replicas of each other, same fleet? • Separate, non-uniform deployments? • Single operational/administrative control • Segmented by network? Segmented by team? • Independent administration?
6 | Copyright © 2020 Cluster federation • Autonomous clusters • Different organizational/network/administrative boundaries • Share pieces of configuration • For those shared pieces, treat union as a single unit • Uses an orchestrator to stitch together policies for federation
7 | Copyright © 2020 Example: Kubefed Cluster 1 Cluster 2 Cluster 0 Kubefed CP Federated Resources watches Federate to clusters https://github.com/kubernetes-sigs/kubefed
8 | Copyright © 2020 Example: Kubefed apiVersion: types.kubefed.io/v1beta1 kind: FederatedService metadata: name: echo-server spec: placement: clusterSelector: matchLabels: {} template: metadata: labels: app: echo-server spec: ports: - name: http port: 8080 selector: app: echo-server
9 | Copyright © 20209 | Copyright © 2020 Demo Simple Kubernetes federation
10 | Copyright © 2020 Services need to communicate with each other
11 | Copyright © 2020 Pattern: flat network across pods Account User Products Cluster 1 Cluster 2 History
12 | Copyright © 2020 Pattern: Different network, expose all services Account User Products Cluster 1 Cluster 2 History
13 | Copyright © 2020 Pattern: Different network, controlled gateway Account User Products Cluster 1 Cluster 2 History
14 | Copyright © 2020 Forces to balance • Security (authz/authn/encryption/identity) • Service discovery • Failover / traffic shifting / transparent routing • Observability • Separate networks • Well-defined fault domains • Building for scale
15 | Copyright © 2020 Could you build these patterns just using Kubernetes?
16 | Copyright © 2020 Service Mesh can help
17 | Copyright © 2020 Envoy is the magic behind service mesh http://envoyproxy.io
18 | Copyright © 2020 Envoy implements: • zone aware, priority/locality load balancing • circuit breaking, outlier detection • timeouts, retries, retry budgets • traffic shadowing • request racing • rate limiting • RBAC, TLS origination/termination • access logging, statistics collection
19 | Copyright © 2020 Envoy to do application networking heavy lifting Account work load work load work load mTLS • Transparent client-side routing decisions • TLS orig/termination • Circuit breaking • Stats collection
20 | Copyright © 2020 Envoy as backbone for multi-cluster communication federation Account User Cluster 1 Cluster 2 Products History User
21 | Copyright © 2020 Other key Envoy proxying features • Request hedging • Retry Budgets • Load balancing priorities • Locality weighted load balancing • Zone aware routing • Degraded endpoints (fallback) • Aggregated clusters
22 | Copyright © 2020 Exploring Envoy failover routing capabilities: Request racing Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Timeout Race request First to return is the response to the caller
23 | Copyright © 2020 Exploring Envoy failover routing capabilities: Zone aware routing (Envoy decides) Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Not enough healthy hosts in same zone Spill over to another zone
24 | Copyright © 2020 Exploring Envoy failover routing capabilities: Locality aware (Control plane decides) Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Not enough healthy hosts in same zone Spill over to another zone W=1 W=1 W=1 W=5 W=5
25 | Copyright © 2020 Exploring Envoy failover routing capabilities: Aggregate Cluster (for routing to gateways) Account work load work load work load Calls http://products.service/ Edge gw us-west-1 us-west-2 EDS Strict DNS
26 | Copyright © 202026 | Copyright © 2020 Multi-cluster examples Service mesh examples using Envoy Proxy
27 | Copyright © 2020 Istio shared control plane, flat network Account User Cluster 1 Cluster 2 Products History User Istiod
28 | Copyright © 2020 Thoughts about shared control plane/flat network • Simplest set up for Istio multi-cluster • No special Envoy routing (though may use zone-aware) • Shared control plane increases the failure domain to multiple clusters • Use flat networking if possible (simpler) but may not have/want that option • No special considerations for identity (identity domain is shared) • Still need to federate telemetry collection
29 | Copyright © 2020 Account User Cluster 1 Cluster 2 Products History User Istiod Istio shared control plane, separate networks
30 | Copyright © 2020 Thoughts about shared control plane/separate network • Uses a gateway to allow communication between networks • Uses Envoy Locality Weighted LB (for the gateway endpoints). Istio calls this “split horizon EDS”. • Shares same failure domain across all clusters • Use the gateways to facilitate communication AND control plane • Slight increase in burden on operator to label networks and gateway endpoints correctly so Istio has that information
31 | Copyright © 2020 Account User Cluster 1 Cluster 2 Products History User Istiod Istio separate control planes, separate networks Istiod
32 | Copyright © 2020 Thoughts about separate control plane/separate network • Uses a gateway to allow communication between networks • Uses Istio’s ServiceEntry mechanism to enable cross-network discovery • Independent control planes • Separate, independent failure domains • Doesn’t solve where trust domains MUST be separate (with federation at the boundaries) • Increase burden on operator to maintain service discovery, identity federation, and multi-cluster configuration across meshes
33 | Copyright © 2020 Account Cluster 1 Cluster 2 User User Istiod Example multi-cluster routing with ServiceEntry Istiod http://users.default.svc.cluster.local http://users.default.cluster-2 ServiceEntry users.default.cluster-2
34 | Copyright © 2020 ServiceEntry for service discovery apiVersion: networking.istio.io/v1alpha3 kind: ServiceEntry metadata: name: users-cluster2 spec: hosts: - users.default.cluster2 location: MESH_INTERNAL ports: - name: http1 number: 8000 protocol: http resolution: DNS addresses: - 240.0.0.2 endpoints: - address: 10.0.2.5 ports: http1: 15443
35 | Copyright © 2020 Forces to balance • Security (authz/authn/encryption/identity) • Service discovery • Failover / traffic shifting / transparent routing • Observability • Separate networks • Well-defined fault domains • Building for scale
36 | Copyright © 2020 What to do about the added burden for the operator?
37 | Copyright © 2020 @christianposta Cluster 1 Cluster 2 Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub Ingress Gateway Management Plane
38 | Copyright © 202038 | Copyright © 2020 Demo Service Mesh Hub
39 | Copyright © 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub Ingress Gateway Management Plane Remote Cluster
40 | Copyright © 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub CSR agent CSR agent Create cert/key and CSR Sign cert w/ shared root Shared root Ingress Gateway Management Plane Remote Cluster
41 | Copyright © 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub CSR agent CSR agent Shared root Ingress Gateway Chain with same root Management Plane Remote Cluster
42 | Copyright © 2020 @christianposta THANK YOU FOR ATTENDING! @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
43 | Copyright © 2020 • https://solo.io • https://slack.solo.io • https://gloo.solo.io • https://envoyproxy.io • https://istio.io • https://webassemblyhub.io • https://servicemeshhub.io • https://blog.christianposta.com

Multicluster Kubernetes and Service Mesh Patterns

  • 1.
    Multi-Cluster Kubernetes and ServiceMesh Patterns Christian Posta Field CTO – Solo.io
  • 2.
    2 | Copyright© 2020 CHRISTIAN POSTA Global Field CTO, Solo.io @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
  • 3.
    3 | Copyright© 2020 Challenges • Improve velocity of teams building and delivering code • Decentralized implementations vs centralized operations • Connect and include existing systems and investments • Improve security posture • Stay within regulations and compliance
  • 4.
    4 | Copyright© 2020 More, smaller clusters • High availability • Compliance • Isolation / Autonomy • Scale • Data locality, cost • Public/DMZ/Private networks
  • 5.
    5 | Copyright© 2020 Multiple clusters • Exact replicas of each other, same fleet? • Separate, non-uniform deployments? • Single operational/administrative control • Segmented by network? Segmented by team? • Independent administration?
  • 6.
    6 | Copyright© 2020 Cluster federation • Autonomous clusters • Different organizational/network/administrative boundaries • Share pieces of configuration • For those shared pieces, treat union as a single unit • Uses an orchestrator to stitch together policies for federation
  • 7.
    7 | Copyright© 2020 Example: Kubefed Cluster 1 Cluster 2 Cluster 0 Kubefed CP Federated Resources watches Federate to clusters https://github.com/kubernetes-sigs/kubefed
  • 8.
    8 | Copyright© 2020 Example: Kubefed apiVersion: types.kubefed.io/v1beta1 kind: FederatedService metadata: name: echo-server spec: placement: clusterSelector: matchLabels: {} template: metadata: labels: app: echo-server spec: ports: - name: http port: 8080 selector: app: echo-server
  • 9.
    9 | Copyright© 20209 | Copyright © 2020 Demo Simple Kubernetes federation
  • 10.
    10 | Copyright© 2020 Services need to communicate with each other
  • 11.
    11 | Copyright© 2020 Pattern: flat network across pods Account User Products Cluster 1 Cluster 2 History
  • 12.
    12 | Copyright© 2020 Pattern: Different network, expose all services Account User Products Cluster 1 Cluster 2 History
  • 13.
    13 | Copyright© 2020 Pattern: Different network, controlled gateway Account User Products Cluster 1 Cluster 2 History
  • 14.
    14 | Copyright© 2020 Forces to balance • Security (authz/authn/encryption/identity) • Service discovery • Failover / traffic shifting / transparent routing • Observability • Separate networks • Well-defined fault domains • Building for scale
  • 15.
    15 | Copyright© 2020 Could you build these patterns just using Kubernetes?
  • 16.
    16 | Copyright© 2020 Service Mesh can help
  • 17.
    17 | Copyright© 2020 Envoy is the magic behind service mesh http://envoyproxy.io
  • 18.
    18 | Copyright© 2020 Envoy implements: • zone aware, priority/locality load balancing • circuit breaking, outlier detection • timeouts, retries, retry budgets • traffic shadowing • request racing • rate limiting • RBAC, TLS origination/termination • access logging, statistics collection
  • 19.
    19 | Copyright© 2020 Envoy to do application networking heavy lifting Account work load work load work load mTLS • Transparent client-side routing decisions • TLS orig/termination • Circuit breaking • Stats collection
  • 20.
    20 | Copyright© 2020 Envoy as backbone for multi-cluster communication federation Account User Cluster 1 Cluster 2 Products History User
  • 21.
    21 | Copyright© 2020 Other key Envoy proxying features • Request hedging • Retry Budgets • Load balancing priorities • Locality weighted load balancing • Zone aware routing • Degraded endpoints (fallback) • Aggregated clusters
  • 22.
    22 | Copyright© 2020 Exploring Envoy failover routing capabilities: Request racing Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Timeout Race request First to return is the response to the caller
  • 23.
    23 | Copyright© 2020 Exploring Envoy failover routing capabilities: Zone aware routing (Envoy decides) Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Not enough healthy hosts in same zone Spill over to another zone
  • 24.
    24 | Copyright© 2020 Exploring Envoy failover routing capabilities: Locality aware (Control plane decides) Account work load work load work load Calls http://products.service/ work load work load us-west-1 us-west-2 Not enough healthy hosts in same zone Spill over to another zone W=1 W=1 W=1 W=5 W=5
  • 25.
    25 | Copyright© 2020 Exploring Envoy failover routing capabilities: Aggregate Cluster (for routing to gateways) Account work load work load work load Calls http://products.service/ Edge gw us-west-1 us-west-2 EDS Strict DNS
  • 26.
    26 | Copyright© 202026 | Copyright © 2020 Multi-cluster examples Service mesh examples using Envoy Proxy
  • 27.
    27 | Copyright© 2020 Istio shared control plane, flat network Account User Cluster 1 Cluster 2 Products History User Istiod
  • 28.
    28 | Copyright© 2020 Thoughts about shared control plane/flat network • Simplest set up for Istio multi-cluster • No special Envoy routing (though may use zone-aware) • Shared control plane increases the failure domain to multiple clusters • Use flat networking if possible (simpler) but may not have/want that option • No special considerations for identity (identity domain is shared) • Still need to federate telemetry collection
  • 29.
    29 | Copyright© 2020 Account User Cluster 1 Cluster 2 Products History User Istiod Istio shared control plane, separate networks
  • 30.
    30 | Copyright© 2020 Thoughts about shared control plane/separate network • Uses a gateway to allow communication between networks • Uses Envoy Locality Weighted LB (for the gateway endpoints). Istio calls this “split horizon EDS”. • Shares same failure domain across all clusters • Use the gateways to facilitate communication AND control plane • Slight increase in burden on operator to label networks and gateway endpoints correctly so Istio has that information
  • 31.
    31 | Copyright© 2020 Account User Cluster 1 Cluster 2 Products History User Istiod Istio separate control planes, separate networks Istiod
  • 32.
    32 | Copyright© 2020 Thoughts about separate control plane/separate network • Uses a gateway to allow communication between networks • Uses Istio’s ServiceEntry mechanism to enable cross-network discovery • Independent control planes • Separate, independent failure domains • Doesn’t solve where trust domains MUST be separate (with federation at the boundaries) • Increase burden on operator to maintain service discovery, identity federation, and multi-cluster configuration across meshes
  • 33.
    33 | Copyright© 2020 Account Cluster 1 Cluster 2 User User Istiod Example multi-cluster routing with ServiceEntry Istiod http://users.default.svc.cluster.local http://users.default.cluster-2 ServiceEntry users.default.cluster-2
  • 34.
    34 | Copyright© 2020 ServiceEntry for service discovery apiVersion: networking.istio.io/v1alpha3 kind: ServiceEntry metadata: name: users-cluster2 spec: hosts: - users.default.cluster2 location: MESH_INTERNAL ports: - name: http1 number: 8000 protocol: http resolution: DNS addresses: - 240.0.0.2 endpoints: - address: 10.0.2.5 ports: http1: 15443
  • 35.
    35 | Copyright© 2020 Forces to balance • Security (authz/authn/encryption/identity) • Service discovery • Failover / traffic shifting / transparent routing • Observability • Separate networks • Well-defined fault domains • Building for scale
  • 36.
    36 | Copyright© 2020 What to do about the added burden for the operator?
  • 37.
    37 | Copyright© 2020 @christianposta Cluster 1 Cluster 2 Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub Ingress Gateway Management Plane
  • 38.
    38 | Copyright© 202038 | Copyright © 2020 Demo Service Mesh Hub
  • 39.
    39 | Copyright© 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub Ingress Gateway Management Plane Remote Cluster
  • 40.
    40 | Copyright© 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub CSR agent CSR agent Create cert/key and CSR Sign cert w/ shared root Shared root Ingress Gateway Management Plane Remote Cluster
  • 41.
    41 | Copyright© 2020 @christianposta Istiod work load Ingress Gateway Istiod work load work load work load work load work load Service Mesh Hub CSR agent CSR agent Shared root Ingress Gateway Chain with same root Management Plane Remote Cluster
  • 42.
    42 | Copyright© 2020 @christianposta THANK YOU FOR ATTENDING! @christianposta christian@solo.io https://blog.christianposta.com https://slideshare.net/ceposta
  • 43.
    43 | Copyright© 2020 • https://solo.io • https://slack.solo.io • https://gloo.solo.io • https://envoyproxy.io • https://istio.io • https://webassemblyhub.io • https://servicemeshhub.io • https://blog.christianposta.com

Editor's Notes

  • #3 How does Solo help do this? Help pick right tech when it’s warranted (Envoy) Hedge when market still volatile (SMH) Simplify adoption Enterprise focus (security, heterogeneous) Solve the problem everywhere regardless of technology, infrastructure, footprint On prem/public cloud/hybrid Any service mesh technology VMs, containers, et. al
  • #11 Kubernetes the defacto way to build and deploy containeriszed microservices … but not everything runs in Kubernetes, and not everything will run on premises
  • #18 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!
  • #20 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!
  • #23 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!
  • #24 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!
  • #25 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!
  • #26 Need a way to automate handling of explosive numbers of workloads (microservices) Placement of workloads AKA deployments Autoscale, health check, start/stop, rebalance, scale up/down Building applications for Kubernetes (or any cloud native platform) is fundamentally different Why Kubernetes won: * community Right level of API Extensible Declarative configuration model Foundation of DevOps and Automation model Adopting microservices to go fast!