1 Introduction to Confluent Operator to establish a Cloud-Native Confluent Platform and provide a Kafka Operator for Kubernetes Kai Waehner Technology Evangelist contact@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
2 Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
3 Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
4 Business Digitalization Trends are Driving the Need to Process Events at a whole new Scale, Speed and Efficiency Mobile Cloud Microservices Internet of Things Machine Learning The world has changed!
5Best-of-breed Platforms, Partners and Services for Multi-cloud Streams Private Cloud Deploy on bare-metal, VMs, containers or Kubernetes in your datacenter with Confluent Platform and Confluent Operator Public Cloud Implement self-managed in the public cloud or adopt a fully managed service with Confluent Cloud Hybrid Cloud Build a persistent bridge between datacenter and cloud with Confluent Replicator Confluent Replicator VM SELF MANAGED FULLY MANAGED
6 Software as a Service (SaaS) ● a software distribution model in which a third-party provider hosts applications and makes them available to customers over the Internet. ● provides SLAs like uptime guarantees, throughput, latency, etc. ● Depending on your definition, some also call this Serverless (BaaS, Backend as a Service) for infrastructure components
7 Confluent Cloud Cloud-Native Confluent Platform Fully-Managed Service Available on the leading public clouds with mission-critical SLAs. Serverless Kafka characteristics: Pay-as-you-go, elastic auto-scaling, abstracting infrastructure (topics not brokers)
8 Confluent Cloud, What does Fully-managed Mean? Infrastructure management (commodity) Scaling ● Upgrades (latest stable version of Kafka) ● Patching ● Maintenance ● Sizing (retention, latency, throughput, storage, etc.) ● Data balancing for optimal performance ● Performance tuning for real-time and latency requirements ● Fixing Kafka bugs ● Uptime monitoring and proactive remediation of issues ● Recovery support from data corruption ● Scaling the cluster as needed ● Data balancing the cluster as nodes are added ● Support for any Kafka issue with less than 60 minute response time Infra-as-a-Service Harness full power of Kafka Kafka-specific management Platform-as-a-Service Evolve as you need Future-proof Mission-critical reliability Most Kafka as a Service offerings are partially-managed
9 What is Cloud Native? Benefits • Scalable • Flexible • Agile • Elastic • Automated • Etc.
10 The Twelve-Factor App What is Cloud Native? https://12factor.net/
11 What is Cloud Native? 10 key characteristics (one of many definitions) ● Packaged as lightweight containers ● Developed with best-of-breed languages and frameworks ● Designed as loosely coupled microservices ● Centered around APIs for interaction and collaboration ● Architected with a clean separation of stateless and stateful services ● Isolated from server and operating system dependencies ● Deployed on self-service, elastic, cloud infrastructure ● Managed through agile DevOps processes ● Automated capabilities ● Defined, policy-driven resource allocation https://thenewstack.io/10-key-attributes-of-cloud-native-applications/
12 Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
13 Cloud-Native Platforms in last 5 years
14 Kubernetes won the battle!
15 Kubernetes innovation not stopping… ● Cloud providers provide Kubernetes as a service ○ AWS, Azure, GCP, … ○ Not all are mature yet J ● Stateful deployments leverage the Kubernetes Operator pattern ○ For many infrastructure components, like databases, messaging, etc. ○ Community projects, vendors solutions (open source vs. proprietary) ● Service Mesh ○ Envoy, Istio, Linkerd, ... ○ Pull request for Kafka protocol support in Envoy / Istio accepted recently
16 Evolution of Kafka DevOps Shell scripts Ansible/Chef Docker Kubernetes
17 Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
18 Kafkaesque world of Kafka on Kubernetes
19 Kafka on Kubernetes – It’s tricky © • Translating an existing architecture to Kubernetes • Failover handling and data balancing • Communication between ZooKeeper, Kafka Brokers, Clients (Java, REST, Connect, KSQL), Schema Registry, etc. • External access from / to outside Kubernetes cluster • Persistent storage options on prem and in the cloud • Security configuration • Rolling upgrades • Etc.
20 Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
● Global-scale ● Real-time ● Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
22 Real-Time Inventory Real-Time Fraud Detection Real-Time Customer 360 Machine Learning Models Real-Time Data Transformation ... Contextual Event-Driven Applications Universal Event Pipeline Data Stores Logs 3rd Party Apps Custom Apps/Microservices TREAMSSTREAMS CONNECT CLIENTS
23Confluent establishes Freedom of Choice ● SaaS or Self-Managed or Hybrid ● Confluent’s vision is to introduce cloud native capabilities to Confluent Platform and enable users who want a cloud native experience in on-premises and self- managed cloud environments ● Introducing Confluent Operator
24 Confluent Platform Operations and Security Development & Stream Processing Support,services,training&partners Apache Kafka Security plugins | Role-Based Access Control Control Center | Replicator | Auto Data Balancer | Operator Connectors Clients | REST Proxy MQTT Proxy | Schema Registry KSQL Connect Continuous Commit Log Streams Complete Event Streaming Platform Mission-critical Reliability Freedom of Choice Datacenter Public Cloud Confluent Cloud Self-Managed Software Fully-Managed Service
25 Confluent Operator ● Deployment and management automation for Confluent Platform on Kubernetes ● Including Apache Kafka, Zookeeper, Schema Registry, Connect, Control Center, Replicator, KSQL ● For organizations standardized on Kubernetes as platform runtime ● Operationalizes years of experience running Kafka on Kubernetes on the leading public clouds Confluent Platform Confluent Operator Kubernetes AWS Azure GCP RH OpenShift Mesosphere Pivotal On-Premises Cloud Docker Images Automate Deployment of Confluent Platform on Kubernetes on Any Platform at Any Scale
26 Confluent Operator enables you to: Automate provisioning of Kafka pods in minutes Monitor SLAs through Confluent Control Center or Prometheus Scale Kafka elastically & Automate rolling updates Built on our first hand knowledge of running Confluent at scale Cloud-Native Deployment of Kafka and Confluent Platform
Confluent’s Kubernetes Journey 05/2017 Confluent Cloud Early Access 2016 Confluent Cloud Development 11/2017 Confluent Cloud GA (AWS) 07/2019 Confluent Operator GA (Confluent Platform) 2019 Confluent Cloud GA on AWS, GCP, Azure
28 Confluent Operator: A custom Kubernetes Controller API Server Scheduler Controllers & Custom Controllers PODS PVs ConfigMaps StatefulSets Nodes and pods are where Applications run on Kubernetes Applications use objects like StatefulSets, Configmaps, PVs Custom Controllers create custom resources that provide unique application functionality: ● Upgrades, Elasticity, Kafka Operational Logic Custom Resources Master Node Worker Nodes
29Helm - The Confluent Operator Package Manager ● Confluent Operator leverages Helm Charts to deploy, upgrade and uninstall Confluent Platform custom resources and pods ● Configuration front end for users to specify how a Confluent Platform Cluster is deployed: ○ # of replicas for Kafka, Zookeeper ○ Security and Authentication configuration ○ Persistent Storage configuration ● Cluster configuration edits are also performed using Helm Operator Helm Charts - yaml
30 Kubernetes Cluster K8 NodeK8 NodeK8 Node Replicator Pod C3 Pod SR Pod K8 Node Confluent Operator Deployment Operator Kafka Pod ZK Pod Persistent Volumes (AWS EBS, GCE Persistent Disk, Local Persistent Volume, etc.) External Access Load Balancers Configurations ConfigMapsKSQL Pod REST Proxy Pod
31 Confluent Operator - Automated Provisioning
32 Confluent Operator - Automated Security Configuration SASL PLAIN, SASL_SSL, TLS with Mutual Authentication Automate configuration of truststores and keystores with secret objects Automate configuration of Kafka and all Confluent Platform Components
33 Confluent Operator - Scale Horizontally Automate Scaling: Spin up new brokers, connect workers easily Distribute partitions to new brokers: Determine balancing plan Execute balancing plan Monitor Resources
34 Confluent Operator - Rolling Upgrade of all components Automated Rolling Upgrades of all components - Kafka Brokers, Zookeeper, Connect, Control Center Kafka Broker Upgrades: 1. Stop the broker, upgrade Kafka 2. Wait for Partition Leader reassignment 3. Start the upgraded broker 4. Wait for zero under-replicated partitions 5. Upgrade the next broker
Agenda ● Cloud Native vs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
37 Kai Waehner Technology Evangelist contact@kai-waehner.de @KaiWaehner www.confluent.io www.kai-waehner.de LinkedIn Questions? Feedback? Let’s connect!

Confluent Operator as Cloud-Native Kafka Operator for Kubernetes

  • 1.
    1 Introduction to ConfluentOperator to establish a Cloud-Native Confluent Platform and provide a Kafka Operator for Kubernetes Kai Waehner Technology Evangelist contact@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2.
    2 Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 3.
    3 Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 4.
    4 Business Digitalization Trendsare Driving the Need to Process Events at a whole new Scale, Speed and Efficiency Mobile Cloud Microservices Internet of Things Machine Learning The world has changed!
  • 5.
    5Best-of-breed Platforms, Partnersand Services for Multi-cloud Streams Private Cloud Deploy on bare-metal, VMs, containers or Kubernetes in your datacenter with Confluent Platform and Confluent Operator Public Cloud Implement self-managed in the public cloud or adopt a fully managed service with Confluent Cloud Hybrid Cloud Build a persistent bridge between datacenter and cloud with Confluent Replicator Confluent Replicator VM SELF MANAGED FULLY MANAGED
  • 6.
    6 Software as aService (SaaS) ● a software distribution model in which a third-party provider hosts applications and makes them available to customers over the Internet. ● provides SLAs like uptime guarantees, throughput, latency, etc. ● Depending on your definition, some also call this Serverless (BaaS, Backend as a Service) for infrastructure components
  • 7.
    7 Confluent Cloud Cloud-Native ConfluentPlatform Fully-Managed Service Available on the leading public clouds with mission-critical SLAs. Serverless Kafka characteristics: Pay-as-you-go, elastic auto-scaling, abstracting infrastructure (topics not brokers)
  • 8.
    8 Confluent Cloud, Whatdoes Fully-managed Mean? Infrastructure management (commodity) Scaling ● Upgrades (latest stable version of Kafka) ● Patching ● Maintenance ● Sizing (retention, latency, throughput, storage, etc.) ● Data balancing for optimal performance ● Performance tuning for real-time and latency requirements ● Fixing Kafka bugs ● Uptime monitoring and proactive remediation of issues ● Recovery support from data corruption ● Scaling the cluster as needed ● Data balancing the cluster as nodes are added ● Support for any Kafka issue with less than 60 minute response time Infra-as-a-Service Harness full power of Kafka Kafka-specific management Platform-as-a-Service Evolve as you need Future-proof Mission-critical reliability Most Kafka as a Service offerings are partially-managed
  • 9.
    9 What is CloudNative? Benefits • Scalable • Flexible • Agile • Elastic • Automated • Etc.
  • 10.
    10 The Twelve-Factor App Whatis Cloud Native? https://12factor.net/
  • 11.
    11 What is CloudNative? 10 key characteristics (one of many definitions) ● Packaged as lightweight containers ● Developed with best-of-breed languages and frameworks ● Designed as loosely coupled microservices ● Centered around APIs for interaction and collaboration ● Architected with a clean separation of stateless and stateful services ● Isolated from server and operating system dependencies ● Deployed on self-service, elastic, cloud infrastructure ● Managed through agile DevOps processes ● Automated capabilities ● Defined, policy-driven resource allocation https://thenewstack.io/10-key-attributes-of-cloud-native-applications/
  • 12.
    12 Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 13.
  • 14.
  • 15.
    15 Kubernetes innovation notstopping… ● Cloud providers provide Kubernetes as a service ○ AWS, Azure, GCP, … ○ Not all are mature yet J ● Stateful deployments leverage the Kubernetes Operator pattern ○ For many infrastructure components, like databases, messaging, etc. ○ Community projects, vendors solutions (open source vs. proprietary) ● Service Mesh ○ Envoy, Istio, Linkerd, ... ○ Pull request for Kafka protocol support in Envoy / Istio accepted recently
  • 16.
    16 Evolution of KafkaDevOps Shell scripts Ansible/Chef Docker Kubernetes
  • 17.
    17 Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 18.
    18 Kafkaesque world ofKafka on Kubernetes
  • 19.
    19 Kafka on Kubernetes– It’s tricky © • Translating an existing architecture to Kubernetes • Failover handling and data balancing • Communication between ZooKeeper, Kafka Brokers, Clients (Java, REST, Connect, KSQL), Schema Registry, etc. • External access from / to outside Kubernetes cluster • Persistent storage options on prem and in the cloud • Security configuration • Rolling upgrades • Etc.
  • 20.
    20 Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 21.
    ● Global-scale ● Real-time ●Persistent Storage ● Stream Processing Apache Kafka: The De-facto Standard for Real-Time Event Streaming Edge Cloud Data LakeDatabases Datacenter IoT SaaS AppsMobile Microservices Machine Learning Apache Kafka
  • 22.
    22 Real-Time Inventory Real-Time Fraud Detection Real-Time Customer 360 Machine Learning Models Real-Time Data Transformation ... Contextual Event-DrivenApplications Universal Event Pipeline Data Stores Logs 3rd Party Apps Custom Apps/Microservices TREAMSSTREAMS CONNECT CLIENTS
  • 23.
    23Confluent establishes Freedomof Choice ● SaaS or Self-Managed or Hybrid ● Confluent’s vision is to introduce cloud native capabilities to Confluent Platform and enable users who want a cloud native experience in on-premises and self- managed cloud environments ● Introducing Confluent Operator
  • 24.
    24 Confluent Platform Operations andSecurity Development & Stream Processing Support,services,training&partners Apache Kafka Security plugins | Role-Based Access Control Control Center | Replicator | Auto Data Balancer | Operator Connectors Clients | REST Proxy MQTT Proxy | Schema Registry KSQL Connect Continuous Commit Log Streams Complete Event Streaming Platform Mission-critical Reliability Freedom of Choice Datacenter Public Cloud Confluent Cloud Self-Managed Software Fully-Managed Service
  • 25.
    25 Confluent Operator ● Deploymentand management automation for Confluent Platform on Kubernetes ● Including Apache Kafka, Zookeeper, Schema Registry, Connect, Control Center, Replicator, KSQL ● For organizations standardized on Kubernetes as platform runtime ● Operationalizes years of experience running Kafka on Kubernetes on the leading public clouds Confluent Platform Confluent Operator Kubernetes AWS Azure GCP RH OpenShift Mesosphere Pivotal On-Premises Cloud Docker Images Automate Deployment of Confluent Platform on Kubernetes on Any Platform at Any Scale
  • 26.
    26 Confluent Operator enablesyou to: Automate provisioning of Kafka pods in minutes Monitor SLAs through Confluent Control Center or Prometheus Scale Kafka elastically & Automate rolling updates Built on our first hand knowledge of running Confluent at scale Cloud-Native Deployment of Kafka and Confluent Platform
  • 27.
    Confluent’s Kubernetes Journey 05/2017 ConfluentCloud Early Access 2016 Confluent Cloud Development 11/2017 Confluent Cloud GA (AWS) 07/2019 Confluent Operator GA (Confluent Platform) 2019 Confluent Cloud GA on AWS, GCP, Azure
  • 28.
    28 Confluent Operator: A customKubernetes Controller API Server Scheduler Controllers & Custom Controllers PODS PVs ConfigMaps StatefulSets Nodes and pods are where Applications run on Kubernetes Applications use objects like StatefulSets, Configmaps, PVs Custom Controllers create custom resources that provide unique application functionality: ● Upgrades, Elasticity, Kafka Operational Logic Custom Resources Master Node Worker Nodes
  • 29.
    29Helm - TheConfluent Operator Package Manager ● Confluent Operator leverages Helm Charts to deploy, upgrade and uninstall Confluent Platform custom resources and pods ● Configuration front end for users to specify how a Confluent Platform Cluster is deployed: ○ # of replicas for Kafka, Zookeeper ○ Security and Authentication configuration ○ Persistent Storage configuration ● Cluster configuration edits are also performed using Helm Operator Helm Charts - yaml
  • 30.
    30 Kubernetes Cluster K8 NodeK8NodeK8 Node Replicator Pod C3 Pod SR Pod K8 Node Confluent Operator Deployment Operator Kafka Pod ZK Pod Persistent Volumes (AWS EBS, GCE Persistent Disk, Local Persistent Volume, etc.) External Access Load Balancers Configurations ConfigMapsKSQL Pod REST Proxy Pod
  • 31.
    31 Confluent Operator -Automated Provisioning
  • 32.
    32 Confluent Operator -Automated Security Configuration SASL PLAIN, SASL_SSL, TLS with Mutual Authentication Automate configuration of truststores and keystores with secret objects Automate configuration of Kafka and all Confluent Platform Components
  • 33.
    33 Confluent Operator -Scale Horizontally Automate Scaling: Spin up new brokers, connect workers easily Distribute partitions to new brokers: Determine balancing plan Execute balancing plan Monitor Resources
  • 34.
    34 Confluent Operator -Rolling Upgrade of all components Automated Rolling Upgrades of all components - Kafka Brokers, Zookeeper, Connect, Control Center Kafka Broker Upgrades: 1. Stop the broker, upgrade Kafka 2. Wait for Partition Leader reassignment 3. Start the upgraded broker 4. Wait for zero under-replicated partitions 5. Upgrade the next broker
  • 35.
    Agenda ● Cloud Nativevs. SaaS / Serverless Kafka ● The Emergence of Kubernetes ● Kafka on K8s Deployment Challenges ● Confluent Operator as Kafka Operator ● Q&A Confluent Operator
  • 36.