What’s New in Confluent Platform 6.0 Self-Balancing Kafka, Cluster Linking, Tiered Storage, and much more Kai Waehner Technology Evangelist contact@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
Confluent Platform 6.0 Dynamic Performance & Elasticity Self-Balancing Clusters | Tiered Storage Flexible DevOps Automation Operator | Ansible GUI-driven Mgmt & Monitoring Control Center Efficient Operations at Scale Freedom of Choice Committer-driven Expertise Event Streaming Database ksqlDB Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development Non-Java Clients | REST APIs Global Resilience Multi-region Clusters | Cluster Linking Data Compatibility Schema Registry | Schema Validation Enterprise-grade Security RBAC | Secrets | Audit Logs ARCHITECTOPERATORDEVELOPER Open Source | Community licensed Unrestricted Developer Productivity Production-stage Prerequisites Fully Managed Cloud ServiceSelf-managed Software Training Partners Enterprise Support Professional Services Apache Kafka
3 Tiered Storage Tiered Storage enables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Event Streaming is storage-intensive: ...Micro- service ...SFDC AppSplunk ...Device Logs Object Storage Main- frame ...Hadoop Data Stores 3rd Party Apps Custom Apps / Microservices Logs Dynamic Performance & Elasticity
4 Tiered Storage Tiered Storage enables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Tiered Storage allows Kafka to recognize two layers of storage: Brokers Cost-effective Object Storage Offload old data to object store Dynamic Performance & Elasticity
5 Tiered Storage Tiered Storage enables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Tiered Storage delivers three primary benefits that revolutionize the way our customers experience Kafka: Infinite data retention Reimagine what event streaming apps can do Reduced infrastructure costs Offload data to cost-effective object storage Platform elasticity Scale compute and storage independently Dynamic Performance & Elasticity
Confluent Tiered Storage for Kafka 6
Use Cases for Reprocessing Historical Events Give me all events from time A to time B Real-time Producer Time • New consumer application • Error-handling • Compliance / regulatory processing • Query and analyze existing events • Model training Real-time Consumer Consumer of Historical Data
Tiered Storage in Control Center
10 Self-Balancing Clusters Self-Balancing Clusters automate partition rebalances to improve Kafka’s performance, elasticity, and ease of operations Shrinkage Uneven load Expansion Rebalances are required regularly to optimize cluster performance: Dynamic Performance & Elasticity
11 Self-Balancing Clusters Self-Balancing Clusters automate partition rebalances to improve Kafka’s performance, elasticity, and ease of operations Manual Rebalance Process: $ cat partitions-to-move.json { "partitions": [{ "topic": "foo", "partition": 1, "replicas": [1, 2, 4] }, ...], "version": 1 } $ kafka-reassign-partitions ... Confluent Platform: No complex math, no risk of human error Self-Balancing Dynamic Performance & Elasticity
Self-Balancing Kafka in Control Center
Tiered Storage and Self-Balancing Clusters make Confluent Platform far more elastically scalable Broker 1 Broker 2 Lengthy, manual process for reassigning large topic partitions Broker 3 New: Broker 4 Broker 1 Broker 2 Broker 3 Broker 4 Inelastic Scaling Process Broker 1 Broker 2 Scaling with Tiered Storage and Self-Balancing Clusters Fast, automated process for reassigning small topic partitions Broker 3 New: Broker 4 Broker 1 Broker 2 Broker 3 Broker 4 Object Store Object Store Dynamic Performance & Elasticity
ksqlDB CP 6.0 ships with ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures Building event streaming applications on top of Kafka offers modern, real-time experiences to customers: Event Streaming Database
ksqlDB CP 6.0 ships with ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures Event streaming applications require a complex, heavyweight architecture: DB APP APP DB CONNECTOR DB 1 3 4 CONNECTOR CONNECTOR 2 APP Event Streaming Database
ksqlDB CP 6.0 ships with ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures 16 ksqlDB makes stream processing more accessible by simplifying that architecture to just two components: PULL PUSH CONNECTORS STREAM PROCESSING STATE STORES ksqlDB 1 2 APP DB APP APP DB Event Streaming Database
17 Admin REST APIs Confluent Platform introduces REST APIs for administrative operations to simplify Kafka management Admin REST APIs add even greater flexibility in how you manage Kafka: Describe, list, and configure brokers Create, delete, describe, list, and configure topics Delete, describe, and list consumer groups Create, delete, describe, and list ACLs List partition reassignments Confluent offers several options to run admin operations, including Control Center, the CLI, and Kafka clients... Multi-language Development
Management planeData plane Confluent REST API offers client APIs over HTTP 18 Consume Produce Brokers Topics Consumer groups ACLs • Describe cluster • Alter configs • Stateless producer • Serializer (Protobuf, JSON, Avro) with Schema Registry integration • Compression • Create, delete, list and describe • Alter configs • Delete, list and describe • Describe offsets • Create, delete, list and describe • Stateless (fetch-like) and stateful (consumer group offsets) consumer • Deserialize and decompress
Confluent REST API is ubiquitous Self-managed Dedicated node Dedicated REST Proxy nodes isolate the workload and can be used with Confluent Server and Apache Kafka. 19 Self-managed Broker plugin The Confluent Server REST Plugin provides an out-of-the- box REST interface for Confluent Server clusters. Fully Managed Confluent Cloud A fully-managed REST interface extends Confluent Cloud with the same elasticity and availability guarantees. Confluent Cloud REST Confluent Server RESTKafka REST Proxy
20 Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka Hybrid-cloud and multi-cloud strategies offer significant benefits to businesses: Remove data silos and ensure data exists wherever your business needs it Leverage best of breed solutions across different public cloud providers Offload data infrastructure to fully managed services, like Confluent Cloud Avoid vendor lock-in and utilize the most cost-effective vendors Global Resilience
Sharing data between independent clusters or migrating clusters presents two challenges: 1. Requires deploying a separate Connect cluster 1. Offsets are not preserved, so messages are at risk of being skipped or reread 21 Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka 1 2 0 1 2 3 4 ... 4 5 6 7 8 ... Topic 1, DC 1: Topic 1, DC 2: DC 1: DC 2: Global Resilience
22 Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka Cluster Linking requires no additional infrastructure and preserves offsets: Migrate Apache Kafka clusters to Confluent Cloud Global Resilience
Cluster Linking also offers a cost- effective, secure, and performant transport layer between public clouds: Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka App App App App App App App App App App App App High Networking Costs Complex Management Low Networking Costs Move Once, Read Many Cluster Linking Global Resilience
Confluent Platform 6.0 launches with the latest Apache Kafka 2.6 version Performance improvements Better scalability Security updates New features Work on ZooKeeper removal
Check out the release notes for more updates (e.g. about Audit Logs and RBAC) 25 https://docs.confluent.io/current/release-notes/index.html
Apache Kafka in Manufacturing and Industry 4.0 - @KaiWaehner - www.kai-waehner.de Kai Waehner Technology Evangelist contact@kai-waehner.de @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Let’s connect!

New Features in Confluent Platform 6.0 / Apache Kafka 2.6

  • 1.
    What’s New inConfluent Platform 6.0 Self-Balancing Kafka, Cluster Linking, Tiered Storage, and much more Kai Waehner Technology Evangelist contact@kai-waehner.de LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2.
    Confluent Platform 6.0 DynamicPerformance & Elasticity Self-Balancing Clusters | Tiered Storage Flexible DevOps Automation Operator | Ansible GUI-driven Mgmt & Monitoring Control Center Efficient Operations at Scale Freedom of Choice Committer-driven Expertise Event Streaming Database ksqlDB Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development Non-Java Clients | REST APIs Global Resilience Multi-region Clusters | Cluster Linking Data Compatibility Schema Registry | Schema Validation Enterprise-grade Security RBAC | Secrets | Audit Logs ARCHITECTOPERATORDEVELOPER Open Source | Community licensed Unrestricted Developer Productivity Production-stage Prerequisites Fully Managed Cloud ServiceSelf-managed Software Training Partners Enterprise Support Professional Services Apache Kafka
  • 3.
    3 Tiered Storage Tiered Storageenables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Event Streaming is storage-intensive: ...Micro- service ...SFDC AppSplunk ...Device Logs Object Storage Main- frame ...Hadoop Data Stores 3rd Party Apps Custom Apps / Microservices Logs Dynamic Performance & Elasticity
  • 4.
    4 Tiered Storage Tiered Storageenables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Tiered Storage allows Kafka to recognize two layers of storage: Brokers Cost-effective Object Storage Offload old data to object store Dynamic Performance & Elasticity
  • 5.
    5 Tiered Storage Tiered Storageenables infinite data retention and elastic scalability by decoupling the compute and storage layers in Kafka Tiered Storage delivers three primary benefits that revolutionize the way our customers experience Kafka: Infinite data retention Reimagine what event streaming apps can do Reduced infrastructure costs Offload data to cost-effective object storage Platform elasticity Scale compute and storage independently Dynamic Performance & Elasticity
  • 6.
  • 7.
    Use Cases forReprocessing Historical Events Give me all events from time A to time B Real-time Producer Time • New consumer application • Error-handling • Compliance / regulatory processing • Query and analyze existing events • Model training Real-time Consumer Consumer of Historical Data
  • 8.
    Tiered Storage inControl Center
  • 9.
    10 Self-Balancing Clusters Self-Balancing Clusters automate partition rebalancesto improve Kafka’s performance, elasticity, and ease of operations Shrinkage Uneven load Expansion Rebalances are required regularly to optimize cluster performance: Dynamic Performance & Elasticity
  • 10.
    11 Self-Balancing Clusters Self-Balancing Clusters automate partition rebalancesto improve Kafka’s performance, elasticity, and ease of operations Manual Rebalance Process: $ cat partitions-to-move.json { "partitions": [{ "topic": "foo", "partition": 1, "replicas": [1, 2, 4] }, ...], "version": 1 } $ kafka-reassign-partitions ... Confluent Platform: No complex math, no risk of human error Self-Balancing Dynamic Performance & Elasticity
  • 11.
  • 12.
    Tiered Storage andSelf-Balancing Clusters make Confluent Platform far more elastically scalable Broker 1 Broker 2 Lengthy, manual process for reassigning large topic partitions Broker 3 New: Broker 4 Broker 1 Broker 2 Broker 3 Broker 4 Inelastic Scaling Process Broker 1 Broker 2 Scaling with Tiered Storage and Self-Balancing Clusters Fast, automated process for reassigning small topic partitions Broker 3 New: Broker 4 Broker 1 Broker 2 Broker 3 Broker 4 Object Store Object Store Dynamic Performance & Elasticity
  • 13.
    ksqlDB CP 6.0 shipswith ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures Building event streaming applications on top of Kafka offers modern, real-time experiences to customers: Event Streaming Database
  • 14.
    ksqlDB CP 6.0 shipswith ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures Event streaming applications require a complex, heavyweight architecture: DB APP APP DB CONNECTOR DB 1 3 4 CONNECTOR CONNECTOR 2 APP Event Streaming Database
  • 15.
    ksqlDB CP 6.0 shipswith ksqlDB 0.10 and makes pull queries and embedded connectors generally available to simplify stream processing architectures 16 ksqlDB makes stream processing more accessible by simplifying that architecture to just two components: PULL PUSH CONNECTORS STREAM PROCESSING STATE STORES ksqlDB 1 2 APP DB APP APP DB Event Streaming Database
  • 16.
    17 Admin REST APIs ConfluentPlatform introduces REST APIs for administrative operations to simplify Kafka management Admin REST APIs add even greater flexibility in how you manage Kafka: Describe, list, and configure brokers Create, delete, describe, list, and configure topics Delete, describe, and list consumer groups Create, delete, describe, and list ACLs List partition reassignments Confluent offers several options to run admin operations, including Control Center, the CLI, and Kafka clients... Multi-language Development
  • 17.
    Management planeData plane ConfluentREST API offers client APIs over HTTP 18 Consume Produce Brokers Topics Consumer groups ACLs • Describe cluster • Alter configs • Stateless producer • Serializer (Protobuf, JSON, Avro) with Schema Registry integration • Compression • Create, delete, list and describe • Alter configs • Delete, list and describe • Describe offsets • Create, delete, list and describe • Stateless (fetch-like) and stateful (consumer group offsets) consumer • Deserialize and decompress
  • 18.
    Confluent REST APIis ubiquitous Self-managed Dedicated node Dedicated REST Proxy nodes isolate the workload and can be used with Confluent Server and Apache Kafka. 19 Self-managed Broker plugin The Confluent Server REST Plugin provides an out-of-the- box REST interface for Confluent Server clusters. Fully Managed Confluent Cloud A fully-managed REST interface extends Confluent Cloud with the same elasticity and availability guarantees. Confluent Cloud REST Confluent Server RESTKafka REST Proxy
  • 19.
    20 Cluster Linking (preview) Cluster Linking simplifieshybrid-cloud and multi-cloud deployments for Kafka Hybrid-cloud and multi-cloud strategies offer significant benefits to businesses: Remove data silos and ensure data exists wherever your business needs it Leverage best of breed solutions across different public cloud providers Offload data infrastructure to fully managed services, like Confluent Cloud Avoid vendor lock-in and utilize the most cost-effective vendors Global Resilience
  • 20.
    Sharing data betweenindependent clusters or migrating clusters presents two challenges: 1. Requires deploying a separate Connect cluster 1. Offsets are not preserved, so messages are at risk of being skipped or reread 21 Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka 1 2 0 1 2 3 4 ... 4 5 6 7 8 ... Topic 1, DC 1: Topic 1, DC 2: DC 1: DC 2: Global Resilience
  • 21.
    22 Cluster Linking (preview) Cluster Linking simplifieshybrid-cloud and multi-cloud deployments for Kafka Cluster Linking requires no additional infrastructure and preserves offsets: Migrate Apache Kafka clusters to Confluent Cloud Global Resilience
  • 22.
    Cluster Linking alsooffers a cost- effective, secure, and performant transport layer between public clouds: Cluster Linking (preview) Cluster Linking simplifies hybrid-cloud and multi-cloud deployments for Kafka App App App App App App App App App App App App High Networking Costs Complex Management Low Networking Costs Move Once, Read Many Cluster Linking Global Resilience
  • 23.
    Confluent Platform 6.0 launcheswith the latest Apache Kafka 2.6 version Performance improvements Better scalability Security updates New features Work on ZooKeeper removal
  • 24.
    Check out therelease notes for more updates (e.g. about Audit Logs and RBAC) 25 https://docs.confluent.io/current/release-notes/index.html
  • 25.
    Apache Kafka inManufacturing and Industry 4.0 - @KaiWaehner - www.kai-waehner.de Kai Waehner Technology Evangelist contact@kai-waehner.de @KaiWaehner www.kai-waehner.de www.confluent.io LinkedIn Questions? Feedback? Let’s connect!