1. Use Cases for Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
1. Use Cases for Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
Confluent Customers across Industries
Business Value per Use Case Business Value Improve Customer Experience (CX) Increase Revenue (make money) Decrease Costs (save money) Core Business Platform Increase Operation al Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
1. Use Cases for Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
Event Streaming Platform – The Commit Log Time P C1 C2 C3
Event Streaming Platform – A Distributed System for 24/7 and Zero Data Loss Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
An Event Streaming Platform is the Underpinning of an Event-driven Architecture Microservices Mainframes SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experience s Streams of real time events Stream processing apps Connectors Connectors Stream processing apps
Apache Kafka at Scale at Tech Giants > 7 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
Kafka Connect Kafka Cluster CRM Integration Domain-Driven Design and Decoupled Applications Legacy Integration Custom Application Mainframe Connector Java / C++ / Go / KSQL / Streamsheets Schema Registry Event Streaming Platform CRM Domain Legacy Payment Domain Fraud Domain Audit Logs, RBAC, etc.
Streaming Analytics for Fraud Detection at Scale Integration Layer Batch Analytics Platform BI Dashboard Streaming Platform Big Data Integration Layer Payment App Streaming Platform Other Components Real Time Alerting System All Data Alert Ingest Data Human Intelligence www.kai-waehner.de | @KaiWaehner
Disaster Recovery – RPO and RTO RPO = Recovery Point Objective RTO = Recovery Time Objective
Kafka Clusters can Stretch over Regions Zero Downtime + Zero Data loss (RPO=0 and RTO=0) e.g. Stretched over US East + Mid + West Automate Disaster Recovery Sync or Async Replication per Topic Offset Preserving Automated Client Failover without Custom Code Multi-Region Cluster (Only available in Confluent Platform)
Example of a Multi-Region Cluster in a Bank Large FinServ Customer Payment Log Payment Log Location Location synchronous asynchronous ● ‘Payment’ transactions enter from us-east and us-west with fully synchronous replication ● ‘Log’ and ‘Location’ information in the same cluster use async - optimized for latency ● Automated disaster recovery (zero downtime, zero data loss) Result: Clearing time from ‘deposit’ to ‘available’ goes from 5 days to 5 seconds (including security checks) (Only available in Confluent Platform)
Global Event Streaming Aggregate Small Footprint Edge Deployments with Replication (Aggregation) Simplify Disaster Recovery Operations with Multi-Region Clusters with RPO=0 and RTO=0 Stream Data Globally with Replication and Cluster Linking
17Confluent Platform Fully Managed Cloud ServiceSelf Managed Software FREEDOM OF CHOICE COMMITTER-DRIVEN EXPERTISE PartnersTrainingProfessional Services Enterprise Support Apache Kafka EFFICIENT OPERATIONS AT SCALE PRODUCTION- STAGE PREREQUISITES UNRESTRICTED DEVELOPER PRODUCTIVITY SQL-based Stream Processing KSQL (ksqlDB) Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development non-Java clients | REST Proxy GUI-driven Mgmt & Monitoring Control Center Flexible DevOps Automation Operator | Ansible Dynamic Performance & Elasticity Auto Data Balancer | Tiered Storage Enterprise-grade Security RBAC | Secrets | Audit logs Data Compatibility Schema Registry | Schema Validation Global Resilience Multi-Region Clusters | Replicator Developer Operator Architect Open Source | Community licensed PARTNERSHIP FOR BUSINESS SUCCESS Complete Engagement Model Revenue / Cost / Risk Impact TCO / ROI Executive Buyer
1. Use Cases for Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
Stream processing & IoT for everybody Eclipse Streamsheets - Anybody who knows how to use a spreadsheet can quickly build server-based, real-time applications for any purpose. No programming required.
About Cedalo • IoT & Stream-Processing Startup • Founded 2017, Open-Source & Premium offerings • Cedalo is the main sponsor of 2 of the 6 key Eclipse IoT projects: • Eclipse Streamsheets: anybody with spreadsheet skills can quickly build server-based, real-time applications, for any purpose. • Eclipse Mosquitto: devices and apps communicate in real-time, based on Pub/Sub and the most popular broker technology in the world
Typical Streamsheet Use Cases in the Industry
Most of our user and customer base is rooted in the industry sector Open-Source users Commercial customers
How Streamsheets fit in the Kafka ecosystem
How Streamsheets fit in the Kafka ecosystem
Streamsheets consume and produce Kafka events Subscribe to Kafka topic Publish to Kafka topic They execute all sheet formulas for each arriving event. …and many new stream-specific formulas for aggregating, edge-detect, time-series,… Streamsheet use standard spreadsheet formulas… Streamsheets run as docker-based microservices 24/7
Streamcharts in Streamsheets (Standard & Time-Series)
Customer Story How the Freiburg University Hospital uses Kafka and Streamsheets to monitor the utilization of clinical assets in real-time
Freiburg University Hospital uses Eclipse Streamsheets to connect data sources to Kafka, process streams and visualize insights in real-time. Reception Labs Point of Care IBM Integration Bus Patient mgmt. system (PMDS) 3rd-Party Microservices Real-time dashboards • Patient data is created in many different places throughout the hospital. • This data is transformed, standardized and streamed into kafka. • The PMDS connects directly to the cluster. • Streamsheet dashboards consume aggregated real-time data to visualize the current capacities of devices and beds per department. Pre-processing service (MQTT -> FHIR -> KAFKA) Point of Care Labs ReceptionPMDS
1. Use Cases for Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
15-Minutes Live Demonstration Building a credit card fraud detection dashboard and alerter, based on Confluent Cloud, ksqlDB and Cedalo Cloud just using spreadsheet formulas.
Fraud-Detection: Contactless Credit-Card Payment • With contactless payment there is a risk of fraud if thieves get near your wallet • Since March 2020 the limit for contactless payments was doubled to 50 Euros • Use-Case Assumption: Thieves will try to "optimize" the contactless robbery and transfer in the range of 40 to 50 € • We are building a Kafka / KSQL / Streamsheet application to detect suspicious accounts, alert in real time via Slack and publish Top 3 suspicious accounts back to Kafka
How Streamsheets fit in the Kafka ecosystem Streamsheets ksqlDB
KsqlDB Table Query on Confluent Cloud CREATE TABLE RECEIVES AS SELECT ACCOUNT, COUNT(CASE WHEN AMOUNT BETWEEN 40 AND 50 THEN 1 ELSE NULL END) AS POSS_FRAUD_C, COUNT(CASE WHEN AMOUNT < 40 THEN 1 ELSE NULL END) AS POSS_SMALLER_C , COUNT(CASE WHEN AMOUNT > 50 THEN 1 ELSE NULL END) AS POSS_LARGER_C , WINDOWSTART AS WINDOW_START, WINDOWEND AS WINDOW_END FROM TRANSACTIONS WINDOW TUMBLING ( SIZE 120 SECONDS ) WHERE TYPE = 'RECEIVE' GROUP BY ACCOUNT EMIT CHANGES; This is what you will see in the Streamsheet later
15-Minutes Live Demonstration Building a credit card fraud detection dashboard and alerter, based on Confluent Cloud, ksqlDB and Cedalo Cloud just using spreadsheet formulas.
Mission critical - Enterprise ready • Streamsheets are built on the client's web browser in a spreadsheet like user interface, but … …during run-time, the grid as well as the row- and column headers disappear, and formulas are locked. → No Spreadsheet look at run-time • Streamsheets run 24/7 as real-time apps and eliminate well-known "Spreadsheet Hell" as they rely on the Kafka Cluster, Kafka Connect, ksqlDB and Schema Registry. Run-timeBuild-time Spreadsheet Hell
Get started on www.cedalo.com Get Started! Open-Source, Premium or Cloud The easiest way to get started with the Premium Edition is our managed service in the Cedalo Cloud. Managed Service On-Premises On-Devices You can run both the Community and the Premium Edition on-premises on your local servers or private cloud. Our solutions need little CPU power. It is also possible to run them directly on edge devices & IoT controllers. Cedalo Cloud Download Download
Questions? Contact us Kai Waehner contact@kai-waehner.de www.kai-waehner.de • Lin @KaiWaehner Kristian Raue kristian.raue@cedalo.com http://www.cedalo.com • Lin @cedalo_com

Streamsheets and Apache Kafka – Interactively build real-time Dashboards and Streaming Apps just using your Spreadsheet Skills

  • 2.
    1. Use Casesfor Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
  • 3.
    1. Use Casesfor Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
  • 4.
  • 5.
    Business Value perUse Case Business Value Improve Customer Experience (CX) Increase Revenue (make money) Decrease Costs (save money) Core Business Platform Increase Operation al Efficiency Migrate to Cloud Mitigate Risk (protect money) Key Drivers Strategic Objectives (sample) Fraud Detection IoT sensor ingestion Digital replatforming/ Mainframe Offload Connected Car: Navigation & improved in-car experience: Audi Customer 360 Simplifying Omni-channel Retail at Scale: Target Faster transactional processing / analysis incl. Machine Learning / AI Mainframe Offload: RBC Microservices Architecture Online Fraud Detection Online Security (syslog, log aggregation, Splunk replacement) Middleware replacement Regulatory Digital Transformation Application Modernization: Multiple Examples Website / Core Operations (Central Nervous System) The [Silicon Valley] Digital Natives; LinkedIn, Netflix, Uber, Yelp... Predictive Maintenance: Audi Streaming Platform in a regulated environment (e.g. Electronic Medical Records): Celmatix Real-time app updates Real Time Streaming Platform for Communications and Beyond: Capital One Developer Velocity - Building Stateful Financial Applications with Kafka Streams: Funding Circle Detect Fraud & Prevent Fraud in Real Time: PayPal Kafka as a Service - A Tale of Security and Multi-Tenancy: Apple Example Use Cases $↑ $↓ $ Example Case Studies (of many)
  • 6.
    1. Use Casesfor Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
  • 7.
    Event Streaming Platform– The Commit Log Time P C1 C2 C3
  • 8.
    Event Streaming Platform– A Distributed System for 24/7 and Zero Data Loss Broker 1 Topic1 partition1 Broker 2 Broker 3 Broker 4 Topic1 partition1 Topic1 partition1 Leader Follower Topic1 partition2 Topic1 partition2 Topic1 partition2 Topic1 partition3 Topic1 partition4 Topic1 partition3 Topic1 partition3 Topic1 partition4 Topic1 partition4
  • 9.
    An Event StreamingPlatform is the Underpinning of an Event-driven Architecture Microservices Mainframes SaaS apps Mobile Customer 360 Real-time fraud detection Data warehouse Producers Consumers Database change Microservices events SaaS data Customer experience s Streams of real time events Stream processing apps Connectors Connectors Stream processing apps
  • 10.
    Apache Kafka atScale at Tech Giants > 7 trillion messages / day > 6 Petabytes / day “You name it” * Kafka Is not just used by tech giants ** Kafka is not just used for big data
  • 11.
    Kafka Connect Kafka Cluster CRM Integration Domain-DrivenDesign and Decoupled Applications Legacy Integration Custom Application Mainframe Connector Java / C++ / Go / KSQL / Streamsheets Schema Registry Event Streaming Platform CRM Domain Legacy Payment Domain Fraud Domain Audit Logs, RBAC, etc.
  • 12.
    Streaming Analytics for FraudDetection at Scale Integration Layer Batch Analytics Platform BI Dashboard Streaming Platform Big Data Integration Layer Payment App Streaming Platform Other Components Real Time Alerting System All Data Alert Ingest Data Human Intelligence www.kai-waehner.de | @KaiWaehner
  • 13.
    Disaster Recovery –RPO and RTO RPO = Recovery Point Objective RTO = Recovery Time Objective
  • 14.
    Kafka Clusters can Stretchover Regions Zero Downtime + Zero Data loss (RPO=0 and RTO=0) e.g. Stretched over US East + Mid + West Automate Disaster Recovery Sync or Async Replication per Topic Offset Preserving Automated Client Failover without Custom Code Multi-Region Cluster (Only available in Confluent Platform)
  • 15.
    Example of aMulti-Region Cluster in a Bank Large FinServ Customer Payment Log Payment Log Location Location synchronous asynchronous ● ‘Payment’ transactions enter from us-east and us-west with fully synchronous replication ● ‘Log’ and ‘Location’ information in the same cluster use async - optimized for latency ● Automated disaster recovery (zero downtime, zero data loss) Result: Clearing time from ‘deposit’ to ‘available’ goes from 5 days to 5 seconds (including security checks) (Only available in Confluent Platform)
  • 16.
    Global Event Streaming AggregateSmall Footprint Edge Deployments with Replication (Aggregation) Simplify Disaster Recovery Operations with Multi-Region Clusters with RPO=0 and RTO=0 Stream Data Globally with Replication and Cluster Linking
  • 17.
    17Confluent Platform Fully ManagedCloud ServiceSelf Managed Software FREEDOM OF CHOICE COMMITTER-DRIVEN EXPERTISE PartnersTrainingProfessional Services Enterprise Support Apache Kafka EFFICIENT OPERATIONS AT SCALE PRODUCTION- STAGE PREREQUISITES UNRESTRICTED DEVELOPER PRODUCTIVITY SQL-based Stream Processing KSQL (ksqlDB) Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development non-Java clients | REST Proxy GUI-driven Mgmt & Monitoring Control Center Flexible DevOps Automation Operator | Ansible Dynamic Performance & Elasticity Auto Data Balancer | Tiered Storage Enterprise-grade Security RBAC | Secrets | Audit logs Data Compatibility Schema Registry | Schema Validation Global Resilience Multi-Region Clusters | Replicator Developer Operator Architect Open Source | Community licensed PARTNERSHIP FOR BUSINESS SUCCESS Complete Engagement Model Revenue / Cost / Risk Impact TCO / ROI Executive Buyer
  • 18.
    1. Use Casesfor Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
  • 19.
    Stream processing &IoT for everybody Eclipse Streamsheets - Anybody who knows how to use a spreadsheet can quickly build server-based, real-time applications for any purpose. No programming required.
  • 20.
    About Cedalo • IoT& Stream-Processing Startup • Founded 2017, Open-Source & Premium offerings • Cedalo is the main sponsor of 2 of the 6 key Eclipse IoT projects: • Eclipse Streamsheets: anybody with spreadsheet skills can quickly build server-based, real-time applications, for any purpose. • Eclipse Mosquitto: devices and apps communicate in real-time, based on Pub/Sub and the most popular broker technology in the world
  • 21.
    Typical Streamsheet UseCases in the Industry
  • 22.
    Most of ouruser and customer base is rooted in the industry sector Open-Source users Commercial customers
  • 23.
    How Streamsheets fitin the Kafka ecosystem
  • 24.
    How Streamsheets fitin the Kafka ecosystem
  • 25.
    Streamsheets consume andproduce Kafka events Subscribe to Kafka topic Publish to Kafka topic They execute all sheet formulas for each arriving event. …and many new stream-specific formulas for aggregating, edge-detect, time-series,… Streamsheet use standard spreadsheet formulas… Streamsheets run as docker-based microservices 24/7
  • 26.
    Streamcharts in Streamsheets(Standard & Time-Series)
  • 27.
    Customer Story How theFreiburg University Hospital uses Kafka and Streamsheets to monitor the utilization of clinical assets in real-time
  • 28.
    Freiburg University Hospitaluses Eclipse Streamsheets to connect data sources to Kafka, process streams and visualize insights in real-time. Reception Labs Point of Care IBM Integration Bus Patient mgmt. system (PMDS) 3rd-Party Microservices Real-time dashboards • Patient data is created in many different places throughout the hospital. • This data is transformed, standardized and streamed into kafka. • The PMDS connects directly to the cluster. • Streamsheet dashboards consume aggregated real-time data to visualize the current capacities of devices and beds per department. Pre-processing service (MQTT -> FHIR -> KAFKA) Point of Care Labs ReceptionPMDS
  • 29.
    1. Use Casesfor Event Streaming 2. Apache Kafka as Mission-Critical Infrastructure 3. Streamsheets 4. Live Demo Agenda
  • 30.
    15-Minutes Live Demonstration Buildinga credit card fraud detection dashboard and alerter, based on Confluent Cloud, ksqlDB and Cedalo Cloud just using spreadsheet formulas.
  • 31.
    Fraud-Detection: Contactless Credit-CardPayment • With contactless payment there is a risk of fraud if thieves get near your wallet • Since March 2020 the limit for contactless payments was doubled to 50 Euros • Use-Case Assumption: Thieves will try to "optimize" the contactless robbery and transfer in the range of 40 to 50 € • We are building a Kafka / KSQL / Streamsheet application to detect suspicious accounts, alert in real time via Slack and publish Top 3 suspicious accounts back to Kafka
  • 32.
    How Streamsheets fitin the Kafka ecosystem Streamsheets ksqlDB
  • 33.
    KsqlDB Table Queryon Confluent Cloud CREATE TABLE RECEIVES AS SELECT ACCOUNT, COUNT(CASE WHEN AMOUNT BETWEEN 40 AND 50 THEN 1 ELSE NULL END) AS POSS_FRAUD_C, COUNT(CASE WHEN AMOUNT < 40 THEN 1 ELSE NULL END) AS POSS_SMALLER_C , COUNT(CASE WHEN AMOUNT > 50 THEN 1 ELSE NULL END) AS POSS_LARGER_C , WINDOWSTART AS WINDOW_START, WINDOWEND AS WINDOW_END FROM TRANSACTIONS WINDOW TUMBLING ( SIZE 120 SECONDS ) WHERE TYPE = 'RECEIVE' GROUP BY ACCOUNT EMIT CHANGES; This is what you will see in the Streamsheet later
  • 34.
    15-Minutes Live Demonstration Buildinga credit card fraud detection dashboard and alerter, based on Confluent Cloud, ksqlDB and Cedalo Cloud just using spreadsheet formulas.
  • 35.
    Mission critical -Enterprise ready • Streamsheets are built on the client's web browser in a spreadsheet like user interface, but … …during run-time, the grid as well as the row- and column headers disappear, and formulas are locked. → No Spreadsheet look at run-time • Streamsheets run 24/7 as real-time apps and eliminate well-known "Spreadsheet Hell" as they rely on the Kafka Cluster, Kafka Connect, ksqlDB and Schema Registry. Run-timeBuild-time Spreadsheet Hell
  • 36.
    Get started onwww.cedalo.com Get Started! Open-Source, Premium or Cloud The easiest way to get started with the Premium Edition is our managed service in the Cedalo Cloud. Managed Service On-Premises On-Devices You can run both the Community and the Premium Edition on-premises on your local servers or private cloud. Our solutions need little CPU power. It is also possible to run them directly on edge devices & IoT controllers. Cedalo Cloud Download Download
  • 37.
    Questions? Contact us KaiWaehner contact@kai-waehner.de www.kai-waehner.de • Lin @KaiWaehner Kristian Raue kristian.raue@cedalo.com http://www.cedalo.com • Lin @cedalo_com