Apache Pulsar with MQTT for Edge Computing Timothy Spann | Developer Advocate Big Mountain Data and Dev Conference
Tim Spann Developer Advocate ● https://www.datainmotion.dev/ ● https://github.com/tspannhw/SpeakerProfile ● https://dev.to/tspannhw ● https://sessionize.com/tspann/ DZone Zone Leader and Big Data MVB Data DJay
Founded by the original developers of Apache Pulsar and Apache BookKeeper, StreamNative builds a cloud-native event streaming platform that enables enterprises to easily access data as real-time event streams.
Apache Pulsar
Apache is an open source, cloud-native distributed messaging and streaming platform.
What are the Benefits of Pulsar? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
Apache Pulsar
A Unified Messaging Platform Message Queuing Data Streaming
Pulsar is built for easy scale-out. *Illustrations by Jack Vanlightly
Key Milestones 2012 2016 2017 2018 2019 2020 Originally developed inside Yahoo! as “Cloud Messaging Service” Pulsar is committed to Open Source Pulsar is accepted into the Apache Software Foundation Pulsar becomes a Top-Level Project ● StreamNative is founded and seed round raised. ● Tencent adopts Pulsar for payment processing platform. ● BestPay adopts Pulsar for payment processing. ● Pulsar hits 200 contributors. ● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees ● Pulsar hits 340 contributors ● StreamNative and OVHCloud launch Kafka on Pulsar (KoP) ● StreamNative + China Mobile launch AMQP on Pulsar (AoP) ● Pulsar Ecosystem expands - StreamNative Hub launches ● StreamNative Cloud launches on GCP and Alibaba Cloud ● StreamNative customer adoption continues - new customers include Flipkart and Applied Materials ● Pulsar 2.7 + Transactions ● Pulsar Flink Connector 2.7 Major increase in adoption following TLP designation in 2018 2021 ● 3 global Pulsar conferences ● StreamNative hits 400 contributors (June). ● Pulsar surpasses Kafka in monthly active contributors. ● Pulsar 2.8 + Exactly-Once semantics ● StreamNative Platform launches
Apache Pulsar Overview Enable Geo-Replicated Messaging ● Pub-Sub ● Geo-Replication ● Pulsar Functions ● Horizontal Scalability ● Multi-tenancy ● Tiered Persistent Storage ● Pulsar Connectors ● REST API ● CLI ● Many clients available ● Four Different Subscription Types ● Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ ...
Pulsar’s Publish-Subscribe model Broker Subscription Consumer 1 Consumer 2 Consumer 3 Topic Producer 1 Producer 2 ● Producers send messages. ● Topics are an ordered, named channel that producers use to transmit messages to subscribed consumers. ● Messages belong to a topic and contain an arbitrary payload. ● Brokers handle connections and routes messages between producers / consumers. ● Subscriptions are named configuration rules that determine how messages are delivered to consumers. ● Consumers receive messages.
What is the Pulsar Ecosystem? ● Functions and Connectors ○ Functions: Lightweight stream processing ○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink” APIs ■ Files, Databases, Data tools, Cloud Services, etc ● Protocol Handlers ○ Allows Pulsar to handle additional protocols by an extendable API running in the broker ■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
Topics Tenants (Compliance) Tenants (Data Services) Namespace (Microservices) Topic-1 (Cust Auth) Topic-1 (Location Resolution) Topic-2 (Demographics) Topic-1 (Budgeted Spend) Topic-1 (Acct History) Topic-1 (Risk Detection) Namespace (ETL) Namespace (Campaigns) Namespace (ETL) Tenants (Marketing) Namespace (Risk Assessment) Pulsar Instance Pulsar Cluster
Pulsar subscription modes Different subscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2, V 21 > < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover
Reader and Batch API Pulsar IO/Connectors Stream Processor Applications Prebuilt Connectors Custom Connectors Microservices or Event-Driven Architecture Pub/Sub API Publisher Subscriber Admin API Operators & Administrators Teams Tenant Pulsar API Design
What is the Pulsar Ecosystem? (cont’d) ● Processing Engines ○ Supports modern processing engines ■ Flink and Spark, as well as Pulsar SQL (Presto/Trino) ● Offloaders ○ Allows data to be offloaded to cloud storage and used with existing Pulsar APIs ■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage (in Pulsar 2.7.0)
MQTT on Pulsar (MoP)
MQTT on Pulsar (MoP) Configuration messagingProtocols=mqtt # directory protocolHandlerDirectory=./protocols #mqtt 3.1.1 - port / ip mqttListeners=mqtt://127.0.0.1:1883 advertisedAddress=127.0.0.1
Kafka-on-Pulsar (Kop)
Pulsar Functions Provides a simple API to: ● Receive a message (consume) ● Process the message using your own code ● Send a message (produce) Takes care of the boilerplate code so there is no need to create producers and consumers.
Moving Data In and Out of Pulsar IO/Connectors are a simple way to integrate with external systems and move data in and out of Pulsar. ● Built on top of Pulsar Functions ● Built-in connectors - hub.streamnative.io Source Sink
Use Azure BlobStore offloader with Pulsar https://pulsar.apache.org/docs/en/tiered-storage-azure/
24 streamnative.io Apache Pulsar - Other Sinks https://hub.streamnative.io/connectors/cloud-storage-sink/2.5.1/ mongoDB AWS Lambda redis AWS S3 GCS
Ingesting IoT Data via Java Pulsar https://github.com/tspannhw/StreamingAnalyticsUsingFlinkSQL/
Ingesting IoT Data via Java Pulsar
Pulsar SQL Presto/Trino workers can read segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
Query Your Topics with Pulsar SQL (Trino)
StreamNative Cloud
Powered by Apache Pulsar, StreamNative provides a cloud-native, real-time messaging and streaming platform to support multi-cloud and hybrid cloud strategies. Built for Containers Cloud Native StreamNative Cloud Flink SQL
Best Practice Architectures
StreamNative Hub StreamNative Cloud Unified Batch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores Tiered Storage Pulsar --- KoP --- MoP --- Websocket --- HTTP Pulsar Sink Pulsar Sink Streaming Edge Gateway Protocols End-to-End Streaming FLiP(N) Apps
Demo
NVIDIA Device
MQTT from Python pip3 install paho-mqtt import paho.mqtt.client as mqtt client = mqtt.Client("rpi4-iot") row = { } row['gasKO'] = str(readings) json_string = json.dumps(row) json_string = json_string.strip() client.connect("pulsar-server.com", 1883, 180) client.publish("persistent://public/default/mqtt-2", payload=json_string, qos=0, retain=True)
37 Using NVIDIA Jetson Devices With Pulsar https://dev.to/tspannhw/unboxing-the-most-amazing-edge-ai-devic e-part-1-of-3-nvidia-jetson-xavier-nx-595k https://github.com/tspannhw/minifi-xaviernx/ https://github.com/tspannhw/minifi-jetson-nano https://github.com/tspannhw/Flip-iot https://www.datainmotion.dev/2020/10/flank-streaming-edgeai-on- new-nvidia.html https://github.com/tspannhw/FLiP-Mobile/blob/30bcc1ec98fc31e0 39b51a06180d98545c1e0542/python3/enviro.py
Demo Walkthrough {"entriesAddedCounter":1,"numberOfEntrie s":1,"totalSize":651,"currentLedgerEntries":1," currentLedgerSize":651,"lastLedgerCreated Timestamp":"2021-09-13T16:13:06.6-04:00"," waitingCursorsCount":0,"pendingAddEntri esCount":0,"lastConfirmedEntry":"7076:0"," state":"LedgerOpened","ledgers":[{"ledgerId ":7076,"entries":0,"size":0,"offloaded":false,"u nderReplicated":false}],"cursors":{},"schema Ledgers":[],"compactedLedger":{"ledgerId": -1,"entries":-1,"size":-1,"offloaded":false,"unde rReplicated":false}}
Wrap-Up
Connect with the Community & Stay Up-To-Date ● Join the Pulsar Slack channel - Apache-Pulsar.slack.com ● Follow @streamnativeio and @apache_pulsar on Twitter ● Subscribe to Monthly Pulsar Newsletter for major news, events, project updates, and resources in the Pulsar community
● https://github.com/tspannhw/FLiP-Energy/ ● https://github.com/tspannhw/Flip-iot ● https://github.com/tspannhw/Flip-jetson ● https://github.com/streamnative/pulsar-flink ● https://github.com/streamnative/mop ● https://www.linkedin.com/pulse/2021-schedule-tim-spann/ ● https://github.com/tspannhw/FLiP-InfluxDB/blob/main/README.md ● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud ● https://docs.streamnative.io/cloud/stable/compute/flink-sql Deeper Content @PaasDev https://www.pulsardeveloper.com/ timothyspann
Interested In Learning More? Flink SQL Cookbook The Github Source for Flink SQL Demo The GitHub Source for Demo Manning's Apache Pulsar in Action O’Reilly Book [10/21] Trino Summit Resources Free eBooks Upcoming Events
43 Pulsar Summit Asia November 20-21, 2021 Contact us at partners@pulsar-summit.org to become a sponsor or partner
Let’s Keep in Touch! Speaker Name Speaker title @PassDev https://www.linkedin.com/in/timothyspann https://github.com/tspannhw
Questions
Event Partners Thank you to our Sponsors Premier Sponsors

Big mountain data and dev conference apache pulsar with mqtt for edge computing

  • 1.
    Apache Pulsar withMQTT for Edge Computing Timothy Spann | Developer Advocate Big Mountain Data and Dev Conference
  • 2.
    Tim Spann Developer Advocate ●https://www.datainmotion.dev/ ● https://github.com/tspannhw/SpeakerProfile ● https://dev.to/tspannhw ● https://sessionize.com/tspann/ DZone Zone Leader and Big Data MVB Data DJay
  • 3.
    Founded by theoriginal developers of Apache Pulsar and Apache BookKeeper, StreamNative builds a cloud-native event streaming platform that enables enterprises to easily access data as real-time event streams.
  • 4.
  • 5.
    Apache is anopen source, cloud-native distributed messaging and streaming platform.
  • 6.
    What are theBenefits of Pulsar? Data Durability Scalability Geo-Replication Multi-Tenancy Unified Messaging Model
  • 7.
  • 8.
    A Unified MessagingPlatform Message Queuing Data Streaming
  • 9.
    Pulsar is builtfor easy scale-out. *Illustrations by Jack Vanlightly
  • 10.
    Key Milestones 2012 20162017 2018 2019 2020 Originally developed inside Yahoo! as “Cloud Messaging Service” Pulsar is committed to Open Source Pulsar is accepted into the Apache Software Foundation Pulsar becomes a Top-Level Project ● StreamNative is founded and seed round raised. ● Tencent adopts Pulsar for payment processing platform. ● BestPay adopts Pulsar for payment processing. ● Pulsar hits 200 contributors. ● 2 global Pulsar conferences, 80+ speakers, 1,500+ attendees ● Pulsar hits 340 contributors ● StreamNative and OVHCloud launch Kafka on Pulsar (KoP) ● StreamNative + China Mobile launch AMQP on Pulsar (AoP) ● Pulsar Ecosystem expands - StreamNative Hub launches ● StreamNative Cloud launches on GCP and Alibaba Cloud ● StreamNative customer adoption continues - new customers include Flipkart and Applied Materials ● Pulsar 2.7 + Transactions ● Pulsar Flink Connector 2.7 Major increase in adoption following TLP designation in 2018 2021 ● 3 global Pulsar conferences ● StreamNative hits 400 contributors (June). ● Pulsar surpasses Kafka in monthly active contributors. ● Pulsar 2.8 + Exactly-Once semantics ● StreamNative Platform launches
  • 11.
    Apache Pulsar Overview EnableGeo-Replicated Messaging ● Pub-Sub ● Geo-Replication ● Pulsar Functions ● Horizontal Scalability ● Multi-tenancy ● Tiered Persistent Storage ● Pulsar Connectors ● REST API ● CLI ● Many clients available ● Four Different Subscription Types ● Multi-Protocol Support ○ MQTT ○ AMQP ○ JMS ○ Kafka ○ ...
  • 12.
    Pulsar’s Publish-Subscribe model Broker Subscription Consumer1 Consumer 2 Consumer 3 Topic Producer 1 Producer 2 ● Producers send messages. ● Topics are an ordered, named channel that producers use to transmit messages to subscribed consumers. ● Messages belong to a topic and contain an arbitrary payload. ● Brokers handle connections and routes messages between producers / consumers. ● Subscriptions are named configuration rules that determine how messages are delivered to consumers. ● Consumers receive messages.
  • 13.
    What is thePulsar Ecosystem? ● Functions and Connectors ○ Functions: Lightweight stream processing ○ Connectors: Part of “Pulsar IO”, includes “Source” and “Sink” APIs ■ Files, Databases, Data tools, Cloud Services, etc ● Protocol Handlers ○ Allows Pulsar to handle additional protocols by an extendable API running in the broker ■ AoP (AMQP), KoP (Kafka), MoP (MQTT)
  • 14.
    Topics Tenants (Compliance) Tenants (Data Services) Namespace (Microservices) Topic-1 (Cust Auth) Topic-1 (LocationResolution) Topic-2 (Demographics) Topic-1 (Budgeted Spend) Topic-1 (Acct History) Topic-1 (Risk Detection) Namespace (ETL) Namespace (Campaigns) Namespace (ETL) Tenants (Marketing) Namespace (Risk Assessment) Pulsar Instance Pulsar Cluster
  • 15.
    Pulsar subscription modes Differentsubscription modes have different semantics: Exclusive/Failover - guaranteed order, single active consumer Shared - multiple active consumers, no order Key_Shared - multiple active consumers, order for given key Producer 1 Producer 2 Pulsar Topic Subscription D Consumer D-1 Consumer D-2 Key-Shared < K 1, V 10 > < K 1, V 11 > < K 1, V 12 > < K 2 ,V 2 0 > < K 2 ,V 2 1> < K 2 ,V 2 2 > Subscription C Consumer C-1 Consumer C-2 Shared < K 1, V 10 > < K 2, V 21 > < K 1, V 12 > < K 2 ,V 2 0 > < K 1, V 11 > < K 2 ,V 2 2 > Subscription A Consumer A Exclusive Subscription B Consumer B-1 Consumer B-2 In case of failure in Consumer B-1 Failover
  • 16.
    Reader and Batch API Pulsar IO/Connectors StreamProcessor Applications Prebuilt Connectors Custom Connectors Microservices or Event-Driven Architecture Pub/Sub API Publisher Subscriber Admin API Operators & Administrators Teams Tenant Pulsar API Design
  • 17.
    What is thePulsar Ecosystem? (cont’d) ● Processing Engines ○ Supports modern processing engines ■ Flink and Spark, as well as Pulsar SQL (Presto/Trino) ● Offloaders ○ Allows data to be offloaded to cloud storage and used with existing Pulsar APIs ■ S3, GCP Cloud Storage, HDFS, File (NFS), Azure Blob Storage (in Pulsar 2.7.0)
  • 18.
  • 19.
    MQTT on Pulsar(MoP) Configuration messagingProtocols=mqtt # directory protocolHandlerDirectory=./protocols #mqtt 3.1.1 - port / ip mqttListeners=mqtt://127.0.0.1:1883 advertisedAddress=127.0.0.1
  • 20.
  • 21.
    Pulsar Functions Provides asimple API to: ● Receive a message (consume) ● Process the message using your own code ● Send a message (produce) Takes care of the boilerplate code so there is no need to create producers and consumers.
  • 22.
    Moving Data Inand Out of Pulsar IO/Connectors are a simple way to integrate with external systems and move data in and out of Pulsar. ● Built on top of Pulsar Functions ● Built-in connectors - hub.streamnative.io Source Sink
  • 23.
    Use Azure BlobStoreoffloader with Pulsar https://pulsar.apache.org/docs/en/tiered-storage-azure/
  • 24.
    24 streamnative.io Apache Pulsar -Other Sinks https://hub.streamnative.io/connectors/cloud-storage-sink/2.5.1/ mongoDB AWS Lambda redis AWS S3 GCS
  • 25.
    Ingesting IoT Datavia Java Pulsar https://github.com/tspannhw/StreamingAnalyticsUsingFlinkSQL/
  • 26.
    Ingesting IoT Datavia Java Pulsar
  • 27.
    Pulsar SQL Presto/Trino workers canread segments directly from bookies (or offloaded storage) in parallel. Bookie 1 Segment 1 Producer Consumer Broker 1 Topic1-Part1 Broker 2 Topic1-Part2 Broker 3 Topic1-Part3 Segment 2 Segment 3 Segment 4 Segment X Segment 1 Segment 1 Segment 1 Segment 3 Segment 3 Segment 3 Segment 2 Segment 2 Segment 2 Segment 4 Segment 4 Segment 4 Segment X Segment X Segment X Bookie 2 Bookie 3 Query Coordinator ... ... SQL Worker SQL Worker SQL Worker SQL Worker Query Topic Metadata
  • 28.
    Query Your Topicswith Pulsar SQL (Trino)
  • 29.
  • 30.
    Powered by ApachePulsar, StreamNative provides a cloud-native, real-time messaging and streaming platform to support multi-cloud and hybrid cloud strategies. Built for Containers Cloud Native StreamNative Cloud Flink SQL
  • 32.
  • 33.
    StreamNative Hub StreamNative Cloud UnifiedBatch and Stream COMPUTING Batch (Batch + Stream) Unified Batch and Stream STORAGE Offload (Queuing + Streaming) Apache Flink - Apache Pulsar - Apache NiFi <-> Events <-> Azure Data Stores Tiered Storage Pulsar --- KoP --- MoP --- Websocket --- HTTP Pulsar Sink Pulsar Sink Streaming Edge Gateway Protocols End-to-End Streaming FLiP(N) Apps
  • 34.
  • 35.
  • 36.
    MQTT from Python pip3install paho-mqtt import paho.mqtt.client as mqtt client = mqtt.Client("rpi4-iot") row = { } row['gasKO'] = str(readings) json_string = json.dumps(row) json_string = json_string.strip() client.connect("pulsar-server.com", 1883, 180) client.publish("persistent://public/default/mqtt-2", payload=json_string, qos=0, retain=True)
  • 37.
    37 Using NVIDIA JetsonDevices With Pulsar https://dev.to/tspannhw/unboxing-the-most-amazing-edge-ai-devic e-part-1-of-3-nvidia-jetson-xavier-nx-595k https://github.com/tspannhw/minifi-xaviernx/ https://github.com/tspannhw/minifi-jetson-nano https://github.com/tspannhw/Flip-iot https://www.datainmotion.dev/2020/10/flank-streaming-edgeai-on- new-nvidia.html https://github.com/tspannhw/FLiP-Mobile/blob/30bcc1ec98fc31e0 39b51a06180d98545c1e0542/python3/enviro.py
  • 38.
  • 39.
  • 40.
    Connect with theCommunity & Stay Up-To-Date ● Join the Pulsar Slack channel - Apache-Pulsar.slack.com ● Follow @streamnativeio and @apache_pulsar on Twitter ● Subscribe to Monthly Pulsar Newsletter for major news, events, project updates, and resources in the Pulsar community
  • 41.
    ● https://github.com/tspannhw/FLiP-Energy/ ● https://github.com/tspannhw/Flip-iot ●https://github.com/tspannhw/Flip-jetson ● https://github.com/streamnative/pulsar-flink ● https://github.com/streamnative/mop ● https://www.linkedin.com/pulse/2021-schedule-tim-spann/ ● https://github.com/tspannhw/FLiP-InfluxDB/blob/main/README.md ● https://streamnative.io/en/blog/release/2021-04-20-flink-sql-on-streamnative-cloud ● https://docs.streamnative.io/cloud/stable/compute/flink-sql Deeper Content @PaasDev https://www.pulsardeveloper.com/ timothyspann
  • 42.
    Interested In LearningMore? Flink SQL Cookbook The Github Source for Flink SQL Demo The GitHub Source for Demo Manning's Apache Pulsar in Action O’Reilly Book [10/21] Trino Summit Resources Free eBooks Upcoming Events
  • 43.
    43 Pulsar Summit Asia November20-21, 2021 Contact us at partners@pulsar-summit.org to become a sponsor or partner
  • 45.
    Let’s Keep in Touch! SpeakerName Speaker title @PassDev https://www.linkedin.com/in/timothyspann https://github.com/tspannhw
  • 46.
  • 47.
    Event Partners Thank youto our Sponsors Premier Sponsors