Pulsar Summit Asia 2021 Function Mesh Complex Streaming Jobs Made Simple on Cloud Rui Fu @ StreamNative
• 付睿 • @freeznet • Software Engineer @ StreamNative • Apache Pulsar committer
Agenda I. Pulsar Function II. Function Mesh III. Demo
Pulsar Function
Data Processing With Apache Pulsar
Pulsar Functions Pulsar Functions are lightweight compute processes that: ● consume messages from Pulsar topics ● apply a user-supplied processing logic to each message ● publish results to another Pulsar topic
Pulsar Functions Overview
● ETL Jobs ● Real-time Aggregation ● Microservices ● Event Routing Pulsar Functions Use Case
Pulsar Functions IS: ● Lambda-style processing unit that are specifically designed to integrate with Pulsar Pulsar Functions IS NOT: ● Another Full-Power Streaming Processing Engine Pulsar Functions IS & IS NOT
Pulsar Connectors are special form of Pulsar Functions that: ● Include two types: source and sink ● Enable user to easily integrate with external data systems ○ Source: feed data from external system into Pulsar ○ Sink: send data from Pulsar to external system Pulsar Connectors
Pulsar Functions API Hello Hello! Pulsar Function
Pulsar Functions Semantics ● ATMOST_ONCE ○ Message is ACKed to Pulsar once received ● ATLEAST_ONCE ○ Message is ACKed to Pulsar after the function completes -- Default ● EFFECTIVELY_ONCE ○ Utilizes Pulsar’s Effectively Once Semantics ● EXACTLY_ONCE [TO_BE_ADDED] ○ Txn from Pulsar 2.8.0
Pulsar Functions Runtime ● Thread: Invoke functions threads in functions worker. ● Process: Invoke functions in processes forked by functions worker. ● Kubernetes: Submit functions as Kubernetes StatefulSets by functions worker.
Pulsar Functions Deployment
Pulsar Functions Summary ● Developer productivity ○ Intuitive API: `func process(input) output {}` ○ Multiple Languages Support: Java, Python, Golang ● Operational simplicity ○ Fully Integrated with Apache Pulsar ○ No Extra System/Service Setup Needed ● Easy troubleshooting ○ Convenient Local Runtime ○ Easy to Use log topics
Function Mesh
Function Mesh IS & IS NOT Function Mesh IS a Kubernetes Framework for: ● Integrating separate functions together to process data ● Utilizing Kubernetes native resources and scheduling capability Function Mesh IS NOT full power Streaming Engines
Function Mesh WHY • Inconsistency in Pulsar functions’ Kubernetes runtime • Metadata topics may cause Broker crash loop • Cannot use functions across Pulsar clusters • Hard to use Kubernetes features • A lot of manual works to do complex jobs
Function Mesh Internals ● Kubernetes Operator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
Function Mesh Custom Resource Definition ● Function ● Source ● Sink ● FunctionMesh (a.k.a Mesh)
Function Mesh Overview
Function Mesh Working with Pulsar
Function Mesh Auto-Scaling ● Horizontal Pod Autoscaler (HPA) ○ API version: autoscaling/v2beta2 ○ Custom metrics ● MaxReplicas ● BuiltinAutoscaler ● AutoScalingMetrics ● AutoScalingBehavior
Function Mesh Kubernetes Resources ● Secrets ● ConfigMaps ● Volumes ● InitContainers ● Sidecars ● ServiceAccountName ● Node Selector ● Affinity / Tolerations ● SecurityContext ● ...
Function Mesh Internals ● Kubernetes Operator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
Function Mesh Function Runner ● Function runner images ○ Rootless ○ streamnative/pulsar-functions-java-runner ○ streamnative/pulsar-functions-python-runner ○ streamnative/pulsar-functions-go-runner ● Connector images ○ StreamNative Hub (https://hub.streamnative.io/)
Function Mesh Internals ● Kubernetes Operator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
Function Mesh Mesh Worker Service ● Support Functions Worker APIs ● Work with Pulsar 2.8+ ● Customizable built-in connectors ● Custom configurations
Function Mesh Mesh Worker Service APIs Apache Pulsar Functions Function Mesh Worker Service Create ✅ ✅ Delete ✅ ✅ Update ✅ ✅ Start / Stop / Restart ✅ ❌ Trigger ✅ ❌ Stats ✅ ✅ Status ✅ ✅ State Get / Put ✅ 🔧
Function Mesh Internals ● Kubernetes Operator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
Function Mesh Pulsar Functions Feature Matrix PIP 108: Pulsar Feature Matrix (Client and Function) Features Apache Pulsar Function Mesh Schema ✅ ✅ Custom SerDe ✅ ✅ End to end encryption ✅ ✅ Resources limit ✅ ✅ Metrics ✅ ✅ User Config ✅ ✅ Secrets ✅ ✅ Stateful ✅ 🔧
Function Mesh Summary ● Eases the management ● Utilizes the full power of Kubernetes Scheduler ● Function as a First Class Citizen in Cloud Environment ● Open the potential talking to different messaging system
Function Mesh Future ● Improve the capability level of the Function Mesh operator. ● Feature parity with Pulsar Functions. ● Support additional runtime based on self-contained function runtime, such as web-assembly. ● Develop better tools/frontend to manage and inspect Function Meshes. ● Group individual functions together to improve latency and reduce cost. ● Auto-scaling to/from zero
Get Function Mesh ● Website https://functionmesh.io/ ● Open Sourced https://github.com/streamnative/function-mesh https://github.com/streamnative/function-mesh-worker-service
Demo
Demo -- Connector On Cloud
We’re hiring Build Pulsar with the team that builds Pulsar ✓ Work with the creators of Pulsar ✓ Exciting, growth-stage company ✓ Open and collaborative environment ✓ Competitive compensation and benefits ✓ Best teammates on earth https://streamnative.io/careers
Thank You
Function Mesh Architecture

Function Mesh for Apache Pulsar, the Way for Simple Streaming Solutions

  • 1.
    Pulsar Summit Asia2021 Function Mesh Complex Streaming Jobs Made Simple on Cloud Rui Fu @ StreamNative
  • 2.
    • 付睿 • @freeznet •Software Engineer @ StreamNative • Apache Pulsar committer
  • 3.
    Agenda I. Pulsar Function II.Function Mesh III. Demo
  • 4.
  • 5.
    Data Processing WithApache Pulsar
  • 6.
    Pulsar Functions Pulsar Functionsare lightweight compute processes that: ● consume messages from Pulsar topics ● apply a user-supplied processing logic to each message ● publish results to another Pulsar topic
  • 7.
  • 8.
    ● ETL Jobs ●Real-time Aggregation ● Microservices ● Event Routing Pulsar Functions Use Case
  • 9.
    Pulsar Functions IS: ●Lambda-style processing unit that are specifically designed to integrate with Pulsar Pulsar Functions IS NOT: ● Another Full-Power Streaming Processing Engine Pulsar Functions IS & IS NOT
  • 10.
    Pulsar Connectors arespecial form of Pulsar Functions that: ● Include two types: source and sink ● Enable user to easily integrate with external data systems ○ Source: feed data from external system into Pulsar ○ Sink: send data from Pulsar to external system Pulsar Connectors
  • 11.
  • 12.
    Pulsar Functions Semantics ● ATMOST_ONCE ○Message is ACKed to Pulsar once received ● ATLEAST_ONCE ○ Message is ACKed to Pulsar after the function completes -- Default ● EFFECTIVELY_ONCE ○ Utilizes Pulsar’s Effectively Once Semantics ● EXACTLY_ONCE [TO_BE_ADDED] ○ Txn from Pulsar 2.8.0
  • 13.
    Pulsar Functions Runtime ● Thread:Invoke functions threads in functions worker. ● Process: Invoke functions in processes forked by functions worker. ● Kubernetes: Submit functions as Kubernetes StatefulSets by functions worker.
  • 14.
  • 15.
    Pulsar Functions Summary ● Developerproductivity ○ Intuitive API: `func process(input) output {}` ○ Multiple Languages Support: Java, Python, Golang ● Operational simplicity ○ Fully Integrated with Apache Pulsar ○ No Extra System/Service Setup Needed ● Easy troubleshooting ○ Convenient Local Runtime ○ Easy to Use log topics
  • 16.
  • 17.
    Function Mesh IS &IS NOT Function Mesh IS a Kubernetes Framework for: ● Integrating separate functions together to process data ● Utilizing Kubernetes native resources and scheduling capability Function Mesh IS NOT full power Streaming Engines
  • 18.
    Function Mesh WHY • Inconsistencyin Pulsar functions’ Kubernetes runtime • Metadata topics may cause Broker crash loop • Cannot use functions across Pulsar clusters • Hard to use Kubernetes features • A lot of manual works to do complex jobs
  • 19.
    Function Mesh Internals ● KubernetesOperator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
  • 20.
    Function Mesh Custom ResourceDefinition ● Function ● Source ● Sink ● FunctionMesh (a.k.a Mesh)
  • 21.
  • 22.
  • 23.
    Function Mesh Auto-Scaling ● HorizontalPod Autoscaler (HPA) ○ API version: autoscaling/v2beta2 ○ Custom metrics ● MaxReplicas ● BuiltinAutoscaler ● AutoScalingMetrics ● AutoScalingBehavior
  • 24.
    Function Mesh Kubernetes Resources ●Secrets ● ConfigMaps ● Volumes ● InitContainers ● Sidecars ● ServiceAccountName ● Node Selector ● Affinity / Tolerations ● SecurityContext ● ...
  • 25.
    Function Mesh Internals ● KubernetesOperator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
  • 26.
    Function Mesh Function Runner ●Function runner images ○ Rootless ○ streamnative/pulsar-functions-java-runner ○ streamnative/pulsar-functions-python-runner ○ streamnative/pulsar-functions-go-runner ● Connector images ○ StreamNative Hub (https://hub.streamnative.io/)
  • 27.
    Function Mesh Internals ● KubernetesOperator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
  • 28.
    Function Mesh Mesh WorkerService ● Support Functions Worker APIs ● Work with Pulsar 2.8+ ● Customizable built-in connectors ● Custom configurations
  • 29.
    Function Mesh Mesh WorkerService APIs Apache Pulsar Functions Function Mesh Worker Service Create ✅ ✅ Delete ✅ ✅ Update ✅ ✅ Start / Stop / Restart ✅ ❌ Trigger ✅ ❌ Stats ✅ ✅ Status ✅ ✅ State Get / Put ✅ 🔧
  • 30.
    Function Mesh Internals ● KubernetesOperator ● Function Runner ● Mesh Worker Service ● Pulsar Functions Feature Matrix
  • 31.
    Function Mesh Pulsar FunctionsFeature Matrix PIP 108: Pulsar Feature Matrix (Client and Function) Features Apache Pulsar Function Mesh Schema ✅ ✅ Custom SerDe ✅ ✅ End to end encryption ✅ ✅ Resources limit ✅ ✅ Metrics ✅ ✅ User Config ✅ ✅ Secrets ✅ ✅ Stateful ✅ 🔧
  • 32.
    Function Mesh Summary ● Easesthe management ● Utilizes the full power of Kubernetes Scheduler ● Function as a First Class Citizen in Cloud Environment ● Open the potential talking to different messaging system
  • 33.
    Function Mesh Future ● Improvethe capability level of the Function Mesh operator. ● Feature parity with Pulsar Functions. ● Support additional runtime based on self-contained function runtime, such as web-assembly. ● Develop better tools/frontend to manage and inspect Function Meshes. ● Group individual functions together to improve latency and reduce cost. ● Auto-scaling to/from zero
  • 34.
    Get Function Mesh ●Website https://functionmesh.io/ ● Open Sourced https://github.com/streamnative/function-mesh https://github.com/streamnative/function-mesh-worker-service
  • 35.
  • 36.
  • 37.
    We’re hiring Build Pulsarwith the team that builds Pulsar ✓ Work with the creators of Pulsar ✓ Exciting, growth-stage company ✓ Open and collaborative environment ✓ Competitive compensation and benefits ✓ Best teammates on earth https://streamnative.io/careers
  • 38.
  • 39.