How to build an event driven architecture with Kafka & Kafka Connect Nov 12, 2020 Lợi Nguyễn - Technical Architect @ VeXeRe 1
Vietnam’s largest online bus booking systemvexere.com VeXeRe.com is a Vietnamese online bus ticket booking system that operates through many transportation companies. 2
Vietnam’s largest online bus booking systemvexere.com 3
Vietnam’s largest online bus booking systemvexere.com 4
Vietnam’s largest online bus booking systemvexere.com ● Why event driven architecture? ● What is “Event Driven” architecture? ○ Event-carried State Transfer ○ Event Sourcing ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Postgresql / WAL ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ■ MSSQL Source Connector ○ Transform ○ Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer ● Experience / Tools / Troubleshoot ○ Tools: (kafka manager, kafka tool) ○ Troubleshoot connector ○ Monitoring ○ Domain Event vs Event Sourcing Event ● Demo ● Q & A & Discussion 5
Vietnam’s largest online bus booking systemvexere.com 6
Vietnam’s largest online bus booking systemvexere.com 7 Phase Challenges Launch ● Business Idea Profitability ● Limit resource: ○ Time ○ Technology ○ Money Growth/Expansion ● Increasing Customers ● Increasing Feature ● Adding New Products/BU
Vietnam’s largest online bus booking systemvexere.com Name: Nguyễn Văn Lợi Company: ● Vexere ● Chotot ● Start-up ● Softfoundry Industries: ● VoIP ● Big Data ● Ecommerce ● Logistic 8
Vietnam’s largest online bus booking systemvexere.com ● Event-carried State Transfer ● Event Sourcing 9
Vietnam’s largest online bus booking systemvexere.com 10
Vietnam’s largest online bus booking systemvexere.com 11
Vietnam’s largest online bus booking systemvexere.com 12
Vietnam’s largest online bus booking systemvexere.com ● Two representation of the world: ○ Application State: the current representation of the world, and ○ log of all the events: that changed that world ● The test definition of Event Sourcing: ○ at any time we can blow away the application state and confidently rebuild it from the log. ● Benefit: ○ Audits ○ Debugging 13
Vietnam’s largest online bus booking systemvexere.com ● What is “Event Driven” architecture? ○ Event-carried State Transfer ○ Event Sourcing 14
Vietnam’s largest online bus booking systemvexere.com ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Mysql / binlog ○ Postgresql / WAL ○ MongoDB / oplog 15
Vietnam’s largest online bus booking systemvexere.com ● DB transaction log ○ Db use transaction log for replicate and restore purpose ● Git/Subversion ● Blockchain 16
Vietnam’s largest online bus booking systemvexere.com 17
Vietnam’s largest online bus booking systemvexere.com 18
Vietnam’s largest online bus booking systemvexere.com ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Postgresql / WAL 19
Vietnam’s largest online bus booking systemvexere.com ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ○ Transform ○ How we use Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer 20
Vietnam’s largest online bus booking systemvexere.com ● topic ● producer ● consumer ● broker ● partition ● consumer group 21
Vietnam’s largest online bus booking systemvexere.com 22
Vietnam’s largest online bus booking systemvexere.com 23
Vietnam’s largest online bus booking systemvexere.com 24
Vietnam’s largest online bus booking systemvexere.com 25
Vietnam’s largest online bus booking systemvexere.com 26
Vietnam’s largest online bus booking systemvexere.com 27
Vietnam’s largest online bus booking systemvexere.com Kafka Connect is a framework to stream data into and out of Apache Kafka ● Connectors – the high level abstraction that coordinates data streaming by managing tasks ● Tasks – the implementation of how data is copied to or from Kafka ● Workers – the running processes that execute connectors and tasks ● Converters – the code used to translate data between Connect and the system sending or receiving data ● Transforms – simple logic to alter each message produced by or sent to a connector ● Dead Letter Queue – how Connect handles connector errors 28
Vietnam’s largest online bus booking systemvexere.com No coding required, just json config: 29
Vietnam’s largest online bus booking systemvexere.com 30
Vietnam’s largest online bus booking systemvexere.com 31
Vietnam’s largest online bus booking systemvexere.com 32
Vietnam’s largest online bus booking systemvexere.com 33
Vietnam’s largest online bus booking systemvexere.com 34
Vietnam’s largest online bus booking systemvexere.com 35
Vietnam’s largest online bus booking systemvexere.com Pros Cons ● Many Connectors (source/sink) ● No coding required ● Simple transform only ● Hard to customize or write your own connector 36
Vietnam’s largest online bus booking systemvexere.com ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ■ MSSQL Source Connector ○ Transform ○ How we use Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer 37
Vietnam’s largest online bus booking systemvexere.com ● Monitor kafka connect job ● AlwaysOn Cluster Config ● Database schema evolution 38
Vietnam’s largest online bus booking systemvexere.com 39
Vietnam’s largest online bus booking systemvexere.com 40 ● Database schema evolution ○ Hot update ○ Cold update ○ Capture table schema is static ○ Must disable/enable table cdc again to reflect new schema to cdc table ■ Create a new capture table for the update source table using sys.sp_cdc_enable_table procedure with a unique value for parameter @capture_instance ■ Why? Because source connector only sent new schema to schema changes topic (for later comparison), then use this to diff check with message schema
Vietnam’s largest online bus booking systemvexere.com Always On Cluster 41
Vietnam’s largest online bus booking systemvexere.com 42 ● Datatype: ○ VARCHAR(max) is not supported in CDC table ( cannot record before value, only have record with value after update )
Vietnam’s largest online bus booking systemvexere.com In Kafka Connect, task is being killed and will not recover until manually restarted Solution: ● Cronjob to check task status, then restart task by calling restful to task api ● Dead Letter Queue to handle error in: ○ Convert ○ Transform 43
Vietnam’s largest online bus booking systemvexere.com ● Experience / Tools / Troubleshoot ○ Tools: (kafka manager, kafka tool) ○ Troubleshoot connector ○ Monitoring 44
Vietnam’s largest online bus booking systemvexere.com 45
Vietnam’s largest online bus booking systemvexere.com 46
Vietnam’s largest online bus booking systemvexere.com 47
Vietnam’s largest online bus booking systemvexere.com 48 Reference: https://www.enterpriseintegrationpatterns.com/patterns/messaging/index.html
Vietnam’s largest online bus booking systemvexere.com ● https://martinfowler.com/articles/201701-event-driven.html ● https://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latenc y-data-pipelines/ ● https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change- data-capture-sql-server?view=sql-server-2017 ● https://docs.confluent.io/current/connect/concepts.html ● https://www.slideshare.net/ConfluentInc/from-zero-to-hero-with-kafka-connect ● https://www.innoq.com/en/blog/domain-events-versus-event-sourcing/#eventsfromeven tsourcing%E2%89%A0domainevents 49
Vietnam’s largest online bus booking systemvexere.com 50
Vietnam’s largest online bus booking systemvexere.com 51

How to build an event driven architecture with kafka and kafka connect

  • 1.
    How to buildan event driven architecture with Kafka & Kafka Connect Nov 12, 2020 Lợi Nguyễn - Technical Architect @ VeXeRe 1
  • 2.
    Vietnam’s largest onlinebus booking systemvexere.com VeXeRe.com is a Vietnamese online bus ticket booking system that operates through many transportation companies. 2
  • 3.
    Vietnam’s largest onlinebus booking systemvexere.com 3
  • 4.
    Vietnam’s largest onlinebus booking systemvexere.com 4
  • 5.
    Vietnam’s largest onlinebus booking systemvexere.com ● Why event driven architecture? ● What is “Event Driven” architecture? ○ Event-carried State Transfer ○ Event Sourcing ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Postgresql / WAL ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ■ MSSQL Source Connector ○ Transform ○ Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer ● Experience / Tools / Troubleshoot ○ Tools: (kafka manager, kafka tool) ○ Troubleshoot connector ○ Monitoring ○ Domain Event vs Event Sourcing Event ● Demo ● Q & A & Discussion 5
  • 6.
    Vietnam’s largest onlinebus booking systemvexere.com 6
  • 7.
    Vietnam’s largest onlinebus booking systemvexere.com 7 Phase Challenges Launch ● Business Idea Profitability ● Limit resource: ○ Time ○ Technology ○ Money Growth/Expansion ● Increasing Customers ● Increasing Feature ● Adding New Products/BU
  • 8.
    Vietnam’s largest onlinebus booking systemvexere.com Name: Nguyễn Văn Lợi Company: ● Vexere ● Chotot ● Start-up ● Softfoundry Industries: ● VoIP ● Big Data ● Ecommerce ● Logistic 8
  • 9.
    Vietnam’s largest onlinebus booking systemvexere.com ● Event-carried State Transfer ● Event Sourcing 9
  • 10.
    Vietnam’s largest onlinebus booking systemvexere.com 10
  • 11.
    Vietnam’s largest onlinebus booking systemvexere.com 11
  • 12.
    Vietnam’s largest onlinebus booking systemvexere.com 12
  • 13.
    Vietnam’s largest onlinebus booking systemvexere.com ● Two representation of the world: ○ Application State: the current representation of the world, and ○ log of all the events: that changed that world ● The test definition of Event Sourcing: ○ at any time we can blow away the application state and confidently rebuild it from the log. ● Benefit: ○ Audits ○ Debugging 13
  • 14.
    Vietnam’s largest onlinebus booking systemvexere.com ● What is “Event Driven” architecture? ○ Event-carried State Transfer ○ Event Sourcing 14
  • 15.
    Vietnam’s largest onlinebus booking systemvexere.com ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Mysql / binlog ○ Postgresql / WAL ○ MongoDB / oplog 15
  • 16.
    Vietnam’s largest onlinebus booking systemvexere.com ● DB transaction log ○ Db use transaction log for replicate and restore purpose ● Git/Subversion ● Blockchain 16
  • 17.
    Vietnam’s largest onlinebus booking systemvexere.com 17
  • 18.
    Vietnam’s largest onlinebus booking systemvexere.com 18
  • 19.
    Vietnam’s largest onlinebus booking systemvexere.com ● Event Sourcing in real world ○ What is 2 phase write? ○ MSSQL / transaction log ○ Postgresql / WAL 19
  • 20.
    Vietnam’s largest onlinebus booking systemvexere.com ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ○ Transform ○ How we use Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer 20
  • 21.
    Vietnam’s largest onlinebus booking systemvexere.com ● topic ● producer ● consumer ● broker ● partition ● consumer group 21
  • 22.
    Vietnam’s largest onlinebus booking systemvexere.com 22
  • 23.
    Vietnam’s largest onlinebus booking systemvexere.com 23
  • 24.
    Vietnam’s largest onlinebus booking systemvexere.com 24
  • 25.
    Vietnam’s largest onlinebus booking systemvexere.com 25
  • 26.
    Vietnam’s largest onlinebus booking systemvexere.com 26
  • 27.
    Vietnam’s largest onlinebus booking systemvexere.com 27
  • 28.
    Vietnam’s largest onlinebus booking systemvexere.com Kafka Connect is a framework to stream data into and out of Apache Kafka ● Connectors – the high level abstraction that coordinates data streaming by managing tasks ● Tasks – the implementation of how data is copied to or from Kafka ● Workers – the running processes that execute connectors and tasks ● Converters – the code used to translate data between Connect and the system sending or receiving data ● Transforms – simple logic to alter each message produced by or sent to a connector ● Dead Letter Queue – how Connect handles connector errors 28
  • 29.
    Vietnam’s largest onlinebus booking systemvexere.com No coding required, just json config: 29
  • 30.
    Vietnam’s largest onlinebus booking systemvexere.com 30
  • 31.
    Vietnam’s largest onlinebus booking systemvexere.com 31
  • 32.
    Vietnam’s largest onlinebus booking systemvexere.com 32
  • 33.
    Vietnam’s largest onlinebus booking systemvexere.com 33
  • 34.
    Vietnam’s largest onlinebus booking systemvexere.com 34
  • 35.
    Vietnam’s largest onlinebus booking systemvexere.com 35
  • 36.
    Vietnam’s largest onlinebus booking systemvexere.com Pros Cons ● Many Connectors (source/sink) ● No coding required ● Simple transform only ● Hard to customize or write your own connector 36
  • 37.
    Vietnam’s largest onlinebus booking systemvexere.com ● What is Kafka & Kafka Connect? ○ Connector/Task/Worker ■ MSSQL Source Connector ○ Transform ○ How we use Kafka and Kafka Connect @ vexere ○ Pros/Cons of Kafka Connect vs Custom Producer 37
  • 38.
    Vietnam’s largest onlinebus booking systemvexere.com ● Monitor kafka connect job ● AlwaysOn Cluster Config ● Database schema evolution 38
  • 39.
    Vietnam’s largest onlinebus booking systemvexere.com 39
  • 40.
    Vietnam’s largest onlinebus booking systemvexere.com 40 ● Database schema evolution ○ Hot update ○ Cold update ○ Capture table schema is static ○ Must disable/enable table cdc again to reflect new schema to cdc table ■ Create a new capture table for the update source table using sys.sp_cdc_enable_table procedure with a unique value for parameter @capture_instance ■ Why? Because source connector only sent new schema to schema changes topic (for later comparison), then use this to diff check with message schema
  • 41.
    Vietnam’s largest onlinebus booking systemvexere.com Always On Cluster 41
  • 42.
    Vietnam’s largest onlinebus booking systemvexere.com 42 ● Datatype: ○ VARCHAR(max) is not supported in CDC table ( cannot record before value, only have record with value after update )
  • 43.
    Vietnam’s largest onlinebus booking systemvexere.com In Kafka Connect, task is being killed and will not recover until manually restarted Solution: ● Cronjob to check task status, then restart task by calling restful to task api ● Dead Letter Queue to handle error in: ○ Convert ○ Transform 43
  • 44.
    Vietnam’s largest onlinebus booking systemvexere.com ● Experience / Tools / Troubleshoot ○ Tools: (kafka manager, kafka tool) ○ Troubleshoot connector ○ Monitoring 44
  • 45.
    Vietnam’s largest onlinebus booking systemvexere.com 45
  • 46.
    Vietnam’s largest onlinebus booking systemvexere.com 46
  • 47.
    Vietnam’s largest onlinebus booking systemvexere.com 47
  • 48.
    Vietnam’s largest onlinebus booking systemvexere.com 48 Reference: https://www.enterpriseintegrationpatterns.com/patterns/messaging/index.html
  • 49.
    Vietnam’s largest onlinebus booking systemvexere.com ● https://martinfowler.com/articles/201701-event-driven.html ● https://www.confluent.io/blog/announcing-kafka-connect-building-large-scale-low-latenc y-data-pipelines/ ● https://docs.microsoft.com/en-us/sql/relational-databases/track-changes/about-change- data-capture-sql-server?view=sql-server-2017 ● https://docs.confluent.io/current/connect/concepts.html ● https://www.slideshare.net/ConfluentInc/from-zero-to-hero-with-kafka-connect ● https://www.innoq.com/en/blog/domain-events-versus-event-sourcing/#eventsfromeven tsourcing%E2%89%A0domainevents 49
  • 50.
    Vietnam’s largest onlinebus booking systemvexere.com 50
  • 51.
    Vietnam’s largest onlinebus booking systemvexere.com 51