gschmutz Location Analytics – Real-Time Geofencing using Kafka Berlin Buzzwords 2019 Guido Schmutz (guido.schmutz@trivadis.com) gschmutz http://guidoschmutz.wordpress.com
gschmutz Agenda Location Analytics – Real-Time Geofencing using Kafka 1. Introduction & Motivation 2. Implementing using KSQL 3. Implementing using Tile38 4. Visualization using ArcadiaData 5. Summary
gschmutz Guido Schmutz Location Analytics – Real-Time Geofencing using Kafka Working at Trivadis for more than 22 years Oracle Groundbreaker Ambassador & Oracle ACE Director Consultant, Trainer, Software Architect for Java, AWS, Azure, Oracle Cloud, SOA and Big Data / Fast Data Platform Architect & Head of Trivadis Architecture Board More than 30 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 155th edition
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Introduction
gschmutz Geofencing – What is it? Location Analytics – Real-Time Geofencing using Kafka the use of GPS or RFID technology to create a virtual geographic boundary, enabling software to trigger a response when a object/device enters or leaves a particular area Possible Events • OUTSIDE • lNSIDE • ENTER • EXIT Source: https://tile38.com
gschmutz Geofencing – What can we do with it? Location Analytics – Real-Time Geofencing using Kafka • On-Demand and Delivery Services - assign orders to an area's designated service provider • On-Demand Transportation - track Electronic Transportation Devices and their distance from charging stations • Transportation Management - track flow of people using public transport systems • Commercial Real Estate - Identify how many people drive or walk by a specific location • Retail Shopper Guidance - Guide customer to a specific product once they are in your store • Property Security - Open or lock doors as individuals with designated devices approach or leave a building or vehicle. • Property Control - restrict vehicles to be operational only inside a geofenced area – like drones or construction equipment
gschmutz Geo-Processing Location Analytics – Real-Time Geofencing using Kafka Well-known text (WKT) is a text markup language for representing vector geometry objects on a map GeoTools is a free software GIS toolkit for developing standards compliant solutions
gschmutz Apache Kafka – A Streaming Platform Source Connector trucking_ driver Kafka Broker Sink Connector Stream Processing Location Analytics – Real-Time Geofencing using Kafka
gschmutz Dash board High Level Overview of Use Case Location Analytics – Real-Time Geofencing using Kafka geofence Join Position & Geofences Vehicle Position object position pos & geofences Geo fencing geofence status key=10 { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} key=3 {"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))","last_update":1560607149015} Geofence Mgmt Vehicle Position Weather Service
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Implementing using KSQL
gschmutz KSQL - Overview • Stream • unbounded sequence of structured data ("facts") • Facts in a stream are immutable • Table • collected state of a stream • Latest value for each key in a stream • Facts in a table are mutable • Stream Processing with zero coding using SQL-like language • Stream and Table as first-class citizens trucking_ driver Kafka Broker KSQL Engine Kafka Streams KSQL CLI Commands Location Analytics – Real-Time Geofencing using Kafka
gschmutz KSQL – Streams and Tabless Location Analytics – Real-Time Geofencing using Kafka geofence Table vehicle position Stream CREATE STREAM vehicle_position_s (id VARCHAR, latitude DOUBLE, longitude DOUBLE) WITH (KAFKA_TOPIC='vehicle_position', VALUE_FORMAT='DELIMITED'); CREATE TABLE geo_fence_t (id BIGINT, name VARCHAR, geometry_wkt VARCHAR) WITH (KAFKA_TOPIC='geo_fence', VALUE_FORMAT='JSON', KEY = 'id');KSQL Geofencing
gschmutz How to determine "inside" or "outside" geofence? Location Analytics – Real-Time Geofencing using Kafka Only one standard UDF for geo processing in KSQL: GEO_DISTANCE Implement custom UDF using functionality from GeoTools Java library public String geo_fence(final double latitude, final double longitude, final String geometryWKT){ .. } public List<String> geo_fence_bulk(final double latitude , final double longitude, List<String> idGeometryListWKT) { .. } ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781 52.56195151687443, 13.2440185546875 52.530216577830124, ...))') FROM test_geo_udf_s; 52.4497 | 13.3096 | OUTSIDE 52.4556 | 13.3178 | INSIDE
gschmutz Custom UDF to determine if Point is inside a geometry Location Analytics – Real-Time Geofencing using Kafka @Udf(description = "determines if a lat/long is inside or outside the geometry passed as the 3rd parameter as WKT encoded ...") public String geo_fence(final double latitude, final double longitude, final String geometryWKT) { String status = ""; GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory(); WKTReader reader = new WKTReader(geometryFactory); Polygon polygon = (Polygon) reader.read(geometryWKT); Coordinate coord = new Coordinate(longitude, latitude); Point point = geometryFactory.createPoint(coord); if (point.within(polygon)) { status = "INSIDE"; } else { status = "OUTSIDE"; } return status; }
gschmutz 1) Using Cross Join Location Analytics – Real-Time Geofencing using Kafka geofence Table Join Position & Geofences vehicle position Stream Stream pos & geofences CREATE STREAM vp_join_gf_s AS SELECT vp.id, vp.latitude, vp.longitude, gf.geometry_wkt FROM vehicle_position_s AS vp CROSS JOIN geo_fence_t AS gf There is no Cross Join in KSQL!
gschmutz 2) INNER Join Location Analytics – Real-Time Geofencing using Kafka geofence Stream Join Position & Geofences { "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} vehicle position Stream Stream pos & geofences { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich Group Table geofences by group 1 Enrich Group Stream postion by group 1 Cannot insert into Table from Stream >INSERT INTO geo_fence_t >SELECT '1' AS group_id, geof.id, … >FROM geo_fence_s geof; INSERT INTO can only be used to insert into a stream. A02_GEO_FENCE_T is a table.
gschmutz 3) Geofences aggregated in one group Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Stream geofence status Geofences aggby group Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk geofence Stream vehicle position Stream { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich With Group-1 Stream geofences by group 1 Enrich With Group-1 Stream postion by group 1 geofences by id { "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} high low low high low high Scalable Latency "Code Smell" medium medium medium
gschmutz 3) Geofences aggregated in one group Location Analytics – Real-Time Geofencing using Kafka CREATE TABLE a03_geo_fence_aggby_group_t AS SELECT group_id , collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list FROM a03_geo_fence_by_group_s geof GROUP BY group_id; CREATE STREAM a03_vehicle_position_by_group_s AS SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude FROM vehicle_position_s vehp PARTITION BY group_id;
gschmutz 3) Geofences aggregated in one group Location Analytics – Real-Time Geofencing using Kafka ksql> SELECT * FROM a03_geo_fence_status_s; 46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE] 46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE] ... CREATE STREAM a03_geo_fence_status_s AS SELECT vehp.id, vehp.latitude, vehp.longitude, geo_fence_bulk(vehp.latitude, vehp.longitude, geofaggid_geometry_wkt_list) AS geofence_status FROM a03_vehicle_position_by_group_s vehp LEFT JOIN a03_geo_fence_aggby_group_t geofagg ON vehp.group_id = geofagg.group_id; As many as there are geo-fences
gschmutz Geo Hash for a better distribution Geohash is a geocoding which encodes a geographic location into a short string of letters and digits Length Area width x height 1 5,009.4km x 4,992.6km 2 1,252.3km x 624.1km 3 156.5km x 156km 4 39.1km x 19.5km 12 3.7cm x 1.9cm http://geohash.gofreerange.com/ Location Analytics – Real-Time Geofencing using Kafka
gschmutz Geo Hash Custom UDF Location Analytics – Real-Time Geofencing using Kafka ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3) >FROM test_geo_udf_s; 38.484769753492536 | -90.23345947265625 | 9yz public String geohash(final double latitude, final double longitude, int length) public List<String> neighbours(String geohash) public String adjacentHash(String geohash, String directionString) public List<String> coverBoundingBox(String geometryWKT, int length) ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5) >FROM test_geo_udf_s; POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531 38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk, 9yzfs, 9yzfu]
gschmutz 4) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Stream geofence status Geofences gpby geohash Table { "geohash":9yz", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":9yz", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by id geo_hash() geo_hash() { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311} high low low high low high Scalable Latency "Code Smell" medium medium medium
gschmutz 4) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04_geo_fence_by_geohash_s AS SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s PARTITION by geo_hash; INSERT INTO a04_geo_fence_by_geohash_s SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL PARTITION BY geo_hash;s INSERT INTO a04_geo_fence_by_geohash_s SELECT ... There is no explode() functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
gschmutz 4) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE TABLE a04_geo_fence_by_geohash_t AS SELECT geo_hash, COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list, COLLECT_SET(id) id_list FROM a04_geo_fence_by_geohash_s GROUP BY geo_hash; CREATE STREAM a04_vehicle_position_by_geohash_s AS SELECT vp.id, vp.latitude, vp.longitude, geo_hash(vp.latitude, vp.longitude, 3) geo_hash FROM vehicle_position_s vp PARTITION BY geo_hash;
gschmutz 4) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04_geo_fence_status_s AS SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude, geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list) AS fence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); ksql> SELECT * FROM a04_geo_fence_status_s; u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] 9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE] ... As many as there are geo-fences in geohash
gschmutz 4a) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Geofences gpby geohash Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { ”geohash":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by id geo_hash() geo_hash() Stream udf status geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311}
gschmutz 4b) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Geofences gpby geohash Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence() geofence Table vehicle position Stream { "geohash":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences gpby geohash geo_hash() geo_hash() Stream position & geofence Explode Geofendes Stream geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311}
gschmutz 4b) Geofences aggregated by GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04b_geofence_udf_status_s AS SELECT id, latitude, longitude, id_list[0] AS geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); INSERT INTO a04b_geofence_udf_status_s SELECT id, latitude, longitude, id_list[1] geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash) WHERE id_list[1] IS NOT NULL;
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Implementing using Tile38
gschmutz Tile38 Location Analytics – Real-Time Geofencing using Kafka https://tile38.com Open Source Geospatial Database & Geofencing Server Real Time Geofencing Roaming Geofencing Fast Spatial Indices Plugable Event Notifications
gschmutz Tile38 – How does it work? Location Analytics – Real-Time Geofencing using Kafka > SETCHAN berlin WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} > SUBSCRIBE berlin {"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85 µs"} . . . {"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo k":"berlin","key":"vehicle","time":"2019-06- 17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
gschmutz Tile38 – How does it work? Location Analytics – Real-Time Geofencing using Kafka > SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status % Auto-selecting Consumer mode (use -P or -C to override) {"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo k":"berlin_hook","key":"vehicle","time":"2019-06- 17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
gschmutz 1) Enrich with GeoFences – aggregated by geohash Location Analytics – Real-Time Geofencing using Kafka geofence Stream vehicle position Stream Invoke UDF {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Invoke UDF Geofence Service geofence status set_pos() set_fence() Stream udf status high low low high low high Scalable Latency "Code Smell" medium medium medium
gschmutz 2) Using Custom Kafka Connector for Tile38 Location Analytics – Real-Time Geofencing using Kafka geofence vehicle position {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Geofence Service kafka-to- tile38 kafka-to- tile38 geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium
gschmutz 2) Using Custom Kafka Connector for Tile38 Location Analytics – Real-Time Geofencing using Kafka curl -X PUT /api/kafka-connect-1/connectors/Tile38SinkConnector/config -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "connector.class": "com.trivadis.geofence.kafka.connect.Tile38SinkConnector", "topics": "vehicle_position", "tasks.max": "1", "tile38.key": "vehicle", "tile38.operation": "SET", "tile38.hosts": "tile38:9851" }' Currently only supports SET command
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Visualization using Arcadia Data
gschmutz Arcadia Data Location Analytics – Real-Time Geofencing using Kafka https://www.arcadiadata.com/
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Summary
gschmutz Outlook Location Analytics – Real-Time Geofencing using Kafka • Geo Fencing is doable using Kafka and KSQL • KSQL is similar to SQL, but don't think relational • UDF and UDAF's is a powerful way to extend KSQL • Use Geo Hahes to partition work • Outlook • Performance Tests • Cleanup code of UDFs and UDAFs • Implement Kafka Source Connector for Tile 38
gschmutzLocation Analytics – Real-Time Geofencing using Kafka Technology on its own won't help you. You need to know how to use it properly.

Location Analytics - Real Time Geofencing using Apache Kafka

  • 1.
    gschmutz Location Analytics –Real-Time Geofencing using Kafka Berlin Buzzwords 2019 Guido Schmutz (guido.schmutz@trivadis.com) gschmutz http://guidoschmutz.wordpress.com
  • 2.
    gschmutz Agenda Location Analytics –Real-Time Geofencing using Kafka 1. Introduction & Motivation 2. Implementing using KSQL 3. Implementing using Tile38 4. Visualization using ArcadiaData 5. Summary
  • 3.
    gschmutz Guido Schmutz Location Analytics– Real-Time Geofencing using Kafka Working at Trivadis for more than 22 years Oracle Groundbreaker Ambassador & Oracle ACE Director Consultant, Trainer, Software Architect for Java, AWS, Azure, Oracle Cloud, SOA and Big Data / Fast Data Platform Architect & Head of Trivadis Architecture Board More than 30 years of software development experience Contact: guido.schmutz@trivadis.com Blog: http://guidoschmutz.wordpress.com Slideshare: http://www.slideshare.net/gschmutz Twitter: gschmutz 155th edition
  • 4.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Introduction
  • 5.
    gschmutz Geofencing – Whatis it? Location Analytics – Real-Time Geofencing using Kafka the use of GPS or RFID technology to create a virtual geographic boundary, enabling software to trigger a response when a object/device enters or leaves a particular area Possible Events • OUTSIDE • lNSIDE • ENTER • EXIT Source: https://tile38.com
  • 6.
    gschmutz Geofencing – Whatcan we do with it? Location Analytics – Real-Time Geofencing using Kafka • On-Demand and Delivery Services - assign orders to an area's designated service provider • On-Demand Transportation - track Electronic Transportation Devices and their distance from charging stations • Transportation Management - track flow of people using public transport systems • Commercial Real Estate - Identify how many people drive or walk by a specific location • Retail Shopper Guidance - Guide customer to a specific product once they are in your store • Property Security - Open or lock doors as individuals with designated devices approach or leave a building or vehicle. • Property Control - restrict vehicles to be operational only inside a geofenced area – like drones or construction equipment
  • 7.
    gschmutz Geo-Processing Location Analytics –Real-Time Geofencing using Kafka Well-known text (WKT) is a text markup language for representing vector geometry objects on a map GeoTools is a free software GIS toolkit for developing standards compliant solutions
  • 8.
    gschmutz Apache Kafka –A Streaming Platform Source Connector trucking_ driver Kafka Broker Sink Connector Stream Processing Location Analytics – Real-Time Geofencing using Kafka
  • 9.
    gschmutz Dash board High Level Overviewof Use Case Location Analytics – Real-Time Geofencing using Kafka geofence Join Position & Geofences Vehicle Position object position pos & geofences Geo fencing geofence status key=10 { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} key=3 {"id":3,"name":"Berlin, Germany","geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))","last_update":1560607149015} Geofence Mgmt Vehicle Position Weather Service
  • 10.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Implementing using KSQL
  • 11.
    gschmutz KSQL - Overview •Stream • unbounded sequence of structured data ("facts") • Facts in a stream are immutable • Table • collected state of a stream • Latest value for each key in a stream • Facts in a table are mutable • Stream Processing with zero coding using SQL-like language • Stream and Table as first-class citizens trucking_ driver Kafka Broker KSQL Engine Kafka Streams KSQL CLI Commands Location Analytics – Real-Time Geofencing using Kafka
  • 12.
    gschmutz KSQL – Streamsand Tabless Location Analytics – Real-Time Geofencing using Kafka geofence Table vehicle position Stream CREATE STREAM vehicle_position_s (id VARCHAR, latitude DOUBLE, longitude DOUBLE) WITH (KAFKA_TOPIC='vehicle_position', VALUE_FORMAT='DELIMITED'); CREATE TABLE geo_fence_t (id BIGINT, name VARCHAR, geometry_wkt VARCHAR) WITH (KAFKA_TOPIC='geo_fence', VALUE_FORMAT='JSON', KEY = 'id');KSQL Geofencing
  • 13.
    gschmutz How to determine"inside" or "outside" geofence? Location Analytics – Real-Time Geofencing using Kafka Only one standard UDF for geo processing in KSQL: GEO_DISTANCE Implement custom UDF using functionality from GeoTools Java library public String geo_fence(final double latitude, final double longitude, final String geometryWKT){ .. } public List<String> geo_fence_bulk(final double latitude , final double longitude, List<String> idGeometryListWKT) { .. } ksql> SELECT geo_fence(latitude, longitude, ' POLYGON ((13.297920227050781 52.56195151687443, 13.2440185546875 52.530216577830124, ...))') FROM test_geo_udf_s; 52.4497 | 13.3096 | OUTSIDE 52.4556 | 13.3178 | INSIDE
  • 14.
    gschmutz Custom UDF todetermine if Point is inside a geometry Location Analytics – Real-Time Geofencing using Kafka @Udf(description = "determines if a lat/long is inside or outside the geometry passed as the 3rd parameter as WKT encoded ...") public String geo_fence(final double latitude, final double longitude, final String geometryWKT) { String status = ""; GeometryFactory geometryFactory = JTSFactoryFinder.getGeometryFactory(); WKTReader reader = new WKTReader(geometryFactory); Polygon polygon = (Polygon) reader.read(geometryWKT); Coordinate coord = new Coordinate(longitude, latitude); Point point = geometryFactory.createPoint(coord); if (point.within(polygon)) { status = "INSIDE"; } else { status = "OUTSIDE"; } return status; }
  • 15.
    gschmutz 1) Using CrossJoin Location Analytics – Real-Time Geofencing using Kafka geofence Table Join Position & Geofences vehicle position Stream Stream pos & geofences CREATE STREAM vp_join_gf_s AS SELECT vp.id, vp.latitude, vp.longitude, gf.geometry_wkt FROM vehicle_position_s AS vp CROSS JOIN geo_fence_t AS gf There is no Cross Join in KSQL!
  • 16.
    gschmutz 2) INNER Join LocationAnalytics – Real-Time Geofencing using Kafka geofence Stream Join Position & Geofences { "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} vehicle position Stream Stream pos & geofences { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich Group Table geofences by group 1 Enrich Group Stream postion by group 1 Cannot insert into Table from Stream >INSERT INTO geo_fence_t >SELECT '1' AS group_id, geof.id, … >FROM geo_fence_s geof; INSERT INTO can only be used to insert into a stream. A02_GEO_FENCE_T is a table.
  • 17.
    gschmutz 3) Geofences aggregatedin one group Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Stream geofence status Geofences aggby group Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk geofence Stream vehicle position Stream { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich With Group-1 Stream geofences by group 1 Enrich With Group-1 Stream postion by group 1 geofences by id { "group":"1", "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 18.
    gschmutz 3) Geofences aggregatedin one group Location Analytics – Real-Time Geofencing using Kafka CREATE TABLE a03_geo_fence_aggby_group_t AS SELECT group_id , collect_set(id + ':' + geometry_wkt) AS id_geometry_wkt_list FROM a03_geo_fence_by_group_s geof GROUP BY group_id; CREATE STREAM a03_vehicle_position_by_group_s AS SELECT '1' group_id, vehp.id, vehp.latitude, vehp.longitude FROM vehicle_position_s vehp PARTITION BY group_id;
  • 19.
    gschmutz 3) Geofences aggregatedin one group Location Analytics – Real-Time Geofencing using Kafka ksql> SELECT * FROM a03_geo_fence_status_s; 46 | 52.47546 | 13.34851 | [1:OUTSIDE, 3:INSIDE] 46 | 52.47521 | 13.34881 | [1:OUTSIDE, 3:INSIDE] ... CREATE STREAM a03_geo_fence_status_s AS SELECT vehp.id, vehp.latitude, vehp.longitude, geo_fence_bulk(vehp.latitude, vehp.longitude, geofaggid_geometry_wkt_list) AS geofence_status FROM a03_vehicle_position_by_group_s vehp LEFT JOIN a03_geo_fence_aggby_group_t geofagg ON vehp.group_id = geofagg.group_id; As many as there are geo-fences
  • 20.
    gschmutz Geo Hash fora better distribution Geohash is a geocoding which encodes a geographic location into a short string of letters and digits Length Area width x height 1 5,009.4km x 4,992.6km 2 1,252.3km x 624.1km 3 156.5km x 156km 4 39.1km x 19.5km 12 3.7cm x 1.9cm http://geohash.gofreerange.com/ Location Analytics – Real-Time Geofencing using Kafka
  • 21.
    gschmutz Geo Hash CustomUDF Location Analytics – Real-Time Geofencing using Kafka ksql> SELECT latitude, longitude, geo_hash(latitude, longitude, 3) >FROM test_geo_udf_s; 38.484769753492536 | -90.23345947265625 | 9yz public String geohash(final double latitude, final double longitude, int length) public List<String> neighbours(String geohash) public String adjacentHash(String geohash, String directionString) public List<String> coverBoundingBox(String geometryWKT, int length) ksql> SELECT geometry_wkt, geo_hash(geometry_wkt, 5) >FROM test_geo_udf_s; POLYGON ((-90.23345947265625 38.484769753492536, -90.25886535644531 38.47455675836861, ...)) | [9yzf6, 9yzf7, 9yzfd, 9yzfe, 9yzff, 9yzfg, 9yzfk, 9yzfs, 9yzfu]
  • 22.
    gschmutz 4) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Stream geofence status Geofences gpby geohash Table { "geohash":9yz", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"geohash":"u33", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":9yz", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by id geo_hash() geo_hash() { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311} high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 23.
    gschmutz 4) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04_geo_fence_by_geohash_s AS SELECT geo_hash(geometry_wkt, 3)[0] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s PARTITION by geo_hash; INSERT INTO a04_geo_fence_by_geohash_s SELECT geo_hash(geometry_wkt, 3)[1] geo_hash, id, name, geometry_wkt FROM a04_geo_fence_s WHERE geo_hash(geometry_wkt, 3)[1] IS NOT NULL PARTITION BY geo_hash;s INSERT INTO a04_geo_fence_by_geohash_s SELECT ... There is no explode() functionality in KSQL! https://github.com/confluentinc/ksql/issues/527
  • 24.
    gschmutz 4) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE TABLE a04_geo_fence_by_geohash_t AS SELECT geo_hash, COLLECT_SET(id + ':' + geometry_wkt) AS id_geometry_wkt_list, COLLECT_SET(id) id_list FROM a04_geo_fence_by_geohash_s GROUP BY geo_hash; CREATE STREAM a04_vehicle_position_by_geohash_s AS SELECT vp.id, vp.latitude, vp.longitude, geo_hash(vp.latitude, vp.longitude, 3) geo_hash FROM vehicle_position_s vp PARTITION BY geo_hash;
  • 25.
    gschmutz 4) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04_geo_fence_status_s AS SELECT vp.geo_hash, vp.id, vp.latitude, vp.longitude, geo_fence_bulk (vp.latitude, vp.longitude, gf.id_geometry_wkt_list) AS fence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); ksql> SELECT * FROM a04_geo_fence_status_s; u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] u33 | 46 | 52.3906 | 13.1599 | [3:OUTSIDE] 9yz | 12 | 38.34409 | -90.15034 | [2:OUTSIDE, 1:OUTSIDE] ... As many as there are geo-fences in geohash
  • 26.
    gschmutz 4a) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Geofences gpby geohash Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence_bulk() geofence Table vehicle position Stream { "geohash":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { ”geohash":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences by id geo_hash() geo_hash() Stream udf status geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311}
  • 27.
    gschmutz 4b) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka Join Position & Geofences Geofences gpby geohash Table { "group":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} geo_fence() geofence Table vehicle position Stream { "geohash":1", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} { "group":1", "name":"Berlin", "geometry_wkt":"POLYGON ((-90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} Enrich with GeoHash Stream geofences & geohash Enrich with GeoHash Stream position & geohash geofences gpby geohash geo_hash() geo_hash() Stream position & geofence Explode Geofendes Stream geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium { "geohash":"u33", "id" : "10", "latitude" : 38.35821, "longitude" : - 90.15311}
  • 28.
    gschmutz 4b) Geofences aggregatedby GeoHash Location Analytics – Real-Time Geofencing using Kafka CREATE STREAM a04b_geofence_udf_status_s AS SELECT id, latitude, longitude, id_list[0] AS geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[0]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash); INSERT INTO a04b_geofence_udf_status_s SELECT id, latitude, longitude, id_list[1] geofence_id, geo_fence(latitude, longitude, geometry_wkt_list[1]) AS geofence_status FROM a04_vehicle_position_by_geohash_s vp LEFT JOIN a04_geo_fence_by_geohash_t gf ON (vp.geo_hash = gf.geo_hash) WHERE id_list[1] IS NOT NULL;
  • 29.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Implementing using Tile38
  • 30.
    gschmutz Tile38 Location Analytics –Real-Time Geofencing using Kafka https://tile38.com Open Source Geospatial Database & Geofencing Server Real Time Geofencing Roaming Geofencing Fast Spatial Indices Plugable Event Notifications
  • 31.
    gschmutz Tile38 – Howdoes it work? Location Analytics – Real-Time Geofencing using Kafka > SETCHAN berlin WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} > SUBSCRIBE berlin {"ok":true,"command":"subscribe","channel":"berlin","num":1,"elapsed":"5.85 µs"} . . . {"command":"set","group":"5d07581689807d000193ac33","detect":"outside","hoo k":"berlin","key":"vehicle","time":"2019-06- 17T09:06:30.624923584Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 32.
    gschmutz Tile38 – Howdoes it work? Location Analytics – Real-Time Geofencing using Kafka > SETHOOK berlin_hook kafka://broker-1:9092/tile38_geofence_status WITHIN vehicle FENCE OBJECT {"type":"Polygon","coordinates":[[[13.297920227050781,52.56195151687443],[1 3.2440185546875,52.530216577830124],[13.267364501953125,52.45998421679598], [13.35113525390625,52.44826791583386],[13.405036926269531,52.44952338289473 ],[13.501167297363281,52.47148826410652], ...]]} bigdata@bigdata:~$ kafkacat -b localhost -t tile38_geofence_status % Auto-selecting Consumer mode (use -P or -C to override) {"command":"set","group":"5d07581689807d000193ac34","detect":"outside","hoo k":"berlin_hook","key":"vehicle","time":"2019-06- 17T09:12:00.488599119Z","id":"10","object":{"type":"Point","coordinates":[1 3.3096,52.4497]}} SET vehicle 10 POINT 52.4497 13.3096
  • 33.
    gschmutz 1) Enrich withGeoFences – aggregated by geohash Location Analytics – Real-Time Geofencing using Kafka geofence Stream vehicle position Stream Invoke UDF {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Invoke UDF Geofence Service geofence status set_pos() set_fence() Stream udf status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 34.
    gschmutz 2) Using CustomKafka Connector for Tile38 Location Analytics – Real-Time Geofencing using Kafka geofence vehicle position {"vehicle_id":10", "name":"St. Louis", "geometry_wkt":"POLYGON ((13.297920227050781 52.56195151687443, …))", "last_update":1560607149015} {"vehicle_id":10", "name":"Berlin", "geometry_wkt":"POLYGON ((- 90.23345947265625 38.484769753492536,…))", "last_update":1560607149015} { "id" : "10", "latitude" : 38.35821, "longitude" : -90.15311} Geofence Service kafka-to- tile38 kafka-to- tile38 geofence status high low low high low high Scalable Latency "Code Smell" medium medium medium
  • 35.
    gschmutz 2) Using CustomKafka Connector for Tile38 Location Analytics – Real-Time Geofencing using Kafka curl -X PUT /api/kafka-connect-1/connectors/Tile38SinkConnector/config -H 'Content-Type: application/json' -H 'Accept: application/json' -d '{ "connector.class": "com.trivadis.geofence.kafka.connect.Tile38SinkConnector", "topics": "vehicle_position", "tasks.max": "1", "tile38.key": "vehicle", "tile38.operation": "SET", "tile38.hosts": "tile38:9851" }' Currently only supports SET command
  • 36.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Visualization using Arcadia Data
  • 37.
    gschmutz Arcadia Data Location Analytics– Real-Time Geofencing using Kafka https://www.arcadiadata.com/
  • 38.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Summary
  • 39.
    gschmutz Outlook Location Analytics –Real-Time Geofencing using Kafka • Geo Fencing is doable using Kafka and KSQL • KSQL is similar to SQL, but don't think relational • UDF and UDAF's is a powerful way to extend KSQL • Use Geo Hahes to partition work • Outlook • Performance Tests • Cleanup code of UDFs and UDAFs • Implement Kafka Source Connector for Tile 38
  • 40.
    gschmutzLocation Analytics –Real-Time Geofencing using Kafka Technology on its own won't help you. You need to know how to use it properly.