Data Federation with Apache Spark

Data Federation with Spark Dan Marshall danmarshall07@gmail.com 06/13/2017

HBase Source hbase(main):004:0* create 'hb_dept','cf1' => Hbase::Table - hb_dept hbase(main):008:0* put 'hb_dept','M1','cf1:dept_name','Maintenance' hbase(main):009:0> put 'hb_dept','F1','cf1:dept_name','Entertainment' hbase(main):010:0> put 'hb_dept','S2','cf1:dept_name','Sports' hbase(main):012:0* scan 'hb_dept' ROW COLUMN+CELL F1 column=cf1:dept_name, timestamp=1496621309775, value=Entertainment M1 column=cf1:dept_name, timestamp=1496621309741, value=Maintenance S2 column=cf1:dept_name, timestamp=1496621309863, value=Sports 3 row(s) in 0.0590 seconds

Cassandra Source Connected to Test Cluster at cassandra:9042. [cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> use mykeyspace; cqlsh:mykeyspace> create table bonus_table (userid int primary key, bonus_amount decimal); cqlsh:mykeyspace> insert into bonus_table (userid, bonus_amount) values (1, 500.00); cqlsh:mykeyspace> insert into bonus_table (userid, bonus_amount) values (4, 1000.00); cqlsh:mykeyspace> select * from bonus_table; userid | bonus_amount --------+-------------- 1 | 500.00 4 | 1000.00 (2 rows)

Use SQL on DataFrame from Cassandra Source

Join – Hbase,PostgreSQL,Cassandra

JSON Source {"dept":"F1"} {"dept":"S2"}

Join – Hbase,PostgreSQL,Cassandra,JSON

Data Federation with Apache Spark

More Related Content

What's hot

More from DataWorks Summit

Recently uploaded

Data Federation with Apache Spark