Skip to content

Commit f5ac023

Browse files
committed
better READ ME
1 parent eaf48e9 commit f5ac023

File tree

1 file changed

+19
-1
lines changed

1 file changed

+19
-1
lines changed

README.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# Kafka / Cassandra / Spark Structured Streaming Example
2-
Stream the number of time **Drake is broadcasted** on each radio
2+
Stream the number of time **Drake is broadcasted** on each radio.
3+
And also, see how easy is Spark Structured Streaming to use using SparkSQL
34

45
## Input data
56
Coming from radio stations stored inside a parquet file, the stream is emulated with ` .option("maxFilesPerTrigger", 1)` option.
@@ -15,6 +16,7 @@ Cassandra's Sinks uses the [ForeachWriter](https://spark.apache.org/docs/latest/
1516
### Kafka topic
1617
topic:test
1718
### Cassandra Table
19+
A table for the ForeachWriter
1820
```
1921
CREATE TABLE test.radio (
2022
radio varchar,
@@ -25,6 +27,17 @@ CREATE TABLE test.radio (
2527
);
2628
```
2729

30+
A second sink to test the other writer.
31+
```
32+
CREATE TABLE test.radioOtherSink (
33+
radio varchar,
34+
title varchar,
35+
artist varchar,
36+
count bigint,
37+
PRIMARY KEY (radio, title, artist)
38+
);
39+
```
40+
2841

2942
```
3043
cqlsh> SELECT * FROM test.radio;
@@ -43,6 +56,11 @@ cqlsh> SELECT * FROM test.radio;
4356
4457
```
4558

59+
## Useful links
60+
* https://databricks.com/blog/2017/04/04/real-time-end-to-end-integration-with-apache-kafka-in-apache-sparks-structured-streaming.html
61+
* https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#using-foreach
62+
* https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#output-modes
63+
4664
## Requirements
4765
* Cassandra 3.10
4866
* Kafka 0.10+ (with Zookeeper)

0 commit comments

Comments
 (0)