Skip to content

Commit 7b6ffea

Browse files
authored
Update gen-avro-parquet.sh
1 parent b56b6da commit 7b6ffea

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

gen-avro-parquet.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#this script transforms the HDFS stored CSV files to AVRO and Parquet formats using spark-shell and a small scala script
22
cd /tmp
3-
#gets the ETL script into /tmp, the saved file will be called etl.scala
3+
#gets the ETL script into /tmp, the saved file will be called avro-parquet.scala
44
wget -q https://raw.githubusercontent.com/academyofdata/clusterdock/master/avro-parquet.scala
55
#runs spark-shell providing the script as input
66
HADOOP_USER_NAME=spark spark-shell --packages com.databricks:spark-csv_2.10:1.5.0 com.databricks:spark-avro_2.10:3.2.0 -i avro-parquet.scala

0 commit comments

Comments
 (0)