Installing Spark on Windows 10. Shantanu Sharma Department of Computer Science, Ben-Gurion University, Israel. sharmas@cs.bgu.ac.il 1. Install Scala: Download Scala from the link: http://downloads.lightbend.com/scala/2.11.8/scala- 2.11.8.msi a. Set environmental variables: i. User variable:  Variable: SCALA_HOME;  Value: C:Program Files (x86)scala ii. System variable:  Variable: PATH  Value: C:Program Files (x86)scalabin b. Check it on cmd, see below. 2. Install Java 8: Download Java 8 from the link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html a. Set environmental variables: i. User variable:  Variable: JAVA_HOME  Value: C:Program FilesJavajdk1.8.0_91 ii. System variable:  Variable: PATH  Value: C:Program FilesJavajdk1.8.0_91bin b. Check on cmd, see below:
3. Install Eclipse Mars. Download it from the link: https://eclipse.org/downloads/ and extract it into C drive. a. Set environmental variables: i. User variable:  Variable: ECLIPSE_HOME  Value: C:eclipse ii. System variable:  Variable: PATH  Value: C:eclipse bin 4. Install Spark 1.6.1. Download it from the following link: http://spark.apache.org/downloads.html and extract it into D drive, such as D:Spark. a. Set environmental variables: i. User variable:  Variable: SPARK_HOME  Value: D:sparkspark-1.6.1-bin-hadoop2.6 ii. System variable:  Variable: PATH
 Value: D:sparkspark-1.6.1-bin-hadoop2.6bin 5. Download Windows Utilities: Download it from the link: https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin And paste it in D:sparkspark-1.6.1-bin-hadoop2.6bin 6. Execute Spark on cmd, see below: 7. Install Maven 3.3. Download Apache-Maven-3.3.9 from the link: http://apache.mivzakim.net/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.zip And extract it into D drive, such as D:apache-maven-3.3.9 a. Set Environmental variables: i. User variable  Variable: MAVEN_HOME  Value: D:apache-maven-3.3.9 ii. System variable  Variable: Path  Value: D:apache-maven-3.3.9bin b. Check on cmd, see below 8. Create first WordCount project. a. Open Eclipse and do File New  project  Select Maven Project; see below.
b. Enter Group id, Artifact id, and click finish.
c. Edit pom.xml. Paste the following code. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>sparkWCexample</groupId> <artifactId>spWCexample</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.2.0</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.3</version> </plugin> </plugins> </build>
</project> d. Write your code or just copy given WordCount code from D:sparkspark-1.6.1-bin- hadoop2.6examplessrcmainjavaorgapachesparkexamples e. Now, add external jar from the location D:sparkspark-1.6.1-bin-hadoop2.6lib and set Java 8 for compilation; see below.
f. Build the project: Go to the following location (where we stored the project) on cmd: D:hadoopexamplesspWCexample Write mvn package on cmd
g. Execute the project: Go to the following location on cmd: D:sparkspark-1.6.1-bin- hadoop2.6bin Write the following command spark-submit --class groupid.artifactid.classname --master local[2] /path to the jar file created using maven /path to a demo test file /path to output directory spark-submit --class sparkWCexample.spWCexample.WC --master local[2] /hadoop/examples/spWCexample/target/spWCexample-1.0-SNAPSHOT.jar /hadoop/examples/spWCexample/how.txt /hadoop/examples/spWCexample/anwer.txt
h. You can also check the progress of the project at: http://localhost:4040/jobs/ i. Finally get the answers; see below.

Install spark on_windows10

  • 1.
    Installing Spark onWindows 10. Shantanu Sharma Department of Computer Science, Ben-Gurion University, Israel. sharmas@cs.bgu.ac.il 1. Install Scala: Download Scala from the link: http://downloads.lightbend.com/scala/2.11.8/scala- 2.11.8.msi a. Set environmental variables: i. User variable:  Variable: SCALA_HOME;  Value: C:Program Files (x86)scala ii. System variable:  Variable: PATH  Value: C:Program Files (x86)scalabin b. Check it on cmd, see below. 2. Install Java 8: Download Java 8 from the link: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html a. Set environmental variables: i. User variable:  Variable: JAVA_HOME  Value: C:Program FilesJavajdk1.8.0_91 ii. System variable:  Variable: PATH  Value: C:Program FilesJavajdk1.8.0_91bin b. Check on cmd, see below:
  • 2.
    3. Install EclipseMars. Download it from the link: https://eclipse.org/downloads/ and extract it into C drive. a. Set environmental variables: i. User variable:  Variable: ECLIPSE_HOME  Value: C:eclipse ii. System variable:  Variable: PATH  Value: C:eclipse bin 4. Install Spark 1.6.1. Download it from the following link: http://spark.apache.org/downloads.html and extract it into D drive, such as D:Spark. a. Set environmental variables: i. User variable:  Variable: SPARK_HOME  Value: D:sparkspark-1.6.1-bin-hadoop2.6 ii. System variable:  Variable: PATH
  • 3.
     Value: D:sparkspark-1.6.1-bin-hadoop2.6bin 5.Download Windows Utilities: Download it from the link: https://github.com/steveloughran/winutils/tree/master/hadoop-2.6.0/bin And paste it in D:sparkspark-1.6.1-bin-hadoop2.6bin 6. Execute Spark on cmd, see below: 7. Install Maven 3.3. Download Apache-Maven-3.3.9 from the link: http://apache.mivzakim.net/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.zip And extract it into D drive, such as D:apache-maven-3.3.9 a. Set Environmental variables: i. User variable  Variable: MAVEN_HOME  Value: D:apache-maven-3.3.9 ii. System variable  Variable: Path  Value: D:apache-maven-3.3.9bin b. Check on cmd, see below 8. Create first WordCount project. a. Open Eclipse and do File New  project  Select Maven Project; see below.
  • 4.
    b. Enter Groupid, Artifact id, and click finish.
  • 5.
    c. Edit pom.xml.Paste the following code. <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema- instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>sparkWCexample</groupId> <artifactId>spWCexample</artifactId> <version>1.0-SNAPSHOT</version> <dependencies> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.2.0</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>3.3</version> </plugin> </plugins> </build>
  • 6.
    </project> d. Write yourcode or just copy given WordCount code from D:sparkspark-1.6.1-bin- hadoop2.6examplessrcmainjavaorgapachesparkexamples e. Now, add external jar from the location D:sparkspark-1.6.1-bin-hadoop2.6lib and set Java 8 for compilation; see below.
  • 7.
    f. Build theproject: Go to the following location (where we stored the project) on cmd: D:hadoopexamplesspWCexample Write mvn package on cmd
  • 8.
    g. Execute theproject: Go to the following location on cmd: D:sparkspark-1.6.1-bin- hadoop2.6bin Write the following command spark-submit --class groupid.artifactid.classname --master local[2] /path to the jar file created using maven /path to a demo test file /path to output directory spark-submit --class sparkWCexample.spWCexample.WC --master local[2] /hadoop/examples/spWCexample/target/spWCexample-1.0-SNAPSHOT.jar /hadoop/examples/spWCexample/how.txt /hadoop/examples/spWCexample/anwer.txt
  • 9.
    h. You canalso check the progress of the project at: http://localhost:4040/jobs/ i. Finally get the answers; see below.