In this article we are going to review the classic Hadoop word count example, customizing it a little bit. As usual I suggest to use Eclipse with Maven in order to create a project that can be modified, compiled and easily executed on the cluster. First of all, download the maven boilerplate project from here: https://github.com/H4ml3t/maven-hadoop-java-wordcount-template
$ git clone git@github.com:H4ml3t/maven-hadoop-java-wordcount-template.git
If you want to compile it directly than you can
$ cd maven-hadoop-java-wordcount-template
$ mvn package
the result fat jar will be found in the target folder with name “maven-hadoop-java-wordcount-template-0.0.1-SNAPSHOT-jar-with-dependencies.jar“.
Alternatively, if you want to modify the code (like we are about to do now) open Eclipse and go for [File] -> [Import] -> [Existing maven project] -> Browse for the directory …Continue reading →
