DEV Community

Zaw Htut Win
Zaw Htut Win

Posted on

Installing Hadoop single node cluster in AWS EC2

Ubuntu 18, m3.large, memory 8GB

Install openjdk(Not JRE)

sudo apt-get install openjdk-8-jdk

Get the Hadoop 2.9.0
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.9.0/hadoop-2.9.0.tar.gz

Extract Hadoop in home folder

tar -xvf hadoop-2.9.0.tar.gz

Create folder for Hadoop

sudo mkdir /usr/lib/hadoop

Move extracted Hadoop folders to /usr/lib/hadoop

mv hadoop-2.9.0 /usr/lib/hadoop/

Find the JDK 8 path and note down as following

EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64

Open ~/.bashrc and put the above line at the end of thee file.

EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64

Load the env environment

source ~/.bashrc

Generate SSH

ssh-keygen -t rsa

cd ~

sudo .ssh/id_rsa.pub >> .ssh/authorized_keys

ssh-copy-id -i .ssh/id_rsa.pub ubuntu@localhost

Create hadoopdata folder in home directory
cd ~

mkdir hadoopdata

Go to xml files

cd /usr/lib/hadoop/hadoop-2.9.0/etc/hadoop

core-site.xml

<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration> 
Enter fullscreen mode Exit fullscreen mode

hdfs-site.xml

<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/ubuntu/hadoopdata/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/ubuntu/hadoopdata/hdfs/data</value> </property> </configuration> 
Enter fullscreen mode Exit fullscreen mode

mapred-site.xml

<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration> 
Enter fullscreen mode Exit fullscreen mode

yarn-site.xml

<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration> 
Enter fullscreen mode Exit fullscreen mode

Format the name node
hdfs namenode -format

Go to sbin directory of hadoop :
cd $HADOOP_HOME/sbin

Start the name node
./hadoop-daemon.sh start namenode

Start HDFS components
./start-dfs.sh

Stop all
./stop-all.sh

Start all
./start-all.sh

Then access the web ui for Hadoop in following webpages.

NameNode – aws_ip_address: 50070

DataNode – aws_ip_address: 50075

SecondaryNameNode – aws_ip_address: 50090

ResourceManager – aws_ip_address: 8088

In the next tutorial, we will install sqoop.
https://dev.to/zawhtutwin/installing-sqoop-on-hadoop-14n8

Top comments (0)