Ubuntu 18, m3.large, memory 8GB
Install openjdk(Not JRE)
sudo apt-get install openjdk-8-jdk
Get the Hadoop 2.9.0
wget https://archive.apache.org/dist/hadoop/core/hadoop-2.9.0/hadoop-2.9.0.tar.gz
Extract Hadoop in home folder
tar -xvf hadoop-2.9.0.tar.gz
Create folder for Hadoop
sudo mkdir /usr/lib/hadoop
Move extracted Hadoop folders to /usr/lib/hadoop
mv hadoop-2.9.0 /usr/lib/hadoop/
Find the JDK 8 path and note down as following
EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64
Open ~/.bashrc and put the above line at the end of thee file.
EXPORT=/usr/lib/jvm/java-1.8.0-openjdk-amd64
Load the env environment
source ~/.bashrc
Generate SSH
ssh-keygen -t rsa
cd ~
sudo .ssh/id_rsa.pub >> .ssh/authorized_keys
ssh-copy-id -i .ssh/id_rsa.pub ubuntu@localhost
Create hadoopdata folder in home directory
cd ~
mkdir hadoopdata
Go to xml files
cd /usr/lib/hadoop/hadoop-2.9.0/etc/hadoop
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>/home/ubuntu/hadoopdata/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/home/ubuntu/hadoopdata/hdfs/data</value> </property> </configuration>
mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
yarn-site.xml
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
Format the name node
hdfs namenode -format
Go to sbin directory of hadoop :
cd $HADOOP_HOME/sbin
Start the name node
./hadoop-daemon.sh start namenode
Start HDFS components
./start-dfs.sh
Stop all
./stop-all.sh
Start all
./start-all.sh
Then access the web ui for Hadoop in following webpages.
NameNode – aws_ip_address: 50070
DataNode – aws_ip_address: 50075
SecondaryNameNode – aws_ip_address: 50090
ResourceManager – aws_ip_address: 8088
In the next tutorial, we will install sqoop.
https://dev.to/zawhtutwin/installing-sqoop-on-hadoop-14n8
Top comments (0)