You are on page 1of 5

Lesson-4

Hadoop installation
Some notes while using CLI or terminal ….
1. Always use number keys under function keys
2. Always copy from Google drive/or mail with ^C option
3. Always paste by using right click and clicking paste command
4. While copying, select precisely, not more than required
5. Any command starting with # or – or within () are comments only. not to be copied.
6. ^ + X and then hit ENTR, is a closing and saving command
7. .xml files are configuration files
8. Bashrc is environment variable command and usually followed by sourcing command
Hadoop 2.8.4 installation step by step………………

Update repositories
Remain at root user (by default you are in root user)
sudo apt update
sudo apt upgrade -y

Install Java-8
sudo apt purge openjdk*
sudo add-apt-repository -y ppa:webupd8team/java
sudo apt update
sudo apt install openjdk-8-jdk-headless

Check Installation of Java- 8


java -version

# The installed version will appear in the terminal

Create JAVA_HOME environment variable


sudo nano /etc/profile

#append the below code then CTRL+X then Y and finally ENTR
export JAVA_HOME=/usr

source /etc/profile

1
Disable IPv6
sudo nano /etc/sysctl.conf # then append the below whole command

# Disable IPv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1

sudo reboot

Configure SSH keys


sudo addgroup hadoopgroup
sudo adduser -ingroup hadoopgroup hadoopuser

sudo apt install ssh


sudo systemctl enable ssh
sudo systemctl start ssh

Switch to Hadoop user by the following command


su - hadoopuser

ssh-keygen -t rsa -P "" (hit ENTR again)


cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys
ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
su -

Install Hadoop
Swith to root user
su – rootsudo rm -rf hadoop-2.8.4.tar.gz # optional user name #follow your
root name, check your prompt.

sudo wget https://archive.apache.org/dist/hadoop/core/hadoop-2.8.4/hadoop-2.8.4.tar.gz

(Hit ENTR - take time to install)


sudo tar xzf hadoop-2.8.4.tar.gz

2
sudo mv hadoop-2.8.4 /usr/local
sudo ln -sf /usr/local/hadoop-2.8.4/ /usr/local/hadoop
sudo chown -R hadoopuser:hadoopgroup /usr/local/hadoop-2.8.4/

Switch to Hadoop user


su - hadoopuser

nano ~/.bashrc #(or editor ~/.bashrc)

Append the below whole command

# Hadoop config
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# Native path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native"
# Java path
export JAVA_HOME="/usr"
# OS path
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbin
export HADOOP_INSTALL=/usr/local/hadoop/bin/hadoop

# After bashrc make source command


source ~/.bashrc

Switch to root user


su – root user name (your root name from the prompt)
sudo editor /usr/local/hadoop/etc/hadoop/hadoop-env.sh

#(can use nano instead of editor)

Append the below


export JAVA_HOME="/usr"

3
Configure Hadoop
cd /usr/local/hadoop/etc/hadoop

# Append the below from property to property in between configuration.


<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>

sudo nano hdfs-site.xml


# Append the below from property to property in between configuration.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>

<property>
<name>dfs.name.dir</name>
<value>file:/usr/local/hadoop/hadoopdata/hdfs/namenode</value>
</property>

<property>
<name>dfs.data.dir</name>
<value>file:/usr/local/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>

sudo nano mapred-site.xml.template


# Append below and (Don’t forget to delete template from file name, hit enter-finally again hit
y)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

sudo nano yarn-site.xml


# Append the below
<configuration>

4
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Format namenode
Switch to Hadoop user
su - hadoopuser

hadoop namenode -format

Start Services
start-dfs.sh

start-yarn.sh

Check Services
jps
All the 6 services to appear
#(Jps/SecondaryNamenode/NodeManager/Datanode/ResouceManager/NameNode)

Open local host:50070 for HDFS and local host:8088 for YARN
ls

Review and QA…………

You might also like