You are on page 1of 5

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY: VADAPALANI CAMPUS

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEEERING

Prerequisite

ubuntu 16.04

Make ec2 as password Authentication:

use command for setting password to ec2 ubuntu image: sudo passwd ubuntu

Step:1 JAVA 8-----

1. sudo add-apt-repository ppa:webupd8team/java


2. sudo apt-get update
3. sudo apt-get install oracle-java8-installer
4. sudo apt-get install oracle-java8-set-default

Step 2: SSH SERVER INSTALLATION

5. sudo apt-get install openssh-server

6. sudo sed -i -e 's/PasswordAuthentication no/PasswordAuthentication yes/g'


/etc/ssh/sshd_config

7. ssh-keygen -t dsa -P “” -f ~/.ssh/id_dsa


cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

8. sudo service ssh restart

9. ssh localhost
//passwordless login
10. exit

Step 3: Download hadoop package

https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-2.7.3.tar.gz

10 .wget https://archive.apache.org/dist/hadoop/core/hadoop-2.7.3/hadoop-
2.7.3.tar.gz

11. sudo tar -xzvf hadoop-2.7.3.tar.gz


sudo mkdir -p /usr/local/hadoop

sudo mv hadoop-2.7.3/* /usr/local/hadoop/

12. sudo chown -R ubuntu:ubuntu /usr/local/hadoop

//create folder for datanode and name node

13sudo mkdir -p /app/hadoop/tmp

14 sudo mkdir -p /app/hadoop/tmp

set permission

15 sudo chown -R ubuntu /app/hadoop/tmp

Step 4: Configure Hadoop:


 Check where your Java is installed:
 16 readlink -f /usr/bin/java

If you get something like /usr/lib/jvm/java-8-oracle/jre/bin/java,

/usr/lib/jvm/java-8-oracle is what you should used for JAVA_HOME.

 Add to ~/.bashrc file:

17 sudo nano ~/.bashrc


export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native"

 Reload ~/.bashrc file:


18 source ~/.bashrc
 Modify JAVA_HOME in

19 sudo nano /usr/local/hadoop/etc/hadoop/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/java-8-oracle

 Modify
20. sudo nano /usr/local/hadoop/etc/hadoop/core-site.xml

to have something like:


<configuration>
...
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/app/hadoop/tmp</value>

<description>A base for other temporary directories.</description>

</property>

...
</configuration>

 Modify
21. sudo nano /usr/local/hadoop/etc/hadoop/yarn-site.xml

to have something like:


<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>

 Create /usr/local/lib/hadoop-2.7.0/etc/hadoop/mapred-site.xml
from template:

21. cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template
/usr/local/hadoop/etc/hadoop/mapred-site.xml

 Modify
22. sudo nano /usr/local/hadoop/etc/hadoop/mapred-site.xml

to have something like:


<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

 Modify
23. sudo nano /usr/local/hadoop/etc/hadoop/hdfs-site.xml

to have something like:


<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

</configuration>

 Format file system:


24. hdfs namenode -format

 Start Hadoop:
25. start-dfs.sh
26. start-yarn.sh
You might be asked to accept machine’s key.
 Check if everything is running:
27. jps

You should get something like:


Jps
NodeManager
NameNode
ResourceManager
DataNode
SecondaryNameNode

TYPE IN WEB BROWSER


28. http://localhost:8088/cluster
29. http://localhost:50070/

INSTALLED HADOOP CLUSTER SUCCESSFULLY IN AMAZON EC2

You might also like