Professional Documents
Culture Documents
Go to the below link and download the image of ubuntu 12.04. Extract/UnZip
downloaded Zip file in some folder.
http://www.traffictool.net/vmware/ubuntu1204t.html
Open VMware Player and click open virtual machine and select path where you have
extracted image of Ubuntu. After that select the .vmx file and click ok.
Username : user
Password : password
Open a Terminal
Update the repository:
Command:
sudo apt-get update
Command:
sudo apt-get install openjdk-6-jdk
After Java has been Installed, To check whether Java is installed on your
system or not give the below command:
Command:
java -version
Install openssh-server:
Command:
sudo apt-get install openssh-server
Download and extract Hadoop:
Command:
wget http://archive.apache.org/dist/hadoop/core/hadoop-1.2.0/hadoop-
1.2.0.tar.gz
Note : Ensure to make changes as per the hadoop version you are installing
Command:
tar -xvf hadoop-1.2.0.tar.gz
Create an alias (soft link) for Hadoop folder as below so hadoop folder hadoop-
1.2.0 can be referred as hadoop.
Command:
ln -s hadoop-1.2.0 hadoop
Configuration
Add JAVA_HOME in hadoop-env.sh file:
Command:
sudo gedit hadoop/conf/hadoop-env.sh
Type:
export JAVA_HOME=/usr/lib/jvm/java-6-openjdk-i386
Uncomment the below shown export and add the below the path to your
JAVA HOME:
Now create another instance of Ubuntu VM and start it on VMPlayer
Command:
ifconfig
Command:
Add both the IP addresses in the both the instances in /etc/hosts file as shown
separated by TAB
example
Edit core-site.xml:
Command:
<property>
<name>fs.default.name</name>
<value>hdfs://nn1.cluster.com:8020</value>
</property>
Edit hdfs-site.xml:
Command:
gedit hadoop/conf/hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
Edit mapred-site.xml:
Command:
<property>
<name>mapred.job.tracker</name>
<value>nn1.cluster.com:8021</value>
</property>
Now there is a slight difference in slaves and masters file for
both VM.
On slave node, master file is blank and slaves file contains slave
VMs ip address. See the image below.
Create a ssh key:
Command:
ssh-keygen -t rsa -P ""
Note: If this command is not working. Please check whether hyphen is appearing correctly or
not. Otherwise try deleting it and re-type - only.
Command:
ssh-copy-id -i $HOME/.ssh/id_rsa.pub user@nn1.cluster.com
Command:
ssh-copy-id -i $HOME/.ssh/id_rsa.pub user@dn1.cluster.com
Run the below commands on Master node.
Format the name node
Command:
bin/hadoop namenode format
export HADOOP_HOME=/home/user/hadoop
PATH=$PATH:$HADOOP_HOME/bin
export PATH