Professional Documents
Culture Documents
Step 2. Install Java jdk ( you may directly install from Oracle website
also).
http://www.oracle.com/technetwork/java/javase/downloads/index.html
3.2 Download the 32-bit or 64-bit Linux "compressed binary file" - it has a
".tar.gz" file extension.
$ java -version
Step 4. We will use a dedicated Hadoop user account for running Hadoop.
While that’s not required it is recommended because it helps to
separate the Hadoop installation from other software applications
and user accounts running on the same machine
(For example: security, permissions, backups, etc).
This will add the user hduser and the group hadoop to your local machine.
$ su - hduser
Install ssh and rsync and set up key based ssh to its own account. To do
this use execute following commands.
$ sudo apt-get install ssh
$ gedit ~/.bashrc
#Hadoop Variables
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_92
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_PREFIX:$HADOOP_PREFIX/sbin:
$HADOOP_PREFIX/bin
export HADOOP_LOG_DIR=$HADOOP_HOME/logs
$ source ~/.bashrc
Disabling IPv6
One problem with IPv6 on Ubuntu is that using 0.0.0.0 for the various
networking-related
Hadoop configuration options will result in Hadoop binding to the IPv6
addresses of your Ubuntu OS.
There is no practical point in enabling IPv6 on a box when it is not
connected to any IPv6 network.
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
Save the above config and reboot your machine in order to make the
changes take effect.
Then check whether IPv6 is enabled on your machine with the following
command:
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
A return value of 0 means IPv6 is enabled, a value of 1 means disabled
(that’s what we want).
Step 6. Download and install Hadoop:
Download Apache hadoop and untar it. You may use git also for cloning
and dowloading latest version.
$ wget -c
http://mirror.olnevhost.net/pub/apache/hadoop/common/current/hadoop-
2.6.0.tar.gz
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>hdfs</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hadoop_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-
services.mapreduce.shuffle.class</name>
<value> org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
$ cd
Step 13. Create Directories for namenode and datanode and take ownership of
$HADOOP_PREFIX
so that MapReduce program has permissions to write.
$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/namenode
$ mkdir -p /usr/local/hadoop/hadoop_data/hdfs/datanode
$ sudo chown hduser:hadoop -R /usr/local/hadoop
As a sanity check, issue a jps command to see that all the services (namely,
SecondaryNameNode,
NameNode and, DataNode) are running.
$ jps
Step 17: Verify the Running Services Using the Web Interface
Both HDFS and the YARN Resource Manager have a web interface. These
interfaces are a
convenient way to browse many of the aspects of your Hadoop
installation. To monitor HDFS
enter the following:
$ firefox http://localhost:50070 &
Connecting to port 50070 will bring up the web interface.
A web interface for the Resource Manager can be viewed by entering the
following:
$ firefox http://localhost:8088 &
$ firefox http://localhost:50090/ &
$ firefox http://localhost:50075/ &
Step 16. Store an input file in HDFS to be used for WordCount application.
$ jps >> testing.txt
Step 17. Download & Install Eclipse (kepler) as mentioned in previous post.
$ ./Eclipse &