Had Oop Installation 1

RicMa.co – Install Apache Hadoop 2.7 (on *buntu ... https://ricma.co/install-apache-hadoop-27-on-bunt...
This site uses some cookies! That's so unusual, right? Close this to accept the policy or learn ✖
more.
Riccardo Macoratti
Geeky developer · Linux lover
bio cv-eng cv-ita repos nixRIOT linuxvar.it
       
HOME CATEGORIES TAGS ATOM
Install Apache Hadoop 2.7

(on *buntu 16.04)
Posted on Tue 27 September 2016 in tutorials
If you are interested in Hadoop, read more here.
For this tutorial, I'll use a VM with Ubuntu Server 16.04, 64 bit version, relying
on VirtualBox 5.1.4 for the virtualization.
1 of 8 8/18/18, 11:09 PM
more.
All 2 cores of my i5-6200U
4096 MB of RAM (although 1024 MB should be enough)
A dinamically allocated 10 GB VDI hard disk (5 GB are the least)
Ubuntu Server 16.04 x64 ISO file (but every *buntu flavour should be ok)
Notes
When you read a line like this:
jdoe@farlands ~ $ echo "Hello, world!"
I imply a bash prompt without root priviledges, where

jdoe is the username and farlands is the hostname.
On the other hand, when the line is like this:
farlands % echo "Hello, world!"
I imply a bash prompt with root priviledges
Ok, let's start: run the guest os installation with default values and let's jump to
hadoop headaches.
Update the guest system

Open up a terminal and fire this commands to update repositories and upgrade
the emulated system.
farlands % apt update

farlands % apt upgrade -y
Java 8
2 of 8 8/18/18, 11:09 PM
more.
Open up the usual terminal and input:
farlands % apt purge openjdk*

farlands % add-apt-repository -y ppa:webupd8team/java
farlands % apt update
farlands % apt install -y oracle-java8-installer
You can verify Java version by typing:
farlands % java -version

java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
If you read a similar output, you completed this step.
Next, we need to create the JAVA_HOME environmental variable, to give

hadoop the capability to find java executables.
farlands % echo "export JAVA_HOME=/usr" >> /etc/profile

farlands % source /etc/profile
Disable IPv6
Apache Hadoop supports only IPv4, so let's disable IPv6 in the kernel
parameters.
Open the file /etc/sysctl.conf:
farlands % editor /etc/sysctl.conf
And append to the end:
# Disable IPv6
net.ipv6.conf.all.disable_ipv6 = 1
net.ipv6.conf.default.disable_ipv6 = 1
net.ipv6.conf.lo.disable_ipv6 = 1
3 of 8 8/18/18, 11:09 PM
more.
farlands % reboot
Con�gure SSH keys

We want to run our setup on a different general purpose user, so we will create a
hadoopuser user and a hadoopgroup group.
farlands % addgroup hadoopgroup

farlands % adduser -ingroup hadoopgroup hadoopuser
We need ssh access to our machine, so let's install and start an OpenSSH server.
farlands % apt install ssh

farlands % systemctl enable ssh
farlands % systemctl start ssh
Now we need to setup passwordless ssh, by means of crypto keys. In first place,
we change to the hadoopuser account, then we create the key using RSA
encryption and finally we authorize the key for the current user.
farlands % su - hadoopuser
hadoopuser@farlands ~ $ ssh-keygen -t rsa -P ""
hadoopuser@farlands ~ $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
hadoopuser@farlands ~ $ chmod 600 ~/.ssh/authorized_keys
hadoopuser@farlands ~ $ ssh-copy-id -i ~/.ssh/id_rsa.pub localhost
hadoopuser@farlands ~ $ ssh localhost
If no password were asked on ssh login, you successfully configured

passwordless ssh for user hadoopuser.
Install Hadoop
We are ready to install Hadoop. Unfortunately, it does not come prepackaged, but
we have to extract and move it to /usr/local.
farlands % wget http://it.apache.contactlab.it/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.
4 of 8 8/18/18, 11:09 PM
more.
farlands % mv hadoop-2.7.3 /usr/local
farlands % ln -sf /usr/local/hadoop-2.7.3/ /usr/local/hadoop
farlands % chown -R hadoopuser:hadoopgroup /usr/local/hadoop-2.7.3/
Now we need to configure some environmental variables, with the hadoopuser

account. Switch to that account and edit ~/.bashrc:
hadoopuser@farlands ~ $ editor ~/.bashrc
Append at the end:
# Hadoop config
export HADOOP_PREFIX=/usr/local/hadoop
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_MAPRED_HOME=${HADOOP_HOME}
export HADOOP_COMMON_HOME=${HADOOP_HOME}
export HADOOP_HDFS_HOME=${HADOOP_HOME}
export YARN_HOME=${HADOOP_HOME}
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
# Native path
export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib/native"
# Java path
export JAVA_HOME="/usr"
# OS path
export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbin
Next, source ~/.bashrc to apply changes.
hadoopuser@farlands ~ $ source ~/.bashrc
Now we need to edit /usr/local/hadoop/etc/hadoop/hadoop-env.sh:
farlands % editor /usr/local/hadoop/etc/hadoop/hadoop-env.sh
And add this at the end:
export JAVA_HOME="/usr"
5 of 8 8/18/18, 11:09 PM
more.
Hadoop configuration is quite hard, because it has a lot of config files. We need
to navigate to /usr/local/hadoop/etc/hadoop and edit these files:
core-site.xml
hdfs-site.xml
mapred-site.xml (needs to be copied from mapred-site.xml.template)
yarn-site.xml
They all are XML files with a top-level <configuration> node. For clarity we
report the configuration node only.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:/usr/local/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:/usr/local/hadoop/hadoopdata/hdfs/datanode</value>
6 of 8 8/18/18, 11:09 PM
more.
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
Format namenode
Next, we need to format the namenode filesystem with the following command:
hadoopuser@farlands ~ $ hdf namenode -format
Search the output: if you can read a string like this:
INFO common.Storage: Storage directory /usr/local/hadoop/hadoopdata/hdfs/namenode has b
It's done.
Start and stop services

Now, the last thing to do is starting Hadoop services:
7 of 8 8/18/18, 11:09 PM
more.
To check the status of the services use the jps command:
hadoopuser@farlands ~ $ jps
26899 Jps
26216 SecondaryNameNode
25912 NameNode
26041 DataNode
26378 ResourceManager
26494 NodeManager
To stop services, these are the commands:
hadoopuser@farlands ~ $ stop-dfs.sh
hadoopuser@farlands ~ $ stop-yarn.sh
Congratulations, you made it!
hadoop apache ubuntu tutorial hdfs
Licenses · Cookie policy
© Riccardo Macoratti 2016 - This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike
4.0 International License
Built using Pelican · Oniony theme based on Flex by Alexandre Vicenzi
8 of 8 8/18/18, 11:09 PM

Had Oop Installation 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Had Oop Installation 1

Uploaded by

Copyright:

Available Formats

RicMa.co – Install Apache Hadoop 2.7 (on *buntu ... https://ricma.co/install-apache-hadoop-27-on-bunt...

bio cv-eng cv-ita repos nixRIOT linuxvar.it

HOME CATEGORIES TAGS ATOM

Install Apache Hadoop 2.7

If you are interested in Hadoop, read more here.

When you read a line like this:

jdoe@farlands ~ $ echo "Hello, world!"

I imply a bash prompt without root priviledges, where

On the other hand, when the line is like this:

farlands % echo "Hello, world!"

I imply a bash prompt with root priviledges

Update the guest system

farlands % apt update

Open up the usual terminal and input:

farlands % apt purge openjdk*

You can verify Java version by typing:

farlands % java -version

If you read a similar output, you completed this step.

Next, we need to create the JAVA_HOME environmental variable, to give

farlands % echo "export JAVA_HOME=/usr" >> /etc/profile

Open the file /etc/sysctl.conf:

farlands % editor /etc/sysctl.conf

And append to the end:

Con�gure SSH keys

farlands % addgroup hadoopgroup

farlands % apt install ssh

If no password were asked on ssh login, you successfully configured

farlands % wget http://it.apache.contactlab.it/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.

Now we need to configure some environmental variables, with the hadoopuser

hadoopuser@farlands ~ $ editor ~/.bashrc

Append at the end:

Next, source ~/.bashrc to apply changes.

hadoopuser@farlands ~ $ source ~/.bashrc

Now we need to edit /usr/local/hadoop/etc/hadoop/hadoop-env.sh:

farlands % editor /usr/local/hadoop/etc/hadoop/hadoop-env.sh

And add this at the end:

hadoopuser@farlands ~ $ hdf namenode -format

Search the output: if you can read a string like this:

INFO common.Storage: Storage directory /usr/local/hadoop/hadoopdata/hdfs/namenode has b

Start and stop services

To check the status of the services use the jps command:

To stop services, these are the commands:

Congratulations, you made it!

hadoop apache ubuntu tutorial hdfs

Licenses · Cookie policy

Built using Pelican · Oniony theme based on Flex by Alexandre Vicenzi

You might also like