You are on page 1of 27

1

INSTALL OPEN JDK

Enrique Davila Big Data Instructor


enrique.davila@gmail.com
10/24/2016

Installing Hadoop on
Ubuntu 16

Install Java
java -version

Do I have Java? Type on terminal:

If I see the output below, then I dont have java installed, follow
instructions next slide

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

Install Java

Type:

sudo

apt-get install openjdk-8-jdk

Type Y to continue the installation process (it will take a while to


complete the installation)

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

Do I have java?

To confirm java ins installed on my Ubuntu system type:

java

version

You will see output below

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

Install Openssh

Is mandatory to install openssh server:

sudo

apt-get install openssh-server

If ssh server is installed then

generate keys, run command below:

ssh-keygen

-t rsa

Enter file, press enter

Enter passphrase, press enter

Enter same passphrase again press

enter

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

SSH Keys

Now we will copy the key to the user and host, in my case my user is
hadoop and host is hadoopdev

ssh-copy-id hadoop@hadoopdev

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

DOWNLOAD HADOOP FROM APACHE WEB PAGE

Enrique Davila Big Data Instructor


enrique.davila@gmail.com
10/24/2016

Download and Install


Hadoop

Download Apache Hadoop

Type in the terminal the following command to create new folder within
my home linux folder, in this case/home/Hadoop/:

mkdir hadoop_install

Then go into this new folder:

cd hadoop_install

And copy the command below:

wget http://
www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar
.gz

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

Download Apache Hadoop

You will see windows reflecting the progress of the download

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

10

Unzip Hadoop folder

Once download is complete

Type the following command:

tar -xvf hadoop-2.7.3.tar.gz

Now you will see 2 folders, the new directory is called hadoop-2.7.3:

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

11

Setup bashrc

This is the java location (very important for next steps):

Edit bashrc

Type:

Sudo gedit ~/.bashrc

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

12

Setup ~/.bashrc

Add this lines to the .bashrc

Pls note on previous slide the java path is displayed, need to point
bashrc to the actual java path

#HADOOP VARIABLES START

export JAVA_HOME=/usr/lib/jvm/ java-1.8.0-openjdk-amd64

export HADOOP_INSTALL=/home/hadoop/hadoop_install

export PATH=$PATH:$HADOOP_INSTALL/bin

export PATH=$PATH:$HADOOP_INSTALL/sbin

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

13

Testing hadoop installation

Type the following command to refresh ~/.bashrc changes (no need to


restart)

source

~/.basrch

Type the command below (if at this point you see an output like
this youre doing well)

hadoop

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

version

10/24/2016

14

INSTALL OPEN JDK

Enrique Davila Big Data Instructor


enrique.davila@gmail.com
10/24/2016

Setup single node

15

Point your java to hadoop conf file

Go to the path:

/home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop

Edit the file:

sudo gedit Hadoop-env.sh

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

16

Modifying hadoop-env.sh

Modify the value for Java Home in the file: hadoop-env.sh

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

17

Modify core-site.xml

Create a folder called tmp in /home/hadoop/hadoop_install

Add the following text to the core-site.xml , file is on the path:


/home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop_install/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system.</description>
</property>
</configuration>

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

18

Modify mapred-site.xml

By default there is a file called: mapred-site.xml.template, needs to be


renamed to mapred-site.xml and then add the code below:

File is on path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs at. </description>
</property>

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

19

Modify hdfs-site.xml

We need to crate 2 new folders which will contain name node and data
node:

I placed these 2 folders on: /home/hadoop/hadoop_install/

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

20

Modify hdfs-site.xml
Add the code below in the file hdfs-site.xml, the paths for namnode and datanode are the 2 new
folders you just created on previous slide.
<configuration>

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoop_install/namenode</value>
</property>
<property>
<name>dfs.data.node.name.dir</name>
<value>file:///home/hadoop/hadoop_install/datanode</value>
</property>
</configuration>
#hdfs-site.xml is located on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
Enrique Davila Big Data Instructor
enrique.davila@gmail.com

10/24/2016

21

Format the namenode

Run the following command:

hadoop namenode format

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

22

Format the namenode part 2

If everything is ok you will see message below:

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

23

Running Hadoop Single node

Run the command:

startall.sh

Then execute the command:

jps,

you will see the following output

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

24

Stop Cluster

We run stop-all.sh

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

25

Web Interface: localhost:50070

In the browser go to: localhost:50070

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

26

Applies for:

This installation runs under:

Ubuntu 16

Hadoop 2.7.3

Virtual Machine:

2 Processors

2 Gb Ram

2 Network Interface, 1 as Bridge, 2nd as Nat

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016

27

You need help?

Contact name:

Enrique Davila Gutierrez

Enrique.davila@Gmail.com

Enrique Davila Big Data Instructor


enrique.davila@gmail.com

10/24/2016