You are on page 1of 34

Hadoop Installation Process

Download Resource First then Extract

https://drive.google.com/drive/folders/1F0CJw-UQNZGzQ3igbO_
mvOUFVhn-s721?usp=sharing

Inside this Zip file


1. bin.zip
2. CLI Command.txt
3. Configuration file.txt
4. Hadoop-3.2.1.tar.gz
5. Hadoop-hdfs-3.2.1.jar
6. Jdk-8u311-windows-x64.exe

Java Installation
Step 1:
Install java jdk files in default location, just press the
next button.

Step 2:
To Install the jre files make sure you have to create a
new folder in C drive.
Step 3:
Now create a new folder named “Java” by clicking on
“make new folder”

Step 4 :
By clicking on next you can see that the installation
process has been Started

Step 5:
Now click on program files

Then you will find a folder named java


In java folder you will get find a jdk file

Step 6:
Now move this jdk file to folder you have created in C
drive named java
Step 7:

Next step is to set up the environment variables for java.


Go to the windows setting and click on systems
Inside the system, type environment variables

Then select “edit the system environment variables”


and you will get a dialog box
Step 8:
Then select the environment variables you will get
another dialog box
Inside the environment variables you need to set
the java home as well as path for java

Clicking on the new you will get a window where you


have to write variable name and variable value
Variable name is “JAVA_HOME”
Variable value is the jdk bin location
“C:\java\jdk1.8.0_311\bin”
Step 9:
Then go to the system variables
Inside that select path
Step 10 :
Then click edit

Create a new path variable


Click all the “ok” button

Now java has successfully installed into our local


system.

Step 11:
Now let's check java is functioning correctly

Open cmd

Just type javac


If set of files popping up into your terminal then it
means that java is working properly

Now check the version of java installed into our local


system
Hadoop installation
Hadoop download Homepage:
https://hadoop.apache.org/releases.html

Step 1:

Move Hadoop-3.2.1.tar.gz file into C drive then extract


using winrar application.
Rename extracted file into just hadoop.

Step 2:
Next step is to set the environment variable for
hadoop. Before that it is need to set the configuration
of hadoop

Inside hadoop you will see a folder named etc


Inside etc there is another folder name hadoop

Now inside hadoop you can see there are a lot of


files. Among the four files are important
They are
i) core-site.xml
ii) hdfs-site.xml
iii) mapred-site.xml
iv)yarn-site.xml
We need to edit these four files. After editing this four
file we need to edit one file which is hadoop-env.cmd
Step 3:
Now edit core-site.xml

Set property and value

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Here We have added one property which is the file
location that is fs.default file system
And the local host location that is 9000

Step 4:
Now edit mapred-site.xml
Here we also need to add some properties
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>

<value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HO
ME%/share/hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/had
oop/common/*,%HADOOP_HOME%/share/hadoop/common/lib/*,%HA
DOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/share/hadoo
p/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_H
OME%/share/hadoop/hdfs/lib/*</value>
</property>

Step 5:
Now update yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>

<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class
</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>localhost:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>localhost:8032</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>localhost:8088</value>
</property>
<property>

<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>localhost:8033</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>

<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP
_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADO
OP_MAPRED_HOME</value>
</property>
Step 6:

Now edit hdfs-site.xml

Before editing it we need to create a folder name


“data” inside hadoop folder.

Inside data folder we need to create two new folder


named “namenode” and “datanode”

Step 7:
Now copy the location of namenode and datanode
datanode location is C:\hadoop\data\datanode
Namenode location is C:\hadoop\data\namenode

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<!-- <value>file:///DIRECTORY 1 HERE</value> -->
<value>C:\hadoop\data\namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<!-- <value>file:///DIRECTORY 2 HERE</value> -->
<value>C:\hadoop\data\datanode</value>
</property>

Now we have successfully edited four file

Step 8:

Now Edit hadoop-env.cmd

Edit the Java Home location


Now go to environment variables and get jdk location

This location is the location of java home


All the files are edited.

Step 9:
Now go to environment variables for setting home
and path for hadoop

Select new and write the variable name


HADOOP_HOME
Variable value = C:\hadoop\bin
Now set path for hadoop files
C:\hadoop\bin

You need to create another path variable name sbin


C:\hadoop\sbin is the location or path value for
sbin
Now we need to fix the configuration files
Step: 10

Now extract downloaded bin.zip file and replace it into


hadoop main bin file

Delete the hadoop bin file

And paste the new bin file

Step: 11

Final step is to replace a jar file which is located in


( C:\hadoop\share\hadoop\hdfs ) this location.
File name is: hadoop-hdfs-3.2.1.jar

Replace it into newly downloaded hadoop-hdfs-3.2.1.jar


file

Hadoop is been set

Let’s check if hadoop is functioning properly or not

Open cmd in administrator mode and type


“ hdfs namenode -format”

If so many files are popping then hadoop is set


successfully
The namenode is starting successfully

If you get this message then congratulations your hadoop


is properly installed.

To start your all daemons just type in cmd :


“start-all.cmd”

All the daemons will be started


To check if daemons are running or not just type “jps” in
cmd.

If you find that 4 daemons are running then hadoop


configuration is ok.

Other CLI command you will find in CLI Command.txt file

You might also like