Commands in Hadoop

Commands in Hadoop
1. ls:
This command is used to list all the files. Use lsr for recursive approach. It is useful when we
want a hierarchy of a folder.
Syntax:
hadoop fs -ls <path>
Example:
Hadoop fs -ls /user
It will print all the directories present in user directory
2. mkdir:
To create a directory. In Hadoop fs there is no home directory by default. So let’s first create it.
Syntax:
hadoop fs -mkdir <folder name>
creating home directory:
Hadoop fs -mkdir /user
3. touchz:
It creates an empty file.
Syntax:
hadoop fs -touchz <file_path>
Example:
hadoop fs -touchz /user/myfile.txt
4. copyFromLocal (or) put:
To copy files/folders from local file system to hdfs store. This is the most important command.
Local filesystem means the files present on the OS.
Syntax:
hadoop fs -copyFromLocal <local file path> <dest(present on hdfs)>
Example: Let’s suppose we have a file AI.txt on Desktop which we want to copy to folder user
present on hdfs.
hadoop fs -copyFromLocal ../Desktop/AI.txt /user
OR
hadoop fs -put ../Desktop/AI.txt /user
5. cat:
To print file contents.
Syntax:
hadoop fs -cat <path>
Example:
// print the content of AI.txt present
// inside geeks folder.
hadoop fs -cat /user/AI.txt
6. copyToLocal (or) get:
To copy files/folders from hdfs store to local file system.
Syntax:
hadoop fs -copyToLocal <<srcfile(on hdfs)> <local file dest>
Example:
hadoop fs -copyToLocal /user/data.txt ../Desktop
(OR)
hadoop fs -put /user/data.txt ../Desktop
7 cp:
This command is used to copy files within hdfs. Lets copy folder user to user_copied.
Syntax:
hadoop fs -cp <src(on hdfs)> <dest(on hdfs)>
Example:
hadoop -cp /user /user_copied
8. mv:
This command is used to move files within hdfs. Lets cut-paste a file myfile.txt from user folder
to user_copied.
Syntax:
hadoop fs -mv <src(on hdfs)> <src(on hdfs)>
Example:
hadoop -mv /user/myfile.txt /user_copied
9. du:
It will give the size of each file in directory.
Syntax:
hadoop fs -du <dirName>
Example:
hadoop fs -du /user
10. dus:
This command will give the total size of directory/file.
Syntax:
hadoop fs -dus <dirName>
Example:
hadoop fs -dus /user

Map Reduce Steps word count program
Step 1: Create a file with the name word_count_data.txt and add some data to it
Step 2: Create a mapper.py file that implements the mapper logic. It will read the data from
STDIN and will split the lines into words, and will generate an output of each word with its
individual count.
#!/usr/bin/env python
# import sys because we need to read and write data to STDIN and STDOUT
import sys
# reading entire line from STDIN (standard input)
for line in sys.stdin:
# to remove leading and trailing whitespace
line = line.strip()
# split the line into words
words = line.split()
for word in words:
print ('%s\t%s' % (word, 1))
Step 3: Create a reducer.py file that implements the reducer logic. It will read the output of
mapper.py from STDIN(standard input) and will aggregate the occurrence of each word and
will write the final output to STDOUT.
#!/usr/bin/env python
from operator import itemgetter
import sys
current_word = None
current_count = 0
for line in sys.stdin:
line = line.strip()
word, count = line.split('\t')
count = int(count)
if current_word == word:
current_count += count
else:
if current_word:
print ('%s\t%s' % (current_word, current_count))
current_count = count
current_word = word
if current_word == word:
print ('%s\t%s' % (current_word, current_count))
Step 4: Now let’s start all our Hadoop daemons with the below command.
start-all.cmd
Now make a directory word_count_in_python in our HDFS in the root directory that will
store our word_count_data.txt file with the below command.
hdfs dfs -mkdir /word_count_in_python
Copy word_count_data.txt to this folder in our HDFS with help of copyFromLocal

command.
Syntax to copy a file from your local file system to the HDFS is given below:
hdfs dfs -copyFromLocal /path 1 /path 2 .... /path n /destination
Let’s give executable permission to our mapper.py and reducer.py with the help of below
command.
cd Documents/
chmod 777 mapper.py reducer.py # changing the permission to read, write, execute for user,
group and others
Step 5: Now download the latest hadoop-streaming jar. Then place, this Hadoop,-streaming
jar file to a place from you can easily access it.
Now let’s run our python files with the help of the Hadoop streaming utility as shown below.
Compile and run command
hadoop jar /home/dikshant/Documents/hadoop-streaming-2.7.3.jar -input

word_count_in_python/word_count_data.txt -output word_count_in_python/output -mapper
mapper.py -reducer reducer.py

Commands in Hadoop

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Commands in Hadoop

Uploaded by

Copyright:

Available Formats

Commands in Hadoop

hadoop fs -ls <path>

Hadoop fs -ls /user

It will print all the directories present in user directory

hadoop fs -mkdir <folder name>

creating home directory:

Hadoop fs -mkdir /user

It creates an empty file.

hadoop fs -touchz <file_path>

hadoop fs -touchz /user/myfile.txt

4. copyFromLocal (or) put:

hadoop fs -copyFromLocal <local file path> <dest(present on hdfs)>

hadoop fs -copyFromLocal ../Desktop/AI.txt /user

hadoop fs -put ../Desktop/AI.txt /user

To print file contents.

hadoop fs -cat <path>

// print the content of AI.txt present

// inside geeks folder.

hadoop fs -cat /user/AI.txt

6. copyToLocal (or) get:

To copy files/folders from hdfs store to local file system.

hadoop fs -copyToLocal <<srcfile(on hdfs)> <local file dest>

hadoop fs -copyToLocal /user/data.txt ../Desktop

hadoop fs -cp <src(on hdfs)> <dest(on hdfs)>

hadoop -cp /user /user_copied

hadoop fs -mv <src(on hdfs)> <src(on hdfs)>

hadoop -mv /user/myfile.txt /user_copied

It will give the size of each file in directory.

hadoop fs -du <dirName>

hadoop fs -du /user

This command will give the total size of directory/file.

hadoop fs -dus /user

Copy word_count_data.txt to this folder in our HDFS with help of copyFromLocal

hdfs dfs -copyFromLocal /path 1 /path 2 .... /path n /destination

Compile and run command

hadoop jar /home/dikshant/Documents/hadoop-streaming-2.7.3.jar -input

You might also like