Professional Documents
Culture Documents
PREREQUISITES
● Download the ‘FileSystemOperationsTest.java’ file.
● Please ensure that you have installed the following on your Windows machine:
1. WinSCP tool.
2. Notepad++.
IMPORTANT INSTRUCTIONS
● The following notations have been while running the Java API code.
[ec2-user@ip-10-0-0-14 ~]$ hadoop command
NOTE: Before starting with the document below, it is necessary to have created the
EC2 instance with Cloudera installed on it and to have connected to it as well. If not
so, kindly go through Video 1 and Video 2 before getting started with this document.
STEPS TO ACCESS HDFS USING THE JAVA API ON AMAZON EC2
vi /etc/profile
2. Add the following at the end of the file as shown below. Please change to the
insert mode by pressing i before pasting the following lines.
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera/
export JRE_HOME=/usr/java/jdk1.7.0_67-cloudera/jre/
export PATH=$JAVA_HOME/bin:$PATH
3. Now, save and exit the file. It is important to exit from the insert mode and
enter the following in the command mode while using the vi editor:
:wq!
● Now run the following commands as shown below:
[root@ip-10-0-0-14 ~]# source /etc/profile
/usr/java/jdk1.7.0_67-cloudera/
[root@ip-10-0-0-14 ~]# java -version
javac 1.7.0_67
1. Open WinSCP.
● To verify whether the same is done or not, use the following commands:
[ec2-user@ip-10-0-0-14 ~]$ sudo -i (helps shift from the ec2-user to the root user)
FileSystemOperationsTest.java test.txt
You can see that the file has been copied to the root user home directory.
● Now, let us create a directory ‘testapi’ using the ‘mkdir’ command and now copy the
file ‘FileSystemOperationsTest.java’ to it.
[root@ip-10-0-0-14 ~]# mkdir testapi
FileSystemOperationsTest.java
You can see that the file is present.
● Now, create a new directory ‘testapi_classes’ using the ‘mkdir’ command to store
the class files after the compilation of the Java code.
[root@ip-10-0-0-14 testapi]# mkdir testapi_classes
● Set the environment variable for the Hadoop classpath using the below commands:
[root@ip-10-0-0-14 testapi]# export HADOOP_CLASSPATH=$(hadoop classpath)
FileSystemOperationsDemo.class FileSystemOperationsTest.class
You can see that there is a FileSystemOperationsDemo.class created.
● Navigate back to the test_api folder using the ‘cd ..’ command.
[root@ip-10-0-0-14 testapi_classes]# cd ..
added manifest
adding: FileSystemOperationsDemo.class(in = 4394) (out= 2062)(deflated 53%)
adding: FileSystemOperationsTest.class(in = 2655) (out= 1332)(deflated 49%)
qwe
qwer
Running the Java API Code
● Run the Java API code using the command shown below:
[root@ip-10-0-0-14 testapi]# hadoop jar test.jar FileSystemOperationsTest
● Now let us remove the file file1.txt from our local directory using ‘rm -rf file1.txt’
command.
[root@ip-10-0-0-14 testapi]# rm -rf file1.txt
[root@ip-10-0-0-14 testapi]# ls
FileSystemOperationsTest.java testapi_classes test.jar
● We can verify by checking whether the file is in our local directory by using the ‘ls’
command.
[root@ip-10-0-0-14 testapi]# ls
file1.txt FileSystemOperationsTest.java testapi_classes test.jar
You can see that file1.txt exists.
● Run the Java API code
[root@ip-10-0-0-14 testapi]# hadoop jar test.jar FileSystemOperationsTest
Enter 1 for Local to HDFS
Enter 2 for HDFS to local
Enter 3 for HDFS to HDFS
Enter 4 for splitting a file
Enter 5 for deletion from HDFS
Enter 6 for making a directory
Enter 7 for exit…
● If user enters 3
Enter HDFS source and destination…
/user/root/file1.txt
/user/root/av.txt
18/02/13 11:37:08 INFO Configuration.deprecation: fs.default.name is deprecated.
Instead, use fs.defaultFS
● We can verify whether the file has been copied from HDFS to HDFS using the
command shown below. As you can see the file av.txt is present which will have the
same contents as of the file file1.txt.
[root@ip-10-0-0-14 testapi]# hadoop fs -ls /user/root/
Found 3 items
drwxr-xr-x - root supergroup 0 2018-02-13 11:37 /user/root/av.txt
-rw-r--r-- 3 root supergroup 9 2018-02-13 10:42 /user/root/file1.txt
-rw-r--r-- 6 root supergroup 27 2018-02-13 07:50 /user/root/test.txt
● You can also verify the same using ‘Hue’.
To access hue, type your public IP followed by :8888 on your browser.
<Public IP>:8888
Click on the icon on the top left corner followed by ‘files’, then click on ‘hdfs’
followed by ‘root’ and verify whether the file av.txt exists.
● Run the Java API code
[root@ip-10-0-0-14 testapi]# hadoop jar test.jar FileSystemOperationsTest
Enter 1 for Local to HDFS
Enter 2 for HDFS to local
Enter 3 for HDFS to HDFS
Enter 4 for splitting a file
Enter 5 for deletion from HDFS
Enter 6 for making a directory
Enter 7 for exit…
● If user enters 4
Enter HDFS source…
/user/root/file1.txt
18/02/13 11:40:41 INFO Configuration.deprecation: fs.default.name is deprecated.
Instead, use fs.defaultFS
File :- file1_1.txt created!!!!
● You can verify the same using the command shown below. As we can see the file has
been split into only a single file. This is because the default block size in HDFS is 128
MB. Since the size of the block is less than 128MB, it has been split into a single file.
[root@ip-10-0-0-14 testapi]# hadoop fs -ls /user/root/
Found 3 items
drwxr-xr-x - root supergroup 0 2018-02-13 11:37 /user/root/av.txt
-rw-r--r-- 3 root supergroup 9 2018-02-13 11:40 /user/root/file1_1.txt
-rw-r--r-- 6 root supergroup 27 2018-02-13 07:50 /user/root/test.txt
● Run the Java API code
[root@ip-10-0-0-14 testapi]# hadoop jar test.jar FileSystemOperationsTest
Enter 1 for Local to HDFS
Enter 2 for HDFS to local
Enter 3 for HDFS to HDFS
Enter 4 for splitting a file
Enter 5 for deletion from HDFS
Enter 6 for making a directory
Enter 7 for exit…
● If user enters 5
Enter HDFS source to be deleted…
/user/root/file1_1.txt
18/02/13 11:45:44 INFO Configuration.deprecation: fs.default.name is deprecated.
Instead, use fs.defaultFS
Enter 1 for Local to HDFS
Enter 2 for HDFS to local
Enter 3 for HDFS to HDFS
Enter 4 for splitting a file
Enter 5 for deletion from HDFS
Enter 6 for making a directory
Enter 7 for exit…
Enter 7
● You can verify the same using the command shown below. You can see that the file
file1_1.txt has been deleted
[root@ip-10-0-0-14 testapi]# hadoop fs -ls /user/root/
Found 2 items
drwxr-xr-x - root supergroup 0 2018-02-13 11:37 /user/root/av.txt
-rw-r--r-- 6 root supergroup 27 2018-02-13 07:50 /user/root/test.txt
Enter 7
● You can verify whether the directory has been created or not using the following
command. It can be seen that the directory ‘aa’ has now been created.
[root@ip-10-0-0-14 testapi]# hadoop fs -ls /user/root/
Found 3 items
drwxr-xr-x - root supergroup 0 2018-02-13 11:54 /user/root/aa
drwxr-xr-x - root supergroup 0 2018-02-13 11:37 /user/root/av.txt
-rw-r--r-- 6 root supergroup 27 2018-02-13 07:50 /user/root/test.txt
You can verify the same using Hue as described above.