You are on page 1of 20

S.

No Pre-Requisite Question Marks


Softwares
1. Cloudera Hadoop Topic : Sqoop with MySQL
with Hive & Sqoop
1. Make a new directory
called Exam in HDFS
using the Hadoop file
system shell.
2. Create a local file called
DEPT which contains
department information
and load into HDFS
Exam directory using
the Hadoop file system
shell
3. Create MYSQL table
Employee & Import
same table from a
relational database into
HDFS Exam directory

2 Cloudera Hadoop Topic : Hive


with Hive & Sqoop
1. Create Product table
following is the product
structure .

Product file - id, userid,


prod_name,
pur_mon,pur_year
2. Load product file into
hive table using load
command.

3. Perform Overwrite
operation on Product
table

4. Create partition on
product table using
column deptno.

5. Load data into partition


table and check
warehouse partition is
created or not

3 Cloudera Hadoop Topic :Hive


with Hive
Movie Dataset Analysis using
Hive:
Note: refer Movie Data set
create Movie Table into Hive
2. load data into hive table.
3. Write hive QL for below use
cases .
 List the movies
that having a
rating greater
than 4
 List the movie
names its
duration in
minutes
 List the years
and the number
of movies
released each
year.
 List all the
movies in the
ascending order
of year.
 List all the
movies in the
descending
order of year.
 List the movies
that were
released
between 1950
and 1960
 List the movies
that start with the
Alphabet A
 List the movies
that have
duration greater
that 2 hours
 List the movies
that have rating
between 3 and 4

Q1. Write a query to display the department name, number of workers working in each department

Sort the result on ascending order of dept_name.

Q2. Write a query to display worker name, job title for those workers who are either managers or
executive and had served these position for at least 5 years

No. 1
1. Create MYSQL table Employee & Import same table from a relational
database into HDFS Exam directory

hadoop fs -mkdir /user/cloudera/Exam

2. Create a local file called DEPT which contains department information and load
into HDFS Exam directory using the Hadoop file system shell.
vi DEPT
hadoop fs -put DEPT /user/cloudera/Exam

3. Create MYSQL table Employee & Import same table from a relational
database into HDFS Exam directory
<mysql> create table Employee(
-> empid int,
-> firstname varchar(20),
-> lastname varchar(20),
-> phoneno int);
<mysql> insert into Employee
-> values (1901040063, ‘Nguyen’, ‘Duc’, 0387772001);
<mysql> insert into Employee
-> values (1901040231, ‘Mai’, ‘Uyen’, 123456789);
<mysql> insert into Employee
-> values (1801040777, ‘Trinh’, ‘Thien’, 987654321);

Sqoop import -m 1 --connect jdbc:mysql://quickstart:3306/mydb --


username=root --password=cloudera --table Employee --target-dir
/user/cloudera/Exam/Employee
No. 2
1. Create Product table following is the product structure .

Product file - id, userid, prod_name, pur_mon,pur_year

> create table Product


> (id int, userid string, prod_name string, pur_mon string, pur_year int)
> row format delimited
> fields terminated by ‘|’
> stored as textfile;

2. Load product file into hive table using load command.


vi Product
cat Product
hadoop fs -put Product /user/cloudera/Exam

load data inpath ‘/user/cloudera/Exam/Product’ into table Product;


No. 3
1. Create Movie Table into Hive

> create table Movie


> (movieid int, moviename string, year int, rating float, duration bigint)
> row format delimited
> fields terminated by ‘|’
> stored as textfile;
2. Load data into hive table.

vi Movie
cat Movie
hadoop fs -put Movie /user/cloudera/Exam

load data inpath ‘/user/cloudera/Exam/Movie’ into table Movie;


3. Write hive QL for below use cases .

select * from Movie where rating >= 4;


select moviename, round(duration/60) as time from Movie;
select year, count(*) as nomovie from Movie group by year;
Select * from Movie order by year asc;
Select * from Movie order by year desc;
Select * from Movie where year between 1950 and 1960;
Select * from Movie where moviename like “A%”;
Select * from Movie where (duration/3600) > 2;
Select * from Movie where rating between 3 and 4;

You might also like