Professional Documents
Culture Documents
Apache Hadoop PDF
Apache Hadoop PDF
Module 2
Agenda
• Installation
• Administering Hadoop
• Hive
• Pig
• Sqoop
• Hbase
Installation
Administrative tool
Post successful installation of Hadoop in system.
Open cmd and change directory to "C:\Hadoop*\sbin" and type "start-all.cmd" to start apache.
SAFE MODE
Safe mode is needed to give the datanodes time to check in to the namenode to run the filesystem effectively.
When you are starting a new formatted HDFS Cluster, the namenode does not go into safe node, since there are
no blocks in the system.
Safe Mode Commands:
• Hdfs dfsadmin –safemode get : This command is used to check whether the namenode is in safe mode or not.
• Hdfs dfsadmin –safemode wait: Sometimes you want to wait for the namemode to exit safe mode before
carrying out a command.
• Hdfs dfsadmin –safemode enter: To enter Safe mode
• Hdfs dfsadmin –safemode leave: To enter Safe mode
DDL Commands
1 Create 2 Alter 3 Drop
4 Show 5 Truncate 6 Delete
DDL Command:
CREATE:
Create Database DigitalVidhya; OR
CREATE DATABASE IF NOT EXISTS DigitalVidhya;
CUSTOM LOCATION
CREATE DATABASE DigitalVidhya
LOCATION "/your/preferred/path/in/HDFS“;
DATABASE PROPERTIES:
CREATE DATABASE DigitalVidhya
COMMENT 'This comment is added for Hadoop Tutorial!'
WITH DBPROPERTIES("owner"=“ManishBhagchandani");
DESCRIBE:
DESCRIBE database DigitalVidhya;
DDL Command..
SHOW:
Ex. • Practical Aspects
SHOW DATABASES; • LOAD DATA LOCAL inpath '/home/hive/emp.csv'
SHOW tables;
into TABLE employee;
ALTERING: • select * from employee;
Renaming Tables
Modifying Columns
Delete some columns
Change table properties.
Alter tables for adding partitions.
Altering Storage Properties.
Altering Database Properties.
Ex.
ALTER TABLE stud RENAME TO student;
ALTER TABLE student CHANGE COLUMN sname student_name STRING;
ALTER TABLE student REPLACE COLUMN (sname STRING, grade STRING, city STRING);
ALTER TABLE student REPLACE COLUMN (sname STRING, grade STRING);
ALTER TABLE student ADD COLUMNS (city STRING);
DROP:
Ex.
DROP TABLE IF EXISTS student;
DROP DATABASE IF EXIST DigitalVidhya;
Select Statements
• LOAD DATA LOCAL INPATH '/home/hive/emp-gujarat.csv'
INTO TABLE employee;
• select name, salary from employee;
• select e.name, e.salary from employee e;
• select name, technology from employee;
• select symbol, 'price.*' from employee;
Pig
Pig
• Pig raises the level of abstraction for processing large datasets.
• Pig is made up of two pieces:
–The language used to express data flows, Called Pig Latin
–The execution environment to run Pig Latin Programs.
• Pig latin program is made up of a series of operation, or
transformations, that are applied to the input data to produce
output.
• Pig is scripting language for exploring large datasets.
• Pig was designed to be extensible.
Data type
Query
•Load
•For Each
•Filter
•Dump
Sqoop
Sqoop
• SQOOP is a command line tool which runs on bash or zsh. Thus, the let us create the step
by step procedure on how to import the data from MySQL to HDFS via SQOOP.
• MariaDB instead of MySQL which is sister branch of MySQL as an open source project.
Thus, the commands in both are the same in MariaDB as well as in MySQL.
• Practical Aspects
– Create Database
– Use Database
– Create table
– Insert rows
– View Table
HBASE
Sqoop
• SQOOP is a command line tool which runs on bash or zsh. Thus, the let us create the step
by step procedure on how to import the data from MySQL to HDFS via SQOOP.
• MariaDB instead of MySQL which is sister branch of MySQL as an open source project.
Thus, the commands in both are the same in MariaDB as well as in MySQL.
• Practical Aspects
– Create Database
– Use Database
– Create table
– Insert rows
– View Table
Thank you