Distributed Storage & Horizontal Scalability Increasing The No of Systems & Operate in Parallel

Uploaded by

Ajay Sreedhar Janapala

0% found this document useful (0 votes)

7 views5 pages

Hadoop

Original Title

Hadoop

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Hadoop

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

7 views5 pages

Distributed Storage & Horizontal Scalability Increasing The No of Systems & Operate in Parallel

Uploaded by

Ajay Sreedhar Janapala

Hadoop

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 5

Search inside document

Distributed storage & horizontal scalability

Increasing the no of systems & operate in Parallel.

Vertical scalability

Increasing the disk size & RAM

HDFS -> Used for storage-> distributed file system ->based on Name node[master] & multiple Data
nodes[slaves] based on the data size ->HDFS fault tolerance is based on 2 factors->Replication factor
& Block size -> Block Size by default is 64MB ->The total blocks required is (File Size)/(Default Block
Size) -> Each block will be distributed based on the Replication factor ->used for replication the same
dataset over N no of Datanodes, where N is the Replication factor.
MAP REDUCE ->Native support for Java ->framework used for processing of data->Mapper &
Reducer combination->Mapper used for parallel processing of the instructions input to the Map
Reduce framework-> distributes the instructions set among the Data nodes for parallel processing ->
Reducer will merge the results obtained from the parallel processed instruction from different data
nodes & aggregates(merge) them.

HIVE -> SQL query based support for analytics

PIG -> defined Functions support for analytics

SQOOP ->for Importing/Exporting data from DMS/RDBMS systems to HDFS

FLUME -> for Importing Streaming data from to HDFS

HBASE -> a NoSQL database ->Column based storage->Only Database supported by Hadoop

APACHE OOZIE -> scheduler to control the workflow of all the process

Overview of the Hadoop ecosystem

Untitled Document
Document6 pages
Untitled Document
sujith
No ratings yet
BDA All Modules
Document72 pages
BDA All Modules
v h
No ratings yet
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
Document37 pages
Chapter N2 HDFS The Hadoop Distributed File System - Matrix
Komal
No ratings yet
Cloud Computing - Unit 5 Notes
Document33 pages
Cloud Computing - Unit 5 Notes
steffinamorin L
No ratings yet
Unit 5 Print
Document32 pages
Unit 5 Print
sivapunithan S
No ratings yet
Module 1 PDF
Document49 pages
Module 1 PDF
Ajay
No ratings yet
Unit 5-Cloud PDF
Document33 pages
Unit 5-Cloud PDF
GOKUL b
No ratings yet
CC Unit-5
Document33 pages
CC Unit-5
Rajamanikkam Rajamanikkam
No ratings yet
Business Intelligence & Big Data Analytics-CSE3124Y
Document26 pages
Business Intelligence & Big Data Analytics-CSE3124Y
splokbov
No ratings yet
Hadoop Distributed File System Basics
Document30 pages
Hadoop Distributed File System Basics
ashuvasuma
No ratings yet
Ceph An Overview
Document8 pages
Ceph An Overview
Svetlin Ivanov
No ratings yet
Prepared By: Manoj Kumar Joshi & Vikas Sawhney
Document47 pages
Prepared By: Manoj Kumar Joshi & Vikas Sawhney
kavitha
No ratings yet
Unit V Cloud Technologies and Advancements 8
Document33 pages
Unit V Cloud Technologies and Advancements 8
Jaya Prakash M
No ratings yet
UNIT V-Cloud Computing
Document33 pages
UNIT V-Cloud Computing
Jayanth V 19CS045
No ratings yet
Unit - II
Document64 pages
Unit - II
praneelp2000
No ratings yet
UNIT5
Document33 pages
UNIT5
sureshkumar a
No ratings yet
CC Unit 5 Notes
Document30 pages
CC Unit 5 Notes
Hrudhai S
No ratings yet
SAP WEB Dispatcher: Installation Steps
Document4 pages
SAP WEB Dispatcher: Installation Steps
nagaraju
No ratings yet
Hadoop Presentaton
Document47 pages
Hadoop Presentaton
Jhumri Talaiya
No ratings yet
Unit III
Document86 pages
Unit III
Farhan Sj
No ratings yet
Splits Input Into Independent Chunks in Parallel Manner
Document4 pages
Splits Input Into Independent Chunks in Parallel Manner
Sayali Shivarkar
No ratings yet
Module 1 PDF
Document42 pages
Module 1 PDF
M Yaseen
No ratings yet
Hadoop Building Blocks
Document30 pages
Hadoop Building Blocks
Kavya
No ratings yet
Backup Database Command: db2 List Utilities Show Detail
Document8 pages
Backup Database Command: db2 List Utilities Show Detail
Bineet Lal
No ratings yet
Configuration File
Document7 pages
Configuration File
Subramanyam Hindustani
No ratings yet
Unit 1 Haoop Architecture
Document26 pages
Unit 1 Haoop Architecture
Anirudh Prakash
No ratings yet
Hadoop Architecture
Document30 pages
Hadoop Architecture
Nandini Malviya
No ratings yet
Attach Detach Procedure System
Document6 pages
Attach Detach Procedure System
Vaibhav Gore
No ratings yet
Viden Io Data Analytics Lecture10 Introduction To Hdfs
Document28 pages
Viden Io Data Analytics Lecture10 Introduction To Hdfs
Ram Chandu
No ratings yet
Module I - Hadoop Distributed File System (HDFS)
Document51 pages
Module I - Hadoop Distributed File System (HDFS)
Sid Mohammed
No ratings yet
Cloud Computing Unit 5updated
Document43 pages
Cloud Computing Unit 5updated
Prajjwal Singh Rathore
No ratings yet
Ase Build Standards
Document86 pages
Ase Build Standards
Sri Babu
No ratings yet
BDA Lab Assignment 2
Document18 pages
BDA Lab Assignment 2
parth shah
No ratings yet
Chotu Meat
Document11 pages
Chotu Meat
emraan anwar
No ratings yet
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
Document112 pages
Hadoop Echosystem and Ibm Big Insights: Rafie Tarabay Eng - Rafie@Mans - Edu.Eg
udayachandrikaa@gmailcom
No ratings yet
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Document37 pages
Big Data & Hadoop - Machine Learning: Ajay Kumar Assistant Professor-I Department of Computer Science & Engineering
Dank Boii
No ratings yet
DataStage Configuration File
Document6 pages
DataStage Configuration File
Jose Antonio Lopez Mesa
No ratings yet
APT Config
Document9 pages
APT Config
nithinmamidala999
No ratings yet
SDFS Architecture
Document17 pages
SDFS Architecture
uzai88
No ratings yet
Module-2-Introduction To HDFS and Tools
Document38 pages
Module-2-Introduction To HDFS and Tools
shreya
No ratings yet
Windows Server 2012
Document56 pages
Windows Server 2012
Nguyen Manh Cuong
No ratings yet
Shortnotes For Cloud
Document22 pages
Shortnotes For Cloud
Mahi Mahi
No ratings yet
3.7 Maxdb-Specific Procedure Purpose
Document2 pages
3.7 Maxdb-Specific Procedure Purpose
cthulhu35
No ratings yet
777 1651400645 BD Module 3
Document62 pages
777 1651400645 BD Module 3
nimmy
No ratings yet
Bda - Unit 2
Document56 pages
Bda - Unit 2
Kajal Vaniya
No ratings yet
(17CS82) 8 Semester CSE: Big Data Analytics
Document169 pages
(17CS82) 8 Semester CSE: Big Data Analytics
Prakash G
No ratings yet
Optimizing MySQL Server Settings Codingpedia
Document6 pages
Optimizing MySQL Server Settings Codingpedia
Benedict E. Pranata
No ratings yet
DataStage Theory Part
Document18 pages
DataStage Theory Part
Jesse Kota
No ratings yet
DataStage Configuration File
Document7 pages
DataStage Configuration File
rachit
No ratings yet
0713 Document Imaging and The SAP Document Imaging and The SAP Content Server 101
Document16 pages
0713 Document Imaging and The SAP Document Imaging and The SAP Content Server 101
Charith Nilanga Weerasekara
No ratings yet
BDA Module-1 Notes
Document14 pages
BDA Module-1 Notes
Kavita Horadi
No ratings yet
NYOUG Hadoop Presentaton
Document47 pages
NYOUG Hadoop Presentaton
V Kalyan
No ratings yet
Describe The Functions and Features of HDP
Document16 pages
Describe The Functions and Features of HDP
Mahmoud Elmahdy
100% (2)
Cloud Computing - Unit 3
Document38 pages
Cloud Computing - Unit 3
lightfreezzer
No ratings yet
Unit-4 Hadoop Distributed File System (HDFS) : Syllabus
Document17 pages
Unit-4 Hadoop Distributed File System (HDFS) : Syllabus
Frost Rebbeca
No ratings yet
JenaTDB SDB
Document46 pages
JenaTDB SDB
jose victor
No ratings yet
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
From Everand
DRBD-Cookbook: How to create your own cluster solution, without SAN or NAS!
Joerg Christian Seubert
No ratings yet
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
PostgreSQL 9.0 High Performance
From Everand
PostgreSQL 9.0 High Performance
Gregory Smith
Rating: 4 out of 5 stars
4/5 (1)
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Rmi Learning
Document1 page
Rmi Learning
Ajay Sreedhar Janapala
No ratings yet
AjaySreedharJ TestEngineer 3yrs
Document4 pages
AjaySreedharJ TestEngineer 3yrs
Ajay Sreedhar Janapala
No ratings yet
Reflection A Pi
Document1 page
Reflection A Pi
Ajay Sreedhar Janapala
No ratings yet
Core Java K V Rao PDF
Document411 pages
Core Java K V Rao PDF
Suhas Nipane
No ratings yet
Rmi Learning
Document1 page
Rmi Learning
Ajay Sreedhar Janapala
No ratings yet
Code Snippet
Document1 page
Code Snippet
Ajay Sreedhar Janapala
No ratings yet
Testing With Groovy
Document40 pages
Testing With Groovy
Ajay Sreedhar Janapala
No ratings yet
Getnumberofsheets: Hssfsheet Getsheet
Document10 pages
Getnumberofsheets: Hssfsheet Getsheet
Ajay Sreedhar Janapala
No ratings yet
Collection (And Hence All Its Subinterfaces) Implements Iterable
Document14 pages
Collection (And Hence All Its Subinterfaces) Implements Iterable
Ajay Sreedhar Janapala
No ratings yet
XML Presentation
Document17 pages
XML Presentation
Ajay Sreedhar Janapala
No ratings yet