Welcome to Scribd!

Skip carousel

1 What Is Hadoop

Uploaded by

Knvpavankalyan Sunny

0% found this document useful (0 votes)

20 views13 pages

Original Title

1 what is hadoop

Copyright

Available Formats

PPT, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

20 views13 pages

1 What Is Hadoop

Uploaded by

Knvpavankalyan Sunny

Copyright:

Available Formats

Download as PPT, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 13

Search inside document

What is HADOOP ?

Agenda

Introduction to Distributed Computing

An overview of Super Computing and its challenges

Hadoop as a solution

History of Hadoop

2
One man = One day

3
Adding the Numbers

Input file (1 GB)

Sum total = 10000

Time taken = 50 seconds 4

Parallel Processing – Faster computing !

6000

4000
Time taken = 30 sec 10000(sum total)
5
SUPER COMPUTER – cluster of computers

https://en.wikipedia.org/wiki/History_of_supercomputing
Uses of Super Computers
Used in weather forecasting, seismic data analysis etc.

University research labs for studies related to computational

fluid dynamics, bioinformatics etc.

Department of defense for nuclear research.

7
Challenges With Super Computing
• A general purpose operating system like framework for
parallel computing needs did not exist
• Companies procuring super computers were locked to
specific vendors for hardware support
• High initial cost of the hardware
• Develop custom software for individual use cases
• High cost of software maintenance and upgrades which
had to be taken care in house the organizations using a
super computer.
• Not simple to scale horizontally
8
HADOOP Comes to the Rescue
• A general purpose operating system like framework for parallel
comuting needs

• Its free software (open source) with free upgrades

• Has options for upgrading the software and its free !

• Opens up the power of distributed computing to a wider set of

audience.

• Mid sized organizations need not be locked to specific vendors for

hardware support – Hadoop works on commoditity hardware

• The software challenges of the orgnazation having to write

proprietary softwares is no longer the case.

9
The BUZZZZ…WORD is
HADOOOPPPP

10
Late 90's to Early 2000's – dot com Bubble !

This was the age of the INTERNET BOOM

The need of the hour was a scalable web search engine

Major players in this business of internet search engines back then were
Evolution of Hadoop From Internet Search
Engines

The need of the hour was scalable search engine for the growing
internet

Internet Archive search director Doug Cutting and University of
Washington graduate student Mike Cafarella set out to build a
search engine and the project named NUTCH in the year 2001-
2002

Google's distributed file system paper came out in 2003 & first
file map-reduce paper came out in 2004

In 2006 Dough Cutting joined YAHOO and created an open
source framework called HADOOP (name of his son's toy
elephant) HADOOP traces back its root to NUTCH, Google's
distributed file system and map-reduce processing engine.

It went to become a full fledged Apache project and a stable
version of Hadoop was used in Yahoo in the year 2008
Summary

Overview of parallel processing and its challenges

Hadoop as a rescue tool for those challenges

What is hadoop and its evolution?

(PACKT) Managing Microsoft Teams MS-700 Exam Guide (2021)
Document452 pages
(PACKT) Managing Microsoft Teams MS-700 Exam Guide (2021)
Angelo Paolo Coronel
100% (1)
LAB505 FactoryTalk Batch and LBSM Lab
Document449 pages
LAB505 FactoryTalk Batch and LBSM Lab
Marcio Issao Watanabe
100% (4)
WebTechnology Study Materials
Document143 pages
WebTechnology Study Materials
Suganya Periasamy
100% (2)
Core Jave by Ratan PDF
Document286 pages
Core Jave by Ratan PDF
eramitsaroha
100% (3)
Hadoop, A Distributed Framework For Big Data
Document55 pages
Hadoop, A Distributed Framework For Big Data
HARISH REDDY B
No ratings yet
Class: CS 237 Distributed Systems Middleware Instructor: Nalini Venkatasubramanian
Document55 pages
Class: CS 237 Distributed Systems Middleware Instructor: Nalini Venkatasubramanian
Pratheesh Kumar
No ratings yet
75 Digital Tools and Apps Teachers Can Use To Support Formative Assessment in The Classroom
Document5 pages
75 Digital Tools and Apps Teachers Can Use To Support Formative Assessment in The Classroom
Karnan Thilak
No ratings yet
Building A Big Data Platform With The Hadoop Ecosystem
Document53 pages
Building A Big Data Platform With The Hadoop Ecosystem
Gregg Barrett
No ratings yet
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
Document23 pages
Exploring Bigdata With Hadoop: Dr.A.Bazila Banu Associate Professor Department of Cse
MAMAN MYTHIEN S
No ratings yet
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
From Everand
Exploring Hadoop Ecosystem (Volume 1): Batch Processing
Wei Liu
No ratings yet
Hadoop Presentation
Document13 pages
Hadoop Presentation
Nishil Shah
No ratings yet
Hadoop
Document13 pages
Hadoop
Maganti Roopesh
No ratings yet
Hadoop
Document23 pages
Hadoop
sowjanya kandukuri
No ratings yet
Bigdata Hadoop PPTs
Document96 pages
Bigdata Hadoop PPTs
Rahul Sharma
No ratings yet
Introduction To Cloud Computing
Document34 pages
Introduction To Cloud Computing
PriSim
No ratings yet
Next Generation Technology
Document4 pages
Next Generation Technology
shashwat2010
No ratings yet
Fillatre Big Data
Document98 pages
Fillatre Big Data
satmania
No ratings yet
The Age OF: Every Minute
Document47 pages
The Age OF: Every Minute
aghodke
No ratings yet
Big Data Open Source Frameworks Lecture Slides
Document109 pages
Big Data Open Source Frameworks Lecture Slides
apna india
No ratings yet
Hadoop
Document25 pages
Hadoop
RAJNISH KUMAR ROY
No ratings yet
Hadoop
Document13 pages
Hadoop
kajole7693
No ratings yet
Hadoop Nishant Gandhi.
Document21 pages
Hadoop Nishant Gandhi.
Bhavesh Lodaliya
No ratings yet
Hadoop Important Lecture
Document38 pages
Hadoop Important Lecture
affanabbasi015
No ratings yet
Big Data Infrastructure
Document12 pages
Big Data Infrastructure
Hifzaa Riyes
No ratings yet
Hadoop, A Distributed Framework For Big Data
Document55 pages
Hadoop, A Distributed Framework For Big Data
sonia choudhary
No ratings yet
BDA Module-2 Notes PDF
Document14 pages
BDA Module-2 Notes PDF
VTU ML Workshop
No ratings yet
Unit - I Introduction To Big Data
Document38 pages
Unit - I Introduction To Big Data
praneelp2000
No ratings yet
Chapter 2 - 大数据生态系统
Document31 pages
Chapter 2 - 大数据生态系统
gs68295
No ratings yet
Big Data
Document12 pages
Big Data
prerana rai
No ratings yet
Part 02 - Big Data Solutions
Document17 pages
Part 02 - Big Data Solutions
Palak Garhwani
No ratings yet
Chapter 2 Hadoop Eco System
Document34 pages
Chapter 2 Hadoop Eco System
lamisaldhamri237
No ratings yet
HADOOP
Document43 pages
HADOOP
MUSHTAQ AHAMED
No ratings yet
Big Data & Hadoop
Document32 pages
Big Data & Hadoop
krishan Goyal
No ratings yet
Introduction: Hadoop's History and Advantages 2. Architecture in Detail 3. Hadoop in Industry
Document53 pages
Introduction: Hadoop's History and Advantages 2. Architecture in Detail 3. Hadoop in Industry
jainam dude
No ratings yet
Cloud Computing: Concepts, Technologies and Business Implications
Document37 pages
Cloud Computing: Concepts, Technologies and Business Implications
InSha RafIq
No ratings yet
Lecture 1
Document55 pages
Lecture 1
George Okemwa
No ratings yet
Cloud Compute
Document46 pages
Cloud Compute
Saloni Rakholiya
No ratings yet
Bda Aiml Note Unit 2
Document13 pages
Bda Aiml Note Unit 2
viswakranthipalagiri
No ratings yet
Big Data and Hadoop
Document37 pages
Big Data and Hadoop
Sreenivasulu Gogula
No ratings yet
Unit No. 4
Document67 pages
Unit No. 4
vishal phule
No ratings yet
Parallel Programming
Document12 pages
Parallel Programming
Eska Ralthây
No ratings yet
Big Data - Hadoop
Document20 pages
Big Data - Hadoop
akshay
No ratings yet
Chapter-2-Hadoop Eco System
Document34 pages
Chapter-2-Hadoop Eco System
noor222.202
No ratings yet
Big Data Hadoop
Document37 pages
Big Data Hadoop
SDHR BCA
No ratings yet
Bda 18CS72 Mod-2
Document152 pages
Bda 18CS72 Mod-2
Dhathri Reddy
No ratings yet
Data Mining With Hadoop and Hive Introduction To Architecture
Document39 pages
Data Mining With Hadoop and Hive Introduction To Architecture
Ashwin Ajmera
No ratings yet
BATCH12
Document32 pages
BATCH12
Siruvuru Jithendra varma
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
Document51 pages
Lecture Week - 1 Introduction 1 - SP-24
imhafeezniazi
No ratings yet
Lab Manual BDA
Document36 pages
Lab Manual BDA
hemalata jangale
No ratings yet
Unit 3 Da
Document43 pages
Unit 3 Da
aadityapawar210138
No ratings yet
Foundations of Sequential and Parallel Programming: Raising A New Generation of Leaders
Document34 pages
Foundations of Sequential and Parallel Programming: Raising A New Generation of Leaders
ekechi bernive
No ratings yet
Hadoop and Pig Overview - Hands-On: Outline of Tutorial
Document52 pages
Hadoop and Pig Overview - Hands-On: Outline of Tutorial
Konara Kiran
No ratings yet
Unit 4
Document33 pages
Unit 4
Sahana Shetty
100% (1)
Real Time Data Processing Framework
Document16 pages
Real Time Data Processing Framework
AISHWARYA ANILKUMAR
No ratings yet
Hadoop Ecosystem
Document58 pages
Hadoop Ecosystem
pechaporn
No ratings yet
Lesson 1 - Introduction To Big Data and Hadoop
Document46 pages
Lesson 1 - Introduction To Big Data and Hadoop
PoojaSampath
No ratings yet
SEN-762 Advanced Big Data Analytics
Document39 pages
SEN-762 Advanced Big Data Analytics
بالیراجپوت
No ratings yet
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
Document71 pages
Hadoop and MR Programming: DR G Sudha Sadasivam Professor Cse, PSGCT
VALANARR COMPUTERS
No ratings yet
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
Document40 pages
Big Data, Map Reduce & Hadoop: By: Surbhi Vyas (7) Varsha
18941
No ratings yet
517 454, 517 441: Parallel and Distributed Computing: Apisake Hongwitayakorn
Document31 pages
517 454, 517 441: Parallel and Distributed Computing: Apisake Hongwitayakorn
อภิเษก หงษ์วิทยากร
No ratings yet
Ar CloudComputing 2020 15 ConceptTechBusinessImplication
Document37 pages
Ar CloudComputing 2020 15 ConceptTechBusinessImplication
Miftah Zaidan
No ratings yet
Day1 2
Document110 pages
Day1 2
patil_555
No ratings yet
01 Hadoop Crs1 Elearning
Document30 pages
01 Hadoop Crs1 Elearning
Mohit Gautam
No ratings yet
Module 2. 16974328568170
Document113 pages
Module 2. 16974328568170
Sagar B S
No ratings yet
RV College of Engineering: Big Data Analytics 16CS7F1 Prof - Mamatha T
Document64 pages
RV College of Engineering: Big Data Analytics 16CS7F1 Prof - Mamatha T
tester
No ratings yet
DRAFT UNIT #2 Class 09 & 10 On 07aug2018 ORBITS-continued
Document128 pages
DRAFT UNIT #2 Class 09 & 10 On 07aug2018 ORBITS-continued
Knvpavankalyan Sunny
No ratings yet
Rotational Part Example - Capproductions: David Legge / Engineering Design Group
Document2 pages
Rotational Part Example - Capproductions: David Legge / Engineering Design Group
Knvpavankalyan Sunny
No ratings yet
Student Notes For Parametric Modeling of Gears
Document6 pages
Student Notes For Parametric Modeling of Gears
Knvpavankalyan Sunny
No ratings yet
Rotax Gearbox Front (Prop Side) Case (v3)
Document1 page
Rotax Gearbox Front (Prop Side) Case (v3)
Knvpavankalyan Sunny
No ratings yet
3 HDFS Basics
Document17 pages
3 HDFS Basics
Knvpavankalyan Sunny
No ratings yet
Ch6 - Part1
Document32 pages
Ch6 - Part1
Ibrahem Ahmad
No ratings yet
Trend Micro Partner Portal Introduction PDF
Document69 pages
Trend Micro Partner Portal Introduction PDF
Ashok K
No ratings yet
The Pragmatic Engineers Resume Template
Document3 pages
The Pragmatic Engineers Resume Template
Adelina Blanaru
No ratings yet
Efficient Parking & Toll Management: A RFID-Enabled Approach With Vega Aries Development Board
Document6 pages
Efficient Parking & Toll Management: A RFID-Enabled Approach With Vega Aries Development Board
International Journal of Innovative Science and Research Technology
No ratings yet
PCO 15.2 Security Guide
Document26 pages
PCO 15.2 Security Guide
Daniel
No ratings yet
Converged Systems (Netapp H-Series) : H615C Compute Nodes
Document1 page
Converged Systems (Netapp H-Series) : H615C Compute Nodes
MrChimichanga
No ratings yet
Ulster Bank Interview Questions and Answers 40548
Document11 pages
Ulster Bank Interview Questions and Answers 40548
qwerty
No ratings yet
Telegram Listing
Document244 pages
Telegram Listing
Average Engineer
No ratings yet
URS Admission Online Application Step by Step Procedure URBO
Document7 pages
URS Admission Online Application Step by Step Procedure URBO
erelleve
No ratings yet
AWS Elemental Live L012AE Specification Sheet
Document2 pages
AWS Elemental Live L012AE Specification Sheet
edwin vera
No ratings yet
Installation Guide - ChE Junkie
Document23 pages
Installation Guide - ChE Junkie
cincaohijau
No ratings yet
Static Vs Dynamic Routing
Document10 pages
Static Vs Dynamic Routing
Bijay Lama
No ratings yet
Ahiwpso4u1 Single-Plug Manual Eng Spa
Document26 pages
Ahiwpso4u1 Single-Plug Manual Eng Spa
Luis Armando
No ratings yet
Installation Procedure: GX1000 (Office Model) Rev 1.2
Document47 pages
Installation Procedure: GX1000 (Office Model) Rev 1.2
arez99
No ratings yet
J2EE The Complete Reference
Document13 pages
J2EE The Complete Reference
Gaurav Pitaliya
0% (1)
Advanced Configuration: This Section Provides Details About Advanced Configuration Techniques For Configuring Blue Prism
Document32 pages
Advanced Configuration: This Section Provides Details About Advanced Configuration Techniques For Configuring Blue Prism
vishal guthula
No ratings yet
Artillery Management System
Document4 pages
Artillery Management System
Adnan Yousaf
No ratings yet
Quick Start Guide - Project Coordinator
Document10 pages
Quick Start Guide - Project Coordinator
RC
No ratings yet
Saiil Gribb
Document137 pages
Saiil Gribb
ferruccio sabatino
No ratings yet
Installer Manual: Battery Monitoring System
Document24 pages
Installer Manual: Battery Monitoring System
carlos
No ratings yet
Aia Helpdesk Contact List: ALPP - Issues Such As Forgotten Password or User ID Locked Can Be Resolved With Just A
Document4 pages
Aia Helpdesk Contact List: ALPP - Issues Such As Forgotten Password or User ID Locked Can Be Resolved With Just A
Vapor Kun
No ratings yet
Dirbuster & Beyond: The Owasp Foundation
Document24 pages
Dirbuster & Beyond: The Owasp Foundation
gandalf56
No ratings yet
Mobile Tenant Chase Evans
Document10 pages
Mobile Tenant Chase Evans
Pablo Gomez Jones
No ratings yet
CCNA Exploration 2 - Module 1 Exam Answers Version 4.0
Document62 pages
CCNA Exploration 2 - Module 1 Exam Answers Version 4.0
Oynky
No ratings yet
Customer Request Form PDF
Document1 page
Customer Request Form PDF
Mehmood Chawla
No ratings yet