Welcome to Scribd!

Mapreduce: Simple Programming For Big Results

Uploaded by

0% found this document useful (0 votes)

9 views28 pages

MapReduce provides a simplified programming model for processing large datasets in parallel across clusters of computers. It involves two steps - Map and Reduce. In Map, a function is applied to all elements to generate intermediate key-value pairs. In Reduce, a summary operation is performed on all intermediate values with the same key. This allows for massively parallel processing without requiring expertise in threads and locks. A common example is WordCount, where words are counted by mapping words to keys and reducing the counts of identical keys. MapReduce abstracts parallelization details and is well-suited for applications with independent data-parallel tasks on large datasets.

Original Description:

Original Title

MapReduce

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

9 views28 pages

Mapreduce: Simple Programming For Big Results

Uploaded by

Kosuru ratnasai

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 28

Search inside document

MapReduce:

Simple Programming for

Big Results
After this video you will be able to..
• Explain how MapReduce simplifies
creating parallel programs

• Design a WordCount application using the

MapReduce programming model
MapReduce = Programming
Model for Hadoop Ecosystem

Hive Pig
Giraph

Spark
Storm

Flink
MapReduce

HBase

Cassandra

MongoDB
Zookeeper

YARN

HDFS
Parallel Programming = Requires Expertise

Semaphores
Threads Monitors
Message
Shared
Passing
Memory
Locks
MapReduce = Only Map and Reduce!

Semaphores
Threads Monitors
Message
Shared
Passing
Memory
Locks
Based on Functional Programming

Map = apply operation f (x) = y

to all elements

Reduce = summarize
operation on elements
Example MapReduce Application: WordCount

File 1
Result
File 2 WordCount
File

File N
Step 0: File is stored in HDFS
Step 1: Map on each node
My apple is red and my rose is blue....
…

You are the apple of my eye....

…

…
Map generates
My apple is red and my rose is blue.... key-value pairs
…
my, my  (my, 1), (my, 1)
apple  (apple, 1)
is, is  (is, 1), (is, 1)
red  (red, 1)
and  (and, 1)
rose  (rose, 1)
blue  (blue, 1)
Map generates
You are the apple of my eye.... key-value pairs
…
You  (You, 1)
are  (are, 1)
the  (the, 1)
apple  (apple, 1)
of  (of, 1)
my  (my, 1)
eye  (eye, 1)
Step 2: Sort and Shuffle
Pairs with same key
moved to same node
(You, 1) Step 2: Sort and Shuffle
(apple, 1) Pairs with same key
moved to same node
(apple, 1)

(is, 1)
(is, 1)

(rose, 1)
(red, 1)
Step 3: Reduce Add values for same keys
Step 3: Reduce Add values for same keys
(You, 1) (You, 1)
(apple, 1), (apple, 1) (apple, 2)

(my, 1), (my, 1),

(my, 3)
(my, 1)
(red, 1) (red, 1)
(rose, 1) (rose, 1)
Shuffle
Map Reduce
and Sort

Represents a large
number of applications.
Sort and Shuffle (You, http://you1.fake)
(apple, http://apple1.fake)
(apple, http://apple2.fake)

(is, http://apple2.fake)
(is, http://apple2.fake)

(rose, http://apple2.fake)
(red, http://apple2.fake)
Reduce Results for “apple”

(apple -> http://apple1.fake,

http://apple2.fake)
Reduce Results for “apple”

Key Value
(apple -> http://apple1.fake,
http://apple2.fake)

apple
Shuffle
Map Reduce
and Sort
Shuffle
Map Reduce
and Sort

Parallelization
over the input
Shuffle
Map Reduce
and Sort

Parallelization
Parallelization
over the input
data sorting
Shuffle
Map Reduce
and Sort

Parallelization Parallelization
Parallelization over
over the input intermediate data over data groups
MapReduce is bad for:
MapReduce is bad for:

Frequently changing data

MapReduce is bad for:

Frequently changing data

Dependent tasks
MapReduce is bad for:

Frequently changing data

Dependent tasks
Interactive analysis
MapReduce

Simplified parallel Applications with

programming independent data-
parallel tasks

Increase Youtube Subscribers
Document30 pages
Increase Youtube Subscribers
4ktazekahve
No ratings yet
Crystal+Essentials a+Beginners+Guide+to+Crystals FINAL
Document7 pages
Crystal+Essentials a+Beginners+Guide+to+Crystals FINAL
Gretchen
No ratings yet
Ninja® Foodi™ Smart XL 6-In-1 Indoor Grill Quick Start Guide
Document2 pages
Ninja® Foodi™ Smart XL 6-In-1 Indoor Grill Quick Start Guide
hawkjohn
No ratings yet
Financial Analysis and Decision Making Powerpoint
Document28 pages
Financial Analysis and Decision Making Powerpoint
ashvina321
No ratings yet
Finding Dory Ed Guide
Document43 pages
Finding Dory Ed Guide
MirelaCojocaru
No ratings yet
Escrow Agreement
Document3 pages
Escrow Agreement
madelyn sarmienta
No ratings yet
Spark Using Scala - Developer Training 3.0
Document102 pages
Spark Using Scala - Developer Training 3.0
ahmed_sft
No ratings yet
Personal Grooming PDF
Document8 pages
Personal Grooming PDF
ayu
No ratings yet
Hadoop Beginner's Guide
From Everand
Hadoop Beginner's Guide
Garry Turkington
Rating: 4 out of 5 stars
4/5 (7)
Spark Using Scala - Developer Training 1.0
Document106 pages
Spark Using Scala - Developer Training 1.0
ahmed_sft
No ratings yet
Probability Distributions Summary - Exam P
Document1 page
Probability Distributions Summary - Exam P
roy_getty
No ratings yet
Autonomy of Art or Dignity of The Artwork? - Agnes Heller
Document20 pages
Autonomy of Art or Dignity of The Artwork? - Agnes Heller
ProfrFer
100% (1)
8th Sem BE VTU Seminar Format
Document6 pages
8th Sem BE VTU Seminar Format
SourabhAdike
80% (15)
UNIT 2-tt1
Document7 pages
UNIT 2-tt1
avm5439
No ratings yet
What Is MapReduce in Hadoop - Architecture - Example
Document7 pages
What Is MapReduce in Hadoop - Architecture - Example
jppn33
No ratings yet
Map Reduce
Document44 pages
Map Reduce
K Anantha Krishnan
No ratings yet
Chapter 3MapReduce
Document30 pages
Chapter 3MapReduce
Komal
No ratings yet
Assignment 11 DSBDA
Document4 pages
Assignment 11 DSBDA
DARSHAN JADHAV
No ratings yet
BigData Spark Sparklyr
Document80 pages
BigData Spark Sparklyr
michelle
No ratings yet
Lecture 2 - Mapreduce: Cpe 458 - Parallel Programming, Spring 2009
Document26 pages
Lecture 2 - Mapreduce: Cpe 458 - Parallel Programming, Spring 2009
rhshriva
No ratings yet
Hadoop (Mapreduce)
Document43 pages
Hadoop (Mapreduce)
Nisrine Mofakir
No ratings yet
What Is MapReduce in Hadoop
Document5 pages
What Is MapReduce in Hadoop
Rakesh Shaw
No ratings yet
Distributed and Cloud Computing
Document58 pages
Distributed and Cloud Computing
18JE0254 CHIRAG JAIN
No ratings yet
Large Scale Etl With Hadoop
Document76 pages
Large Scale Etl With Hadoop
禹范
No ratings yet
Hadoop Spark
Document34 pages
Hadoop Spark
klogeswaran.it
No ratings yet
Hadoop MapReduce2.0 (Part-I)
Document18 pages
Hadoop MapReduce2.0 (Part-I)
Anonymous Saitama
No ratings yet
MapReduce Flow
Document8 pages
MapReduce Flow
Chilaka Kumar
No ratings yet
Data Science Presentation
Document20 pages
Data Science Presentation
yadvendra dhakad
No ratings yet
Unit 3 MapReduce Part 1
Document12 pages
Unit 3 MapReduce Part 1
Ruparel Education Pvt. Ltd.
No ratings yet
CPS216: Advanced Database Systems (Data-Intensive Computing Systems)
Document15 pages
CPS216: Advanced Database Systems (Data-Intensive Computing Systems)
Abhishek Ranjan
No ratings yet
Intro To Mapreduce
Document15 pages
Intro To Mapreduce
Aruna Samanthapudi
No ratings yet
Big Data Analysis
Document26 pages
Big Data Analysis
Lakshayjeet swami
No ratings yet
Bda Expt7 - 60002190056
Document3 pages
Bda Expt7 - 60002190056
kr
No ratings yet
Map Reduce
Document18 pages
Map Reduce
Soni Nsit
No ratings yet
Chapter 05 Google Re
Document69 pages
Chapter 05 Google Re
William
No ratings yet
2011 Hyunjung Park VLDB Map-Reduce, RAMP, Big Data
Document4 pages
2011 Hyunjung Park VLDB Map-Reduce, RAMP, Big Data
Sunil Thalia
No ratings yet
1c MR YARN Transcript
Document4 pages
1c MR YARN Transcript
NetF feree
No ratings yet
DSBDA Manual Assignment 11
Document6 pages
DSBDA Manual Assignment 11
kartiknikumbh11
No ratings yet
Big Data and Analytics and MapReduce 29052023 054155pm
Document35 pages
Big Data and Analytics and MapReduce 29052023 054155pm
Talha Mughal
No ratings yet
Unit 3 - Big Data Technologies
Document42 pages
Unit 3 - Big Data Technologies
prakash N
No ratings yet
Unit 2 - From Hadoop Streaming PDF
Document20 pages
Unit 2 - From Hadoop Streaming PDF
Gopal Agarwal
No ratings yet
Map Reduce
Document8 pages
Map Reduce
Vishakha Sharma
No ratings yet
A Brief On MapReduce Performance
Document6 pages
A Brief On MapReduce Performance
IJIRAE
No ratings yet
How To Make The Best Use of Live Sessions
Document39 pages
How To Make The Best Use of Live Sessions
venkatraji719568
No ratings yet
Hadoop Karunesh
Document14 pages
Hadoop Karunesh
Mukul Mishra
No ratings yet
DS Tools Lec 03
Document24 pages
DS Tools Lec 03
Youmna Eid
No ratings yet
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
Document55 pages
Big Data Infrastructure: Week 2: Mapreduce Algorithm Design (2/2)
Lindsey Hoffman
No ratings yet
Map Reduce PDF
Document29 pages
Map Reduce PDF
Mahalakshmi G
No ratings yet
Hadoop Ecosystem
Document26 pages
Hadoop Ecosystem
ain
No ratings yet
CS-702 (D) BigData
Document61 pages
CS-702 (D) BigData
garima bh
No ratings yet
CPS216: Advanced Database Systems (Data-Intensive Computing Systems)
Document15 pages
CPS216: Advanced Database Systems (Data-Intensive Computing Systems)
sumit_pmb7608
No ratings yet
Hadoop Interview Questions Author: Pappupass Learning Resource
Document16 pages
Hadoop Interview Questions Author: Pappupass Learning Resource
Dheeraj Reddy
No ratings yet
Example - (Map Function in Word Count)
Document6 pages
Example - (Map Function in Word Count)
Dileep Reddy
No ratings yet
Map Reduced B Seminar
Document17 pages
Map Reduced B Seminar
niharika sunkara
No ratings yet
Word Count Program With MapReduce and Java
Document6 pages
Word Count Program With MapReduce and Java
sheenu george
No ratings yet
12 13 14 Map Reduce
Document57 pages
12 13 14 Map Reduce
Manjula Annamalai
No ratings yet
Big Data
Document43 pages
Big Data
ABHIJEETH kumar
No ratings yet
Hadoop Interview Questions Faq
Document14 pages
Hadoop Interview Questions Faq
mihirhota
No ratings yet
Adobe Scan Dec 05, 2023
Document7 pages
Adobe Scan Dec 05, 2023
dk singh
No ratings yet
5-HDFS Commands and MapReduce
Document22 pages
5-HDFS Commands and MapReduce
rajveer shah
No ratings yet
Hadoop Overview-Tutorial-20081128 PDF
Document31 pages
Hadoop Overview-Tutorial-20081128 PDF
TrurlScribd
No ratings yet
Hadoop: A Seminar Report On
Document28 pages
Hadoop: A Seminar Report On
Roshni Khairnar
No ratings yet
Map-Reduce For Parallel Computing: Amit Jain
Document72 pages
Map-Reduce For Parallel Computing: Amit Jain
Nate Austin
No ratings yet
HADOOP
Document35 pages
HADOOP
Ekapop Verasakulvong
100% (1)
Big Data Management
Document38 pages
Big Data Management
sehun twin
No ratings yet
Map-Reduce Is A Programming Model That Is Mainly Divided Into Two
Document2 pages
Map-Reduce Is A Programming Model That Is Mainly Divided Into Two
Fahim Muntasir
No ratings yet
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
Document55 pages
Parlab Parallel Boot Camp: Cloud Computing With Mapreduce and Hadoop
Yogesh Bansal
No ratings yet
MapReduce Tutorial
Document32 pages
MapReduce Tutorial
ShivanshuSingh
No ratings yet
Hadoop: A Report Writing On
Document13 pages
Hadoop: A Report Writing On
dilip kodmour
No ratings yet
Hadoop: Data Processing and Modelling
From Everand
Hadoop: Data Processing and Modelling
Garry Turkington
No ratings yet
Module 4 Introduction To Spreadsheets - Models v2
Document12 pages
Module 4 Introduction To Spreadsheets - Models v2
Kosuru ratnasai
No ratings yet
The Moving Average Models MA (1) and MA (2) : Al Nosedal University of Toronto
Document47 pages
The Moving Average Models MA (1) and MA (2) : Al Nosedal University of Toronto
Kosuru ratnasai
No ratings yet
Practical Mvo
Document30 pages
Practical Mvo
Kosuru ratnasai
No ratings yet
Why Is Big Data Processing Different?
Document23 pages
Why Is Big Data Processing Different?
Kosuru ratnasai
No ratings yet
When To Reconsider Hadoop ?
Document11 pages
When To Reconsider Hadoop ?
Kosuru ratnasai
No ratings yet
Getting Started: Why Hadoop?
Document13 pages
Getting Started: Why Hadoop?
Kosuru ratnasai
No ratings yet
MachineGeneratedData Part2 Altintas Final2 PDF
Document15 pages
MachineGeneratedData Part2 Altintas Final2 PDF
Kosuru ratnasai
No ratings yet
Cloud Computing: An Important Big Data Enabler
Document23 pages
Cloud Computing: An Important Big Data Enabler
Kosuru ratnasai
No ratings yet
The Hadoop Ecosystem: So Much Free Stuff!
Document21 pages
The Hadoop Ecosystem: So Much Free Stuff!
Kosuru ratnasai
No ratings yet
The Relational Data Model
Document10 pages
The Relational Data Model
Kosuru ratnasai
No ratings yet
Cat-Eht Brake New PDF
Document4 pages
Cat-Eht Brake New PDF
Lingaraj Suresh Lingaian
No ratings yet
Progress Audio Script 2
Document1 page
Progress Audio Script 2
gronigan
No ratings yet
GP-100 Software User Guide This Software Only Supports Windows System
Document1 page
GP-100 Software User Guide This Software Only Supports Windows System
John Henry
No ratings yet
Chapter One
Document36 pages
Chapter One
Jeremiah Alhassan
100% (1)
Bioshock Audio Diaries
Document6 pages
Bioshock Audio Diaries
George Curioso
No ratings yet
2.2 Crime Scene Investigation and
Document1 page
2.2 Crime Scene Investigation and
Fatima Sarpina Hinay
No ratings yet
Piyush Mulkar: Cleveland, OH
Document6 pages
Piyush Mulkar: Cleveland, OH
Radha Ramineni
No ratings yet
Singtel Annual Report 2023 Higher Res
Document275 pages
Singtel Annual Report 2023 Higher Res
JL Chua
No ratings yet
Development Letter
Document3 pages
Development Letter
tan balan
No ratings yet
BMI Middle East and Africa Oil and Gas Insight March 2016
Document10 pages
BMI Middle East and Africa Oil and Gas Insight March 2016
Anonymous pPpTQrpvbr
No ratings yet
42 Different Types of Wrenches Explained in Detail Notes PDF
Document24 pages
42 Different Types of Wrenches Explained in Detail Notes PDF
Muhammad Zain
No ratings yet
Latihan Soal
Document7 pages
Latihan Soal
julia
No ratings yet
Hazyl Delerosario@ctu Edu PH
Document4 pages
Hazyl Delerosario@ctu Edu PH
Dulce Amor del Rosario
No ratings yet
Inside Job PDF
Document87 pages
Inside Job PDF
Ariel Marascalco
No ratings yet
A Guide To Facts and Fiction About Climate Change
Document19 pages
A Guide To Facts and Fiction About Climate Change
ANDI Agencia de Noticias do Direito da Infancia
100% (1)
Help Utf8
Document8 pages
Help Utf8
Jorge Diaz Lastra
No ratings yet
EMPOWERMENT TECHNOLOGIES QUARTER 1 MODULE 3 Final 2 PDF
Document13 pages
EMPOWERMENT TECHNOLOGIES QUARTER 1 MODULE 3 Final 2 PDF
Jason Gulla
No ratings yet
Multi Car Parking
Document33 pages
Multi Car Parking
Tanvi Khurana
No ratings yet
Principles of Health Education
Document3 pages
Principles of Health Education
Junah Dayaganon
No ratings yet
Recognia Intraday Trader Infosheet
Document2 pages
Recognia Intraday Trader Infosheet
Dipesh Pawaiya
No ratings yet