You are on page 1of 24

MapReduce

<

Enseignant : Mohamed MANAA

1
Agenda

 MapReduce Introduction
 MapReduce Tasks
 WordCount Example
 Splits
 Execution
 Scheduling

2 © 2013 IBM Corporation


Introduction to MapReduce
 Driving principals
– Data is stored across the entire cluster
– Programs are brought to the data, not the data to the program

 Data is stored across the entire cluster (the DFS)


– The entire cluster participates in the file system
– Blocks of a single file are distributed across the cluster
– A given block is typically replicated as well for resiliency

10110100 Cluster
10100100
1
11100111
11100101
00111010 1 3 2
01010010
11001001
2
01010011 4 1 3
Blocks 00010100
10111010
11101011
3
11011011
01010110 2 4
10010101 4 2 3
00101010
10101110
1
4
01001101
01110100

Logical File

3
MapReduce Explained
 Hadoop computation model
– Data stored in a distributed file system spanning many inexpensive computers
– Bring function to the data
– Distribute application to the compute resources where the data is stored
 Scalable to thousands of nodes and petabytes of data

public static class TokenizerMapper Hadoop Data Nodes


extends Mapper<Object,Text,Text,IntWritable> {
private final static IntWritable
one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text val, Context
StringTokenizer itr =
new StringTokenizer(val.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());

}
context.write(word, one); 1. Map Phase
}
} (break job into small parts)
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWrita
private IntWritable result = new IntWritable();
public void reduce(Text key, Distribute map 2. Shuffle
Iterable<IntWritable> val, Context context){
int sum = 0;
tasks to cluster (transfer interim output
for (IntWritable v : val) {
sum += v.get();
for final processing)
. . .

3. Reduce Phase
(boil all output down to
MapReduce Application
a single result set)
Shuffle

Result Set Return a single result set

4
MapReduce Engine
 Master / Slave architecture
– Single master (JobTracker) controls job execution on multiple slaves
(TaskTrackers).
 JobTracker
– Accepts MapReduce jobs submitted by clients
– Pushes map and reduce tasks out to TaskTracker nodes
– Keeps the work as physically close to data as possible
– Monitors tasks and TaskTracker status
 TaskTracker
– Runs map and reduce tasks
– Reports status to JobTracker
– Manages storage and transmission of intermediate output

cluster Computer 1

JobTracker

TaskTracker TaskTracker TaskTracker TaskTracker


Computer 2 Computer 3 Computer 4 Computer 5
5
The MapReduce Programming Model

 "Map" step:
– Input split into pieces

– Worker nodes process individual pieces in parallel (under


global control of the Job Tracker node)

– Each worker node stores its result in its local file system
where a reducer is able to access it

 "Reduce" step:
– Data is aggregated (‘reduced” from the map steps) by
worker nodes (under control of the Job Tracker)

– Multiple reduce tasks can parallelize the aggregation


6
MapReduce Overview

Results can be
written to HDFS or a
database

Map Shuffle Reduce


Distributed
FileSystem HDFS,
data in blocks

7
Agenda

 MapReduce Introduction
 MapReduce Tasks
– Map
– Shuffle
– Reduce
– Combiner
 WordCount Example
 Splits
 Execution
 Scheduling

8 © 2013 IBM Corporation


MapReduce – Map Phase
 Mappers
– Small program (typically), distributed across the cluster, local to data
– Handed a portion of the input data (called a split)
– Each mapper parses, filters, or transforms its input
– Produces grouped <key,value> pairs

Map Phase
sort Logical Output File
10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
sort reduce 00111010
01010010
11001001
2
01010011
2 map 11001001
00010100
10111010 Logical Output File
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map
Logical
Input File

9
MapReduce – The Shuffle
 The output of each mapper is locally grouped together by key
 One node is chosen to process data for each unique key
 All of this movement (shuffle) of data is transparently orchestrated by
MapReduce

Shuffle

sort Logical Output File


10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
sort reduce 00111010
01010010
11001001
2
01010011
2 map 11001001
00010100
10111010 Logical Output File
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map

10
MapReduce – Reduce Phase
 Reducers
– Small programs (typically) that aggregate all of the values for the key that
they are responsible for
– Each reducer writes output to its own file

Reduce Phase

sort Logical Output File


10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
sort reduce 00111010
01010010
11001001
2
01010011
2 map 11001001
00010100
10111010 Logical Output File
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map

11
Agenda

 MapReduce Introduction
 MapReduce Tasks
 WordCount Example
 Splits
 Execution
 Scheduling

12 © 2013 IBM Corporation


How does Hadoop run MapReduce jobs?
2. get new job ID
MapReduce 1. run job 5. initialize job
JobClient JobTracker
program
4. submit job
client JVM
jobtracker node
client node
7. heartbeat
6. retrieve
3. copy job (returns task)
input splits
resources

Distributed TaskTracker
8. retrieve
file system
job resources 9. launch
(e.g. HDFS)
child JVM

Child
 Client: submits MapReduce jobs
 JobTracker: coordinates the job run, breaks 10. run
down the job to map and reduce tasks for each
node to work on the cluster MapTask
or
 TaskTracker: execute the map and reduce ReduceTask
functions
tasktracker node

13
Fault tolerance
JobTracker node

JobTracker 3 JobTracker fails

heartbeat

TaskTracker 2 TaskTracker fails

child JVM

Child

1 Task fails
MapTask
or
ReduceTask

TaskTracker node

14
14
Fault tolerance
• Task Failure
• If a child task fails, the child JVM reports to the TaskTracker before it
exits. Attempt is marked failed, freeing up slot for another task.
• If the child task hangs, it is killed. JobTracker reschedules the task
on another machine.
• If task continues to fail, job is failed.

15
15
Fault tolerance
• TaskTracker Failure
• JobTracker receives no heartbeat
• Removes TaskTracker from pool of TaskTrackers to schedule tasks
on.

16
16
Fault tolerance

• JobTracker Failure
• Singe point of failure. Job fails

17
17
Configuration
 mapred-site.xml
– Configuration settings for MapReduce (prefixed with mapred.)
– These settings are global, some of the settings can be changed per Map Task

Parameter Description
Scheduler used by JobTracker. In BigInsights
jobtracker.taskScheduler changed to:
com.ibm.biginsights.scheduler.WorkflowScheduler

Maximum Number of Map/Reduce Tasks per task


tasktracker.map.tasks.maximum
tasktracker.reduce.tasks.maximum tracker. Set according to memory and CPUs in
System
child.java.opts Max. memory of JVM for each task
map.tasks.speculative.execution
reduce.tasks.speculative.execution Starts redundant tasks
Parameters for merging map output in reducer
io.sort.mb
io.sort.factor (memory cache and number of files to merge at a
time)

18
Agenda

 MapReduce Introduction
 MapReduce Tasks
 WordCount Example
 Splits
 Execution
 Scheduling
– FIFO scheduler (with priorities)
– Fair scheduler

19 © 2013 IBM Corporation


Scheduling – FIFO scheduler (with priorities)
• Each job uses the whole cluster, so jobs wait their turn.
• Can set priorities for the jobs in the queue (5 queues with priorities)
• Preemption is not supported

Very Low Normal High Very


Low High

J3

J6 J2

J7 J5 J4 J1

20
20
Scheduling – Fair scheduler
• Jobs are assigned to pools (1 pool per user by default)
• A user job pool is a number of slots assigned for tasks for that user
• Each pool gets the same number of task slots by default

Mary submits
many jobs …
Slot for a task

Job pool for


user Mary

21 21
Scheduling – Fair scheduler
• Multiple users can run jobs on the cluster at the same time
• Example:
• Mary, John, Peter submit jobs that demand 80, 80, and 120 tasks
respectively
• Say the cluster has a limit to allocate 60 tasks at most
• Default behavior: Distribute task fairly among 3 users (each get 20)

Maximum amount of tasks that can be allocated in this cluster: 60

20 tasks 20 tasks 20 tasks

Mary John Peter


Demand: 80 Demand: 80 Demand: 120
22 22
Scheduling – Fair scheduler
• Minimum share can be set for a pool
• In the previous example, say Mary has a minimum share of 40
• Mary would be allocated 40, then the rest is distributed evenly to other
pools

Maximum amount of tasks that can be allocated in this cluster: 60

40 tasks 10 tasks 10 tasks

Mary John Peter


Demand: 80 Demand: 80 Demand: 120
Minimum share: 40

23 23
Questions?

You might also like