05 Mapreduce

MapReduce
<
Enseignant : Mohamed MANAA
1
Agenda
 MapReduce Introduction
 MapReduce Tasks
 WordCount Example
 Splits
 Execution
 Scheduling
2 © 2013 IBM Corporation

Introduction to MapReduce
 Driving principals
– Data is stored across the entire cluster
– Programs are brought to the data, not the data to the program
 Data is stored across the entire cluster (the DFS)

– The entire cluster participates in the file system
– Blocks of a single file are distributed across the cluster
– A given block is typically replicated as well for resiliency
10110100 Cluster
10100100
1
11100111
11100101
00111010 1 3 2
01010010
11001001
2
01010011 4 1 3
Blocks 00010100
10111010
11101011
3
11011011
01010110 2 4
10010101 4 2 3
00101010
10101110
1
4
01001101
01110100
Logical File
3
MapReduce Explained
 Hadoop computation model
– Data stored in a distributed file system spanning many inexpensive computers
– Bring function to the data
– Distribute application to the compute resources where the data is stored
 Scalable to thousands of nodes and petabytes of data
public static class TokenizerMapper Hadoop Data Nodes

extends Mapper<Object,Text,Text,IntWritable> {
private final static IntWritable
one = new IntWritable(1);
private Text word = new Text();
public void map(Object key, Text val, Context
StringTokenizer itr =
new StringTokenizer(val.toString());
while (itr.hasMoreTokens()) {
word.set(itr.nextToken());
}
context.write(word, one); 1. Map Phase
}
} (break job into small parts)
public static class IntSumReducer
extends Reducer<Text,IntWritable,Text,IntWrita
private IntWritable result = new IntWritable();
public void reduce(Text key, Distribute map 2. Shuffle
Iterable<IntWritable> val, Context context){
int sum = 0;
tasks to cluster (transfer interim output
for (IntWritable v : val) {
sum += v.get();
for final processing)
. . .
3. Reduce Phase
(boil all output down to
MapReduce Application
a single result set)
Shuffle
Result Set Return a single result set
4
MapReduce Engine
 Master / Slave architecture
– Single master (JobTracker) controls job execution on multiple slaves
(TaskTrackers).
 JobTracker
– Accepts MapReduce jobs submitted by clients
– Pushes map and reduce tasks out to TaskTracker nodes
– Keeps the work as physically close to data as possible
– Monitors tasks and TaskTracker status
 TaskTracker
– Runs map and reduce tasks
– Reports status to JobTracker
– Manages storage and transmission of intermediate output
cluster Computer 1
JobTracker
TaskTracker TaskTracker TaskTracker TaskTracker

Computer 2 Computer 3 Computer 4 Computer 5
5
The MapReduce Programming Model
 "Map" step:
– Input split into pieces
– Worker nodes process individual pieces in parallel (under

global control of the Job Tracker node)
– Each worker node stores its result in its local file system
where a reducer is able to access it
 "Reduce" step:
– Data is aggregated (‘reduced” from the map steps) by
worker nodes (under control of the Job Tracker)
– Multiple reduce tasks can parallelize the aggregation

6
MapReduce Overview
Results can be
written to HDFS or a
database
Map Shuffle Reduce

Distributed
FileSystem HDFS,
data in blocks
7
Agenda
 MapReduce Tasks
– Map
– Shuffle
– Reduce
– Combiner
 Splits
 Execution
 Scheduling

MapReduce – Map Phase
 Mappers
– Small program (typically), distributed across the cluster, local to data
– Handed a portion of the input data (called a split)
– Each mapper parses, filters, or transforms its input
– Produces grouped <key,value> pairs
Map Phase
sort Logical Output File
10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
sort reduce 00111010
01010010
11001001
2
01010011
2 map 11001001
00010100
10111010 Logical Output File
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map
Logical
Input File
9
MapReduce – The Shuffle
 The output of each mapper is locally grouped together by key
 One node is chosen to process data for each unique key
 All of this movement (shuffle) of data is transparently orchestrated by
MapReduce
Shuffle

10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
01010010
11001001
2
01010011
2 map 11001001
00010100
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map
10
MapReduce – Reduce Phase
 Reducers
– Small programs (typically) that aggregate all of the values for the key that
they are responsible for
– Each reducer writes output to its own file
Reduce Phase

10110100 1 map copy merge 10110100
10100100
1
11100111 10100100
11100101 11100111
00111010 11100101 To DFS
01010010
01010010
11001001
2
01010011
2 map 11001001
00010100
11101011
sort merge 10110100
3
11011011
10100100
01010110
10010101
3 map 11100111
00101010 11100101 To DFS
reduce 00111010
10101110
01010010
4
01001101
sort 11001001
01110100
4 map
11
Agenda
 MapReduce Tasks
 Splits
 Execution
 Scheduling

How does Hadoop run MapReduce jobs?
2. get new job ID
MapReduce 1. run job 5. initialize job
JobClient JobTracker
program
4. submit job
client JVM
jobtracker node
client node
7. heartbeat
6. retrieve
3. copy job (returns task)
input splits
resources
Distributed TaskTracker
8. retrieve
file system
job resources 9. launch
(e.g. HDFS)
child JVM
Child
 Client: submits MapReduce jobs
 JobTracker: coordinates the job run, breaks 10. run
down the job to map and reduce tasks for each
node to work on the cluster MapTask
or
 TaskTracker: execute the map and reduce ReduceTask
functions
tasktracker node
13
Fault tolerance
JobTracker node
JobTracker 3 JobTracker fails
heartbeat
TaskTracker 2 TaskTracker fails
child JVM
Child
1 Task fails
MapTask
or
ReduceTask
TaskTracker node
14
14
Fault tolerance
• Task Failure
• If a child task fails, the child JVM reports to the TaskTracker before it
exits. Attempt is marked failed, freeing up slot for another task.
• If the child task hangs, it is killed. JobTracker reschedules the task
on another machine.
• If task continues to fail, job is failed.
15
15
Fault tolerance
• TaskTracker Failure
• JobTracker receives no heartbeat
• Removes TaskTracker from pool of TaskTrackers to schedule tasks
on.
16
16
Fault tolerance
• JobTracker Failure
• Singe point of failure. Job fails
17
17
Configuration
 mapred-site.xml
– Configuration settings for MapReduce (prefixed with mapred.)
– These settings are global, some of the settings can be changed per Map Task
Parameter Description
Scheduler used by JobTracker. In BigInsights
jobtracker.taskScheduler changed to:
com.ibm.biginsights.scheduler.WorkflowScheduler
Maximum Number of Map/Reduce Tasks per task

tasktracker.map.tasks.maximum
tasktracker.reduce.tasks.maximum tracker. Set according to memory and CPUs in
System
child.java.opts Max. memory of JVM for each task
map.tasks.speculative.execution
reduce.tasks.speculative.execution Starts redundant tasks
Parameters for merging map output in reducer
io.sort.mb
io.sort.factor (memory cache and number of files to merge at a
time)
18
Agenda
 MapReduce Tasks
 Splits
 Execution
 Scheduling
– FIFO scheduler (with priorities)
– Fair scheduler

Scheduling – FIFO scheduler (with priorities)
• Each job uses the whole cluster, so jobs wait their turn.
• Can set priorities for the jobs in the queue (5 queues with priorities)
• Preemption is not supported
Very Low Normal High Very

Low High
J3
J6 J2
J7 J5 J4 J1
20
20
Scheduling – Fair scheduler
• Jobs are assigned to pools (1 pool per user by default)
• A user job pool is a number of slots assigned for tasks for that user
• Each pool gets the same number of task slots by default
Mary submits
many jobs …
Slot for a task
Job pool for

user Mary
21 21
• Multiple users can run jobs on the cluster at the same time
• Example:
• Mary, John, Peter submit jobs that demand 80, 80, and 120 tasks
respectively
• Say the cluster has a limit to allocate 60 tasks at most
• Default behavior: Distribute task fairly among 3 users (each get 20)
Maximum amount of tasks that can be allocated in this cluster: 60
20 tasks 20 tasks 20 tasks
Mary John Peter

Demand: 80 Demand: 80 Demand: 120
22 22
• Minimum share can be set for a pool
• In the previous example, say Mary has a minimum share of 40
• Mary would be allocated 40, then the rest is distributed evenly to other
pools
Maximum amount of tasks that can be allocated in this cluster: 60
40 tasks 10 tasks 10 tasks
Mary John Peter

Demand: 80 Demand: 80 Demand: 120
Minimum share: 40
23 23
Questions?

05 Mapreduce

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

05 Mapreduce

Uploaded by

Copyright:

Available Formats

MapReduce

Enseignant : Mohamed MANAA

2 © 2013 IBM Corporation

 Data is stored across the entire cluster (the DFS)

public static class TokenizerMapper Hadoop Data Nodes

Result Set Return a single result set

TaskTracker TaskTracker TaskTracker TaskTracker

– Worker nodes process individual pieces in parallel (under

– Multiple reduce tasks can parallelize the aggregation

Map Shuffle Reduce

8 © 2013 IBM Corporation

sort Logical Output File

sort Logical Output File

12 © 2013 IBM Corporation

JobTracker 3 JobTracker fails

TaskTracker 2 TaskTracker fails

Maximum Number of Map/Reduce Tasks per task

19 © 2013 IBM Corporation

Very Low Normal High Very

Job pool for

Maximum amount of tasks that can be allocated in this cluster: 60

20 tasks 20 tasks 20 tasks

Mary John Peter

Maximum amount of tasks that can be allocated in this cluster: 60

40 tasks 10 tasks 10 tasks

Mary John Peter

You might also like