Professional Documents
Culture Documents
Microsoft PowerPoint - 6 - MapReduceAlgorithmDesign
Microsoft PowerPoint - 6 - MapReduceAlgorithmDesign
Design
Contents
• Combiner and in‐mapper combining
• Complex keys and values
• Secondary Sorting
Combiner and in‐mapper combining
• Purpose
Carry out local aggregation before shuffle and sort phase
Reduce the communication volume between map and reduce stages
Combiner example (word count)
• Use reducer as
combiner
Integer addition
is both
associative and
commutative
MapReduce with Combiner
MapReduce with combiner
• MapReduce with combiner
map: (k1, v1) ‐> [(k2, v2)]
combine: (k2, [v2]) ‐> [(k2, v2)]
• The combiner input and output key‐value types must match the
mapper output key‐value type
Combiner is an optimization, not a requirement
• Combiner is optional
A particular implementation of MapReduce framework may choose to
execute the combine method many times or none
Calling the combine method zero, one, or many times should produce the
same output from the reducer
• The correctness of the MapReduce program should not rely on the
assumption that the combiner is always carried out