MapReduce Programming Model and Design Patterns

Andrea Lottarini January 17, 2012

Contents
1 Introduction 1.1 Functional Programming Heritage . . . . . . . . . . . . . . . . . . . . . . 2 MapReduce Programming Model 2.1 The Word Count Example . . . . 2.2 Advanced Features . . . . . . . . 2.2.1 Combiner . . . . . . . . . 2.2.2 Partitioner . . . . . . . . . 2 3 3 4 6 6 9 10 10 14 16 20 21 21 22 22 22 22 22 23 23

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

3 Design Patterns 3.1 Matrix Vector multiplication . . . . . . . . . 3.1.1 Secondary Sort . . . . . . . . . . . . 3.1.2 Generic Objects and Sequence Files . 3.1.3 In Memory Multiplication . . . . . . 3.2 Relational Algebra . . . . . . . . . . . . . . 3.2.1 Selection . . . . . . . . . . . . . . . . 3.2.2 Projection . . . . . . . . . . . . . . . 3.2.3 Union . . . . . . . . . . . . . . . . . 3.2.4 Intersection . . . . . . . . . . . . . . 3.2.5 Difference . . . . . . . . . . . . . . . 3.2.6 Natural Join . . . . . . . . . . . . . . 3.2.7 Group by and aggregation functions . 4 Conclusions

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

1

1

Introduction

This document is based on Dr. Nicola Tonellotto’s lectures of the Complements of Enabling Platforms course. The objective of these notes is to introduce the MapReduce programming model and its most used design patterns. MapReduce serves mainly to process large data sets in a massively parallel manner and it is based on the following fundamental concepts [4]: 1. Accessing input elements in a sequential fashion. 2. Each piece of input is treated and processed as a key/value pair. 3. Intermediate values are grouped using the key. 4. Each group is reduced using a specific function. The programming model is very simple (Figure 1). Apache Hadoop[6][9][8] is the de facto standard implementation of the MapReduce programming model. Considering a large, distributed network of commodity computers, the Hadoop framework can handle most of the non functional problems such as distribution of input data/tasks and fault tolerance. In fact, one of the biggest strength of the MapReduce model and the Hadoop implementation. We will not consider how the frameworks implements the model, but instead we will analyze only the programming model and we will present several examples implemented in Hadoop.

Programming Model (simple)
INPUT

I1 map

I2 map

I3 map Aggregate values by key

I4 map

I5 map

reduce

reduce

reduce

O1

O2

O3

OUTPUT

Figure 1: Schema of a Map Reduce computation. Notice how the input and output data is accessed or written in a massively parallel fashion.

MCSN – N. Tonellotto – Complements of Distributed Enabling Platforms

11

2

(2 ∗ x) Then. We will now examine an example of a map and fold computation in haskell.4.6] 12 Here. Every InputSplit is then assigned to a worker using Rack Awareness.1 Functional Programming Heritage The MapReduce model is strongly influenced.e. 2 A new instance of Mapper is instantiated in a separate Java process for each map task (InputSplit) that makes up part of the total job input 1 3 . this is done by using the functions Map and Reduce in a similar way as Section 1. the input files are divided in chunks called InputSplits1 . the output of every mapper is sorted by means of the key. is the file is smaller padding is added.1.4. The computation is divided in three steps: 1. foldl (+) 0 [2. i.2. The map function multiplies every number in a list by 2.1. the Reduce (fold) function sums all the numbers in the list. data which is stored locally on a given machine is usually elaborated on the same node in order to avoid communication. map ((∗) 2) [1. Every element of an InputSplit is elaborated by the assigned mapper 2 and (in general) a new element is emitted. 2 MapReduce Programming Model Conceptually. 2. the expression (+) denotes addition and 0 is the initial value of the sum. This is done because the default block size of the Hadoop distributed file system is 64MB. if not derived. by the map/fold primitives which can be found in many functional languages. foldl is left associative so the operation performed is: (((0 + 2) + 4) + 6) = 12 In the next section we are going to show how the actual model of MapReduce is an extension of this scheme. Shuffle & Sort: This phase is entirely handled by the framework. Map: In this phase. This mechanism is implicitly implemented by the framework without the programmer’s intervention.3] [2. MapReduce programs create an output list of elements from an input list of elements (Figure 1).6] ((*) 2) denotes multiplication by 2. By default every file is divided in chunks of 64MB. Expressed in lambda notation this is equivalent to: λx.

permits the user to use and define different datatypes under the constrain that they can be serialized as a string. V alue1 is produced. communication is not allowed between mapper or reducer instances. Consider that this is the only phase where communication is performed. a single key with a list of values is received Key. we want to be oblivious to data types and consider everything as a String. the input is a list of pairs in the form Key.txt: this is the dog file cat. we should formally define the phases just introduced. V alue1 . The program just computes how many times different words appear in a set of files. In the Map phase. All these constrains state that a MapReduce computation should be purely functional and static typed (exactly like a computation in Haskell). This is necessary to avoid synchronization among a large number of nodes. the framework. Notice that every phase can be easily parallelized but the three phases that compose a computation are indeed sequential. sorts all the values using the key and assigns every sorted run to a specific machine.1 The Word Count Example This is possibly one of the simplest examples of a computation implemented in the MapReduce Model. V alue2 . · · · ] and a multiset of the form Key. however. Similarly. an operation which would affect the scalability of the whole application. During the shuffle & sort stage. 2. · · · is produced by applying a specific function. there is no possibility to start the reduce phase before all elements are sorted. Reduce: Finally all runs of sorted elements. The Hadoop implementation. this is seamless to the programmer. Given the input files: dog.3. At this point. V alue and another list of pairs Key1 . It may seem that the model impose some excessive constrains. we want to be able to split the input in chunks and process them separately. v3. We already stated that every element in MapReduce is immutable.txt: this is the cat file we should expect the output file to look like this: this 2 is 2 the 2 4 . In the reduce stage. Key. We are not imposing that the reduce function is associative or commutative as in the case of the fodl command (Section 1.1 ) or the MPI REDUCE command [1]. In fact. which performs a specific function and produces the output elements. which implements MapReduce. associated to the different keys. Consider that every element is immutable and of type String [3] so pairs of strings are read from the input and a pairs of strings are produced as intermediate output. Moreover. [v1. are assigned to a different reducer. We are not implying that the same number of pairs is produced as an output. v2.

i. Text. context. int .nextToken()). [1. 2 while the reducer for the word ”dog” will receive the pair dog. the whole computation is purely functional and the input files can be splitted without affecting the correctness of the computation. IntWritable.hasMoreTokens()) { word.dog 1 cat 1 file 2 and the code which actually implements it will have this form: mapper (filename. string it produces pairs of type string. file contents ) : for each word in file contents : emit (word. public void map(Object key. values) : sum = 0 foreach value in values : sum = sum + value emit (word. The corresponding Hadoop code for this example is the following: public class WordCount { public static class MyMapper extends Mapper<Object. 1 . [1] and produce dog. Context context) { StringTokenizer itr = new StringTokenizer(value. Text. In the shuffle phase the output of every mapper is collected and sorted. 5 . For every word in the file the mapper will produce a pair word. Notice how there is no state associated to mapper or reducer processes. sum) Listing 1: Pseudo code of the wordcount application We can analyze the input/output behavior: every mapper will receive a list of key value pairs. } } } public static class MyReducer extends Reducer<Text. IntWritable> { private final static IntWritable one = new IntWritable(1). 1 . Notice how the mapper changes datatypes in this process.write(word. while (itr .toString()).set( itr .e. The reducer associated to the word ”this” will receive the pair this. A Reducer is then instantiated for every different key. IntWritable> { private IntWritable result = new IntWritable(). Text value. 1) reducer (word. private Text word = new Text(). 1] and produce the pair this. Text. starting from pairs of type f ilename. one). We can assume that the key is the filename while the value is the content of a whole line of text.

class).1. job.class).setMapperClass(MyMapper. No relevant computation is performed in the main apart from specifying configuration details of the job.2. 2.setOutputValueClass(IntWritable.class).} public void reduce(Text key.exit(job. } result . } Listing 2: Code of the wordcount application. System.1 Combiner By analyzing the Wordcount example in Section 2.class).2 Advanced Features We have already seen how Hadoop permits to use datatypes in order to relieve the programmer from the burden of working with text only. Job job = new Job(conf.class).setJarByClass(WordCount.addInputPath(job. 2. Notice how the association of the files (or inputSplits) to the Mappers is performed by the framework. set(sum). we can notice that producing a pair word. result ) . FileInputFormat. job.write(key.setOutputKeyClass(Text. Hadoop has other advanced features that somewhat break the purely functional characteristic of Map Reduce in order to obtain better performance and usability. FileOutputFormat. job. job. A possible solution is to keep a dictionary of word with associated number of occurrences and flush it when the input is completely scanned. this breaks the functional constrain by adding a state to the mapper. ”wordcount”). 6 . new Path(args[0])).waitForCompletion(true) ? 0 : 1). new Path(args[1])). However.setReducerClass(MyReducer. Context context) { int sum = 0. 1 for every word in the text can be inefficient.get(). for (IntWritable val : values) { sum += val. } } public static void main(String[] args) throws Exception { Configuration conf = new Configuration(). The mapper could perform a preliminary reduce of the values of his input split in order to reduce the size of communications.setOutputPath(job. context. Iterable<IntWritable> values. job. This operation is instead efficiently performed by combiners (Figure 2).

job. this.class).setCombinerClass(MyReducer. It is important to notice that the programmer has no control over the execution of the combiner. the output of the mapper instances running on a same node will be redirected to the node combiner which will directly produce the pair this. It is left 7 .class).setMapperClass(MYMapper. 1 . 1 By using the combiner. consider the word count application and an Input Split which contains three instances of the word ”this”. Listing 3: Combiner setup in the job configuration. 1 . Without the combiner these pairs will be produced: this. job.class). Notice that a reference to the reducer class is given as the combiner.Programming Model (complete) INPUT I1 map combine partition I2 map combine partition I3 map combine partition I4 map combine partition I5 map combine partition Aggregate values by key reduce reduce reduce O1 O2 OUTPUT O3 Figure 2: Schema of a Map Reduce computation comprising a combiner and partition phase. job. The framework will keep in memory sorted runs of the mapper output and will periodically flush these sorted runs using the combiner. this. It is clear that the same operation performed by the reducer is instead performed from the combiner in memory. 3 This is obviously advantageous considering that the amount of data to be transferred is greatly reduced.setReducerClass(MyReducer. This is done by simply adding in the previous code a single instruction to instruct the framework to perform the combiner phase (Listing 3). As an example.

Nonetheless the programmer could actually overcome this by performing the reduction by hand. Text. may grow in size very rapidly. used to represent the state. Consider a modification of the wordcount example where the output consists of pairs word. 2010. public void reduce(Text key. This might be convenient for applications where the mappers produce lots of data with many repetitions. Morgan & Claypool Publisher. IntWritable. It breaks the stream behavior. Context context) { int sum = 0. In this case. boolean where the boolean values indicates whether more than ten occurrences of the associated word are present in the set of analyzed documents. •  Coding overhead This givesit a programmer the ability to control the combine phase and to minimize •  Is the “real” improvement? the creation and destruction of objects during execution. Jimmy Linassumption (now the mapper 44 breaks the functional programming and Chris Dyer. Iterable<IntWritable> values. Tonellotto – Complements of Distributed Enabling Platforms 5 8 . which is possible in Hadoop. but it has two major drawbacks: 1. set( (sum > 10) ? true : false). BooleanWritable> { private BooleanWritable result = new BooleanWritable(). the user has to define a case specific combiner.get(). also memory foot printing is necessary as auxiliary data structures. Even more important is the case when Combiners cannot simply be Reducers executed in memory. Statefull In-Mapper Combining Figure local aggregator with performed •  Custom 3: PseudoCode of the Combiner state in memory. process has a state). 2. MCSN – N. it is necessary to rewrite the mapper code ( Figure 3). This requires a modification in the reducer: public static class NewReducer extends Reducer<Text.to the framework to decide whether it is convenient to perform the local reduction or not. } result . for (IntWritable val : values) { sum += val. pag. ItTaken from “Data-Intensive Text Processing with MapReduce”.

2.setJarByClass(WordCountNew. System. it should not change the data types between input and output. job.addInputPath(job. It is possible to use the reducer class of the first wordcount application (Listing 2). job.exit(job.class). which is an operation that is both commutative and associative. job. an ad hoc combiner must be implemented. the combiner has to compute the sum of the values produced by the mapper instances. Notice that it is necessary to make a small modification to the main and specify the datatypes outputted in each phase.setMapperClass(MyMapper.2 Partitioner Another important feature of the Hadoop framework is the Partitioner. ”wordcount”).class). The default behavior of Hadoop is to use the HashPartitioner class as a partitioner. } Listing 5: Job configuration of the new wordcount application 2. FileInputFormat.class).class).class). new Path(args[0])). new Path(args[1])). } } Listing 4: Reducer of the new wordcount application The mapper is the same presented in Listing 2. obviously.waitForCompletion(true) ? 0 : 1).class).setMapOutputValueClass(IntWritable. job.class). therefore. but consider a more realistic case where the number of values per key is unbalanced 9 . job. In this case.setMapOutputKeyClass(Text. It utilizes the hascode() function of the key in order to divide evenly the keys among the different reducers applying the function j = (hascode(Ki )) mod numberOfReducers Thus the reducer Rj is associated with the key Ki . result ) . can define a specific partition function. boolean . The operation performed by the reducer in Listing 4 is not associative and it modifies the data types. The constrains on the combiner are that it should perform a commutative and associative operation and. Job job = new Job(conf. however. job. Partitioning is the process of determining which reducer instance will receive which intermediate keys and values[5]. job.context.setOutputPath(job. public static void main(String[] args) throws Exception { Configuration conf = new Configuration().setCombinerClass(MyReducer. job.write(key. The user. The reducer instead takes a key of type string with a list of integers and output a pair string.setOutputKeyClass(Text.setOutputValueClass(BooleanWritable.class). FileOutputFormat.setReducerClass(NewReducer. The default behavior is perfectly fine for evenly distributed keys.

n received.E> and specify. while j indicates the column index 10 . i. it will emits < i. the user has to implement the partitioner interface MyPartitioner implements Partitioner<E. it emits < j. 3. 3 Design Patterns We have already seen some advanced features of Hadoop and we presented the InMemoryCombiner which is an important design pattern. value >. the same column index. It will also receive all the nonzero elements of the n-th column of the matrix A. We will now present other two relevant example of Map Reduce Design Patterns: • Matrix vector multiplication • Relational algebra Operations [2] The first one is very important since we will show how many advanced features of Hadoop are utilized in order to produce an efficient implementation of a map reduce computation. i. setPartitionerClass (MyPartitioner). Examples of the matrix matrix multiplication implemented in Hadoop can also be found at [7]. in the job configuration. vj >. By doing so.n ∗ vn > ( Figure 5). value distribution. This operation is performed on the power method (among others) to compute the greatest eigenvalue of a matrix. In order to define a partitioner. elements with the same j. Every non zero element is stored as < row. column. its implementation as the partitioner class using job. Ai. Only non zero elements of the matrix are stored in order to reduce space utilization.and you have statistical information about the key. In the first case. ∀ Ai. The simplest way to perform this operation is to perform two steps of map reduce: • Map 1: The Mapper receive either a chunk of elements from the matrix or the vector. will be sent to the same reducer.e. 3 i indicates the row index. • Reduce 1: The reducer assigned to the key n will receive a single element from the vector v.j >3 while in the second case it emits < j.1 Matrix Vector multiplication Consider a very common operation in Numerical methods: multiply a sparse matrix A with a vector v. Ai. precisely the element in position n.

Figure 4: Matrix Vector multiplication.j ∗ wj >. It will perform the sum of these elements and emit < i. the same row index. • Map 2: Is an identity mapper (reads elements from the input and output them without modifications). the final value of the vector for the row i (Figure 6). • Reduce 2: Will receive all temporary elements with the same i. i.n) v(n) A(:. j Ai.e. A(:.n)*v(n) Figure 5: First Phase of Map Reduce 11 .

.n = 0..j ∗ wj ￿ . The first step in particular is somewhat tricky. Similarly. We would like to receive vn before the elements of the matrix since vn is used in every multiplication performed by the reducer.... ￿ j An1. The mapper should distinguish between the matrix element and the vector element. ￿ j An2..1)*v(1) A(:. .n | Ai.n ∗ vn > ∀ Ai.. 4. The first two concerns are functional.A(:. 2. Let’s analyze it in detail. Ai. .n)*v(n) n1 Reducer n1 ￿n1. We are using two steps of map reduce in sequence and the information is not implicitly text (they are numbers). It also has to emit data structured differently in the two cases.. First we consider a preliminary straightforward implementation. 1.. 5... 3.. n2 Reducer n2 ￿n2.2)*v(2) A(:. Consider the output of the reducer < i. it is necessary to distinguish elements from A and v. we would like to minimize the size of the data transferred. otherwise the computation will be incorrect.. in fact.. We want to have a well organized/maintainable code. the reducer has to distinguish between an element from the vector or the matrix.j ∗ wj ￿ Shuffle & Sort Figure 6: Second Phase of Map Reduce It does not appear to be a complex operation to implement in Hadoop but it require some attention in order to be implemented correctly. public class Phase1 { 12 .

iterator() .toString(). } else if (! chunkName. if (V) { context. InterruptedException { /∗∗ The vector is stored one element per line in the form row#value ∗ The matrix is stored one element per line in the form column#row%value. double vectorValue = 0.public static class MyMapper extends Mapper<LongWritable.write(column. split (”#”).substring(1)). Text> { private static boolean V = false.startsWith(”V”)) { V = true. } } } public static class MyReducer extends Reducer<IntWritable. Text. new Text(”V” + values[1] ) ). } } @Override public void map(LongWritable key. if (val . context. while (iter . outputValue.charAt(0) == ’a’) { matrixValues. 13 . InterruptedException { ArrayList<Text> matrixValues = new ArrayList<Text>(). if (chunkName. IntWritable. Context context) throws IOException. Text val.getInputSplit()). Iterator <Text> iter = values.hasNext()) { val = iter .write( new IntWritable(Integer. } else { vectorValue = Double.iterator() . Iterable<Text> values. ∗/ String [] values = value.toString() . Text value.next(). Text. @Override protected void setup(Context context) throws IOException { String chunkName = ((FileSplit) context.getPath(). Context context) throws IOException. } else /∗ The sparse element must be emitted ∗/ { IntWritable column = new IntWritable(Integer.parseInt(values[0])).parseInt(values[0]) ) . } } /∗ elements are emitted by scanning the list of received elements from the matrix ∗/ iter = matrixValues. Text> { @Override public void reduce(IntWritable key.getName(). IntWritable. new Text(”a” + values[1])).add(val).startsWith(”A”)) { throw new IOException(”File name not correct”).parseDouble(val.

context. There are many possible ways to implement a secondary sort. We are also using text to save intermediate values so we can arrange the structure of the data as we want. In this specific case.hasNext()) { val = iter . We assume that the vector is saved on multiple files stored in a folder with the filename starting with V. new Text(”” + outputValue)). we consider non functional constrains. String [] rowValue = val. In the worst case. this solution is equivalent to the previous solution (Listing 6). A possible optimization is to buffer elements from A until the element from the vector is received and then. it must implements the WritableComparable interface.toString(). This problem of having values (not only keys) sorted can be found in literature as Secondary Sort. in this case a very convenient solution is to define a specific object as a key. @Override public void readFields(DataInput in) throws IOException { 14 . flush the buffer and output final elements in a streaming fashion.1 Secondary Sort What we want to accomplish with secondary sort is to have runs of values from a key ordered depending on our needs. Now.} } } while (iter . we solved the problem in the reducer by adding an annotation in the intermediate values. Mapper and Reducers can distinguish between data from the vector or the matrix. Similarly for the matrix A. 3. } Listing 6: Code of the first Map Reduce phase. The order of the values received is random. public class IntAndIdWritable extends IntWritable { private char id. split (”%”). we can extend IntWritable (Listing 7) in order to solve the problem in a very concise way. The object must be comparable so that a list of keys can be sorted using their compareTo method. This solution is inefficient since the reducer has to store every element received from the matrix. outputValue = vectorValue ∗ Double. We solved the problem of distinguishing between A and v in the mapper using the filename of the chunk. We can overcome this problem using the shuffle & sort phase and ensure that the vector element will be the first on the list of values for every reduce operation. Similarly. The object must be writable in order to be serialized using the methods readFields(DataOutput out) and write(DataInput in).parseInt(rowValue[0])).next(). In order to utilize a user defined object as a key in Hadoop.write(new IntWritable(Integer.parseDouble(rowValue[1]).1.

int s2.define(IntAndIdWritable. int l1 . s2). b2. int thatValue = readInt(b2. } } } static { // register this comparator WritableComparator. } @Override public void write(DataOutput out) throws IOException { super. } public int compare(byte[] b1. ∗/ public static class Comparator extends WritableComparator { public Comparator() { super(IntAndIdWritable. s1+l1−2. s2+l2 −2.super.readFields(in). 2). s1) .writeChar(id). There is a lot going on in this class. This class implements both methods from the Writable and the Comparable interface. Sort: elements are sorted.class). 15 . int l2) { int thisValue = readInt(b1. /∗∗ char are in UTF −> 2 byte ∗/ int confrontoChar = compareBytes(b1.id : compare value.id − ((IntAndIdWritable)o). } /∗∗ A Comparator optimized for IntAndIdWritable. new Comparator()).write(out).readChar(). return (thisValue<thatValue ? −1 : (thisValue==thatValue ? confrontoChar : 1)). this. int s1. } /∗∗ Compares two IntWritables. return (compare value==0)? this. 2 . Consider the three phases that occur during shuffle & sort. We omitted constructors and getter/setter methods for clarity.id=in. } Listing 7: IntAndIdWritable class.compareTo(o).class. byte[] b2. 1. Based exclusively on the key. out. ∗/ @Override public int compareTo(Object o) { int compare value = super.

We must ensure that they are sent to the same reducer task. By doing so. 3.2. The whole implementation of secondary sort is not so simple since it requires redefinition of the different mechanisms. consider a completely different problem of the naive implementation (Listing 6): we work with integers and double values and therefore using text for input/output and intermediate values is probably inefficient and leads to complex code. The user can override the comparator and group together elements with different keys. Hadoop offers the possibility to use any type under the constrains that it should 16 . This is done by defining an ad hoc partition function which in this particular case can be inherited from the superclass (IntWritable) without any modification.2 Generic Objects and Sequence Files Now. In this specific case. Now. For intermediate values. The inner class Comparator implements such compare on bytes. The reader may actually benefit from trying to implement the Secondary Sort from scratch redefining all the three interfaces (all the methods should work on raw binary data for better performance). The first phase sorts the output from the mappers using the key. In our class. we would like to receive both on a single sorted run invoking reduce(). Group: elements are grouped together in a single reduce() invocation. By doing so. only the integer part of the object will be used for partitioning. Notice that every object can have different comparators for the sort and group phase. The IntAndIdWritable class will inherit the hashcode method from the IntWritable class and the default HashPartitioner will be used. without considering if they are elements from the matrix or from the vector.1. This is done by implementing a grouping comparator. a grouping comparator tells the framework when to collapse different runs (with different keys) in a single run. the framework gives the programmer the possibility to implement a compare directly on the bytes of the serialized object in order to avoid serialization and deserialization during the shuffle & sort phase. the Sort phase will place the element coming from the vector before the elements from the matrix. 3. We ensured that the elements from the matrix and the vector having the same index are sent to the same reducer and they are sorted in such a way that the element from the vector is the first of the run. Partition: elements are partitioned to worker. we managed to implement it with minimal modifications using the default partitioner and inheriting other mechanisms from IntWritable.e. Therefore. However. i. In this case. To be more efficient. we added an id and implemented a new compareTo method so that elements will be in the right order (element from the vector first). this is based both on the key and the value (if necessary). runs of elements will be assigned to the same reducer without considering the id. we can declare the comparator of IntWritable as grouping comparator of IntAndIdWritable so elements with the same integer part will be grouped together ignoring the id. Still this is not enough. this is not enough because elements from the vector and the matrix have different keys so they could be partitioned to different machines.

getPath(). GenericElement> { IntAndIdWritable out = new IntAndIdWritable(). However the framework does not permit such operations. In our case. DoubleW ritable. This mechanism also produces a more readable and efficient code since key value pairs are stored as objects. Similarly to what presented in Listing 6 we can distinguish vector or matrix element from the filename. However we have the problem that values read from the input and emitted from mappers can either be a sparse element (value and coordinates) or a double value (a single vector element). the input key will correspond to the column index of either the matrix or the vector. but only a small set of different types. In conclusion. @Override protected void setup(Context context) throws IOException { String folderName = ((FileSplit) context. every serialized element should contain a serialized id of its class increasing the amount of data transferred.getInputSplit()) . it is done in order to ensure better performance. if (folderName. without need for conversion. It provides the developer with a writable wrapper for different types of objects with a minimum overhead for serialization.startsWith(”V”)) /∗ A vector element must be emitted ∗/ Consider a M apper < IntW ritable. GenericElement. In fact. The final implementation is the following: public class Phase1 { private static boolean V = false. instead of Text. every object read or written by mappers and reducers must have the same exact type as specified in its type declaration4 . 4 17 . This is done via the SequenceFile record reader and writer. This is done in the implementation of ObjectOutputStream in Java and it greatly increases the dimension of serialized data.getParent(). we do not want to serialize every possible type of object. IntAndIdWritable. Hadoop also offers the possibility to use as Input and Output of a map reduce job serialized data (binary data). in order to use polymorphism. DoubleW ritable >. Sequence Files and GenericWritable elements are used in order to ensure maximum performance and produce a concise and maintainable code.implement the WritableComparable interface. Consider that in most cases. Secondary Sort. Similarly it should be possible to write an IntAndIdWritable as an output key. getName(). IntW ritable. It should be possible to read an IntAndIdWritable (which inherit from IntWritable) as a key. Even if this breaks the Liskov substitution principle (which can be quite shocking for some of the readers of this report ). Consider that it is not possible to exploit polymorphism. public static class MyMapper extends Mapper<IntWritable. The framework has a specific solution to solve this problem: the GenericWritable Interface.

} while (iter . IntWritable. } else if ( ! folderName. } else /∗ The sparse element must be emitted ∗/ { context. value).get() .hasNext()) { // a generic element is received GenericElement g = iter. Context context) throws IOException. InterruptedException { if (W) { context. Iterator <GenericElement> iter = values.write(out.iterator(). // its correct type can be inferred and the generic element is unwrapped vectorValue = (DoubleWritable) g. GenericElement value. double vv. } } } public static class MyReducer extends Reducer<IntAndIdWritable. SparseVectorElement val = null.hasNext()) { 18 .write(out. set(key. Iterable<GenericElement> values.get(). GenericElement. value) . if ( iter . InterruptedException { DoubleWritable emit = null.{ V = true.startsWith(”A”)) throw new IOException(”File name not correct”). IntWritable out = new IntWritable. } @Override public void map(IntWritable key.next().get() . Context context) throws IOException. ’a’) .vectorValue = null. ’W’).get(). vv = vectorValue. set(key. DoubleWritable> { @Override public void reduce(IntAndIdWritable key.

new Path(args[1])).setOutputPath(job. TextInputFormat.class). IntWritable. TextInputFormat. job. job.getValue() != 0. Notice how objects are reused in order to reduce the overhead of garbage collection.setJarByClass(Phase1. // this should be added in order to use sequence files job.class).setOutputKeyClass(IntWritable.0) { emit.addInputPath(job.class).} } } val = (SparseVectorElement) iter. new Path(args[2])). job.setMapperClass(MyMapper. set(val . and final values are emitted. Job job = new Job(conf. this overhead might be significant for very large computations (especially if jvm reuse is enabled). job. GenericElement> { 19 . job. DoubleWritable. emit).class).next(). job.class). job.class).waitForCompletion(true).getValue()). FileOutputFormat.setReducerClass(MyReducer. if (val . The second phase of matrix vector multiplication is instead very simple.class). job.addInputPath(job. context.Comparator.setOutputValueClass(DoubleWritable.setOutputFormatClass(SequenceFileOutputFormat. temporary values of every row of the output vector are grouped together. new Path(args[0])). } } Listing 8: First phase of matrix vector multiplication (Figure 5). job.setMapOutputKeyClass(IntAndIdWritable.get() . we will show its implementation without any further comment: public class Phase2{ public static class MyReducer extends Reducer<IntWritable. summed.setGroupingComparatorClass(IntWritable.write(out.class).setMapOutputValueClass(GenericElement. ”Phase 1”).setInputFormatClass(SequenceFileInputFormat.set(vv ∗ val .class). job. } public static void main(String[] args) throws Exception { Configuration conf = new Configuration().class).getCoordinate()).

Sum(mv). Context context) throws IOException. Iterable<GenericElement> values... } while(iterator. j An2. ￿ ￿ j An1. A(n1. result .j ∗ wj ￿ Figure 7: In Memory Multiplication Map Reduce 20 .:) ￿n2.public void reduce(IntWritable key.j ∗ wj ￿ . 3.hasNext()) { result = (DoubleWritable) iterator.:) Reducer n1 ￿n1.. InterruptedException { /∗ The array contains the the row vector once the w row vector is read ∗/ DoubleWritable mv.get() .next(). } context.. } } } Listing 9: Second phase of matrix vector multiplication (Figure 6). if ( iterator .result = null..get().1.3 In Memory Multiplication We will now examine the case of a vector small enough to fit in memory.next().write(key.iterator(). v(:) Reducer n2 A(n2. Only the Reducer is shown since non computation is performed in the Map phase. result ) .hasNext()) { mv = (DoubleWritable) iterator.. Iterator <DoubleWritable> iterator = values.

∀Ai.2. 3. every reducer will receive entire rows of the matrix A (Secondary Sort can be used to obtain runs sorted also on the column index). In conclusion. • Reduce: 1. it is possible to compute the multiplication employing only one phase of map reduce: • Map: Identity Mapper • Shuffle & Sort: Elements of the matrix are sorted using the row index as key. Therefore.j ∗ vj is computed and added to previous results.In this case. 5 DistributedCache mechanism of the hadoop framework 21 . However. it is clear that the first implementation scales better since reducers do not have to store the entire vector in memory and they don’t have to access it concurrently from the file system.j received Ai.1 Selection Select from a relation R the tuples satisfying condition C: • Map: emit only tuples that satisfies condition C. The whole operation is similar to the second stage of the previous computation (Figure 6) but now all reducers have to access the vector. 3. We should expect this implementation to be faster since it requires only one phase of map reduce. Elements from the matrix are accessed in the usual streaming pattern.2 Relational Algebra We are going to briefly introduce how to implement relational algebra operation on Hadoop. 3. Every reducer task store in its memory the vector v (this can be done efficiently using HDFS and Hadoop mechanisms 5 ) 2. In the cleanup phase (when every element of the same row of the matrix has been received) the final value is emitted. this second solution would probably perform better for a small vector v and a small number of nodes (in both cases is questionable to utilize Map Reduce to solve the problem).

]) emit t. • Reduce: receive a list of identical tuples (t’.2.4 Intersection For each tuple in relation R select only specific attributes and remove duplicates • Map: for each tuple t emit (t.”S”]) is received.[”R”]) is received.C) find tuples that have the same B attribute. 3. • Map: for each tuple t emit (t. (b.a2 ). 3.2.2 Projection For each tuple in relation R select only specific attributes and remove duplicates • Map: from a tuple t. create a new tuple t’ and emit (t’. 3.”R”) if from relation R.2.t’ .”R”) if from relation R. (t.3 Union Given relation R and S output the union of their tuples. • Sort: tuples with the same B argument will be shuffled to the same reducer.pair2 ).(”S”.3.a)). Given tuple (b. • Reduce: Given the received list (b’.B) and S(B..”R”) if from relation R.(”S”.[”R”.c1 ) · · · ) produce all the pairs6 and output them as (b.(”R”. 3.(”R”.(”R”.[t’.[ .]) emit t’ only once.6 Natural Join Given relation R(A.5 Difference For each tuple in relation R select only specific attributes and remove duplicates • Map: for each tuple t emit (t..a1 ).t’). c) ∈ S emit (b.”S”) otherwise • Reduce: emit tuple t if and only if (t.pair3 ) · · · We should expect the number of produced pairs to be limited since we have already grouped together tuple with the same B parameter 6 22 .”S”) otherwise • Reduce: emit tuple t if and only if (t.”S”) otherwise • Reduce: for each key (t. • Map: Given tuple (a.2. (b.2. b) ∈ R emit (b. (t.. (t.pair1 ) .c)).

4 Conclusions This short essay on the Map Reduce model and its design patterns should have given the reader a basic understanding of the programming model and presented some of its most important design patterns. [7] John Norstad. c) ∈ R output (a. In particular. A model of computation for mapreduce.3. b.b2 .· · · ) = x and emit (a. • Sort: elements with the same a will be sent to the same reducer.x).php.org/doc/v1.· · · ]) apply function f on the list: f(b1 . [3] Sergei Vassilvitskii Howard Karloff. [8] Jason Venner.[b1 . The Definitive Guide. http://www.b). Advanced mapreduce. Ruey-Lung Hsiao. Map-reduce-merge: Simplified relational data processing on large clusters. it is important to understand the various passages from a basic implementation to an implementation that fully exploit the potential of the framework.B. [9] Tom White.7 Group by and aggregation functions Given a relation R(A.3. Pro Hadoop. References [1] Mpi man page.b2 . A mapreduce algorithm for matrix multiplication.2. Stott Parker Ali Dasdan. Hadoop. 23 . Siddhart Suri. [2] D. [4] Ralf Lammel. • Reduce: given the list (a. [6] Yahoo Developer Network. Mapreduce.open-mpi. [5] Yahoo Developer Network.5/man3/MPI_Reduce. The example of Matrix Vector factorization is explained in great detail in order to present the reader many important aspects of Hadoop and show usage its advanced mechanisms.C) we group together elements with the same A and apply a function f on the related B elements • Map: given (a. Google’s mapreduce programming model.