You are on page 1of 4

Name: Krittika Roy SAP: 60002190056 BE E21

Semester: VII
Subject: Big Data Analytics
Matrix Multiplication using Hadoop MapReduce Framework

Aim: To implement matrix multiplication using Map Reduce.


Theory:
Matrix Multiplication with One MapReduce Step:
There often is more than one way to use MapReduce to solve a problem. You may wish to
use only a single MapReduce pass to perform matrix multiplication P = MN. 5 It is possible
to do so if we put more work into the two functions. Start by using the Map function to create
the sets of matrix elements that are needed to compute each element of the answer P. Notice
that an element of M or N contributes to many elements of the result, so one input element
will be turned into many key-value pairs. The keys will be pairs (i, k), where i is a row of M
and k is a column of N. Here is a synopsis of the Map and Reduce functions.
The Map Function: For each element mij of M, produce all the key-value pairs (i, k), (M, j,
mij ) for k = 1, 2, . . ., up to the number of columns of N. Similarly, for each element njk of
N, produce all the key-value pairs (i, k), (N, j, njk) for i = 1, 2, . . ., up to the number of rows
of M. As before, M and N are really bits to tell which of the two matrices a value comes
from.
The Reduce Function: Each key (i, k) will have an associated list with all the values (M, j,
mij ) and (N, j, njk), for all possible values of j. The Reduce function needs to connect the
two values on the list that have the same value of j, for each j. An easy way to do this step is
to sort by j the values that begin with M and sort by j the values that begin with N, in separate
lists. The jth values on each list must have their third components, mij and njk extracted and
multiplied. Then, these products are summed and the result is paired with (i, k) in the output
of the Reduce function
You may notice that if a row of the matrix M or a column of the matrix N is so large that it
will not fit in main memory, then the Reduce tasks will be forced to use an external sort to
order the values associated with a given key (i, k). However, in that case, the matrices
themselves are so large, perhaps 1020 elements, that it is unlikely we would attempt this
calculation if the matrices were dense. If they are sparse, then we would expect many fewer
values to be associated with any one key, and it would be feasible to do the sum of products
in main memory.
PseudoCode:
map(key, value):
// value is ("A", i, j, a_ij) or ("B", j, k, b_jk)
if value[0] == "A":
i = value[1] j
= value[2] a_ij =
value[3] for k = 1
Name: Krittika Roy SAP: 60002190056 BE E21

to p: emit((i, k),
(A, j, a_ij)) else:
j = value[1] k
= value[2] b_jk =
value[3] for i = 1 to
m: emit((i, k),
(B, j, b_jk))

reduce(key, values):
// key is (i, k)
// values is a list of ("A", j, a_ij) and ("B", j, b_jk)
hash_A = {j: a_ij for (x, j, a_ij) in values if x == A}
hash_B = {j: b_jk for (x, j, b_jk) in values if x == B}
result = 0 for j = 1 to n:
result += hash_A[j] * hash_B[j]
emit(key, result)

Result:

Creating new directory and JAR files. Also checking the list of files.

Creating a Hadoop directory and putting the file in it. Also printing the content in it.
Name: Krittika Roy SAP: 60002190056 BE E21

Storing and logging the sorted result.


Name: Krittika Roy SAP: 60002190056 BE E21

Conclusion:

In this experiment we learnt how to implement matrix multiplication using Map Reduce using
commands.

You might also like