You are on page 1of 7

Map Reduce Algorithm for Statistical

Functions
By
Dr. R. Satya Krishna Sharma
Mean
 Input:10
numbers (x1, x2, ..., x10)
 Map Input:
◦ Input: List of 10 numbers
◦ Output: (key, value) pairs where key is a constant (e.g., 1) and
value is the corresponding number from the input list.
◦ Input: [4, 8, 12, 6, 5, 3, 9, 11, 15, 7]
◦ Map Output: [(1, 4), (1, 8), ..., (1,7)]
Sort: [(1, 3), (1, 4), (1, 5), ..., (1,15)]
Shuffle:[(1,(3,4,5,6,7,8,9,11,12,15)]
Reduce: [1,(3+4+5+……12,15)/(10)]
[1,(80/10)]
[1,8]
Standard Deviation
Let's consider the numbers: [4, 8, 12, 6, 5, 3, 9, 11, 15, 7]
 1st Map Output: [(1, 4), (1, 8), ..., (1, 7)]
 1st Sort : [(1, 3), (1, 4), (1, 5)), ..., (1, 15)]
 1st Shuffle:[1,(3,4,5,6,…….15))
 1st Reduce:[1,((3+4+5+6+…….+15),10)]

:[1,(80,10)]
The above is as good as importing map reduce of mean
2ndMap and sort Output[(1,(3-8)), (1,(4-8)),….. (1,(15-8))]
[(1,((3-8)^2), (1,((4-8) ^2)),….. (1,((15-8)^2))]
2ndShuffle:[(1 ,((3-8)^2), (1,((4-8) ^2)),….. (1,((15-8)^2))]
[(1,((3-8)^2), ((4-8) ^2)),….. ((15-8)^2))]
2nd Reduce: [1,(25+16+9---+49]
[1,SQRT(130/(10-1))]
Skewness
 Let's consider the numbers: [4, 8, 12, 6, 5, 3, 9, 11, 15, 7]
 Import Map Reduce of Standard Deviation
2ndMap and sort Output[(1,(3-8)), (1,(4-8)),….. (1,(15-8))]
[(1,((3-8)^3), (1,((4-8) ^3)),….. (1,((15-
8)^3))]

2ndShuffle:[(1 ,((3-8)^3), (1,((4-8) ^3)),….. (1,((15-8)^3))]


[(1,((3-8)^3), ((4-8) ^3)),….. ((15-8)^3))]

2nd Reduce: [1,(-125-64-27---+343


[1, (210/((10-1)*(SD^3))]
Matrix multiplication
1
3 14 15 Map Output:
1 2 3 4
1 Key Value
5 6 7 8 6 17 18 (0, 0) [(1, 13), (2, 16), (3, 19), (4, 22)]
1 (0, 1) [(1, 14), (2, 17), (3, 20), (4, 23)]
9 10 11 12
9 20 21
(0, 2) [(1, 15), (2, 18), (3, 21), (4, 24)]
2
2 23 24
Shuffle and Sort: Reduce Step:
Group the key-value pairs by key, so For each key (i, j), calculate the sum of
all values for a specific key are sent to the products of the corresponding
the same Reducer. values.

Reduce Input (for key (0, 0)): Result


Key Value
(0, 0) [(1, 13), (2, 16), (3, 19), (4, 22)] 190 200 210
Reduce Output:
470 496 522
Key Value
(0, 0) 1*13 + 2*16 + 3*19 + 4*22 = 330 750 792 834
Left outer Join
customer Customer
_id, _name ,
101John Doe order_ Customer Order-
id, _id, amount
102, Jane Smith 1 101 50
103Keyne 2 102 75
3 101 30
4 103 40

1.Map (Orders):
•Input: (key, value) pairs where the key is the customer_id and the value is the order
information.
•Emit key-value pairs as (customer_id, ("order", order_info)).
2.Map (Customers):
•Input: (key, value) pairs where the key is the customer_id and the value is the
customer information.
•Emit key-value pairs as (customer_id, ("customer", customer_info))

Shuffle and Sort:


•Data is shuffled and sorted based on the common key (customer_id).
Left Join
 Reduce:
◦ Input: (customer_id, [list of values]).
◦ For each key, check the values to see if there are both "order" and
"customer" entries.
◦ If an "order" entry is found, combine the order and customer information.
◦ Emit key-value pairs as (order_id, combined_info)

You might also like