Map Reduce

MapReduce is a framework for processing large data sets in parallel across clusters, utilizing a two-step process of mapping and reducing data into key/value pairs. CGL-MapReduce enhances this by using streaming instead of a file system, improving efficiency for scientific data processing. Both frameworks support parallel programming and can handle large data volumes, with CGL-MapReduce facilitating iterative computations more effectively.

Uploaded by

Prasath SivaSubramanian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views2 pages

Map Reduce

Uploaded by

Prasath SivaSubramanian

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

MAP REDUCE

MapReduce is a framework using which we can write applications to process huge amounts of data,
in parallel, on large clusters of commodity hardware in a reliable manner.

MapReduce is a processing technique and a program model for distributed computing based on java.
The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map takes a set of
data and converts it into another set of data, where individual elements are broken down into tuples
(key/value pairs). Secondly, reduce task, which takes the output from a map as an input and
combines those data tuples into a smaller set of tuples. As the sequence of the name MapReduce
implies, the reduce task is always performed after the map job.

MAP FUCTION IS GIVEN AS:

REDUCE FUNCTION IS GIVEN AS:

Key features

 Supports parallel programming

 Fast
 Can handle a large amount of data

CGL MAP REDUCE:

CGL-MapReduce is another type of MapReduce runtime and was developed by Ekanayake et al.
Unlike MapReduce, CGL-MapReduce uses streaming instead of a file system and eliminates the
overheads associated with file communication, and the intermediate results from the map functions
are directly sent. The application was primarily created and tested for a large amount of scientific data
and was compared with MapReduce. It is shown in the following figure.

1. Map worker: A map worker is responsible for doing map operation.

2. Reduce worker: A reduce worker is responsible for doing reduce operation.

Fig: CGL-MapReduce

3. Content Dissemination Network: Content dissemination network handles all the communication
between the components
4. MRDriver: MRDriver is a master worker and controls the other workers based on the instructions by
the user program. CGL-MapReduce is different from MapReduce, the main difference being the
avoidance of file system and usage of streaming.

a. Initializing stage: The first step involves starting the MapReduce worker nodes and configuration of
the MapReduce task. This is one of the improvements of CGL-MapReduce, which facilitates efficient
iterative MapReduce computations.

b. Map stage: After the initialization step, MRDriver starts the map computation upon the instruction of
the programmer. This is done by passing the variable data to the map tasks. This is relayed to
workers for invoking configured map tasks. It also allows passing the results from one iteration to
another. Finally, the map tasks are transferred directly to reduce workers using dissemination
network.

c. Reduce stage: As soon as all the map tasks are completed, they are transferred to reduce workers,
and these workers start executing tasks after they are initialized by the MRDriver. Output of the
reduce function is directly sent to the user application.

d. Combine stage: In this stage, all the results obtained in the reduce stage are combined. In single-
pass MapReduce computation, then the results are directly combined, and in iterative operation, then
appropriate combination is obtained such that the iteration continues successfully.

e. Termination stage: This is the final stage, and user program gives the command for termination. At
this stage, all the workers are terminated

Key features

•Uses streaming for communication

•Supports parallelization

•Iterative in nature

•Can handle a large amount of data

MapReduce: Simplified Data Processing
No ratings yet
MapReduce: Simplified Data Processing
4 pages
Unit 3 BDT
No ratings yet
Unit 3 BDT
42 pages
Module 3 Nosql
No ratings yet
Module 3 Nosql
12 pages
MapReduce Framework in Big Data
No ratings yet
MapReduce Framework in Big Data
46 pages
Hadoop and MapReduce Overview
No ratings yet
Hadoop and MapReduce Overview
16 pages
Paper Map Reduce
No ratings yet
Paper Map Reduce
16 pages
Nosql Mod3
No ratings yet
Nosql Mod3
18 pages
Optimizing MapReduce with MPI
No ratings yet
Optimizing MapReduce with MPI
10 pages
Final - Module-4 Cloud Computing - May 8, 2023
No ratings yet
Final - Module-4 Cloud Computing - May 8, 2023
88 pages
Big Data Analytics UNIT 3 Notets
No ratings yet
Big Data Analytics UNIT 3 Notets
12 pages
MapReduce in Hadoop: Big Data Solutions
No ratings yet
MapReduce in Hadoop: Big Data Solutions
15 pages
MapReduce Overview and Implementation Guide
No ratings yet
MapReduce Overview and Implementation Guide
42 pages
Map Reduce
No ratings yet
Map Reduce
35 pages
MapReduce for Data Engineers
No ratings yet
MapReduce for Data Engineers
29 pages
MapReduce: Efficient Data Processing
No ratings yet
MapReduce: Efficient Data Processing
29 pages
Cloud Application Development
No ratings yet
Cloud Application Development
21 pages
Da Unit 5 Data Analytics
No ratings yet
Da Unit 5 Data Analytics
43 pages
Understanding Hadoop MapReduce Framework
No ratings yet
Understanding Hadoop MapReduce Framework
15 pages
Big Data and MapReduce Overview
No ratings yet
Big Data and MapReduce Overview
14 pages
BIS613D Module 5 Textbook
No ratings yet
BIS613D Module 5 Textbook
9 pages
Big Data Analytics and MapReduce Overview
No ratings yet
Big Data Analytics and MapReduce Overview
10 pages
Unit 5 Lecture 5
No ratings yet
Unit 5 Lecture 5
21 pages
Understanding MapReduce and Functional Programming
No ratings yet
Understanding MapReduce and Functional Programming
2 pages
MapReduce on Red Green Blue Architecture
No ratings yet
MapReduce on Red Green Blue Architecture
11 pages
777 1651400043 BD Module 4
No ratings yet
777 1651400043 BD Module 4
21 pages
Map Reduce 2
No ratings yet
Map Reduce 2
14 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
25 pages
Understanding MapReduce Algorithms
No ratings yet
Understanding MapReduce Algorithms
6 pages
Map Reduce Examples
No ratings yet
Map Reduce Examples
7 pages
3 Fuel Consumption Example - MR
No ratings yet
3 Fuel Consumption Example - MR
7 pages
Understanding MapReduce for Big Data
No ratings yet
Understanding MapReduce for Big Data
7 pages
Clustering Algorithms in MapReduce Review
No ratings yet
Clustering Algorithms in MapReduce Review
4 pages
MapReduce and HDFS Architecture Explained
No ratings yet
MapReduce and HDFS Architecture Explained
9 pages
Map Reduce Report
No ratings yet
Map Reduce Report
16 pages
Unit-2 (MapReduce-I)
No ratings yet
Unit-2 (MapReduce-I)
28 pages
Introduction to MapReduce & Functional Programming
No ratings yet
Introduction to MapReduce & Functional Programming
37 pages
MapReduce vs Spark: Key Differences
No ratings yet
MapReduce vs Spark: Key Differences
2 pages
Unit 3 - Map Reduce Applications
No ratings yet
Unit 3 - Map Reduce Applications
25 pages
Understanding MapReduce Execution Framework
No ratings yet
Understanding MapReduce Execution Framework
2 pages
Distributed and Cloud Computing
No ratings yet
Distributed and Cloud Computing
58 pages
Parallel Programming and MapReduce Overview
No ratings yet
Parallel Programming and MapReduce Overview
47 pages
Module 3 Version1
No ratings yet
Module 3 Version1
17 pages
MapReduce Architecture Explained
No ratings yet
MapReduce Architecture Explained
13 pages
MapReduce for Data Engineers
No ratings yet
MapReduce for Data Engineers
26 pages
MapReduce: Big Data Processing Guide
No ratings yet
MapReduce: Big Data Processing Guide
25 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
3 pages
Map Reduce Paradigm
No ratings yet
Map Reduce Paradigm
3 pages
Unit 4 1
No ratings yet
Unit 4 1
12 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
21 pages
MAP Reduce - 1
No ratings yet
MAP Reduce - 1
34 pages
MapReduce Framework Overview and Tasks
No ratings yet
MapReduce Framework Overview and Tasks
34 pages
Understanding MapReduce Workflows
No ratings yet
Understanding MapReduce Workflows
38 pages
Big Data Computing: MapReduce & Clustering
No ratings yet
Big Data Computing: MapReduce & Clustering
36 pages
Understanding MapReduce Framework
No ratings yet
Understanding MapReduce Framework
120 pages
Mapreduce Model Principles
No ratings yet
Mapreduce Model Principles
65 pages
MapReduce: Efficient Big Data Processing
No ratings yet
MapReduce: Efficient Big Data Processing
72 pages
3.1.how Map Reduce Works & 3.2 Anatomy
No ratings yet
3.1.how Map Reduce Works & 3.2 Anatomy
11 pages
MapReduce for Data Engineers
No ratings yet
MapReduce for Data Engineers
59 pages
MultiMLton &erlang
No ratings yet
MultiMLton &erlang
14 pages
Cloud Haskel
No ratings yet
Cloud Haskel
3 pages
Secure Routing in MANETs: APALLS Protocol
No ratings yet
Secure Routing in MANETs: APALLS Protocol
25 pages
m-jUDDI+: Mobile Service Discovery Framework
No ratings yet
m-jUDDI+: Mobile Service Discovery Framework
14 pages
1 Java
No ratings yet
1 Java
17 pages
A Twofold Clock and Voltage-Based Detection
No ratings yet
A Twofold Clock and Voltage-Based Detection
14 pages
Commercial Integrated Receiver/Decoder: Product Overview
No ratings yet
Commercial Integrated Receiver/Decoder: Product Overview
4 pages
Introduction To MIPS Assembly Language Programming1
No ratings yet
Introduction To MIPS Assembly Language Programming1
179 pages
Velozity Global Solutions Portfolio
No ratings yet
Velozity Global Solutions Portfolio
20 pages
Predicting Current Differential Relay Tripping and Targeting When Testing at Final Settings
100% (1)
Predicting Current Differential Relay Tripping and Targeting When Testing at Final Settings
26 pages
ParPgmDesign Fosterrr
No ratings yet
ParPgmDesign Fosterrr
33 pages
Packet Tracer 21.7.5
No ratings yet
Packet Tracer 21.7.5
8 pages
E-Books EEE
No ratings yet
E-Books EEE
135 pages
E82zafpc201 Profibus-Io Fif Module v4-0 en
No ratings yet
E82zafpc201 Profibus-Io Fif Module v4-0 en
118 pages
Tny 268 PN
No ratings yet
Tny 268 PN
24 pages
Fireware-Essentials: Number: Fireware Essentials Passing Score: 800 Time Limit: 120 Min File Version: 7.0
No ratings yet
Fireware-Essentials: Number: Fireware Essentials Passing Score: 800 Time Limit: 120 Min File Version: 7.0
39 pages
Idoc 1
No ratings yet
Idoc 1
3 pages
User Manual For Bluetooth Pro Controller For Nintendo Switch
No ratings yet
User Manual For Bluetooth Pro Controller For Nintendo Switch
6 pages
Aec Fet N
No ratings yet
Aec Fet N
81 pages
Control Builder Startup
No ratings yet
Control Builder Startup
106 pages
Firefighter User Exit
No ratings yet
Firefighter User Exit
3 pages
Computer Organization & Assembly Basics
No ratings yet
Computer Organization & Assembly Basics
43 pages
Blackminer F1+ Setup and Operation Guide
No ratings yet
Blackminer F1+ Setup and Operation Guide
10 pages
System Programming Question Bank for CE
No ratings yet
System Programming Question Bank for CE
6 pages
Replacing Selenium Rectifiers Safely
No ratings yet
Replacing Selenium Rectifiers Safely
6 pages
Information Technology Exam Questions For JSS3
No ratings yet
Information Technology Exam Questions For JSS3
14 pages
Huawei OLT Setup Guide
No ratings yet
Huawei OLT Setup Guide
3 pages
Assembler Design Guide
No ratings yet
Assembler Design Guide
31 pages
Manual de Servicio MF6500
No ratings yet
Manual de Servicio MF6500
295 pages
Waveform Generators and Filters Explained
No ratings yet
Waveform Generators and Filters Explained
12 pages
AOD4186 N-Channel MOSFET Specifications
No ratings yet
AOD4186 N-Channel MOSFET Specifications
6 pages
Setting Up C++ Development Environment
No ratings yet
Setting Up C++ Development Environment
18 pages
Makerere University: College of Computing & Information Sciences
No ratings yet
Makerere University: College of Computing & Information Sciences
5 pages
Assignmnt #02 Basic Electronics
No ratings yet
Assignmnt #02 Basic Electronics
4 pages

Map Reduce

Uploaded by

Map Reduce

Uploaded by

MAP REDUCE

MAP FUCTION IS GIVEN AS:

REDUCE FUNCTION IS GIVEN AS:

 Supports parallel programming

CGL MAP REDUCE:

1. Map worker: A map worker is responsible for doing map operation.

•Uses streaming for communication

•Can handle a large amount of data

You might also like