Exp 4 Word Count

The document describes a MapReduce WordCount program that counts the number of occurrences of each word in a given text input. The MapReduce algorithm contains map and reduce tasks - the map task breaks down data into key-value pairs, and the reduce task combines the output of the map into smaller sets of pairs. The WordCount program implements mapper and reducer classes to count word frequencies, with the mapper emitting <word, 1> pairs and reducer summing the counts for each word.

Uploaded by

munish kumar agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

304 views4 pages

Exp 4 Word Count

Uploaded by

munish kumar agarwal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Assignment 4

Objective: Run a basic Word Count Map Reduce program to understand Map Reduce Paradigm.
Description:
MapReduce is a processing technique and a program model for distributed computing based on
java. The MapReduce algorithm contains two important tasks, namely Map and Reduce. Map takes
a set of data and converts it into another set of data, where individual elements are broken down
into tuples (key/value pairs). Secondly, reduce task, which takes the output from a map as an input
and combines those data tuples into a smaller set of tuples. As the sequence of the name
MapReduce implies, the reduce task is always performed after the map job. WordCount is a simple
program that counts the number of occurrences of each word in a given text input set.
Program:
Mapper Class:
import java.io.IOException;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

Text, Text, IntWritable> {
// Map function
public void map(LongWritable key, Text value, OutputCollector<Text,
IntWritable> output, Reporter rep) throws IOException
{

String line = value.toString();

// Splitting the line on spaces

for (String word : line.split(" "))
{
if (word.length() > 0)
{
output.collect(new Text(word), new IntWritable(1));
}
}
}
}
Reducer:
import java.io.IOException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class WCReducer extends MapReduceBase implements Reducer<Text,

IntWritable, Text, IntWritable> {

// Reduce function
public void reduce(Text key, Iterator<IntWritable> value,
OutputCollector<Text, IntWritable> output,
Reporter rep) throws IOException
{

int count = 0;

// Counting the frequency of each words

while (value.hasNext())
{
IntWritable i = value.next();
count += i.get();
}

output.collect(key, new IntWritable(count));

}
}
Driver code:
import java.io.IOException;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException

{
if (args.length < 2)
{
System.out.println("Please give valid inputs");
return -1;
}

JobConf conf = new JobConf(WCDriver.class);

FileInputFormat.setInputPaths(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
conf.setMapperClass(WCMapper.class);
conf.setReducerClass(WCReducer.class);
conf.setMapOutputKeyClass(Text.class);
conf.setMapOutputValueClass(IntWritable.class);
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
JobClient.runJob(conf);
return 0;
}

// Main Method
public static void main(String args[]) throws Exception
{
int exitCode = ToolRunner.run(new WCDriver(), args);
System.out.println(exitCode);
}
}
Output:
Input File:
Welcome everyone.
Welcome to Hadoop lab.
Today we are going to work on Hadoop MapReduce concept.

Output File:
MapReduce 1
Today 1
Welcome 2
are 1
concept. 1
everyone 1
going 1
Hadoop 2
lab. 1
on 1
to 2
we 1
work 1

Create and Execute Word Count Jar File
No ratings yet
Create and Execute Word Count Jar File
5 pages
Hadoop Word Count Example
No ratings yet
Hadoop Word Count Example
2 pages
B1 Instructions
No ratings yet
B1 Instructions
9 pages
Java WordCount with Hadoop Guide
No ratings yet
Java WordCount with Hadoop Guide
6 pages
BDA3
No ratings yet
BDA3
7 pages
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
No ratings yet
CS246 TA Session: Hadoop Tutorial: Peyman Kazemian 1/11/2011
13 pages
Java Word Count Program in Hadoop
No ratings yet
Java Word Count Program in Hadoop
7 pages
Experiment-4 BDA LAB
No ratings yet
Experiment-4 BDA LAB
7 pages
02-Wordcount Mapreduce
No ratings yet
02-Wordcount Mapreduce
5 pages
MapReduce Word Count Tutorial
No ratings yet
MapReduce Word Count Tutorial
6 pages
BDA University Questions
No ratings yet
BDA University Questions
10 pages
MR Progs For Self Excercise
No ratings yet
MR Progs For Self Excercise
14 pages
MapReduce Programming: Word Count & Union
No ratings yet
MapReduce Programming: Word Count & Union
11 pages
Hadoop MapReduce Word Count Example
No ratings yet
Hadoop MapReduce Word Count Example
4 pages
MapReduce Programs
No ratings yet
MapReduce Programs
10 pages
Exp3 - Map Reduce Code
No ratings yet
Exp3 - Map Reduce Code
2 pages
Big Data Analytics with Hadoop Guide
No ratings yet
Big Data Analytics with Hadoop Guide
10 pages
Cloud LAB 10.1,11.1,12.1
No ratings yet
Cloud LAB 10.1,11.1,12.1
6 pages
Java MapReduce Word Count Example
No ratings yet
Java MapReduce Word Count Example
15 pages
Dsbda 11
No ratings yet
Dsbda 11
15 pages
Java Word Count MapReduce Example
No ratings yet
Java Word Count MapReduce Example
2 pages
Hadoop MapReduce Word Count Lab
No ratings yet
Hadoop MapReduce Word Count Lab
8 pages
Word Count MapReduce Program Guide
No ratings yet
Word Count MapReduce Program Guide
3 pages
Word Count Example
No ratings yet
Word Count Example
4 pages
MapReduce Word Count Program Guide
No ratings yet
MapReduce Word Count Program Guide
14 pages
Word Count Program in MapReduce
No ratings yet
Word Count Program in MapReduce
5 pages
Java Word Count MapReduce Example
No ratings yet
Java Word Count MapReduce Example
4 pages
Word Count Program With MapReduce and Java
No ratings yet
Word Count Program With MapReduce and Java
6 pages
Java MapReduce Word Count Program
No ratings yet
Java MapReduce Word Count Program
2 pages
Mapreduce Program
No ratings yet
Mapreduce Program
3 pages
CCBDI Full Lab Manual Anurag Removed
No ratings yet
CCBDI Full Lab Manual Anurag Removed
97 pages
Sanoob BDA - 2
No ratings yet
Sanoob BDA - 2
4 pages
MapReduce Word and Character Count Guide
No ratings yet
MapReduce Word and Character Count Guide
22 pages
Java MapReduce Programs for Counting
No ratings yet
Java MapReduce Programs for Counting
34 pages
Word Count Program
No ratings yet
Word Count Program
2 pages
MapReduce Word Count Program Guide
No ratings yet
MapReduce Word Count Program Guide
4 pages
Hadoop Word Count MapReduce Example
No ratings yet
Hadoop Word Count MapReduce Example
3 pages
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
No ratings yet
Kick Start Hadoop: Word Count - Hadoop Map Reduce Example
13 pages
Hadoop Word Count with MapReduce
No ratings yet
Hadoop Word Count with MapReduce
6 pages
CS702 Big Data Programs
No ratings yet
CS702 Big Data Programs
58 pages
Java MapReduce Implementation Guide
No ratings yet
Java MapReduce Implementation Guide
3 pages
Understanding Hadoop MapReduce Workflow
No ratings yet
Understanding Hadoop MapReduce Workflow
20 pages
Java MapReduce Word Count Tutorial
No ratings yet
Java MapReduce Word Count Tutorial
4 pages
WordCountApp
No ratings yet
WordCountApp
2 pages
Hadoop Mini Project
No ratings yet
Hadoop Mini Project
8 pages
Hadoop Word Count with MapReduce
No ratings yet
Hadoop Word Count with MapReduce
17 pages
Experiment 1 2
No ratings yet
Experiment 1 2
19 pages
Patent Sub-Patent Count MapReduce Code
No ratings yet
Patent Sub-Patent Count MapReduce Code
3 pages
MapReduce Word Count Example in Java
No ratings yet
MapReduce Word Count Example in Java
6 pages
MapReduce Word Count Program Guide
No ratings yet
MapReduce Word Count Program Guide
4 pages
Sribharanitharan.M 71762234049
No ratings yet
Sribharanitharan.M 71762234049
2 pages
WordCount MapReduce Source Code
No ratings yet
WordCount MapReduce Source Code
3 pages
Classcreation
No ratings yet
Classcreation
2 pages
BDAPract 4
No ratings yet
BDAPract 4
5 pages
Hadoop Installation & MapReduce Guide
No ratings yet
Hadoop Installation & MapReduce Guide
7 pages
Hadoop HDFS MapReduce Guide
No ratings yet
Hadoop HDFS MapReduce Guide
57 pages
Java Hadoop Word Count Tutorial
No ratings yet
Java Hadoop Word Count Tutorial
4 pages
Hadoop MapReduce WordCount Guide
No ratings yet
Hadoop MapReduce WordCount Guide
5 pages
Hadoop Word Count Setup Guide
No ratings yet
Hadoop Word Count Setup Guide
9 pages
Week 1 Activity Sheet:: Defining A Database
100% (1)
Week 1 Activity Sheet:: Defining A Database
30 pages
Touch Screen Control for AC Motors
No ratings yet
Touch Screen Control for AC Motors
5 pages
RM Unit-7 Statistical Reasoning
No ratings yet
RM Unit-7 Statistical Reasoning
18 pages
Indexing and Retrieving Archaeological R
No ratings yet
Indexing and Retrieving Archaeological R
9 pages
ReleaseNote FileList of X456UAK WIN10 64 V2.00 Lite
No ratings yet
ReleaseNote FileList of X456UAK WIN10 64 V2.00 Lite
2 pages
3x3 Puzzle Solver in C++ Code
No ratings yet
3x3 Puzzle Solver in C++ Code
3 pages
Final Report Sample - WD
No ratings yet
Final Report Sample - WD
2 pages
User Manual WirelessMedia
No ratings yet
User Manual WirelessMedia
93 pages
Folio Views 3.1 Guide for Columbia Law
No ratings yet
Folio Views 3.1 Guide for Columbia Law
11 pages
Digi Manual PDF
No ratings yet
Digi Manual PDF
14 pages
Lean
No ratings yet
Lean
5 pages
MAXQDA Pricing
No ratings yet
MAXQDA Pricing
1 page
CERC ADMS-Screen User Guide
No ratings yet
CERC ADMS-Screen User Guide
222 pages
Date: Experiment No. 1 Aim: To Study J2EE.: Theory
No ratings yet
Date: Experiment No. 1 Aim: To Study J2EE.: Theory
71 pages
Intrusion Detection System Report
No ratings yet
Intrusion Detection System Report
22 pages
Oracle Receivables Analyzer Setup
No ratings yet
Oracle Receivables Analyzer Setup
58 pages
Professional 3D Printer Buyers Guide Update - Aug 2024 - WEB
No ratings yet
Professional 3D Printer Buyers Guide Update - Aug 2024 - WEB
8 pages
Gama Alta Gamer Lab
No ratings yet
Gama Alta Gamer Lab
29 pages
Applied Calculus Final Exam 2014-2015
No ratings yet
Applied Calculus Final Exam 2014-2015
5 pages
Abhishek Chauhan Resume
No ratings yet
Abhishek Chauhan Resume
3 pages
SD-WAN Based Dynamic Enterprise Services
No ratings yet
SD-WAN Based Dynamic Enterprise Services
11 pages
Contador de Peças de Schneider - XBKT81030U33E
No ratings yet
Contador de Peças de Schneider - XBKT81030U33E
3 pages
Electrical Load Schedule Overview
100% (1)
Electrical Load Schedule Overview
5 pages
MCKINSEY Driving Impact at Scale From Automation and AI
No ratings yet
MCKINSEY Driving Impact at Scale From Automation and AI
100 pages
VAMP Relays IEC 61850 Conformance
No ratings yet
VAMP Relays IEC 61850 Conformance
31 pages
ADMS 2511 Chapter 13
No ratings yet
ADMS 2511 Chapter 13
7 pages
Apartment Rental System
80% (15)
Apartment Rental System
38 pages
Dahua EZ-IP NVR1B04/1B08HS Overview
No ratings yet
Dahua EZ-IP NVR1B04/1B08HS Overview
3 pages
Senior Salesforce Developer Job Description
No ratings yet
Senior Salesforce Developer Job Description
2 pages
Efficient Laptop Inventory Tracking System
No ratings yet
Efficient Laptop Inventory Tracking System
7 pages

Exp 4 Word Count

Uploaded by

Exp 4 Word Count

Uploaded by

Assignment 4

public class WCMapper extends MapReduceBase implements Mapper<LongWritable,

String line = value.toString();

// Splitting the line on spaces

public class WCReducer extends MapReduceBase implements Reducer<Text,

// Counting the frequency of each words

output.collect(key, new IntWritable(count));

public class WCDriver extends Configured implements Tool {

public int run(String args[]) throws IOException

JobConf conf = new JobConf(WCDriver.class);

You might also like