You are on page 1of 103

Quantitative Methods in

Management
Term II
4 credits
MGT 408
DAY -2
Business Statistics
A First course
David M.Levine
Kathryn A.Szabat
David F.Stephan
P.K.Viswanathan
PEARSON PUBLICATIONS 7e
Subject Outline
• Introduction ch-1
• Data collection, classification and presentation ch-2
• Measures of central tendencies and dispersion ch-3
• Correlation and Regression analysis ch-12
• Probability concepts ch-04
• Probability distributions – Binomial and Poisson ch-05
• Probability distribution – Normal ch-06
• Sampling techniques ch-07
• Estimation and Inference statistics ch-08
• Testing of Hypothesis – Non Parametric (Chi square) ch-9, 11
• Bayesian Analysis and decision theory ch-15
Recap -1
• Introduction
• Definition
• Importance and limitations
• Applications
• Scale of measurement
• Type of variables
– Qualitative, quantitative
– Time series and cross sectional
• Population, sample, parameter and
statistic
Types of statistics

Descriptive
inferential
Types of Statistics
• Statistics
• The branch of mathematics that transforms data
into useful information for decision makers.

Descriptive Statistics Inferential Statistics

Collecting, Drawing conclusions


summarizing, and and/or making decisions
describing data concerning a population
based only on sample
data
Descriptive Statistics

• Collect data
– e.g., Survey

• Present data
– e.g., Tables and graphs

• Characterize data
– e.g., Sample mean = X i

n
Inferential Statistics
• Estimation
– e.g., Estimate the
population mean
weight using the
sample mean weight
• Hypothesis testing
– e.g., Test the claim that
the population mean
weight is 120 pounds

Drawing conclusions about a large group of individuals


based on a subset of the large group.
Descriptive Statistics

Most of the statistical information in


newspapers, magazines, company reports,
and other publications consists of data that
are summarized and presented in a form that
is easy to understand.

Such summaries of data, which may be


tabular, graphical, or numerical, are
referred to as descriptive statistics.
Probability
• “Inverse” of statistics
Statistics
The You
world see
Probability

– Statistics: generalizes from data to the world


– Probability: “What if …” Assuming you know how the
world works, what data are you likely to see?
• Examples of probability:
– Flip coin, stock market, future sales, IRS audit, …
• Foundation for statistical inference
Statistical Inference
Statistical inference is the process of making an estimate,
prediction, or decision about a population based on a
sample.
Population

Sample

Inference

Statistic
Parameter

What can we infer about a Population’s Parameters


based on a Sample’s Statistics?
Process of Inferential Statistics

Calculate x
to estimate 
Population Sample
 x
(parameter) (statistic)

Select a
random sample
Sources of data collection
Collecting Data Correctly Is A Critical
Task
DCOVA
 Need to avoid data flawed by
biases, ambiguities, or other
types of errors.

 Results from flawed data will be


suspect or in error.

 Even the most sophisticated


statistical methods are not very
useful when the data is flawed.
Developing Operational Definitions Is
Crucial To Avoid Confusion / Errors
DCOVA
• An operational definition is a clear and
precise statement that provides a
common understanding of meaning

• In the absence of an operational


definition miscommunications and
errors are likely to occur.

• Arriving at operational definition(s) is a


key part of the Define step of DCOVA
Why to Collect Data?
 A marketing research analyst needs to assess the effectiveness
of a new television advertisement.

 A pharmaceutical manufacturer needs to determine whether a


new drug is more effective than those currently in use.

 An operations manager wants to monitor a manufacturing


process to find out whether the quality of the product being
manufactured is conforming to company standards.

 An auditor wants to review the financial transactions of a


company in order to determine whether the company is in
compliance with generally accepted accounting principles.
Sources of Data
 Primary Sources: The data collector is the one using the data for analysis
 Data from a political survey
 Data collected from an experiment
 Observed data Production data from your factory
 Your firm’s marketing studies

 Secondary Sources: The person performing data analysis is not the data
collector
 Analyzing census data
 Examining data from print journals or data published on the internet.
 Government data: economics and demographics
 Media reports – TV, newspapers, Internet
 Companies that specialize in gathering data
Sources of data fall into five
categories
DCOVA
• Data distributed by an organization or an
individual
• The outcomes of a designed experiment
• The responses from a survey
• The results of conducting an
observational study
• Data collected by ongoing business
activities
Examples Of Data Distributed By
Organizations or Individuals
DCOVA
• Financial data on a company provided
by investment services.

• Industry or market data from market


research firms and trade associations.

• Stock prices, weather conditions, and


sports statistics in daily newspapers.
Examples of Data From A Designed
Experiment
DCOVA
• Consumer testing of different versions of a
product to help determine which product
should be pursued further.

• Material testing to determine which supplier’s


material should be used in a product.

• Market testing on alternative product


promotions to determine which promotion to
use more broadly.
Data Sources
• Statistical Studies - Experimental

In
In experimental
experimental studies
studies the
the variable
variable ofof interest
interest is
is
first
first identified.
identified. Then
Then one
one oror more
more other
other variables
variables
are
are identified
identified and
and controlled
controlled soso that
that data
data can
can be
be
obtained
obtained about
about how
how they
they influence
influence the
the variable
variable ofof
interest.
interest.

The
The largest
largest experimental
experimental study
study ever
ever conducted
conducted isis
believed
believed toto be
be the
the 1954
1954 Public
Public Health
Health Service
Service
experiment
experiment for for the
the Salk
Salk polio
polio vaccine.
vaccine. Nearly
Nearly two
two
million
million U.S.
U.S. children
children (grades
(grades 1-
1- 3)
3) were
were selected.
selected.
Examples of Survey Data
DCOVA
• A survey asking people which laundry
detergent has the best stain-removing
abilities

• Political polls of registered voters


during political campaigns.

• People being surveyed to determine


their satisfaction with a recent product
or service experience.
Examples of Data Collected
From Observational Studies
DCOVA
• Market researchers utilizing focus groups to
elicit unstructured responses to open-
ended questions.

• Measuring the time it takes for customers


to be served in a fast food establishment.

• Measuring the volume of traffic through an


intersection to determine if some form of
advertising at the intersection is justified.
Data Sources

 Statistical Studies - Observational

In
In observational
observational (nonexperimental)
(nonexperimental) studies
studies no
no
attempt
attempt is
is made
made to to control
control or
or influence
influence the
the
variables
variables of
of interest.
interest. a survey is a good
example

Studies
Studies of
of smokers
smokers and
and nonsmokers
nonsmokers are
are
observational
observational studies
studies because
because researchers
researchers
do
do not
not determine
determine or
or control
control
who
who will
will smoke
smoke and
and who
who will
will not
not smoke.
smoke.
Examples of Data Collected From
Ongoing Business Activities
DCOVA
• A bank studies years of financial
transactions to help them identify
patterns of fraud.

• Economists utilize data on searches done


via Google to help forecast future
economic conditions.

• Marketing companies use tracking data to


evaluate the effectiveness of a web site.
Structured Data Follows An Organizing
Principle & Unstructured Data Does Not
DCOVA
• A Stock Ticker Provides Structured Data:
– The stock ticker repeatedly reports a company name, the
number of shares last traded, the bid price, and the
percent change in the stock price.
• Due to their inherent structure, data from tables
and forms are structured data.
• E-mails from five people concerning stock trades is
an example of unstructured data.
– In these e-mails you cannot count on the information
being shared in a specific order or format.
• This book deals exclusively with structured data
All Of The Methods In our study Deal
With Structured Data
DCOVA

• To use the techniques in this book on


unstructured data you need to convert
the unstructured into structured data.

• For many of the questions you might


want to answer, the starting point can /
will be tabular data.
Data Can Be Formatted and / or
Encoded In More Than One Way
DCOVA
• Some electronic formats are more
readily usable than others.

• Different encodings can impact the


precision of numerical variables and
can also impact data compatibility.

• As you identify and choose sources of


data you need to consider / deal with
these issues
Data Cleaning Is Often A Necessary
Activity When Collecting Data
DCOVA

• Often find “irregularities” in the data


– Typographical or data entry errors
– Values that are impossible or undefined
– Missing values
– Outliers
• When found these irregularities
should be reviewed / addressed
• Both Excel & Minitab can be used to
address irregularities
After Collection It Is Often Helpful To
Recode Some Variables
DCOVA
• Recoding a variable can either supplement
or replace the original variable.
• Recoding a categorical variable involves
redefining categories.
• Recoding a quantitative variable involves
changing this variable into a categorical
variable.
• When recoding be sure that the new
categories are mutually exclusive (categories
do not overlap) and collectively exhaustive
(categories cover all possible values).
Data Acquisition Considerations

Time Requirement
• Searching for information can be time consuming.
• Information may no longer be useful by the time it
is available.

Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.

Data Errors
• Using any data that happen to be available or were
acquired with little care can lead to misleading
information.
PRACTICE
Examples of Types of Variables
DCOVA

Question Responses Variable Type


Do you have a
Facebook profile? Yes or No Categorical
(Qualitative)
How many text Numerical
messages have you --------------- (discrete)
sent in the past
three days?

How long did the Numerical


mobile app update --------------- (continuous)
take to download?
• For each of the following variables, determine whether the
variable is categorical or numerical. If the variable is
numerical, determine whether the variable is discrete or
continuous.
– Number of cellphones in the household.
– Monthly data usage ( in MB)
– Number of text messages exchanged per month
– Voice usage per month ( in minutes)
– Whether the cellphone is used for email.
– Name of the internet service provider
– Time, in hours, spend surfing the internet per week
– Whether the individual uses a mobile phone to connect to the
internet
– Number of online purchases made in a month
• A survey in which customers taste five
different brands of ice cream, and rank their
favorites from 1 to 5, would be an example of
which type of scale of measurement?
– Ordinal
– Nominal
– Interval
– Ratio
• State whether the following question provided
is qualitative or quantitative data and
indicates the measurement scale appropriate -
What is your age?
– Qualitative, ratio
– Quantitative, ratio
– Qualitative, nominal
– Quantitative, ordinal
• Abel Alonzo, Director of Human Resources, is
exploring the causes of employee absenteeism at
Batesville Bottling during the last operating year
(January 1, 1999 through December 31, 1999). For
this study, the set of all employees who worked at
Batesville Bottling during the last operating year is
a(a)____________________.

    Population   Statistic   Parameter   Sample
• A student makes an 82 on the first test in a
statistics course. From this, she assumes that her
average at the end of the semester (after other
tests) will be about 82. This is an example of
(a)____________________.
 Descriptive statistics 
  Inferential statistics  
 Nonparametric statistics  
 Wishful thinking
• A statistics instructor collects information about the
background of his students. About 30% have taken
economics and about 40% have taken accounting.
There are 23 male students and 27 female students in
this class. This is an example
of(a)____________________.
   Descriptive statistics  
 Inferential statistics  
 Nominal data  
 Nonparametric statistics
• Abel Alonzo, Director of Human Resources, is
exploring the causes of employee absenteeism
at Batesville Bottling during the last operating
year (January 1, 1999 through December 31,
1999). The average number of absences per
employee, computed from the personnel data of
all employees, is a (a)____________________.
(a)  O Parameter   Population   Sample   St
atistic
• Pinky Bauer, Chief Financial Officer of Harrison
Haulers, Inc., suspects irregularities in the payroll
system, and orders an inspection of "each and
every payroll voucher issued since January 1,
1991". Five percent of the payroll vouchers
contained material errors. This is an example
of(a)____________________.
(a)   Nonparametric statistics   Nominal
data   Inferential statistics  O Descriptive statistics
Data Mining
• Search for patterns in large data sets
– Businesses data: marketing, finance, production ...
• Collected for some purpose, often useful for others
• From government or private companies
– Makes use of
• Statistics – all the basic activities, and
– Prediction, classification, clustering
• Computer science – efficient algorithms (instructions) for
– Collecting, maintaining, organizing, analyzing data
• Optimization – calculations to achieve a goal
– Maximize or minimize (e.g. sales or costs)
Census Bureau County Data
• Over 1,000 counties with demographic, social,
economic, and housing data available for
mining
Clusters of Households
• Identified through data mining (A
Classification of Residential Neighborhoods)Segments
Summary Groups Top One Percent

Wealthy Seaboard Suburbs

Affluent Upper Income Empty Nesters


Families
Successful Suburbanites

Prosperous Baby Boomers


.
. Semirural Lifestyle
Households
. .
.
.
Twentysomethings
Young
Mobile College Campuses
Adults
Military Proximity
.
.
. .
.
.
Computers and Statistical Analysis

 Statisticians often use computer software to perform


the statistical computations required with large
amounts of data.
 To facilitate computer usage, many of the data sets
in this book are available on the website that
accompanies the text.
 The data files may be downloaded in either Minitab
or Excel formats.
 Also, the Excel add-in StatTools can be downloaded
from the website.
 Chapter ending appendices cover the step-by-step
procedures for using Minitab, Excel, and StatTools.
Statistical software
• MS- EXCEL
• Minitab
• SAS
• SPSS
• StatTools

( chapter 1 pages : 1-32)


Organizing and visualizing
variables
Chapter 2
Pg. 33-98
CLASSIFICATION OF DATA

• Qualitative
• Quantitative
• Geographical
• Chronological
– Time series (is a set of observations collected at usually
discrete and equally spaced time intervals- Eg. Daily closing
stock price of a certain stock recorded over the last six
weeks )
– Cross sectional (observations from different individuals or
groups at a single point in time – inventory of all ice creams
in stock at a particular store)
PRESENTATION OF DATA

TABULAR
DIAGRAMS
GRAPHS
• TABULATION

SPECIMEN OF A TABLE

Stub Caption Total


Stub Body of the table
Entries
Stub entries

Total Grand
Total

Foot Note
Sources
DESCRIPTIVE STATISTICS:
ORGANIZING AND VISUALIZING
VARIABLES
CHAPTER 2
Descriptive Statistics:
Tabular and Graphical
Presentations
• Summarizing Categorical Data
 Summarizing Quantitative Data

Categorical
Categorical data
data use
use labels
labels or
or names
names
to
to identify
identify categories
categories of
of like
like items.
items.

Quantitative
Quantitative data
data are
are numerical
numerical values
values
that
that indicate
indicate how
how much
much or
or how
how many.
many.
Categorical Data Are Organized By Utilizing Tables
DCOVA
Categorical
Data

Tallying Data

One Two
Categorical Categorical
Variable Variables

Summary Contingency
Table Table
Organizing Categorical Data: Summary Table
DCOVA
 A summary table tallies the frequencies or percentages of items in a set of
categories so that you can see differences between categories.

Main Reason Young Adults Shop Online

Reason For Shopping Online? Percent


Better Prices 37%
Avoiding holiday crowds or hassles 29%
Convenience 18%
Better selection 13%
Ships directly 3%

Source: Data extracted and adapted from “Main Reason Young Adults Shop Online?”
USA Today, December 5, 2012, p. 1A.
Frequency Distribution

AA frequency
frequency distribution
distribution is
is aa tabular
tabular summary
summary of of
data
data showing
showing the
the frequency
frequency (or(or number)
number) of
of items
items
in
in each
each of
of several
several non-overlapping
non-overlapping classes.
classes.

The
The objective
objective is
is to
to provide
provide insights
insights about
about the
the data
data
that
that cannot
cannot be
be quickly
quickly obtained
obtained by
by looking
looking only
only at
at
the
the original
original data.
data.
Relative Frequency Distribution

The
The relative
relative frequency
frequency of of aa class
class is
is the
the fraction
fraction or
or
proportion
proportion of
of the
the total
total number
number of of data
data items
items
belonging
belonging to
to the
the class.
class.

AA relative
relative frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the relative
relative
frequency
frequency forfor each
each class.
class.
Percent Frequency Distribution
The
The percent
percent frequency
frequency of
of aa class
class is
is the
the relative
relative
frequency
frequency multiplied
multiplied by
by 100.
100.

AA percent
percent frequency
frequency distribution
distribution is
is aa tabular
tabular
summary
summary of of aa set
set of
of data
data showing
showing the
the percent
percent
frequency
frequency for
for each
each class.
class.
Frequency Distribution…
Example – 4 soft drinks – 15
households
Coke Pepsi 7 Up Coke Mirinda
Coke 7 Up 7 Up Coke Coke
Mirinda 7 Up Coke Mirinda Coke
Drink Frequency
Coke 7
Pepsi 1
Mirinda 3
7 Up 4
Total 15
Frequency Distribution…
Soft Drink Frequency Relative Percent
frequency frequency
Coke 7 0.46 46
Pepsi 1 0.07 7
Mirinda 3 0.20 20
7 Up 4 0.27 27
Total 15 1.00 100
Frequency Distribution
 Example: Marada Inn

Guests staying at Marada Inn were asked to rate the quality of their
accommodations as being excellent, above average, average, below
average, or poor. The ratings provided by a sample of 20 guests are:

Below Average Average Above Average


Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average
Frequency Distribution

 Example: Marada Inn

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20
Relative Frequency and
Percent Frequency Distributions
 Example: Marada Inn

Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25 .10(100)
.10(100) == 10
10
Above Average .45 45
Excellent .05 5
Total 1.00 100

1/20
1/20 == .05
.05
A Contingency Table Helps Organize Two or More
Categorical Variables
DCOVA
• Used to study patterns that may exist between the
responses of two or more categorical variables

• Cross tabulates or tallies jointly the responses of


the categorical variables

• For two variables the tallies for one variable are


located in the rows and the tallies for the second
variable are located in the columns
Contingency Table - Example
DCOVA
• A random sample of 400
invoices is drawn. Contingency Table Showing
Frequency of Invoices Categorized
• Each invoice is categorized By Size and The Presence Of Errors
as a small, medium, or large No
Errors Errors Total
amount.
• Small 170 20 190
Each invoice is also Amount
examined to identify if there
Medium 100 40 140
are any errors. Amount
• This data are then organized Large 65 5 70
in the contingency table to Amount
the right. 335 65 400
Total
Contingency Table Based On Percentage Of Overall Total
DCOVA
No
Errors Errors Total 42.50% = 170 / 400
Small 170 20 190 25.00% = 100 / 400
Amount 16.25% = 65 / 400
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 42.50% 5.00% 47.50%
Total 335 65 400 Amount
Medium 25.00% 10.00% 35.00%
Amount
83.75% of sampled invoices Large 16.25% 1.25% 17.50%
have no errors and 47.50% Amount
of sampled invoices are for Total 83.75% 16.25% 100.0%
small amounts.
Contingency Table Based On Percentage of Row Totals
No DCOVA
Errors Errors Total 89.47% = 170 / 190
Small 170 20 190 71.43% = 100 / 140
Amount 92.86% = 65 / 70
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 89.47% 10.53% 100.0%
Total 335 65 400 Amount
Medium 71.43% 28.57% 100.0%
Amount
Medium invoices have a Large 92.86% 7.14% 100.0%
larger chance (28.57%) of Amount
having errors than small Total 83.75% 16.25% 100.0%
(10.53%) or large (7.14%)
invoices.
Contingency Table Based On Percentage Of Column
Totals
No DCOVA
Errors Errors Total
50.75% = 170 / 335
Small 170 20 190 30.77% = 20 / 65
Amount
Medium 100 40 140
Amount No
Large 65 5 70 Errors Errors Total
Amount Small 50.75% 30.77% 47.50%
Total 335 65 400 Amount
Medium 29.85% 61.54% 35.00%
Amount
There is a 61.54% chance Large 19.40% 7.69% 17.50%
that invoices with errors are Amount
of medium size. Total 100.0% 100.0% 100.0%
Quantitative Data

Quantitative
Quantitative data
data indicate
indicate how
how many
many or
or how
how much:
much:

discrete,
discrete, ifif measuring
measuring how
how many
many

continuous,
continuous, ifif measuring
measuring how
how much
much

Quantitative
Quantitative data
data are
are always
always numeric.
numeric.

Ordinary
Ordinary arithmetic
arithmetic operations
operations are
are meaningful
meaningful for
for
quantitative
quantitative data.
data.
Ungrouped Versus Grouped Data

• Ungrouped data
• have not been summarized in any way
• are also called raw data
• Grouped data
• have been organized into a frequency
distribution
Tables Used For Organizing
Numerical Data
DCOVA

Numerical Data

Ordered Array Frequency Cumulative


Distributions Distributions
Organizing Numerical Data:
Ordered Array
DCOVA
 An ordered array is a sequence of data, in rank order, from the smallest
value to the largest value.
 Shows range (minimum value to maximum value)
 May help identify outliers (unusual observations)

Age of Day Students


Surveyed
College
16 17 17 18 18 18
Students 19 19 20 20 21 22
22 25 27 32 38 42
Night Students
18 18 19 19 20 21
23 28 32 33 41 45
Organizing Numerical Data:
Frequency Distribution
DCOVA
 The frequency distribution is a summary table in which the data are arranged into
numerically ordered classes.

 You must give attention to selecting the appropriate number of class groupings for the
table, determining a suitable width of a class grouping, and establishing the boundaries
of each class grouping to avoid overlapping.

 The number of classes depends on the number of values in the data. With a larger
number of values, typically there are more classes. In general, a frequency distribution
should have at least 5 but no more than 15 classes.

 To determine the width of a class interval, you divide the range (Highest value–
Lowest value) of the data by the number of class groupings desired.
Organizing Numerical Data:
Frequency Distribution Example
DCOVA

Example: A manufacturer of insulation randomly selects 20


winter days and records the daily high temperature

24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53,
27
Organizing Numerical Data:
Frequency Distribution Example
DCOVA
 Sort raw data in ascending order:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58
 Find range: 58 - 12 = 46
 Select number of classes: 5 (usually between 5 and 15)
 Compute class interval (width): 10 (46/5 then round up)
 Determine class boundaries (limits):
 Class 1: 10 but less than 20
 Class 2: 20 but less than 30
 Class 3: 30 but less than 40
 Class 4: 40 but less than 50
 Class 5: 50 but less than 60
 Compute class midpoints: 15, 25, 35, 45, 55
 Count observations & assign to classes
Organizing Numerical Data: Frequency Distribution
Example
DCOVA
Data in ordered array:
12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53,
58

Class Midpoints Frequency

10 but less than 20 15 3


20 but less than 30 25 6
30 but less than 40 35 5
40 but less than 50 45 4
50 but less than 60 55 2
Total 20
Organizing Numerical Data: Relative & Percent
Frequency Distribution Example
DCOVA
Relative
Class Frequency Frequency Percentage

10 but less than 20 3 .15 15%


20 but less than 30 6 .30 30%
30 but less than 40 5 .25 25%

40 but less than 50 4 .20 20%


50 but less than 60 2 .10 10%

Total 20 1.00 100%

Relative Frequency = Frequency / Total, e.g. 0.10 = 2 / 20


Organizing Numerical Data: Cumulative
Frequency Distribution Example
DCOVA
Cumulative Cumulative
Class Frequency Percentage
Frequency Percentage

10 but less than 20 3 15% 3 15%


20 but less than 30 6 30% 9 45%
30 but less than 40 5 25% 14 70%
40 but less than 50 4 20% 18 90%
50 but less than 60 2 10% 20 100%
Total 20 100 20 100%

Cumulative Percentage = Cumulative Frequency / Total * 100 e.g. 45% = 100*9/20


• LESS THAN CUMULATIVE FREQUENCY SERIES

NO. OF
HOURS
WORKERS

LESS THAN 10 5
LESS THAN 30 15
LESS THAN 60 30
LESS THAN 90 50
• MORE THAN CUMULATIVE FREQUENCY SERIES

PROFITS (RS. IN LAKHS) NO. OF COMPANIES

MORE THAN 100 150


MORE THAN 150 90
MORE THAN 200 40
MORE THAN 250 5
• INCLUSIVE CLASS INTERVAL

CLASS INTERVAL FREQUENCY

10 – 19 17
20 – 29 15
30 – 39 12
40 – 49 10
• EXCLUSIVE CLASS INTERVAL

NO. OF
REVENUE (RS.)
PRODUCTS
100 – 200 15
200 – 300 20
300 – 400 10
400 – 500 5
TOTAL 50
• OPEN END CLASS INTERVAL

SALARY (RS.) NO. OF CLERKS


LESS THAN 1500 10
1500 – 1700 25
1700 – 1900 45
1900 – 2100 11
MORE THAN 2100 9
TOTAL 100
Why Use a Frequency Distribution?
DCOVA
• It condenses the raw data into a more
useful form
• It allows for a quick visual interpretation
of the data
• It enables the determination of the
major characteristics of the data set
including where the data are
concentrated / clustered
Frequency Distributions:
Some Tips
DCOVA
• Different class boundaries may provide different pictures for the
same data (especially for smaller data sets)

• Shifts in data concentration may show up when different class


boundaries are chosen

• As the size of the data set increases, the impact of alterations in


the selection of class boundaries is greatly reduced

• When comparing two or more groups with different sample


sizes, you must use either a relative frequency or a percentage
distribution
FEW MORE EXAMPLES
Frequency Distribution

 Example: Hudson Auto Repair


Sample of Parts Cost($) for 50 Tune-ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Relative Frequency and
Percent Frequency Distributions
 Example: Hudson Auto Repair

Parts Relative Percent


Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 26
2/50
2/50 .04(100)
.04(100)
70-79 .32 32
80-89 .14 14 Percent
Percent
frequency
frequencyisis
90-99 .14 14 the
therelative
relative
100-109 .10 10 frequency
frequency
multiplied
multiplied
Total 1.00 100 by 100.
by 100.
Relative Frequency and
Percent Frequency Distributions
 Example: Hudson Auto Repair
Insights Gained from the % Frequency Distribution:
• Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.
Example of Ungrouped Data
42 26 32 34 57

30 58 37 50 30

53 40 30 47 49 Ages of a Sample of
Managers from
50 40 32 31 40 Urban Child Care
52 28 23 35 25 Centers in the
United States
30 36 32 26 50

55 30 58 64 52

49 33 43 46 32

61 31 30 40 60

74 37 29 43 54
Frequency Distribution of Child Care
Manager’s Ages

Class Interval Frequency


20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 1
Data Range
42 26 32 34 57 Range = Largest - Smallest
30 58 37 50 30

53 40 30 47 49
= 74 - 23
50 40 32 31 40 = 51
52 28 23 35 25

30 36 32 26 50

55 30 58 64 52 Smallest
49 33 43 46 32

61 31 30 40 60 Largest
74 37 29 43 54
Number of Classes and Class Width
• The number of classes should be between 5 and 15.
• Fewer than 5 classes cause excessive
summarization.
• More than 15 classes leave too much detail.
• Class Width
• Divide the range by the number of classes for an
approximate class width
• Round up to a convenient number
51
Approximate Class Width = = 8.5
6
Class Width = 10
Relative Frequency
Relative
Class Interval Frequency Frequency
20-under 30 6 .12
30-under 40 18 .36
40-under 50 11 .22
50-under 60 11 .22
60-under 70 3 .06
70-under 80 1 .02
Total 50 1.00
Cumulative Frequency
Cumulative
Class Interval Frequency Frequency
20-under 30 6 6
30-under 40 18 24
40-under 50 11 35
50-under 60 11 46
60-under 70 3 49
70-under 80 1 50
Total 50
Class Midpoints, Relative Frequencies, and
Cumulative Frequencies

Relative Cumulative
Class Interval Frequency Midpoint Frequency Frequency
20-under 30 6 25 .12 6
30-under 40 18 35 .36 24
40-under 50 11 45 .22 35
50-under 60 11 55 .22 46
60-under 70 3 65 .06 49
70-under 80 1 75 .02 50
Total 50 1.00
Cumulative Relative Frequencies

Cumulative
Relative Cumulative Relative
Class Interval Frequency Frequency Frequency Frequency
20-under 30 6 .12 6 .12
30-under 40 18 .36 24 .48
40-under 50 11 .22 35 .70
50-under 60 11 .22 46 .92
60-under 70 3 .06 49 .98
70-under 80 1 .02 50 1.00
Total 50 1.00
Cumulative Distributions

 The last entry in a cumulative frequency distribution


always equals the total number of observations.
 The last entry in a cumulative relative frequency
distribution always equals 1.00.
 The last entry in a cumulative percent frequency
distribution always equals 100.
PRACTICE QUESTIONS
• The response to a question has three
alternatives: A, B and C. A sample of 120
responses provides 60 A, 24 B and 36 C. Show
the frequency and relative frequency
distributions.
A 60 0.5
B 24
C 36
• A partial relative frequency distribution is given .

Clas Relative frequency


s
A 0.22
B 0.18
C 0.40
D

• What is the relative frequency of class D?


• The total sample size is 200. what is the frequency of class D?
• Show the frequency distribution.
• Show the percent frequency distribution
• A ________________is a tabular summary of
data showing the number of items in each of
several non overlapping classes.
– Frequency distribution
– Relative frequency
– Probability distribution
– Cumulative distribution
• When studying the simultaneous responses to
two categorical questions, you should set up a
• a) contingency table.
• b) frequency distribution table.
• c) cumulative percentage distribution table.
• d) histogram.
• In a cumulative relative frequency distribution,
the last class will have a cumulative relative
frequency equal to
• a. one
• b. zero
• c. the total number of elements in the data set
• d. None of these alternatives is correct.

You might also like