You are on page 1of 49

Chapter 3

Data Description

1
Chapter 3 Overview
Introduction
◼ 3-1 Measures of Central Tendency
◼ 3-2 Measures of Variation
◼ 3-3 Measures of Position
◼ 3-4 Exploratory Data Analysis

Bluman, Chapter 3 2
Chapter 3 Objectives
1. Summarize data using measures of
central tendency.
2. Describe data using measures of
variation.
3. Identify the position of a data value in a
data set.
4. Use boxplots and five-number
summaries to discover various aspects
of data.
Bluman, Chapter 3 3
Introduction
Traditional Statistics
◼ Average
◼ Variation
◼ Position

Bluman, Chapter 3 4
3.1 Measures of Central Tendency
◼ A statistic is a characteristic or measure
obtained by using the data values from a
sample.
◼ A parameter is a characteristic or
measure obtained by using all the data
values for a specific population.

Bluman, Chapter 3 5
Measures of Central Tendency
General Rounding Rule
The basic rounding rule is that rounding
should not be done until the final answer is
calculated. Use of parentheses on
calculators or use of spreadsheets help to
avoid early rounding error.

Bluman, Chapter 3 6
Measures of Central Tendency
What Do We Mean By Average?
Mean

Median

Mode

Midrange

Weighted Mean

Bluman, Chapter 3 7
Measures of Central Tendency:
Mean
◼ The mean is the quotient of the sum of
the values and the total number of values.
◼ The symbol X is used for sample mean.
X=
X1 + X 2 + X 3 + + Xn
=
 X
n n
◼ For a population, the Greek letter μ (mu)
is used for the mean.
=
X1 + X 2 + X 3 + + XN
=
 X
N N
Bluman, Chapter 3 8
Chapter 3
Data Description

Section 3-1
Example 3-1
Page #106

Bluman, Chapter 3 9
Example 3-1: Days Off per Year
The data represent the number of days off per
year for a sample of individuals selected from
nine different countries. Find the mean.
20, 26, 40, 36, 23, 42, 35, 24, 30

X=
X1 + X 2 + X 3 + + Xn
=
 X
n n
20 + 26 + 40 + 36 + 23 + 42 + 35 + 24 + 30 276
X= = = 30.7
9 9

The mean number of days off is 30.7 days.

Bluman, Chapter 3 10
Rounding Rule: Mean
The mean should be rounded to one more
decimal place than occurs in the raw data.
The mean, in most cases, is not an actual
data value.

Bluman, Chapter 3 11
Measures of Central Tendency:
Mean for Grouped Data
◼ The mean for grouped data is calculated
by multiplying the frequencies and
midpoints of the classes.

X=
 f X m

Bluman, Chapter 3 12
Chapter 3
Data Description

Section 3-1
Example 3-3
Page #107

Bluman, Chapter 3 13
Example 3-3: Miles Run
Below is a frequency distribution of miles
run per week. Find the mean.
Class Boundaries Frequency
5.5 - 10.5 1
10.5 - 15.5 2
15.5 - 20.5 3
20.5 - 25.5 5
25.5 - 30.5 4
30.5 - 35.5 3
35.5 - 40.5 2
f = 20
Bluman, Chapter 3 14
Example 3-3: Miles Run
Class Frequency, f Midpoint, Xm f ·Xm
5.5 - 10.5 1 8 8
10.5 - 15.5 2 13 26
15.5 - 20.5 3 18 54
20.5 - 25.5 5 23 115
25.5 - 30.5 4 28 112
30.5 - 35.5 3 33 99
35.5 - 40.5 2 38 76
f = 20  f ·Xm = 490

X=
 f X m
=
490
= 24.5 miles
n 20
Bluman, Chapter 3 15
Measures of Central Tendency:
Weighted Mean
◼ The weighted mean
Let the values X 1, X 2 , ,X n
with the corresponding weight w 1, w 2 , ,w n .
The weighted mean is given by

X=
w1 X 1 + w2 X 2 + + wn X n
=
 wX
w1 + w2 + + wn w
Bluman, Chapter 3 16
Example 3-17:Page #115 Grade Point Average
A student received the following grades. Find
the corresponding GPA.
Course Credits, w Grade, X
English Composition 3 A (4 points)
Introduction to Psychology 3 C (2 points)
Biology 4 B (3 points)
Physical Education 2 D (1 point)

X=  wX
=
3  4 + 3  2 + 4  3 + 2 1 32
= = 2.7
w 3+3+ 4 + 2 12

The grade point average is 2.7.


Bluman, Chapter 3 17
Exercise :No 26 Page #120.
◼ Find the weighted mean price of three models
of automobiles sold. The number and price of
each model sold are shown in this list.
Model Number, w Price, X
A 8 $ 10,000
B 10 $ 12,000
C 12 $ 8,000

X=  wX =
(10000)(8) + (12000)(10) + (8000)(12)
w 8 + 10 + 12
296000
= = 9866.67$
30
Bluman, Chapter 3 18
Chapter 3
Exercises (3-1)

Number 27, 28, 32,33


Page #120

Bluman, Chapter 3 19
Measures of Central Tendency:
Median
◼ The median is the midpoint of the data
array. The symbol for the median is MD.
◼ The median will be one of the data values
if there is an odd number of values.
◼ The median will be the average of two
data values if there is an even number of
values.
Bluman, Chapter 3 20
Chapter 3
Data Description

Section 3-1
Example 3-4
Page #110

Bluman, Chapter 3 21
Example 3-4: Hotel Rooms
The number of rooms in the seven hotels in
downtown Pittsburgh is 713, 300, 618, 595,
311, 401, and 292. Find the median.

Arrange the data in order.


292, 300, 311, 401, 596, 618, 713

Select the middle value.


MD = 401

The median is 401 rooms.


Bluman, Chapter 3 22
Chapter 3
Data Description

Section 3-1
Example 3-6
Page #110

Bluman, Chapter 3 23
Example 3-6: Tornadoes in the U.S.
The number of tornadoes that have
occurred in the United States over an 8-
year period follows. Find the median.
684, 764, 656, 702, 856, 1133, 1132, 1303

Arrange the data in order


656, 684, 702, 764, 856, 1132, 1133, 1303

Find the average of the two middle values


764 + 856 1620
MD = = = 810
2 2
Bluman, Chapter 3 24
Example: Find MD 5.40 1.10 0.42 0.73 0.48 1.10
Solution 0.42 0.48 0.73 1.10 1.10 5.40

(in order - even number of values – no exact middle


shared by two numbers)

0.73 + 1.10
2
MEDIAN is 0.915

Example: Find MD 5.40 1.10 0.42 0.73 0.48 1.10 0.66


Solution 0.42 0.48 0.66 0.73 1.10 1.10 5.40
(in order - odd number of values)

exact middle MEDIAN is 0.73


Bluman, Chapter 3 25
Measures of Central Tendency: Median for
Grouped Data

Bluman, Chapter 3 26
n
− C .f
MD = 2 w + Lm
f

Where n = sum of frequencies


c.f = cumulative frequency of class
immediately preceding the median class
w = width of median class
f =frequency of median class
Lm= lower boundary of median class

Bluman, Chapter 3 27
Example
Below is a frequency distribution of miles
run per week. Find the median.
Class Boundaries Frequency
5.5 - 10.5 1
10.5 - 15.5 2
15.5 - 20.5 3
20.5 - 25.5 5
25.5 - 30.5 4
30.5 - 35.5 3
35.5 - 40.5 2
f = 20
Bluman, Chapter 3 28
Measures of Central Tendency:
Mode
◼ The mode is the value that occurs most
often in a data set.
◼ It is sometimes said to be the most typical
case.
◼ There may be no mode, one mode
(unimodal), two modes (bimodal), or many
modes (multimodal).

Bluman, Chapter 3 29
Chapter 3
Data Description

Section 3-1
Example 3-9
Page #111

Bluman, Chapter 3 30
Example 3-9: NFL Signing Bonuses
Find the mode of the signing bonuses of
eight NFL players for a specific year. The
bonuses in millions of dollars are
18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10

You may find it easier to sort first.


10, 10, 10, 11.3, 12.4, 14.0, 18.0, 34.5

Select the value that occurs the most.

The mode is 10 million dollars.

Bluman, Chapter 3 31
Chapter 3
Data Description

Section 3-1
Example 3-10
Page #112

Bluman, Chapter 3 32
Example 3-10: Coal Employees in PA
Find the mode for the number of coal employees
per county for 10 selected counties in
southwestern Pennsylvania.
110, 731, 1031, 84, 20, 118, 1162, 1977, 103, 752

No value occurs more than once.

There is no mode.

Bluman, Chapter 3 33
Chapter 3
Data Description

Section 3-1
Example 3-11
Page #112

Bluman, Chapter 3 34
Example 3-11: Licensed Nuclear
Reactors
The data show the number of licensed nuclear
reactors in the United States for a recent 15-year
period. Find the mode.
104 104 104 104 104 107 109 109 109 110
109 111 112 111 109

104 and 109 both occur the most. The data set
is said to be bimodal.

The modes are 104 and 109.


Bluman, Chapter 3 35
Chapter 3
Data Description

Section 3-1
Example 3-12
Page #111

Bluman, Chapter 3 36
Example 3-12: Miles Run per Week
Find the modal class for the frequency distribution
of miles that 20 runners ran in one week.
Class Frequency
5.5 – 10.5 1
The modal class is
10.5 – 15.5 2
20.5 – 25.5.
15.5 – 20.5 3
20.5 – 25.5 5
25.5 – 30.5 4 The mode, the midpoint
30.5 – 35.5 3 of the modal class, is
35.5 – 40.5 2 23 miles per week.

Determine the modal class.


Modal class is the class with the largest frequency
Bluman, Chapter 3 37
Exercise3-1 No. 12 Page 119
Frequency,
Class Limit Midpoint, Xm f ·Xm
f
2.48-7.48 7 4.98 34.86
7.49–12.49 3 9.99 29.97
12.50- 17.50 1 15 15
17.51 - 22.51 7 20.01 140.07
22.52 - 27.52 5 25.02 125.1
27.53 - 32.53 5 30.03 150.15

n=f = 28  f ·Xm = 495.15

X =
 f X m
=
495.15
= 17.68 $
n 28
Bluman, Chapter 3 38
Exercise3-1 No. 12 Page 119
For the median
median value n = 28 =14
2 2
median class is 17.51-22.51, C.f=11, f=7, w=5.01, L m =17.505
n
− C.f
2 (14-11)
MD= w+L m = 5.01+17.505
f 7
15.03
= +17.505=2.147+17.505=19.6524$
7
For the mode
modal classes are 2.48-7.48 and 17.51-22.51
modes are 4.98 $ and 20.01$

Bluman, Chapter 3 39
Measures of Central Tendency:
Midrange
◼ The midrange is the average of the
lowest and highest values in a data set.

Lowest + Highest
MR =
2

Bluman, Chapter 3 40
Chapter 3
Data Description

Section 3-1
Example 3-15
Page #114

Bluman, Chapter 3 41
Example 3-15: Water-Line Breaks
In the last two winter seasons, the city of
Brownsville, Minnesota, reported these
numbers of water-line breaks per month.
Find the midrange.
2, 3, 6, 8, 4, 1

1+ 8 9
MR = = = 4.5
2 2

The midrange is 4.5.

Bluman, Chapter 3 42
Properties of the Mean
➢ Uses all data values.
➢ Used in computing other statistics, such as
the variance
➢ Unique, usually not one of the data values
➢ Cannot be used with open-ended classes
➢ Affected by extremely high or low values,
called outliers

Bluman, Chapter 3 43
Properties of the Median
➢ Gives the midpoint
➢ Used when it is necessary to find out
whether the data values fall into the upper
half or lower half of the distribution.
➢ Can be used for an open-ended
distribution.
➢ Affected less than the mean by extremely
high or extremely low values.

Bluman, Chapter 3 44
Properties of the Mode
-Used when the most typical case is desired
-Easiest average to compute
-The mode can be used when the data are
nominal or categorical, such as religious
preference, gender, or political affiliation
-Not always unique or may not exist

Bluman, Chapter 3 45
Properties of the Midrange
➢ Easy to compute.
➢ Gives the midpoint.
➢ Affected by extremely high or low values in
a data set

Bluman, Chapter 3 46
Distributions

Bluman, Chapter 3 47
Bluman, Chapter 3 48
H.W
Exercises (3-1)
(1) Find (1) mean (2) median (3) mode
for the exercises Num. 12, 13 and 14
Page 119
(2) Solve the exercises, Num. 27, 28,
32,33 page 121

Bluman, Chapter 3 49

You might also like