You are on page 1of 86

Copyright © 2005 Pearson Education, Inc.

Slide 13-1
SEVENTH EDITION and EXPANDED SEVENTH EDITION
Chapter 13

Statistics

Copyright © 2005 Pearson Education, Inc.


13.1

Sampling Techniques

Copyright © 2005 Pearson Education, Inc.


Statistics

 Statistics is the art and science of gathering,


analyzing, and making inferences from numerical
information (data) obtained in an experiment.
 Statistics are divided into two main braches.
 Descriptive statistics is concerned with the collection,
organization, and analysis of data.
 Inferential statistics is concerned with the making of
generalizations or predictions of the data collected.

Copyright © 2005 Pearson Education, Inc. Slide 13-4


Statisticians

 A statistician’s interest lies in drawing conclusions about


possible outcomes through observations of only a few
particular events.
 The population consists of all items or people of interest.
 The sample includes some of the items in the population.
 When a statistician draws a conclusion from a sample,
there is always the possibility that the conclusion is
incorrect.

Copyright © 2005 Pearson Education, Inc. Slide 13-5


Types of Sampling

 A random sampling occurs if a sample is drawn in such


a way that each time an item is selected, each item has
an equal chance of being drawn.
 When a sample is obtained by drawing every nth item
on a list or production line, the sample is a systematic
sample.
 A cluster sample is referred to as an area sample
because it is applied on a geographical basis.

Copyright © 2005 Pearson Education, Inc. Slide 13-6


Types of Sampling continued

 Stratified sampling involves dividing the


population by characteristics such as gender,
race, religion, or income.
 Convenience sampling uses data that is easily
obtained and can be extremely biased.

Copyright © 2005 Pearson Education, Inc. Slide 13-7


Example: Identifying Sampling
Techniques
 A raffle ticket is drawn by a blindfolded person at a
festival to win a grand prize.
 Students at an elementary are classified according to
their present grade level. Then, a random sample of
three students from each grade are chosen to
represent their class.
 Every sixth car on highway is stopped for a vehicle
inspection.

Copyright © 2005 Pearson Education, Inc. Slide 13-8


Example: Identifying Sampling
Techniques continued
 Voters are classified based on their polling location. A
random sample of four polling locations are selected.
All the voters from the precinct are included in the
sample.
 The first 20 people entering a water park are asked if
they are wearing sunscreen.
Solution:
a) Random d) Cluster
b) Stratified e) Convenience
c) Systematic

Copyright © 2005 Pearson Education, Inc. Slide 13-9


13.2

The Misuses of Statistics

Copyright © 2005 Pearson Education, Inc.


Misuses of Statistics

 Many individuals, businesses, and advertising firms


misuse statistics to their own advantage.
 When examining statistical information consider the
following:
 Was the sample used to gather the statistical data
unbiased and of sufficient size?
 Is the statistical statement ambiguous, could it be
interpreted in more than one way?

Copyright © 2005 Pearson Education, Inc. Slide 13-11


Example: Misleading Statistics

 An advertisement says, “Fly  A helped wanted ad read,”


Speedway Airlines and Save Salesperson wanted for
20%”. Ryan’s Furniture Store.
Average Salary: $32,000.”
 Here there is not enough  The word “average” can be
information given. very misleading.
 The “Save 20%” could be
 If most of the salespeople
off the original ticket price, earn $20,000 to $25,000
the ticket price when you and the owner earns
buy two tickets or of $76,000, this “average
another airline’s ticket salary” is not a fair
price. representation.

Copyright © 2005 Pearson Education, Inc. Slide 13-12


Charts and Graphs

 Charts and graphs can also be misleading.


 Even though the data is displayed correctly,
adjusting the vertical scale of a graph can give a
different impression.
 A circle graph can be misleading if the sum of the
parts of the graphs do not add up to 100%.

Copyright © 2005 Pearson Education, Inc. Slide 13-13


Example: Misleading Graphs

While each graph presents identical


information, the vertical scales have been
altered. Sales

Sales
500

Dollars (in thousands)


Dollars (in thousands)

175 400
150
125
100 300
75
50 200
25
0
100
99 00 01 02 03 04
99 00 01 02 03 04
Years
Years

Copyright © 2005 Pearson Education, Inc. Slide 13-14


13.3

Frequency Distributions

Copyright © 2005 Pearson Education, Inc.


Example

 The number of pets per family is recorded for 30


families surveyed. Construct a frequency
distribution of the following data:

0 0 0 0 0 0
1 1 1 1 1 1
1 1 1 1 2 2
2 2 2 2 2 2

3 3 3 3 4 4

Copyright © 2005 Pearson Education, Inc. Slide 13-16


Solution

Number of Frequency
0 0 0 0 0 0
Pets
1 1 1 1 1 1
0 6 1 1 1 1 2 2
1 10 2 2 2 2 2 2
2 8 3 3 3 3 4 4
3 4
4 2

Copyright © 2005 Pearson Education, Inc. Slide 13-17


Rules for Data Grouped by Classes

 The classes should be of the same “width.”


 The classes should not overlap.
 Each piece of data should belong to only one
class.

Copyright © 2005 Pearson Education, Inc. Slide 13-18


Definitions

Classes
 04 
 59 
 
10  14 
Lower class limits   Upper class limits
15  19 
20  24 
 
25  29 

 Midpoint of a class is found by adding the lower and


upper class limits and dividing the sum by 2.

Copyright © 2005 Pearson Education, Inc. Slide 13-19


Example

 The following set of data represents the distance, in


miles, 15 randomly selected second grade students live
from school.

6.8 5.3 9.7 3.8 8.7


0.5 5.9 0.8 5.7 1.3
4.8 9.6 1.5 7.4 0.2

Construct a frequency distribution with the first class 0  2.

Copyright © 2005 Pearson Education, Inc. Slide 13-20


Solution

 First, rearrange the data  # of miles Frequency


from lowest to highest. from school

0.2 0.5 0.8 0-2 5

1.3 1.5 3.8 2.1 - 4.1 1


4.2 - 6.2 4
4.8 5.3 5.7
6.3 - 8.3 2
5.9 6.8 7.4
8.4 -10.4 3
8.7 9.6 9.7
15

Copyright © 2005 Pearson Education, Inc. Slide 13-21


13.4

Statistical Graphs

Copyright © 2005 Pearson Education, Inc.


Circle Graphs

 Circle graphs (also known as pie charts) are


often used to compare parts of one or more
components of the whole to the whole.

Copyright © 2005 Pearson Education, Inc. Slide 13-23


Example

 According to a recent hospital survey of 200 patients the


following table indicates how often hospitals used four
different kinds of painkillers. Use the information to
construct a circle graph illustrating the percent each
painkiller was used.

Aspirin 56
Ibuprofen 104
Acetaminophen 16
Other 24
200

Copyright © 2005 Pearson Education, Inc. Slide 13-24


Solution

 Determine the measure of the corresponding


central angle.

Painkiller Number of Percent of Total Measure of Central


Patients Angle

Aspirin 56 56
200
 100  28% 0.28  360 = 100.8

Ibuprofen 104 104


200
 100  52% 0.52  360 = 187.2

Acetaminophen 16 16
200
 100  8% 0.08  360 = 28.8

Other 24 24
200
 100  12% 0.12  360 = 43.2

Total 200 100% 360

Copyright © 2005 Pearson Education, Inc. Slide 13-25


Solution continued

 Use a protractor to construct a circle graph and label it


properly.
Hospital Painkiller Use

Ibuprofen
52%

Aspirin
28%

Other Acetaminophe
12% n
8%
Copyright © 2005 Pearson Education, Inc. Slide 13-26
Histogram

 A histogram is a graph with observed values on


its horizontal scale and frequencies on it vertical
scale.
# of pets Frequency
 Example: Construct a
histogram of the 0 6
frequency distribution. 1 10
2 8
3 4
4 2

Copyright © 2005 Pearson Education, Inc. Slide 13-27


Solution

Number of Pets per Family

12
10
Frequency

8
6
4
2
0
0 1 2 3 4
Number of Pets
# of pets Frequency
0 6
1 10
2 8
3 4
4 2

Copyright © 2005 Pearson Education, Inc. Slide 13-28


Frequency Polygon

Number of Pets per Family

12
10
8
Frequency

6
4
2
0
0 1 2 3 4
Number of Pets

Copyright © 2005 Pearson Education, Inc. Slide 13-29


Stem-and-Leaf Display

 A stem-and-leaf display is a tool that organizes


and groups the data while allowing us to see the
actual values that make up the data.
 The left group of digits is called the stem.
 The right group of digits is called the leaf.

Copyright © 2005 Pearson Education, Inc. Slide 13-30


Example

 The table below indicates the number of miles


20 workers have to drive to work. construct a
stem-and-leaf display.

12 18 3 8 12
25 21 3 15 4
17 27 43 21 16
12 26 35 14 9

Copyright © 2005 Pearson Education, Inc. Slide 13-31


Solution

 Data 
0 33489
12 18 3 8 12
1 22245678
25 21 3 15 4
17 27 43 21 16 2 11567
12 26 35 14 9
3 5

4 3

Copyright © 2005 Pearson Education, Inc. Slide 13-32


13.5

Measures of Central Tendency

Copyright © 2005 Pearson Education, Inc.


Definitions

 An average is a number that is representative


of a group of data.

 The arithmetic mean, or simply the mean is


symbolized by x or by the Greek letter mu, .

Copyright © 2005 Pearson Education, Inc. Slide 13-34


Mean

 The mean, x is the sum of the data divided by


the number of pieces of data. The formula for
calculating the mean is

x
 x
n
 where  x represents the sum of all the data and
n represents the number of pieces of data.

Copyright © 2005 Pearson Education, Inc. Slide 13-35


Example-find the mean

 Find the mean amount of money parents spent


on new school supplies and clothes if 5 parents
randomly surveyed replied as follows: $327
$465 $672 $150 $230

x
 x $327  $465  $672  $150  $230

n 5
$1844
  $368.80
5

Copyright © 2005 Pearson Education, Inc. Slide 13-36


Median

 The median is the value in the middle of a set


of ranked data.
 Example: Determine the mean of $327 $465
$672 $150 $230.
Rank the data from smallest to largest.
$150 $230 $327 $465 $672

middle value
(median)

Copyright © 2005 Pearson Education, Inc. Slide 13-37


Example: Median (even data)

 Determine the median of the following set of


data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4.
Rank the data:
3 4 4 6 7 8 9 11 12 15
There are 10 pieces of data so the median will
lie halfway between the two middle pieces the 7
and 8. The median is (7 + 8)/2 = 7.5
3 4 4 6 7 8 9 11 12 15

Copyright © 2005 Pearson Education, Inc. Slide 13-38


Mode

 The mode is the piece of data that occurs most


frequently.

 Example: Determine the mode of the data set:


3, 4, 4, 6, 7, 8, 9, 11, 12, 15.
 The mode is 4 since is occurs twice and the
other values only occur once.

Copyright © 2005 Pearson Education, Inc. Slide 13-39


Midrange

 The midrange is the value halfway between the lowest


(L) and highest (H) values in a set of data.

lowest value + highest value


Midrange =
2
 Example: Find the midrange of the data set $327, $465,
$672, $150, $230.

$150 + $672
Midrange =  $411
2

Copyright © 2005 Pearson Education, Inc. Slide 13-40


Example

 The weights of eight Labrador retrievers


rounded to the nearest pound are 85, 92, 88,
75, 94, 88, 84, and 101. Determine the
 a) mean b) median
 c) mode d) midrange
 e) rank the measures of central tendency
from lowest to highest.

Copyright © 2005 Pearson Education, Inc. Slide 13-41


Example--dog weights 85, 92, 88, 75,
94, 88, 84, 101
 Mean 85  92  88  75  94  88  84  101
x
8
707
  88.375
8

 Median-rank the data


 75, 84, 85, 88, 88, 92, 94, 101
 The median is 88.

Copyright © 2005 Pearson Education, Inc. Slide 13-42


Example--dog weights 85, 92, 88, 75,
94, 88, 84, 101
 Mode-the number that occurs most frequently.
The mode is 88.
 Midrange = (L + H)/2
= (75 + 101)/2 = 88
 Rank the measures
88.375, 88, 88, 88

Copyright © 2005 Pearson Education, Inc. Slide 13-43


Measures of Position

 Measures of position are often used to make


comparisons.
 Two measures of position are percentiles and
quartiles.

Copyright © 2005 Pearson Education, Inc. Slide 13-44


To Find the Quartiles of a Set of Data

 Order the data from smallest to largest.


 Find the median, or 2nd quartile, of the set of
data. If there are an odd number of pieces of
data, the median is the middle value. If there
are an even number of pieces of data, the
median will be halfway between the two middle
pieces of data.

Copyright © 2005 Pearson Education, Inc. Slide 13-45


To Find the Quartiles of a Set of Data
continued
 The first quartile, Q1, is the median of the lower
half of the data; that is, Q1, is the median of the
data less than Q2.
 The third quartile, Q3, is the median of the upper
half of the data; that is, Q3 is the median of the
data greater than Q2.

Copyright © 2005 Pearson Education, Inc. Slide 13-46


Example: Quartiles

 The weekly grocery bills for 23 families are as


follows. Determine Q1, Q2, and Q3.
170 210 270 270 280
330 80 170 240 270
225 225 215 310 50
75 160 130 74 81
95 172 190

Copyright © 2005 Pearson Education, Inc. Slide 13-47


Example: Quartiles continued

 Order the data:


50 75 74 80 81 95 130
160 170 170 172 190 210 215
225 225 240 270 270 270 280
310 330
Q2 is the median of the entire data set which is 190.
Q1 is the median of the numbers from 50 to 172 which is 95.
Q3 is the median of the numbers from 210 to 330 which is 270.

Copyright © 2005 Pearson Education, Inc. Slide 13-48


13.6

Measures of Dispersion

Copyright © 2005 Pearson Education, Inc.


Measures of Dispersion

 Measures of dispersion are used to indicate the


spread of the data.

 The range is the difference between the highest


and lowest values; it indicates the total spread
of the data.

Copyright © 2005 Pearson Education, Inc. Slide 13-50


Example: Range

 Nine different employees were selected and the


amount of their salary was recorded. Find the
range of the salaries.
$24,000 $32,000 $26,500
$56,000 $48,000 $27,000
$28,500 $34,500 $56,750
 Range = $56,750  $24,000 = $32,750

Copyright © 2005 Pearson Education, Inc. Slide 13-51


Standard Deviation

 The standard deviation measures how much


the data differ from the mean.

 x  x
2

s
n 1

Copyright © 2005 Pearson Education, Inc. Slide 13-52


To Find the Standard Deviation of a
Set of Data
 1. Find the mean of the set of data.
 2. Make a chart having three columns:
 Data Data  Mean (Data  Mean)2
 3. List the data vertically under the
column marked Data.
 4. Subtract the mean from each piece
of data and place the difference in
the Data  Mean column.

Copyright © 2005 Pearson Education, Inc. Slide 13-53


To Find the Standard Deviation of a
Set of Data continued
 5. Square the values obtained in the Data 
Mean column and record these values in the
(Data  Mean)2 column.
 6. Determine the sum of the values in the
(Data  Mean)2 column.
 7. Divide the sum obtained in step 6 by
n  1, where n is the number of pieces of
data.
 8. Determine the square root of the number
obtained in step 7. This number is the
standard deviation of the set of data.

Copyright © 2005 Pearson Education, Inc. Slide 13-54


Example

 Find the standard deviation of the following


prices of selected washing machines:
$280, $217, $665, $684, $939, $299

Find the mean.

x
 x 665  217  684  280  939  299 3084
   514
n 6 6

Copyright © 2005 Pearson Education, Inc. Slide 13-55


Example continued, mean = 514

Data Data  Mean (Data  Mean)2


217 297 (297)2 = 88,209
280 234 54,756
299 215 46,225
665 151 22,801
684 170 28,900
939 425 180,625
0 421,516

Copyright © 2005 Pearson Education, Inc. Slide 13-56


Example continued, mean = 514

 421,516
s
6 1
421,516
s  290.35
5

 The standard deviation is $290.35.

Copyright © 2005 Pearson Education, Inc. Slide 13-57


13.7

The Normal Curve

Copyright © 2005 Pearson Education, Inc.


Types of Distributions

 Rectangular Distribution  J-shaped distribution

Rectangular Distribution
Frequency

Values

Copyright © 2005 Pearson Education, Inc. Slide 13-59


Types of Distributions continued

 Bimodal  Skewed to right

Copyright © 2005 Pearson Education, Inc. Slide 13-60


Types of Distributions continued

 Skewed to left  Normal

Copyright © 2005 Pearson Education, Inc. Slide 13-61


Normal Distribution

 In a normal distribution, the mean, median, and


mode all have the same value.
 Z-scores determine how far, in terms of
standard deviations, a given score is from the
mean of the distribution.

value of piece of data  mean x  


z 
standard deviation 

Copyright © 2005 Pearson Education, Inc. Slide 13-62


Example: z-scores

 A normal distribution has a mean of 50 and a


standard deviation of 5. Find z-scores for the
following values.
 a) 55 b) 60 c) 43

value of piece of data  mean


 a) z
standard deviation
55  50 5
z55   1
5 5
A score of 55 is one standard deviation above the mean.

Copyright © 2005 Pearson Education, Inc. Slide 13-63


Example: z-scores continued

60  50 10
 b) z60   2
5 5
A score of 60 is 2 standard deviations above the mean.

43  50 7
 c) z43    1.4
5 5

A score of 43 is 1.4 standard deviations below the


mean.

Copyright © 2005 Pearson Education, Inc. Slide 13-64


To Find the Percent of Data Between
any Two Values
1. Draw a diagram of the normal curve,
indicating the area or percent to be
determined.
2. Use the formula to convert the given
values to z-scores. Indicate these z-
scores on the diagram.
3. Look up the percent that corresponds to
each z-score in Table 13.

Copyright © 2005 Pearson Education, Inc. Slide 13-65


To Find the Percent of Data Between
any Two Values continued
4.
 a) When finding the percent of data between two z-scores
on the opposite side of the mean (when one z-score is
positive and the other is negative), you find the sum of the
individual percents.
 b) When finding the percent of data between two z-scores
on the same side of the mean (when both z-scores are
positive or both are negative), subtract the smaller percent
from the larger percent.

Copyright © 2005 Pearson Education, Inc. Slide 13-66


To Find the Percent of Data Between
any Two Values continued
 c) When finding the percent of data to the right of
a positive z-score or to the left of a negative z-
score, subtract the percent of data between ) and
z from 50%.
 d) When finding the percent of data to the left of a
positive z-score or to the right of a negative z-
score, add the percent of data between 0 and z to
50%.

Copyright © 2005 Pearson Education, Inc. Slide 13-67


Example

 Assume that the waiting times for customers at a popular


restaurant before being seated for lunch at a popular restaurant
before being seated for lunch are normally distributed with a mean
of 12 minutes and a standard deviation of 3 min.
 a) Find the percent of customers who wait for at least 12 minutes
before being seated.
 b) Find the percent of customers who wait between 9 and 18
minutes before being seated.
 c) Find the percent of customers who wait at least 17 minutes
before being seated.
 d) Find the percent of customers who wait less than 8 minutes
before being seated.

Copyright © 2005 Pearson Education, Inc. Slide 13-68


Solution

wait for at least 12 minutes between 9 and 18 minutes


9  12
z9   1.00
Since 12 minutes is the 3
mean, half, or 50% of 18  12
customers wait at least 12 z18   2.00
min before being seated. 3
Use table 13.7 page 801.
34.1% + 47.7%
= 81.8%

Copyright © 2005 Pearson Education, Inc. Slide 13-69


Solution continued

 at least 17 min  less than 8 min


17  12 8  12
z17   1.67 z8   1.33
3 3
Use table 13.7 page 801. Use table 13.7 page 801.
45.3% is between the mean and 40.8% is between the mean and
1.67. 1.33.
50%  45.3% = 4.7% 50%  40.8% = 9.2%
Thus, 4.7% of customers wait at Thus, 9.2% of customers wait
least 17 minutes. less than 8 minutes.

Copyright © 2005 Pearson Education, Inc. Slide 13-70


13.8

Linear Correlation and


Regression

Copyright © 2005 Pearson Education, Inc.


Linear Correlation

 Linear correlation is used to determine whether there


is a relationship between two quantities and, if so, how
strong the relationship is.
 The linear correlation coefficient, r, is a unitless
measure that describes the strength of the linear
relationship between two variables.
 If the value is positive, as one variable increases, the
other increases.
 If the value is negative, as one variable increases, the
other decreases.
 The variable, r, will always be a value between –1 and 1
inclusive.

Copyright © 2005 Pearson Education, Inc. Slide 13-72


Scatter Diagrams

 A visual aid used with correlation is the scatter diagram,


a plot of points (bivariate data).
 The independent variable, x, generally is a quantity that
can be controlled.
 The dependant variable, y, is the other variable.
 The value of r is a measure of how far a set of points
varies from a straight line.
 The greater the spread, the weaker the correlation and the
closer the r value is to 0.

Copyright © 2005 Pearson Education, Inc. Slide 13-73


Correlation

Copyright © 2005 Pearson Education, Inc. Slide 13-74


Correlation

Copyright © 2005 Pearson Education, Inc. Slide 13-75


Linear Correlation Coefficient

 The formula to calculate the correlation


coefficient (r) is as follows:

n   xy     x    y 
r 
n  x     x n y   y
2 2 2 2

Copyright © 2005 Pearson Education, Inc. Slide 13-76


Example: Words Per Minute versus
Mistakes
There are five applicants applying for a job as a medical
transcriptionist. The following shows the results of the
applicants when asked to type a chart. Determine the
correlation coefficient between the words per minute
typed and the number of mistakes.

Applicant Words per Minute Mistakes


Ellen 24 8
George 67 11
Phillip 53 12
Kendra 41 10
Nancy 34 9

Copyright © 2005 Pearson Education, Inc. Slide 13-77


Solution

 We will call the words typed per minute, x, and the mistakes, y.
 List the values of x and y and calculate the necessary sums.

WPM Mistakes
x y x2 y2 xy
24 8 576 64 192
67 11 4489 121 737
53 12 2809 144 636
41 10 1681 100 410
34 9 1156 81 306
 x = 219  y = 50  x2 =10,711  y2 = 510  xy = 2,281

Copyright © 2005 Pearson Education, Inc. Slide 13-78


Solution continued

 The n in the formula represents the number of pieces of data. Here n = 5.

n   xy     x    y 
r 

n  x2    x   2

n y2   y  2

5  2281   219   50 
r 
5  10,711   219  5  510    50 
2 2

11,405  10,950

5  10,711  47,961 5  510   2500
455

53,555  47,961 2550  2500
455
  0.86
5594 50

Copyright © 2005 Pearson Education, Inc. Slide 13-79


Solution continued

 Since 0.86 is fairly close to 1, there is a fairly


strong positive correlation.
 This result implies that the more words typed
per minute, the more mistakes made.

Copyright © 2005 Pearson Education, Inc. Slide 13-80


Linear Regression

 Linear regression is the process of determining


the linear relationship between two variables.
 The line of best fit (line of regression or the least
square line) is the line such that the sum of the
vertical distances from the line to the data
points is a minimum.

Copyright © 2005 Pearson Education, Inc. Slide 13-81


The Line of Best Fit

 Equation:

y  mx  b, where

n   xy     x    y  y  m  x
m , and b 

n  x2    x   2
n

Copyright © 2005 Pearson Education, Inc. Slide 13-82


Example

 Use the data in the previous example to find


the equation of the line that relates the
number of words per minute and the number
of mistakes made while typing a chart.
 Graph the equation of the line of best fit on a
scatter diagram that illustrates the set of
bivariate points.

Copyright © 2005 Pearson Education, Inc. Slide 13-83


Solution

 From the previous results, we  Now we find the y-intercept, b.


know that
n   xy     x    y   y  m  x
m b

n x 2
    x 2
n
5(2,281)  (219)(50) 50  0.081 219 
m
5(10,711)  2192
b
5
455 32.261
m
5594 b  6.452
m  0.081 5

Therefore the line of best fit is y = 0.081x + 6.452

Copyright © 2005 Pearson Education, Inc. Slide 13-84


Solution continued

 To graph y = 0.081x + 6.452, plot at least two


points and draw the graph.

x y
10 7.262
20 8.072
30 8.882

Copyright © 2005 Pearson Education, Inc. Slide 13-85


Solution continued

Copyright © 2005 Pearson Education, Inc. Slide 13-86

You might also like