You are on page 1of 411

Chapter 1

Data and Statistics


 Applications in Business and Economics
 Data
 Data Sources
 Descriptive Statistics
 Statistical Inference

© 2003 Thomson/South-Western Slide


1
Applications in
Business and Economics
 Accounting
Public accounting firms use statistical sampling
procedures when conducting audits for their clients.
 Finance
Financial analysts use a variety of statistical
information, including price-earnings ratios and
dividend yields, to guide their investment
recommendations.
 Marketing
Electronic point-of-sale scanners at retail checkout
counters are being used to collect data for a variety of
marketing research applications.

© 2003 Thomson/South-Western Slide


2
Applications in
Business and Economics
 Production
A variety of statistical quality control charts are used
to monitor the output of a production process.
 Economics
Economists use statistical information in making
forecasts about the future of the economy or some
aspect of it.

© 2003 Thomson/South-Western Slide


3
Data

 Elements, Variables, and Observations


 Scales of Measurement
 Qualitative and Quantitative Data
 Cross-Sectional and Time Series Data

© 2003 Thomson/South-Western Slide


4
Data and Data Sets

 Data are the facts and figures that are collected,


summarized, analyzed, and interpreted.
 The data collected in a particular study are referred to
as the data set.

© 2003 Thomson/South-Western Slide


5
Elements, Variables, and Observations

 The elements are the entities on which data are


collected.
 A variable is a characteristic of interest for the
elements.
 The set of measurements collected for a particular
element is called an observation.
 The total number of data values in a data set is the
number of elements multiplied by the number of
variables.

© 2003 Thomson/South-Western Slide


6
Data, Data Sets,
Elements, Variables, and Observations
Observation
Variables
Stock Annual Earn/
Company Exchange Sales($M) Sh.($)
Dataram AMEX 73.10 0.86
EnergySouth OTC 74.00 1.67
Keystone NYSE 365.70 0.86
LandCare NYSE 111.40 0.33
Psychemedics AMEX 17.60 0.13

Elements Data Set Datum

© 2003 Thomson/South-Western Slide


7
Scales of Measurement

 Scales of measurement include:


• Nominal
• Ordinal
• Interval
• Ratio
 The scale determines the amount of information
contained in the data.
 The scale indicates the data summarization and
statistical analyses that are most appropriate.

© 2003 Thomson/South-Western Slide


8
Scales of Measurement

 Nominal
• Data are labels or names used to identify an
attribute of the element.
• A nonnumeric label or a numeric code may be
used.

© 2003 Thomson/South-Western Slide


9
Scales of Measurement

 Nominal
• Example:
Students of a university are classified by the
school in which they are enrolled using a
nonnumeric label such as Business,
Humanities, Education, and so on.
Alternatively, a numeric code could be used for
the school variable (e.g. 1 denotes Business, 2
denotes Humanities, 3 denotes Education, and
so on).

© 2003 Thomson/South-Western Slide


10
Scales of Measurement

 Ordinal
• The data have the properties of nominal data and
the order or rank of the data is meaningful.
• A nonnumeric label or a numeric code may be
used.

© 2003 Thomson/South-Western Slide


11
Scales of Measurement

 Ordinal
• Example:
Students of a university are classified by their
class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
Alternatively, a numeric code could be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).

© 2003 Thomson/South-Western Slide


12
Scales of Measurement

 Interval
• The data have the properties of ordinal data and
the interval between observations is expressed in
terms of a fixed unit of measure.
• Interval data are always numeric.

© 2003 Thomson/South-Western Slide


13
Scales of Measurement

 Interval
• Example:
Melissa has an SAT score of 1205, while Kevin
has an SAT score of 1090. Melissa scored 115
points more than Kevin.

© 2003 Thomson/South-Western Slide


14
Scales of Measurement

 Ratio
• The data have all the properties of interval data
and the ratio of two values is meaningful.
• Variables such as distance, height, weight, and
time use the ratio scale.
• This scale must contain a zero value that indicates
that nothing exists for the variable at the zero
point.

© 2003 Thomson/South-Western Slide


15
Scales of Measurement

 Ratio
• Example:
Melissa’s college record shows 36 credit hours
earned, while Kevin’s record shows 72 credit
hours earned. Kevin has twice as many credit
hours earned as Melissa.

© 2003 Thomson/South-Western Slide


16
Qualitative and Quantitative Data

 Data can be further classified as being qualitative or


quantitative.
 The statistical analysis that is appropriate depends on
whether the data for the variable are qualitative or
quantitative.
 In general, there are more alternatives for statistical
analysis when the data are quantitative.

© 2003 Thomson/South-Western Slide


17
Qualitative Data

 Qualitative data are labels or names used to identify


an attribute of each element.
 Qualitative data use either the nominal or ordinal
scale of measurement.
 Qualitative data can be either numeric or
nonnumeric.
 The statistical analysis for qualitative data are rather
limited.

© 2003 Thomson/South-Western Slide


18
Quantitative Data

 Quantitative data indicate either how many or how


much.
• Quantitative data that measure how many are
discrete.
• Quantitative data that measure how much are
continuous because there is no separation between
the possible values for the data..
 Quantitative data are always numeric.
 Ordinary arithmetic operations are meaningful only
with quantitative data.

© 2003 Thomson/South-Western Slide


19
Cross-Sectional and Time Series Data

 Cross-sectional data are collected at the same or


approximately the same point in time.
• Example: data detailing the number of building
permits issued in June 2000 in each of the counties
of Texas
 Time series data are collected over several time
periods.
• Example: data detailing the number of building
permits issued in Travis County, Texas in each of
the last 36 months

© 2003 Thomson/South-Western Slide


20
Data Sources

 Existing Sources
• Data needed for a particular application might
already exist within a firm. Detailed information
is often kept on customers, suppliers, and
employees for example.
• Substantial amounts of business and economic
data are available from organizations that
specialize in collecting and maintaining data.

© 2003 Thomson/South-Western Slide


21
Data Sources

 Existing Sources
• Government agencies are another important
source of data.
• Data are also available from a variety of industry
associations and special-interest organizations.

© 2003 Thomson/South-Western Slide


22
Data Sources

 Internet
• The Internet has become an important source of
data.
• Most government agencies, like the Bureau of the
Census (www.census.gov), make their data
available through a web site.
• More and more companies are creating web sites
and providing public access to them.
• A number of companies now specialize in making
information available over the Internet.

© 2003 Thomson/South-Western Slide


23
Data Sources

 Statistical Studies
• Statistical studies can be classified as either
experimental or observational.
• In experimental studies the variables of interest
are first identified. Then one or more factors are
controlled so that data can be obtained about how
the factors influence the variables.
• In observational (nonexperimental) studies no
attempt is made to control or influence the
variables of interest; an example is a survey.

© 2003 Thomson/South-Western Slide


24
Data Acquisition Considerations

 Time Requirement
• Searching for information can be time consuming.
• Information might no longer be useful by the time
it is available.
 Cost of Acquisition
• Organizations often charge for information even
when it is not their primary business activity.
 Data Errors
• Using any data that happens to be available or
that were acquired with little care can lead to poor
and misleading information.

© 2003 Thomson/South-Western Slide


25
Descriptive Statistics

 Descriptive statistics are the tabular, graphical, and


numerical methods used to summarize data.

© 2003 Thomson/South-Western Slide


26
Example: Hudson Auto Repair

The manager of Hudson Auto would like to have


a better understanding of the cost of parts used in the
engine tune-ups performed in the shop. She examines
50 customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed below.

91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2003 Thomson/South-Western Slide


27
Example: Hudson Auto Repair

 Tabular Summary (Frequencies and Percent


Frequencies)
Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
Total 50 100

© 2003 Thomson/South-Western Slide


28
Example: Hudson Auto Repair

 Graphical Summary (Histogram)


18
16
14
Frequency

12
10
8
6
4
2
Parts
50 60 70 80 90 100 110 Cost ($)

© 2003 Thomson/South-Western Slide


29
Example: Hudson Auto Repair

 Numerical Descriptive Statistics


• The most common numerical descriptive statistic
is the average (or mean).
• Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).

© 2003 Thomson/South-Western Slide


30
Statistical Inference

 Statistical inference is the process of using data


obtained from a small group of elements (the sample)
to make estimates and test hypotheses about the
characteristics of a larger group of elements (the
population).

© 2003 Thomson/South-Western Slide


31
Example: Hudson Auto Repair

 Process of Statistical Inference

1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.

4. The value of the 3. The sample data


sample average is used provide a sample
to make an estimate of average cost of
the population average. $79 per tune-up.

© 2003 Thomson/South-Western Slide


32
Chapter 2
Descriptive Statistics:
Tabular and Graphical Methods
 Summarizing Qualitative Data
 Summarizing Quantitative Data
 Exploratory Data Analysis
 Crosstabulations
and Scatter Diagrams

© 2003 Thomson/South-Western Slide


33
Summarizing Qualitative Data

 Frequency Distribution
 Relative Frequency
 Percent Frequency Distribution
 Bar Graph
 Pie Chart

© 2003 Thomson/South-Western Slide


34
Frequency Distribution

 A frequency distribution is a tabular summary of


data showing the frequency (or number) of items in
each of several nonoverlapping classes.
 The objective is to provide insights about the data
that cannot be quickly obtained by looking only at
the original data.

© 2003 Thomson/South-Western Slide


35
Example: Marada Inn

 Frequency Distribution

Rating Frequency
Poor 2
Below Average 3
Average 5
Above Average 9
Excellent 1
Total 20

© 2003 Thomson/South-Western Slide


36
Percent Frequency Distribution

 The percent frequency of a class is the relative


frequency multiplied by 100.
 A percent frequency distribution is a tabular
summary of a set of data showing the percent
frequency for each class.

© 2003 Thomson/South-Western Slide


37
Bar Graph

 A bar graph is a graphical device for depicting


qualitative data.
 On the horizontal axis we specify the labels that are
used for each of the classes.
 A frequency, relative frequency, or percent frequency
scale can be used for the vertical axis.
 Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
 The bars are separated to emphasize the fact that
each class is a separate category.

© 2003 Thomson/South-Western Slide


38
Example: Marada Inn

Guests staying at Marada Inn were asked to rate the


quality of their accommodations as being excellent,
above average, average, below average, or poor. The
ratings provided by a sample of 20 quests are shown
below.

Below Average Average Above Average


Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above AverageAverage
Above Average Average

© 2003 Thomson/South-Western Slide


39
Relative Frequency Distribution

 The relative frequency of a class is the fraction or


proportion of the total number of data items
belonging to the class.
 A relative frequency distribution is a tabular
summary of a set of data showing the relative
frequency for each class.

© 2003 Thomson/South-Western Slide


40
Example: Marada Inn

 Relative Frequency and Percent Frequency


Distributions
Relative Percent
Rating Frequency Frequency
Poor .10 10
Below Average .15 15
Average .25 25
Above Average .45 45
Excellent .05 5
Total 1.00 100

© 2003 Thomson/South-Western Slide


41
Bar Graph

 A bar graph is a graphical device for depicting


qualitative data.
 On the horizontal axis we specify the labels that are
used for each of the classes.
 A frequency, relative frequency, or percent frequency
scale can be used for the vertical axis.
 Using a bar of fixed width drawn above each class
label, we extend the height appropriately.
 The bars are separated to emphasize the fact that
each class is a separate category.

© 2003 Thomson/South-Western Slide


42
Example: Marada Inn

 Bar Graph
9
8
7
Frequency

6
5
4
3
2
1
Rating
Poor Below Average Above Excellent
Average Average

© 2003 Thomson/South-Western Slide


43
Pie Chart

 The pie chart is a commonly used graphical device


for presenting relative frequency distributions for
qualitative data.
 First draw a circle; then use the relative frequencies
to subdivide the circle into sectors that correspond to
the relative frequency for each class.
 Since there are 360 degrees in a circle, a class with a
relative frequency of .25 would consume .25(360) =
90 degrees of the circle.

© 2003 Thomson/South-Western Slide


44
Example: Marada Inn

 Pie Chart
Exc.
Poor
5%
10%
Below
Average
Above
15%
Average
45%
Average
25%

Quality Ratings

© 2003 Thomson/South-Western Slide


45
Example: Marada Inn

 Insights Gained from the Preceding Pie Chart


• One-half of the customers surveyed gave Marada
a quality rating of “above average” or “excellent”
(looking at the left side of the pie). This might
please the manager.
• For each customer who gave an “excellent” rating,
there were two customers who gave a “poor”
rating (looking at the top of the pie). This should
displease the manager.

© 2003 Thomson/South-Western Slide


46
Summarizing Quantitative Data

 Frequency Distribution
 Relative Frequency and Percent Frequency
Distributions
 Dot Plot
 Histogram
 Cumulative Distributions
 Ogive

© 2003 Thomson/South-Western Slide


47
Example: Hudson Auto Repair

The manager of Hudson Auto would like to get a


better picture of the distribution of costs for engine
tune-up parts. A sample of 50 customer invoices has
been taken and the costs of parts, rounded to the
nearest dollar, are listed below.

91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73

© 2003 Thomson/South-Western Slide


48
Frequency Distribution

 Guidelines for Selecting Number of Classes


• Use between 5 and 20 classes.
• Data sets with a larger number of elements
usually require a larger number of classes.
• Smaller data sets usually require fewer classes.

© 2003 Thomson/South-Western Slide


49
Frequency Distribution

 Guidelines for Selecting Width of Classes


• Use classes of equal width.
• Approximate Class Width =
Largest Data Value  Smallest Data Value
Number of Classes

© 2003 Thomson/South-Western Slide


50
Example: Hudson Auto Repair

 Frequency Distribution
If we choose six classes:
Approximate Class Width = (109 - 52)/6 = 9.5 10
Cost ($) Frequency
50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50

© 2003 Thomson/South-Western Slide


51
Example: Hudson Auto Repair

 Relative Frequency and Percent Frequency


Distributions

Relative Percent
Cost ($) Frequency Frequency
50-59 .04 4
60-69 .26 26
70-79 .32 32
80-89 .14 14
90-99 .14 14
100-109 .10 10
Total 1.00 100

© 2003 Thomson/South-Western Slide


52
Example: Hudson Auto Repair

 Insights Gained from the Percent Frequency


Distribution
• Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third)
of the parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.

© 2003 Thomson/South-Western Slide


53
Dot Plot

 One of the simplest graphical summaries of data is a


dot plot.
 A horizontal axis shows the range of data values.
 Then each data value is represented by a dot placed
above the axis.

© 2003 Thomson/South-Western Slide


54
Example: Hudson Auto Repair

 Dot Plot

... ..
.. .. .
.. .
.. .
. .
. . . ..... .......... .. . .. . . ... . .. .
50 60 70 80 90 100 110

Cost ($)

© 2003 Thomson/South-Western Slide


55
Histogram

 Another common graphical presentation of


quantitative data is a histogram.
 The variable of interest is placed on the horizontal
axis.
 A rectangle is drawn above each class interval with
its height corresponding to the interval’s frequency,
relative frequency, or percent frequency.
 Unlike a bar graph, a histogram has no natural
separation between rectangles of adjacent classes.

© 2003 Thomson/South-Western Slide


56
Example: Hudson Auto Repair

 Histogram
18
16
14
Frequency

12
10
8
6
4
2
Parts
Cost ($)
50 60 70 80 90 100 110
© 2003 Thomson/South-Western Slide
57
Cumulative Distributions

 Cumulative frequency distribution -- shows the


number of items with values less than or equal to the
upper limit of each class.
 Cumulative relative frequency distribution -- shows
the proportion of items with values less than or equal
to the upper limit of each class.
 Cumulative percent frequency distribution -- shows
the percentage of items with values less than or equal
to the upper limit of each class.

© 2003 Thomson/South-Western Slide


58
Example: Hudson Auto Repair

 Cumulative Distributions
Cumulative Cumulative
Cumulative Relative Percent
Cost ($) Frequency Frequency Frequency
< 59 2 .04 4
< 69 15 .30 30
< 79 31 .62 62
< 89 38 .76 76
< 99 45 .90 90
< 109 50 1.00 100

© 2003 Thomson/South-Western Slide


59
Ogive

 An ogive is a graph of a cumulative distribution.


 The data values are shown on the horizontal axis.
 Shown on the vertical axis are the:
• cumulative frequencies, or
• cumulative relative frequencies, or
• cumulative percent frequencies
 The frequency (one of the above) of each class is
plotted as a point.
 The plotted points are connected by straight lines.

© 2003 Thomson/South-Western Slide


60
Example: Hudson Auto Repair

 Ogive
• Because the class limits for the parts-cost data are
50-59, 60-69, and so on, there appear to be one-unit
gaps from 59 to 60, 69 to 70, and so on.
• These gaps are eliminated by plotting points
halfway between the class limits.
• Thus, 59.5 is used for the 50-59 class, 69.5 is used
for the 60-69 class, and so on.

© 2003 Thomson/South-Western Slide


61
Example: Hudson Auto Repair

 Ogive with Cumulative Percent Frequencies


Cumulative Percent Frequency

100

80

60

40

20
Parts
Cost ($)
50 60 70 80 90 100 110

© 2003 Thomson/South-Western Slide


62
Exploratory Data Analysis

 The techniques of exploratory data analysis consist of


simple arithmetic and easy-to-draw pictures that can
be used to summarize data quickly.
 One such technique is the stem-and-leaf display.

© 2003 Thomson/South-Western Slide


63
Stem-and-Leaf Display

 A stem-and-leaf display shows both the rank order


and shape of the distribution of the data.
 It is similar to a histogram on its side, but it has the
advantage of showing the actual data values.
 The first digits of each data item are arranged to the
left of a vertical line.
 To the right of the vertical line we record the last
digit for each item in rank order.
 Each line in the display is referred to as a stem.
 Each digit on a stem is a leaf.

© 2003 Thomson/South-Western Slide


64
Example: Hudson Auto Repair

 Stem-and-Leaf Display

5 2 7
6 2 2 2 2 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
8 0 0 2 3 5 8 9
9 1 3 7 7 7 8 9
10 1 4 5 5 9

© 2003 Thomson/South-Western Slide


65
Stretched Stem-and-Leaf Display

 If we believe the original stem-and-leaf display has


condensed the data too much, we can stretch the
display by using two more stems for each leading
digit(s).
 Whenever a stem value is stated twice, the first value
corresponds to leaf values of 0-4, and the second
values corresponds to values of 5-9.

© 2003 Thomson/South-Western Slide


66
Example: Hudson Auto Repair

 Stretched Stem-and-Leaf Display


5 2
5 7
6 2 2 2 2
6 5 6 7 8 8 8 9 9 9
7 1 1 2 2 3 4 4
7 5 5 5 6 7 8 9 9 9
8 0 0 2 3
8 5 8 9
9 1 3
9 7 7 7 8 9
10 1 4
10 5 5 9

© 2003 Thomson/South-Western Slide


67
Stem-and-Leaf Display

 Leaf Units
• A single digit is used to define each leaf.
• In the preceding example, the leaf unit was 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.
• Where the leaf unit is not shown, it is assumed to
equal 1.

© 2003 Thomson/South-Western Slide


68
Example: Leaf Unit = 0.1

If we have data with values such as


8.6 11.7 9.4 9.1 10.2 11.0 8.8
a stem-and-leaf display of these data will be

Leaf Unit = 0.1


8 6 8
9 1 4
10 2
11 0 7

© 2003 Thomson/South-Western Slide


69
Example: Leaf Unit = 10

If we have data with values such as


1806 1717 1974 1791 1682 1910 1838
a stem-and-leaf display of these data will be

Leaf Unit = 10
16 8
17 1 9
18 0 3
19 1 7

© 2003 Thomson/South-Western Slide


70
Crosstabulations and Scatter Diagrams

 Thus far we have focused on methods that are used


to summarize the data for one variable at a time.
 Often a manager is interested in tabular and
graphical methods that will help understand the
relationship between two variables.
 Crosstabulation and a scatter diagram are two
methods for summarizing the data for two (or more)
variables simultaneously.

© 2003 Thomson/South-Western Slide


71
Crosstabulation

 Crosstabulation is a tabular method for summarizing


the data for two variables simultaneously.
 Crosstabulation can be used when:
• One variable is qualitative and the other is
quantitative
• Both variables are qualitative
• Both variables are quantitative
 The left and top margin labels define the classes for
the two variables.

© 2003 Thomson/South-Western Slide


72
Example: Finger Lakes Homes

 Crosstabulation
The number of Finger Lakes homes sold for each
style and price for the past two years is shown below.
Price Home Style
Range Colonial Ranch Split A-Frame Total
< $99,000 18 6 19 12 55
> $99,000 12 14 16 3 45
Total 30 20 35 15 100

© 2003 Thomson/South-Western Slide


73
Example: Finger Lakes Homes

 Insights Gained from the Preceding Crosstabulation


• The greatest number of homes in the sample (19)
are a split-level style and priced at less than or
equal to $99,000.
• Only three homes in the sample are an A-Frame
style and priced at more than $99,000.

© 2003 Thomson/South-Western Slide


74
Crosstabulation: Row or Column Percentages

 Converting the entries in the table into row


percentages or column percentages can provide
additional insight about the relationship between the
two variables.

© 2003 Thomson/South-Western Slide


75
Example: Finger Lakes Homes

 Row Percentages

Price Home Style


Range Colonial Ranch Split A-Frame Total
< $99,000 32.73 10.91 34.55 21.82 100
> $99,000 26.67 31.11 35.56 6.67 100

Note: row totals are actually 100.01 due to rounding.

© 2003 Thomson/South-Western Slide


76
Example: Finger Lakes Homes

 Column Percentages

Price Home Style


Range Colonial Ranch Split A-Frame
< $99,000 60.00 30.00 54.29 80.00
> $99,000 40.00 70.00 45.71 20.00
Total 100 100 100 100

© 2003 Thomson/South-Western Slide


77
Scatter Diagram

 A scatter diagram is a graphical presentation of the


relationship between two quantitative variables.
 One variable is shown on the horizontal axis and the
other variable is shown on the vertical axis.
 The general pattern of the plotted points suggests the
overall relationship between the variables.

© 2003 Thomson/South-Western Slide


78
Scatter Diagram

 A Positive Relationship
y

© 2003 Thomson/South-Western Slide


79
Scatter Diagram

 A Negative Relationship
y

© 2003 Thomson/South-Western Slide


80
Scatter Diagram

 No Apparent Relationship
y

© 2003 Thomson/South-Western Slide


81
Example: Panthers Football Team

 Scatter Diagram
The Panthers football team is interested in
investigating the relationship, if any, between
interceptions made and points scored.

x = Number of y = Number of
Interceptions Points Scored
1 14
3 24
2 18
1 17
3 27

© 2003 Thomson/South-Western Slide


82
Example: Panthers Football Team

 Scatter Diagram
Number of Points Scored y

30
25
20
15
10
5
0 x
0 1 2 3
Number of Interceptions
© 2003 Thomson/South-Western Slide
83
Example: Panthers Football Team

 The preceding scatter diagram indicates a positive


relationship between the number of interceptions and
the number of points scored.
 Higher points scored are associated with a higher
number of interceptions.
 The relationship is not perfect; all plotted points in
the scatter diagram are not on a straight line.

© 2003 Thomson/South-Western Slide


84
Tabular and Graphical Procedures
Data
Qualitative Data Quantitative Data

Tabular Graphical Tabular Graphical


Methods Methods Methods Methods

• Frequency • Bar Graph • Frequency • Dot Plot


Distribution • Pie Chart Distribution • Histogram
• Rel. Freq. Dist. • Rel. Freq. Dist. • Ogive
• % Freq. Dist. • Cum. Freq. Dist. • Scatter
• Crosstabulation • Cum. Rel. Freq. Diagram
Distribution
• Stem-and-Leaf
Display
• Crosstabulation

© 2003 Thomson/South-Western Slide


85
Chapter 3
Descriptive Statistics: Numerical Methods
Part A
 Measures of Location
 Measures of Variability


 %
x

© 2003 Thomson/South-Western Slide


86
Measures of Location

 Mean
 Median
 Mode
 Percentiles
 Quartiles

© 2003 Thomson/South-Western Slide


87
Example: Apartment Rents

Given below is a sample of monthly rent values ($)


for one-bedroom apartments. The data is a sample of 70
apartments in a particular city. The data are presented
in ascending order.
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


88
Mean

 The mean of a data set is the average of all the data


values.
 If the data are from a sample, the mean is denoted by
x
.  xi
x
n
 If the data are from a population, the mean is
denoted by m (mu).
 xi

N

© 2003 Thomson/South-Western Slide


89
Example: Apartment Rents

 Mean
 xi 34, 356
x   490.80
n 70
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


90
Median

 The median is the measure of location most often


reported for annual income and property value data.
 A few extremely large incomes or property values
can inflate the mean.

© 2003 Thomson/South-Western Slide


91
Median

 The median of a data set is the value in the middle


when the data items are arranged in ascending order.
 For an odd number of observations, the median is the
middle value.
 For an even number of observations, the median is
the average of the two middle values.

© 2003 Thomson/South-Western Slide


92
Example: Apartment Rents

 Median
Median = 50th percentile
i = (p/100)n = (50/100)70 = 35.5
Averaging the 35th and 36th data values:
Median = (475 + 475)/2 = 475
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


93
Mode

 The mode of a data set is the value that occurs with


greatest frequency.
 The greatest frequency can occur at two or more
different values.
 If the data have exactly two modes, the data are
bimodal.
 If the data have more than two modes, the data are
multimodal.

© 2003 Thomson/South-Western Slide


94
Example: Apartment Rents

 Mode
450 occurred most frequently (7 times)
Mode = 450
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


95
Percentiles

 A percentile provides information about how the


data are spread over the interval from the smallest
value to the largest value.
 Admission test scores for colleges and universities
are frequently reported in terms of percentiles.

© 2003 Thomson/South-Western Slide


96
Percentiles

 The pth percentile of a data set is a value such that at


least p percent of the items take on this value or less
and at least (100 - p) percent of the items take on this
value or more.
• Arrange the data in ascending order.
• Compute index i, the position of the pth percentile.
i = (p/100)n
• If i is not an integer, round up. The p th percentile is
the value in the i th position.
• If i is an integer, the p th percentile is the average of
the values in positions i and i +1.

© 2003 Thomson/South-Western Slide


97
Example: Apartment Rents

 90th Percentile
i = (p/100)n = (90/100)70 = 63
Averaging the 63rd and 64th data values:
90th Percentile = (580 + 590)/2 = 585
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


98
Quartiles

 Quartiles are specific percentiles


 First Quartile = 25th Percentile
 Second Quartile = 50th Percentile = Median
 Third Quartile = 75th Percentile

© 2003 Thomson/South-Western Slide


99
Example: Apartment Rents

 Third Quartile
Third quartile = 75th percentile
i = (p/100)n = (75/100)70 = 52.5 = 53
Third quartile = 525
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


100
Measures of Variability

 It is often desirable to consider measures of


variability (dispersion), as well as measures of
location.
 For example, in choosing supplier A or supplier B we
might consider not only the average delivery time for
each, but also the variability in delivery time for each.

© 2003 Thomson/South-Western Slide


101
Measures of Variability

 Range
 Interquartile Range
 Variance
 Standard Deviation
 Coefficient of Variation

© 2003 Thomson/South-Western Slide


102
Range

 The range of a data set is the difference between the


largest and smallest data values.
 It is the simplest measure of variability.
 It is very sensitive to the smallest and largest data
values.

© 2003 Thomson/South-Western Slide


103
Example: Apartment Rents

 Range
Range = largest value - smallest value
Range = 615 - 425 = 190
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


104
Interquartile Range

 The interquartile range of a data set is the difference


between the third quartile and the first quartile.
 It is the range for the middle 50% of the data.
 It overcomes the sensitivity to extreme data values.

© 2003 Thomson/South-Western Slide


105
Example: Apartment Rents

 Interquartile Range
3rd Quartile (Q3) = 525
1st Quartile (Q1) = 445
Interquartile Range = Q3 - Q1 = 525 - 445 = 80
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


106
Variance

 The variance is a measure of variability that utilizes


all the data.
 It is based on the difference between the value of
each observation (xi) and the mean (x for a sample, m
for a population).

© 2003 Thomson/South-Western Slide


107
Variance

 The variance is the average of the squared differences


between each data value and the mean.
 If the data set is a sample, the variance is denoted by
s2.
2
2  ( xi  x )
s 
n 1
 If the data set is a population, the variance is denoted
by  2.
2
2  ( xi   )
 
N

© 2003 Thomson/South-Western Slide


108
Standard Deviation

 The standard deviation of a data set is the positive


square root of the variance.
 It is measured in the same units as the data, making it
more easily comparable, than the variance, to the
mean.
 If the data set is a sample, the standard deviation is
denoted s.
2
s s
 If the data set is a population, the standard deviation
is denoted  (sigma).

 2

© 2003 Thomson/South-Western Slide


109
Coefficient of Variation

 The coefficient of variation indicates how large the


standard deviation is in relation to the mean.
 If the data set is a sample, the coefficient of variation
is computed as follows:
s
(100)
x
 If the data set is a population, the coefficient of
variation is computed as follows:

(100)

© 2003 Thomson/South-Western Slide


110
Example: Apartment Rents

 Variance
2  ( xi  x ) 2
s   2 , 996.16
n 1
 Standard Deviation

s  s 2  2996. 47  54. 74
 Coefficient of Variation
s 54. 74
 100   100  11.15
x 490.80

© 2003 Thomson/South-Western Slide


111
Chapter 3
Descriptive Statistics: Numerical Methods
Part B
 Measures of Relative Location and Detecting Outliers
 Exploratory Data Analysis
 Measures of Association Between Two Variables


 The Weighted Mean and
Working with Grouped Data
%
x
© 2003 Thomson/South-Western Slide
112
Measures of Relative Location
and Detecting Outliers
 z-Scores
 Chebyshev’s Theorem
 Empirical Rule
 Detecting Outliers

© 2003 Thomson/South-Western Slide


113
z-Scores

 The z-score is often called the standardized value.


 It denotes the number of standard deviations a data
value xi is from the mean.
xi  x
zi 
s
 A data value less than the sample mean will have a z-
score less than zero.
 A data value greater than the sample mean will have
a z-score greater than zero.
 A data value equal to the sample mean will have a z-
score of zero.

© 2003 Thomson/South-Western Slide


114
Example: Apartment Rents

 z-Score of Smallest Value (425)


xi  x 425  490.80
z   1. 20
s 54. 74
Standardized Values for Apartment Rents
-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

© 2003 Thomson/South-Western Slide


115
Chebyshev’s Theorem

At least (1 - 1/z2) of the items in any data set will be


within z standard deviations of the mean, where z is
any value greater than 1.
• At least 75% of the items must be within
z = 2 standard deviations of the mean.
• At least 89% of the items must be within
z = 3 standard deviations of the mean.
• At least 94% of the items must be within
z = 4 standard deviations of the mean.

© 2003 Thomson/South-Western Slide


116
Example: Apartment Rents

 Chebyshev’s Theorem

Let z = 1.5 with x= 490.80 and s = 54.74


At least (1 - 1/(1.5)2) = 1 - 0.44 = 0.56 or 56%
of the rent values must be between
x - z(s) = 490.80 - 1.5(54.74) = 409
and
x+ z(s) = 490.80 + 1.5(54.74) = 573

© 2003 Thomson/South-Western Slide


117
Example: Apartment Rents

 Chebyshev’s Theorem (continued)


Actually, 86% of the rent values
are between 409 and 573.
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


118
Empirical Rule

For data having a bell-shaped distribution:

• Approximately 68% of the data values will be


within one standard deviation of the mean.

© 2003 Thomson/South-Western Slide


119
Empirical Rule

For data having a bell-shaped distribution:

• Approximately 95% of the data values will be


within two standard deviations of the mean.

© 2003 Thomson/South-Western Slide


120
Empirical Rule

For data having a bell-shaped distribution:

• Almost all (99.7%) of the items will be


within three standard deviations of the mean.

© 2003 Thomson/South-Western Slide


121
Example: Apartment Rents

 Empirical Rule
Interval % in Interval
Within +/- 1s 436.06 to 545.54 48/70 = 69%
Within +/- 2s 381.32 to 600.28 68/70 = 97%
Within +/- 3s 326.58 to 655.02 70/70 = 100%
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


122
Detecting Outliers

 An outlier is an unusually small or unusually large


value in a data set.
 A data value with a z-score less than -3 or greater
than +3 might be considered an outlier.
 It might be:
• an incorrectly recorded data value
• a data value that was incorrectly included in the
data set
• a correctly recorded data value that belongs in the
data set

© 2003 Thomson/South-Western Slide


123
Example: Apartment Rents

 Detecting Outliers
The most extreme z-scores are -1.20 and 2.27.
Using |z| > 3 as the criterion for an outlier,
there are no outliers in this data set.
Standardized Values for Apartment Rents
-1.20 -1.11 -1.11 -1.02 -1.02 -1.02 -1.02 -1.02 -0.93 -0.93
-0.93 -0.93 -0.93 -0.84 -0.84 -0.84 -0.84 -0.84 -0.75 -0.75
-0.75 -0.75 -0.75 -0.75 -0.75 -0.56 -0.56 -0.56 -0.47 -0.47
-0.47 -0.38 -0.38 -0.34 -0.29 -0.29 -0.29 -0.20 -0.20 -0.20
-0.20 -0.11 -0.01 -0.01 -0.01 0.17 0.17 0.17 0.17 0.35
0.35 0.44 0.62 0.62 0.62 0.81 1.06 1.08 1.45 1.45
1.54 1.54 1.63 1.81 1.99 1.99 1.99 1.99 2.27 2.27

© 2003 Thomson/South-Western Slide


124
Exploratory Data Analysis

 Five-Number Summary
 Box Plot

© 2003 Thomson/South-Western Slide


125
Five-Number Summary

 Smallest Value
 First Quartile
 Median
 Third Quartile
 Largest Value

© 2003 Thomson/South-Western Slide


126
Example: Apartment Rents

 Five-Number Summary
Lowest Value = 425 First Quartile = 450
Median = 475
Third Quartile = 525 Largest Value = 615
425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615

© 2003 Thomson/South-Western Slide


127
Box Plot

 A box is drawn with its ends located at the first and


third quartiles.
 A vertical line is drawn in the box at the location of
the median.
 Limits are located (not drawn) using the interquartile
range (IQR).
• The lower limit is located 1.5(IQR) below Q1.
• The upper limit is located 1.5(IQR) above Q3.
• Data outside these limits are considered outliers.
… continued

© 2003 Thomson/South-Western Slide


128
Box Plot (Continued)

 Whiskers (dashed lines) are drawn from the ends of


the box to the smallest and largest data values inside
the limits.
 The locations of each outlier is shown with the
symbol * .

© 2003 Thomson/South-Western Slide


129
Example: Apartment Rents

 Box Plot
Lower Limit: Q1 - 1.5(IQR) = 450 - 1.5(75) = 337.5
Upper Limit: Q3 + 1.5(IQR) = 525 + 1.5(75) = 637.5
There are no outliers.

37 40 42 45 47 50 52 550 575 600 625


5 0 5 0 5 0 5

© 2003 Thomson/South-Western Slide


130
Measures of Association
Between Two Variables
 Covariance
 Correlation Coefficient

© 2003 Thomson/South-Western Slide


131
Covariance

 The covariance is a measure of the linear association


between two variables.
 Positive values indicate a positive relationship.
 Negative values indicate a negative relationship.

© 2003 Thomson/South-Western Slide


132
Covariance

 If the data sets are samples, the covariance is denoted


by sxy.
 ( xi  x )( yi  y )
sxy 
n 1
 If the data sets are populations, the covariance is
denoted by  xy .
 ( xi   x )( yi   y )
 xy 
N

© 2003 Thomson/South-Western Slide


133
Correlation Coefficient

 The coefficient can take on values between -1 and +1.


 Values near -1 indicate a strong negative linear
relationship.
 Values near +1 indicate a strong positive linear
relationship.
 If the data sets are samples, the coefficient is rxy.
sxy
rxy 
sx s y
 xy
 If the data sets are populations, the coefficient is .
 xy
 xy 
 x y

© 2003 Thomson/South-Western Slide


134
The Weighted Mean and
Working with Grouped Data
 Weighted Mean
 Mean for Grouped Data
 Variance for Grouped Data
 Standard Deviation for Grouped Data

© 2003 Thomson/South-Western Slide


135
Weighted Mean

 When the mean is computed by giving each data


value a weight that reflects its importance, it is
referred to as a weighted mean.
 In the computation of a grade point average (GPA),
the weights are the number of credit hours earned for
each grade.
 When data values vary in importance, the analyst
must choose the weight that best reflects the
importance of each value.

© 2003 Thomson/South-Western Slide


136
Weighted Mean

x =  wi xi
 wi

where:
xi = value of observation i
wi = weight for observation i

© 2003 Thomson/South-Western Slide


137
Grouped Data

 The weighted mean computation can be used to


obtain approximations of the mean, variance, and
standard deviation for the grouped data.
 To compute the weighted mean, we treat the
midpoint of each class as though it were the mean of
all items in the class.
 We compute a weighted mean of the class midpoints
using the class frequencies as weights.
 Similarly, in computing the variance and standard
deviation, the class frequencies are used as weights.

© 2003 Thomson/South-Western Slide


138
Mean for Grouped Data

 Sample Data
x
 fM
i i

f i

 Population Data

  fM
i i

N
where:
fi = frequency of class i
Mi = midpoint of class i

© 2003 Thomson/South-Western Slide


139
Example: Apartment Rents

Given below is the previous sample of monthly rents


for one-bedroom apartments presented here as grouped
data in the form of a frequency distribution.
Rent ($) Frequency
420-439 8
440-459 17
460-479 12
480-499 8
500-519 7
520-539 4
540-559 2
560-579 4
580-599 2
600-619 6
© 2003 Thomson/South-Western Slide
140
Example: Apartment Rents

 Mean for Grouped Data


Rent ($) fi Mi f iMi
420-439 8 429.5 3436.0 34, 525
x  493. 21
440-459 17 449.5 7641.5 70
460-479 12 469.5This approximation
5634.0
480-499
differs by8 $2.41489.5
from 3916.0
500-519 7 509.5 3566.5
520-539 4 529.5the actual
2118.0sample
mean of $490.80.
540-559 2 549.5 1099.0
560-579 4 569.5 2278.0
580-599 2 589.5 1179.0
600-619 6 609.5 3657.0
Total 70 34525.0

© 2003 Thomson/South-Western Slide


141
Variance for Grouped Data

 Sample Data
2  f i ( Mi  x ) 2
s 
n 1
 Population Data
2
 f i ( M i   )
2 
N

© 2003 Thomson/South-Western Slide


142
Example: Apartment Rents

 Variance for Grouped Data


s2  3, 017.89

 Standard Deviation for Grouped Data

s  3, 017.89  54. 94
This approximation differs by only $.20
from the actual standard deviation of $54.74.

© 2003 Thomson/South-Western Slide


143
Chapter 7
Sampling and Sampling Distributions
 Simple Random Sampling
 Point Estimation
 Introduction to Sampling Distributions
 Sampling Distribution of x
 Sampling Distribution of p n = 100
 Sampling Methods

n = 30

© 2003 Thomson/South-Western Slide


144
Statistical Inference

 The purpose of statistical inference is to obtain


information about a population from information
contained in a sample.
 A population is the set of all the elements of interest.
 A sample is a subset of the population.
 The sample results provide only estimates of the
values of the population characteristics.
 A parameter is a numerical characteristic of a
population.
 With proper sampling methods, the sample results
will provide “good” estimates of the population
characteristics.

© 2003 Thomson/South-Western Slide


145
Simple Random Sampling

 Finite Population
• A simple random sample from a finite population
of size N is a sample selected such that each
possible sample of size n has the same probability
of being selected.
• Replacing each sampled element before selecting
subsequent elements is called sampling with
replacement.

© 2003 Thomson/South-Western Slide


146
Simple Random Sampling

 Finite Population
• Sampling without replacement is the procedure
used most often.
• In large sampling projects, computer-generated
random numbers are often used to automate the
sample selection process.

© 2003 Thomson/South-Western Slide


147
Simple Random Sampling

 Infinite Population
• A simple random sample from an infinite
population is a sample selected such that the
following conditions are satisfied.
• Each element selected comes from the same
population.
• Each element is selected independently.

© 2003 Thomson/South-Western Slide


148
Simple Random Sampling

 Infinite Population
• The population is usually considered infinite if it
involves an ongoing process that makes listing or
counting every element impossible.
• The random number selection procedure cannot
be used for infinite populations.

© 2003 Thomson/South-Western Slide


149
Point Estimation

 In point estimation we use the data from the sample


to compute a value of a sample statistic that serves as
an estimate of a population parameter.
 We refer to x as the point estimator of the population
mean .
 s is the point estimator of the population standard
deviation .
 p is the point estimator of the population proportion
p.

© 2003 Thomson/South-Western Slide


150
Point Estimation

 When the expected value of a point estimator is equal


to the population parameter, the point estimator is
said to be unbiased.

© 2003 Thomson/South-Western Slide


151
Sampling Error

 The absolute difference between an unbiased point


estimate and the corresponding population
parameter is called the sampling error.
 Sampling error is the result of using a subset of the
population (the sample), and not the entire
population to develop estimates.
 The sampling errors are:
| x for
 | sample mean
|s   | for sample standard deviation
| p for
p| sample proportion

© 2003 Thomson/South-Western Slide


152
Example: St. Andrew’s

St. Andrew’s College receives 900 applications


annually from prospective students. The application
forms contain a variety of information including the
individual’s scholastic aptitude test (SAT) score and
whether or not the individual desires on-campus
housing.

© 2003 Thomson/South-Western Slide


153
Example: St. Andrew’s

The director of admissions would like to know the


following information:
• the average SAT score for the applicants, and
• the proportion of applicants that want to live on
campus.

© 2003 Thomson/South-Western Slide


154
Example: St. Andrew’s

We will now look at three alternatives for obtaining


the desired information.
• Conducting a census of the entire 900 applicants
• Selecting a sample of 30 applicants, using a
random number table
• Selecting a sample of 30 applicants, using
computer-generated random numbers

© 2003 Thomson/South-Western Slide


155
Example: St. Andrew’s

 Taking a Census of the 900 Applicants


• SAT Scores
• Population Mean


 x i
 990
900
• Population Standard Deviation


 i
( x   )2

 80
900

© 2003 Thomson/South-Western Slide


156
Example: St. Andrew’s

 Taking a Census of the 900 Applicants


• Applicants Wanting On-Campus Housing
• Population Proportion
648
p  .72
900

© 2003 Thomson/South-Western Slide


157
Example: St. Andrew’s

 Take a Sample of 30 Applicants


Using a Random Number Table
Since the finite population has 900 elements, we
will need 3-digit random numbers to randomly select
applicants numbered from 1 to 900.
We will use the last three digits of the 5-digit
random numbers in the third column of the
textbook’s random number table.

© 2003 Thomson/South-Western Slide


158
Example: St. Andrew’s

 Take a Sample of 30 Applicants


Using a Random Number Table
The numbers we draw will be the numbers of the
applicants we will sample unless
• the random number is greater than 900 or
• the random number has already been used.
We will continue to draw random numbers until we
have selected 30 applicants for our sample.

© 2003 Thomson/South-Western Slide


159
Example: St. Andrew’s

 Use of Random Numbers for Sampling


3-Digit Applicant
Random Number Included in Sample
744 No. 744
436 No. 436
865 No. 865
790 No. 790
835 No. 835
902 Number exceeds 900
190 No. 190
436 Number already used
etc. etc.

© 2003 Thomson/South-Western Slide


160
Example: St. Andrew’s

 Sample Data
Random
No. Number Applicant SAT Score On-Campus
1 744 Connie Reyman 1025 Yes
2 436 William Fox 950 Yes
3 865 Fabian Avante 1090 No
4 790 Eric Paxton 1120 Yes
5 835 Winona Wheeler 1015 No
. . . . .
30 685 Kevin Cossack 965 No

© 2003 Thomson/South-Western Slide


161
Example: St. Andrew’s

 Take a Sample of 30 Applicants


Using Computer-Generated Random Numbers
• Excel provides a function for generating random
numbers in its worksheet.
• 900 random numbers are generated, one for each
applicant in the population.
• Then we choose the 30 applicants corresponding
to the 30 smallest random numbers as our sample.
• Each of the 900 applicants have the same
probability of being included.

© 2003 Thomson/South-Western Slide


162
Using Excel to Select
a Simple Random Sample
 Formula Worksheet
A B C D
Applicant SAT On-Campus Random
1 Number Score Housing Number
2 1 1008 Yes =RAND()
3 2 1025 No =RAND()
4 3 952 Yes =RAND()
5 4 1090 Yes =RAND()
6 5 1127 Yes =RAND()
7 6 1015 No =RAND()
8 7 965 Yes =RAND()
9 8 1161 No =RAND()

Note: Rows 10-901 are not shown.

© 2003 Thomson/South-Western Slide


163
Using Excel to Select
a Simple Random Sample
 Value Worksheet
A B C D
Applicant SAT On-Campus Random
1 Number Score Housing Number
2 1 1008 Yes 0.41327
3 2 1025 No 0.79514
4 3 952 Yes 0.66237
5 4 1090 Yes 0.00234
6 5 1127 Yes 0.71205
7 6 1015 No 0.18037
8 7 965 Yes 0.71607
9 8 1161 No 0.90512

Note: Rows 10-901 are not shown.

© 2003 Thomson/South-Western Slide


164
Using Excel to Select
a Simple Random Sample
 Value Worksheet (Sorted)
A B C D
Applicant SAT On-Campus Random
1 Number Score Housing Number
2 12 1107 No 0.00027
3 773 1043 Yes 0.00192
4 408 991 Yes 0.00303
5 58 1008 No 0.00481
6 116 1127 Yes 0.00538
7 185 982 Yes 0.00583
8 510 1163 Yes 0.00649
9 394 1008 No 0.00667

Note: Rows 10-901 are not shown.

© 2003 Thomson/South-Western Slide


165
Example: St. Andrew’s

 Point Estimates
• x as Point Estimator of 
x
 x

29,910
i
 997
30 30

• s as Point Estimator of 

s
 (x i  x )2

163,996
 75.2
29 29
• p as Point Estimator of p
p  20 30  .68

© 2003 Thomson/South-Western Slide


166
Example: St. Andrew’s

 Point Estimates
Note: Different random numbers would have
identified a different sample which would have
resulted in different point estimates.

© 2003 Thomson/South-Western Slide


167
Sampling Distribution of x

 Process of Statistical Inference

Population A simple random sample


with mean of n elements is selected
m=? from the population.

The value xof is used to The sample data


make inferences about provide a value for
the value of m. x .
the sample mean

© 2003 Thomson/South-Western Slide


168
Sampling Distribution of x

 The sampling distribution of x is the probability


distribution of all possible values of the sample
mean x .
 Expected Value of x
E( ) =  x
where:
 = the population mean

© 2003 Thomson/South-Western Slide


169
Sampling Distribution of x

 Standard Deviation of x
Finite Population Infinite Population

 N n 
x  ( ) x 
n N 1 n
• A finite population is treated as being
infinite if n/N < .05.
• ( N  n ) / ( N  1) is the finite correction factor.
•  x is referred to as the standard error of the mean.

© 2003 Thomson/South-Western Slide


170
Sampling Distribution of x

 If we use a large (n > 30) simple random sample, the


central limit theorem enables us to conclude that the
sampling distribution of x can be approximated by a
normal probability distribution.
 When the simple random sample is small (n < 30), the
sampling distribution of x can be considered normal
only if we assume the population has a normal
probability distribution.

© 2003 Thomson/South-Western Slide


171
Example: St. Andrew’s

 Sampling Distribution of x for the SAT Scores

 80
x    14.6
n 30

x
E ( x )    990

© 2003 Thomson/South-Western Slide


172
Example: St. Andrew’s

 Sampling Distribution of x for the SAT Scores


What is the probability that a simple random
sample of 30 applicants will provide an estimate of
the population mean SAT score that is within plus or
minus 10 of the actual population mean  ?
In other words, what is the probability that xwill
be between 980 and 1000?

© 2003 Thomson/South-Western Slide


173
Example: St. Andrew’s

 Sampling Distribution of x for the SAT Scores

Sampling
distribution
Area = ? of x

x
980 990 1000

© 2003 Thomson/South-Western Slide


174
Example: St. Andrew’s

 Sampling Distribution of x for the SAT Scores


Using the standard normal probability table with
z = 10/14.6= .68, we have area = (.2518)(2) = .5036
The probability is .5036 that the sample mean will be
within +/-10 of the actual population mean.

© 2003 Thomson/South-Western Slide


175
Example: St. Andrew’s

 Sampling Distribution of x for the SAT Scores

Sampling
Area = 2(.2518) = .5036
distribution
of x

x
980 990 1000

© 2003 Thomson/South-Western Slide


176
Sampling Distribution of p

 The sampling distribution of p is the probability


distribution of all possible values of the sample
proportion p.
 Expected Value of p

E ( p)  p
where:
p = the population proportion

© 2003 Thomson/South-Western Slide


177
Sampling Distribution of p

 Standard Deviation of p
Finite Population Infinite Population

p (1  p ) N  n p (1  p )
p  p 
n N 1 n
•  p is referred to as the standard error of the
proportion.

© 2003 Thomson/South-Western Slide


178
Sampling Distribution of p

 The sampling distribution of p can be approximated


by a normal probability distribution whenever the
sample size is large.
 The sample size is considered large whenever these
conditions are satisfied:

np > 5
and
n(1 – p) > 5

© 2003 Thomson/South-Western Slide


179
Sampling Distribution of p

 For values of p near .50, sample sizes as small as 10


permit a normal approximation.
 With very small (approaching 0) or large
(approaching 1) values of p, much larger samples are
needed.

© 2003 Thomson/South-Western Slide


180
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents


The normal probability distribution is an
acceptable approximation because:
np = 30(.72) = 21.6 > 5
and
n(1 - p) = 30(.28) = 8.4 > 5.

© 2003 Thomson/South-Western Slide


181
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents

.72(1  .72)
p   .082
30

E( p )  .72

© 2003 Thomson/South-Western Slide


182
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents


What is the probability that a simple random
sample of 30 applicants will provide an estimate of
the population proportion of applicants desiring on-
campus housing that is within plus or minus .05 of
the actual population proportion?
In other words, what is the probability that p
will be between .67 and .77?

© 2003 Thomson/South-Western Slide


183
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents

Sampling
distribution
of p

Area = ?

p
0.67 0.72 0.77

© 2003 Thomson/South-Western Slide


184
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents


For z = .05/.082 = .61, the area = (.2291)(2) = .4582.
The probability is .4582 that the sample proportion
will be within +/-.05 of the actual population
proportion.

© 2003 Thomson/South-Western Slide


185
Example: St. Andrew’s

 Sampling Distribution of p for In-State Residents

Sampling
Area = 2(.2291) = .4582 distribution
of p

p
0.67 0.72 0.77

© 2003 Thomson/South-Western Slide


186
Sampling Methods

 Stratified Random Sampling


 Cluster Sampling
 Systematic Sampling
 Convenience Sampling
 Judgment Sampling

© 2003 Thomson/South-Western Slide


187
Stratified Random Sampling

 The population is first divided into groups of


elements called strata.
 Each element in the population belongs to one and
only one stratum.
 Best results are obtained when the elements within
each stratum are as much alike as possible (i.e.
homogeneous group).
 A simple random sample is taken from each stratum.
 Formulas are available for combining the stratum
sample results into one population parameter
estimate.

© 2003 Thomson/South-Western Slide


188
Stratified Random Sampling

 Advantage: If strata are homogeneous, this method


is as “precise” as simple random sampling but with a
smaller total sample size.
 Example: The basis for forming the strata might be
department, location, age, industry type, etc.

© 2003 Thomson/South-Western Slide


189
Cluster Sampling

 The population is first divided into separate groups


of elements called clusters.
 Ideally, each cluster is a representative small-scale
version of the population (i.e. heterogeneous group).
 A simple random sample of the clusters is then taken.
 All elements within each sampled (chosen) cluster
form the sample.
… continued

© 2003 Thomson/South-Western Slide


190
Cluster Sampling

 Advantage: The close proximity of elements can be


cost effective (I.e. many sample observations can be
obtained in a short time).
 Disadvantage: This method generally requires a
larger total sample size than simple or stratified
random sampling.
 Example: A primary application is area sampling,
where clusters are city blocks or other well-defined
areas.

© 2003 Thomson/South-Western Slide


191
Systematic Sampling

 If a sample size of n is desired from a population


containing N elements, we might sample one element
for every n/N elements in the population.
 We randomly select one of the first n/N elements
from the population list.
 We then select every n/Nth element that follows in
the population list.
 This method has the properties of a simple random
sample, especially if the list of the population
elements is a random ordering.
… continued

© 2003 Thomson/South-Western Slide


192
Systematic Sampling

 Advantage: The sample usually will be easier to


identify than it would be if simple random sampling
were used.
 Example: Selecting every 100th listing in a telephone
book after the first randomly selected listing.

© 2003 Thomson/South-Western Slide


193
Convenience Sampling

 It is a nonprobability sampling technique. Items are


included in the sample without known probabilities
of being selected.
 The sample is identified primarily by convenience.
 Advantage: Sample selection and data collection are
relatively easy.
 Disadvantage: It is impossible to determine how
representative of the population the sample is.
 Example: A professor conducting research might use
student volunteers to constitute a sample.

© 2003 Thomson/South-Western Slide


194
Judgment Sampling

 The person most knowledgeable on the subject of the


study selects elements of the population that he or
she feels are most representative of the population.
 It is a nonprobability sampling technique.
 Advantage: It is a relatively easy way of selecting a
sample.
 Disadvantage: The quality of the sample results
depends on the judgment of the person selecting the
sample.
 Example: A reporter might sample three or four
senators, judging them as reflecting the general
opinion of the senate.

© 2003 Thomson/South-Western Slide


195
Chapter 9
Hypothesis Testing
 Developing Null and Alternative Hypotheses
 Type I and Type II Errors
 One-Tailed Tests About a Population Mean:
Large-Sample Case
 Two-Tailed Tests About a Population Mean:
Large-Sample Case
 Tests About a Population Mean:
Small-Sample Case
 Tests About a Population Proportion

© 2003 Thomson/South-Western Slide


196
Developing Null and Alternative Hypotheses

 Hypothesis testing can be used to determine whether


a statement about the value of a population
parameter should or should not be rejected.
 The null hypothesis, denoted by H0 , is a tentative
assumption about a population parameter.
 The alternative hypothesis, denoted by Ha, is the
opposite of what is stated in the null hypothesis.

© 2003 Thomson/South-Western Slide


197
Developing Null and Alternative Hypotheses

 Testing Research Hypotheses


• Hypothesis testing is proof by contradiction.
• The research hypothesis should be expressed as
the alternative hypothesis.
• The conclusion that the research hypothesis is true
comes from sample data that contradict the null
hypothesis.

© 2003 Thomson/South-Western Slide


198
Developing Null and Alternative Hypotheses

 Testing the Validity of a Claim


• Manufacturers’ claims are usually given the
benefit of the doubt and stated as the null
hypothesis.
• The conclusion that the claim is false comes from
sample data that contradict the null hypothesis.

© 2003 Thomson/South-Western Slide


199
Developing Null and Alternative Hypotheses

 Testing in Decision-Making Situations


• A decision maker might have to choose between
two courses of action, one associated with the null
hypothesis and another associated with the
alternative hypothesis.
• Example: Accepting a shipment of goods from a
supplier or returning the shipment of goods to the
supplier.

© 2003 Thomson/South-Western Slide


200
A Summary of Forms for Null and Alternative
Hypotheses about a Population Mean
 The equality part of the hypotheses always appears
in the null hypothesis.
 In general, a hypothesis test about the value of a
population mean  must take one of the following
three forms (where 0 is the hypothesized value of
the population mean).
H0:  > 0 H0:  < 0 H0:  = 0
Ha:  < 0 Ha:  > 0 H :  
a 0

One-tailed One-tailed Two-tailed

© 2003 Thomson/South-Western Slide


201
Example: Metro EMS

 Null and Alternative Hypotheses


A major west coast city provides one of the most
comprehensive emergency medical services in the
world. Operating in a multiple hospital system with
approximately 20 mobile medical units, the service
goal is to respond to medical emergencies with a
mean time of 12 minutes or less.
The director of medical services wants to
formulate a hypothesis test that could use a sample of
emergency response times to determine whether or
not the service goal of 12 minutes or less is being
achieved.

© 2003 Thomson/South-Western Slide


202
Example: Metro EMS

 Null and Alternative Hypotheses


Hypotheses Conclusion and Action
H0:  The emergency service is meeting
the response goal; no follow-up
action is necessary.
Ha: The emergency service is not
meeting the response goal;
appropriate follow-up action is
necessary.
Where:  = mean response time for the population
of medical emergency requests.

© 2003 Thomson/South-Western Slide


203
Type I and Type II Errors

 Since hypothesis tests are based on sample data, we


must allow for the possibility of errors.
 A Type I error is rejecting H0 when it is true.
 The person conducting the hypothesis test specifies
the maximum allowable probability of making a
Type I error, denoted by  and called the level of
significance.

© 2003 Thomson/South-Western Slide


204
Type I and Type II Errors

 A Type II error is accepting H0 when it is false.


 Generally, we cannot control for the probability of
making a Type II error, denoted by .
 Statistician avoids the risk of making a Type II error
by using “do not reject H0” and not “accept H0”.

© 2003 Thomson/South-Western Slide


205
Example: Metro EMS

 Type I and Type II Errors

Population Condition
H0 True Ha True
Conclusion ( ) ( )

Accept H0 Correct Type II


(Conclude  Conclusion Error

Reject H0 Type I Correct


(Conclude  rror Conclusion

© 2003 Thomson/South-Western Slide


206
Using the Test Statistic

 The test statistic z has a standard normal probability


distribution.
 We can use the standard normal probability
distribution table to find the z-value with an area of a
in the lower (or upper) tail of the distribution.
 The value of the test statistic that established the
boundary of the rejection region is called the critical
value for the test.
 The rejection rule is:
• Lower tail: Reject H if z < z .
0 
• Upper tail: Reject H if z > z .
0 

© 2003 Thomson/South-Western Slide


207
Using the p-Value

 The p-value is the probability of obtaining a sample


result that is at least as unlikely as what is observed.
 The p-value can be used to make the decision in a
hypothesis test by noting that:
• if the p-value is less than the level of significance
, the value of the test statistic is in the rejection
region.
• if the p-value is greater than or equal to , the
value of the test statistic is not in the rejection
region.
 Reject H0 if the p-value < .

© 2003 Thomson/South-Western Slide


208
Steps of Hypothesis Testing

1. Determine the null and alternative hypotheses.


2. Specify the level of significance .
3. Select the test statistic that will be used to test the
hypothesis.
Using the Test Statistic
4. Use to determine the critical value for the test
statistic and state the rejection rule for H0.
5. Collect the sample data and compute the value of
the test statistic.
6. Use the value of the test statistic and the rejection
rule to determine whether to reject H0.

© 2003 Thomson/South-Western Slide


209
Steps of Hypothesis Testing

Using the p-Value


4. Collect the sample data and compute the value of
the test statistic.
5. Use the value of the test statistic to compute the p-
value.
6. Reject H0 if p-value < a.

© 2003 Thomson/South-Western Slide


210
One-Tailed Tests about a Population Mean:
Large-Sample Case (n > 30)
 Hypotheses
H0:   or H0: 
Ha:  Ha:

 Test Statistic
 Known  Unknown
x  0 x  0
z z
/ n s/ n
 Rejection Rule
Reject H0 if z > zReject H0 if z < -z

© 2003 Thomson/South-Western Slide


211
Example: Metro EMS

 One-Tailed Test about a Population Mean: Large n


Let  = P(Type I Error) = .05

Sampling distribution
of x (assuming H0 is
true and  = 12) Reject H0

Do Not Reject H0 

1.645 x
x
12 c
(Critical value)

© 2003 Thomson/South-Western Slide


212
Example: Metro EMS

 One-Tailed Test about a Population Mean: Large n


Let n = 40, x = 13.25 minutes, s = 3.2 minutes
(The sample standard deviation s can be used to
estimate the population standard deviation .)
x   13. 25  12
z   2. 47
 / n 3. 2 / 40
Since 2.47 > 1.645, we reject H0.
Conclusion: We are 95% confident that Metro EMS
is not meeting the response goal of 12 minutes;
appropriate action should be taken to improve
service.

© 2003 Thomson/South-Western Slide


213
Example: Metro EMS

 One-Tailed Test about a Population Mean: Large n


Conclusion: xWe are 95% confident that Metro EMS
is not meeting the response goal of 12 minutes;
appropriate action should be taken to improve
service.

© 2003 Thomson/South-Western Slide


214
Example: Metro EMS

 Using the p-value to Test the Hypothesis


Recall that z = 2.47 for x = 13.25. Then p-value = .0068.
Since p-value < , that is .0068 < .05, we reject H0.

Reject H0

Do Not Reject H0 p-value

z
0 1.645 2.47

© 2003 Thomson/South-Western Slide


215
Two-Tailed Tests about a Population Mean:
Large-Sample Case (n > 30)
 Hypotheses
H0: = 
Ha:  

 Test Statistic  Known  Unknown


x  0 x  0
z z
/ n s/ n

 Rejection Rule
Reject H0 if |z| > z

© 2003 Thomson/South-Western Slide


216
Example: Glow Toothpaste

 Two-Tailed Tests about a Population Mean: Large n


The production line for Glow toothpaste is
designed to fill tubes of toothpaste with a mean
weight of 6 ounces.
Periodically, a sample of 30 tubes will be selected
in order to check the filling process. Quality
assurance procedures call for the continuation of the
filling process if the sample results are consistent with
the assumption that the mean filling weight for the
population of toothpaste tubes is 6 ounces; otherwise
the filling process will be stopped and adjusted.

© 2003 Thomson/South-Western Slide


217
Example: Glow Toothpaste

 Two-Tailed Tests about a Population Mean: Large n


A hypothesis test about the population mean can
be used to help determine when the filling process
should continue operating and when it should be
stopped and corrected.
• Hypotheses
H0:  
H :  
 a
• Rejection Rule
ssuming a .05 level of significance,
Reject H0 if z < -1.96 or if z > 1.96

© 2003 Thomson/South-Western Slide


218
Example: Glow Toothpaste

 Two-Tailed Test about a Population Mean: Large n


Sampling distribution
of x (assuming H0 is
true and  = 6)

Reject H0 Do Not Reject H0 Reject H0


 

z
-1.96 0 1.96

© 2003 Thomson/South-Western Slide


219
Example: Glow Toothpaste

 Two-Tailed Test about a Population Mean: Large n


Assume that a sample of 30 toothpaste tubes
provides a sample mean of 6.1 ounces and standard
deviation of 0.2 ounces.
Let n = 30, x= 6.1 ounces, s = .2 ounces
x   0 6.1  6
z   2.74
s / n .2 / 30
Since 2.74 > 1.96, we reject H0.

© 2003 Thomson/South-Western Slide


220
Example: Glow Toothpaste

 Two-Tailed Test about a Population Mean: Large n


Conclusion: We are 95% confident that the mean
filling weight of the toothpaste tubes is not 6
ounces. The filling process should be stopped and
the filling mechanism adjusted.
x

© 2003 Thomson/South-Western Slide


221
Example: Glow Toothpaste

 Using the p-Value for a Two-Tailed Hypothesis Test


Suppose we define the p-value for a two-tailed test
as double the area found in the tail of the distribution.
With z = 2.74, the standard normal probability
table shows there is a .5000 - .4969 = .0031 probability
of a difference larger than .1 in the upper tail of the
distribution.
Considering the same probability of a larger
difference in the lower tail of the distribution, we have
p-value = 2(.0031) = .0062
The p-value .0062 is less than  = .05, so H0 is rejected.

© 2003 Thomson/South-Western Slide


222
Confidence Interval Approach to a
Two-Tailed Test about a Population Mean
 Select a simple random sample from the population
and use the value of the sample mean x to develop
the confidence interval for the population mean .
 If the confidence interval contains the hypothesized
value 0, do not reject H0. Otherwise, reject H0.

© 2003 Thomson/South-Western Slide


223
Example: Glow Toothpaste

 Confidence Interval Approach to a Two-Tailed


Hypothesis Test
The 95% confidence interval for  is

x  z / 2  6.1  1. 96(. 2 30 )  6.1. 0716
n
or 6.0284 to 6.1716
Since the hypothesized value for the population
mean, 0 = 6, is not in this interval, the hypothesis-
testing conclusion is that the null hypothesis,
H0:  = 6, can be rejected.

© 2003 Thomson/South-Western Slide


224
Tests about a Population Mean:
Small-Sample Case (n < 30)
 Test Statistic
 Known  Unknown
x  0 x  0
t t
/ n s/ n

This test statistic has a t distribution with n - 1


degrees of freedom.

© 2003 Thomson/South-Western Slide


225
Tests about a Population Mean:
Small-Sample Case (n < 30)
 Rejection Rule

H0:  Reject H0 if t > t


H0:  Reject H0 if t < -t
H0:  Reject H0 if |t| > t

© 2003 Thomson/South-Western Slide


226
p -Values and the t Distribution

 The format of the t distribution table provided in


most statistics textbooks does not have sufficient
detail to determine the exact p-value for a hypothesis
test.
 However, we can still use the t distribution table to
identify a range for the p-value.
 An advantage of computer software packages is that
the computer output will provide the p-value for the
t distribution.

© 2003 Thomson/South-Western Slide


227
Example: Highway Patrol

 One-Tailed Test about a Population Mean: Small n


A State Highway Patrol periodically samples
vehicle speeds at various locations on a particular
roadway. The sample of vehicle speeds is used to
test the hypothesis
H0: m < 65.
The locations where H0 is rejected are deemed the
best locations for radar traps.
At Location F, a sample of 16 vehicles shows a
mean speed of 68.2 mph with a standard deviation of
3.8 mph. Use an a = .05 to test the hypothesis.

© 2003 Thomson/South-Western Slide


228
Example: Highway Patrol

 One-Tailed Test about a Population Mean: Small n

Reject H0

Do Not Reject H0 

t
0 1.753
(Critical value)

© 2003 Thomson/South-Western Slide


229
Example: Highway Patrol

 One-Tailed Test about a Population Mean: Small n


Let n = 16, x = 68.2 mph, s = 3.8 mph
a = .05, d.f. = 16-1 = 15, ta = 1.753
x   0 68.2  65
t   3.37
s / n 3.8 / 16
Since 3.37 > 1.753, we reject H0.
Conclusion: We are 95% confident that the mean
speed of vehicles at Location F is greater than 65
mph. Location F is a good candidate for a radar trap.

© 2003 Thomson/South-Western Slide


230
Summary of Test Statistics to be Used in a
Hypothesis Test about a Population Mean
Yes No
n > 30 ?
No
s known ? Popul.
Yes
approx.
Yes normal
Use s to
estimate s No ?
s known ?
No
Use s to
Yes
estimate s

x  x  x  x  Increase n
z z z t
/ n s/ n / n s/ n to > 30

© 2003 Thomson/South-Western Slide


231
A Summary of Forms for Null and Alternative
Hypotheses about a Population Proportion
 The equality part of the hypotheses always appears
in the null hypothesis.
 In general, a hypothesis test about the value of a
population proportion p must take one of the
following three forms (where p0 is the hypothesized
value of the population proportion).
H0: p > p0 H0: p < p0 H0: p = p0

Ha: p < p0 Ha: p > p0 Ha: p p0
One-tailed One-tailed Two-tailed

© 2003 Thomson/South-Western Slide


232
Tests about a Population Proportion:
Large-Sample Case (np > 5 and n(1 - p) > 5)
 Test Statistic
p  p0
z
p

where:

p0 (1  p0 )
p 
n

© 2003 Thomson/South-Western Slide


233
Tests about a Population Proportion:
Large-Sample Case (np > 5 and n(1 - p) > 5)
 Rejection Rule

H0: pp Reject H0 if z > z


H0: pp Reject H0 if z < -z
H0: pp Reject H0 if |z| > z

© 2003 Thomson/South-Western Slide


234
Example: NSC

 Two-Tailed Test about a Population Proportion:


Large n
For a Christmas and New Year’s week, the
National Safety Council estimated that 500 people
would be killed and 25,000 injured on the nation’s
roads. The NSC claimed that 50% of the accidents
would be caused by drunk driving.
A sample of 120 accidents showed that 67 were
caused by drunk driving. Use these data to test the
NSC’s claim with a = 0.05.

© 2003 Thomson/South-Western Slide


235
Example: NSC

 Two-Tailed Test about a Population Proportion:


Large n
• Hypothesis
H0: p = .5
H : p .5 
a
• Test Statistic
p0 (1  p0 ) .5(1  .5)
p    .045644
n 120
p  p0 (67 /120)  .5
z   1.278
p .045644

© 2003 Thomson/South-Western Slide


236
Example: NSC

 Two-Tailed Test about a Population Proportion:


Large n
• Rejection Rule
Reject H0 if z < -1.96 or z > 1.96
• Conclusion
Do not reject H0.
For z = 1.278, the p-value is .201. If we reject
H0, we exceed the maximum allowed risk of
committing a Type I error (p-value > .050).

© 2003 Thomson/South-Western Slide


237
Chapter 10
Comparisons Involving Means
 Estimation of the Difference between the Means of
Two Populations: Independent Samples
 Hypothesis Tests about the Difference between the
Means of Two Populations: Independent Samples
 Inferences about the Difference between the Means
of Two Populations: Matched Samples
 Introduction to Analysis of Variance (ANOVA)
 ANOVA: Testing for the Equality of k Population
Means

2 ?
1 =
ANOVA
© 2003 Thomson/South-Western Slide
238
Estimation of the Difference Between the Means
of Two Populations: Independent Samples
 Point Estimator of the Difference between the Means
of Two Populations
 Sampling Distribution x1  x2
 Interval Estimate of Large-Sample Case
 Interval Estimate of Small-Sample Case

© 2003 Thomson/South-Western Slide


239
Point Estimator of the Difference Between
the Means of Two Populations
 Let 1 equal the mean of population 1 and 2 equal
the mean of population 2.
 The difference between the two population means is
1 - 2.
 To estimate 1 - 2, we will select a simple random
sample of size n1 from population 1 and a simple
random sample of size n2 from population 2.
x1 x2
 Let equal the mean of sample 1 and equal the
mean of sample 2.
 The point estimator of the difference between the
x1  x2
means of the populations 1 and 2 is .

© 2003 Thomson/South-Western Slide


240
Sampling Distribution of x1  x2

 Properties of the Sampling Distribution of x1  x2


• Expected Value
E ( x1  x2 )  1   2

© 2003 Thomson/South-Western Slide


241
Sampling Distribution of x1  x2

 Properties of the Sampling Distribution of x1  x2


• Standard Deviation

12  22
 x1  x2  
n1 n2

where: 1 = standard deviation of population 1


2 = standard deviation of population 2
n1 = sample size from population 1
n2 = sample size from population 2

© 2003 Thomson/South-Western Slide


242
Interval Estimate of 1 - 2:
Large-Sample Case (n1 > 30 and n2 > 30)
 Interval Estimate with 1 and 2 Known

x1  x2  z / 2  x1  x2

where:
1 -  is the confidence coefficient

© 2003 Thomson/South-Western Slide


243
Interval Estimate of 1 - 2:
Large-Sample Case (n1 > 30 and n2 > 30)
 Interval Estimate with 1 and 2 Unknown
x1  x2  z / 2 sx1  x2

where:

s12 s22
sx1  x2  
n1 n2

© 2003 Thomson/South-Western Slide


244
Example: Par, Inc.

 Interval Estimate of 1 - 2: Large-Sample Case


Par, Inc. is a manufacturer of golf equipment and
has developed a new golf ball that has been designed
to provide “extra distance.” In a test of driving
distance using a mechanical driving device, a sample of
Par golf balls was compared with a sample of golf balls
made by Rap, Ltd., a competitor.
The sample statistics appear on the next slide.

© 2003 Thomson/South-Western Slide


245
Example: Par, Inc.

 Interval Estimate of 1 - 2: Large-Sample Case


• Sample Statistics
Sample #1 Sample #2
Par, Inc. Rap, Ltd.
Sample Size n1 = 120 balls n2 = 80 balls
Mean = 235xyards
1 = 218xyards
2

Standard Dev. s1 = 15 yards s2 = 20 yards

© 2003 Thomson/South-Western Slide


246
Example: Par, Inc.

 Point Estimate of the Difference Between Two


Population Means
1 = mean distance for the population of
Par, Inc. golf balls
2 = mean distance for the population of
Rap, Ltd. golf balls
Point estimate of 1 - 2 = x1  x2 = 235 - 218 = 17 yards.

© 2003 Thomson/South-Western Slide


247
Point Estimator of the Difference Between
the Means of Two Populations

Population 1 Population 2
Par, Inc. Golf Balls Rap, Ltd. Golf Balls
m1 = mean driving m2 = mean driving
distance of Par distance of Rap
golf balls golf balls
m1 – m2 = difference between
the mean distances
Simple random sample Simple random sample
of n1 Par golf balls of n2 Rap golf balls
x1 = sample mean distance x2 = sample mean distance
for sample of Par golf ball for sample of Rap golf ball
x1 - x2 = Point Estimate of m1 – m2

© 2003 Thomson/South-Western Slide


248
Example: Par, Inc.

 95% Confidence Interval Estimate of the Difference


Between Two Population Means: Large-Sample Case,
1 and 2 Unknown
Substituting the sample standard deviations for the
population standard deviation:
12  22 (15) 2 ( 20) 2
x1  x2  z / 2   17  1. 96 
n1 n2 120 80

= 17 + 5.14 or 11.86 yards to 22.14 yards.


We are 95% confident that the difference between the
mean driving distances of Par, Inc. balls and Rap, Ltd.
balls lies in the interval of 11.86 to 22.14 yards.

© 2003 Thomson/South-Western Slide


249
Interval Estimate of 1 - 2:
Small-Sample Case (n1 < 30 and/or n2 < 30)
 Interval Estimate with  2 Known

x1  x2  z / 2  x1  x2
where:
1 1
2
 x1  x2   (  )
n1 n2

© 2003 Thomson/South-Western Slide


250
Interval Estimate of 1 - 2:
Small-Sample Case (n1 < 30 and/or n2 < 30)
 Interval Estimate with  2 Unknown
x1  x2  t / 2 sx1  x2
where:
2 2
2 1 1 ( n  1) s  ( n  1) s
sx1  x2  s (  ) s2  1 1 2 2
n1 n2 n1  n2  2

© 2003 Thomson/South-Western Slide


251
Example: Specific Motors

Specific Motors of Detroit has developed a new


automobile known as the M car. 12 M cars and 8 J cars
(from Japan) were road tested to compare miles-per-
gallon (mpg) performance. The sample statistics are:
Sample #1 Sample #2
M Cars J Cars
Sample Size n1 = 12 cars n2 = 8 cars
Mean = 29.8x1mpg = 27.3x2mpg
Standard Deviation s1 = 2.56 mpg s2 = 1.81
mpg

© 2003 Thomson/South-Western Slide


252
Example: Specific Motors

 Point Estimate of the Difference Between Two


Population Means
1 = mean miles-per-gallon for the population of
M cars
2 = mean miles-per-gallon for the population of
J cars
Point estimate of 1 - 2 = x1  x2 = 29.8 - 27.3 = 2.5
mpg.

© 2003 Thomson/South-Western Slide


253
Example: Specific Motors

 95% Confidence Interval Estimate of the Difference


Between Two Population Means: Small-Sample Case
We will make the following assumptions:
• The miles per gallon rating must be normally
distributed for both the M car and the J car.
• The variance in the miles per gallon rating must
be the same for both the M car and the J car.

© 2003 Thomson/South-Western Slide


254
Example: Specific Motors

 95% Confidence Interval Estimate of the Difference


Between Two Population Means: Small-Sample Case
Using the t distribution with n1 + n2 - 2 = 18 degrees
of freedom, the appropriate t value is t.025 = 2.101.
We will use a weighted average of the two sample
variances as the pooled estimator of  2.

© 2003 Thomson/South-Western Slide


255
Example: Specific Motors

 95% Confidence Interval Estimate of the Difference


Between Two Population Means: Small-Sample Case
2 2 2 2
( n  1) s  ( n  1) s 11( 2 . 56 )  7 (1. 81)
s2  1 1 2 2
  5. 28
n1  n2  2 12  8  2

2 1 1 1 1
x1  x2  t.025 s (  )  2. 5  2.101 5. 28(  )
n1 n2 12 8
= 2.5 + 2.2 or .3 to 4.7 miles per gallon.
We are 95% confident that the difference between the
mean mpg ratings of the two car types is from .3 to
4.7 mpg (with the M car having the higher mpg).

© 2003 Thomson/South-Western Slide


256
Hypothesis Tests About the Difference
between the Means of Two Populations:
Independent Samples
 Hypotheses
H0: 1 - 2 < 0 H0: 1 - 2 > 0 H0: 1 - 2 = 0
Ha: 1 - 2 > 0 Ha: 1 - 2 < 0 Ha: 1 - 2  0

 Test Statistic
Large-Sample Small-Sample
( x1  x2 )  ( 1   2 ) ( x1  x2 )  (  1   2 )
z t
12 n1   22 n2 s 2 (1 n1  1 n2 )

© 2003 Thomson/South-Western Slide


257
Example: Par, Inc.

 Hypothesis Tests About the Difference between the


Means of Two Populations: Large-Sample Case
Par, Inc. is a manufacturer of golf equipment and has
developed a new golf ball that has been designed to
provide “extra distance.” In a test of driving distance
using a mechanical driving device, a sample of Par
golf balls was compared with a sample of golf balls
made by Rap, Ltd., a competitor. The sample
statistics appear on the next slide.

© 2003 Thomson/South-Western Slide


258
Example: Par, Inc.

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Large-Sample Case
• Sample Statistics
Sample #1 Sample #2
Par, Inc. Rap, Ltd.
Sample Size n1 = 120 balls n2 = 80 balls
Mean = 235xyards
1 = 218xyards
2
Standard Dev. s1 = 15 yards s2 = 20 yards

© 2003 Thomson/South-Western Slide


259
Example: Par, Inc.

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Large-Sample Case
Can we conclude, using a .01 level of significance,
that the mean driving distance of Par, Inc. golf balls is
greater than the mean driving distance of Rap, Ltd.
golf balls?

© 2003 Thomson/South-Western Slide


260
Example: Par, Inc.

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Large-Sample Case
1 = mean distance for the population of Par, Inc.
golf balls
2 = mean distance for the population of Rap, Ltd.
golf balls
• Hypotheses H0: 1 - 2 < 0
Ha: 1 - 2 > 0

© 2003 Thomson/South-Western Slide


261
Example: Par, Inc.

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Large-Sample Case
• Rejection Rule
Reject H0 if z > 2.33

( x1  x2 )  ( 1   2 ) ( 235  218)  017


z    6. 49
12  22 2
(15) ( 20) 2 2. 62
 
n1 n2 120 80

© 2003 Thomson/South-Western Slide


262
Example: Par, Inc.

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Large-Sample Case
• Conclusion

Reject H0. We are at least 99% confident that the


mean driving distance of Par, Inc. golf balls is
greater than the mean driving distance of Rap, Ltd.
golf balls.

© 2003 Thomson/South-Western Slide


263
Example: Specific Motors

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Small-Sample Case
Can we conclude, using a .05 level of significance,
that the miles-per-gallon (mpg) performance of M
cars is greater than the miles-per-gallon performance
of J cars?

© 2003 Thomson/South-Western Slide


264
Example: Specific Motors

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Small-Sample Case
1 = mean mpg for the population of M cars
2 = mean mpg for the population of J cars
• Hypotheses H0: 1 - 2 < 0
Ha: 1 - 2 > 0

© 2003 Thomson/South-Western Slide


265
Example: Specific Motors

 Hypothesis Tests About the Difference Between the


Means of Two Populations: Small-Sample Case
• Rejection Rule
Reject H0 if t > 1.734
(a = .05, d.f. = 18)
• Test Statistic
( x1  x2 )  ( 1   2 )
t
s2 (1 n1  1 n2 )
where:
(n1  1)s12  (n2  1)s22
2
s 
n1  n2  2

© 2003 Thomson/South-Western Slide


266
Inference About the Difference between the
Means of Two Populations: Matched Samples
 With a matched-sample design each sampled item
provides a pair of data values.
 The matched-sample design can be referred to as
blocking.
 This design often leads to a smaller sampling error
than the independent-sample design because
variation between sampled items is eliminated as a
source of sampling error.

© 2003 Thomson/South-Western Slide


267
Example: Express Deliveries

 Inference About the Difference between the Means of


Two Populations: Matched Samples
A Chicago-based firm has documents that must
be quickly distributed to district offices throughout
the U.S. The firm must decide between two delivery
services, UPX (United Parcel Express) and INTEX
(International Express), to transport its documents.
In testing the delivery times of the two services, the
firm sent two reports to a random sample of ten
district offices with one report carried by UPX and
the other report carried by INTEX.
Do the data that follow indicate a difference in
mean delivery times for the two services?

© 2003 Thomson/South-Western Slide


268
Example: Express Deliveries

Delivery Time (Hours)


District Office UPX INTEX Difference
Seattle 32 25 7
Los Angeles 30 24 6
Boston 19 15 4
Cleveland 16 15 1
New York 15 13 2
Houston 18 15 3
Atlanta 14 15 -1
St. Louis 10 8 2
Milwaukee 7 9 -2
Denver 16 11 5

© 2003 Thomson/South-Western Slide


269
Example: Express Deliveries

 Inference About the Difference between the Means of


Two Populations: Matched Samples
Let d = the mean of the difference values for the
two delivery services for the population of
district offices

• Hypotheses
H0: d = 0, Ha: d 

© 2003 Thomson/South-Western Slide


270
Example: Express Deliveries

 Inference About the Difference between the Means of


Two Populations: Matched Samples
• Rejection Rule
Assuming the population of difference values is
approximately normally distributed, the t
distribution with n - 1 degrees of freedom applies.
With  = .05, t.025 = 2.262 (9 degrees of freedom).
Reject H0 if t < -2.262 or if t > 2.262

© 2003 Thomson/South-Western Slide


271
Example: Express Deliveries

 Inference About the Difference between the Means of


Two Populations: Matched Samples
 di ( 7  6... 5)
d    2. 7
n 10
2
 ( di  d ) 76.1
sd    2. 9
n 1 9
d  d 2. 7  0
t   2. 94
sd n 2. 9 10

© 2003 Thomson/South-Western Slide


272
Example: Express Deliveries

 Inference About the Difference between the Means of


Two Populations: Matched Samples
• Conclusion
Reject H0.
There is a significant difference between the mean
delivery times for the two services.

© 2003 Thomson/South-Western Slide


273
Introduction to Analysis of Variance

 Analysis of Variance (ANOVA) can be used to test


for the equality of three or more population means
using data obtained from observational or
experimental studies.
 We want to use the sample results to test the
following hypotheses.
H0: 1=2=3=. . .= k
Ha: Not all population means are equal

© 2003 Thomson/South-Western Slide


274
Introduction to Analysis of Variance

 If H0 is rejected, we cannot conclude that all


population means are different.
 Rejecting H0 means that at least two population
means have different values.

© 2003 Thomson/South-Western Slide


275
Assumptions for Analysis of Variance

 For each population, the response variable is


normally distributed.
 The variance of the response variable, denoted  2, is
the same for all of the populations.
 The observations must be independent.

© 2003 Thomson/South-Western Slide


276
Analysis of Variance:
Testing for the Equality of k Population Means
 Between-Treatments Estimate of Population Variance
 Within-Treatments Estimate of Population Variance
 Comparing the Variance Estimates: The F Test
 The ANOVA Table

© 2003 Thomson/South-Western Slide


277
Between-Treatments Estimate
of Population Variance
 A between-treatment estimate of  2 is called the
mean square treatment and is denoted MSTR.

 j j
n (
j 1
x  x ) 2

MSTR 
k 1

 The numerator of MSTR is called the sum of squares


treatment and is denoted SSTR.
 The denominator of MSTR represents the degrees of
freedom associated with SSTR.

© 2003 Thomson/South-Western Slide


278
Within-Samples Estimate
of Population Variance
 The estimate of  2 based on the variation of the
sample observations within each sample is called the
mean square error and is denoted by MSE.

 j
( n
j 1
 1) s 2
j

MSE 
nT  k

 The numerator of MSE is called the sum of squares


error and is denoted by SSE.
 The denominator of MSE represents the degrees of
freedom associated with SSE.

© 2003 Thomson/South-Western Slide


279
Comparing the Variance Estimates: The F Test

 If the null hypothesis is true and the ANOVA


assumptions are valid, the sampling distribution of
MSTR/MSE is an F distribution with MSTR d.f. equal
to k - 1 and MSE d.f. equal to nT - k.
 If the means of the k populations are not equal, the
value of MSTR/MSE will be inflated because MSTR
overestimates  2.
 Hence, we will reject H0 if the resulting value of
MSTR/MSE appears to be too large to have been
selected at random from the appropriate F
distribution.

© 2003 Thomson/South-Western Slide


280
Test for the Equality of k Population Means

 Hypotheses
H0: 1=2=3=. . .= k
Ha: Not all population means are equal
 Test Statistic
F = MSTR/MSE
 Rejection Rule
Reject H0 if F > F
where the value of F is based on an F distribution
with k - 1 numerator degrees of freedom and nT - 1
denominator degrees of freedom.

© 2003 Thomson/South-Western Slide


281
Sampling Distribution of MSTR/MSE

 The figure below shows the rejection region


associated with a level of significance equal to 
where F denotes the critical value.

Do Not Reject H0 Reject H0


MSTR/MSE
F
Critical Value

© 2003 Thomson/South-Western Slide


282
ANOVA Table

Source of Sum of Degrees of Mean


Variation Squares Freedom Squares F
Treatment SSTR k-1 MSTR MSTR/MSE
Error SSE nT - k MSE
Total SST nT - 1

SST divided by its degrees of freedom nT - 1 is simply


the overall sample variance that would be obtained if
we treated the entire nT observations as one data set.
k nj

SST   ( xij  x) 2  SSTR  SSE


j 1 i 1

© 2003 Thomson/South-Western Slide


283
Example: Reed Manufacturing

 Analysis of Variance
J. R. Reed would like to know if the mean number
of hours worked per week is the same for the
department managers at her three manufacturing
plants (Buffalo, Pittsburgh, and Detroit).
A simple random sample of 5 managers from
each of the three plants was taken and the number of
hours worked by each manager for the previous
week is shown on the next slide.

© 2003 Thomson/South-Western Slide


284
Example: Reed Manufacturing

 Analysis of Variance
Plant 1 Plant 2 Plant 3
Observation Buffalo Pittsburgh Detroit
1 48 73 51
2 54 63 63
3 57 66 61
4 54 64 54
5 62 74 56
Sample Mean 55 68 57
Sample Variance 26.0 26.5 24.5

© 2003 Thomson/South-Western Slide


285
Example: Reed Manufacturing

 Analysis of Variance
• Hypotheses
H0: 1=2=3
Ha: Not all the means are equal
where:
1 = mean number of hours worked per
week by the managers at Plant 1
2 = mean number of hours worked per
week by the managers at Plant 2
3 = mean number of hours worked per
week by the managers at Plant 3

© 2003 Thomson/South-Western Slide


286
Example: Reed Manufacturing

 Analysis of Variance
• Mean Square Treatment
Since the sample sizes are all equal
= + 68 + 57)/3 = 60
x = (55
SSTR = 5(55 - 60)2 + 5(68 - 60)2 + 5(57 - 60)2 = 490
MSTR = 490/(3 - 1) = 245
• Mean Square Error
SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308
MSE = 308/(15 - 3) = 25.667

© 2003 Thomson/South-Western Slide


287
Example: Reed Manufacturing

 Analysis of Variance
• F - Test
If H0 is true, the ratio MSTR/MSE should be near
1 since both MSTR and MSE are estimating  2. If
Ha is true, the ratio should be significantly larger
than 1 since MSTR tends to overestimate  2.

© 2003 Thomson/South-Western Slide


288
Example: Reed Manufacturing

 Analysis of Variance
• Rejection Rule
Assuming  = .05, F.05 = 3.89 (2 d.f. numerator,
12 d.f. denominator). Reject H0 if F > 3.89
• Test Statistic
F = MSTR/MSE = 245/25.667 = 9.55

© 2003 Thomson/South-Western Slide


289
Example: Reed Manufacturing

 Analysis of Variance
• ANOVA Table

Source of Sum of Degrees of Mean


Variation Squares Freedom Square F
Treatments 490 2 245 9.55
Error 308 12 25.667
Total 798 14

© 2003 Thomson/South-Western Slide


290
Example: Reed Manufacturing

 Analysis of Variance
• Conclusion
F = 9.55 > F.05 = 3.89, so we reject H0. The mean
number of hours worked per week by department
managers is not the same at each plant.

© 2003 Thomson/South-Western Slide


291
Chapter 11
Comparisons Involving Proportions
and a Test of Independence
 Inference about the Difference Between the
Proportions of Two Populations
 A Hypothesis Test for Proportions of a
Multinomial Population
 Test of Independence: Contingency Tables

p = 0
p -
H o: 1
2

p = 0
p -
H a: 1
2

© 2003 Thomson/South-Western Slide


292
Inferences About the Difference between the
Proportions of Two Populations
 Sampling Distribution of p1  p2
 Interval Estimation of p1 - p2
 Hypothesis Tests about p1 - p2

© 2003 Thomson/South-Western Slide


293
Sampling Distribution of p1  p2

 Expected Value
E ( p1  p2 )  p1  p2
 Standard Deviation

p1 (1  p1 ) p2 (1  p2 )
 p1  p2  
n1 n2

where: n1 = size of sample taken from population 1


n2 = size of sample taken from population 2

© 2003 Thomson/South-Western Slide


294
Sampling Distribution of p1  p2

 Distribution Form
If the sample sizes are large (n1p1, n1(1 - p1), n2p2,
and n2(1 - p2) are all greater than or equal to 5), the
sampling distribution of p1  p2 can be approximated
by a normal probability distribution.

© 2003 Thomson/South-Western Slide


295
Sampling Distribution of p1  p2

p1 (1  p1 ) p2 (1  p2 )
 p1  p2  
n1 n2

p1  p2
p1 – p2

© 2003 Thomson/South-Western Slide


296
Interval Estimation of p1 - p2

 Interval Estimate

p1  p2  z / 2  p1  p2

 Point Estimator of  p1  p2

p1 (1  p1 ) p2 (1  p2 )
s p1  p2  
n1 n2

© 2003 Thomson/South-Western Slide


297
Example: MRA

MRA (Market Research Associates) is conducting


research to evaluate the effectiveness of a client’s new
advertising campaign. Before the new campaign
began, a telephone survey of 150 households in the test
market area showed 60 households “aware” of the
client’s product. The new campaign has been initiated
with TV and newspaper advertisements running for
three weeks.

© 2003 Thomson/South-Western Slide


298
Example: MRA

A survey conducted immediately after the new


campaign showed 120 of 250 households “aware” of
the client’s product.
Does the data support the position that the
advertising campaign has provided an increased
awareness of the client’s product?

© 2003 Thomson/South-Western Slide


299
Example: MRA

 Point Estimator of the Difference Between the


Proportions of Two Populations
120 60
p1  p2  p1  p2   . 48. 40 . 08
250 150
p1 = proportion of the population of households
“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign
p1 = sample proportion of households “aware” of the
product after the new campaign
p2 = sample proportion of households “aware” of the
product before the new campaign

© 2003 Thomson/South-Western Slide


300
Example: MRA

 Interval Estimate of p1 - p2: Large-Sample Case

For = .05, z.025 = 1.96:


. 48(. 52 ) . 40(. 60)
. 48. 40  1. 96 
250 150

.08 + 1.96(.0510)
.08 + .10

-.02 to +.18

© 2003 Thomson/South-Western Slide


301
Example: MRA

 Interval Estimate of p1 - p2: Large-Sample Case


• Conclusion
At a 95% confidence level, the interval estimate
of the difference between the proportion of
households aware of the client’s product before and
after the new advertising campaign is -.02 to +.18.

© 2003 Thomson/South-Western Slide


302
Hypothesis Tests about p1 - p2

 Hypotheses
H 0 : p1 - p2 < 0
H a : p1 - p2 > 0

 Test statistic
( p1  p2 )  ( p1  p2 )
z
 p1  p2

© 2003 Thomson/South-Western Slide


303
Hypothesis Tests about p1 - p2

 Point Estimator of  p1  p2 where p1 = p2

s p1  p2  p (1  p )(1 n1  1 n2 )

where:
n1 p1  n2 p2
p
n1  n2

© 2003 Thomson/South-Western Slide


304
Example: MRA

 Hypothesis Tests about p1 - p2


Can we conclude, using a .05 level of significance,
that the proportion of households aware of the client’s
product increased after the new advertising campaign?

© 2003 Thomson/South-Western Slide


305
Example: MRA

 Hypothesis Tests about p1 - p2


• Hypotheses
H0: p1 - p2 < 0
H a : p1 - p2 > 0

p1 = proportion of the population of households


“aware” of the product after the new campaign
p2 = proportion of the population of households
“aware” of the product before the new campaign

© 2003 Thomson/South-Western Slide


306
Example: MRA

 Hypothesis Tests about p1 - p2


• Rejection Rule Reject H0 if z > 1.645
• Test Statistic
250(. 48)  150(. 40) 180
p  . 45
250  150 400

s p1  p2  . 45(. 55)( 1  1 ) . 0514


250 150

(. 48. 40)  0 . 08
z   1. 56
. 0514 . 0514

© 2003 Thomson/South-Western Slide


307
Example: MRA

 Hypothesis Tests about p1 - p2


• Conclusion
z = 1.56 < 1.645. Do not reject H0. We cannot
conclude, with at least 95 % confidence, that the
proportion of households aware of the client’s
product increased after the new advertising
campaign.

© 2003 Thomson/South-Western Slide


308
Hypothesis (Goodness of Fit) Test
for Proportions of a Multinomial Population
1. Set up the null and alternative hypotheses.
2. Select a random sample and record the observed
frequency, fi , for each of the k categories.
3. Assuming H0 is true, compute the expected
frequency, ei , in each category by multiplying the
category probability by the sample size.

© 2003 Thomson/South-Western Slide


309
Hypothesis (Goodness of Fit) Test
for Proportions of a Multinomial Population
4. Compute the value of the test statistic.
2
k ( f  e )
2   i i
i 1 ei

5. Reject H0 if  2 (where
 2  is the significance level and
there are k - 1 degrees of freedom).

© 2003 Thomson/South-Western Slide


310
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


Finger Lakes Homes manufactures four models of
prefabricated homes, a two-story colonial, a ranch, a
split-level, and an A-frame. To help in production
planning, management would like to determine if
previous customer purchases indicate that there is a
preference in the style selected.

© 2003 Thomson/South-Western Slide


311
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


The number of homes sold of each model for 100
sales over the past two years is shown below.

Model Colonial Ranch Split-Level A-Frame


# Sold 30 20 35 15

© 2003 Thomson/South-Western Slide


312
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


Let:
pC = population proportion that purchase a colonial
pR = population proportion that purchase a ranch
pS = population proportion that purchase a split-level
pA = population proportion that purchase an A-frame

© 2003 Thomson/South-Western Slide


313
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


• Hypotheses
H0: pC = pR = pS = pA = .25
Ha: The population proportions are not pC = .25,
pR = .25, pS = .25, and pA = .25

© 2003 Thomson/South-Western Slide


314
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


• Rejection Rule
With  = .05 and
k - 1 = 4 - 1 = 3 degrees of
freedom

Do Not Reject H0 Reject H0


2
7.815

© 2003 Thomson/South-Western Slide


315
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


• Expected Frequencies
e1 = .25(100) = 25 e2 = .25(100) = 25
e3 = .25(100) = 25 e4 = .25(100) = 25

• Test Statistic
2 2 2 2
( 30  25) ( 20  25) ( 35  25) (15  25)
2    
25 25 25 25
=1+1+4+4
= 10

© 2003 Thomson/South-Western Slide


316
Example: Finger Lakes Homes (A)

 Multinomial Distribution Goodness of Fit Test


• Conclusion
c2 = 10 > 7.815. We reject the assumption there
is no home style preference, at the .05 level of
significance.

© 2003 Thomson/South-Western Slide


317
Test of Independence: Contingency Tables

1. Set up the null and alternative hypotheses.


2. Select a random sample and record the observed
frequency, fij , for each cell of the contingency table.
3. Compute the expected frequency, eij , for each cell.

(Row i Total)(Column j Total)


eij 
Sample Size

© 2003 Thomson/South-Western Slide


318
Test of Independence: Contingency Tables

4. Compute the test statistic.

( f ij  eij ) 2
2   
i j eij

5. Reject H0 if  2   2 (where  is the significance level


and with n rows and m columns there are
(n - 1)(m - 1) degrees of freedom).

© 2003 Thomson/South-Western Slide


319
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


Each home sold can be classified according to
price and to style. Finger Lakes Homes’ manager
would like to determine if the price of the home and
the style of the home are independent variables.

© 2003 Thomson/South-Western Slide


320
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


The number of homes sold for each model and
price for the past two years is shown below. For
convenience, the price of the home is listed as either
$99,000 or less or more than $99,000.

Price Colonial Ranch Split-Level A-Frame


< $99,000 18 6 19 12
> $99,000 12 14 16 3

© 2003 Thomson/South-Western Slide


321
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


• Hypotheses
H0: Price of the home is independent of the style
of the home that is purchased
Ha: Price of the home is not independent of the
style of the home that is purchased

© 2003 Thomson/South-Western Slide


322
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


• Expected Frequencies

Price Colonial Ranch Split-Level A-Frame Total


< $99K 18 6 19 12 55
> $99K 12 14 16 3 45
Total 30 20 35 15 100

© 2003 Thomson/South-Western Slide


323
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


• Rejection Rule
With  = .05 and (2 - 1)(4 - 1) = 3 d.f., .205  7. 81
Reject H0 if 2 > 7.81

© 2003 Thomson/South-Western Slide


324
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


• Test Statistic
2 2 2
(18  16 . 5) ( 6  11) ( 3  6 . 75)
2    ... 
16. 5 11 6. 75

= .1364 + 2.2727 + . . . + 2.0833 = 9.1486

© 2003 Thomson/South-Western Slide


325
Example: Finger Lakes Homes (B)

 Contingency Table (Independence) Test


• Conclusion
2 = 9.15 > 7.81, so we reject H0, the assumption
that the price of the home is independent of the
style of the home that is purchased.

© 2003 Thomson/South-Western Slide


326
Chapter 12
Simple Linear Regression
 Simple Linear Regression Model
 Least Squares Method
 Coefficient of Determination
 Model Assumptions
 Testing for Significance
 Using the Estimated Regression Equation
for Estimation and Prediction
 Computer Solution
 Residual Analysis: Validating Model Assumptions

© 2003 Thomson/South-Western Slide


327
Simple Linear Regression Model

 The equation that describes how y is related to x and


an error term is called the regression model.
 The simple linear regression model is:

y = b0 + b1x +e

• b0 and b1 are called parameters of the model.


• e is a random variable called the error term.

© 2003 Thomson/South-Western Slide


328
Simple Linear Regression Equation

 The simple linear regression equation is:

E(y) = 0 + 1x

• Graph of the regression equation is a straight line.


• b0 is the y intercept of the regression line.
• b1 is the slope of the regression line.
• E(y) is the expected value of y for a given x value.

© 2003 Thomson/South-Western Slide


329
Simple Linear Regression Equation

 Positive Linear Relationship

E(y)

Regression line

Intercept Slope b1
b0
is positive

© 2003 Thomson/South-Western Slide


330
Simple Linear Regression Equation

 Negative Linear Relationship

E(y)

Intercept Regression line


b0

Slope b1
is negative

© 2003 Thomson/South-Western Slide


331
Simple Linear Regression Equation

 No Relationship

E(y)

Regression line
Intercept
b0
Slope b1
is 0

© 2003 Thomson/South-Western Slide


332
Estimated Simple Linear Regression Equation

 The estimated simple linear regression equation is:

ŷ  b0  b1 x

• The graph is called the estimated regression line.


• b0 is the y intercept of the line.
• b1 is the slope of the line.
• ŷ is the estimated value of y for a given x value.

© 2003 Thomson/South-Western Slide


333
Estimation Process

Regression Model Sample Data:


y = b0 + b1x +e x y
Regression Equation x1 y1
E(y) = b0 + b1x . .
Unknown Parameters . .
b0, b1 xn y n

Estimated
b0 and b1 Regression Equation
provide estimates of ŷ  b0  b1 x
b0 and b1 Sample Statistics
b0, b1

© 2003 Thomson/South-Western Slide


334
Least Squares Method

 Least Squares Criterion

min  (y i  y i ) 2

where:
yi = observed value of the dependent variable
for the ith observation
y^i = estimated value of the dependent variable
for the ith observation

© 2003 Thomson/South-Western Slide


335
The Least Squares Method

 Slope for the Estimated Regression Equation

 xi y i  (  xi  y i ) / n
b1  2 2
 xi  (  xi ) / n

© 2003 Thomson/South-Western Slide


336
The Least Squares Method

 y-Intercept for the Estimated Regression Equation

b0  y  b1 x
where:
xi = value of independent variable for ith observation
yi = value of dependent variable for ith observation
_
x = mean value for independent variable
_
y = mean value for dependent variable
n = total number of observations

© 2003 Thomson/South-Western Slide


337
Example: Reed Auto Sales

 Simple Linear Regression


Reed Auto periodically has a special week-
long sale. As part of the advertising campaign Reed
runs one or more television commercials during the
weekend preceding the sale. Data from a sample of 5
previous sales are shown on the next slide.

© 2003 Thomson/South-Western Slide


338
Example: Reed Auto Sales

 Simple Linear Regression

Number of TV Ads Number of Cars Sold


1 14
3 24
2 18
1 17
3 27

© 2003 Thomson/South-Western Slide


339
Example: Reed Auto Sales

 Slope for the Estimated Regression Equation


b1 = 220 - (10)(100)/5 = 5
24 - (10)2/5
 y-Intercept for the Estimated Regression Equation
b0 = 20 - 5(2) = 10
 Estimated Regression Equation
y^ = 10 + 5x

© 2003 Thomson/South-Western Slide


340
Example: Reed Auto Sales

 Scatter Diagram

30

25
20
Cars Sold

y^ = 10 + 5x
15

10
5
0
0 1 2 3 4
TV Ads

© 2003 Thomson/South-Western Slide


341
The Coefficient of Determination

 Relationship Among SST, SSR, SSE

SST = SSR + SSE

 ( y i  y )2   ( y^i  y )2   ( y i  y^i )2

where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error

© 2003 Thomson/South-Western Slide


342
The Coefficient of Determination

 The coefficient of determination is:

r2 = SSR/SST

where:
SST = total sum of squares
SSR = sum of squares due to regression

© 2003 Thomson/South-Western Slide


343
Example: Reed Auto Sales

 Coefficient of Determination
r2 = SSR/SST = 100/114 = .8772
The regression relationship is very strong because
88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.

© 2003 Thomson/South-Western Slide


344
The Correlation Coefficient

 Sample Correlation Coefficient

rxy  (sign of b1 ) Coefficien t of Determinat ion

rxy  (sign of b1 ) r 2

where:
b1 = the slope of the estimated regression
equation yˆ  b0  b1 x

© 2003 Thomson/South-Western Slide


345
Example: Reed Auto Sales

 Sample Correlation Coefficient

rxy  (sign of b1 ) r 2
The sign of b1 in the equation yˆ  10 is5“+”.
x

rxy = + .8772
rxy = +.9366

© 2003 Thomson/South-Western Slide


346
Model Assumptions

 Assumptions About the Error Term 


1. The error  is a random variable with mean of
zero.
2. The variance of  , denoted by  2, is the same for
all values of the independent variable.
3. The values of  are independent.
4. The error  is a normally distributed random
variable.

© 2003 Thomson/South-Western Slide


347
Testing for Significance

 To test for a significant regression relationship, we


must conduct a hypothesis test to determine whether
the value of b1 is zero.
 Two tests are commonly used
• t Test
• F Test
 Both tests require an estimate of s 2, the variance of e
in the regression model.

© 2003 Thomson/South-Western Slide


348
Testing for Significance

 An Estimate of s 2
The mean square error (MSE) provides the estimate
of s 2, and the notation s2 is also used.

s2 = MSE = SSE/(n-2)

where:

SSE   ( yi  yˆ i ) 2   ( yi  b0  b1 xi ) 2

© 2003 Thomson/South-Western Slide


349
Testing for Significance

 An Estimate of s
• To estimate s we take the square root of s 2.
• The resulting s is called the standard error of the
estimate.

SSE
s  MSE 
n2

© 2003 Thomson/South-Western Slide


350
Testing for Significance: t Test

 Hypotheses

H0: 1 = 0
Ha: 1 = 0

 Test Statistic
b1
t
sb 1

© 2003 Thomson/South-Western Slide


351
Testing for Significance: t Test

 Rejection Rule

Reject H0 if t < -tor t > t

where: t is based on a t distribution


with n - 2 degrees of freedom

© 2003 Thomson/South-Western Slide


352
Example: Reed Auto Sales

 t Test
• Hypotheses
H0 : 1 = 0
Ha: 1 = 0

• Rejection Rule
For  = .05 and d.f. = 3, t.025 = 3.182
Reject H0 if t > 3.182

© 2003 Thomson/South-Western Slide


353
Example: Reed Auto Sales

 t Test
• Test Statistics
t = 5/1.08 = 4.63
• Conclusions
t = 4.63 > 3.182, so reject H0

© 2003 Thomson/South-Western Slide


354
Confidence Interval for 1

 We can use a 95% confidence interval for 1 to test the


hypotheses just used in the t test.
 H0 is rejected if the hypothesized value of 1 is not
included in the confidence interval for 1.

© 2003 Thomson/South-Western Slide


355
Confidence Interval for 1

 The form of a confidence interval for 1 is:


b1  t / 2 sb1

where b1 is the point estimate


t /the
is 2 sb1 margin of error

is tthe
 / 2 t value providing an area
of a/2 in the upper tail of a
t distribution with n - 2 degrees
of freedom

© 2003 Thomson/South-Western Slide


356
Example: Reed Auto Sales

 Rejection Rule
Reject H0 if 0 is not included in
the confidence interval for 1.
 95% Confidence Interval for 1
b1  t / 2 sb1
= 5 +/- 3.182(1.08) = 5 +/- 3.44

or 1.56 to 8.44
 Conclusion
0 is not included in the confidence interval.
Reject H0

© 2003 Thomson/South-Western Slide


357
Testing for Significance: F Test

 Hypotheses

H0 : 1 = 0
Ha : 1 = 0

 Test Statistic

F = MSR/MSE

© 2003 Thomson/South-Western Slide


358
Testing for Significance: F Test

 Rejection Rule

Reject H0 if F > F

where: F is based on an F distribution


with 1 d.f. in the numerator and
n - 2 d.f. in the denominator

© 2003 Thomson/South-Western Slide


359
Example: Reed Auto Sales

 F Test
• Hypotheses
H0 : 1 = 0
Ha: 1 = 0
• Rejection Rule
For  = .05 and d.f. = 1, 3: F.05 = 10.13
Reject H0 if F > 10.13.

© 2003 Thomson/South-Western Slide


360
Example: Reed Auto Sales

 F Test
• Test Statistic
F = MSR/MSE = 100/4.667 = 21.43
• Conclusion
F = 21.43 > 10.13, so we reject H0.

© 2003 Thomson/South-Western Slide


361
Some Cautions about the
Interpretation of Significance Tests
 Rejecting H0: b1 = 0 and concluding that the
relationship between x and y is significant does not
enable us to conclude that a cause-and-effect
relationship is present between x and y.
 Just because we are able to reject H0: b1 = 0 and
demonstrate statistical significance does not enable
us to conclude that there is a linear relationship
between x and y.

© 2003 Thomson/South-Western Slide


362
Using the Estimated Regression Equation
for Estimation and Prediction
 Confidence Interval Estimate of E(yp)
y p  t  /2 s y p

 Prediction Interval Estimate of yp

yp + t/2 sind

where: confidence coefficient is 1 -  and


t/2 is based on a t distribution
with n - 2 degrees of freedom

© 2003 Thomson/South-Western Slide


363
Example: Reed Auto Sales

 Point Estimation
If 3 TV ads are run prior to a sale, we expect the mean
number of cars sold to be:

y^ = 10 + 5(3) = 25 cars

© 2003 Thomson/South-Western Slide


364
Example: Reed Auto Sales

 Confidence Interval for E(yp)


95% confidence interval estimate of the mean number
of cars sold when 3 TV ads are run is:

25 + 4.61 = 20.39 to 29.61 cars

© 2003 Thomson/South-Western Slide


365
Example: Reed Auto Sales

 Prediction Interval for yp


95% prediction interval estimate of the number of
cars sold in one particular week when 3 TV ads are
run is:

25 + 8.28 = 16.72 to 33.28 cars

© 2003 Thomson/South-Western Slide


366
Residual Analysis

 Residual for Observation i


yi – ^yi

 Standardized Residual for Observation i


y i  y^i
syi  y^i

where: syi  y^i  s 1  hi

© 2003 Thomson/South-Western Slide


367
Example: Reed Auto Sales

 Residuals
Observation Predicted Cars Sold Residuals
1 15 -1
2 25 -1
3 20 -2
4 15 2
5 25 2

© 2003 Thomson/South-Western Slide


368
Example: Reed Auto Sales

 Residual Plot

TV Ads Residual Plot


3
2
Residuals

1
0
-1
-2
-3
0 1 2 3 4
TV Ads

© 2003 Thomson/South-Western Slide


369
Residual Analysis

 Residual Plot

y  yˆ
Good Pattern
Residual

© 2003 Thomson/South-Western Slide


370
Residual Analysis

 Residual Plot

y  yˆ
Nonconstant Variance
Residual

© 2003 Thomson/South-Western Slide


371
Residual Analysis

 Residual Plot

y  yˆ
Model Form Not Adequate
Residual

© 2003 Thomson/South-Western Slide


372
Chapter 13
Multiple Regression
 Multiple Regression Model
 Least Squares Method
 Multiple Coefficient of Determination
 Model Assumptions
 Testing for Significance
 Using the Estimated Regression Equation
for Estimation and Prediction
 Qualitative Independent Variables

© 2003 Thomson/South-Western Slide


373
Multiple Regression Model

 The equation that describes how the dependent


variable y is related to the independent variables x1,
x2, . . . xp and an error term is called the multiple
regression model.
 The multiple regression model is:

y = b0 + b1x1 + b2x2 + . . . + bpxp + e

• b0, b1, b2, . . . , bp are the parameters.


• e is a random variable called the error term.

© 2003 Thomson/South-Western Slide


374
Multiple Regression Equation

 The equation that describes how the mean value of y


is related to x1, x2, . . . xp is called the multiple
regression equation.
 The multiple regression equation is:

E(y) = 0 + 1x1 + 2x2 + . . . + pxp

© 2003 Thomson/South-Western Slide


375
Estimated Multiple Regression Equation

 A simple random sample is used to compute sample


statistics b0, b1, b2, . . . , bp that are used as the point
estimators of the parameters b0, b1, b2, . . . , bp.

 The estimated multiple regression equation is:


^
y = b0 + b1x1 + b2x2 + . . . + bpxp

© 2003 Thomson/South-Western Slide


376
Estimation Process

Multiple Regression Model Sample Data:


E(y) = 0 + 1x1 + 2x2 + . . + pxp + e x1 x2 . . . xp y
Multiple Regression Equation . . . .
E(y) = 0 + 1x1 + 2x2 + . . . + pxp . . . .
Unknown parameters are
b0, b1, b2, . . . , bp

Estimated Multiple
Regression Equation
b0, b1, b2, . . . , bp yˆ  b0  b1 x1  b2 x2  ...  bp x p
provide estimates of b0, b1, b2, . . . , bp
b0, b1, b2, . . . , bp are sample statistics

© 2003 Thomson/South-Western Slide


377
Least Squares Method

 Least Squares Criterion


min  ( y i  y^i )2
 Computation of Coefficients Values
The formulas for the regression coefficients b0, b1,
b2, . . . bp involve the use of matrix algebra. We will
rely on computer software packages to perform the
calculations.

© 2003 Thomson/South-Western Slide


378
Least Squares Method

 A Note on Interpretation of Coefficients


bi represents an estimate of the change in y
corresponding to a one-unit change in xi when all
other independent variables are held constant.

© 2003 Thomson/South-Western Slide


379
Multiple Coefficient of Determination

 Relationship Among SST, SSR, SSE

SST = SSR + SSE

 i
( y  y ) 2
  i
( ˆ
y  y )2
  i i
( y  ˆ
y )2

© 2003 Thomson/South-Western Slide


380
Multiple Coefficient of Determination

 Multiple Coefficient of Determination

R 2 = SSR/SST

 Adjusted Multiple Coefficient of Determination

n1
Ra2  1  ( 1  R 2 )
np1

© 2003 Thomson/South-Western Slide


381
Model Assumptions

 Assumptions About the Error Term 


1. The error  is a random variable with mean of
zero.
2. The variance of  , denoted by 2, is the same for
all values of the independent variables.
3. The values of  are independent.
4. The error  is a normally distributed random
variable reflecting the deviation between the y
value and the expected value of y given by
0 + 1x1 + 2x2 + . . . + pxp

© 2003 Thomson/South-Western Slide


382
Example: Programmer Salary Survey

A software firm collected data for a sample of 20


computer programmers. A suggestion was made that
regression analysis could be used to determine if salary
was related to the years of experience and the score on
the firm’s programmer aptitude test.
The years of experience, score on the aptitude test,
and corresponding annual salary ($1000s) for a sample
of 20 programmers is shown on the next slide.

© 2003 Thomson/South-Western Slide


383
Example: Programmer Salary Survey

Exper. Score Salary Exper. Score Salary


4 78 24 9 88 38
7 100 43 2 73 26.6
1 86 23.7 10 75 36.2
5 82 34.3 5 81 31.6
8 86 35.8 6 74 29
10 84 38 8 87 34
0 75 22.2 4 79 30.1
1 80 23.1 6 94 33.9
6 83 30 3 70 28.2
6 91 33 3 89 30

© 2003 Thomson/South-Western Slide


384
Example: Programmer Salary Survey

 Multiple Regression Model


Suppose we believe that salary (y) is related to the
years of experience (x1) and the score on the
programmer aptitude test (x2) by the following
regression model:
y = 0 + 1x1 + 2x2 + 

where
y = annual salary ($000)
x1 = years of experience
x2 = score on programmer aptitude test

© 2003 Thomson/South-Western Slide


385
Example: Programmer Salary Survey

 Solving for the Estimates of 0, 1, 2

Least Squares
Input Data Output
x1 x2 y
Computer b0 =
Package b1 =
4 78 24
for Solving b2 =
7 100 43
Multiple
. . . R2 =
Regression
. . .
Problems etc.
3 89 30

© 2003 Thomson/South-Western Slide


386
Example: Programmer Salary Survey

 Minitab Computer Output

The regression is
Salary = 3.174 + 1.404 Exper + 0.251 Score
Predictor Coef Stdev t-ratio
p
Constant 3.174 6.156 .52 .613
Exper 1.4039 .1986 7.07 .000
Score .25089 .07735 3.24 .005
s = 2.419 R-sq = 83.4% R-sq(adj) =
81.5%

© 2003 Thomson/South-Western Slide


387
Example: Programmer Salary Survey

 Estimated Regression Equation


SALARY = 3.174 + 1.404(EXPER) + 0.2509(SCORE)
Note: Predicted salary will be in thousands of dollars

© 2003 Thomson/South-Western Slide


388
Testing for Significance

 In simple linear regression, the F and t tests provide


the same conclusion.
 In multiple regression, the F and t tests have different
purposes.

© 2003 Thomson/South-Western Slide


389
Testing for Significance: F Test

 The F test is used to determine whether a significant


relationship exists between the dependent variable
and the set of all the independent variables.
 The F test is referred to as the test for overall
significance.

© 2003 Thomson/South-Western Slide


390
Testing for Significance: t Test

 If the F test shows an overall significance, the t test is


used to determine whether each of the individual
independent variables is significant.
 A separate t test is conducted for each of the
independent variables in the model.
 We refer to each of these t tests as a test for
individual significance.

© 2003 Thomson/South-Western Slide


391
Testing for Significance: F Test

 Hypotheses
H 0: 1 =  2 = . . . = p = 0
Ha: One or more of the parameters
is not equal to zero.
 Test Statistic
F = MSR/MSE
 Rejection Rule
Reject H0 if F > F
where F is based on an F distribution with p d.f. in
the numerator and n - p - 1 d.f. in the denominator.

© 2003 Thomson/South-Western Slide


392
Testing for Significance: t Test

 Hypotheses
H0: i = 0
Ha: i = 0
 Test Statistic
bi
t
sbi
 Rejection Rule
Reject H0 if t < -tor t > t
where t is based on a t distribution with
n - p - 1 degrees of freedom.

© 2003 Thomson/South-Western Slide


393
Example: Programmer Salary Survey

 Minitab Computer Output (continued)


Analysis of Variance
SOURCE DF SS MS F P
Regression 2 500.33 250.16 42.76 0.000
Error 17 99.46 5.85
Total 19 599.79

© 2003 Thomson/South-Western Slide


394
Example: Programmer Salary Survey

 F Test
• Hypotheses

H0 :  1 = 2 = 0
Ha: One or both of the parameters
is not equal to zero.
• Rejection Rule
For  = .05 and d.f. = 2, 17:
F.05 = 3.59
Reject H0 if F > 3.59.

© 2003 Thomson/South-Western Slide


395
Example: Programmer Salary Survey

 F Test
• Test Statistic

F = MSR/MSE
= 250.16/5.85 = 42.76
• Conclusion

F = 42.76 > 3.59, so we can reject H0.

© 2003 Thomson/South-Western Slide


396
Example: Programmer Salary Survey

 t Test for Significance of Individual Parameters


• Hypotheses

H0: i = 0
Ha: i = 0
• Rejection Rule
For  = .05 and d.f. = 17:
t.025 = 2.11
Reject H0 if t > 2.11

© 2003 Thomson/South-Western Slide


397
Example: Programmer Salary Survey

 t Test for Significance of Individual Parameters


• Test Statistics

b1 1. 4039 b2 . 25089
  7 . 07   3. 24
sb1 . 1986 sb2 . 07735
• Conclusions
Reject H0: 1 = 0 and reject H0: 2 = 0.
Both independent variables are significant.

© 2003 Thomson/South-Western Slide


398
Testing for Significance: Multicollinearity

 The term multicollinearity refers to the correlation


among the independent variables.
 When the independent variables are highly
correlated (say, |r | > .7), it is not possible to
determine the separate effect of any particular
independent variable on the dependent variable.

© 2003 Thomson/South-Western Slide


399
Testing for Significance: Multicollinearity

 If the estimated regression equation is to be used


only for predictive purposes, multicollinearity is
usually not a serious problem.
 Every attempt should be made to avoid including
independent variables that are highly correlated.

© 2003 Thomson/South-Western Slide


400
Using the Estimated Regression Equation
for Estimation and Prediction
 The procedures for estimating the mean value of y
and predicting an individual value of y in multiple
regression are similar to those in simple regression.
 We substitute the given values of x1, x2, . . . , xp into
the estimated regression equation and use the
^
corresponding value of y as the point estimate.
 The formulas required to develop interval estimates
for the mean value of y and for an individual value
of y are beyond the scope of the text.
 Software packages for multiple regression will often
provide these interval estimates.

© 2003 Thomson/South-Western Slide


401
Qualitative Independent Variables

 In many situations we must work with qualitative


independent variables such as gender (male, female),
method of payment (cash, check, credit card), etc.
 For example, x2 might represent gender where x2 = 0
indicates male and x2 = 1 indicates female.
 In this case, x2 is called a dummy or indicator
variable.

© 2003 Thomson/South-Western Slide


402
Qualitative Independent Variables

 If a qualitative variable has k levels, k - 1 dummy


variables are required, with each dummy variable
being coded as 0 or 1.
 For example, a variable with levels A, B, and C
would be represented by x1 and x2 values of (0, 0),
(1, 0), and (0,1), respectively.

© 2003 Thomson/South-Western Slide


403
Example: Programmer Salary Survey (B)

As an extension of the problem involving the


computer programmer salary survey, suppose that
management also believes that the annual salary is
related to whether or not the individual has a graduate
degree in computer science or information systems.
The years of experience, the score on the programmer
aptitude test, whether or not the individual has a
relevant graduate degree, and the annual salary ($000)
for each of the sampled 20 programmers are shown on
the next slide.

© 2003 Thomson/South-Western Slide


404
Example: Programmer Salary Survey (B)

Exp. Score Degr. Salary Exp. Score Degr. Salary


4 78 No 24 9 88 Yes 38
7 100 Yes 43 2 73 No 26.6
1 86 No 23.7 10 75 Yes 36.2
5 82 Yes 34.3 5 81 No 31.6
8 86 Yes 35.8 6 74 No 29
10 84 Yes 38 8 87 Yes 34
0 75 No 22.2 4 79 No 30.1
1 80 No 23.1 6 94 Yes 33.9
6 83 No 30 3 70 No 28.2
6 91 Yes 33 3 89 No 30

© 2003 Thomson/South-Western Slide


405
Example: Programmer Salary Survey (B)

 Multiple Regression Equation


E(y ) = 0 + 1x1 + 2x2 + 3x3
 Estimated Regression Equation
^
y = b0 + b1x1 + b2x2 + b3x3
where
y = annual salary ($000)
x1 = years of experience
x2 = score on programmer aptitude test
x3 = 0 if individual does not have a grad. degree
1 if individual does have a grad. degree
Note: x3 is referred to as a dummy variable.

© 2003 Thomson/South-Western Slide


406
Example: Programmer Salary Survey (B)

 Minitab Computer Output


The regression is
Salary = 7.95 + 1.15 Exp + 0.197 Score + 2.28 Deg
Predictor Coef Stdev t-ratio
p
Constant 7.945 7.381 1.08 .298
Exp 1.1476 .2976 3.86 .001
Score .19694 .0899 2.19 .044
Deg 2.280 1.987 1.15 .268
s = 2.396 R-sq = 84.7% R-sq(adj) = 81.8%

© 2003 Thomson/South-Western Slide


407
Example: Programmer Salary Survey (B)

 Minitab Computer Output (continued)


Analysis of Variance
SOURCE DF SS MS F P
Regression 3 507.90 169.30 29.48 0.000
Error 16 91.89 5.74
Total 19 599.79

© 2003 Thomson/South-Western Slide


408
Example: Programmer Salary Survey (B)

 Interpreting the Parameters


• b1 = 1.15
Salary is expected to increase by $1,150 for each
additional year of experience (when all other
independent variables are held constant)

© 2003 Thomson/South-Western Slide


409
Example: Programmer Salary Survey (B)

 Interpreting the Parameters


• b2 = 0.197
Salary is expected to increase by $197 for each
additional point scored on the programmer
aptitude test (when all other independent
variables are held constant)

© 2003 Thomson/South-Western Slide


410
Example: Programmer Salary Survey (B)

 Interpreting the Parameters


• b3 = 2.28
Salary is expected to be $2,280 higher for an
individual with a graduate degree than one
without a graduate degree (when all other
independent variables are held constant)

© 2003 Thomson/South-Western Slide


411

You might also like