You are on page 1of 429

Applied Statistics

for the Six Sigma


Green Belt

Bhisham C. Gupta
H. Fred Walker

ASQ Quality Press


Milwaukee, Wisconsin

American Society for Quality, Quality Press, Milwaukee 53203


2005 by American Society for Quality
All rights reserved. Published 2005
Printed in the United States of America
12 11 10 09 08 07 06 05

5 4 3 2 1

Library of Congress Cataloging-in-Publication Data


Gupta, Bhisham C., 1942
Applied statistics for the Six Sigma Green Belt / Bhisham C. Gupta,
H. Fred Walker. 1st ed.
p. cm.
Includes bibliographical references and index.
ISBN 0-87389-642-4 (hardcover : alk. paper)
1. Six sigma (Quality control standard) 2. Production management.
3. Quality control. I. Walker, H. Fred, 1963 II. Title.
TS156.G8673 2005
658.4'013dc22
2004029760
ISBN 0-87389-642-4
No part of this book may be reproduced in any form or by any means,
electronic, mechanical, photocopying, recording, or otherwise, without the
prior written permission of the publisher.
Publisher: William A. Tony
Acquisitions Editor: Annemieke Hytinen
Project Editor: Paul OMara
Production Administrator: Randall Benson
ASQ Mission: The American Society for Quality advances individual,
organizational, and community excellence worldwide through learning,
quality improvement, and knowledge exchange.
Attention Bookstores, Wholesalers, Schools, and Corporations: ASQ
Quality Press books, videotapes, audiotapes, and software are available
at quantity discounts with bulk purchases for business, educational, or
instructional use. For information, please contact ASQ Quality Press at
800-248-1946, or write to ASQ Quality Press, P.O. Box 3005, Milwaukee,
WI 53201-3005.
To place orders or to request a free copy of the ASQ Quality Press
Publications Catalog, including ASQ membership information, call
800-248-1946. Visit our Web site at www.asq.org or
http://qualitypress.asq.org.
Printed on acid-free paper

Introduction

henever a process is not producing products or services at a desired


level of quality, an investigation is launched to better understand
and improve the process. In many instances such investigations are
launched to rapidly identify and correct underlying problems as part of a
problem solving methodologyone such methodology is commonly known
as root cause analysis. Many problem-solving methodologies, such as root
cause analysis, rely on the study of numerical (quantitative) or non-numerical (qualitative) data as a means of discovering the true cause to one or more
problems negatively impacting product or service quality. The problemsolving methodologies, however, are all too commonly used to investigate
problems that need a quick solution and thus are not afforded the time or
resources needed for a particularly detailed or in-depth analysis. Further,
problem-solving methodologies are also all too commonly used to investigate problems without sufcient analysis of a series of costs associated with
a given problem as they relate to lost prot or opportunity, human resources
needed to investigate the problem, and so forth. Let us not have the wrong
impression of problem-solving methodologies such as root cause analysis!
Each of these methodologies has a proper place in quality and process
improvement; however, the scope or size of the problem needs also to be
considered. In this context, when problems are smaller and are easier to
understand, we can effectively use less rigorous, complicated, and thorough
problem-solving methodologies. When problems become large, complex,
and expensive, a more detailed and robust problem-solving methodology is
needed, and that problem-solving methodology is Six Sigma. While it is
beyond the intended scope of this book to discuss, in detail, the Six Sigma
methodology as an approach to problem solving, it is the explicit intent of
this book to describe the concepts and application of tools and techniques
used to support the Six Sigma methodology. Next we give a brief description
of the topics discussed in the book, followed by where in the Six Sigma
methodology you can expect to use these tools and techniques.
In Chapter 1 we introduce the concept of Six Sigma from both statistical
and quality perspectives. We briey describe what we need for converting
data into information. In statistical applications we come across various
xxiv

Introduction

types of data that require specic analyses that depend upon the types of data
we are working with. It is therefore important to distinguish between different types of data.
In Chapter 2 we discuss and provide examples for different types of data.
In addition, terminology such as population and sample are introduced.
In Chapter 3 we introduce several graphical methods found in descriptive
statistics. These graphical methods are some of the basic tools of statistical
quality control (SQC). These methods are also very helpful in understanding
the pertinent information contained in very large and complex datasets.
In Chapter 4 we learn about the numerical methods of descriptive statistics. Numerical methods that are applicable to both sample as well as population data provide us with quantitative or numerical measures. Such measures
further enlighten us about the information contained in the data.
In Chapter 5 we proceed to study the basic concepts of probability theory and see how probability theory relates to applied statistics. We also introduce the random experiment and dene sample space and events. In addition,
we study certain rules of probability and conditional probability.
In Chapter 6 we introduce the concept of a random variable, which is a
vehicle used to assign some numerical values to all the possible outcomes of
a random experiment. We also study probability distributions and dene
mean and standard deviation of random variables. Specically, we study
some special probability distributions of discrete random variables such as
Bernoulli, binomial, hypergeometric, and Poisson distributions, which are
encountered frequently in many statistical applications. Finally, we discuss
under what conditions (e.g., the Poisson process) these probability models
are applicable.
In Chapter 7 we continue studying probability distributions of random
variables. We introduce the continuous random variable and study its probability distribution. We specically examine uniform, normal, exponential,
and Weibull continuous probability distributions. The normal distribution is
the backbone of statistics and is extensively used in achieving Six Sigma
quality characteristics. The exponential and Weibull distributions form an
important part of reliability theory. The hazard or failure rate function is also
introduced.
Having discussed probability distributions of data as they apply to discrete and continuous random variables in Chapters 6 and 7, in Chapter 8 we
expand our study to the probability distributions of sample statistics. In particular, we study the probability distribution of the sample mean and sample
proportion. We then study Students t, chi-square, and F distributions. These
distributions are an essential part of inferential statistics and, therefore, of
applied statistics.
Estimation is an important component of inferential statistics. In Chapter
9 we discuss point estimation and interval estimation of population mean and
of difference between two population means, both when sample size is large
and when it is small. Then we study point estimation and interval estimation
of population proportion and of difference between two population proportions when the sample size is large. Finally, we study the estimation of a population variance, standard deviation, ratio of two population variances, and
ratio of two population standard deviations.

xxv

xxvi

Introduction

Table 1 Applied statistics and the Six Sigma methodology.


Six Sigma Phase

Tool or Technique

Where in this book?

Define

Descriptive Statistics
Graphical Methods
Numerical Descriptions

Chapter 2
Chapter 3
Chapter 4

Measure

Sampling
Point & Interval Estimation

Chapter 8
Chapter 9

Analyze

Probability
Discrete & Continuous Distributions
Hypothesis Testing

Chapter 5
Chapters 6 & 7
Chapters 10

Improve
Control

In Chapter 10 we study another component of inferential statistics,


which is the testing of statistical hypotheses. The primary aim of statistical
hypotheses is to either refute or support the existing theory, which is, in other
words, what is believed to be true based upon the information contained in
sample data. This further enhances good procedures. In this chapter we discuss the techniques of testing statistical hypotheses for one population mean
and for differences between two population means, both when sample sizes
are large and when they are small. We also discuss techniques of testing
hypotheses for one population proportion and for differences between two
population proportions when sample sizes are large. Finally, we discuss testing of statistical hypotheses for one population variance and for ratio of two
population variances under the assumption that the populations are normal.
The results of Chapter 9 and this chapter are frequently used in statistical
quality control (SQC) and design of experiments (DOE).
In Chapter 11 we consider computer-based tools for applied statistical
support. Computing resources were purposefully included at the end of the
book so as to encourage readers not to rely on computers until after they have
gained a mastery of the statistical content presented in the preceding chapters.
But where in the Six Sigma methodology do we use these tools and techniques? The answer is throughout the methodology! Lets take a closer look.
The information contained in Table 1 will help us better relate specic tools
and techniques to phases of the Six Sigma methodology as they relate to the
intended scope and purpose of this booka basic level of applied statistics.
Additional topics will be discussed in later books in this series. As topics are
discussed in later books, these topics will be added to content of Table 1 and
readers can use the table to help associate specic tools to the Six Sigma
methodology.
The array of topics as they relate to the Six Sigma methodology is helpful in understanding where you may use these tools and techniques. It is
important to note however, that any of these tools and techniques may come
into play in more than one phase of the Six Sigma methodology, and in fact,
should be expected to do so. What is presented in Table 1 is a rst point in
the methodology you may expect to use these tools and techniques. From
here its time to get started! Enjoy!

Preface

pplied Statistics for the Six Sigma Green Belt was written as a desk
reference and instructional aid for individuals involved with Six
Sigma project teams. As Six Sigma team members, green belts will
help select appropriate statistical tools, collect data for those tools, and assist
with data interpretation within the context of the Six Sigma methodology.
Composed of steps or phases titled Dene, Measure, Analyze, Improve,
and Control (DMAIC), the Six Sigma methodology calls for the use of many
more statistical tools than is reasonable to address in one large book.
Accordingly, the intent of this book is to provide Green Belts with the benet of a thorough discussion relating to the underlying concepts of basic statistics. More advanced topics of a statistical nature will be discussed in three
other books that, together with this book, will comprise a four-book series.
The other books in the series will discuss statistical quality control, introductory design of experiments and regression analysis, and advanced design
of experiments.
While it is beyond the scope of this book and series to cover the DMAIC
methodology specically, we do focus this book and series on concepts,
applications, and interpretations of the statistical tools used during, and as part
of, the DMAIC methodology. Of particular interest in this book, and indeed
the other books in this series, is an applied approach to the topics covered
while providing a detailed discussion of the underlying concepts. This level
of detail in providing the underlying concepts is particularly important for
individuals lacking a recent study of applied statistics as well as for individuals who may never have had any formal education or training in statistics.
In fact, one very controversial aspect of Six Sigma training is that, in
many cases, this training is targeted at the Six Sigma Black Belt and is all too
commonly delivered to large groups of people with the assumption that all
trainees have a uent command of the underlying statistical concepts and
theory. In practice this assumption commonly leads to a good deal of concern and discomfort for trainees who quickly nd it difcult to keep up with
and successfully complete black beltlevel training. This concern and discomfort becomes even more serious when individuals involved with Six

xx

Preface xxi

Sigma training are expected to pass a written and/or computer-based examination that so commonly accompanies this type of training.
So if you are beginning to learn about Six Sigma and are either preparing for training or are supporting a Six Sigma team, the question is: How do
I get up to speed with applied statistics as quickly as possible so I can get the
most from training or add the most value to my Six Sigma team? The answer
to this question is simple and straightforwardget access to a book that provides a thorough and systematic discussion of applied statistics, a book that
uses the plain language of application rather than abstract theory, and a book
that emphasizes learning by examples. Applied Statistics for the Six Sigma
Green Belt has been designed to be just that book.
This book was organized so as to expose readers to applied statistics in
a thorough and systematic manner. We begin by discussing concepts that are
the easiest to understand and that will provide you with a solid foundation
upon which to build further knowledge. As we proceed with our discussion,
and as the complexity of the statistical tools increases, we fully intend that
our readers will be able to follow the discussion by understanding that the
use of any given statistical tool, in many cases, enables us to use additional
and more powerful statistical tools. The order of presentation of these tools
in our discussion then will help you understand how these tools relate to,
mutually support, and interact with one another. We will continue this logic
of the order in which we present topics in the remaining books in this series.
Getting the most benet from this book, and in fact from the complete series
of books, is consistent with how many of us learn most effectivelystart at
the beginning with less complex topics, proceed with our discussion of new
and more powerful statistical tools once we learn the basics, be sure to
cover all the statistical tools needed to support Six Sigma, and emphasize
examples and applications throughout the discussion.
So let us take a look together at Applied Statistics for the Six Sigma
Green Belt. What you will learn is that statistics arent mysterious, they
arent scary, and they arent overly difcult to understand. As in learning any
topic, once you learn the basics it is easy to build on that knowledgetrying to start without a knowledge of the basics, however, is generally the
beginning of a difcult situation!

Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii


List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv
Chapter 1 Setting the Context for Six Sigma . . . . . . . . . . . . . . . .
1.1 Six Sigma Dened as a Statistical Concept . . . . . . . . . . . . . . .
1.2 Now, Six Sigma Explained as a Statistical Concept . . . . . . . . .
1.3 Six Sigma as a Comprehensive Approach and Methodology
for Problem Solving and Process Improvement . . . . . . . . . . . .
1.4 Understanding the Role of the Six Sigma Green Belt as
Part of the Bigger Picture . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.5 Converting Data into Useful Information . . . . . . . . . . . . . . . . .

1
1
2

Chapter 2 Getting Started with Statistics. . . . . . . . . . . . . . . . . . .


2.1 What Is Statistics? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Populations and Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Classication of Various Types of Data . . . . . . . . . . . . . . . . . .
2.3.1 Nominal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.2 Ordinal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.3 Interval Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.4 Ratio Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9
9
10
11
12
12
13
13

Chapter 3 Describing Data Graphically . . . . . . . . . . . . . . . . . . . .


3.1 Frequency Distribution Table . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Qualitative Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.2 Quantitative Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Graphical Representation of a Data Set . . . . . . . . . . . . . . . . . .
3.2.1 Dot Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Pie Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 Bar Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.4 Histograms and Related Graphs . . . . . . . . . . . . . . . . . . .

15
15
15
18
20
20
22
23
27

vii

3
5
6

viii Contents

3.2.5 Line Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


3.2.6 Stem and Leaf Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.7 Measure of Association . . . . . . . . . . . . . . . . . . . . . . . . . 39
Chapter 4 Describing Data Numerically. . . . . . . . . . . . . . . . . . . .
4.1 Numerical Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Measures of Centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 Measures of Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1 Range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.3 Standard Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.4 Coefcient of Variation . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Measures of Central Tendency and Dispersion for
Grouped Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.1 Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.2 Median . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.3 Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4.4 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.5 Empirical Rule (Normal Distribution) . . . . . . . . . . . . . . . . . . .
4.6 Certain Other Measures of Location and Dispersion . . . . . . . . .
4.6.1 Percentiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.2 Quartiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.6.3 Interquartile Range . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7 Box Whisker Plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.7.1 Construction of a Box Plot . . . . . . . . . . . . . . . . . . . . . .
4.7.2 How to Use the Box Plot . . . . . . . . . . . . . . . . . . . . . . . .
Chapter 5 Probability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.1 Probability and Applied Statistics . . . . . . . . . . . . . . . . . . . . . .
5.2 The Random Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 Sample Space, Simple Events, and Events of Random
Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4 Representation of Sample Space and Events Using Diagrams . .
5.4.1 Tree Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.4.2 Permutation and Combination . . . . . . . . . . . . . . . . . . . .
5.5 Dening Probability Using Relative Frequency . . . . . . . . . . . .
5.6 Axioms of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.7 Conditional Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45
45
46
46
48
50
52
53
53
55
57
57
58
58
59
60
60
63
63
64
64
66
66
67
71
71
72
73
75
75
77
83
86
88

Chapter 6 Discrete Random Variables and Their Probability


Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1 Discrete Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Mean and Standard Deviation of a Discrete Random
Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2.1 Interpretation of the Mean and the Standard
Deviation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Contents ix

6.3 The Bernoulli Trials and the Binomial Distribution . . . . . . . . . 101


6.3.1 Mean and Standard Deviation of a Bernoulli
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3.2 The Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . 102
6.3.3 Binomial Probability Tables . . . . . . . . . . . . . . . . . . . . . . 105
6.4 The Hypergeometric Distribution . . . . . . . . . . . . . . . . . . . . . . 107
6.4.1 Mean and Standard Deviation of a Hypergeometric
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
6.5 The Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Chapter 7 Continuous Random Variables and Their
Probability Distributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.1 Continuous Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2 The Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.2.1 Mean and Standard Deviation of the Uniform
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.3 The Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.3.1 Standard Normal Distribution Table . . . . . . . . . . . . . . . . 123
7.4 The Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 129
7.4.1 Mean and Standard Deviation of an Exponential
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.4.2 Distribution Function F(x) of the Exponential
Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
7.5 The Weibull Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.5.1 Mean and Variance of the Weibull Distribution . . . . . . . . 133
7.5.2 Distribution Function F(t) of Weibull . . . . . . . . . . . . . . . 133
Chapter 8 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . 137
8.1 Sampling Distribution of Sample Mean . . . . . . . . . . . . . . . . . . 138
8.2 The Central Limit Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
8.2.1 Sampling Distribution of Sample Proportion . . . . . . . . . . 147
8.3 Chi-Square Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
8.4 The Students t-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 153
8.5 Snedecors F-Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.6 The Poisson Approximation to the Binomial Distribution . . . . . 158
8.7 The Normal Approximation to the Binomial Distribution . . . . . 159
Chapter 9 Point and Interval Estimation . . . . . . . . . . . . . . . . . . . 165
9.1 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
9.1.1 Properties of Point Estimators . . . . . . . . . . . . . . . . . . . . 167
9.2 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
9.2.1 Interpretation of a Condence Interval . . . . . . . . . . . . . . 172
9.3 Condence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
9.3.1 Condence Interval for Population Mean  When
the Sample Size Is Large . . . . . . . . . . . . . . . . . . . . . . . . 173
9.3.2 Condence Interval for Population Mean  When
the Sample Size Is Small . . . . . . . . . . . . . . . . . . . . . . . . 177
9.4 Condence Interval for the Difference between Two
Population Means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

Contents

9.4.1 Large Sample Condence Interval for the


Difference between Two Population Means . . . . . . . . . . 180
9.4.2 Small Sample Condence Interval for the
Difference between Two Population Means . . . . . . . . . . 183
9.5 Condence Intervals for Population Proportions When
Sample Sizes Are Large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
9.5.1 Condence Interval for p the Population Proportion . . . . 188
9.5.2 Condence Interval for the Difference of
Two Population Proportions . . . . . . . . . . . . . . . . . . . . . . 189
9.6 Determination of Sample Size . . . . . . . . . . . . . . . . . . . . . . . . . 192
9.7 Condence Interval for Population Variances . . . . . . . . . . . . . . 195
9.7.1 Condence Interval for a Population Variance . . . . . . . . . 196
Chapter 10 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . 201
10.1 Basic Concepts of Testing Statistical Hypotheses . . . . . . . . . . . 201
10.2 Testing Statistical Hypotheses about One Population Mean
When Sample Size Is Large . . . . . . . . . . . . . . . . . . . . . . . . . . 208
10.2.1 Population Variance Is Known . . . . . . . . . . . . . . . . . . . 208
10.2.2 Population Variance Is Unknown . . . . . . . . . . . . . . . . . 213
10.3 Testing Statistical Hypotheses about the Difference Between
Two Population Means When the Sample Sizes Are Large . . . . 216
10.3.1 Population Variances Are Known . . . . . . . . . . . . . . . . . 216
10.3.2 Population Variances Are Unknown . . . . . . . . . . . . . . . 219
10.4 Testing Statistical Hypotheses about One Population Mean
When Sample Size Is Small . . . . . . . . . . . . . . . . . . . . . . . . . . 222
10.4.1 Population Variance Is Known . . . . . . . . . . . . . . . . . . . 223
10.4.2 Population Variance Is Unknown . . . . . . . . . . . . . . . . . 226
10.5 Testing Statistical Hypotheses about the Difference Between
Two Population Means When Sample Sizes Are Small . . . . . . . 229
10.5.1 Population Variances 12 and 22 Are Known . . . . . . . . 230
10.5.2 Population Variances 12 and 22 Are Unknown
But 12  22  2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
10.5.3 Population Variances 12 and 22 Are Unknown
and 12  22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
10.6 Paired t-Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
10.7 Testing Statistical Hypotheses about Population Proportions . . . 240
10.7.1 Testing of Statistical Hypotheses about One
Population Proportion When Sample Size Is
Large . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
10.7.2 Testing of Statistical Hypotheses about the
Difference Between Two Population Proportions
When Sample Sizes Are Large . . . . . . . . . . . . . . . . . . . 242
10.8 Testing Statistical Hypotheses about Population Variances . . . . 244
10.8.1 Testing Statistical Hypotheses about One
Population Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
10.8.2 Testing Statistical Hypotheses about the Two
Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 247
10.9 An Alternative Technique for Testing of Statistical Hypotheses
Using Condence Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . 250

Contents xi

Chapter 11 Computing Resources to Support Applied


Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.1 Using MINITAB, Version 14 . . . . . . . . . . . . . . . . . . . . . . . . . 255
11.1.1 Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
11.1.2 Calculating Descriptive Statistics . . . . . . . . . . . . . . . . . 258
11.1.3 Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . 269
11.1.4 Estimation and Testing of Hypotheses about
Population Mean and Proportion . . . . . . . . . . . . . . . . . 273
11.1.5 Estimation and Testing of Hypotheses about Two
Population Means and Proportions . . . . . . . . . . . . . . . . 276
11.1.6 Estimation and Testing of Hypotheses about Two
Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 280
11.1.7 Testing Normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
11.2 Using JMP, Version 5.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
11.2.1 Getting Started with JMP . . . . . . . . . . . . . . . . . . . . . . . 284
11.2.2 Calculating Descriptive Statistics . . . . . . . . . . . . . . . . . 286
11.2.3 Estimation and Testing of Hypotheses about One
Population Mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
11.2.4 Estimation and Testing of Hypotheses about Two
Population Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 300
11.2.5 Normality Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
11.3 Web-based Computing Resources . . . . . . . . . . . . . . . . . . . . . . 303
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Table I Binomial probabilities . . . . . . . . . . . . . . . . . . . . . . .
Table II Poisson probabilities . . . . . . . . . . . . . . . . . . . . . . . .
Table III Standard normal distribution . . . . . . . . . . . . . . . . . .
Table IV Critical values of 2 with  degrees of freedom . . . .
Table V Critical values of t with  degrees of freedom . . . . . .
Table VI Critical values of F with numerator and
denominator degrees of freedom 1, 2
respectively (  0.10)

311
312
315
317
318
320
321

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
About the Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

1
Setting the Context
for Six Sigma

t is important to begin our discussion of applied statistics by recognizing


that Six Sigma (6) has come to refer simultaneously to two related but
different ideas. The rst idea is that of the technical denition as a statistical conceptthis technical denition will be provided and explained in
sections 1.1 and 1.2, respectively. The second idea is that of a comprehensive
approach and methodology for problem solving and process improvement
this comprehensive approach and methodology will be briey outlined in
section 1.3; however, a thorough discussion of the 6 approach and methodology is beyond the scope of this book. The remainder of the chapter will be
devoted to describing how the green belt contributes to 6 efforts (section
1.4) and how the green belt goes about the task of converting data into useful information (section 1.5).

1.1 Six Sigma Defined as a Statistical Concept


Six Sigma is a measure of process quality wherein the distance between a
target value and the upper or lower specication limit is at least six standard
deviations. The most widely publicized consequence of a 6 process is that
there are 3.4 defects per million opportunities (DPMO). DPMO is dened
not as a count of defects alone, but rather as a ratio of defects compared to
the number of opportunities for defects to occur.
Since most operators, service providers, technicians, engineers, and
managers are trained to think in terms of counting total defects, the concept
of comparing defects to opportunities for defects to occur is counterintuitive.
In fact, determining what constitutes an opportunity for a defect to occur
has, in some circles, become controversial. Now combining these ideas of
3.4 DPMO, that defects are compared to opportunities for those defects to
occur and that the denition of an opportunity is not universally agreed upon,
means we have a statistical concept (i.e., 6) that is difcult for a great many
people to understandeven for professionals with advanced levels of statistical training and education!

Chapter One

Not to worry! We can readily understand the meaning of this 6 concept


if we avoid the unnecessary rigor of a theoretical discussion and focus on its
application.

1.2 Now, Six Sigma Explained


as a Statistical Concept
In its purest statistical form, 6 refers to six standard deviations and
describes the variability of a process in what is commonly referred to as a
measure of dispersion. In this case, three standard deviations would be
located above some measure of location such as a mean or average, and
three standard deviations would be located below the same measure of location, as illustrated in Figure 1.1.
As you can see from Figure 1.1, the standard deviations are combined to
form the boundaries of what is referred to as a normal distributionthis
normal distribution is also commonly referred to as the bell-shaped curve.
It is important to note that a much more detailed discussion of the topics
identied above, and related topics, will be provided where they are appropriate later in this book. For now, let us continue with our explanation of 6.
As was stated above, 6 refers to a defect rate equivalent to 3.4
DPMOthis is where understanding the term and concept of 6 can become
unnecessarily difcult. And while some people take great satisfaction in
being able to explain 6 at an excruciating level of technical detail, such
detail is not necessary to grasp a general understanding of the concept.
To avoid an unnecessary level of complexity, while still being able to
understand the concept, let us think of 6 as illustrated in Figure 1.2.
In Figure 1.2 we can readily see there is a normal or bell-shaped distribution. What makes the distribution interesting is that the width of the distribution that describes variability is quite narrow compared to some limits, for
example, specication limits. These specication limits are generally provided

Figure 1.1

The normal distribution.

+3

Setting the Context for Six Sigma 3

0 +-1.5

LSL

USL
Cp=2

Cpk=1.5
3.4 DPMO

3.4 DPMO

Cpk=1.5

Cp=2
Cp=Cpk=2

6 to LSL

Figure 1.2

+1

+2

+3

+4

+5

6 to USL

Six Sigma (Motorola definition).

by customers in the form of tolerances and describe the values for which products or services must conform to be considered good or acceptable.
There is more to this explanation, however. Again, looking at Figure 1.2
we see that because the width of the distribution is so much smaller than the
width of the limits that it is possible for the location of the distribution to
move around, or vary, within the limits. This movement or natural variation
is inherent in any process, and so anticipating the movement is exactly what
we want to be able to do! In fact, we want as much room as possible for the
distribution to move within the limits so we do not risk the distribution moving outside these limits.
Now someone may ask, Why would the distribution move around within the limits? and How much movement would we expect? Both are interesting questions, and both questions help us better understand this concept
called 6 as it refers to quality.
When a process is operating, whether that process involves manufacturing operations or service delivery, variation within that process is to be
expected. Variation occurs both in terms of the measures of dispersion (i.e.,
the width of a process) and measures of location (i.e., where the center of that
process lies). During normal operation we would expect the location of a
process (described numerically by the measure of location) to vary or move
/ 1.5 standard deviations. Herein lies explanation of 6.
Our goal is to reduce the variability of any process as compared to the
process limits to a point where there is room for a / 1.5 standard deviation move, accounting for the natural variability of the process while containing all that variability within the limits. Such a case is referred to as a 6
level of quality, wherein no more than 3.4 DPMO would be expected to fall
outside the limits.

1.3 Six Sigma as a Comprehensive Approach


and Methodology for Problem Solving
and Process Improvement
Having been mystied or confused about the technical denition of 6,
many people never fully develop an understanding that 6 is really referring

+6

Chapter One

Commitment made to implement Six Sigma

Champion team
formed

Potential projects
identified and
evaluated

Begin/Charter
projects with
DMAIC
methodology

Yes

Do projects
meet criteria?

No

Discontinue
consideration of
project

Define phase
Measure phase
Analyze phase

Is phase review
successfully
completed?

No

Improve phase
Control phase

Yes

Complete projects
with DMAIC
methodology

Verify financial
payback criteria
have been met

Have financial
payback
criteria been
met?

No

Yes

Complete project involvement and documentation

Figure 1.3

Current Six Sigma implementation flow chart.

Reconsider
project
selection criteria

Setting the Context for Six Sigma 5

more to a comprehensive approach and methodology for problem solving


and process improvement than to a statistical concept. Developing such an
understanding is necessary sooner, rather than later, because implementation
of 6 is based on the use of a wide variety of tools and techniquessome
statistical in nature and some notwhere they are appropriate to support
each of several phases in the methodology.
While originally developed as a phased approach to problem solving and
process improvement, 6 started as a sequential progression of phases titled
Measure, Analyze, Improve, and Control (MAIC). Six Sigma was later
expanded to include a Dene phase, as it became apparent more attention
was needed to identify, understand, and adequately describe problems or
opportunities. In what is now known as the DMAIC approach and methodology, 6 continues to be improved upon, and the addition of new phases as
formal components of the methodology is being discussed in various venues.
In its current form of implementation, however, Six Sigma is practiced as
identied in Figure 1.3.
However, as 6 evolves, it is clear that several levels of stakeholders,
participants, and team members will be needed to apply the tools and techniques as they are called for within the methodology. And as a percentage of
the total number of people involved with 6 efforts, green belts will continue to represent one of the largest groups of stakeholders, participants, and
team members.

1.4 Understanding the Role of the Six Sigma


Green Belt as Part of the Bigger Picture
Green belts constitute one of the largest contributors to 6 efforts, as highlighted in Figure 1.4.
As seen in Figure 1.4, green belts are close to process operations and
work directly with shop oor operators and service delivery personnel.
Green belts most commonly collect data, make initial interpretations, and
begin to formulate recommendations that are fed to black belts. Black belts
then perform more thorough analyses, generally with additional data and
input from other sources, and make recommendations to master black belts
and project champions.
The ow of involvement and responsibilities described above is the
essence of how 6 has been implemented to date. What is interesting,
though, is not how 6 has been implemented to date, but how the implementation of 6 is changing. A current trend consistent with administration
of quality and certain management functions is to push responsibility to
lower levels within organizations. How this applies to implementation of 6
is that greater responsibility for problem or opportunity identication, data
collection, analysis, and corrective action is being levied on green belts.
To support that trend, many consultants providing 6 training now
include green belts and black belts in the same classes. This means that, in
many cases, green belts receive training on all the tools and techniques, as
do black belts, and the expectation is that green belts will assume more

Chapter One

6 Master Black Belts

6 Black Belts

6 Green Belts

Process operators and service


delivery personnel

Figure 1.4

Six Sigma support personnel.

responsibility for day-to-day operation of 6 efforts. So we see a redenition of responsibilities wherein the green belts no longer simply collect data
as prescribed by black belts, but rather green belts are rapidly being tasked
with collecting data and, more importantly, converting that data into useful
information.

1.5 Converting Data into Useful Information


What does this mean, converting data into useful information? It implies
that data and information are somehow different thingsthey are! Data represent raw facts. Raw facts by themselves do not convey to us much meaning. Consider Table 1.1.
Table 1.1 Process step completion times.
24
21
28
30
20

22
26
25
29
23

29
20
27
24
27

27
28
24
26
25

29
24
31
23
26

Table 1.1 has several rows and columns of numbers. These numbers correspond to measurements of the average time to complete a process step. As
a collection of numbers, the data in Table 1.1 do not help us understand much
about the process. To really understand the process, we need to convert the
data into information, and to convert the data we use appropriate tools and

Setting the Context for Six Sigma 7

Table 1.2

Descriptive statistics.

Mean
Std Dev
Std Err Mean
upper 95% Mean
lower 95% Mean
N

17.5

Figure 1.5

20

22.5

25

25.52
3.0430248
0.608605
26.776099
24.263901
25

27.5

30

32.5

Histogram.

techniques. In this case we can use simple descriptive statistics to help us


quantify certain parameters and we can use graphics to help us visually convert the data into information, as shown in Table 1.2 and Figure 1.5, respectively.
Table 1.2 indicates the mean (or average) is 25.52 and the standard deviation is 3.0430248. Now the data have been processed to give us a pair of
quantitative values, we can better understand the process. Figure 1.5 indicates
that the data appear to be distributed in a manner that looks like the normal
distributiona bell-shaped curve. And while we do gain some understanding
of any given process by converting data into information such as the mean and
standard deviation, we generally also gain very useful information by presenting the same data graphically. And so begins the job of the 6 green
beltconverting data into information.
As a nal thought in this chapter, it is worth noting that not all information is useful information. You will read about many tools and techniques
in the following chapters. It is important to note that these tools and techniques are what we call blind to mistakes and misinterpretations. This is to
say that the tools and techniques will not tell you whether the information
you create is good or bad. Nor will the tools and techniques give you guidance on how to interpret the informationfor that you will have to learn the
lessons contained in this book and be careful what information to use, how
to use that information, and when.

2
Getting Started with Statistics

aving established a context for Six Sigma (6) in Chapter 1, it is time


to move forward with a discussion of selected statistical topics as
they relate to 6. More specically, in this chapter we will examine
what is meant by the term statistics, differentiate between samples and populations, learn how to classify types of data, and then begin to look at what
is meant by the term descriptive statistics.

2.1 What Is Statistics?


The term statistics is commonly used in two senses. In the rst sense, we use
the term statistics in our day-to-day communication when we refer to collections of numbers or facts. What follows are several examples of statistics:
1. In 1980, the salaries of CEOs from 10 selected companies
ranged from $200,000 to $500,000.
2. On average, the starting salary of engineers is 40% higher than
for technicians.
3. This year our company is expected to produce one million units
of production.
4. In 2002, the customer service department responded to its
highest level of inquiries ever.
5. The operating budget for our Cleveland, Ohio, facility is 50%
larger than the operating budgets for the rest of our 10 facilities
combined.
6. Our company is doing more work in repair services than in
direct manufacturing.
7. The demand for our services is lower when the economy is
recessional.

10

Chapter Two

8. The R&D budget of our pharmaceutical division is higher than


the R&D budget of our biomedical division.
9. The number of years our products are expected to be useful is
decreasing.
10. In 1998, more than 30% of the services we offered were
nancially related.
In the second sense, we dene statistics as a scientic subject that provides the techniques of collecting, organizing, summarizing, analyzing, and
interpreting information as input to make appropriate decisions. Accordingly,
the subject of statistics may be divided into two parts:
Descriptive statistics
Inferential statistics
Descriptive statistics uses techniques to organize, summarize, present,
and interpret a data set to draw conclusions that do not go beyond the boundaries of the data set. In inferential statistics, varieties of techniques are used
to generalize results obtained from a sample to the population and evaluate
their reliabilities. In this book we shall discuss descriptive as well as inferential statistics.
Before we study descriptive and inferential statistics in more detail, it is
important that we discuss key terminology and denitions we will need to
support our discussions.

2.2 Populations and Samples


In our day-to-day life we think of a population as a collection of things, such
as all the people with whom we work or all the people working within a particular type of industry or business. In statistics, however, when referring to a
population, we may mean the group of people who share the same professional responsibilities, the group of people who work at a specic site or facility,
or simply a set of numbers. We may, therefore, dene a population as follows:
Definition 2.1 A population is a collection of all conceivable individuals, elements, numbers or entities which possess a characteristic
of interest.
For example, if we are interested in the ability of employees of a certain
company with a specic job title or classication to perform specic job functions, the population may be dened as all employees working at the company of interest across all of the sites and locations of that company with the
specic job title. If, however, we are interested in the ability of employees
with a specic job title or classication to perform specic job functions at a
particular location, the population may be dened as all employees with the
specic job title working only at the selected site or location. Populations,
therefore, are shaped by the point or level of interest.

Getting Started with Statistics 11

Populations can be nite or innite. A population where all the elements


are easily identiable is considered finite and a population where all the elements are not easily identiable is considered infinite. For example, a batch
or lot of production is normally considered a nite population, whereas all
the production that may be produced from a certain manufacturing line
would normally be considered innite.
It is important to note that in statistical applications the term infinite is
used in the relative sense. For instance, if we are interested in studying the
products produced or service delivery iterations occurring over a given period of time, the population may be considered nite or innite depending
upon ones frame of reference.
In any given statistical application, to study each element of a population
is not only time consuming and expensive but also impossible. For example,
if we are interested in studying the average life of a particular kind of electric bulb manufactured by a company, then obviously we cannot study the
whole population without destroying each bulb of that particular kind manufactured by that company. Simply put, in almost all studies we end up
studying only a small portion, called a sample, of the population. More formally we dene a sample as follows:
Definition 2.2 A portion of a population selected for study is called
a sample.
The population from which a sample is selected is called a sampled population and the population being studied is called the target population.
Normally these two populations coincide with each other, since every effort
is made to ensure the sampled population is the same as the target population. However, situations do arise when the sampled population does not
cover the whole target population. In such cases conclusions made about the
sampled population are not usually applicable for the target population.
In almost all statistical studies, conclusions about a population are made
based upon the information drawn from a sample. One must keep in mind that
such conclusions are valid only if the sample selected is a representative sample, that is, the sample possesses all the characteristics of the population that
is under investigation. One way to achieve this goal is by taking a random
sample. A sample is called a random sample if each element of the population
has the same chance of being included in the sample. There are several techniques of selecting a random sample, but the concept that each element of the
population has the same chance of being included in a sample forms the basis
of all random sampling. In volume II of this series of books, we will dedicate
one full chapter to study four techniques of random sampling, namely, simple
random sampling, systematic random sampling, stratied random sampling,
and cluster random sampling.

2.3 Classification of Various Types of Data


In our professional lives we daily collect a large amount of non-numerical
and/or numerical data. For example, we may collect data concerning customer

12

Chapter Two

satisfaction, thoughts of employees, or perceptions of suppliers, or we may


track the number of employees in various departments of a company, weekly
production volume in units produced, or sales dollars per unit of time. All the
data we collect, however, cannot be treated the same way as there are differences in types of data. Accordingly, statistical data can normally be divided
into two major categories:
Qualitative
Quantitative
Each of these categories can be further subdivided into two subcategories. The two subcategories of qualitative data are nominal and ordinal,
and the two subcategories of quantitative data are interval and ratio. We may
summarize this classication of statistical data as shown in Figure 2.1.
Statistical Data
Qualitative
Nominal
Figure 2.1

Ordinal

Quantitative
Interval

Ratio

Classifications of statistical data.

The classication of data as nominal, ordinal, interval, and ratio is


arranged in order of strength. In other words, nominal data is the weakest
and ratio data is the strongest in terms of the amount of information they can
provide.
2.3.1 Nominal Data
As mentioned above, nominal data contains the least amount of information.
Numbers representing nominal data are merely symbols used to label categories of a population. For example, production part numbers with a 2003
prex are nominal data wherein the 2003 prex indicates only that the parts
were produced in 2003 (in this case, the year 2003 serves as the category). No
arithmetic operation such as addition, subtraction, multiplication, or division
can be performed on numbers representing nominal data. For example, similar categories are used to represent the year in which the parts were produced,
such as 2000, 2001, and 2002. Adding the rst two numbers and comparing
with the third number makes no sense. Other examples of nominal data are ID
numbers of workers, account numbers used by nance/accounting departments, zip codes, and telephone numbers.
2.3.2 Ordinal Data
Ordinal data are stronger than nominal data. When the ordering of categories
becomes important, the data collected is called ordinal. Using ordinal data,

Getting Started with Statistics 13

certain companies can be ranked according to the quality of their product;


according to annual revenues of the companies; or the severity of burn
injuries so that 1 represents very serious injury, 2 serious, 3 moderate, and 4
minor. No arithmetic operations can be performed on ordinal data.
Some other examples of ordinal data are represented by geographical
regions designated as A, B, C, and D for shipping purposes, preference of
vendors who can be called upon for service, and skill ratings of certain workers of a company.
2.3.3 Interval Data
Interval data is numerical data, stronger than nominal and ordinal data but
weaker than ratio data. Typical examples of interval are temperature, ow
rate, and index number of hardness of a metal. Arithmetic operations addition
and subtractions are applicable, but multiplication and division are not applicable. For example, the temperatures of three consecutive parts A, B, and C
during a selected step in a manufacturing process is 22F, 28F, and 23.5F
respectively. We can say the temperature difference between parts A and B is
different from the difference between parts B and C. We can say that part B
is warmer than part A and that part C is warmer than part A but cooler than
part B. However, it would be meaningless to say that part B is nearly 20%
warmer than part A and nearly 15% warmer than part C. Moreover, in interval data zero does not have the conventional meaning; it is just an arbitrary
point on the scale of measurement. For instance, 0F and 0C have different
meanings and are arbitrary points on different scales of measurements.
2.3.4 Ratio Data
Ratio data have the potential to produce the most meaningful information of
all the data types. All arithmetic operations are applicable on this type of
data. Examples of ratio data are height, weight, length of rods, diameter of a
ball bearing, RPM of a motor, number of employees in a company, and
hourly wages.

3
Describing Data Graphically

n Chapter 2 we introduced descriptive statistics. In this and the next chapter we take a detailed look at the various methods that come under the
umbrella of descriptive statistics.
Commonly, practitioners applying statistics in a professional environment
become overwhelmed with large data sets they have collected. Occasionally,
practitioners even have difculties understanding data sets because either too
many or too few factors are included/not included in the data sets that inuence
a response variable of interest. In other cases, practitioners may doubt whether
the proper statistical technique was used to collect data. Consequently, the
information present in a selected data set may be biased or incomplete.
To avoid the situations described above, it is important to stay focused
on the purpose or need for collecting the data. By staying focused on the
purpose or need, it is much easier to ensure the use of appropriate data collection techniques and the selection of appropriate factors. Descriptive statistics are commonly used in applied statistics to help us understand the
information contained in large and complex data sets. Next, we continue our
discussion of descriptive statistics by considering an important tool called
the frequency distribution table.

3.1 Frequency Distribution Table


Graphical methods allow us to visualize characteristics of the data, as well as
to summarize pertinent information contained in the data. The frequency distribution table is a powerful tool that helps summarize both quantitative and
qualitative data, enabling us to prepare additional types of graphics discussed
in this chapter.
3.1.1 Qualitative Data
A frequency distribution table for qualitative data consists of two or more categories along with the data points that belong to each category. The number

15

16

Chapter Three

of data points that belong to any particular category is called the frequency of
that category. For illustration, let us consider the following example.
Example 3.1 Consider a random sample of 110 small to midsize companies located in the midwestern United States. Classify them according to
their annual revenues (in millions of dollars).
Solution: We can classify the annual revenues into ve categories:
Category 1Annual revenue is under $250 million.
Category 2Annual revenue is at least $250 million but less
than $500 million.
Category 3Annual revenue is at least $500 million but less
than $750 million.
Category 4Annual revenue is at least $750 million but less
than $1,000 million.
Category 5Annual revenue is over $1,000 million.
The data collected are given in Table 3.1.
After tallying the data, we nd that of the 110 companies, 30 belong in
the rst category, 25 in the second category, 20 in the third category, 15 in
the fourth category, and 20 in the fth category. The frequency distribution
table for these data is shown in Table 3.2.
Notes:
1. While preparing the frequency distribution table, we must
ensure that no data point belongs to more than one category and
that no data point is omitted from the count. In other words,
each data point must belong to only one category.
2. The total frequency is always equal to the total number of data
points in the data set. In the above example, the total frequency
is equal to 110.
Definition 3.1 A variable is a characteristic of the data under consideration. For example, in the Example 3.1 data, a companys annual revenues are under consideration, so revenue is a variable.
The information provided in frequency distribution Table 3.1 can be
expanded if we include two more columns: a column of relative frequencies
and a column of cumulative frequencies. The column of relative frequencies is

Table 3.1 Annual revenues of 110 small to midsize companies in midwestern United
States.
1 4 3 5 3 4 1 2 3 4 3 1 5 3 4 2 1 1 4 5 5 3 5 2 1 2 1 2 3 3 2 1 5 3 2 1
1 1 2 2 4 5 5 3 3 1 1 2 1 4 1 1 1 4 4 5 2 4 1 4 4 2 4 3 1 1 4 4 1 1 21 5
3 1 1 2 5 2 3 1 1 2 1 1 2 2 5 3 2 2 5 2 5 3 5 5 3 2 3 5 2 3 5 5 2 3 2 5

Describing Data Graphically 17

Table 3.2 Frequency distribution table for 110 small to midsize companies in the
midwestern United States.
Category number

Tally

Category/Class frequency

///// ///// ///// ///// ///// /////

130

///// ///// ///// ///// /////

125

///// ///// ///// /////

120

///// ///// /////

115

///// ///// ///// /////

120

Total

110

Table 3.3 Complete frequency distribution table for the 110 small to midsize
companies in the midwestern United States.
Category
number

Tally

Frequency

Relative
frequency

Percentage

Cumulative
frequency

///// ///// ///// /////


///// /////

30

30/110

27.27

30

///// ///// ///// /////


/////

25

25/110

22.73

55

///// ///// ///// /////

20

20/110

18.18

75

///// ///// /////

15

15/110

13.64

90

///// ///// ///// /////

20

20/110

18.64

110

110

100%

Total

obtained by dividing the frequency of each class by the total frequency. The
column of the cumulative frequencies is obtained by adding the frequency of
each class to the frequencies of all the preceding classes so that the last entry
in this column is equal to the total frequency. Some practitioners like to use a
column of percentages instead of the relative frequency column or both. The
percentage column is easy to obtain, that is, just multiply each entry in the relative frequency column by 100. For example, the expanded (or complete) version of Table 3.2 is as shown in Table 3.3.
Sometimes a data set is such that it consists of only a few distinct observations, which occur repeatedly. This kind of data is normally treated in the
same way as the categorical data. The categories are represented by the distinct observations. We illustrate this scenario with the following example.
Example 3.2 The following data show the number of coronary artery
bypass graft surgeries performed at a hospital in 24 hour periods during the
past 50 days. Bypass surgeries are usually performed when a patient has
multiple blockages or when the left main coronary artery is blocked.
1 2 1 5 4 2 3 1 5 4 3 4 6 2 3 3 2 2 3 5 2 5 3 43
1 3 2 2 4 2 6 1 2 6 6 1 4 5 4 1 4 2 1 2 5 2 24 3
Construct a complete frequency distribution table for these data.

18

Chapter Three

Table 3.4 Complete frequency distribution table for the data in example 3.2.
Category
number

Tally

Frequency

Relative
frequency

Percentage

Cumulative
frequency

///// ///

18

8/50

16

///// ///// ///

13

13/50

26

21

///// /////

10

10/50

20

31

///// ////

19

9/50

18

40

///// /

16

6/50

12

46

////

14

4/50

18

50

50

100%

Total

Solution: In this example the variable of interest is the number of bypass


surgeries performed at a hospital in a period of 24 hours. Now, following the
discussion in Example 3.1, we can see the frequency distribution table for the
data in this example in Table 3.4.
The frequency distribution table in Table 3.4 is usually called a singlevalued frequency distribution table.
Interpretation of a Frequency Distribution Table In Table 3.4 the entries in
row 2, for example, refer to category 2. Entries in row 2 and in column 1
indicate that the number of bypass surgeries performed in 24 hours is two.
Entries in column 2 count the number of days when two bypass surgeries are
performed. Column 3 indicates that on 13 days two bypass surgeries are performed. Column 4 indicates the proportion of days (13 out of 50) on which
two bypass surgeries are performed. Column 5 indicates that on 26% of the
days two bypass surgeries are performed. Column 6 indicates that on 21 days
the number of bypass surgeries performed is one or two.
3.1.2 Quantitative Data
In the preceding section we studied frequency distribution tables for qualitative data. In this section we will discuss frequency distribution tables for
quantitative data.
Let x1, x2, - - -, xn be a set of quantitative data. We would like to construct
a frequency distribution table for this data set. In order to prepare such a table
we need to go through the following steps.
Step 1 Find the range of the data that is dened as
Range (R)  largest data point  smallest data point

(3.1)

Step 2 Divide the data set into an appropriate number of classes/categories.


The appropriate number of classes/categories is commonly dened as the
variable m, which is determined in one of two ways: (1) As a rule of thumb
where m is between 5 and 20 classes or categories such that the average num-

Describing Data Graphically 19

ber of data points in each class or category is about six or seven, or (2) we
may use Sturges formula:
m  1  3.3 log n

(3.2)

Where n is the total number of data points in a given data set.


Step 3 Determine the width of classes as follows:
Class width  R/m

(3.3)

Step 4 Finally, prepare the frequency distribution table by assigning each


data point to an appropriate class or category. While assigning these data
points to a class, we must be particularly careful to ensure that each data
point is assigned to only one class and that the whole set is included in the
table. It is also important that the class on the lowest end of the scale must
be started with a number less than or equal to the smallest data point and that
the class on the highest end of the scale must end with a number greater than
or equal to the largest data point in the data set.
Notes:
1. Quite often when we determine the class width, the number
obtained by dividing R with m is not an easy number to work
with. In such cases we should always round this number up,
preferably to a whole number. Never round it down.
2. If we use Sturges formula to nd the number of classes, then
usually the value of m is not a whole number. In that case one
must round it up or down to a whole number since the number
of classes can only be a whole number.
Example 3.3 The following data define the lengths (in millimeters) of 40
randomly selected rods manufactured by a company
145 140 120 110 135 150 130 132 137 115
142 115 130 124 139 133 118 127 144 143
131 120 117 129 148 130 121 136 133 147
147 128 142 147 152 122 120 145 126 151
Prepare a frequency distribution table for these data.
Solution: Following the steps described above, we have
1. Range (R)  152 110  42
2. Number of classes  1  3.3 log 40  6.29, which, by
rounding becomes 6.
3. Class width  R/m  42/6  7
The six classes we use to prepare the frequency distribution table are 110
to under 117, 117 to under 124, 124 to under 131, 131 to under 138, 138 to
under 145, and 145 to 152.
Note that in the case of quantitative data, each class is dened by two
numbers. The smaller of the two numbers is usually called the lower limit

20

Chapter Three

Table 3.5 Frequency table for the data on rod lengths.


Classes

Tally

Frequency

Relative
frequency

Percentage

Cumulative
frequency

[110117)

///

3/40

7.5

[117124)

///// //

7/40

17.5

10

[124131)

///// ///

8/40

20.0

18

[131138)

///// //

7/40

17.5

25

[138145)

///// /

6/40

15.0

31

[145152]

///// ////

9/40

22.5

40

40

100%

Total

and the larger is called the upper limit. Note that except for the last class, the
upper limit does not belong to the class. This means, for example, the data
point 117 will be assigned to class 2 and not class 1. This way no two classes have any common point, which ensures each data point will belong to only
one class. For simplication we will use mathematical notations to denote
the above classes as
[110117), [117124), [124131), [131138), [138145), [145152]
where customarily the symbol [ implies that the end point belongs to the class
and ) implies that the end point does not belong to the class. Then the frequency distribution table for the data in this example is as shown in Table 3.5.
Once data are placed in a frequency distribution table, the data are
grouped data. Once the data are grouped, it is not possible to retrieve the
original data; it is important to note that when grouping data some information will be lost. As we shall see in the next chapter, by using grouped data,
we cannot expect to get as accurate a result as we might expect by using
ungrouped data. In the next chapter we will also see that in order to calculate
certain quantities, such as the mean and variance, using grouped data we
need to dene another quantity, the class mark or class midpoint, which is
dened as the average of the upper and the lower limit. For example, the midpoint of class 1 in the above example is:
Midpoint of class 1  (110  117)/2  113.5

(3.4)

3.2 Graphical Representation of a Data Set


Graphical representation of a data set is a powerful tool that provides us with
good visual and instantaneous information that helps us analyze the data.
3.2.1 Dot Plot
A dot plot is one of the easiest graphs to construct. In a dot plot each observation is plotted on a real line. For illustration we consider the following
example.

Describing Data Graphically 21

DOT PLOT
A graphical tool used to provide visual
information about the distribution of a single
variable.

DESCRIPTION

USE

Used to assess the distribution of a single


variable. Two side-by-side dot plots can be
used to compare the distribution of two
different data sets. Pioneer statisticians
used Dot Plots to compare the results of
different experiments when other
sophisticated techniques to analyze data
were not available.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Data may have unusually large range or


may be concentrated only at few points.

SPECIAL COMMENTS/CONCERNS

Simplest graph to study, particularly in small


quantitative data sets. Does not provide
particularly sound information for analysis.

RELATED TOOLS

A similar graph for studying two variables


simultaneously is a scatter plot.

Example 3.4 The following data gives the number of defective motors
received in 20 shipments.
8 12 10 16 6 25 21 15 17 5
26 21 29 8 10 21 10 17 15 13
Construct a dot plot for this data.
Solution: To construct a dot plot rst draw a horizontal line, the scale of
which begins at the smallest observation (5 in this case) or smaller and ends
with the largest observation (29 in this case) or larger (see Figure 3.1).
Dot plots usually are more useful when the sample size is small. A dot
plot gives us, for example, information about how far the data are scattered
and where most of the observations are concentrated. For instance, in the
above example, we see that the minimum number of defective motors and the
maximum number of defective motors received in any shipment was 5 and
29, respectively. Also, we can see that 75% of the time, the number of defective motors was between 8 and 21 for the shipment, and so on.

12

20
16
Defective Motors

24

28

Figure 3.1 Dot plot for the data on defective motors that are received in 20
shipments.

30

22

Chapter Three

3.2.2 Pie Chart

DESCRIPTION

PIE CHART
A graphical tool to study a population when
it is divided into different categories. Each
category is represented by a slice of the pie
with angle at the center of the pie
proportional to the frequency of the
corresponding category.

USE

Used to study budget distributions,


demographic data, or to study various
aspects in manufacturing processes such
as the percentage of different sizes of parts
in inventory or parts produced at different
plants or in different shifts etc.

TYPE OF DATA

Categorical (qualitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Pie Chart shows how the total quantity


(population) is divided and allocated to
different categories.

SPECIAL COMMENTS/CONCERNS

A simple visual technique to summarize a


categorical data set.

RELATED TOOLS

Bar chart.

Pie charts are commonly used to represent categories of a population that are
created by a characteristic of interest of that population. Examples include
allocation of federal budget by sector, revenues of a large manufacturing company by region, and technicians in a large corporation by qualication, that is,
high school diploma, associate degree, undergraduate degree, or graduate
degree, and so on. The pie chart helps us better understand at a glance the
composition of the population with respect to the characteristic of interest.
To construct a pie chart, divide a circle into slices such that each slice
represents a category proportional to the size of that category. Remember, the
total angle of the circle is 360 degrees. The angle of a slice corresponding to
a given category is determined as follows:
Angle of a slice  (Relative frequency of the given category)  360.
We illustrate the construction of a pie chart using the data in example 3.5.
Example 3.5 In a manufacturing operation we are interested in better
understanding defect rates as a function of our various process steps. The
inspection points are initial cutoff, turning, drilling, and assembly. These
data are shown in Table 3.6. Construct a pie chart for these data.
Solution: The pie chart for these data is constructed by dividing the circle
into four slices. The angle of each slice is given in the last column of Table
3.6. The pie chart appears in Figure 3.2.

Describing Data Graphically 23

Table 3.6 Understanding defect rates as a function of various process steps.


Process steps

Frequency

Relative frequency

Angle size

86

86/361

85.75

Turning

182

182/361

181.50

Drilling

83

83/361

82.75

10

10/361

10.00

361

1.000

360.00

Initial cutoff

Assembly
Total

Assembly
2.8%
Initial cutoff
23.8%

Drilling
23.0%

Turning
50.1%

Figure 3.2

Pie chart for defects associated with manufacturing process steps.

Clearly the pie chart in Figure 3.2 gives us a better understanding at a


glance about the rate of defects occurring at different process steps.
3.2.3 Bar Chart
Bar charts are commonly used to study one or more populations classied
into various categories, such as by sector, by region, or over different periods.
For example, we may want to know more about the sales of our company by
sector, by region, or over different periods. A bar chart is constructed by creating categories represented by intervals of equal length on a horizontal axis,
or x-axis. Within each category we indicate observations as a frequency of the
corresponding category, which is represented by a bar of length proportional

24

Chapter Three

BAR CHART
A graphical tool in which frequency of each
category of qualitative data is represented
by a bar of height proportional to the
corresponding category.

DESCRIPTION

USE

Provides a visual display of qualitative data.


It is also used to compare two or more sets
of qualitative data.

TYPE OF DATA

Categorical (qualitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Units used in each category must be


identical. For example, comparing parts
manufactured by different companies must
have identical specifications. Categories
must be isolated from each other.

SPECIAL COMMENTS/CONCERNS

A bar chart can be presented vertically or


horizontally. It differs from the Pie Chart in
the sense that it can be used to compare
two or more populations whereas the Pie
Chart can be used only to study one
population.

RELATED TOOLS

The Pareto Chart, which is a powerful tool in


SQC, is a special case of the Bar Chart.

to the frequency. We illustrate the construction of a bar chart in the examples


that follow.
Example 3.6 The following data give the annual revenues (in millions of
dollars) of a company over a period of five years (19982003).
78 92 95 94 102
Construct a bar chart for these data.
Solution: We construct a bar chart that is shown in Figure 3.3.
Example 3.7 A company that manufactures auto parts is interested in
studying the types of defects that occur in parts produced at a particular
plant. The following data show the types of defects that occurred over a certain period.
BACABAEDCABCDCAEABAB
CEDCAEADBCBABEDBDBEA
CABCECBABE
Construct a bar chart for the types of defects found in the auto parts.
Solution: To construct a bar chart, we rst need to prepare a frequency distribution table. The data in this example are qualitative, and the categories are
the types of defects, namely, A, B, C, D, and E. The frequency distribution
table is shown in Table 3.7.

Describing Data Graphically 25

100

80

60

40

20

0
1998

2000

1999

2001

2002

Figure 3.3 Bar graph for annual revenues of a company over the period of five
years.

Table 3.7 Frequency distribution table for the data in Example 3.7.
Categproes

Tally

Frequency

Relative
frequency

Cumulative
frequency

///// ///// ////

14

14/50

14

///// ///// ///

13

13/50

27

///// ////

9/50

36

///// //

7/50

43

///// //

7/50

50

50

1.00

Total

To construct the bar chart we label the intervals of equal length on the xaxis with the types of defects and then indicate the frequency of observations
associated with the defect within that interval. The observations are taken to
be equal to the frequency of the corresponding categories. The desired bar
graph appears in Figure 3.4, which shows that the defects of type A occur
most frequently, type B occur second most frequently, type C occur third
most frequently, and so on.
Example 3.8 The following data give the frequency defect types in
Example 3.7, as auto parts manufactured over the same period in two plants
that have the same manufacturing capacity.

Defect Type

Total

Plant I

14

13

50

Plant II

12

18

12

52

Chapter Three

Construct a bar chart comparing the types of defects occurring in auto


parts manufactured in the two plants.
Solution: The bar chart for the data in this example appears in Figure 3.5.
Bar charts as shown in Figure 3.5 are commonly used to compare two or
more populations. We can visually observe that defect types B, C, and E
occur less frequently and that defect types A and D occur more frequently in
Plant I than in Plant II.
14
12
10
Frequency
Frequency

26

8
6
4
2
0
A

Figure 3.4

C
Defect
Defect type
type

Bar graph for the data in Example 3.7.

20

15

10

0
P1 P2
A

P1 P2
B

P1 P2
C

P1 P2
D

P1 P2
E

Defect type

Figure 3.5 Bar charts for types of defects in auto parts manufactured in Plant I (P1)
and Plant II (P2).

Describing Data Graphically 27

DESCRIPTION

HISTOGRAM
A graphical tool consisting of bars
representing the frequencies or relative
frequencies of classes or categories of
quantitative data. The height of each bar is
equal to the frequency or relative frequency
of the corresponding class.

USE

Used to asses visually the probability


distribution of the quantitative data.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

To verify the normality assumption to see if


we need any transformation for the data to
meet the normality condition.

SPECIAL COMMENTS/CONCERNS

The information provided by a histogram


is valid when the data set is sufficiently
large. Each bar should be of the same
width and there should not be any gap
between bars.

RELATED TOOLS

Frequency polygon, stem and leaf diagram.

3.2.4 Histograms and Related Graphs


Histograms are popular graphs used to represent quantitative data graphically. Histograms usually provide us useful information about a data set, for
example, information about trends, patterns, location/center of the data, and
dispersion of the data. Such information is not particularly apparent from
raw data.
Construction of a histogram involves two major steps:
Step 1 Prepare a frequency distribution table for the given data.
Step 2 Use the frequency distribution table prepared in step 1
to construct the histogram. From here the steps involved in
constructing a histogram are exactly the same as steps taken
to construct a bar chart except that in a histogram there is
no gap between the intervals marked on the x-axis. We
illustrate the construction of a histogram with the following
example.
Note that a histogram is called a frequency histogram or a relative frequency histogram depending upon whether the heights of rectangles erected
over the intervals marked on the x-axis are proportional to the frequencies or
to the relative frequencies. In both types of histograms the width of rectangles is equal to the class width. In fact, the only difference between the two
histograms is that the scales used on the y-axes are different. This point
should become clear from the following example.

28

Chapter Three

Example 3.9 The following data give the survival time (in hours) of 50
parts involved in a field test under extreme operating conditions:
60 100 130 100 115 30 60 145 75 80 89 57 64 92 87 110 180 195 175 179
159 155 146 157 167 174 87 67 73 109 123 135 129 141 154 166 179 37 49
68 74 89 87 109 119 125 56 39 49 190
a. Construct a frequency distribution table for the above data.
b. Construct frequency and relative frequency histograms for the
above data.
Solution:
a.
1. Find the range of the data.
R  195 30  165
2. Determine the number of classes.
m  1  3.3 log 50
 6.57
By rounding it we consider the number of classes to be equal to 7.
3. Compute the class width.
Class width  R/m  165/7  23.57
By rounding up this number we have class width equal to 24. As noted
earlier, we always round up the class width to a whole number or to any
other number that may be easy to work with. Note that if we round down
the class width, then some of the observations may be left out of our count
and not belong to any class. Consequently the total frequency will be less
than n. The frequency distribution table for the data in this example is as
shown in Table 3.8.

Table 3.8 Frequency distribution table for the survival time of parts.
Classes

Tally

Frequency

Relative
frequency

Cumulative
frequency

[3054)

/////

5/50

[5478)

///// /////

10

11/50

16

[78102)

///// ////

8/50

24

[102126)

///// //

7/50

31

[126150)

///// /

6/50

37

[150174)

///// /

6/50

43

[174198]

///// //

7/50

50

50

Total

Describing Data Graphically 29

Frequency

10

0
30

54

78

102

126

150

174

198

Data

Figure 3.6 Frequency histogram for survival time of parts under extreme operating
conditions.

b. Having completed the frequency distribution table, now we are


ready to construct the histograms. To construct the frequency
histogram, we mark the classes on the x-axis and the
frequencies on the y-axis. Remember that when we mark the
classes on the x-axis we must make sure there is no gap
between the classes. Then on each class marked on the x-axis,
place a rectangle where the height of each rectangle is
proportional to the frequency of the corresponding class. The
frequency histogram for the data with a frequency distribution
given in Table 3.8 is shown in Figure 3.6.
To construct the relative frequency histogram, just change the scale on
the y-axis in Figure 3.6 so that instead of plotting the frequencies, we can
plot relative frequencies. The resulting graph shown in Figure 3.7 will be the
relative frequency histogram for the data, with a relative frequency distribution given in Table 3.8.
As described earlier, it is interesting to note that the two histograms
shown in Figures 3.6 and 3.7 are identical except for the scale on the y-axis.
It is important to remember that quite often we encounter data where the
last few observations are very sparse, so that the last few classes have hardly any observations in them. For instance, in Example 3.9, it could happen
that there are few parts that survived 200 or more hours. In such cases we
usually keep the last class open at the upper end and include all sparse observations in that class. For example, if in Example 3.9 some sparse observations were present, then the last class in Table 3.9 would become [174 ).
Similarly, if that kind of scenario occurs at the lower end, we keep the lower

Chapter Three

Relative frequency

10/50

5/50

0.0
30

78

54

102

126

150

174

198

Data

Figure 3.7 Relative frequency histogram for survival time of parts under extreme
operating conditions.

10

Frequency

30

0
30

54

78

102

126

150

174

198

Data

Figure 3.8

Frequency polygon for the data in Example 3.9.

end of class one open and include all the sparse observations that may be
present in the data in class 1.
Another graph that becomes the basis of probability distributions that we
will study in later chapters is called the frequency polygon or relative frequency polygon, depending upon which histogram is used to construct this
graph.
To construct the frequency or relative frequency polygon, rst mark the
midpoints on the top ends of the rectangles of the corresponding histogram

Describing Data Graphically 31

Relative frequency

10/50

5/50

0
30

54

78

126

102

150

174

198

Data

Figure 3.9

Relative frequency polygon for the data in Example 3.9.

f (x)

Figure 3.10 A typical frequency distribution curve.

and then simply join these midpoints. Note that we include classes with zero
frequencies at the lower as well as at the upper end of the histogram so that
we can connect the polygon with the x-axis. The curves obtained by joining
the midpoints are called the frequency or relative frequency polygons as the
case may be. The frequency polygon and the relative frequency polygon for
the data in Example 3.9 are shown in Figure 3.8 and Figure 3.9 respectively.
Sometimes a data set consists of a very large number of observations,
and that results in having a large number of classes of very small widths. In
such cases frequency polygons or relative frequency polygons become
smooth curves. For example, Figure 3.10 shows one such smooth curve.
Such curves are usually called frequency distribution curves and represent the probability distributions of continuous random variables. We will
study probability distributions of continuous random variables in Chapter 7.
Our comments on this topic indicate the importance of the histogram, as they
eventually become the basis of probability distributions.

Chapter Three

Figure 3.11 Three types of frequency distribution curves.

50

40
Cumulative frequency

32

30

20

10

0
30

54

78

102

126

150

174

198

Data

Figure 3.12 Cumulative frequency histogram for the data in Example 3.9.

The shape of the frequency distribution curve depends on the shape of


its corresponding histogram, which, in turn, depends on the given set of
data. The shape of a frequency distribution curve can be of any type, but in
general we have three types of frequency distribution curves, shown in
Figure 3.11.
Cumulative Frequency Histogram If we now use cumulative frequencies
instead of simple frequencies or relative frequencies, we get what is known
as a cumulative frequency histogram. The cumulative frequency histogram
for the data in Example 3.9 on survival time of part is shown in Figure
3.12.
Now, if we add another class with zero frequency only at the lower end
of the cumulative frequency histogram and join the end points, instead of the
midpoints, we get an ogive curve. We show the ogive curve for the survival
data in Example 3.9 in Figure 3.13.

Describing Data Graphically 33

50

Cumulative frequency

40

30

20

10

0
6

30

54

78

102

126

150

174

198

Data

Figure 3.13 Ogive curve for the survival data in Example 3.9.

3.2.5 Line Graph

DESCRIPTION

LINE GRAPH
A graphical tool used to display a time
series data.

USE

Used to assess the trend or trends in a


quantitative data that might occur over time.
It may be used to study the effect of one or
more factors on a response variable.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Any trends in the data may violate the


conditions of the underlying model.

SPECIAL COMMENTS/CONCERNS

The results displayed by the line graph


should be verified by using numerical
methods.

RELATED TOOLS

Control charts, frequency polygon.

A line graph, also known as time series graph, is commonly used to study
any changes that take place over time in the variable of interest. In a line
graph, time is marked on the horizontal axis or x-axis and the variable on the
vertical axis or y-axis. For illustration we use the data in Example 3.10.

Chapter Three

Example 3.10 The following data give the number of lawn mowers sold by
a garden shop in one year:
Month

Jan.

LMS

Feb. Mar.
1

April

May

June

10

57

62

July Aug. Sept. Oct. Nov. Dec.


64

68

40

15

10

Prepare a line graph for the above data.


Solution: To prepare the line graph, plot the above data using x-axis for the
months and the y-axis for the lawn mowers sold and then join the plotted
points with a freehand curve. The line graph for the data in this example is
shown in Figure 3.14.
From the line graph in Figure 3.14 we can see that the sale of lawn mowers is seasonal, since more mowers are sold in the summer months. Another
point worth noting is that a good number of lawn mowers are sold in
September when summer is winding down. This may be explained by the
fact that many stores offer sales to clear out such items when the season is
about to end. Any mower sales during the winter months may be because of
discounted price, or the store may be located where winters are mild and
there is still a need for mowers, but at a much lesser frequency.
3.2.6 Stem and Leaf Diagram
A stem and leaf diagram is a powerful tool that is used to summarize quantitative data. The stem and leaf diagram has numerous advantages over the
frequency distribution table and the frequency histogram. One major advantage of the stem and leaf diagram over the frequency distribution table is that
from a frequency distribution table we cannot retrieve the original data, but
from a stem and leaf diagram we can easily retrieve data in its original form.

Time series plot of lawn mowers sold


69

70
62

64

52

60
Lawn mowers sold

34

50
40

40
30
20

15
10

11

10
2

10

11

12

Figure 3.14 Line graph for the data on lawn mowers given in Example 3.10.

Describing Data Graphically 35

DESCRIPTION

STEM AND LEAF DIAGRAM


A graphical tool used to display data using
actual data values. Each value is split into
two parts, the part with leading digits is
called the stem and the rest is called the
leaf.

USE

Used very commonly for exploratory data


analysis.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

In certain situations normality assumption


may not be valid and a data transformation
may be necessary.

SPECIAL COMMENTS/CONCERNS

Unlike the histogram, stem and leaf diagram


does not lose the identity of the actual value
of the data point. Quite effective when the
data set is small.

RELATED TOOLS

Frequency distribution table, histogram.

Table 3.9 Data on survival time (in hours) in example 3.9.


60 100 130 100 115 30 60 14 75 80 89 57 64 92 87 110 180
195 175 179 159 155 146 157 167 174 87 67 73 109 123 135 129 141
154 166 179 37 49 68 74 89 87 109 119 125 56 39 49 190

In other words, by preparing a stem and leaf diagram we do not lose any
information. We illustrate the construction of the stem and leaf diagram with
the following example.
Example 3.11 Use the data on survival time of parts in Example 3.9. The
data of Example 3.9 are reproduced in Table 3.9.
Solution: To create a stem and leaf diagram, we split each observation in
the data set into two parts, called the stem and the leaf. For this example we
split each observation at the unit place so that the digit at the unit place is a
leaf and the part to the left is a stem. For example, for observations 60 and
100 the stems and leaves are
Stem

Leaf

10

To construct a complete stem and leaf diagram, list all the stems without
repeating them in a column and then list all the leaves in a row against its
corresponding stem. For the data in this example, we have the diagram
shown in Figure 3.15.

36

Chapter Three

(a)

(b)
Stem

Leaf

Stem

Leaf

12

15

21

22

(4)

10

24

11

21

12

18

13

16

14

13

15

16

6
9

17

18

19

12

15

21

22

(4)

10

24

11

21

12

18

13

16

14

13

15

16

7
5

17

18

19

Figure 3.15 Ordinary and ordered stem and leaf diagram for the data on survival
time for parts in extreme operating conditions in Example 3.9.

Note that in Figure 3.15(a) leaves occur in the same order as observations in the raw data. In Figure 3.15(b) leaves appear in the ascending order,
and that is why it is called an ordered stem and leaf diagram. By rotating
the stem and leaf diagram counterclockwise through 90 degrees, we see the
diagram can serve the same purpose as a histogram with stems serving the
role of classes, leaves as class frequencies, and rows of leaves as rectangles.
Unlike the frequency distribution table and histogram, the stem and leaf diagram can be used to determine, for example, what percentage of parts survived between 90 and 145 hours. Using the stem and leaf diagram we can
see that 15 out of 50, or 30% of the parts, survived between 90 and 145
hours. Using either the frequency table or the histogram, this question cannot be answered, since the interval 90 to 145 does not include whole parts
of classes 3 and 5. The rst column in the graph is counting from the top
and the bottom the number of parts that survived up to and beyond certain
number of hours, respectively. For example, the entry in the fth row from
the top indicates that 15 parts survived less than 80 hours, whereas the entry
in the fth row from the bottom indicates that 13 parts survived at least 150
hours. The entry with parentheses indicates the row that contains the median value of the data. It is clear that we can easily retrieve the original data
from the stem and leaf diagram. Thus, for example, the rst row in Figure

Describing Data Graphically 37

3.15 consists of data points 30, 37, and 39. Ultimately, we can see the stem
and leaf diagram is usually more informative than a frequency distribution
table or a histogram.
Breaking Each Stem into Two or Five Parts Quite often we deal with a large
data set that is spread in a narrow range. When we prepare a stem and leaf diagram for data that do not indicate much variability, it becomes difcult to interpret, as there are too many leaves on the same stem. The stem and leaf diagram
in which stems have too many leaves tend to be less informative since they are
not as clear as those in which the stems do not have too many leaves. To illustrate this scenario, we consider a stem and leaf diagram in Example 3.12.
Example 3.12 A manufacturing company has been awarded a huge contract by the Defense Department to supply spare parts. In order to provide
these parts on schedule, the company needs to hire a large number of new
workers. To estimate how many workers to hire, representatives of the human
resources department took a random sample of 80 workers and found the
number of parts each worker produces per week. The data collected are
shown in Table 3.10.
Table 3.10 Number of parts produced by each worker per week.
66
82
75
68

71
74
76
81

73
73
74
87

70
69
89
93

68
68
86
92

79
87
91
81

84
85
92
80

85
86
65
70

77
87
64
63

75
89
62
65

61
90
67
62

69
92
63
69

74
71
69
74

80
93
73
76

83
67
69
83

82
66
71
85

86
65
76
91

87
68
77
89

78
73
84
90

81
72
83
85

Prepare a stem and leaf diagram for the data in Table 3.10.
Solution: The stem and leaf diagram for the data in Table 3.10 appears in
Figure 3.16.
The stem and leaf diagram in Figure 3.16 is not as informative as it
would be if it had more stems so that each stem had fewer leaves. We can
modify the diagram by breaking each stem into two parts so that the rst part
carries leaves 0 thru 4 and the second one carries leaves 5 thru 9. The modied stem and leaf diagram is shown in Figure 3.17.
Sometimes even a two-stem and leaf diagram is insufcient in illustrating the desired information, since some of the stems still have too many
leaves. In such cases we can break each stem into ve parts so that these
Stem
(22

Leaf
6 1223345556677888899999

(23)

7 00111233334444556667789

(35

8 00111223334455556667777999

(29

9 001122233

Figure 3.16 Ordered stem and leaf diagram for the data in Table 3.10.

38

Chapter Three

Stem

Leaf

(6

6.*

122334

(22

6*

5556677888899999

(36

7.*

00111233334444

(9)

7*

556667789

(35

8.*

001112233344

(23

8*

55556667777999

(9

9.*

001122233

Figure 3.17 Ordered two-stem and leaf diagram for the data in Table 3.10.

Stem

Leaf

(1

6.

(5

6t

2233

(9

6f

4555

(13

6s

6677

(22

6*

888899999

(27

7.

00111

(32

7t

23333

(38

7f

444455

(5)

7s

66677

(37

7*

89

(35

8.

00111

(30

8t

22333

(25

8f

445555

(19

8s

6667777

(12

8*

999

(9

9.

0011

(5

9t

22233

Figure 3.18 Ordered five-stem and leaf diagram for the data in Table 3.10.

stems carry leaves 01, 23, 45, 67, and 89. The new stem and leaf diagram appears in Figure 3.18.
Note that the stem and leaf diagram in Figure 3.18 has become much
simpler and accordingly more informative. It is also interesting to note
that the labels t, f, and s, used to denote the stems, have real meanings. In
this case, the stems indicate that in the leaves assigned to them, t stands
for two and three, f stands for four and five, and s stands for six and
seven.

Describing Data Graphically 39

3.2.7 Measure of Association


So far in this chapter we have dedicated our discussion to univariate statistics because we were interested in studying only a single characteristic of a
subject of concern. The variable of interest was either qualitative or quantitative. We now divert our attention to cases involving two variables; that is,
we simultaneously examine two characteristics of a subject of concern. The
two variables of interest could be either qualitative or quantitative. We will
study only variables that are quantitative in the remainder of this volume.
Qualitative variables will be visited in a later volume.
When studying two variables simultaneously, the data obtained for such
a study is known as bivariate data. In examining bivariate data, the rst
question is whether there is any association between the two variables of
interest. One effective way to investigate for such an association is to prepare a graph by plotting one variable along the horizontal scale (x-axis) and
the second variable along the vertical scale (y-axis). Each pair of observations (x, y) is then plotted as a point in the xy-plane. The graph prepared is
called a scatter plot. A scatter plot depicts the nature and strength of associations between two variables. To illustrate, we consider the following
example.

DESCRIPTION

SCATTER PLOT
A graphical tool used to plot and compare
one variable against another variable.

USE

Used to assess if there are any


relationships between two variables.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Several relationships may be present


between two variables including: positive
correlation, possible positive correlation, no
correlation, possible negative correlation,
negative correlation.

SPECIAL COMMENTS/CONCERNS

The Scatter Plot does not indicate cause in


a cause and effect relationship.

RELATED TOOLS

Line Graph. A similar graph for single


variable is the dot plot.

Example 3.13 The cholesterol level and the systolic blood pressure of 30
randomly selected U.S. men in the age group of 40 to 50 years are given in
Table 3.11. Construct a scatterplot of this data and determine whether there
is any association between the cholesterol levels and systolic blood pressures.

40

Chapter Three

Table 3.11 Cholesterol levels and Systolic BP of 30 randomly selected U.S. males.
Subject

10

Cholesterol (x)

195

180

220

160

200

220

200

183

139

155

Systolic BP (y)

130

128

138

122

140

148

142

127

116

123

11

12

13

14

15

16

17

18

19

20

Subject
Cholesterol (x)

153

164

171

143

159

167

162

165

178

145

Systolic BP (y)

119

130

128

120

121

124

118

121

124

115

Subject

21

22

23

24

25

26

27

28

29

30

Cholesterol (x)

245

198

156

175

171

167

142

187

158

142

Systolic BP (y)

145

126

122

124

117

122

112

131

122

120

Solution: Figure 3.19(a) shows the scatterplot of the data in Table 3.11.
This scatterplot clearly indicates that there is a fairly good upward linear
trend. Also, if we draw a straight line through the data points, we can see that
the data points are concentrated around the straight line within a narrow band.
The upward trend indicates a positive association between the two variables,
whereas the narrow width of the band indicates the strength of the association
is very strong. As the association between the two variables gets stronger, the
band enclosing the plotted points becomes narrower and narrower. The downward trend indicates a negative association between the two variables. A
numerical measure of association between two numerical variables is called
the Pearson correlation coefficient, named after the English statistician Karl
Pearson (18571936). The correlation coefcient between two numerical
variables in sample data is usually denoted by r. A Greek letter  denotes the
corresponding measure of association, that is, correlation coefcient for a
population data. The correlation coefcient is dened as

r=

( xi )( yi )
n
=
2
2
2
2 ( xi ) 2 ( yi )2
( xi x ) ( yi y)
xi n yi n
( xi x )( yi y )

xi yi

(3.5)

The correlation coefcient is a unitless measure, which can attain any


value in the interval [1, 1]. As the strength of the association between the
two variables grows, the absolute value of r approaches 1. Thus, when there
is a perfect association between the two variables, r  1 or 1, depending
upon whether the association is positive or negative. In other words, r  1 if
the two variables are moving in the same direction (increasing or decreasing)
and r  1 if they are moving in the opposite direction.
Perfect association means that if we know the value of one variable then
the value of the other variable can be determined without any error. The other
special case is when r  0, which means that there is no association between
the two variables. As a rule of thumb, the association is weak, moderate, or
strong when the absolute value of r is less than 0.3, between 0.3 and 0.7 or
greater than 0.7 respectively. Figure 3.19 depicts scatter plots for data with
strong, perfect, moderate, and weak correlation.

Describing Data Graphically 41

150
r= .891

140

130

120

110
150

175

200

225

250

(a)

110
r= .891

120

130

140

150
150

175

200

225

250

(b)
Figure 3.19 MINITAB display depicting eight degrees of correlation: (a) represents
strong positive correlation, (b) represents strong negative correlation,
(c) represents positive perfect correlation, (d) represents negative perfect
correlation, (e) represents positive moderate correlation, (f) represents
negative moderate correlation, (g) represents a positive weak
correlation, and (h) represents a negative weak correlation.
Continued

Chapter Three

200

180

r=1

160

140

120
100

150

175

200

225

250

(c)

100

r = 1

120

42

140
160
180
200
150

175

200

225

250

(d)
Figure 3.19 MINITAB display depicting eight degrees of correlation: (a) represents
strong positive correlation, (b) represents strong negative correlation,
(c) represents positive perfect correlation, (d) represents negative perfect
correlation, (e) represents positive moderate correlation, (f) represents
negative moderate correlation, (g) represents a positive weak
correlation, and (h) represents a negative weak correlation.
Continued

Describing Data Graphically 43

150

140

130

120

r = .518

110
150

200

175

225

250

(e)
110
rr == -.891
.518

120

130

140

150
150

175

200

225

250

(f)
Figure 3.19 MINITAB display depicting eight degrees of correlation: (a) represents
strong positive correlation, (b) represents strong negative correlation,
(c) represents positive perfect correlation, (d) represents negative perfect
correlation, (e) represents positive moderate correlation, (f) represents
negative moderate correlation, (g) represents a positive weak
correlation, and (h) represents a negative weak correlation.
Continued

Chapter Three

Continued

150
r = .212

140

130

120

110
150

200

175

225

250

(g)
110
r = .212
120

44

130

140

150
150

200

175

225

250

(h)
Figure 3.19 MINITAB display depicting eight degrees of correlation: (a) represents
strong positive correlation, (b) represents strong negative correlation,
(c) represents positive perfect correlation, (d) represents negative perfect
correlation, (e) represents positive moderate correlation, (f) represents
negative moderate correlation, (g) represents a positive weak
correlation, and (h) represents a negative weak correlation.

4
Describing Data Numerically

n Chapter 3 we studied graphical methods to organize and summarize


data. We saw that graphical methods provide us with powerful tools to
visualize the information contained in data. Perhaps you have a new
appreciation for the saying that a picture is worth a thousand words.
In Chapter 4 we extend our knowledge from graphical methods to numerical methods. Numerical methods provide us with what are commonly known
as quantitative or numerical measures. The numerical methods that we are
about to study are applicable to both sample as well as population data.

4.1 Numerical Measures


Numerical methods can be used to analyze sample as well as population
data.
Definition 4.1 Numerical measures computed by using population
data are referred to as parameters.
Definition 4.2 Numerical measures computed by using sample data
are referred to as statistics.
In statistics it is a standard practice to denote parameters by letters of the
Greek alphabet and statistics by letters of the English alphabet.
We divide numerical measures into two major categories: (1) measures
of centrality and (2) measures of dispersion. Measures of centrality give us information about the center of the data whereas measures of dispersion give
information about the variation within the data.

45

46

Chapter Four

4.2 Measures of Centrality

DESCRIPTION

MEASURES OF CENTRALITY
Measures of centrality include several
numerical measures. The more commonly
used such measures are mean, median and
mode.

USE

Used to assess the location of the center


of a data set and to check the shape of the
distribution. An important tool to assess the
relative position of two or more distributions.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

In many applications different treatments


or different actions on a process may affect
the measures of location. In such cases the
most commonly used measure is the mean.

SPECIAL COMMENTS/CONCERNS

The choice of any measure of location


should be carefully made. For example, the
median is a better measure of location if the
data contains a few very small or very large
values. Normally measures of location
should be used in collaboration with
measures of dispersion such as variance.
Distinction must be made between the
sample and population measures.

RELATED TOOLS

Measures of dispersion.

Measures of centrality are also known as measures of central tendency.


Whether referring to measures of centrality or central tendency, the following measures are of primary importance:
1. Mean
2. Median
3. Mode
The mean is also sometimes referred to as the arithmetic mean, and it is the
most useful and most commonly used measure of centrality. The median is
the second most used, and mode is the least used measure of centrality.
4.2.1 Mean
The mean of sample or population data is calculated by dividing the sum of
the data measurements by the number of measurements in the sample or the
population data. The mean of a sample is called the sample mean and is

Describing Data Numerically 47

denoted by X (read as X bar), and the population mean is denoted by the


Greek letter  (read as mu). These terms are dened numerically as follows:
Population mean =
Sample mean X =

X1 + X 2 + + X N
=
N

X1 + X 2 + + X n
=
n

Xi
N

Xi
n

(4.1)
(4.2)

where  (read as sigma) is symbolized as a summation over all the measurements, and where N and n denote the population and sample size respectively.
Example 4.1 The following data give the hourly wages (in dollars) of some
randomly selected workers in a manufacturing company:
8, 6, 9, 10, 8, 7, 11, 9, 8
Find the mean hourly wage of these workers.
Solution: Since wages listed in these data are only for some of the workers in the company, it represents a sample. We have
n9
xi  8  6  9  10  8  7  11  9  8  76
So the sample mean is
X=

x1 = 76 = 8.44
n

In this example, the mean hourly wage of these employees is $8.44 an hour.
Example 4.2 The following data give the ages of all the employees in city
hardware store:
22, 25, 26, 36, 26, 29, 26, 26
Find the mean age of the employees in that hardware store.
Solution: Since the data give the ages of all the employees of the hardware
store, we are interested in a population. Thus, we have
N8
xi  2225263626292626  216
So the population mean is

x1 = 216 = 27 years

8
N
In this example, the mean age of the employees in the hardware store is 27
years.
Note that even though the formulas for calculating sample mean and population mean are
similar, it is important to make a clear distinction between
the sample mean X and the population mean  for all application purposes.

48

Chapter Four

Sometimes a data set may include a few observations or measurements


that are very small or very large. For examples, the salaries of a group of
engineers in a big corporation may include the salary of its CEO, who also
happens to be an engineer and whose salary is much larger than other engineers in that group. In such cases where there are some very small and/or
very large observations, these values are referred to as extreme values. If
extreme values are present in the data set, the sets mean is not an appropriate measure of centrality. It is important to note that any extreme values,
large or small, adversely affect the mean value. In such cases the median is
a better measure of centrality since the median is unaffected by a few
extreme values. Next we discuss the method to calculate median of a data set.
4.2.2 Median
We denote the median of a data set by Md. To determine the median of a data
set we take the following steps.
Step 1. Arrange the measurements in the data set in ascending
order and rank them from 1 to n.
Step 2. Find the rank of the median which is equal to (n 1) / 2.
Step 3. Find the value corresponding to the rank (n  1) / 2 of
the median. This value represents the median of the data set.
Example 4.3 To illustrate this method we consider a simple example. The
following data give the length of an alignment pin for a printer shaft in a
batch of production:
30, 24, 34, 28, 32, 35, 29, 26, 36, 30, 33
Find the median alignment pin length.
Solution:
Step 1. Write the data in ascending order and rank them from 1
to 11 since n  11.
Observations in ascending order: 24 26 28 29 30 30 32 33 34 35 36
Ranks: 1 2 3 4 5 6 7 8 9 10 11
Step 2. Find the rank of the median.
Rank of the median  (n  1) / 2  (11  1) / 2  6
Step 3. Find the value corresponding to rank 6 (this is the rank
of the median).
The value corresponding to rank 6 is 30.
Thus, the median alignment pin length is Md  30. This means that at
the most 50% alignment pins are of length less than 30 and at the most 50%
are of length greater than 30.

Describing Data Numerically 49

Example 4.4 The following data describe the sales (in thousands of dollars) for the 16 randomly selected sales personnel distributed throughout the
United States:
10, 8, 15, 12, 17, 7, 20, 19, 22, 25, 16, 15, 18, 250, 300, 12
Find the median sales of these individuals.
Solution:
Step 1. Observations in ascending order:
7 8 10 12 12 15 15 16 17 18 19 20 22 25 250 300
Ranks: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Step 2. Rank of the median  (16  1) / 2  8.5
Step 3. Find the value corresponding to the 8.5th rank. Since
the rank of the median is not a whole number, in this case the
median is defined as the average of the values that correspond
to the ranks 8 and 9 (since rank 8.5 is between the ranks 8
and 9).
The median of the above data is Md  (16  17) / 2  16.5.
Thus, the median sales of the given individuals is 16.5 thousand dollars.
It is important to note the median does not have to be one of the values
of the data set. Whenever the sample size is odd, the median is the center
value and whenever it is even, the median is always the average of the two
middle values where the data is arranged in the ascending order.
Finally, note that the data in the above example contains two values, 250
thousand and 300 thousand dollars, that seem to be the sales of top performing sales personnel. These two large values may be considered as extreme
values.
In this case, the mean of these data is given by
X  (7  8  10  12  12  15  15  16  17  18  19  20 
22  25  250  300) / 16 47.875
Since the mean of 47.875 is so much larger than the median of 16.5, it is
obvious that the mean of these data has been adversely affected by the
extreme values. Since the mean does not adequately represent the measure of
centrality of the data set, the median would more accurately identify center
locations of the data for this example.
Furthermore, if we replace the extreme values of 250 and 300, for example, with 25 and 30, the median will not change, although the mean becomes
$16,937. The new data obtained by replacing 250 and 300 with 25 and 30 did
not contain any extreme values. Therefore, the new mean value is more consistent with the true average sales.

50

Chapter Four

Example 4.5 Elizabeth took five courses in a semester with 5, 4, 3, 3, and


2 credit hours. The grade points she earned in these courses at the end of the
semester were 3.7, 4.0, 3.3, 3.7 and 4.0, respectively. Find her GPA for that
semester.
Solution: Note that in this example the data points 3.7, 4.0, 3.3, 3.7, and
4.0 have different weights attached to them, that is, credit hours for each
course. Thus, to nd Elizabeths GPA we cannot simply settle for the arithmetic mean. In this case we shall nd the mean called the weighted mean,
which is dened as:
Xw =

w1 X 1 + w 2X 2 + ... + w nX n
=
w1 + w 2 + ... + w n

wi X i
wi

(4.3)

where w1, w2, ..., wn are the weights attached to X1, X2, ..., Xn, respectively.
In this example, the GPA is given by:
Xw =

5( 3.7 ) + 4 ( 4.0 ) + 3( 3.3) + 3( 3.7 ) + 2( 4.0 )


= 3.735
5 + 4 + 3+ 3+ 2

4.2.3 Mode
The mode of a data set is the value that occurs most frequently. Mode is the
least used measure of centrality. When products are produced via mass production, for example, clothes of certain sizes or rods of certain lengths, the
modal value is of great interest. Note that in any data set there may be no
mode or, conversely, there may be multiple modes. We denote the mode of a
data set by M0.
Example 4.6 Find the mode for the following data set:
3, 8, 5, 6, 10, 17, 19, 20, 3, 2, 11
Solution: In the given data set each value occurs once except 3 which
occurs twice. Thus, the mode for this set is:
M0  3
Example 4.7 Find the mode for the following data set:
1, 7, 19, 23, 11, 12, 1, 12, 19, 7, 11, 23
Solution: Note that in this data set, each value occurs the same number of
times. Thus, in this data set there is no mode.
Example 4.8 Find modes for the following data set:
5, 7, 12, 13, 14, 21, 7, 21, 23, 26, 5
Solution: In this data set, 5, 7, and 21 occur twice and the rest of the values occur only once. Thus, in this example there are three modes, that is,
M0  5, 7, and 21

Describing Data Numerically 51

Symmetric
Mean = Median = Mode

Left-skewed
Left-skewed

Mean < Median < Mode

Right-ske

Right-skewed
Mode < Median < Mean

Figure 4.1 Frequency distributions showing the shape and location of measures of
centrality.

Note that as such there is no mathematical relationship between the mean,


mode, and median. That is, if we are given any one or two of these measures
(i.e., mean, median, or mode) it is not possible for us to nd the missing
value(s) without using the data values. However, the values of mean, mode,
and median do provide us important information about the potential type or
shape of the frequency distribution of the data, since their location depends
upon the shape of the frequency distribution. Although the shape of the frequency distribution of a data set could be of any type, most frequently we see
the following three types of frequency distributions (Figure 4.1).
Definition 4.3 A data set is symmetric when the values in the data set
that lie equidistant from the mean, on either side, occur with equal
frequency.
Definition 4.4 A data set is left-skewed when values in the data set
that are greater than the median occur with relatively higher frequency than those values that are smaller than the median. The values smaller than the median are scattered far from the median.

52

Chapter Four

Definition 4.5 A data set is right-skewed when values in the data set
that are smaller than the median occur with relatively higher frequency than those values that are greater than the median. The values greater than the median are scattered far from the median.

4.3 Measures of Dispersion


In the previous section we discussed measures of central tendency, these
measures provide us information about the location of the center of frequency distributions of the data sets under consideration. For example, consider
the frequency distribution curves shown in Figure 4.2.

DESCRIPTION

MEASURES OF DISPERSION
Measures of dispersion include several
numerical measures. The more commonly
used measures are variance, standard
deviation, range and inter-quartile range.

USE

Used to assess the spread of the


distribution and the variability within a
data set or between the data sets.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

An important measure that should be


monitored closely and controlled. While
designing any process or experiment care
should be taken to keep the measures of
dispersion as small as possible.

SPECIAL COMMENTS/CONCERNS

Measures of dispersion are scale sensitive.


Therefore, when comparing such measures
for two or more data sets it is very important
to keep in mind that the units used in
different sets are the same. Distinction must
be made between the sample and
population measures.

RELATED TOOLS

Coefficient of variation.

Measures of central tendency do not portray the whole picture of any


data set. For example, it can be seen in Figure 4.2; the two frequency distributions have the same mean, median, and mode. Interestingly, however, the
two distributions are signicantly different. The major difference is in the
variation among the values associated with each distribution. It is important
for us to know about the variation among the values of the data set. Information about variation is provided by measures known as measures of dispersion. In this section we will study three measures of dispersion including
the range, variance, and standard deviation.

Describing Data Numerically 53

Figure 4.2 Two frequency distribution curves with equal mean, median and mode
values.

4.3.1 Range
The range of a data set is the easiest measure of dispersion to calculate.
Range is dened as follows:
Range  Largest value Smallest value

(4.4)

Range is not a very efcient measure of dispersion since it takes into consideration only the largest and the smallest values and none of the remaining
observations. For example, if a data set has 100 distinct observations, it uses
only two observations and ignores the remaining 98. As a rule of thumb, if
the data set contains 10 or fewer observations, the range is considered a reasonably good measure of dispersion. For data sets larger than 10 observations, the range is not considered to be a very efcient measure of dispersion.
Example 4.9 The following data give the tensile strength (in psi) of a material sample submitted for inspection:
8538.24, 8450.16, 8494.27, 8317.34, 8443.99,
8368.04, 8368.94, 8424.41, 8427.34, 8517.64
Find the range for this data set.
Solution: The largest and the smallest values in the data set are 8538.24
and 8317.34, respectively. Therefore, the range for this data set is:
Range  8538.24  8317.34  220.90
4.3.2 Variance
One of the most interesting pieces of information associated with any data is
how the values in the data set vary from one another. Of course, range can
give us some idea of variability. Unfortunately, range does not help us understand centrality. To better understand variability, we rely on more powerful

54

Chapter Four

indicators such as variance, which is a value that focuses on how much individual observations within the data set deviate from their mean.
For example, if the values in the data set are x1, x2, x3, ... xn and the mean
is x, then x1  x, x2  x, x3  x ... xn  x are the deviations from the mean.
It is then natural to nd the sum of these deviations and argue that if this sum
is large, the values differ too much from each other, and if this sum is small,
they do not differ from each other too much. Unfortunately, this argument
does not hold since the sum of the deviations is always zero, no matter how
much the values in the data set differ. This is true because some of the deviations are positive and some are negative, and when we take their sum they
cancel each other.
Since we dont get any useful information from two sets of measures
(i.e., one positive and one negative) that cancel each other, we can square
these deviations and then take their sum. By taking the square we get rid of
the negative deviation in the sense that they also become positive. The variance then becomes the average value of the sum of the squared deviations
from the mean x. If the data set represents a population, the deviations are
taken from the population mean . Thus, the population variance, denoted by
2 (read as sigma square), is dened as:

2 =

2
1 N
( Xi )

N i =1

(4.5)

And the sample variance denoted by S 2 is dened as:


S2 =

2
1 n
( Xi X )

n i =1

(4.6)

It is important to note, however, that for a reason to be discussed in


Chapter 9, the formula used in practice to calculate the sample variance S 2 is
S2 =

2
1 n
( Xi X )

n 1 i =1

(4.7)

For computational purposes, we give the simplied forms of the above


formulas for population and sample variances:
( X i )
1
= [ X i2
]
N
N

(4.8)

( X i )
1
S =
[ X i2
]
n 1
n

(4.9)

Note that one difculty in using the variance as the measure of dispersion is that the units for measuring the variance are not the same as are used
for data values. Rather, variance is expressed as a square of the units used for
the data values. For example, if the data values are dollar amounts, then the
variance will be expressed in squared dollars that, in this case, becomes

Describing Data Numerically 55

meaningless. For application purposes therefore, we dene another measure


of dispersion, called standard deviation that is directly related to the variance. Standard deviation is measured in the same units as the data values.
4.3.3 Standard Deviation
Standard deviation is obtained by taking the positive square root (with positive sign) of the variance. The population standard deviation  and the sample standard deviation s are dened as follows:
( X i )
1
[ Xi2
]
N
N

(4.10)

( X i )2
1
[ Xi2
]
n 1
n

(4.11)

=+
S=+

Note: In general, random variables are denoted by uppercase letters and their
values by the corresponding lowercase letters.
Example 4.10 The following data give the length (in millimeters) of material chips removed during a machining operation:
4, 2, 5, 1, 3, 6, 2, 4, 3, 5
Calculate the variance and the standard deviation for the data.
Solution: There are three simple steps involved in calculating the variance
of any data set.
Step 1. Calculate xi, the sum of all the data values. Thus we
have
xi  4  2  5  1  3  6  2  4  3  5  35
2
Step 2. Calculate X i , the sum of squares of all the
observations, that is,

X i2  42  22  52  12  32  62  22  42  32  52  145
Step 3. Since the sample size is n  10, by inserting the values
2
xi and X i , calculated in Step 1 and Step 2 in formula 4.10,
we have
S2 =

1
1
( 35 )2
(145
) = (145 122.5 ) = 2.5
10 1
10
9

The standard deviation is obtained by taking the square root of the variance,
that is,
S = 2.5 = 1.58

56

Chapter Four

Notes:
1. It is important to remember that the value of S2, and therefore of
S, is always greater than zero, except where all the data values
are equal, in which case it is zero.
2. Sometimes the data values are so large that the calculations for
computing the variance become quite cumbersome. In such
cases one can code the data values by adding to (or subtracting
from) each data value the constant, say c, and then calculate
the variance/standard deviation of the coded data, since the
variance/standard deviation does not change. This can easily be
seen from the following discussion.
Let X1, X2, ..., Xn be a data set and let C  0 be any constant. Let Y1 
Y = X +C.
X1  c, Y2  X2  C, ..., Yn  Xn  C. Then,
clearly we have
This means
that the deviations of Xis from X are the same as the deviations
of Yis from Y . Thus, the variance/standard deviation of Xs is the same as the
variance/standard deviation of Ys (S2y  Sx2 and Sy  Sx). This result implies
that any shift in location of the data set does not affect the variance/standard
deviation of the data.
Example 4.11
ing data:

Find the variance and the standard deviation of the follow53, 60, 58, 64, 57, 56, 54, 55, 51, 61, 63

Solution: We now compute the variance and the standard deviation of the
new data set 3, 10, 8, 14, 7, 6, 4, 5, 1, 11, 13, which is obtained by subtracting 50 from each value of the original data. This will also be the variance and
the standard deviation of the original data set. Thus, we have
xi  3  10  8  14  7  6  4  5  1  11  13  82
xi2  32  102  82  142  72  62  42  52  12  112  132  786
So that
S2 =

1
1
(82 )2
[ 786
] = [ 786 611.27 ] = 17.473
11 1
11
10

and
S = 17.473 = 4.18
The variance and the standard deviation of the original data set are 17.473
and 4.18, respectively.
3. Any change in scale of the data does affect the variance/
standard deviation. That is, if Yi  CXi (C  0) then Y = CX .
Therefore, it can be seen that S2y  C2Sx2 and sy  | C | Sx.

Describing Data Numerically 57

4.3.4 Coefficient of Variation


The coefficient of variation is usually denoted by cv and is dened as the
ratio of the standard deviation to the mean expressed in percentage.
cv  (standard deviation / mean) 100%

(4.12)

The coefcient of variation is a relative comparison of a standard deviation


to its mean and it has no units. The cv is commonly used to compare the variability in two populations having different units. For example, we might
want to compare disparity of earnings for technicians who work for the same
employer but in two countries. In this case, we compare the coefcient of
variation of the two populations rather than comparing the variances, which
would be an invalid comparison. The population with a greater coefcient of
variation has more variability than the other. As another illustration we consider the following example.
Example 4.12 A company uses two measuring instruments, one to measure the diameters of ball bearings and the other to measure the length of rods
it manufactures. The quality control department wants to find which instrument is more precise. A quality control engineer takes several measurements
of a ball bearing by using one instrument and finds the mean and standard
deviation to be 3.84mm and .002mm respectively. Then she takes several
measurements of a rod by using the other instrument and finds the mean and
the standard deviation to be 29.5cm and .035cm, respectively. Calculate the
coefficient of variation from the two measurements.
Solution: By using Formula 4.12, we have
cv1  (0.02 / 3.84) 100%  .52%
cv2  (0.035 / 29.5) 100%  .118%
Thus, the measurements of the length of the rod are relatively less variable
than the diameter of the ball bearing. Therefore, we can say that instrument
2 is more precise than instrument 1.

4.4 Measures of Central Tendency


and Dispersion for Grouped Data
So far in this chapter we have learned how to compute measures of central
tendency and dispersion for ungrouped data. In this section we will learn
how to compute these measures for grouped data.
In Chapter 3, we saw that by grouping data we always lose some information from the original data. The same is true for measurements that are
obtained using grouped data as they only approximate values of measurements obtained from the original data. The actual approximation will, of
course, depend upon the nature of the data. In certain cases values may be
close to their actual values, but in other cases they may be far apart. A word
of caution: measurements obtained by using grouped data should be used
only when it is not possible to retrieve the original data.

58

Chapter Four

4.4.1 Mean
In order to compute the mean of a grouped data set, the rst step is to nd the
midpoint m, also known as the class mark for each class, which is dened as:
m  (Lower limit  Upper limit) / 2
Then the population mean G and the sample mean XG bar are dened as
follows:
G  (fimi) / N

(4.13)

XG  (fimi) / n

(4.14)

where mi  midpoint of the ith class, fi  frequency of the ith class, N 


population size and n  sample size.
Example 4.13

Find the mean of the grouped data given in Table 4.1.

Note: From the entries in Table 4.1 one can observe that the difference
between the midpoints of any two consecutive classes is always the same as
the class width.
Solution: Using Formula 4.14, we have
XG  (fimi) / n  1350 / 40  33.75
4.4.2 Median
To compute the median MG of grouped data, follow these steps:
Step 1. Determine the rank of the median that is given by
Rank of MG  (n  1) / 2
Table 4.1 Age distribution of group of 40 people watching a basketball game.
Class

Frequency

Class Midpoint (m)

fm

10under 20

10 + 20
=15
2

120

20under 30

10

20 + 30
= 25
2

250

30under 40

30 + 40
= 35
2

210

40under 50

11

40 + 50
= 45
2

495

50under 60

50 + 60
= 55
2

275

n = f = 40

f i m = 1350

Describing Data Numerically 59

Step 2. Determine the class in which the rank (n  1) / 2 falls.


In order to nd such a class proceed as follows:
Add the frequencies of classes starting from class 1 and continue doing
so until the sum becomes greater than or equal to (n  1) / 2, and stop as
soon as the sum becomes greater than or equal to (n  1) / 2. That is the class
that contains the median.
Step 3. Once we know the class where the rank of the median
falls, the median is given by
MG  L  (c / f ) w

(4.15)

where
L  lower limit of the class containing the median
c  (n  1) / 2  [sum of the frequencies of all classes
preceding the class containing the median]
f  frequency of the class containing the median
w  class width
Example 4.14
Example 4.13.

Find the median of the grouped data in Table 4.1,

Solution:
Step 1. Rank of the median  (40  1) / 2  20.5
Step 2. Add the frequencies until the sum becomes greater than
or equal to 20.5, that is
8  10  6  24  20.5
The class containing the median is (30under 40)
Step 3.
MG  30 ((20.5  (8  10)) / 6)10
 30  (2.5 / 6)10  34.16
4.4.3 Mode
To nd the mode of grouped data is a simple exercise. That is, just nd the
class with the highest frequency. The mode of the grouped data is equal to
the midpoint of that class. Note that if there is more than one class with the
highest, but equal, frequencies, there is more than one mode and those modes
are equal to the midpoints of such classes.
In Example 4.13, the mode is equal to the midpoint of the class
[40under 50] since it has the highest frequency, 11. Thus,
Mode  (40  50) / 2  45

60

Chapter Four

4.4.4. Variance
The population and the sample variance of grouped data are computed by
using the following formulas.
( fi mi )
1
[ fi mi2
]
N
N
2

PopulationVariance( G2 ) =

( fi mi )
1
[ fi mi2
]
n 1
n

(4.16)

SampleVariance(SG2 ) =

(4.17)

where f, m, N, and n are as dened earlier in this section.


Example 4.15
Example 4.13.

Determine the variance of the grouped data in Table 4.1,

Solution: From the data in Table 3.1, we have


fimi2  8(15)2  10(25)2  6(35)2 11(45)2  5(55)2  52800
fimi  8(15)  10(25)  6(35)  11(45)  5(55)  1350
n  40
Substituting the value in Formula 4.18, we have
SG2 =
=

1
(1350 )2
[ 52800
]
40 1
40
1
1
[ 52800 45562.5 ] = [ 7237.5 ] = 185.577
39
39

The population and the sample standard deviation are found by taking the
square root of the corresponding variances. For example, the standard deviation for the data in example 4.13 is
SG 

185.577  13.62

4.5 Empirical Rule (Normal Distribution)


If data have a distribution that is approximately bell-shaped, the following
rule, known as the empirical rule, can be used to compute the percentage of
data that will fall within k standard deviations from the mean (k  1, 2, 3).
1. About 68% of the data will fall within one standard deviation of
the mean, that is, between   1 and   1.
2. About 95% of the data will fall within two standard deviations
of the mean, that is, between   2 and   2.
3. About 99.7% of the data will fall within three standard
deviations of the mean, that is, between   3 and   3.

Describing Data Numerically 61

Figure 4.3

Application of the empirical rule.

Figure 4.3 illustrates the empirical rule.


Note. The empirical rule is applicable for population data as well as for
sample data. In other words the above rule is valid if we replace  with X and 
with S.
Example 4.16 A soft-drink filling machine is used to fill 16-ounce softdrink bottles. Since the amount of beverage slightly varies from bottle to bottle, it is believed that the actual amount of beverage in the bottles forms a
bell-shaped distribution with a mean 15.8 ounces and standard deviation of
0.15 ounces. Use the empirical rule to find what percentage of bottles have
between 15.5 ounces and 16.1 ounces of beverage.
Solution: From the information provided to us in this problem, we have
  15.8 oz

  .15 oz

We are interested in nding the percentage of bottles that contain between


15.5 ounces and 16.1 ounces of beverage. Comparing Figure 4.4 with Figure
4.3, it is obvious that approximately 95% of the bottles contain between 15.5
ounces and 16.1 ounces, since 15.5 and 16.1 are two standard deviations
away from the mean.
Example 4.17 At the end of every fiscal year a manufacturer writes off or
adjusts its financial records to reflect the number of units of bad production
occurring over all lots of production during the year. Suppose the dollar values associated with the various units of bad production form a bell-shaped

62

Chapter Four

distribution with mean X  $35,700 and standard deviation S  $2,500.


Find the percentage of units of bad production that have a dollar value
between $28,200 and $43,200.

15.8

15.5

16.1

0.3 = 2

0.3 = 2

-2

Figure 4.4

-2

Amount of soft drink contained in a bottle.

X = 35,700

28,000

7,500 = 3S

7,500 = 3S

X 3S

Figure 4.5

43,200

Dollar value of units of bad production.

X + 3S

Describing Data Numerically 63

Solution: From the information provided to us, we have X  $35,700 and


S  $2,500.
Since the limits $28,200 and $43,200 are three standard deviations away
from the mean, comparing Figure 4.5 with Figure 4.3, we see that approximately 99.7% of the bad units had an outstanding debt between $28,200 and
$43,200.

4.6 Certain Other Measures of


Location and Dispersion
In this section we will study certain measures that help us locate the place of
any data value in the whole data set. Then we will study another measure that
gives us the range of the middle 50% of the data values.
4.6.1 Percentiles
Percentiles divide the data into 100 equal parts and they are numbered from
1 to 99. The median of a data set is the 50th percentile, which divides the data
into two equal parts, that is, at most 50% of the data fall below the median
and at most 50% of the data fall above it. The procedure for determining the
other percentiles is similar to the procedure used for determining the median. We compute the percentiles as follows:
Step 1. Write the data values in the ascending order and rank
them from 1 to n.
Step 2. Find the rank of the pth percentile ( p  1, 2, ..., 99),
which is given by
Rank of the pth percentile  p[(n  1) / 100]
Step 3. Find the data value that corresponds to the rank of the
pth percentile. We illustrate this procedure with the following
example.
Example 4.18 The following data give the salaries (in thousands of dollars) of 15 engineers of a corporation:
62, 48, 52, 63, 85, 51, 95, 76, 72, 51, 69, 73, 58, 55, 54
Find the 70th percentile for these data.
Solution: Write the data values in the ascending order and rank them from
1 to 15, since n is equal to 15.
Step 1. Salaries: 48, 51, 51, 52, 54, 55, 58, 62, 63, 69, 72, 73, 76, 85, 95
Rank: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
Step 2. Find the rank of the 70th percentile which is given by
rank of the 70th percentile  70((15  1) / 100)  11.2

64

Chapter Four

11
72

Figure 4.6

11.2

12

.2

.8

73

Salary data.

Step 3. Find the data value that corresponds to the rank 11.2,
which will be the 70th percentile. From Figure 4.6, we can
easily see that the value of the 70th percentile is given by
70th percentile  72(.8)  73(.2)

 72.2

Thus, the 70th percentile of the salary data is $72,200.


That is, at most 70% of the engineers are making less than $72,200 and
at most 30% of the engineers are making more than $72,200.
In our above discussion we determined the value x of a given percentile p.
Now we would like to nd the percentile p corresponding to a given value x.
This can be done by using the following formula.
p=

(# of data values x )
(100 )
(n + 1)

(4.18)

For example, in Example 4.18 nd the percentile corresponding to the salary


of $60,000, we have
P  (7/(15  1))100  44
In this case, the engineer who is making a salary of $60,000 is at the 44th
percentile, which means that at most 44% of the engineers are making less
than her and that at most 56% are making more.
4.6.2 Quartiles
In the previous discussion we studied the percentiles that divide the data into
100 equal parts. Some of the percentiles have special importance. These
other important percentiles are the 25th, 50th, and 75th percentiles and are
known as the rst, second, and third quartiles (denoted by Q1, Q2, and Q3).
These quartiles are sometimes also known as lower, middle, and upper quartiles. Also note that the second quartile is the same as the median. To determine the values of different quartiles, one just nds the 25th, 50th, and 75th
percentiles (see Figure 4.7).
4.6.3 Interquartile Range
Often we are more interested in nding information about the middle 50% of
a population. A measure of dispersion relative to the middle 50% of the population or sample data is known as the interquartile range. This range is

Describing Data Numerically 65

25%

25%

25%

25%

________ ___________ _________


_ _________
___
Quar

es

Q1

Q2

Q3

25 th

50 th

75 th

Quartiles
Percentiles

Figure 4.7

Quartiles and percentiles.

obtained by trimming 25% of the values from the bottom and 25% from the
top. Interquartile range (IQR) is dened as
IQR  Q3  Q1
Example 4.19
4.18:

(4.19)

Find the interquartile range for the salary data in Example

Salaries: 48, 51, 51, 52, 54, 55, 58, 62, 63, 69, 72, 73, 76, 85, 95
Solution: In order to nd the interquartile range, we need to nd the quartiles Q1 and Q3 or, equivalently, 25th percentile and the 75th percentile. We
can easily see that the ranks of 25th and 75th percentile are:
Rank of 25th percentile  (25 / 100)(15  1)  4
Rank of 75th percentile  (75 / 100)(15  1)  12
Thus, in this case, Q1  52 and Q3  73.
This means that the middle 50% of the engineers earn a salary between
$52,000 and $73,000. The interquartile range in this example is
IQR  $73,000  $52,000  $21,000
Notes:
1. The interquartile range gives the range of variation among the
middle 50% of the population.
2. The interquartile range is potentially a more meaningful
measure of dispersion as it is not affected by the extreme
values that may be present in the data. By trimming 25%
of the data from the bottom and 25% from the top, we are
eliminating any extreme values that may be present in the
data set. Interquartile range is used quite often as a measure
of comparison for comparing two or more data sets on similar
studies.

66

Chapter Four

4.7 Box-Whisker Plot


In this and the previous chapter, we have mentioned extreme values. So at
some point we must know what values in a data set are extreme values, also
known as outliers. A box-whisker plot (or, simply, a box plot) helps us
answer this question. Figure 4.8 illustrates a box plot for any data set.

DESCRIPTION

BOX-WHISKER PLOT
A graphical tool that uses summary
statistics: first quartile, second quartile, third
quartile, extreme data values that are
located within a certain range.

USE

Used to assess the skewness of the


distribution. The most important role of the
Box-Whisker Plot is to detect any outliers
that may be present in the data. Also, used
for visual comparison of two or more data
sets.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Any data set may contain outliers, which


if not corrected could invalidate the final
results. Distributions departure from
symmetry may violate the assumption of
the underlying model.

SPECIAL COMMENTS/CONCERNS

The box-whisker plot may be presented in


horizontal or vertical form. The box in the
box plot contains the middle 50% of the
data values.

RELATED TOOLS

Conceptually related to the Empirical Rule.

4.7.1 Construction of a Box Plot


Step 1. For a given data set, rst nd the quartiles Q1, Q2, and
Q3.
Step 2. Draw a box with its outer lines standing at the rst
quartile (Q1) and the third quartile (Q3) and then draw a line at
the second quartile (Q2). The line at Q2 divides the box into two
boxes, which may or may not be of equal size.
Step 3. From the center of the outer lines draw straight lines
extending outwardly up to three times the interquartile range
(IQR) and mark them as shown in Figure 4.8. Note that each
distance between points A and B, B and C, D and E, and E and
F is equal to one and a one-half times the distance between the
points A and D, which is equal to the interquartile range (IQR).
The points S and L are respectively the smallest and largest data

Describing Data Numerically 67

Upper inner fence

Lower inner fence


Lower outer fence

Upper outer fence

L
A

1.5
IQ R

1.5
IQ R

Q1

Q2
IQ R

Smallest value within


the lower fence

Q3

1.5
IQ R

F
1.5
IQ R

Largest value within


the lower fence

Region of extreme outliers


Region of mild outliers

Figure 4.8

Box-whisker plot.

points that fall within the inner fences. The lines from A to S
and D to L are called the whiskers.

4.7.2 How to Use the Box Plot


About the outliers:
1. Any data points that fall beyond the lower outer fence and
upper outer fence are the extreme outliers. These points are
usually excluded from the analysis.
2. Any data points between the inner and outer fences are the mild
outliers. These points are excluded from the analysis only if we
are convinced that these points are in error.
About the shape of the distribution:
1. If the second quartile (median) is close to the center of the box
and each of the whiskers is approximately of equal length, the
distribution is symmetric.
2. If the right box is substantially larger that the left box and/or
the right whisker is much longer than the left whisker, the
distribution is right skewed.
3. If the left box is substantially larger than the right box and/or
the left whisker is much longer than the right whisker, the
distribution is left skewed.

68

Chapter Four

Example 4.20 The following data gives the noise level measured in decibels (a normal conversation by humans produces a noise level of about 75
decibels) produced by different machines in a large manufacturing plant:
85, 80, 88, 95, 115, 110, 105, 104, 89, 87, 96, 140, 75, 79, 99
Construct a box plot and see whether the data set contains any outliers.
Solution:
First we arrange the data in the ascending order and rank them from 1 to 15
(n  15)
Data values: 75, 79, 80, 85, 88, 89, 95, 96, 97, 99, 104, 105, 110, 115, 140
Ranks: 11, 12, 13, 14, 15, 16, 17, 18, 19, 10, 111, 112, 113, 114, 115
We now nd the ranks of the quartiles Q1, Q2, and Q3. Thus, we have
Rank of Q1  (25 / 100)(15  1)  4
Rank of Q2  (50 / 100)(15  1)  8
Rank of Q3  (75 / 100)(15  1)  12
Therefore the values of Q1, Q2, and Q3 are
Q1  85

Q2  96

Q3  105

Interquartile range is
IQR  Q3  Q1  105  85  20
and
(1.5)IQR  (1.5)20  30
The Figure 4.9 shows the box plot for the above data.
Figure 4.9 shows that the data includes one outlier. In this case, action
should be taken to reduce the sounds of the machinery, which produces a
noise level of 140 decibels.

Smallest value within


the inner fence

Largest value within


the inner fence
Mild outlier

C
A

25

55

Figure 4.9

Example box plot.

135

165

75

115

85

96

105

Describing Data Numerically 69

Example 4.21 The following data give the number of persons who take the
bus during the off-peak time schedule from Grand Central Station to Lower
Manhattan in New York:
12 12 12 14 15 16 16 16 16 17 17 18 18 18 19 19 20 20 20 20
20 20 20 20 21 21 21 22 22 23 23 23 24 24 25 26 26 28 28 28
1. Find the mean, mode, and median for these data.
2. Prepare the box plot for the data.
3. Using results of part (1) and (2), verify whether the data are
symmetric or skewed. Examine whether the conclusion made
using the two methods about the shape of the distribution are
the same or not.
4. Using the box plot, determine whether the data contains any
outliers.
5. If in part (3) the conclusion is that the data are at least
approximately symmetric, find the standard deviation and
verify whether the empirical rule holds.
Solution:
1. The sample size in this problem is n  40. Thus, we have
Mean X  xi / n  800/40  20
Mode  20
Median  20
2. To prepare the box plot, we rst nd the quartiles Q1, Q2 and
Q3.
Rank of Q1  (25 / 100)(40  1)  10.25
Rank of Q2  (50 / 100)(40  1)  20.5
Rank of Q3  (75 / 100)(40  1)  30.75
Since the data presented in this problem are already in ascending order, we
can easily see that the quartiles Q1, Q2, and Q3 are
Q1  17

Q2  20

Q3  23

Interquartile range:
IQR  Q3  Q1  23  17  6
1.5(IQR)  1.5(6) 9
The box plot for the data is as shown in Figure 4.10
3. Both parts (1) and (2) lead us to the same conclusionthe data
are symmetric.

70

Chapter Four

Smallest value within


the inner fence

Largest value within


the inner fence
Mild outlier

C
A

32

41

12

28

Figure 4.10 Box plot.

4. From the box plot in Figure 4.10, we see that the data do not
contain any outliers.
5. In part (3) we conclude that the data is symmetric. We proceed
to calculate the standard deviation and then verify whether the
empirical rule holds.
S2 =

1
(12 +  + 28 )2
12 2 + + 28 2
40 1
40

] = 18.1538

Thus, the standard deviation is S  4.26.


It can be seen that in the intervals:
( X  S, X  S)  (15.74  24.26) contains 72.5% of the data
( X  2S, X  2S)  (11.48  28.52) contains 100% of the data
The data are slightly more clustered around the mean. But for all practical purposes we can say the empirical rule does hold.

5
Probability

n Chapters 1 through 4 we started our discussion of applied statistics by


setting a context for Six Sigma. We identied some foundational concepts to get us started, and we explored how to describe data graphically
and numerically. Now it is time to extend our knowledge by looking at the
concept of probability and studying in section 5.1 how probability theory
relates to applied statistics. We will continue our discussion of probability by
introducing the random experiment in section 5.2, dening sample space and
simple events in section 5.3, and using Venn diagrams in section 5.4. We will
complete our discussion by reviewing probability rules and conditional probability in particular in sections 5.5, 5.6, and 5.7.

5.1 Probability and Applied Statistics


While many people who rst study probability struggle to understand the
basic concepts, it need not be overly difcult. As with any subject or topic,
understandability, or lack thereof, rests in the approach and conduct of the
discussion. And as such discussion relates to probability, the most common
approach is to delve rather deeply into mechanics of formula derivation that,
in many cases, leads people to begin struggling without the benet of understanding the basic concepts! We will use a different approach in this chapterwe will begin with an explanation of the basic concepts rst, and then
we will follow with several applications to the topic in examples.
To begin, probability is nothing more than a way to describe chance.
Chance in this context means there is a possibility, or probability, that some
sort of event will or will not occur. More specically, for any given event
someone may be interested in, there is some level of probability the event
will occur as well as some level of probability the event will not occur.
In quantitative terms, we describe probability as a decimal value between
0 and 1. The sum of the probability that an event occurs plus the probability
that the event does not occur equals 1. In common use, we convert the decimal values to a percent value by multiplying the decimal value by 100.

71

72

Chapter Five

Now, how do we relate probability to applied statistics? Statistics includes


the study of probability associated with events of interest. In fact, we spend
much of our effort in the application of statistics attempting to understand
when, where, or how certain events will occur or not occur. And the mechanism we frequently use to study probability is the random experiment.

5.2 The Random Experiment


An experiment is a planned investigation of one or more phenomena wherein
the person conducting the experiment purposefully manipulates the conditions under which the experiment is conducted in order to observe the results.
In practice, we conduct experiments in accordance with the scientic
method in order to ensure consistent and reproducible results. Accordingly,
experiments produce one of two types of results or outcomes:
1. The outcome is unique.
2. The outcome is not unique; it is one of several possible
outcomes.
Experiments that result in unique outcomes are known as deterministic
experiments, while experiments that do not result in unique outcomes are
known as random experiments.
Anytime a random experiment is used, we are typically interested in
knowing which possible outcome will occur. Normally the experimenter has
a certain belief that a particular outcome will occur. For example, a design
engineer may have a certain belief that a product will perform within certain
parameters when used under certain conditions for a certain length of time.
The term probability indicates the measure of ones belief in the occurrence
of the desired outcome in a random experiment.
We now consider some examples of random experiments and list all possible outcomes in each case.
Example 5.1 As an experiment a Six Sigma Green Belt engineer wants to
test a computer chip that has come off the production line. In the past, both
defective and nondefective chips have been produced. Determine the sample
space associated with this experiment.
Solution: Since the possible outcomes in this example are either a defective (D) or nondefective (N) chip, the sample space associated with the
experiment is S  {D, N}.
Example 5.2 Now consider a random experiment in which the engineer
wants to test two chips off the production line and list all the possible outcomes.
Solution: In this case the possible outcomes of the random experiment are
{DD, DN, ND, NN}, where DD means that both chips are defective, DN
means that the rst chip is defective and the second is nondefective, and so on.
Example 5.3 A box contains identical parts produced by six manufacturers 1, 2, 3, 4, 5, and 6. A part is selected randomly and examined to find its
manufacturer. List all the possible outcomes in this experiment.

Probability 73

Solution: In this case there are six possible outcomes, which we list as {1,
2, 3, 4, 5, 6}.
Example 5.4
outcomes.

If in Example 5.3 two parts are selected, list all the possible

Solution: In this case there are 6  6  36 possible outcomes, that is {(1,1),


(1,2), ... (1,6), ..., (6,6)}. For example, the outcome (1,2) represents that the
rst part is produced by manufacturer 1 and the second part by manufacturer
2. Note that in this example ordered pairs represent the possible outcomes.

5.3 Sample Space, Simple Events, and


Events of Random Experiments
In any discussion of probability there must be a careful description of all possible outcomes of the random experiment we want to study. This description
must be complete and unambiguous, that is, every possible outcome of the
experiment must be listed, and careful attention must be devoted to ensuring
the list includes only one entry for each possible outcome.
Definition 5.1 The set of all possible outcomes of a random experiment is called the sample space of the experiment, and is usually
denoted by the letter S.
Conceptually, the sample space S may be regarded as a set whose elements are all the possible outcomes of an experiment. An element of S is usually called a sample point or a simple event, and it is denoted by the letter e.
Let us consider once again the experiments of Examples 5.1 through 5.4
and describe a sample space for each of them. The sample spaces for each
example are as follows:
Example 5.1 S  {e1, e2}; where e1  D, e2  N.
Example 5.2 {e1, e2, e3, e4}; where e1  DD, e2  DN, e3  ND, e4  NN;
one may simply write the sample space as S  {DD, DN, ND, NN}.
Example 5.3 S  {1, 2, 3, 4, 5, 6}.
Example 5.4 S  {(i,j); i  1, 2, 3, 4, 5, 6; j  1, 2, 3, 4, 5, 6}.
Example 5.5 Suppose an experiment involves the nomination of three workers to be appointed on the negotiation team. We have determined that one
nominee is male (M) and two nominees are female (F). We are interested in
observing the order in which nominees are selected.
Solution: In this example there are three possible outcomes, so the sample
space may be written as follows:
S  {(MFF), (FMF), (FFM)}
Note that any change in the experiment results in another sample space, since
the sample space must include all the possible outcomes of the new experiment. For instance, if in the above example we do not specify the gender of

74

Chapter Five

the workers nominated, then the sample space will consist of eight possible
outcomes:
S  {(MMM), (MMF), (MFM), (FMM), (MFF), (FMF), (FFM), (FFF)}
The sample spaces in all the examples considered so far are nite, which
means each sample space contains a nite number of elements or sample
points.
Definition 5.2 A set C is said to be countably innite when there
is a one-to-one correspondence between the set C and a set of all
non-negative integers.
Many examples in our day-to-day life involve sample spaces that consist
of countably innite number of elements.
Example 5.6 A lightbulb manufacturing company agrees to destroy all
bulbs produced until it produces a set quantity, say n, of bulbs at a defined
level of quality. Such an agreement is made in the context of negotiating a
sales contract wherein the purchaser wishes to ensure product of insufficient
quality does not enter the supply stream.
Solution: In this case, we observe the number of bulbs the company has to
destroy. The sample space for this situation consists of countably innite
number of sample points S  {0, 1, 2, ...} until a set number of bulbs produced meets or exceeds the desired level of quality.
The sample space S in this example contains the element or sample point
zero, since it is possible that the rst n bulbs produced by the company meet
the desired quality level or standard, in which case the company does not
need to destroy any bulbs.
Example 5.7 In Example 5.1 suppose that the Six Sigma Green Belt engineer decides to test the chips until she finds a defective chip. Determine the
sample space for this experiment.
Solution: The sample space S in this example will be S  {1, 2, 3, ...}. The
defective chip may be found in the rst trial or in the second trial or in the
third trial and so on.
In Examples 5.6 and 5.7, the sample spaces consist of countably innite
number of elements.
Definition 5.3 A sample space S is considered discrete if it consists of either a nite or a countably innite number of elements.
Definition 5.4 Any collection of sample points of a sample space
S, i.e., any subset of S, is called an event.
Example 5.8
events in S.
Solution:

Consider the sample S in Example 5.2 and list all possible

The sample space S in Example 5.2 is


S  {DD, DN, ND, NN}

Probability 75

The possible subsets of S and consequently the possible events in this sample space S are
[{},{DD},{DN},{ND},{NN},{DD, DN},{DD, ND},{DD, NN},{DN,
ND},{DN, NN},{ND, NN},{DD, DN, ND},{DD, DN, NN},{DD, ND,
NN},{DN, ND, NN},{DD, DN, ND, NN}]
There are 16 total possible events in this sample space. In general, if a
sample space consists of n sample points then there are 2n possible events in
the sample space. Each sample space contains two special events:   {},
which does not contain any elements of S, an empty set known as the null
event, and the event represented by the whole sample space S itself, which is
known as the sure event.
In Example 5.8 we encourage you to note that the simple events {DD},
{DN}, {ND}, and {NN} are also listed as events. By denition, all simple
events are also events; however, not all events are simple events. Events in a
sample space S are usually denoted by the capital letters A, B, C, D, and so
on. Event A is said to have occurred if the outcome of a random experiment
is an element of A.
Example 5.9 Suppose in Example 5.2, a part is randomly selected and the
manufacturer of the part is found. Determine whether a given event has
occurred.
Solution: In this case, the sample space S is {1, 2, 3, 4, 5, 6}. Let the given
event in S be A  {1, 4, 5, 6}. Now we can say event A has occurred if the
manufacturer of the part is 1, 4, 5, or 6. Otherwise we say event A has not
occurred.

5.4 Representation of Sample Space


and Events Using Diagrams
Sometimes the number of sample points in a sample space becomes so large
that the manual description of every sample point is difcult if not impossible. Since it is common for sample spaces to become very large, we must
introduce some techniques that help us systematically describe all the sample points in a sample space S.
5.4.1 Tree Diagram
A tree diagram is a tool that is useful not only in describing the sample
points, but also in listing them in a systematic way. We illustrate this technique with the help of an example.
Example 5.10 Continuing the random experiments in Examples 5.2 and
5.3, suppose that an experiment consists of three trials. The first trial is
testing a chip off the production line, the second is randomly selecting a
part from a box containing parts produced by six manufacturers, and the
third trial is again testing a chip off the production line. We are interested

76

Chapter Five

in describing and listing the sample points in the sample space of the
experiment.
Solution: We use a tree diagram technique to describe and list the sample
points in the sample space of the experiment in this example. The rst trial
in this experiment can result in only two outcomes (D, N); the second in six
outcomes (1, 2, 3, 4, 5, or 6) and the third, again, can result in two possible
outcomes (D, N). The tree diagram associated with the experiment is shown
in Figure 5.1.

D1D

D
N

D1N

1
D2D

D
N
2

D2N
D

D3D
N

D3N
D4D

D
N

5
D

D4N
D5D

D
N

6
D

D5N
D6D

N
D6N
N1D

D
N
1

N1N
N2D

D
2

N
N2N
N3D

N
3

D
N

N3N
N4D

D
5
6

N
D
N

N4N
N5D
N5N
N6D

D
N

N6N

Figure 5.1 Tree diagram for an experiment of testing a chip, randomly selecting a
part, and testing another chip.

Probability 77

The tree diagram in Figure 5.1 is constructed as follows: Starting from


point 0 draw two straight-line segments indicating the number of possible
outcomes in the rst trial. We call these line segments rst-order branches.
From the endpoint of each rst-order branch, draw six straight-line segments, called second-order branches, indicating the number of possible outcomes in the second trial. Then from the end of each second-order branch,
draw two straight-line segments, called third-order branches, indicating the
number of possible outcomes in the third trial. Note that each branch is
marked by the corresponding possible outcome as, for example, the rstorder branches are marked with D and N.
The number of sample points in a sample space is equal to the number
of nal-order branches. For instance, in the present example, the number of
sample points in the sample space is equal to the number of third-order
branches, which, in this case, is 24. To list all the sample points just start
counting from 0 along the paths of all possible connecting branches until you
reach the end of the nal-order branches, listing the sample points in the
same order as the various branches are covered. In Example 5.10, the sample space S is given as follows:
S  {D1D, D1N, D2D, D2N, D3D, D3N, D4D, D4N, D5D, D5N, D6D,
D6N, N1D, N1N, N2D, N2N, N3D, N3N, N4D, N4N, N5D, N5N, N6D,
N6N}
The tree diagram technique for describing the number of sample points
is extendable to an experiment with a large number of trials where each trial
has several possible outcomes. For example, if an experiment has n trials and
the ith trial has mi possible outcomes (i  1, 2, 3, ..., n), then the technique
of the tree diagram may be extended as is described next.
There will be m1 branches at the starting point 0, m2 branches at the end
of each of the m1 branches, m3 branches at the end of the each of m1  m2
branches and so on. The total number of branches at the end would be m1 
m2  m3  ...,  mn, which represents all the sample points in the sample
space S of the experiment. This rule of describing the total number of sample points is known as the multiplication rule.
5.4.2 Permutation and Combination
In this section we introduce the concept of permutation and combination. These
concepts are also helpful in describing the sample points in a sample space.
Permutation
Definition 5.5 A permutation of a set of objects is an arrangement
of that set in some specic order. For example, consider the set of
three objects A, B, and C. The possible arrangements for these distinct objects are
ABC, ACB, BAC, BCA, CAB, and CBA.
In this case, there are six arrangements. In other words, the number of permutations to arrange three distinct objects is six. Note that we can achieve
this result without writing these arrangements, as is described next.

78

Chapter Five

Suppose that to arrange these objects we allocate three slots so that each
slot can accommodate only one object. The rst slot can be lled with any of
the three objects or we can say that there are three ways to ll the rst slot.
After lling the rst slot we are left with only two objects which can be
either A, B; A, C; or B, C. Now the second slot can be lled with either of
the two remaining objects, meaning that there are two ways to ll the second
slot. Once the rst two slots are lled, we are left with only one object, so
that the third slot can be lled only one way. Then, all three slots can be lled
simultaneously in 3  2  1  6 ways. This concept can be extended for any
number of objects, that is, n distinct objects can be arranged in n  (n  1)
 (n  2)  ...  3  2  1 ways. The number n  (n  1)  (n  2) ...
 3  2  1 is particularly important in mathematics and applied statistics
and is denoted by the special symbol n! (read as n-factorial).
Definition 5.6 The n-factorial is the product of all the integers
starting from n to 1. That is,
n!  n  (n  1)  (n  2) ... 3  2  1.

(5.1)

Note also, that 0!  1


Example 5.11

Compute the value of:

1. 8!
2. 10!
3. (13  5)!
Solution:
1. 8!  8  7  6  5  4  3  2  1  40,320
2. 10!  10  9  8  7  6  5  4  3  2  1  3,628,800
3. (13  5)!  8!  40,320
The number of permutations of n distinct objects is denoted by Pnn . If we are
interested in arranging only r of n objects (r  n), the number of permutations is denoted by Prn. From our discussion above, we can see that
Pnn  n  (n  1)  (n  2) ... 3  2  1  n!
Prn  n  (n  1)  (n  2) ...  (n  r  1) 

n!
(n r )!

(5.2)
(5.3)

Example 5.12 An access code for a security system consists of four positive digits (1 through 9). How many access codes are possible if each digit
can be used only once?
Solution: Since each digit can be used only once and the different orders of
four digits give different access codes, the total number of access codes is
equal to the number of arrangements of choosing four digits from nine digits and arranging four digits in all possible ways. That is,
Number of access codes  p49 =

9!
9! 9 8 7 6 5!
= =
= 3024
(9 4 )! 5 !
5!

Probability 79

Combinations
If we want to select r objects from a set of n objects, without giving any
importance to the order in which these objects are selected, the total number
of possible ways that r objects can be selected is called the
n
number of combinations. It is usually denoted by crn and sometimes by .
x
Clearly each of the crn combinations of r objects can be arranged in r! ways.
Thus, the total number of permutations of n objects taken r at a time is r! 
crn . Thus from Equation (5.3), we have
r!  c nr 

n!
(n r )!

that is
crn =

n!
r !(n r )!

(5.4)

Example 5.13 The management team of a particular company is interested


in selecting three members from their team of 10 managers for a special
project. Find the number of possible groups.
Solution: The order in which the managers are selected is not important. In
this case, the total number of combinations of the 10-person team is given by:
c10
3 =
=

10!
10 9 8 7 6 5 4 3 2 1 10 9 8 ( 7!)
=
=
3! 7!
3! 7!
3! 7!
10 9 8 10 9 8
=
= 120
3!
3 2 1

Said another way, the management team can select three managers out of the
10 possible managers in 120 possible ways.
So far we have covered the problem of describing sample points in a sample space and in events belonging to that sample space. Now we need to study
how to combine two or more events and describe the associated sample
points. When we look more closely at probability theory in the next section,
we will see that quite often we are interested in calculating the probability of
events that are, in fact, combinations of two or more events. The combinations
of events are completed by special operations known as unions, intersections,
and complements. To dene these operations we rely on Venn diagrams.
We will now describe how to represent a sample space and events in that
sample space with the help of a Venn diagram. In Venn diagrams, the sample
space is represented by a rectangle, whereas events are represented by regions
or parts of regions within the rectangle. Note that a region representing an
event encloses all the sample points in that event. For example, suppose S 
{1, 2, 3, 4, 5, 6} and A  {2, 4, 5}. Then a Venn diagram, as shown in Figure
5.2, is drawn to represent the sample space S and the event A. Note that the
region representing the event A encloses the sample points 2, 4, and 5.
Definition 5.7 An event in sample space S is called a null event if
it does not contain any sample point, in which case it is usually
denoted by the Greek letter  (read as phi).

80

Chapter Five

S
2

4
5

1
6

Figure 5.2

Venn diagram representing the sample space S and the event A in S.

Example 5.14 Consider a group of 10 shop floor workers in a production/manufacturing operation. Suppose the group consists of nine men and
one woman. Let S be a sample space that consists of a set of all possible
groups of three workers. Determine an event A in S containing all groups of
three workers that have two women and one man.
Solution: Clearly no groups of three workers can have two women since
there is only one woman in the bigger group. Thus, the event A is the null
event, that is A  .
Definition 5.8 Let S be a sample space and let A be an event in S.
Then the event A is called a sure event if it consists of all the sample
points in the sample space S.
Example 5.15 Let S be a sample space associated with a random experiment E. Then determine a sure event in the sample space S.
Solution: By denition a sure event must contain all the sample points that
are in S. Thus, the only sure event is the sample space S itself.
As we saw above, a sample space and events can be represented by using
set notation. To more fully develop the basic concepts of probability theory
in an orderly fashion, it is important to study rst some basic operations of
set theory.
Basic Operations of Set Theory
Let S be a sample space and let A and B be any two events in S. Then the
basic operations of set theory are union, intersection, and complements:
Definition 5.9 The union of events A and B, denoted by A  B
(read as A union B), is dened as an event containing all the sample
points that are in either A or B or both A and B, as illustrated in
Figure 5.3.
Definition 5.10 The intersection of events A and B, denoted by A
 B (read as A intersection B), is dened as an event containing all
the sample points that are in both A and B, as illustrated in Figure 5.4.
Definition 5.11 The complement of an event A in the sample space
S, denoted by A (read as complement of A), is dened as an event

Probability 81

Figure 5.3

Venn diagram representing the union of events A and B (shaded area).

S
S

A
A

B
B

Figure 5.4 Venn diagram representing the intersection of events A and B


(shaded area).

Figure 5.5

Venn diagram representing the complement of an event A (shaded area).

containing all the sample points that are in S but not in A, as illustrated in Figure 5.5.
Example 5.16 Let a sample space S and two events A and B in S be defined
as follows: S  {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, A  {1, 4, 6, 7, 8}, B  {5, 7,
9, 10}. Then determine A  B, A  B, A and B .
Solution:

Clearly from Figure 5.6, we have

A  B  {1, 4, 5, 6, 7, 8, 9, 10}, A  B  {7}, A  {2, 3, 5, 9, 10}, B 


{1, 2, 3, 4, 6, 8}

82

Chapter Five

1 4
6 8

2
B

10
3

1 4

6 8
B
5

9
10

1 4

6 8
B
5

9
10

1 4

6 8
B
5

9
10

B
Figure 5.6 Venn diagram representing A  B  {1, 4, 5, 6, 7, 8, 9, 10},

A  B  {7}, A  {2, 3, 5, 9, 10}, B  {1, 2, 3, 4, 6, 8}.

Probability 83

Figure 5.7

Two mutually exclusive events, A and B.

Definition 5.12 Let S be a sample space and let A and B be any two
events in S. Then the events A and B are called mutually exclusive if
the event A  B is a null event, that is A  B   (see Figure 5.7).
Example 5.17 Let S  {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, A  {1, 3, 5, 7, 9}, B 
{2, 4, 5, 8, 10}. Determine whether events A and B are mutually exclusive.
Solution: Clearly events A and B do not have any sample point in common, that is, A  B  . Therefore, events A and B are mutually exclusive.
This means that events A and B cannot occur together.

5.5 Defining Probability Using


Relative Frequency
In the preceding section we dened sample spaces and events with various
methods of describing the sample points. In this section we will dene probability using relative frequency.
Consider a random experiment E and let S be the sample space associated with this experiment. Let e1, e2, ..., en be the sample points in the sample
space S. Then we have the following denition.
Definition 5.13 The sample points e1, e2, ..., en in a sample S are
called equally likely whenever no particular sample point can occur
in preference to any other sample point.
Example 5.18 The CEO of a large corporation randomly selects an engineer from a total pool of 150 Six Sigma Green Belt engineers for the position
of first line manager. Determine the sample space for this problem.
Solution: Since the engineer was randomly selected, it means any of the
150 engineers had the same chance of being selected. Thus, in this example
the sample space S consists of 150 sample points that are equally likely.
Example 5.19 Find the sample space for an experiment that consists of
tossing three balanced coins and observing whether a head or a tail appears.
Solution: This experiment consists of three steps, and in each step there are
two possible outcomes where each of them has the same chance of occurring.

84

Chapter Five

Thus, there are eight (2  2  2) equally likely sample points in the sample
S associated with this experiment, that is,
S  {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}
Example 5.20 Consider an experiment E of rolling two balanced dice and
observing the numbers that appear on the uppermost face. Determine the
sample space associated with the experiment E.
Solution: The sample space S in this experiment consists of 36 (6  6)
equally likely sample points, that is,
(1, 1), (1, 2 ), (1, 3), (1, 4 ), (1, 5 ), (1, 6 ), (2, 1), (2, 2 ), (2, 3), (2, 4 ), (2, 5 ), (2, 6 ),

S = ( 3, 1), ( 3, 2 ), ( 3, 3), ( 3, 4 ), ( 3, 5 ), ( 3, 6 ), ( 4, 1), ( 4, 2 ), ( 4, 3), ( 4, 4 ), ( 4, 5 ), ( 4, 6 ),

(5, 1), (5, 2 ), (5, 3), (5, 4 ), (5, 5 ), (5, 6 ), (6, 1), (6, 2 ), (6, 3), (6, 4 ), (6, 5 ), (6, 6 )
Example 5.21 Repeat the experiment in Example 5.20, but instead of
observing only the numbers that appear on the uppermost faces, observe the
sum of the numbers that appear on both dice. Then determine the sample
space S associated with the new experiment.
Solution:

The sample space S associated with the new experiment is


S  {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}

The sample points in this sample space correspond to the sample points in
the sample space of Example 5.20:
2  {(1, 1)}
3  {(1, 2) (2, 1)}
4  {(1 ,3) (2, 2) (3, 1)}
5  {(1, 4) (2, 3) (3 ,2) (4, 1)}
6  {(1, 5) (2, 4) (3, 3) (4, 2) (5, 1)}
7  {(1, 6) (2, 5) (3, 4) (4, 3) (5, 2) (6, 1)}
8  {(2, 6) (3, 5) (4, 4) (5, 3) (6, 2)}
9  {(3, 6) (4, 5) (5, 4) (6, 3)}
10  {(4, 6) (5, 5) (6, 4)}
11  {(5, 6) (6, 5)}
12  {(6, 6)}
Note that the equations here express equality of events.
From the description of the sample points, we can see that the sample
points in the sample space are not equally likely. In fact, 3 is twice as likely to
appear as 2, 4 is three times as likely as 2 or 1.5 times as likely as 3, and so on.
Definition 5.14 Consider a random experiment E with sample
space S. The sample points that describe event A are said to be
favorable to the event A.

Probability 85

Definition 5.15 Let S be a sample space associated with a random


experiment consisting of n equally likely sample points. Of the n
sample points, suppose na, (na n), are favorable to the happening
of an event A. In this case, the probability of the event A, denoted by
P(A), is dened as
P(A) 

na
n

(5.5)

Dening probability in this way is known as the relative frequency


approach. Now, if we let n become as large as possible, so that n approaches
na
innity, then P(A) 
approaches to a constant p, which is called the
n
theoretical probability. For example, if we toss a coin n times and let na be
na
the number of times that a head appears, as n becomes innitely large,
n
converges to a constant p where p  1 2 or 50%. We do provide a cautionary
note, however, in that we should not expect that whenever we toss a coin,
50% of the time a head will appear and the other 50% a tail will appear, since
na
the limiting value of
is not an actual probability but merely a theoretical
n
probability that may or may not hold in any particular case. This denition,
however, leads us to a more modern approach of dening probability known
as the axiomatic approach. We will discuss that in the following section. In
the remainder of this section, we discuss a few more examples using the relative frequency approach.
Example 5.22 Suppose that in Example 5.3 the box contains only six parts,
one part produced by each of the six manufacturers. We select one part randomly. Find the probability of an event A that the part selected is produced
by manufacturer 2, 4, or 6.
Solution: Since the box contains exactly six parts, one part produced by
each of the six manufacturers, the sample space S consists of six equally likely sample points, that is S  {1, 2, 3, 4, 5, 6}. The sample points favorable
to the event A are A  {2, 4, 6}. Thus, we have n  6 and na  3. In this
case, the probability of the event A is P(A)  3 6  1 2.
Example 5.23 The production department of a manufacturing company
wants to hire an engineer. In response to an advertisement, it received
60 applications, of which 20 were sent by women. To judge the quality of
the pool of applicants, the manager of the production department randomly
selects one application. Find the probability that the application is a
womans.
Solution: The sample space S consists of 60 equally likely sample
points. Let A be the event that the application is a womans. The number
of sample points favorable to the event A is 20. Thus, in this case, we have
n  60, na  20. Therefore
P(A) 

na
 20 / 60  1 / 3
n

86

Chapter Five

Example 5.24 Roll two balanced dice and observe the sum of the two numbers that appear on the uppermost faces. Let A be the event that the sum of
the numbers of the two dice is 7. Find the probability of the event A.
From Examples 5.20 and 5.21 we can see that n  36, na  6.
na
Thus, P(A) 
 6 / 36  1 / 6.
n

Solution:

Example 5.25 In Example 5.20, find the probability that the two dice show
the same number.
Solution: Let A be the event that the two dice show the same number. Then
A  {(1, 1)(2, 2)(3, 3)(4, 4)(5, 5)(6, 6)}, that is, na  6.
The probability that the two dice show the same number is
P(A) 

na
 6 / 36  1 / 6
n

Example 5.26 Consider a group of six workers, all of whom are born during the same nonleap year. Find the probability that no two workers have
the same birthday.
Solution: We represent the workers as W1, W2, W3, W4, W5, and W6 and
their birthdays as D1, D2, D3, D4, D5, and D6, respectively. The sample point
that represents the birthdays of these workers may be represented by (D1, D2,
D3, D4, D5, D6). Since each birthday could be any day of the year, using the
multiplication rule, the total number of sample points in the sample space is
given by
n  365  365  365  365  365  365  3656
Now suppose that E is the event where no two of the six workers are born on
the same day. Event E will occur if the rst workers birthday falls on any of
the 365 days, the second workers birthday falls on any of the remaining 364
days, the third workers birthday falls on any of the remaining 363 days, and
so on. The total number of sample points favorable to the event E, using the
multiplication rule, is
na  365  364  363  362  361  360. Thus, in this example, we have
na
P(E) 
 (365  364  363  362  361  360) / 3656
n
 0.959538

5.6 Axioms of Probability


In the preceding section we dened probability using the relative frequency
approach. In the early 1930s this approach was largely abandoned in favor of
the modern axiomatic approach to probability theory, which we will we
study in this section.
Consider a random experiment E and let S be the sample space associ-

Probability 87

the modern axiomatic approach to probability theory, which we will we


study in this section.
Consider a random experiment E and let S be the sample space associated with this experiment. To every event A in S, there corresponds a certain
number P(A), called the probability of A, which satises the following
axioms.
Axiom 1.

0 P(A) 1

Axiom 2.

P(S)  1

Axiom 3.

For any sequence {An} of mutually exclusive events


(Ai  Aj   for i  j)
n

P( Ai ) = P(Ai )
i =1

i =1

Please note that the rst axiom states that the probability of an event A
always assumes a value between 0 and 1 (inclusive). Axiom 2 states that the
event S is sure to happen; in other words, it is certain that the outcome of the
experiment E will be a sample point in the sample space S. Axiom 3 states
that the probability of occurrence of one or more of the mutually exclusive
events A1, A2, ..., An is just the sum of their respective probabilities.
A consequence of these axioms is that several important results simplify
the computation of probability of complex events. Here we state just a few
of them. The proofs of these results are beyond the scope of this book.
Theorem 5.1 Let S be a sample space, and let A be any event in S. The

sum of probabilities of the event A and its complement A is one, that is,

P(A)  P(A)  1
(5.6)
From this it follows that

P(A)  1  P(A) or P(A)  1  P(A)


Theorem 5.2
Then

(5.7)

Let S be a sample space, and let  be the null event.


P()  0

(5.8)

Theorem 5.3 Let S be a sample space. Let A and B be any two events
(may or may not be mutually exclusive) in S. Then, we have

Figure 5.8 Venn diagram showing the phenomenon of P(A  B)  P(A)  P(B) 
P(A  B).

88

Chapter Five

P(A  B)  P(A)  P(B)  P(A  B)

(5.9)

A nonmathematical proof of Equation (5.9) follows. From Figure 5.8,


we can easily see that P(A  B) is included in both P(A) and P(B). Thus,
when we add P(A) and P(B), we are adding the probability P(A  B) twice
and, therefore, we must subtract P(A  B) from the sum P(A)  P(B).
If the events A and B are mutually exclusive, P(A  B)  P()  0,
Therefore,
P(A  B)  P(A)  P(B).

(5.10)

The result in Theorem 5.3 can easily be extended for more than two events.
For example, for any three events A, B and C we have
P(A  B  C)  P(A)  P(B)  P(C)  P(A  B) 
P(A  C)  P(B  C)  P(A  B  C)
Example 5.27 Suppose a manufacturing plant has 100 workers, some of
whom are working on two projects, project 1 and project 2. Suppose 60
workers are working on project 1, 30 are working on project 2, and 20 workers are working on both the projects. Suppose a worker is selected randomly. What is the probability that he or she is working on at least one of the
projects?
Solution: Let A be the event that the selected worker is working on project 1 and B be the event that the selected worker is working on project 2. We
are interested in nding the probability that the worker is working on at least
one project, that is, either on project 1 or on project 2 or on both the projects.
This is just equivalent to nding the probability P(A  B). From the information provided to us, we have
P(A)  60 / 100, P(B)  30 / 100 and P(A  B)  20 / 100
Therefore, from Formula (5.9), we have
P(A  B)  P(A)  P(B)  P(A  B)  60 / 100 
30 / 100  20 / 100  70 / 100  7 / 10

5.7 Conditional Probability


Let S be a sample space and let A and B be any two events in the sample
space S. So far we have been interested in nding the probabilities of events

A, B, or some combination of A and B such as (A  B), (A  B), A or B.


Now in this section we are interested in nding the probability of an event,
say A, if we know that the event B has already happened. This probability is
called the conditional probability of the event A given that the event B has
already happened and is denoted by P(A | B), (note that (A | B) is read as A

Probability 89

given that B and it should not be confused with A / B, which means A divided
by B). The conditional probability may be dened as follows:
Definition 5.16 Let S be a sample space and let A and B be any
two events in the sample space S. The conditional probability of the
event A, given that the event B has already occurred, is as follows:
P(A | B) 

P ( A B)
,
P (B)

if

P (B) 0

(5.11)

P(B | A) 

P ( A B)
,
P( A)

if

P( A) 0

(5.12)

Similarly,

Example 5.28 The manufacturing department of a company hires technicians who are college graduates as well as technicians who are not college
graduates. Under the diversity program, the manager of any given department is very careful to hire both male and female technicians. The data in
Table 5.1 shows a classification of all technicians in a selected department
by qualification and gender.
In this case, the manager promotes one of the technicians to a supervisory position. If it is known that the promoted technician is a woman, then
what is the probability that she is a nongraduate? Find the probability that
the promoted technician is a nongraduate when it is not known that the promoted technician is a woman.
Solution: Let S be the sample space associated with this problem and let
A and B be two events dened as follows:
A: the promoted technician is a nongraduate
B: the promoted technician is a woman
We are interested in nding the conditional probability P(A | B).
Since any of the 100 technicians could be promoted, the sample space S
consists of 100 equally likely sample points. The sample points that
are favorable to the event A are 65 and those that are favorable to the event
B are 44. Also, the sample points favorable to both the events A and B are all
the women who are nongraduates and equal to 29. To describe this situation
we have
P(A)  65 / 100, P(B)  44 / 100, and P(A  B)  29 / 100
Therefore,
Table 5.1 Classification of technicians by qualification and gender.
Graduates

Nongraduates

Total

Male

20

36

56

Female

15

29

44

Total

35

65

100

90

Chapter Five

P(A | B) 

P ( A B)
29 / 100

 29 / 44
P (B)
44 / 100

Note the probability P(A), sometimes known as absolute or nonconditional


probability is the probability that the promoted technician is a nongraduate
when it is not known that the promoted technician is a woman, and is different from the conditional probability P(A | B).
When the conditional probability P(A | B) is the same as the nonconditional probability P(A), that is, P(A | B)  P(A), the two events A and B are
said to be independent. From results in Equations (5.11) and (5.12) we can
easily see that
P(A  B)  P(A | B) P(B)

(5.13)

P(A  B)  P(B | A) P(A)

(5.14)

and
Now, using the results in Equations (5.13) and (5.14) and if P(A | B)  P(A)
or P(B | A)  P(B), that is, if events A and B are independent, we can easily see that
P(A  B)  P(A) P(B)
A consequence of this result is that we have the following denition.
Definition 5.17 Let S be a sample space, and let A and B be any
two events in S. The events A and B are independent, if and only if
any one of the following is true:
1. P(A | B)  P(A)

(5.15)

2. P(B | A)  P(B)

(5.16)

3. P(A  B)  P(A) P(B)

(5.17)

The conditions in Equations (5.15), (5.16) and (5.17) are equivalent in the
sense that if one is true then the other two are true.
Note that the results in 5.13 and 5.14 are known as the multiplication rule.
Earlier in this chapter we learned about mutually exclusive events.
Although it may seem that mutually exclusive events are the same as independent events, we encourage you to be aware that the two concepts are
entirely different. Independence is a property that relates to the probability of
events, whereas mutual exclusivity relates to the composition of events, that
is, to the sample points presented in the events. For example, if the events A
and B are mutually exclusive and P(A)  0, P(B)  0, then P(A  B) 
P()  0  P(A)P(B) so the events are not independent.
Another method of calculating conditional probability without using the
formulas in Equations (5.11) and (5.12) is by determining a new sample
space, called the induced sample space, taking into consideration the information about the event that has already occurred. For instance, in Example
5.28, if we use the information that a woman has already been promoted,
then it is determined that the technician who was promoted cant be a man
and, therefore, the new sample space, or the induced sample space, consists

Probability 91

Table 5.2 Classification of technicians by qualification and gender.

Male

Graduates

Nongraduates

Total

27

33

60

Female

18

22

40

Total

45

55

100

only of 44 sample points (the total number of women). Out of the 44 women
technicians, 29 are nongraduates. The conditional probability P(A | B), is the
probability that a nongraduate technician has been promoted given that a
female technician has been promoted, and is now found as
P(A | B) 

# of nongraduate women technicians


 29 / 44
# of all women tecchnicians

This is, of course, the same as found in Example 5.27.


To elaborate the concept of independence further we consider the following example.
Example 5.29 Suppose that in Example 5.28 a new manager in the manufacturing department has made changes in the hiring policy. As a consequence
of the changes, the new classification of the technicians is as in Table 5.2.
The new manager has promoted a technician to a foremans position.
Find the following probabilities:
a. The conditional probability that the promoted technician is a
nongraduate given that the technician is a woman.
b. The nonconditional probability that the promoted technician is
a nongraduate.
Solution:
(a) Let the events A and B be dened as in Example 5.28, that is,
A: The promoted technician is a nongraduate.
B: The promoted technician is a woman.
Again, we are interested in determining the conditional probability
P(A | B). Using the new classication, we can see, as in Example 5.27, that:
P(A)  55 / 100  11 / 20
P(B)  40 / 100  2 / 5
P(A  B)  22 / 100  11 / 20
Therefore,
P(A | B) 

11 / 50
P ( A B)

 11 / 50  5 / 2  11 / 20
2/5
P (B)

(b) In this part we are interested only in nding the probability


P(A). This probability, as it turns out, we already calculated in
part 1. That is,

6
Discrete Random Variables
and Their Probability
Distributions

n Chapter 5 we studied sample space, events, and basic concepts, including certain axioms of probability theory. We saw that the sample space
associated with a random experiment E describes all possible outcomes
of the experiment. In many applications such a description of outcomes is not
sufcient to extract full information about the possible outcomes of the
experiment. In such cases it is always useful to assign a certain numerical
value to all the possible outcomes. Dening a variable known as a random
variable does the assigning of numerical values to all the possible outcomes.
In this chapter we dene random variables and study their probability
distribution, mean, and standard deviation. Then, we study some special
probability distributions that are commonly encountered in various statistical
applications.

6.1 Discrete Random Variables


In most applications we deal with two types of random variables, discrete
random variables and continuous random variables. In this chapter we study
discrete random variables, and in Chapter 7 we shall study continuous random variables.
Definition 6.1 A random variable is a vehicle (in mathematical
language we call it a function) that assigns a real numerical value to
each sample point (or outcome) in the sample space of a random
experiment.
A random variable is usually denoted by uppercase letters at the end
of the English alphabet, such as X, Y, or Z. The corresponding lowercase letter usually denotes the value that a random variable assigns
to the sample point.
Note: A random variable may assign the same numerical value to various outcomes, but it will never assign more than one value to any one outcome.

93

94

Chapter Six

Definition 6.2 A random variable that assumes a nite (or countably innite) number of values is called a discrete random variable.
Definition 6.3 A random variable that assumes an uncountably innite number of values is called a continuous random variable.
Examples of discrete random variables are such as the number of cars
sold by a dealer, the number of new employees hired by a company, the number of parts produced by a machine, the number of defects in an engine of a
car, the number of patients admitted to a hospital, the number of telephone
calls answered by a receptionist, the number of applications sent out by a job
seeker, the number of games played by a batter before he makes a home run,
and so on.
To elaborate the concept that a random variable assigns numerical values
to all the possible outcomes, we consider a simple example of rolling two
dice. Remember that experiments of rolling dice or tossing coins were used
in this text to introduce the concept of probability theory.
Example 6.1 Roll two fair dice and observe the numbers that show up on
the upper faces. Find the sample space of this experiment and then define a
random variable that assigns a numerical value to each sample point equal
to the sum of the points that show up on upper faces. Find all the values that
such a random variable assumes.
Solution: Obviously, when a fair die is rolled any one of the six possible
numbers (1, 2, 3, 4, 5, 6) can come up. Thus, the sample space when two dice
are rolled is as follows:
S  {(1,1), (1,2), (1,3), (1,4), (1,5), (1,6), (2,1), (2,2), ..., (2,6), ..., (6,6)}
Let X be a random variable that assigns a numerical value to each sample point equal to the sum of the two numbers. We have
X(1,1)  2
X(1,2)  X(2,1)  3
X(1,3)  X(2,2)  X(3,1)  4
X(1,4)  X(2,3)  X(3,2)  X(4,1)  5
X(1, 5)  X(2, 4)  X(3, 3)  X(4, 2)  X(5, 1)  6
X(1, 6)  X(2, 5)  X(3, 4)  X(4, 3)  X(5, 2)  X(6, 1)  7
X(2, 6)  X(3, 5)  X(4, 4)  X(5, 3)  X(6, 2)  8
X(3, 6)  X(4, 5)  X(5, 4)  X(6, 3)  9
X(4, 6)  X(5, 5)  X(6, 4)  10
X(5, 6)  X(6, 5)  11
X(6, 6)  12
The random variable X assumes the values 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
12. X assumes only a nite number of values; therefore, it is a discrete random variable.

Discrete Random Variables and Their Probability Distributions 95

Definition 6.4 The set of all possible values of a random variable


X, denoted by R  {x1, x2, ..., xn,...} is usually called the range space.
To each value xi in the range space R, we assign a number pi  P(X  xi),
where pi satises the following two conditions:
1. pi 0 for all i

(6.1)

2.  pi  1

(6.2)

The function P dened above is known as the probability function of the


random variable X. The set of pairs (xi, pi), i  1, 2, ..., n written in tabular
form as shown in Table 6.1 is called the probability distribution of the random variable X. Note that in order for any probability function to be a probability distribution, it must satisfy properties described in Equations (6.1)
and (6.2).
To illustrate the concept of probability distribution we determine the
probability distribution of the random variable X dened in Example 6.1.
To determine the probability distribution of the random variable X, we
use the original sample space S. For instance, in Example 6.1 the sample
space S consists of 36 equally likely sample points. Using this fact about the
sample space and the relative frequency denition of probability introduced
in Chapter 5, we have
P(X  2)  P{(1,1)}  1 / 36
P(X  3)  P{(1,2), (2,1)}  2 / 36
P(X  4)  P{(1,3), (2,2), (3,1)}  3 / 36
P(X  5)  P{(1,4), (2,3), (3,2), (4,1)}  4 / 36
P(X  6)  P{(1,5), (2,4), (3,3), (4,2), (5,1)}  5 / 36
P(X  7)  P{(1,6), (2,5), (3,4), (4,3), (5,2), (6,1)}  6 / 36



P(X  12)  1/36.

We can easily verify that the probabilities in Table 6.2 satisfy the properties given in Equations (6.1) and (6.2). Thus, the probability function P(X  x)

Table 6.1 Probability distribution of a random variable X.


Xx

x1

x2

x3

x4

xn

P(X  x )

p1

p2

p3

p4

pn

Table 6.2 Probability distribution of random variable X defined in Example 6.1.


Xx

10

11

12

P(X  x )

1/36

2/36

3/36

4/36

5/36

6/36

5/36

4/36

3/36

2/36

1/36

96

Chapter Six

10

11

12

Figure 6.1

Graphical representation of probability function in Table 6.2.

dened above is a probability distribution of the random variable X. Note that


it is quite common to denote the probability function P(X  x) by f(x).
A probability function f(x)  P(X  x) is graphically represented in
Figure 6.1.
Let us consider, once again, the discrete random variable X and its range
space R  {x1, x2, ..., xn}. Let A be any subset of R. Then, we have
P(A)   f (xi)  P(X  xi)

(6.3)

where the summation is taken over all xi in A. Thus, for example, let A in
Example 6.1 be such that
A  {2, 4, 5, 8}.
Then, from Table 6.2, we have
P(A)  P(X  2)  P(X  4)  P(X  5)  P(X  8)
 1 / 36  3 / 36  4 / 36  5 / 36  13 / 36
In particular, if we dene the subset A as
A  {x | x 6}
Then, again using Table 6.2, we have
P(A)  P(X  2)  P(X  3)  P(X  4)  P(X  5)  P(X  6)
 1 / 36  2 / 36  3 / 36  4 / 36  5 / 36  15 / 36  5 / 12.
In general, if we dene A as
A  {xi | xi x}
then
P( A) =

P( X = xi ) = f ( xi )

xi x

(6.4)

xi x

Equation (6.4) is clearly the sum of probabilities for all xi that are less
than or equal to x. The probability P(A) dened in Equation (6.4) is commonly known as a cumulative probability.

Discrete Random Variables and Their Probability Distributions 97

Definition 6.5 The cumulative distribution function (cdf), simply


known as distribution function, of a discrete random variable X, and
denoted by F(x), is dened as follows:
F(x)  P(X x) = f ( xi )
(6.5)
xi x

Example 6.2 Consider a random experiment of tossing a coin twice. Let X


be a discrete random variable defined as the number of heads that appear in
two tosses. Determine the probability function and the distribution function
of the random variable X. Give the graphical representation of the probability function and the distribution function.
Solution: Let H and T denote the head and the tail. Then the sample space
S is
S  {HH, HT, TH, TT}
Let the random variable X denote the number of heads, so that
X(HH)  2
X(HT)  X(TH)  1
X(TT)  0
Thus, the range space R is
R  {0, 1, 2}
Let f(x) be the probability function of the random variable X (see Table 6.3).
The graphical representation of the probability function f(x) in Table 6.3
is as shown in Figure 6.2

Table 6.3 Probability function of X.

Figure 6.2

Xx

f(x)  P(X = x)

__

__

__

1
X

Graphical representation of probability function f(x) in Table 6.3.

98

Chapter Six

Figure 6.3

Graphical representation of the distribution function F(x) in Example 6.2.

We can now easily nd the distribution function F(x) as


F(x)  0 x  0
 1 4 0 x  1
 3 4 1 x  2
1 2 x
The graphical representation of F(x) is as shown in Figure 6.3.
The distribution function F(x) of a discrete random variable possesses
the following properties:
1. 0 F(x) 1

(6.6)

2. F(x1) F(x2), for x1  x2

(6.7)

Example 6.3 Consider a random experiment of rolling two dice. Let X be


a discrete random variable defined as in Example 6.1. Let X  x1, x2 where
x1  4, x2  5. Show that 0 F(x) 1 and F(x1) F(x2), where F(x) is
the distribution function of the random variable X.
Solution: Using Denition 6.5, it can easily be shown that
F(x)  0

x2

 1 / 36
 3 / 36
 6 / 36
 10 / 36


2 x3
3 x4
4 x5
5 x6

 35 / 36

11 x  12

1

12 x.

This clearly shows that 0 F(x) 1. Furthermore, it is obvious that


F(x1)  F(4)  6 / 36
and
Therefore, F(x1)  F(x2).

F(x2)  F(5)  10 / 36.

Discrete Random Variables and Their Probability Distributions 99

6.2 Mean and Standard Deviation


of a Discrete Random Variable
The mean and the standard deviation of a discrete random variable X are two
measures that are usually used to summarize the probability distribution of the
random variable. The mean, denoted by , is sometimes also known as the
expected value. It is denoted by E(X), and it is a measure of the center of
the probability distribution. The standard deviation, denoted by , is a measure of variability or dispersion of the probability distribution. In the absence
of complete information about the probability distribution of a random variable, the mean and the standard deviation give us summarized information
about the probability distribution of the random variable.
Let X be a discrete random variable and let f (x) be the probability function of X. Then the mean and the standard deviation of the random variable
X are dened as follows:
Definition 6.6 The mean or expected value of a discrete random
variable X is dened as

= E ( X ) = xf ( x )

(6.8)

that is, each value of the random variable X is multiplied by the corresponding probability and summed over all the possible values of
the random variable X.
Definition 6.7 The variance of a discrete random variable X is
dened as

2 = V ( X ) = ( x )2 f ( x )

(6.9)

The standard deviation of discrete random variable X is dened as

( x )2 f ( x )

(6.10)

x2 f (x) 2

(6.11)

Equivalently, we can show that

Example 6.4 Let a random variable X denote the number of defective parts
produced per eight-hour shift by a machine. Experience shows that it produces defectives between 0 and 4 (inclusive) with the following probabilities:
Xx

f(x)  P(X  x)

0.1

0.4

0.25

0.20

0.05

a. Find the mean and the standard deviation of the random


variable X.

100

Chapter Six

b. Find the probability that X falls in the interval (  2,   2).


Solution:
a. Using Denition 6.7, the mean of the random variable X is

= E ( X ) = xf ( x )
= 0 (0.1) + 1 (0.4 ) + 2 (0.25 ) + 3 (0..20 ) + 4 (0.05 )
= 0 + 0.40 + 0.50 + 0.60 + 0.20 = 1.70
Now, using Equation (6.10), the standard deviation of the random variable X is

= x2 f (x) 2

( 0 (0.1) + 1 (0.4 ) + 2 (0.25) + 3 (0.20) + 4

(0.05 ) (1.70 )2

= (0 + 0.40 + 1.00 + 1.80 + 0.80 2.89 )


= 1.05
b. Using the value of  and  calculated in part (a), then as shown
in Figure 6.4, we have
(  2,   2)  (1.70  2(1.05), 1.70  2(1.05))  (0.4, 3.80)
Thus, the probability that X falls in the interval (0.4, 3.80) is
P(0.4 X 3.80)  P(X  0)  P(X  1)  P(X  2)  P(X  3) 
0.1  0.4  0.25  0.20  0.95
This result tells us that the machine produces the number of defective
parts within 2 standard deviations of the mean with probability 95%.

0.4

0.3

0.2

0.1

0.0
0
0.4

Figure 6.4

1
1.70

4
3.80

Location of mean  and the end point of interval (  2,   2).

Discrete Random Variables and Their Probability Distributions 101

6.2.1 Interpretation of the Mean and the Standard Deviation


Mean In Example 6.4 the mean is the number of defective parts that the
machine is expected to produce per eight-hour shift. Note that the mean value
in this example is 1.70, not a whole number. Obviously, the machine cannot
produce 1.70 defectives, but this is okay because   1.70 does not mean that
in every shift the machine is going to produce exactly 1.70 defective parts.
This simply means that if we observe the number of defectives parts produced
by this machine for many eight-hour shifts, then the machine will produce different numbers of defective parts in different shifts, but the average number
of defective parts produced by the machine in one shift is approximately 1.7.
Physical interpretation of mean is that if we put weights 0.1, 0.4, 0.25,
0.20, and 0.05 units on a rod at the points 0, 1, 2, 3, and 4, respectively, then
the mean is the center of gravity, or the balance point of the rod.
Standard Deviation From part (b) of Example 6.4, it is very clear that the
standard deviation is the measure of dispersion; that is, it tells us how far the
points with probability greater than zero are scattered from the mean. As in
this example, we saw that points with 95% probability fall within two standard deviations of the mean.
Physical interpretation of the variance, which is the square of the standard deviation, is that if we place the weights on a rod as mentioned earlier
then the variance is the moment of inertia about a perpendicular axis through
the mean. The moment of inertia is the constant of proportionality used in
Newtons second law for rotational motion about a xed axis.

6.3 The Bernoulli Trials and


the Binomial Distribution
Let us consider a random experiment E consisting of repeated trials where
each trial has only two possible outcomes, referred to as success S and failure F. Then a sequence of independent trials (repetitions), such that the probability of success on each trial remains a constant p and the probability of
failure is q  1  p, is called a sequence of Bernoulli trials. For example, if
we toss a coin repeatedly, we would have Bernoulli trials since in each trial
the probability of a head as well as of a tail remains xed.
Let X be a random variable denoting the number of successes in each
Bernoulli trial. Clearly, if we set X  1 or 0 according as the trial is a success or a failure and
P(X  x)  p

x1

1pq

x  0,

(6.12)

then the probability function of the Bernoulli random variable X is given by


Xx

P(X  x)

102

Chapter Six

which may also be written as


P(X  x)  pxq1x,

x  0, 1

(6.13)

The function P in Equation (6.13) is a probability function, since


P(X  0) 0,

P(X  1) 0

and
P(X  0)  P(X  1)  p  q  1.
Definition 6.8 A random variable X is said to be distributed as
Bernoulli distribution if its probability function is dened as
P(X  x)  pxq1x,

x  0, 1; p  q  1

Here p is the parameter of the distribution.


Sometimes the Bernoulli distribution is also known as a point binomial
distribution.
6.3.1 Mean and Standard Deviation of a Bernoulli Distribution
The mean and the standard deviation of a Bernoulli distribution are given by

= p

and

pq

(6.14)

respectively where p is the probability of success and q is the probability of


failure.
6.3.2 The Binomial Distribution
The binomial distribution is one of the most commonly used discrete probability distributions. It is applied whenever an experiment possesses the following characteristics:
1. The experiment consists of n independent trials.
2. Each trial has two possible outcomes, usually called success
and failure.
3. The probability p of success in each trial is constant throughout
the experiment. Consequently the probability q  1  p of
failure is also constant throughout the experiment.
For example, consider the following scenarios when the binomial distribution will be applicable.
1. A machine is producing 5% defective parts. Let the random
variable X denote the number of defective parts in the next 60
parts produced by that machine.
2. Babies are being born in a given hospital. Let X denote the
number of boys among the next 15 babies born in that hospital.

Discrete Random Variables and Their Probability Distributions 103

3. Let the probability that a worker in a company will join the


union be 0.4. Let X denote the number of workers out of the
50 workers in that company who would join the union.
4. The probability that a computer in a lab has a virus is 0.25. Let
X denote the number of computers out of a total 30 computers
that have a virus.
5. A job applicant has a 35% chance that he or she will be called
for an interview. Let X denote the number of applicants out of a
total of 18 applicants who will be called for an interview.
6. The likelihood that a consignment being shipped on time
is high with probability 0.70. Let X denote the number of
consignments out of a total of 25 consignments that will
be shipped on time.
Note: In each of the above examples, X is the sum of independent Bernoulli
random variables.
Definition 6.9 A random variable X is said to be distributed as
binomial distribution if its probability function is given by
n
P( X = x ) = p x q n x ,
x

x = 0, 1, 2, ..., n; p + q = 1

(6.15)

where n  0 is the number of trials, x the number successes, and p the probability of success. Sometimes, practitioners call the probability distribution
as a binomial model. The n and p in Equation (6.15) are the parameters of the
distribution.
n
Recall from Equation (5.4) that , pronounced n choose x is the
x
number of combinations when x items are selected from a total of n items,
and is equal to
n
n!
x = x ! (n x )!
For example,
6
6!
6 5 4 3 2 1
4 = 4 ! (6 4 )! = ( 4 3 2 1) ( 2 1) = 15
Example 6.5 The probability is 0.80 that a randomly selected technician
will finish his or her project successfully. Let X be the number of technicians
among a randomly selected group of 10 technicians who will finish their
projects successfully. Find the probability distribution of the random variable X. Also, represent this probability distribution graphically.

104

Chapter Six

Solution: It is clear that the random variable X in this example is distributed as binomial with n  10, and p  0.8. Thus, by using the binomial probability we have
10
0
10
f ( 0 ) = ( 0.80 ) (.20 ) = .0000
0
10
1
9
f (1) = ( 0.80 ) (.20 ) = .0000
1

10
2
8
f ( 2 ) = ( 0.80 ) (.20 ) = .0001
2
10
3
7
f ( 3) = ( 0.80 ) (.20 ) = .0008
3
10
4
6
f ( 4 ) = ( 0.80 ) (.20 ) = .0055
4
10
5
5
f ( 5 ) = ( 0.80 ) (.20 ) = .0264
5
10
6
4
f ( 6 ) = ( 0.80 ) (.20 ) = .0881
6
10
7
3
f ( 7 ) = ( 0.80 ) (.20 ) = .2013
7
10
8
2
f ( 8 ) = ( 0.80 ) (.20 ) = .3020
8
10
9
1
f ( 9 ) = ( 0.80 ) (.20 ) = .2684
9
10
10
0
f (10 ) = ( 0.80 ) (.20 ) = .1074
10
The graphical representation of the probability distribution is shown in
Figure 6.5.
Example 6.6
probabilities:

Using the information in Example 6.5, find the following

1. At least three technicians will finish their project successfully.


2. At most five technicians will finish their project successfully.
3. Between four and six (inclusive) technicians will finish their
project successfully.

Discrete Random Variables and Their Probability Distributions 105

0.3

0.2

0.1

0.0
0

Figure 6.5

10

Binomial probability distribution with n  10, p  0.80.

Solution:
1. In this part we are interested in nding the cumulative
probability P(X 3). Using the probabilities obtained in
Example 6.5, we get
P(X 3)  P(X  3)  P(X  4)  ...  P(X  10)
 .0008  .0055  ... .1074  .9999
2. In this part we want to nd the probability of P(x 5). Again,
using the probabilities in Example 6.5 we get
P(X 5)  P(X  0)  P(X  1)  P(X  2)  P(X  3)  P(X  4) 
P(X  5)
 .0000  .0000  .0001  .0008  .0055  .0264  .0328.
3. Now we want to nd the probability P(4 X 6). Thus we get
P(4 X 6)  P(X  4)  P(X  5)  P(X  6)  .0055  .0264 
.0881  .1200.
6.3.3 Binomial Probability Tables
The tables of binomial probabilities for n  1 to 15 and for some selected
values of p are given in Table I of the Appendix. We illustrate the use of these
tables with the following example.
Example 6.8 The probability that the Food and Drug Administration
(FDA) will approve a new drug is 0.60. Suppose that five new drugs are submitted to FDA for its approval. Find the following probabilities:
1. Exactly three drugs are approved.
2. At most three drugs are approved.
3. At least three drugs are approved.
4. Between two and four (inclusive) drugs are approved.

106

Chapter Six

Table 6.4 Portion of Table I of the Appendix for n  5.


p
x

.05

.60

.774

.010

.203

.077

.022

.230

.001

.346

.000

.259

.000

.078

Solution: To use the appropriate table, one must rst determine the values
of n. Then, the probability for given values of n and p is found at the intersection of the row corresponding to the given value of X and the column corresponding to the given value of p. In this example the probability for n  5,
p  0.60, and x  3 is shown in Table 6.4.
Thus, from Table 6.4, we have
1. P(x  3)  0.346
2. P(x 3)  P(x  0)  P(x  1)  P(x  2)  P(x  3)
 .010  .077  .230  .346  .663
3. P(x 3)  P(x  3)  P(x  4)  P(x  5)
 .346  .259  .078  .683
4. In this part we want to nd the probability P(2 x 4). Thus,
from Table 6.4, we get P(2 x 4)  P(x  2)  P(x  3) 
P(x  4)  .230  .346  .259  .835
Mean and Standard Deviation of a Binomial Distribution
Definition 6.10 The mean and standard deviation of binomial random variable X are
Mean:   E(X)  np

(6.16)

Standard Deviation: = V ( X ) = npq

(6.17)

Where n is the total number of trials, p is the probability of success


in each trial, q  1  p is the probability of failure in each trial, and
V(X)  npq is the variance of the random variable X.
Example 6.9 In Example 6.5 find the mean and the standard deviation of
the random variable X, which represents the number of technicians who will
finish their project successfully.
Solution: In Example 6.5, we have
n  10, p  0.8, and q  1  p  1  0.8  0.2

Discrete Random Variables and Their Probability Distributions 107

Thus, using Equations (6.16) and (6.17), we have

= np = 10 (0.8 ) = 8
= npq = 10(0.8 )(0.2 ) = 1.60 = 1.26
Example 6.10 The probability that a shopper entering a department store
will make a purchase is 0.30. Let X be the number of shoppers out of a total
of 30 shoppers who enter that department store within a certain period who
make a purchase. Use the formulas (6.16) and (6.17) of the mean and the
standard deviation of the binomial distribution to find the mean and the standard deviation of the random variable X.
Solution: Using formulas (6.16) and (6.17), we have

= np = 30 (.30 ) = 9
= npq = 30(.30 )(.70 )
= 6.30
= 2.51

6.4 The Hypergeometric Distribution


In earlier chapters we introduced the term population and the concept of
sampling with and without replacement. In section 6.3 we introduced the
concept of independent trials. If the sampling is done with replacement and
if we consider selecting an object from the population as a trial, we can easily see that these trials are independent. However, if the sampling is done
without replacement, these trials are not independent. That is, the outcome
of any trial will depend upon what happened in the previous trial(s).
Consider now a special kind of population, consisting of two categories
such as a population of males and females, defectives and nondefectives,
salaried and nonsalaried workers, healthy and nonhealthy, successes and failures, and so on. Such populations are generally known as dichotomized populations. If a sampling with replacement is done from a dichotomized population, we can answer all our questions by using the binomial probability
distribution. However, if a sampling is done without replacement, the trials
will not be independent and therefore the binomial probability distribution
will not be applicable. For example, a box contains 100 parts of which ve are
defective. Randomly select two parts, one at a time, from this box. The probability that the rst part will be defective is 5 / 100. However, the probability
of second part being defective depends upon whether the rst part was defective or not. If the rst part was defective, the probability of the second part
being defective is 4 / 99, but if the rst part was not defective this probability
will be 5 / 99. Because the two probabilities are different, we cant use binomial probability distribution. For example, if we randomly select 10 parts
from this box to nd the probability that exactly one of these parts is defective, we cant use binomial distribution.

108

Chapter Six

However, this probability can easily be found by using another probability distribution, known as hypergeometric distribution.
Definition 6.11 A random variable X is said to be distributed as
hypergeometric if

P( X = x ) =

r N r
x n x
N
n

x = a, a + 1, ..., min(r, n )

(6.18)

Where
a  Max(0, n  N  r)
N  total number of objects, say successes and failures in the
population
r  number of objects of category of interest, say successes in
the population
N  r  number of failures in the population
n  number of trials
x  number of successes in n trials
n  x  number of failures in n trials
Example 6.11 A Six Sigma Green Belt randomly selects two parts from a box
containing 5 defective and 15 nondefective parts. He discards the box if one or
both parts drawn are defective. What is the probability that he will:
1. one defective part
2. two defective parts
3. Reject the box.
Solution:
1. In this problem we have N  20, r  5, n  2, x  1. Thus, we have

P ( X = 1) =

5 15
1 1
20
2

5 15 755
= .3947
=
190
190

2.

P( X = 2) =

5 15
2 0
20
2

10 1
= .0526
190

Discrete Random Variables and Their Probability Distributions 109

3. The probability that he/she rejects the box is


P(X  1)  P(X  2)  .3947  .0526  .4473
Example 6.12 A manufacturer ships parts in lots of 100 parts. The quality control department of the receiving company agrees to the sampling plan
that it will select a random sample without replacement of five parts. The lot
will be accepted if the sample does not contain any defective part. What is
the probability that a lot will be accepted if:
1. The lot contains 10 defective parts.
2. The lot contains four defective parts.
Solution:
1. In this problem, we have N  100, r  10, n  5, x  0.
Therefore, the probability that the lot is accepted, that is, the sample does
not contain any defective part is

P( X = 0) =

10 100 10
0 5 0
100
5

100 90
0 5
100
5

10!
90!

90 89 88 87 86
= 0!10! 5!885! =
= .5837
100!
100 99 98 97 96
5! 95!
2. In this case, we have
N  100, r  4, n  5 and x  0.
Therefore, the probability that the lot will be accepted is

P( X = 0) =

4 100 4
0 5 0
100
5

4 ! 96!

96 95 94 93 92
0!! 4 ! 5! 91!
=
=
100!
100 99 98 97 96
5! 95!
= .8119.

110

Chapter Six

6.4.1 Mean and Standard Deviation of a Hypergeometric


Distribution
The mean and standard deviation of hypergeometric distribution are
Mean:   np
Standard Deviation: =

(6.19)
N n
npq
N 1

(6.20)

Where
N  total number of objects in the population
r  total number of objects in the category of interest
n  the sample size
r
p
N
q1p

N r
N

Example 6.13 A shipment of 250 computers contains eight computers with


defective CPUs. A sample without replacement of size 20 is selected. Let X
be a random variable that denotes the number of computers with defective
CPUs. Find the mean and the variance of the random variable X.
Solution: Clearly the shipment is a dichotomized population and the sampling is done without replacement. Thus, the probability distribution of the random variable X is hypergeometric. Here the category of interest is computers
with defective CPUs. Thus, using the formulas (6.18) and (6.19), we have
8
= np = 20
= 0.64
250

N n
npq =
N 1

250 20
8 250 8
20
250 2550
250 1

230
8 242
=
20
= .7565
249
250 250

6.5 The Poisson Distribution


The Poisson distribution is used whenever we are interested in nding the
probability of rare events, for example, the number of occurrences of a particular event occurs over a specied period of time, over a specied length
measurement (such as the length of an electric wire), over a specied area, or
in a specied volume when the probability of such an event happening is very
small. For instance, we may be interested in nding the probability of a certain number of accidents occurring in a manufacturing plant over a specied

Discrete Random Variables and Their Probability Distributions 111

period of time, the number of patients admitted to a hospital, the number of


cars passing through a toll booth, the number of customers entering in a bank,
or the number of telephone calls received by a receptionist over a specied
period of time. Similarly, we may be interested in nding the probability of
an electric wire of certain length having a particular kind of defect, the number of scratches over a specied area of a smooth surface, number of holes in
a roll of paper, or the number of radioactive particles in a specied volume of
air. All these examples have one thing in common, which is that the random
variable X denoting the number of occurrences of events that may be over a
specied period of time, length, area, or volume must satisfy the conditions
of a process or belong to a process called the Poisson Process, for us to be
able to use the Poisson distribution.
Poisson Process
Let X(t) denote the number of times a particular event occurs randomly in a
time period t. Then these events are said to form a Poisson process having
rate ,  0, (for t  1, X(t)  ), if
1. X(0)  0
2. The numbers of events that occur in any two nonoverlapping
intervals are independent.
3. The average number of events occurring in any interval is
proportional to the size of the interval and does not depend
upon when they occur.
4. The probability of precisely one occurrence in a very small
interval (t, t  t) of time is equal to (t), and the probability
of occurring more than one occurrence in such a small interval
is zero.
The number of events occurring over a certain length, area, or in volume will form a Poisson process only if they possess all of the above characteristics.
Poisson Distribution
Definition 6.12 A random variable X that is equal to the number of
events occurring according to the Poisson process is said to have a
Poisson distribution if its probability function is given by
f ( x ) = P( X = x ) =

e x
, x  0, 1, 2,
x!

(6.21)

Where  0 (the Greek letter lambda) and


is the only parameter of the
distribution and e 2.71828.
The binomial distribution, when p is very small and n is very large such
that  np as n , can be approximated by Poisson distribution. Note
that as a rule of thumb the approximation is good when  np  10. We

112

Chapter Six

shall revisit this concept in Chapter 8. The Poisson distribution is also known
to be a distribution that deals with rare events, that is, events that occur with
a very small probability.
It can easily be shown that Equation (6.21) satises both properties of
being a probability function, that is
P(X  x)  f (x) 0
and
P( X = x ) = f ( x ) = 1
x

Example 6.14 It is known from experience that 4% of the parts manufactured at a plant of a manufacturing company are defective. Use the Poisson
approximation to the binomial distribution to find the probability that in a lot
of 200 parts manufactured at that plant, seven parts will be defective.
Solution: Since n  200 and p  .04, we have np  200(.04)  8 ( 10).
The Poisson approximation should give a satisfactory result. From formula
(6.20), we get
e8 (8 )7
7!
(.0009118965 )(2097152 )
= .3794

5040

P( X = 7) = f (7) =

Example 6.15 The number of breakdowns of a machine is a random variable having the Poisson distribution with  2.2 breakdowns per month.
Find the probability that the machine will work during any given month with:
(a) No breakdown
(b) One breakdown
(c) Two breakdowns
(d) At least two breakdowns
Solution: In this example we are given the number of breakdowns of the
machine per unit time. The unit time in this problem is one month. The probabilities that we want to nd are also of the number breakdowns in one
month. The parameter
remains the same, that is,  2.2. The probabilities
in parts (a)(d) can be easily found by using the probability function given
in formula (6.21).
(a) P ( X = 0 ) =

e2.2 (2.2 )0
0!

= e2.2
= 0.1108

since 0!  1

Discrete Random Variables and Their Probability Distributions 113

e2.2 (2.2 )1
1!
(0.1108 )(2.2 )
=
= 0.2438
1

(b) P ( X = 1) =

(c) P ( X = 2 ) =

e2.2 (2.2 )2
= 0.2681
2!

(d) P(X 2)  1  P(X  2)

Since P ( X = x ) = 1
x

 1  P(X  0)  P(X  1)  1  0.1108  0.2438


 0.6454.
Note that if in the above example we were interested in nding the probabilities of certain number of breakdowns in t-months, for example, the value
of the parameter will change to t.
Example 6.16
work with:

In Example 6.15, find the probabilities the machine will

(a) Four breakdowns in two months


(b) Five breakdowns in two and a half months
Solution:
(a) We are interested in nding the probability of breakdowns over
an interval equal to two times the unit time. Thus,  2(2.2) 
4.4, so that
e4.4 ( 4.4 )4 (.012277 )( 374.8096 )
=
4!
24
= .1917.

P( X = 4 ) =

(b) In this part we want to nd the probability of the number of


breakdowns in 2.5 times the unit time, therefore,  (2.5)(2.2)
 5.5. Thus, the desired probability is
e5.5 (5.5 )5 (.00408677 )(5032.84375 )
=
120
5!
= 0.1714.

P( X = 5) =

114

Chapter Six

Table 6.5 Portion of Table II of the Appendix.

1.1

1.9

2.0

.333

.150

.135

.366

.284

.271

.201

.270

.271

.074

.171

.180

.020

.081

.090

.005

.031

.036

.001

.010

.012

.000

.003

.004

.000

.000

.001

.000

.000

.000

Mean and Standard Deviation of a Poisson Distribution


Mean:   E(X) 

(6.22)

Variance:   V(X) 

(6.23)

Standard deviation: = V ( X ) =

(6.24)

Poisson Probability Tables


The tables of Poisson probabilities for various values of are given in Table
II of the Appendix. We illustrate the use of these tables with the following
example.
Example 6.17 The average number of accidents occurring in a manufacturing plant over a period of one year is equal to two. Find the probability
that during any given year five accidents will occur.
Solution: To use the Poisson table, rst nd the value of x and for which
the probability is being sought. Then, the desired probability is the value at
the intersection of the row corresponding to x and column corresponding to .
Thus, the desired probability P(x  5) at  2 is 0.036.

7
Continuous Random
Variables and Their
Probability Distributions

n Chapter 6 we studied discrete random variables and their probability


distributions. In section 6.1 we saw that the discrete random variable can
assume only nite or countably innite number of values. But in real life,
situations often arise when a random variable can assume innite and
uncountable number of values. For example, a quality engineer may want to
determine how long the bulb of an overhead projector lasts in normal operation, or a production engineer may want to determine how long a worker will
take to assemble a motor. In both these examples the random variable X is
the time, which can assume any value in an interval. The interval contains an
uncountably innite number of values. A random variable that can assume
any value in one or more intervals is called a continuous random variable. In
this chapter, we discuss some of the more commonly used probability distributions of continuous random variables.

7.1 Continuous Random Variables


In this section, we introduce the general concept of probability distributions
of continuous random variables. Then in the next several sections, we will
discuss specic probability distributions.
Definition 7.1 A random variable is called continuous if it can
assume any value over one or more intervals. Since by denition the
number of values contained in any interval is innite, the possible
number of values that a continuous random variable can assume is
also innite and uncountable.
More examples of a continuous random variable are: time taken by a
technician to complete a certain job, length of a rod, and diameter of a ball
bearing.
In Chapter 6, we saw that one of the properties of a probability distribution is that the total probability is always equal to 1. This property also holds
for probability distributions of continuous random variables. Also, the continuous random variables assume an uncountably innite number of values.

115

116

Chapter Seven

Combining the above property of probability distributions with the characteristic of continuous random variables, we see that we cannot assign any
nonzero probability when a continuous random variable takes an individual
value, for otherwise it is not possible to keep the total probability equal to 1.
Consequently, unlike the discrete random variables, the probability that a
continuous random variable assumes any individual value is always zero. In
case of continuous random variables, we are always interested in nding the
probability of random variables taking any value in an interval rather than
taking any individual values. For example, in the problem of time taken by a
technician to nish a job, let a random variable X denote the time (in hours)
taken by the technician to nish a given job. Then, we would be interested in
nding, for example, the probability P(3.0 X 3.5). That is, we would be
interested in nding the probability that she takes between 3 and 3.5 hours to
nish the job rather than nding the probability of her taking 3 hours, 10
minutes, 15 seconds to nish the job. In this case, the chance of the event
associated with completing the job in exactly three hours, ten minutes, and
fteen seconds is very remote and, therefore, the probability of such an event
will be zero.
The probability function of a continuous random variable X, denoted by
f(x), is usually known as the density function and is represented by a smooth
curve. For example, a typical density function curve of a continuous random
variable is shown in Figure 7.1.
The density function of a continuous random variable satises the following properties:

(i)

f (x) 0

(ii)

f ( x ) dx = 1

(7.1)

Note that the mathematical expression in property (ii) represents the


total area enclosed by the probability density curve and the x-axis. All it

Figure 7.1 An illustration of a density function of a continuous random variable X.

Continuous Random Variables and Their Probability Distributions 117

says is that the total area, which represents the total probability, is equal to
1. For example, the probability that the random variable X falls in an interval (a, b) is
P(a X b)  Shaded area in Figure 7.1.
Note that if in Figure 7.1 we take a  b, then the shaded area will be 0. That
implies the P(X  a)  P(X  b)  0, which conrms the point we made
earlier in that the probability of a continuous random variable taking any
individual value is 0. This fact leads us to another important result in that it
does not matter whether the endpoints of an interval are included or not
while calculating the probability. In other words, if X is a continuous random variable,
P(a X b)  P(a X  b)  P(a  X b)  P(a  X  b) (7.2)
The cumulative distribution function denoted by F(x) is dened as
F(x)  P(X x)

(7.3)

Graphically, Equation (7.3) may be represented as shown in Figure 7.2.


The distribution function F(x) of a continuous random variable X satises the following properties:
(i) 0 F(x) 1

(7.4)

(ii) If x1  x2 then F(x1) F(x2)

(7.5)

(iii) F()  0, F()  1

(7.6)

So far, we have had a general discussion about the probability distributions of continuous random variables. In the remainder of the chapter we are
going to discuss some special continuous probability distributions that we
encounter frequently in applied statistics.

F(x)

Figure 7.2 Graphical representation of F(x)  P(X x).

118

Chapter Seven

7.2 The Uniform Distribution


The uniform distribution, sometimes also known because of its shape as
rectangular distribution, is perhaps the simplest continuous probability
distribution.
Definition 7.2 A random variable X is said to be uniformly distributed over an interval (a, b) if its probability density function is
given by:
1

f (x) = b a
0

for a x b

(7.7)

otherwise

Note that the density function f (x) in Equation (7.7) is constant for all values of x in the interval (a, b). Figure 7.3 shows the graphical representation
of a uniform distribution of the random variable X distributed over the interval (a, b), where a  b.
The probability that the random variable X takes the values in an interval
(x1, x2), where a x1  x2 b is the shaded area in Figure 7.4 and is equal to
P ( x1 X x2 ) =

x 2 x1
ba

(7.8)

Example 7.1 Let a random variable X be the time taken by a technician to


complete a project. In this example, the time can be anywhere between two
to six months, so the random variable X is uniformly distributed over the
interval (2, 6). Find the following probabilities:
(a) P(3 X 5) (b) P(X 4) (c) P(X 5)
Solution:
(a) To nd the probability P(3 X 5) we use the result given in
Equation (7.8), where we have a  2, b  6, x1  3, and x2  5.
Thus, we have
P(3 X 5) =

53 2
= = 0.5.
62 4

f(x)

1
(b a)

Figure 7.3 Uniform distribution over the interval (a, b).

Continuous Random Variables and Their Probability Distributions 119

f(x)

1
(b a)

x1

x2

Figure 7.4 Probability P (x1 X x2).

(b) In this part, we want to nd the probability P(X 4). This is


equivalent to nding the probability P(2 X 4), since the
probability for any interval that falls below the point x  2 is
zero. Again, using the result in Equation (7.8), we have
42 2
= = 0.5.
62 4

P ( X 4 ) = P (2 X 4 ) =

(c) By the same argument as in part b, we have P(X 5) 


P(5 X 6), and we have
P ( X 5 ) = P (5 X 6 ) =

65 1
= = 0.25.
62 4

Example 7.2 Suppose a delay in starting production due to an unexpected


mechanical failure is anywhere from 0 to 30 minutes. Find the following
probabilities:
(a) Production will be delayed by less than 10 minutes.
(b) Production will be delayed by more than 20 minutes.
(c) Production will be delayed by 12 to 22 minutes.
Solution: Let X be a random variable denoting the time by which production will be delayed. From the given information we can see that the random
variable X is uniformly distributed over the interval (0, 30). Using this information the desired probabilities are found as follows:
(a) P ( X 10 ) = P (0 X 10 ) =

10 0 10 1
=
=
30 0 30 3

(b) P ( X 20 ) = P (20 X 30 ) =
(c) P (12 X 22 ) =

30 20 10 1
=
=
30 0 30 3

22 12 10 1
=
=
30 0 30 3

120

Chapter Seven

Note that in each case, the probability turned out to be the same (i.e., 1/3).
This shows that, in a uniform distribution, the probability depends upon the
length of the interval and not on the location of the interval. In each case the
length of the interval was equal to 10.
7.2.1 Mean and Standard Deviation of the Uniform Distribution
Let X be a random variable distributed uniformly over an interval (a, b). The
mean  and the standard deviation  of the random variable X are given by

a+b
2

(7.9)

ba
12

(7.10)

The distribution function F(x) of a random variable X distributed uniformly over an interval (a, b) is dened as
F(x)  P(X x)
 P(a X x)
=

xa
ba

(7.11)

Example 7.3 Let a random variable X denote the coffee break (in minutes)
that a technician takes every morning. Let the random variable X be uniformly distributed over an interval (0, 16). Find the mean  and the standard
deviation  of the distribution.
Solution: Using Equations (7.9) and (7.10), we get

a + b 0 + 16
=
=8
2
2

b a 16 0
=
= 4.619.
12
12

Example 7.4 In Example 7.3, find the following values of the distribution
function of the random variable X:
(a) F(3) (b) F(5) (c) F(12)
Solution: Using the result of Equation (7.11), we get
(a) F ( 3) =

x a 3 0
3
=
=
b a 16 0 16

(b) F (5 ) =

xa 50
5
=
=
b a 16 0 16

(c) F (12 ) =

x a 12 0 12 3
=
=
=
b a 16 0 16 4

Continuous Random Variables and Their Probability Distributions 121

7.3 The Normal Distribution


The normal distribution forms the basis of modern statistical theory. The
normal distribution is the most widely used probability distribution in
applied statistics. In fact, it is very hard to find many situations in real life
where the normal distribution is not used in one way or the other. For
example, the tensile strength of paper, survival time of a part, time taken
by a programmer to neutralize a new computer virus, volume of a chemical compound in a 1-pound container, compressive strength of a concrete
block, length of rods, time taken by a commuter to travel from home to
work, heights and weights of people; all can be modeled by a normal distribution. Another striking application, which we shall discuss in the next
chapter, is the central limit theorem, which is based on the concept that the
mean X of a random sample is approximately normally distributed. In
applied statistics the normal distribution and central limit theorem are of
paramount importance. Some of these applications we shall see in subsequent chapters. Now we will discuss the normal probability distribution.
Definition 7.3 A random variable X is said to have a normal
probability distribution if the density function of X is given by
f (x) =

2
2
1
e ( x ) / 2
2

  x  

(7.12)

where      and   0 are the two parameters of the distribution,   3.1428 and e  2.71828. Also, note that  and  are
the mean and standard deviation of the distribution. A random variable X having a normal distribution with mean  and a standard
deviation  is usually written as X  N(, ).
Some of the characteristics of the normal density function are the following:
1. The normal density function curve is bell shaped and
completely symmetric about its mean . For this reason the
normal distribution is also known as a bell-shaped distribution.
2. The specic shape of the curve, whether it is more or less tall, is
determined by its standard deviation .
3. The tails of the density function curve extend from  to .
4. The total area under the curve is 1.0. However, 99.74% of the
area falls within three standard deviations of the mean .
5. The area under the normal curve to the right of  is 0.5 and to
the left of  is also 0.5.
Figure 7.5 shows the normal density function curve of a random variable
X with mean  and standard deviation .

122

Chapter Seven

+3

Figure 7.5 The normal density function curve with mean  and standard deviation .

=1

=3

=5

=7

Figure 7.6 Curves representing the normal density function with different means,
but with the same standard deviation.

Since 99.74% of the probability of a normal random variable with mean


 and standard deviation  falls between   3 and   3, the distance
6 between   3 and   3, is usually considered the range of the normal distribution. Figures 7.6 and 7.7 show that as the mean  and the standard deviation  change the location and the shape of normal curve change.
From Figure 7.7 we can observe an important phenomenon of the normal distribution, that is as the standard deviation  becomes smaller and
smaller, the probability is concentrated more and more around the mean .
We will see later that this property of the normal distribution is very useful
in making inferences about populations.
Using integral calculus one can nd the probability of the normal random variable X falling in any interval. Since the use of integral calculus is
beyond the scope of this book, in order to nd such probabilities we need to

Continuous Random Variables and Their Probability Distributions 123

=1

=2

=3

Figure 7.7 Curves representing the normal density function with different standard
deviations, but with the same mean.

Figure 7.8 The standard normal density function curve.

introduce a new random variable, called standardized random variable. A


standard normal random variable, denoted by Z, is dened as follows:
Z=

(7.13)

The new random variable Z is also distributed normally, but with mean 0 and
a standard deviation 1. The distribution of the random variable Z is generally
known as the standard normal distribution.
Definition 7.4 The normal distribution with mean 0 and standard
deviation 1 is known as the standard normal distribution and is usually written as N(0,1).
The values of the standard normal random variable Z, denoted by the
lower case letter z, are called the z-scores. For example, in Figure 7.8 the
points marked on the x-axis are the z-scores. The probability of the random
variable Z falling in an interval (a, b) is shown by the shaded area under the
standard normal curve in Figure 7.9. This probability is determined by using
a standard normal distribution table (see Table III of the appendix).
7.3.1 Standard Normal Distribution Table
The standard normal distribution, Table III of the appendix, lists the probabilities of the random variable Z for its values between z  0.00 and z 
3.09. A small portion of this table is reproduced below in Table 7.1. The
entries in the body of the table are the probabilities P(0 Z z), where z is
some point in the interval (0, 3.09). These probabilities are also shown by the

124

Chapter Seven

Figure 7.9 Probability (a Z b) under the standard normal curve.

Table 7.1 A portion of standard normal distribution Table III of the appendix.
Z

.00

.01

.02

.03

.04

.05

.06

.07

.08

.09

0.0

.0000

.0040

.0080

.0120

.0160

.0199

.0239

.0279

.0319

.0359

0.1

.0398

.0438

.0478

.0517

.0557

.0596

.0636

.0675

.0714

.0753

1.0

.3413

.3438

.3461

.3485

.3508

.3531

.3554

.3577

.3599

.3621

1.1

.3643

.3665

.3686

.3708

.3729

.3749

.3770

.3790

.3810

.3830

1.9

.4713

.4719

.4726

.4732

.4738

.4744

.4750

.4756

.4761

.4767

2.0

.4772

.4778

.4783

.4788

.3793

.4798

.4803

.3808

.4812

.4817

shaded area under the normal curve given at the top of the Table III of the
appendix. To read this table we mark the row and the column corresponding
to the value of z to one decimal point and the second decimal point respectively. Then the entry at the intersection of that row and column is the probability P(0 Z z). For example, the probability P(0 Z 2.09) is found
by marking the row corresponding to z  2.0 and column corresponding to
z  .09 (note that z  2.09  2.0  .09) and locating the entry at the intersection of the marked row and column, which in this case is equal to .4817.
The probabilities for the negative values of z are found, due to the symmetric property of the normal distribution, by nding the probabilities of the corresponding positive values of z. For example, P(1.54 Z 0.50) 
P(0.50 Z 1.54).
Example 7.5 Use the standard normal distribution table, Table III of the
appendix, to find the following probabilities:
(a) P(1.0 Z 2.0) (b) P(1.50 Z 0) (c) P(2.2 Z 1.0)
Solution:
(a) From Figure 7.10 it is clear that
P(1.0 Z 2.0)  P(0 Z 2.0)  P(0 Z 1.0)
 .4772  .3413  0.1359

Continuous Random Variables and Their Probability Distributions 125

Figure 7.10 Shaded area equal to P(1 Z 2).

1.5

1.5

Figure 7.11 Two shaded areas showing P(1.50 Z 0)  P(0 Z 1.50).

2.2

2.2

Figure 7.12 Two shaded areas showing P(2.2 Z 1.0)  P(1.0 Z 2.2).

(b) Since the normal distribution is symmetric about the mean,


which in this case is 0, the probability of Z falling between
1.5 and 0 is the same as the probability of Z falling between 0
and 1.5. Figure 7.11 also supports this assertion. Using Table
III, we have the following:
P(1.50 Z 0)  P(0 Z 1.50)  0.4332
(c) By using the same argument as in part b and using Table III of
the appendix (also, see Figure 7.12), we get
P(2.2 Z 1.0)  P(1.0 Z 2.2)
 P(0 Z 2.2)  P(0 Z 1.0)
 .4861  .3413  0.1448
Example 7.6 Use Table III of the appendix on page 317 to determine the
following probabilities:
(a) P(1.50 Z 0.80) (b) P(Z 0.70) (c) P(Z 1.0)
Solution:
(a) Since the standard normal distribution table (Table III of the
appendix) gives the probabilities of z-values starting from zero
to positive z-values, we have to break the interval 1.5 to 0.8
into two parts, that is, 1.5 to 0 plus 0 to 0.8 (see Figure 7.13),
so that we get
P(1.50 Z 0.80)  P(1.50 Z 0)  P(0 Z 0.80)

126

Chapter Seven

1.5

0.8

Figure 7.13 Showing P(1.50 Z .80)  P(1.50 Z 0)  P(0 Z 0.80).

0.7

Figure 7.14 Shaded area showing P(Z 0.70).

Figure 7.15 Shaded area showing P(Z 1.0).

Thus, we have
P(1.50 Z 0.80)  P(1.50 Z 0)  P(0 Z 0.80)
 P(0 Z 1.50)  P(0 Z 0.80)
 .4332  .2881  0.7213
(b) The probability P(Z 0.70) is shown by the shaded area in
Figure 7.14. This area is equal to the sum of the area to left of
z  0 and the area between z  0 and z  0.7, which implies that:
P(Z 0.70)  P(Z 0)  P(0 Z 0.7)  0.5  .2580  0.7580
(d) By using the same argument as in part b and Figure 7.15, we get
P(Z 1.0)  P(1.0 Z 0)  P(Z 0)
 P(0 Z 1.0)  P(Z 0)
 .3413  .5  0.8413
Example 7.7
bilities:

Use Table III of the appendix to find the following proba(a) P(Z 2.15) (b) P(Z 2.15)

Solution:
(a) The desired probability P(Z 2.15) is equal to the shaded
area under the normal curve to right of z  2.15, shown in

Continuous Random Variables and Their Probability Distributions 127

2.15

Figure 7.16 Shaded area showing P(Z 2.15).

2.15

Figure 7.17 Shaded area showing P(Z 2.15).

Figure 7.16. This area is equal to the area to the right of z  0


minus the area between z  0 and z  2.15. Since the area to
the right of z  0 is 0.5. Thus, we have
P(Z 2.15)  0.5  P(0 Z 2.15)  0.5  0.4842  0.0158
(b) Using the symmetric property of the normal distribution (see
Figure 7.17) and using part a, we have
P(Z 2.15)  P(Z 2.15)  0.0158
So far in this section, we have considered the problems of nding probabilities of the standard normal variable Z, that is, a normal random variable
with mean   0 and standard deviation   1. Now we consider the problems where   0 and   1.
Example 7.8 Let X be a random variable distributed normally with   6
and   4. Then determine the following probabilities:
(a) P(8.0 X 14.0) (b) P(2.0 X 10.0) (c) P(0 X 4.0)
Solution:
(a) In order to nd the probability P(8.0 X 14.0), we rst need
to transform the random variable X into the standard normal
variable Z, which is done by subtracting throughout the
inequality, the mean , and dividing by the standard deviation
. Thus, as shown in Figures 7.18, and 7.19, we get
8 6 X 6 14 6
P (8.0 X 14.0 ) = P

4
4
4
 P(0.5 Z 2.0)
 P(0 Z 2.0)  P(0 Z 0.50)
 0.4772  0.1915  0.2857.

128

Chapter Seven

14

0.5 1

Figure 7.18 Converting normal N(6,4) to standard normal N(0,1).

2.0

0.5

Figure 7.19 Shaded area showing P(0.5 Z 2.0).

Figure 7.20 Shaded area showing P(1.0 Z 1.0).

(b) Proceeding in same manner as in part (a) and using Figure 7.20,
we have
2 6 X 6 10 6
P (2.0 X 10.0 ) = P

4
4
4
= P (1.0 Z 1.0 )
= P (1.0 Z 0 ) P (0 Z 1.0 )
= 2 P (0 Z 1.0 ) = 2(0.3413) = 0.6826.
(c) Again, transforming X into Z and using Figure 7.21, we get
0 6 X 6 4 6
P (0 X 4.0 ) = P

4
4
4
= P (1.50 Z 0.50 ) = P (0.5 Z 1.50 )
= P (0 Z 1.50 ) - P (0 Z 0.50 )
= 0.4332 0.1915 = 0.2417
Example 7.9 Suppose a quality characteristic of a product is normally distributed with mean   18 and standard deviation   1.5. The specification limits furnished by the customer are (15, 21). Determine what percentage of the product meets the specifications set by the customer.

Continuous Random Variables and Their Probability Distributions 129

1.5

0.5

Figure 7.21 Shaded area showing P(1.50 Z 0.50)

Solution: Let the random variable X denote the quality characteristic of


interest. Then, X is normally distributed with mean   18 and standard
deviation   1.5.
So we are interested in nding the percentage of product with the characteristic of interest within the limits (15, 21), which is given by
15 18 X 18 21 18

100P (15 X 21) = 100 P

1.5
1.5
1.5
= 100 P (2.0 Z 2.0 )
= 100 [ P (2.0 Z 0 ) + P (0 Z 2.0 )]
= 100 2 P (0 Z 2.0 )
= 100 2(.4772 ) = 95.44%.
In this case, the percentage of product that will meet the specications
set by the customer is 95.44%.

7.4 The Exponential Distribution


In Chapter 6, we studied the Poisson probability distribution, which describes
the phenomenon of random events that occur in a Poisson process. The events
in the Poisson process occur randomly. This means, for example, the time
between the occurrences of any two consecutive events is a random variable
T (say). Then, the random variable T is distributed as exponential. As another
example, the distance between two defects in a telephone or an electric wire
is distributed as exponential. The exponential distribution has a wide range of
applications in any process that does not take the aging/anti-aging factor into
account. For example, if a machine is always as good as new, we use the exponential distribution to study its reliability.
Definition 7.5 A random variable X is said to be distributed as
exponential if its probability density function is dened as
f (x)  ex for x 0  0, otherwise

(7.14)

Where  0, is the only parameter of this distribution and e  2.71828. It


is important to note that is the number of events occurring per unit of time,
per unit length, per unit area, or per unit volume in a Poisson process.

130

Chapter Seven

f(x)
2

1.5

=2.0

0.5

=1.0
= 0.5

= 0.1

10

Figure 7.22 Graphs of exponential density function for  0.1, 0.5, 1.0, and 2.0.

The shape of the density function of an exponential distribution changes


as the value of changes. Figure 7.22 shows the density function of exponential distribution for some selected values of .
7.4.1 Mean and Standard Deviation of an Exponential Distribution
  E(X)  1/

and

= V (X) = 1 /

(7.15)

where V(X)  1/2 is the variance of the exponential distribution.


7.4.2 Distribution Function F(x) of the Exponential Distribution
F(x)  P(X x)
 1  ex

(7.16)

From Equation (7.16), it follows that


P(X  x)  1  P(X x)
 1  F(x)
 1  (1  ex)
 e
x

(7.17)

Equation (7.17) leads us to an important property known as the memoryless property of the exponential distribution. As an illustration of this property, we consider the following example.
Example 7.10 Let the breakdowns of a machine follow the Poisson process
so that the random variable X denoting the number of breakdowns per unit
of time is distributed as Poisson distribution. Then, the time between any two
consecutive failures is also a random variable (say) T, which is distributed

Continuous Random Variables and Their Probability Distributions 131

as exponential with parameter


. Assuming
 0.1, determine the following
probabilities:
(a) P(T  t), that is, the probability that the machine will function
for at least time t, before it breaks down again.
(b) P(T  t  t1/T  t1), which means that it is known the machine
has already operated for time t1 after a breakdown. Then find
the probability it will function for at least another time t from
time t1. In other words, find the probability that the machine
will function for a total time of at least t  t1 before its next
breakdown given that it has already worked for time t1 since
the previous breakdown.
Solution:
(a) Given that  0.1, we want to nd the probability P(T  t).
From Equation (7.17), this probability is given by
P(T  t)  et  e(0.1)t  et/10
(b) Here we are interested in nding the conditional probability
P(T  t  t1, T  t1).
Using the denition of conditional probability, we have
P (T > t + t1 , T > t1 ) =

P (T > t + t1 , T > t1 )
P (T > t1 )

Where P(T  t  t1, T  t1) means that the probability that T  t  t1 and
T  t1. When T  t  t1 it is automatically greater than t1. Thus, we have
P(T  t  t1 | T  t1)  P(T  t  t1)
So that
P (T > t + t1 | T > t1 ) =

P (T > t + t1 )
P (T > t1 )

Now using Equation (7.17), we get


P (T > t + t1 | T > t1 ) =

e (t + t1 )
= e t = e ( 0.1)t = e t /10 , since = 0.1
e t1

Therefore, the probability P(T  t  t1 | T  t1) is the same as the probability P(T  t). This means that under the exponential model, the probability P(T  t) remains the same no matter from what point we measure the
time t. In other words, it does not remember when the machine had its last
breakdown. For this reason, the exponential distribution is known to have a
memory-less property.
From the above discussion we can see that it does not matter when we
start observing the system, since it does not take into account an aging factor. That is, whether the machine is brand new, or 20 years old, we have the
same result as long as we model the system using the same Poisson process

132

Chapter Seven

(or exponential model). In practice, however, this is not very valid. For example, if we are investigating how tollbooths function during rush hours and
nonrush hours, and we model it with the same Poisson process, then the
results may not be very valid. It would make more sense that when there is
very clear distinction between the two scenarios, we should model them by
two different processes.

7.5 The Weibull Distribution


The Weibull distribution, named after its inventor, is used extensively in quality and reliability engineering. In particular, it is widely used in reliability
problems. Unlike the exponential distribution, the Weibull distribution does
take into account an aging/anti-aging factor. In fact, the exponential distribution is a special case of Weibull distribution. This point is better explained
with a function, called failure rate function or hazard rate function.
Definition 7.6 A hazard rate function, denoted by h(t), is the conditional probability that a system will fail instantaneously some time
after time t given that it has survived up to time t.
In a Weibull model, the hazard rate function can be increasing, decreasing, or constant. When the hazard rate is constant, the Weibull model reduces
to an exponential model, and when it is increasing/decreasing, it means the
aging/anti-aging factor is being taken into account. For example, a hazard
rate function for a system that is getting old and is not well maintained may
increase or decrease after it is reconditioned and remain constant if it is
updated/maintained frequently. In human populations, the hazard rate function is dened as the proportion of individuals alive at the age of t years who
will die in the age interval (t, t  t1). The failure rate curve for humans is
almost tub shaped; rst it is decreasing, then almost at, and nally increasing. Figure 7.23 shows three kinds of hazard rate functions, h1(t), h2(t), and
h3(t), that are respectively increasing, decreasing, and constant.

h1 (t)
4
3
2
h3 (t)

h2 (t)
0

10

Figure 7.23 Curves of three different hazard rate functions.

Continuous Random Variables and Their Probability Distributions 133

Definition 7.7 A random variable T is said to be distributed as


Weibull if its probability density function is given by
f(t)  (t)1 e(t), for t  0
 0, otherwise

(7.18)

where   0 and   0 are the two parameters of the distribution. The


Weibull model of the hazard rate function with these parameters is
h(t)   t 1

(7.19)

Note that if   1, then h(t) is a constant and Equation (7.18) reduces to


Equation (7.14). That is, the Weibull density function reduces to the exponential density function with   . When   1, the hazard function h(t) increases
and when   1, h(t) decreases. Figure 7.24 presents graphs of the hazard function when   1 and   0.5, 1, 2. Note that at   0.5, 1, 2 the hazard function is respectively decreasing, constant and increasing. For these values of 
and , the graphs of the Weibull density function are shown in Figure 7.25.
7.5.1 Mean and Variance of the Weibull Distribution

1
1
1 +

(7.20)

2
1
2
1
= 2 1 + 1 +

(7.21)

where (n) is a gamma function. In particular, when n is a natural number,


say n, (n)  (n  1)!.
7.5.2 Distribution Function F(t) of Weibull
F(t)  P(T t)  1  e(t)

4
= 2.0
3
2

= 1.0

= 0.5
0

10

Figure 7.24 Hazard function h(t) with   1;   0.5, 1, 2.

(7.22)

134

Chapter Seven

2
=1, = 0.5
=1, = 1
=1, = 2.0

0
0

Figure 7.25 Weibull density function (a)   1,  0.5 (b)   1,  1


(c)   1,  2.

From Equation (7.22), it follows that


P(T  t)  1  F(t)  e(t)

(7.23)

Example 7.11 From data on a system, the parameters of a Weibull distribution are estimated to be   0.00025 and   0.5 where t is measured in
hours. Then, determine:
(a) The mean time before the system breaks down
(b) The probability P(T 5,000)
(c) The probability P(T 10,000)
(d) The probability P(T 10,000)
Solution:
(a) Using the expression in Equation (7.20) for the mean, we have

1
1

1 +

0.00025
0.5

 (4,000) (3)
 (4,000) (2!)
 8,000

Continuous Random Variables and Their Probability Distributions 135

(b) Using the expression in Equation (7.23), we have


P (T 5, 000 ) = e( ( 0.00025 )( 5, 000 ))

0.5

0.5

= e (1.25 )
= e1.118

= 0.3269
(c) Again, using the expression in Equation (7.23), we have
P (T 10, 000 ) = e (( 0.00025 )(10, 000 ))

0.5

= e1.5811388
= 0.2057
(d) The probability P(T 10,000) can be found using the result in
part c, that is,
P(T 10,000)  1  P(T 10,000)  1  0.2057  0.7943

8
Sampling Distributions

n Chapters 6 and 7 we discussed distributions of data as they apply to discrete and continuous random variables. Now we will turn our attention to
sampling distributions, or the probability distributions of functions of
observations from a random sample. The sampling distributions we are going
to discuss below are frequently encountered in applied statistics.
One of the key functions of statistics is to draw conclusions about the
population based upon information contained in a random sample. To use
such sample information for drawing conclusions about a population, it is
necessary that we understand the relationship between numerical measures
of sample data (statistics) and numerical measures of population data
(parameters). The purpose of this and the next chapter is to establish and discuss the importance of these relationships between statistics and parameters.
As an illustration, consider that we have a population with an unknown
mean  and we would like to gain some knowledge about that population. It
seems quite plausible that we could take a random sample of n observations
(X1, X2, ..., Xn) from this population and use the sample mean as a substitute
(called an estimator) of the population mean  based on the following:
X=

1 n
xi
n i =1

(8.1)

Once we decide to use the sample mean as an estimator of the population mean, we have an immediate question about how well any particular
esti
mate actually describes the true value of . Since the value of X changes
from sample to sample, how well the estimate describes
the true value of 

depends upon the behavior of the sample mean X, which is a random variable known as a statistic. To see how this random variable X behaves, we
need to study its probability distribution. The probability distribution of X is
known as the sampling distribution of the sample mean. Generally speaking,
a sampling distribution is the probability distribution of a statistic. For example, the probability distribution of a sample median, sample proportion, and
sample variance are called sampling distributions of the sample median,
sample proportion, and sample variance, respectively.

137

138

Chapter Eight

8.1 Sampling Distribution of Sample Mean


We start this section with an example that will illustrate the concept of a
sampling distribution of sample mean.
Example 8.1 Consider an experiment of rolling a fair die and observing
the number that appears on the uppermost face. The population associated
with this experiment and its probability distribution is shown in Table 8.1.
Take a sample of size 2 from this population and develop the concept of a
sampling distribution of sample mean.
Solution: The population mean and the population variance respectively
are as follows:

=
2 =
=

1
1
Xi = (1 + 2 + 3 + 4 + 5 + 6 ) = 3.5
N
6
1
1
Xi2 2 = (1 + 4 + 9 + 16 + 25 + 36 ) ( 3.5 )2
N
6
1
(91 ( 3.5 )2 )
6

 2.917

= 2.917 = 1.708
Now suppose instead of considering the whole population we want to
consider a random sample of size 2 to draw conclusions about the entire population. To achieve this goal we must determine all possible samples of size
2. In this case, there are 15 possible samples, when sampling is done without replacement (under this sampling scheme any population element can
appear in a given sample only once). We list these samples with their respective means in Table 8.2.
Note that some of the samples have the same mean. The sample we will
draw will be selected randomly from the population, which means that each
one of the 15 samples has the same chance (1/15) of being drawn. Our
chances, or probability, of getting different sample means will vary depending upon the actual drawing of the sample. For example, the probability of
getting a sample mean of 2.5 is 2/15, whereas the probability of getting a
sample mean of either 1.5 or 2.0 is only 1/15 each. We list different sample
means with their respective chances or probabilities
in Table 8.3.
Table 8.3 gives the sampling distribution of X. If we select a random
sample of size 2 from the population given in Table 8.1, we may draw any of
the 15 possible samples. The sample mean X can then assume any of the nine
Table 8.1 Population with its distribution for the experiment of rolling a fair die.
Xx

p(x)

1/6

1/6

1/6

1/6

1/6

1/6

Sampling Distributions 139

Table 8.2 All possible samples of size 2 with their respective means.
Sample number

Sample

Sample mean

11

1, 2

1.5

12

1, 3

2.0

13

1, 4

2.5

14

1, 5

3.0

15

1,6

3.5

16

2, 3

2.5

17

2, 4

3.0

18

2, 5

3.5

19

2, 6

4.0

10

3, 4

3.5

11

3, 5

4.0

12

3, 6

4.5

13

4, 5

4.5

14

4, 6

5.0

15

5, 6

5.5

Table 8.3 Different sample means with their respective probabilities.


x
p(x)

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5

1/15

1/15

2/15

2/15

3/15

2/15

2/15

1/15

1/15

values listed in Table 8.3 with their corresponding probabilities.


We now nd

the mean and variance of the sampling distribution of X given in Table 8.3 as
follows:

  E(X )   x p(x )
(8.2)
x

 1.5(1 / 15)  2.0(1 / 15)  2.5(2 / 15)  3.0(2 / 15) 


3.5(3 / 15)  4.0(2 / 15)  4.5(2 / 15)  5.0(1 / 15) 
5.5(1 / 15)
 1 / 15(1.5  2.0  5.0  6.0  10.5  8  9  5  5.5) 
52.5 / 15  3.5
2
   (x  )2 p(x )
x

 xi2 p(xi)  (x)2

(8.4)

 (1.5) (1 / 15)  (2.0) (1 / 15)  (2.5) (2 / 15)  (3.0)


(2 / 15)  (3.5)2(3 / 15)  (4.0)2(2 / 15)  (4.5)2(2 / 15) 
(5.0)2(1 / 15)  (5.5)2(1 / 15) (3.5)2
2

(8.3)

 1 / 15(2.25  4  12.5  18  36.75  32  40.5  25 


30.25)  12.25  1.167

140

Chapter Eight

From this example we can see that the mean of the sample mean is equal
to the population mean. Note that this is not a coincidence, but it is true in
general. That is,

 x  E(X )  
(8.5)

The variance of X, however, is not equal to the variance of the population.


The relationship between
the two variances (i.e., variance of the population

and the variance of X ) depends upon the size of the population and whether
the sample has been taken with replacement or without replacement. If the
population is nite, say of size N, and the sample of size n is taken without
replacement, then we have the following formula:

x2

N n 2
=

N 1 n

(8.6)

N n
is called the finite population correction factor. The
N 1
standard deviation of X, usually known as the standard error of the sample
mean, is found by taking the square root of the variance in Equation (8.6). If
the population is innite, or if the population is nite and the sample has
been taken with replacement, we have the following formula:
where the fraction

x2 =

2
n

(8.7)

Also, if the population is nite and the sample of size n is taken without
replacement such that
n
0.05
N

(8.8)

then the nite population correction factor is approximately equal to 1 and is


ignored. In this case, we again have

x2

2
=
n

(8.9)

From these formulas we see that the variance of the sample mean is smaller
than the population
variance and, therefore, the spread of the sampling dis
tribution of X is smaller than the spread of the corresponding population

distribution. This also implies that asn increases, the values of X become
more concentrated about the mean of X, which is the same as the population
mean. This grouping of sample means about the population mean is the reason that, whenever possible, we like to take as large a sample as possible
when we want to estimate the population mean.
In the above discussion we have not said anything about the shape of the
probability distribution of the sample mean. The concept of shape as related
to a probability distribution is very interesting and occupies an important
place in statistical applications.
If the distribution
of a population is normal then the distribution of the

sample mean X is also normal,


regardless of sample size. That is, when
the population is N(, ) then X is N(, / n ). If the population is not nor-

Sampling Distributions 141

mally distributed, then we have another important


benet that provides us the

sampling distribution of the sample mean X. We describe this benet in the


context of a theorem, as shown in section 8.2.

8.2 The Central Limit Theorem


Theorem 8.1 Let X1, X2, ..., Xn be a random sample from a population with mean  and variance 2, and let X be the sample mean.
Then, for
a large n (n 30) the sampling distribution of the sample
mean X is approximately normal with mean  and variance 2 / n.
In other words, the probability distribution of the random variable

((X  ) /
) is approximately standard normal.
n
It is important to note that there is a clear distinction between the two
cases when the population is normal and when it is not normal. When the
population is normal, there isno restriction on the sample size, and the disWhen the population is not
tribution of the sample mean X is exactly normal.

normally distributed, the distribution of X is approximately normal only if


the sample size is large (n 30). In addition, as discussed in section 8.1, if
the population is nite and sampling is done without replacement, we use the
nite population correction
factor to calculate the variance x2. That is, for a

large sample size, X is approximately normally distributed with mean  and


N n 2
variance
.

N 1 n
Example 8.2 The mean weight of a food entre is   190g with a standard deviation of 14g. If a sample of 49 entres is selected, then find:
(a) The probability that the sample mean weight will fall between
186g and 194g.
(b) The probability that the sample mean will be greater than
192g.
Solution:

(a) Let X be the sample mean. Since the samplesize n  49 is


large, the central limit theorem tells us that X is approximately
normal with mean   190g and standard deviation

x =

14
=
= 2g. Thus, from Figure 8.1, we have
n
49

186 190 X 190 194 190


P (186 X 194 ) = P

2
2
2

 P(2 Z 2)  2P(0 Z 2)  2(.4772)  .9544

142

Chapter Eight

Figure 8.1 Shaded area showing P(2 Z 2).

Figure 8.2 Shaded area showing P(Z 1).

(b) Using the same argument as in part (a) and Figure 8.2, we have
the following:

X 190 192 190

P(X 192)  P
 P(Z 1) 0.1587

2
2

Example 8.3
49 to 64.

Repeat Example 8.2 when the sample size is increased from

Solution:

(a) In this example X will be approximately normal with mean  

14
14
=
=
= 1.75g.
190g and standard deviation x =
8
n
64
Continuing our Example 8.2 and using Figure 8.3, we have
186 190 X 190 194 190

P(186 X 194)  P
1.75
1.75
1.75
 P(2.28 Z 2.28)  2P(0 Z 2.28)
 2(0.4887)  0.9774

Sampling Distributions 143

2.20

2.20

Figure 8.3 Shaded area showing P(2.28 Z 2.28).

1.14

Figure 8.4 Shaded area showing P(Z 1.14).

X -190 192 190

(b) Using Figure 8.4, we have P(X 192)  P


1.75
1.75

 P(Z 1.14)  0.5  P(0 Z 1.14)  0.5  0.3729  0.1271


From Examples 8.2 and 8.3, we see that as the sample size increases
from 49 to 64, the probability that the sample mean falls within four units of
the population mean increases while the probability that the sample mean
falls beyond two units from the population mean decreases. These examples
support our previous assertion that as the sample size increases, the variance
of the sample mean decreases and, therefore, the means are more concentrated about the population mean. This fact makes X the best candidate to be
used for estimating the population mean . We will study this concept in
more detail in the next chapter.
Example 8.4 A random sample of 36 reinforcement rods is taken from a
production plant that produces these rods with mean length of   80cm and
standard deviation of 0.6cm. Find the approximate probability that the sample mean of the 36 rods falls between 79.85cm and 80.15cm.

Solution: Let X be the sample mean. Then, we are interested in nding the
probability

P(79.85 X 80.15)

144

Chapter Eight

1.5

1.5

Figure 8.5 Shaded area showing P(1.5 Z 1.5).

1.6

1.6

Figure 8.6 Shaded area showing P(1.6 Z 1.6).

Figure 8.7 Shaded area showing P(2 Z 2).

Since the sample size is large, the central limit theorem tells us that X is
approximately normally distributed with mean x  80cm and standard

0.6
=
deviation x 
 0.1cm. Thus, using Figure 8.5, we have
n
36
79.85 80 X 80 80.15 80

P(79.85 X 80.15)  P

0.1
0.1
0.1
0.15
0.15
Z
 P
 P(1.5 Z 1.5)
0.1
0.1
 2(0.4332)  0.8664

Sampling Distributions 145

Example 8.5 Suppose the mean hourly wage of all employees in a large
semiconductor
manufacturing facility is $50 with a standard deviation of $10.

Let X be the mean hourly wages of certain employees selected randomly from
all the employees of this manufacturing
facility. Find the approximate proba
bility that the mean hourly wage X falls between $48 and $52 when the number of selected employees is (a) 64, (b) 100.
Solution:
(a) The sample size 64 is large,
so by use of the central limit

theorem, we know that X is approximately normally distributed


with mean and standard deviation as follows:
x    50

x =

10
=
= 1.25
n
64

Now we are interested in nding the probability

P(48 X 52)

Using the mean and the standard deviation of the sample mean X derived
above and Figure 8.6, we have the following:

48 50 X 50 52 50

P(48 X 52)  P
1.25
1.25
1.25

 P(1.6 Z 1.6)  2(.4452)  .8904


(b) In this case the sample size is 100, which is again large. Using
the same technique as used in part (a), we have the following:
x    50 and x 

10
1
100

Then, the desired probability (see Figure 8.7) is given by

48 50 X 50 52 50

P(48 X 52)  P
1
1
1

 P(2 Z 2)  2(.4772)  .9544


Example 8.6 Repeat Example 8.5, assuming that the total number of
employees in the company is only 500.
Solution: In this case, the sample size is large, but the population is not
very large. Before applying the central limit theorem, we must determine

146

Chapter Eight

whether n (0.05)N, because otherwise to calculate the standard deviation


of the sample mean we have to use the nite population correction factor.
(a) In this case, we have

n
64
=
= 0.125
N 500

Thus, the sample size is greater than 5% of the population. By using the nite
population correction factor, we have the following:

x = = 50 and x =

436

N n 10
500 64
= 1.168

= 1.25
499
N 1 8
500 1
n

Therefore, the desired probability (see Figure 8.8) is given by

X
48

50

50
52

50

P(48  X  52)  P
1.168
1.168
1.168

 P(1.71 Z 1.71)  2(.4564)  .9128


(b) Again, in this example, the sample size is greater than 5% of
the population size, since
n 100
=
= 0.20
N 500
Thus, by using the nite population correction factor, we have the following:

x = = 50 and x =

N n 10
500 100
400
= 0.895

=
499
N 1 10
500 1
n

Therefore, the desired probability (see Figure 8.9) is given by

48 50 X 50 52 50

P(48  X  52)  P
0.895
0.895
0.895

 P(2.23 Z 2.23)  2(.4871)  .9742

1.71

1.71

Figure 8.8 Shaded area showing P(1.71 Z 1.71).

Sampling Distributions 147

2.23

2.23

Figure 8.9 Shaded area showing P(2.23 Z 2.23).

8.2.1 Sampling Distribution of Sample Proportion


Let X1, X2, ..., Xn be a random sample, where each Xi represents the outcome
of the ith Bernoulli trial, that is
Xi  1,
 0,

with probability p
with probability q  1  p

(8.10)

where p is the probability of success and q the probability of failure in a


Bernoulli trial. Thus, the sample mean
X=

X1 + X 2 + ... + X n m
=
n
n

(8.11)

where m is the total number of successes in n Bernoulli trials. Thus, X represents the proportion
of successes in a sample of n trials. In other words, we

may look upon X as the sample proportion that is usually denoted by p (read
as p hat). Thus, the sampling distribution of the sample proportion is just the
sampling distribution of the sample mean when the sample is taken from a
Bernoulli population with mean p and variance pq.
From the above discussion and the result of central limit theorem, we
have that for a large n the sampling distribution of the sample proportion p
(sampling mean of a sample from the Bernoulli population) is approximately
normal with mean p and variance pq / n.
It is important to note, however, that when the sample is taken from a
Bernoulli population, the central limit theorem holds only when np 5 and
nq 5. If p  1 2, for example, the sampling distribution of the sample proportion is approximately normal even when n is as small as 10.
Example 8.7 Let X be a random variable distributed as binomial with
parameters n and p, B(n, p), where n is the number of trials and p is the probability of success. Find the sampling distribution of the sample proportion p,
when (a) n  100, p  0.25, and (b) n  64, p  0.5.
Solution:
(a) We have the following:
np  100(.25)  25  5
nq  100(1.25)  100(.75)  75  5

148

Chapter Eight

so the central limit theorem holds. Thus, p is approximately normally distributed with mean and variance given as follows:

p = p = 0.25
2p =

pq (.25 )(.75 )
=
= .001875
n
100

(b) In this case, again, we have


np  (64)(.5)  32  5
nq  (64)(1. 5)  64(.5)  32  5
Thus, p is approximately normally distributed with mean and variance as
follows:

p = p = .05
2p =

pq (.5 )(.5 ) .25


=
=
= 0.0039
n
64
64

8.3 Chi-Square Distribution


In this section we study a new distribution known as the chi-square distribution, usually written as the 2-distribution. The symbol is the Greek letter
chi. We encounter chi-square distributions very frequently in statistical
applications and it occupies a very important place in applied statistics. The
chi-square distribution is related to the normal distribution as described in
the following theorem.
Theorem 8.2 Let Z1, Z2, ..., Zn be a random sample from a standard normal population N(0,1). Let a new random variable X be
dened as follows:
X  Z12  Z22  ...  Zn2

(8.12)

The variable X is said to be distributed as chi-square with n degrees of freedom and is written as 2n. The frequency distribution of chi-square with n
degrees of freedom is given by
1

f (x) =
2

n
2 (n

 0,

e x / 2 x ( n / 2 )1

x0

/ 2)
otherwise

(8.13)

where x is the value of . A simple explanation of degrees of freedom is that


it represents the number of variables in the sample that can independently
vary.
The chi-square distribution has only one parameter n, the degree of freedom. The shape of the distribution changes as the degree of freedom
2

Sampling Distributions 149

f(x)
n=4
n=6
n=8
n = 10
10

20

Figure 8.10 Chi-square distribution with different degrees of freedom.

2
Xn,

Figure 8.11 Chi-square distribution with upper-tail area .

changes. For example, the shape of chi-square for n  4, 6, 8 and 10 is


shown in Figure 8.10.
From Figure 8.10 we can see that the random variable 2 assumes nonnegative values only. The entire frequency distribution curve falls on the
right of the origin and it is skewed to the right. The mean and variance of the
chi-square distribution are respectively equal to the degrees of freedom and
twice the degrees of freedom. That is,
  n

(8.14)

  2n
2

(8.15)

From Table V of the appendix we can nd the value


such that as shown in Figure 8.11, we have
2
P(2n n,
)

2
n,

of the variable 2
(8.16)

For example, if the random variable n is distributed with n  18 and  


0.05, which we write as 218, 0.05 then
2

218, 0.05  28.8693


where the value 28.8693 is found from the Table V of the appendix by rst
locating 18 in the rst column of degrees of freedom and then locating 0.05 in
the top row of the table. The value 28.8693 is found at the intersection of the

150

Chapter Eight

0.05
0

28.8693

Figure 8.12 Chi-square distribution with upper-tail area

  0.05.

2
Xn,1-

Figure 8.13 Chi-square distribution with lower-tail area

.

row corresponding to 18 degrees of freedom and the column corresponding to


the value   0.05. Thus, as shown in Figure 8.12, P(218 28.8693)  0.05.
Note that if we are interested in nding the value of the variable 2 such
that the lower tail area as shown in Figure 8.13 is , then in Table V of the
appendix we look for the value of 2 such that
P(2n 2 )  1  

(8.17)

That is, we rst nd the corresponding area under the upper tail (total
area under the 2 curve   1  ). Since Table V lists the values of the
variable 2 only for certain values of upper tail areas.
For example, if the random variable 2 is distributed with n  10 degrees
of freedom and   0.10 which we write as 210,1-0.10 or as 210,0.90 (see
Figure 8.14) then from Table V of the appendix, we have
210,0.90  4.86518 and P(210 4.86518)  0.10.
Example 8.8 Let a random variable 2 be distributed as chi-square with
20 degrees of freedom. Find the value of 2 such that (a) P(220 2)  0.05
(b) P(220 2)  0.025.

Sampling Distributions 151

0.05

4.86518

Figure 8.14 Chi-square distribution with lower-tail area

  0.10.

Solution:
(a) In this case we are given the area under the upper tail. Thus we
can see the value of 2 directly from the table with n  20 and
  0.05. That is,
220,0.05  31.410
(b) In this case, we are given the area under the lower tail. We now
rst nd the corresponding area under the upper tail that is
given by
1    1  0.025  0.975
Then, the value of can directly be seen from Table V of the appendix with
n  20 and   0.975. That is,
2

220,1  0.025  220, 0.975


 9.591
In applications, we are commonly interested in nding the distribution of
the sample variance S2 when a random sample is taken from a normal population. We state this important result in the following theorem.
Theorem 8.3. Let X1, X2, ..., Xn be a random sample from a normal population with mean  and variance 2. Let
S2 =

1
( X i X )2

n 1

(8.18)

Then
(n 1)S 2
2

(8.19)

is distributed as 2 with (n  1) degrees of freedom. This follows from


the fact that X and S2 are independent random variables and, as seen earlier,

152

Chapter Eight

X is distributed
as normal with mean  and variance 2 / n. In this case, for

a given X only (n  1) of the variables X1, X2, ..., Xn can vary freely, therefore, the degree of freedom of 2 is (n  1). The actual proof of this theorem
is beyond the scope of this book.
Example 8.9 Suppose a tea packaging machine is calibrated so that the
amount of tea it discharges is normally distributed with mean   1 pound
(16oz) with a standard deviation  of 1.0 oz. Suppose we randomly select 21
packages and weigh the amount of tea in each package. If the variance of
these 21 weights is denoted by S2, then it may be of interest to find the values of c1 and c2 such that
P(c1 S2 c2)  0.95
The solution to this problem would enable us to calibrate the machine such
that the value of the sample variance would be expected to fall between certain values with a very high probability.
Solution: From Theorem 8.3, we have the following:
(n 1)S2
220
2
Thus, we have
n 1
(n 1)S2 n 1
P 2 c1
2 c2 = 0.95
2

or
n 1
n 1
2
P 2 c1 20
2 c2 = 0.95

Now n  21 and   1. Thus, we have


P(20c1 2n-1 20c2)  0.95
Now by assigning probability 0.025 under each tail, or equivalently selecting
the middle 0.95 area, we have from Table V of the appendix
P(9.591 220 34.170)  0.95
Thus,
20 c1  9.591 and 20 c2  34.170
or
c1  0.479 and c2  1.708

Sampling Distributions 153

8.4 The Students t-Distribution


Consider two independent random variables X and Y such that X is distributed as standard normal and Y is distributed as 2n. Then we dene another
random variable, that is
T=

X
Y
n

(8.20)

The random variable T is said to be distributed as a Students t-distribution


with n degrees of freedom. W. S. Gosset introduced and named this distribution, while using the pen name Student.
The frequency distribution function of the random variable T is given by
f (t ) =

t2
1
1+
n
1 n
nB ,
2 2

n +1
2

(8.21)

where B(a, b) is called the beta function and is equal to (a)(b)/(a  b).
Like the standard normal distribution, the t-distribution is unimodal and symmetric about t  0. The mean and the variance of t-distribution with n
degrees of freedom are
  0,

provided n  1

(8.22)

and
2 

n
,
n2

provided n  2

(8.23)

respectively.
Figure 8.15 gives a comparison of the frequency distribution function
of the Students t-distribution and the standard normal distribution. Note
that as the degrees of freedom increase, the t-distribution tends to become
more like the standard normal distribution. In most applications, as n
becomes large (n 30), we use the standard normal distribution rather
than the t-distribution. We will see this substitution of normal distribution
for the Students t-distribution in Chapter 9 and subsequent chapters.
From Theorem 8.3 and the denition of the t-distribution, we have an
important result that is used quite frequently in applications.
Theorem 8.4 Let X1, X2, ..., Xn be a random sample from a normal
population with mean  and an unknown variance 2. Let X and S2
respectively be the sample mean and sample variance. Then the random variable

T  (X  ) / (S /

n)

(8.24)

154

Chapter Eight

Figure 8.15 Frequency distribution function of t-distribution with, say n  15


degrees of freedom and standard normal distribution.

t n,

t n,

Figure 8.16 t-distribution with shaded area under the two tails equal to P(T tn, )
 P(T tn, )  .

is distributed as a Students t-distribution with (n  1) degrees of freedom.


It is quite common to denote the distribution of T with t(n-1).
A frequently asked question relates to the concept of degrees of freedom.
As discussed earlier, we dene the concept of degrees of freedom as the
number of observations that can be selected freely. For example, in Theorem
8.4 we have n observations, but the knowledge of X imposes
one restriction

on these observations that their sum must be equal to nX. Thus, we can select
only (n  1) variables freely and the nth observation
must be such that the

total of all the observations must be equal to nX, so that the degree of freedom for the t-distribution in Theorem 8.4 is (n  1).
Like the standard normal distribution tables, we have tables for the t-distributions. Let the random variable T be distributed as a t-distribution with n
degrees of freedom. We then dene the quantity tn,  as
P(T tn, )  

(8.25)

Table IV of the appendix lists the values of the quantity tn,  for various values of n and .
To nd the value of tn,  from Table IV of the appendix, rst locate the
value of n in the column of degrees of freedom and then locate the value of
 (probability represented by the shaded area under the upper tail) at the top
of Table IV. Then, the value of tn,  is the entry found at the intersection of
the row of n degrees of freedom and the column of the probability .
Also, from Figure 8.16 we can see that P(T tn, )  P(T  tn, )  .
Example 8.10

Find the value of t15, 0.05.

Solution: A small portion of the t-table is reproduced in Table 8.4. Using


the technique described above, we mark the row corresponding to 15 degrees
of freedom and the column corresponding to the value of   0.05. The entry

Sampling Distributions 155

Table 8.4 A portion of the t-table giving the value


of tn,  for certain values of n and .
.10

.05

.025

.01

14

1.365

1.761

2.145

2.624

15

1.361

1.753

2.131

2.602

16

1.337

1.746

2.120

2.583

17

1.333

1.740

2.110

2.567

19

1.330

1.724

2.101

2.552

at the intersection of the marked row and column is 1.753. Thus, we have t15,
0.05  1.753
If we now want the value of tn,  such that the probability under the left
tail is 0.05, then it is equal to tn,0.05. This is due to the fact that t-distribution is symmetric about the origin.

8.5 Snedecors F-Distribution


Let the random variables X1 and X2 be distributed as chi-square with 1 and
2 degrees of freedom respectively, and let X1 and X2 be independent. Then
the random variable
X1

F= 1
X2
2

(8.26)

is said to be distributed as Snedecors F-distribution with 1 and 2 degrees


of freedom and is usually written as F1,2. The 1 and 2 are respectively the
numerator and denominator degrees of freedom.
The probability density function of the random variable F, which is distributed as Snedecors F-distribution with 1 and 2 degrees of freedom, is
given by

v1 v2 v1

1
1
v1 2 v22 x 2

f(x) =
v1 + v2 , for x > 0

v
v


B 1 , 2 v1 x + v2 2

2 2

(8.27)

 0, otherwise
The mean and variance of the F-distribution are

2
,
2 2

provided 2  2

(8.28)

156

Chapter Eight

2 =

2 22 (1 + 2 2 )

1 ( 2 2 ) ( 2 4 )
2

provided 2  4

(8.29)

respectively.
Note that the mean of the F-distribution depends only on the degree of
freedom of the denominator.
Figure 8.17 shows the curve of the probability density function of a typical F-distribution with 1 and 2 degrees of freedom. Like the 2 random
variable, the F random variable is also non-negative, and its distribution is
right skewed. The shape of the distribution changes as the degrees of freedom change.
Now consider two random samples X11, X12, ..., X1n1 and X21, X22, ..., X2n2
from two independent normal distributions with variances 12 and 22,
respectively. Let S12 and S22 be the sample variances of the samples coming
from these normal populations respectively. From our previous discussion in
( n2 1) S22
n1 1) S12
(
and X2 
are indesection 8.4, we know that X1 
22
12
pendently distributed as chi-square with 1  n1  1 and 2  n2  1
degrees of freedom. In this case, we have the following theorem.
Theorem 8.5 Let X11, X21, ..., X1n1 and X21, X22, ..., X2n2 be two independent random samples from two normal populations N(1, 1)
and N(2, 2). Let S12 and S22 be the sample variances and let a new
random variable X be dened as
S12
2
X = 12
S2
22

(8.30)

Then the random variable X is distributed as F1, 2, where 1  n11 and 2


 n2  1.

f(x)

Figure 8.17 A typical probability density function curve of F1,

2.

Sampling Distributions 157

f(x)

Fv, v ,
1
2

Figure 8.18 Probability density function curve of F1,

with upper-tail area 

.

f(x)

F, , 1-

Figure 8.19 Probability density function curve of F1,

with lower-tail area 

.

Like tables for some other distributions, we also have tables for the Fdistribution. Table VI of the appendix lists values of F1, 2;  for various values of 1, 2 and  (see Figure 8.18), such that
P(F  F1, 2; )  

(8.31)

The following example illustrates the technique to read F-distribution


from Table VI of the appendix.
Example 8.11

Find the value of F15,20; 0.05

Solution: Locate and mark the column and row corresponding to the
numerator degrees of freedom (1  15) and the denominator degrees of
freedom (2  20). The entry at the intersection of the marked column and
row corresponding to the value   0.05 is the desired value of F15,20; 0.05. In
this case, we have
F15,20; 0.05  2.20
Note that entries F1, 2,  in the Table correspond only to the upper-tail areas.
To nd the entries corresponding to the lower-tail areas which we denote by
F1, 2, 1   such that (see Figure 8.19)
(F F1, 2, 1   )  

158

Chapter Eight

We use the following rule:


F1, 2, 1   

1
Fv2 , v1 ,

(8.32)

For example, the value of F20,15; 10.05 is given by


F15, 20; 1-0.05 


1
F20,15,0.05

1
 0.429.
2.33

As an application of Theorem 8.5, consider the following example.


Example 8.12 Consider two random samples of sizes 10 and 12 from two
independent normal populations N(1, ) and N(2, ). Let S12 and S22 be
their respective sample variances. Find the value of  such that
S12
P ( 2 ) = 0.05.
S2
s2
Solution: Since 12  22  2, from Theorem 8.5 it follows that 12 is
s2
distributed as F9,11. Thus, we have
P(

S12
)  P(F9,11 )
S22

So we need to nd the value of   F9,11; 0.05. From Table VI of the appendix, we have
  F9,11; 0.05  2.90.
This means the probability that the variance of the rst sample is greater than
or equal to 2.90 times the variance of the second sample is only 0.05. We
encourage the reader to verify that the probability that the variance of the rst
sample is less than or equal to 0.3448 times the variance of the second sample is also 0.05.

8.6 The Poisson Approximation to


the Binomial Distribution
In Chapter 6, we saw that when X is a binomial random variable, the probability that it takes a certain value depends upon the values of the parameters
(n, p). The probability P(X  x; x  0, 1, 2,...,n), where x is the number of
successes in n independent trials is given by
n
(8.33)
P(X  x)  px ( 1  p)n-x, x  0, 1, 2, ..., n
x

Sampling Distributions 159

In Chapter 6, we also noted that for n 20 for certain values of p the


probability P(X  x) can readily be found by using the tables (Table I of the
appendix). But if these tables are not available for a given value of n or p, or
both, we need to use an advanced scientic calculator or a computer to calculate these probabilities. In this section we will see that when p becomes
very small and n becomes very large, a binomial distribution can be approximated by using the Poisson distribution with  np, which, relatively
speaking, is much easier to calculate than by using the binomial distribution.
The actual proof showing that the binomial distribution can be approximated with the Poisson distribution is beyond the scope of this book. We illustrate the use of the Poisson distribution as an approximation to the binomial
distribution with the following example.
Example 8.13 Experience shows that the probability that an insurance company pays out against damages of car engine fires in any given year is 0.0001.
Suppose that the insurance company has 50,000 persons who are insured for
such damages. Find the probability that during any given year insurance company will pay against this kind of damage to at most four persons.
Solution: From the given information it is clear that if X denotes the number of persons who are seeking compensation for damages due to engine
res, X is a random variable that is distributed as binomial with n  50,000
and p  0.0001 and we are interested in nding the probability P(X 4).
Finding this probability by using the binomial distribution formula would be
almost an insurmountable amount of work. But by using the Poisson approximation to the binomial distribution, we have
 np  (5000)(0.0001)  5
and, therefore, we have
P(X 4)  P(X  0)  P(X  1)  P(X  2)  P(X  3)  P(X  4)
=

e5 5 0 e5 51 e5 5 2 e5 5 3 e5 5 4
+
+
+
+
0!
1!
2!
3!
4!

 0.0067  0.0337  0.0842  0.1402 0.1775 0 .4425

8.7 The Normal Approximation to


the Binomial Distribution
In the previous section we studied how to use the Poisson distribution as an
approximation to the binomial distribution. In this section we study how we
use the normal distribution to approximate the binomial distribution. From our
discussions of the binomial distribution in Chapter 6 and of the normal distribution in Chapter 7, we know that the binomial distribution is a distribution of
a discrete random variable, while the normal distribution is a distribution of a
continuous random variable. Further, the normal distribution is completely
symmetric about its mean , whereas the shape of the binomial distribution
depends upon the value of p. For example, in Figure 8.20 we can see that the

160

Chapter Eight

p = 0.3

p = 0.2
0.3

0.25

0.25

0.2

0.2

0.15

0.15
0.1

0.1

0.05

0.05

0
0

10 12

14

0 2

p = 0.4
0.25

0.2

0.2

0.15

0.15

0.1

0.1

0.05

0.05

0
4

10

12

14

p = 0.5

0.25

0 2

10 12 14

0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Figure 8.20 Comparison of histograms for various binomial distributions (n  15,


p  0.2, 0.3, 0.4, 0.5)

binomial distribution becomes more symmetric about its mean   np as p


approaches to 0.5 (p 0.5). Also, from Figure 8.21 (page 163) we can see that
the approximation to the binomial distribution by using the normal distribution
with mean   np and variance 2  npq seems very reasonable. Further, as
we should expect, the normal approximation to the binomial distribution
improves signicantly as n increases and p approaches 0.5 (see Table 8.5).
The empirical results show that, in general, the approximation to the
binomial distribution using the normal distribution is quite good whenever n
is large and p is such that np 5 and n(1  p) 5.
As mentioned earlier, normal and binomial distributions are respectively
the distributions of continuous and discrete random variables. This means we
are using the distribution of a continuous random variable to approximate the
distribution of a discrete random variable. In this case, it is natural to expect
that we would need to make a correction in order to convert the discrete random variable to a continuous random variable. In fact, the correction is made
by adding or subtracting 0.5 to or from the values of the discrete random
variable. This correction is usually known as continuity correction factor.
Table 8.6 shows the correction factor under different scenarios.
Let X be a binomial random variable with parameters (n, p). Then for
some n and p such that np 5 and n(1  p) 5, the binomial distribution is
approximated using the normal distribution with mean   np and variance
2  npq and by applying the appropriate continuity correction factor.
We summarize the results as follows:
Example 8.14
0.4.

Let X be a binomial random variable with n  20 and p 

Sampling Distributions 161

Table 8.5 Comparison of approximate probabilities to the exact probabilities (n  5,


p  0.4, 0.5)
n  15

p  0.4

p  0.5

Exact prob.

Normal app.

Exact prob.

Normal app.

10

0.0005

0.0015

0.0000

0.0001

11

0.0047

0.0070

0.0005

0.0008

12

0.0219

0.0237

0.0032

0.0040

13

0.0634

0.0613

0.0139

0.0145

14

0.1268

0.1208

0.0417

0.0412

15

0.1859

0.1814

0.0916

0.0902

16

0.2066

0.2079

0.1527

0.1519

17

0.1771

0.1815

0.1964

0.1973

18

0.1181

0.1207

0.1964

0.1972

19

0.0612

0.0613

0.1527

0.1519

10

0.0245

0.0237

0.0916

0.0902

11

0.0074

0.0070

0.0417

0.0412

12

0.0016

0.0015

0.0139

0.0145

13

0.0003

0.0003

0.0032

0.0040

14

0.0000

0.0000

0.0005

0.0008

15

0.0000

0.0000

0.000

0.0001

Table 8.6 Showing the use of continuity correction factor under different scenarios.
Probability using binomial distribution

Probability using normal approximation

P(a X b)

P(a  0.5 X b  0.5)

P(a X b)

P(a  0.5 X b  0.5)

P(a X < b)

P(a  0.5 X b  0.5)

P(a X < b)

P(a  0.5 X b  0.5)

P(X  a)

P(X a  0.5)

P(X a)

P(X a  0.5)

P(X  a)

P(a  0.5 X a  0.5)

P(X a)

P(X a  0.5)

P(X  a)

P(X a  0.5)

(a) Find the probability P(7 X 12) by using the binomial


tables (Table I of the appendix).
(b) Find the probability P(7 X 12) by using the normal
approximation to the binomial distribution. Find the difference
between the approximate probability calculated in part (b) with
the exact probability calculated in part (a).

162

Chapter Eight

Solution:
(a) From Table I of the appendix, we have
P(7 X 12)  P(X  7)  P(X  8)  P(X  9)  P(X  10) 
P(X  11) P(X12)
 0.1659  0.1797  0.1597  0.1170  0.0710  0.0355
 0.7289
(b) First we check whether we can use normal approximation to the
binomial distribution or not by verifying the condition that np
5 and n(1  p) 5. From the given information, we have
np  20(0.4)  8  5
n(1  p)  20(1  0.4)  12  5
In this case, both conditions necessary for being able to use the normal
approximation for the binomial distribution are satised. Thus, we can now
proceed to calculate the approximate probability as follows:
  np  20(0.4)  8
2  npq  20(0.4)(0.6)  4.8


4.8  2.19

Using the continuity correction factor, we have


P(7 X 12)

Binomial random variable

P(7  0.5 X 12  0.5)

Normal random variable

6.5 8 X 8 12.5 8
= P (6.5 X 12.5 ) = P

2.19
2.19
2.199
 P(0.68 Z 2.05)  P(0.68 Z 0)  P(0 Z 2.05)
 0.2517  0.4798  0.7315.
The difference between the approximate and the exact probability is 0.7315
 0.7289  0.0026, which is clearly not very signicant.
Example 8.15 A fair coin is tossed 12 times. Find the exact and approximate probabilities of getting seven heads and compare the two probabilities.
Solution: Let X be a random variable that denotes the number of heads. In
this case, X is distributed as binomial with n  12 and p  0.5.
Using the binomial tables the exact probability of getting seven heads is
P(X  7)  0.1934
Now we can calculate the approximate probability by using the normal
approximation. Clearly both the conditions np 5 and n(1  p) 5 are satised and

Sampling Distributions 163

(a)

67 8

12

20

(b)

10

15

20

6.5 8

12.5

20

Figure 8.21 (a) Showing the normal approximation to the binomial (b) Replacing
the shaded area contained in the rectangles by the shaded area under
the normal curve.

  np  12(0.5)  6
2  npq  12(0.5)(0.5)  3


 1.73
Thus, we have
P(X  7)  P(7 X 7)
 P(7  0.5 X 7  0.5)
 P(6.5 X 7.5)
6.5 6 X - 6 7.5 6

 P
 P(0.29 Z 0.87)
1.73
1.73
1.73
 P(Z 0.87)  P(Z 0.29)  0.3078  0.1141  0.1937
In this case, the exact and approximate probabilities of getting 7 heads in 12
trials are almost equal.

9
Point and Interval Estimation

n Chapters 5 and 6 we studied probability distributions (or probability


models) that describe various types of populations. Each of these probability distributions is characterized by one or more numerical descriptive
measures called parameters. If the parameters of a probability distribution
are not known, it means we are completely unfamiliar with that distribution.
In practice, it is common that the parameters of a probability distribution that
describes a population are unknown. One of the goals of statistics is to make
some inference about the unknown parameters of a population based on the
information contained in a sample taken from the population under consideration. Methods of statistical inference can be divided into two parts:
parameter estimation and testing of statistical hypotheses. In this chapter we
consider the problem of parameter estimation, and in Chapter 10 we will
consider certain techniques of testing statistical hypotheses.
Estimation of statistical characteristics has many practical applications.
For example, a paper mill might be interested in knowing the percentage of
paper it has to discard due to wrinkles, holes, or other defects. A Six Sigma
Black Belt may want to know if a production process meets required specications, or he or she may want to nd the process-capability ratio. A reliability engineer may believe the survival time of product units in actual use is
normally distributed and may want to estimate the mean and the standard
deviation, which are not known.
The estimation process is divided into two parts. In the rst part, the information contained in a sample is used to arrive at a single number that closely
depicts the actual characteristics of interest. For instance, we may estimate that
25% of the paper produced on a specic machine is discarded due to wrinkles,
holes, or other defects. This part of estimation is called point estimation, and
the single number we identify is called the point estimate. The second part of
the estimation process uses the information contained in a sample to identify
two numbers that represent an interval (a, b), which we believe encloses the
unknown parameter with a certain probability. This second part of the estimation process is called interval estimation, and the interval itself is called the
interval estimate or confidence interval. In both point estimation and interval
estimation we work with a statistic (or a function of sample values) that allows
165

166

Chapter Nine

us to determine a point estimate and an interval estimate of the parameter under


consideration. First, we will take a closer look at point estimation, and in later
sections we will look at interval estimation.

9.1 Point Estimation

DESCRIPTION

POINT ESTIMATION
A method to find a single number, based
upon the information contained in a sample,
that comes close to an unknown parameter
value.

USE

Used to assess characterization of a


population by taking a sample from the
population.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Seeking a sample that contains the


pertinent information about the population.

SPECIAL COMMENTS/CONCERNS

Information extracted from a sample usually


varies from sample to sample. An
appropriate sample size should be taken so
that the variation in the information obtained
from sample to sample is minimal.

RELATED TOOLS

Interval estimation.

Point estimation is a method to determine a suitable statistic, called an estimator. The estimator tells us how to arrive at a single value, called a point
estimate, based on the observations contained in a sample. For example:
X=

1
Xi
n

(9.1)

is a point estimator of the population mean . Note that X is one of the many
possible estimators of the population mean  and is usually denoted by 
(read as mu hat). This estimator tells us how to use the sample data to
arrive at a single value x, a point estimate of . To calculate x, we add all the
observations in the sample and then divide the sum by n, the number of
observations in the sample.
Since it is always possible to nd many point estimators of a parameter,
our immediate problem is to decide which estimator is good. A good estimator is one that would give, based on the information contained in a sample, an
estimate that falls closer to the true value of the parameter to be estimated. But
the question at hand is how to identify a good estimator. There are certain
properties associated with a good estimator. An estimator that possesses more
of these properties is considered a better estimator. Here we will examine only

Point and Interval Estimation 167

a couple of such properties, as the rest of them are beyond the scope of this
book.
9.1.1 Properties of Point Estimators
There are various properties of a good point estimator, such as being unbiased,
having minimum variance, relative efciency, consistency, and sufciency. The
properties that we are going to discuss in some detail are the following:
1. Unbiased
2. Minimum variance
Let f(x, ) be the probability distribution of a population of interest with a
parameter , which is unknown. Let X1, X2, ..., Xn be a random sample from
the population of interest. Let   (X1, X2, ..., Xn) be a point estimator of
the unknown parameter .
Definition 9.1 The point estimator   ( X1, X2, ..., Xn) is said
to be an unbiased estimator of  if and only if E() (i.e., the mean of
) is equal to . If E()  , then  is a biased estimator of .
Note that  in Denition 9.1 is a statistic.
Example 9.1 Find an unbiased estimator of the population mean .
Solution: In Chapter 8, wesaw that the mean of the sample mean
is equal
to the population mean, or E(X)  . Therefore, the sample mean X is always
an unbiased estimator of the population mean .
Example 9.2 Find an unbiased estimator of the population proportion p.
Solution: In Chapter 8, we saw that the mean of the sample proportion p is
equal to the population proportion p, that is, E(p )  p. Thus, the sample proportion p is an unbiased estimator of the population proportion p.
If an estimator is not unbiased, the bias of the estimator is equal to the
difference between E() and . If E()  ,  is said to be positively biased,
and if E()  , it is said to be negatively biased.
Having said this, there still remains one question to be answered. If the
mean  of a population is unknown and we take a random sample from this
population, nd its mean x and use it as an estimate of , how do we know
how close our estimate x is to the true value of ? Dont worry, the answer
to this question is simple, but depends upon the population size, the probability distribution of the population, and the sample size. However, before
discussing the answer to this question, we would like to discuss the estimation of the population variance with the following example.
Example 9.3 Let us consider a population with probability distribution
f (x, , 2) where  and 2 are unknown population mean and variance,
respectively. Then find an unbiased estimator of 2.
Solution: Let X1, X2, ..., Xn be a random sample from the population f (x,
,2) and let S2 be the sample variance. Then S2 is an estimator of 2.

168

Chapter Nine

However, whether this estimator is unbiased or biased depends upon how S2


is dened. If it is dened as
S2 =

1
( Xi X )2
n

(9.2)

it is not an unbiased estimator of 2. If, on the other hand, it is dened as


S2 =

1
( Xi X )2
n 1

(9.3)

it is an unbiased estimator of 2, since it can be shown that E (S2)  2,


where S2 is dened as in Equation (9.3). This is the reason why we usually
use Equation (9.3) instead of (9.2) to calculate the sample variance S2 (see
Chapter 4).
Getting back to the question we raised earlier, let E be the maximum
absolute difference between the estimate x and the actual (unknown) value of
, and let 0    1, (In practice,  is usually taken to be 0.01, 0.05, or 0.1).
Now if either the population is normal (with no restriction on the sample
size) or the sample size is large (n 30) and is chosen from an innite population with known , we can say with probability 1   that
E = z

(9.4)

The quantity E is usually known as margin of error or bound on error of


estimation. The result in Equation (9.4) is still valid if the population is nite
and the sampling is done with replacement or the sample size is less than 5%
(see Equation (8.8)) of the population size. If the population is nite and
sample size relative to the population size is not small, we use the nite correction factor, and the maximum difference between the estimate x and the
true value of  is
E = z

N n
N 1

(9.5)

where N and n are the population and the sample size respectively. If  is not
known in Equations (9.4) or (9.5) we replace it by the sample standard deviation S, which is an estimator of . Thus, in this case the maximum differences between the estimate x and the true value of  in Equations (9.4) and
(9.5), respectively, are given by
E = z

S
n

(9.6)

N n
N 1

(9.7)

and
E = z

S
n

Point and Interval Estimation 169

Note that the margin of error E given in Equations (9.4)(9.7) is with probability 1  .
Example 9.4 A manufacturing engineer wants to use the mean of a random sample of size n  64 to estimate the average length of the rods being
manufactured. If it is known that   0.5 cm, find the margin of error with
95% probability.
Solution: Since the sample size is large, and assuming that the total number of rods manufactured at the given facility is quite large, from Equation
(9.4) it follows that
E = z

= 1.96

n
0.5
0.5
= 1.96
= 0.1225
8
64

From Equations (8.6) and (9.4) or (8.7) and (9.5) it follows that as the sample size increases the variance of the estimator of the parameter and the margin of error E decrease. Thus, if the variance is minimal the margin of error
will also be minimal. In general, it is true that an unbiased estimator with
minimum variance is a better estimator because it will result in an estimate
that is closer to the true value of the parameter. This makes the minimum
variance property of an estimator desirable.
Definition 9.2 Consider a population with probability density
function f(x,), where is an unknown parameter. Let 1, 2, , n,
be the unbiased estimators of . Then an estimator i is said to be a
minimum variance unbiased estimator of if the variance of i is
smaller than the variance of any other unbiased estimator.
There are techniques to nd the minimum variance unbiased estimator,
if it exists, but these techniques are beyond the scope of this book. So we
shall limit ourselves only to the following rule:
If we have more than one unbiased estimator (not necessarily all possible unbiased estimators) of , choose from these estimators the one that has
the smallest standard error (standard error is nothing but the standard deviation of sampling distribution of the estimator).
sample from an infinite popExample 9.5: Let X1, X2, ..., Xn be a random

ulation with an unknown mean  and let X and


Md be the sample mean and
the sample median,
respectively.
Then
both
X
and Md are unbiased estima
tors of . But X is a better unbiased estimator of .

Solution: It can be shown that the expected value of X and Md is equal to


 and therefore both are unbiased estimators of . Furthermore, the variance
of Md for large samples is approximately equal to
2
2
M
d (1.25 )

2
= (1.25 )2 x2 ,
n

(9.8)

170

Chapter Nine

which implies that the variance of the sample median is larger than the variance of the sample mean. Thus, between the twounbiased estimators of the
population mean  we shall choose sample mean X as a better estimator
of .
It is very interesting to note that if the population is normal then X is the
minimum variance unbiased estimator of .
Example 9.6 In order to evaluate a new catalyst in a chemical production
process a chemist uses that catalyst in 30 batches. The final yield of the
chemical in each batch is recorded as follows:
72 74 71 78 84 80 79 75 77 76 74 78 88 78 70
72 84 82 80 75 73 76 78 84 83 85 81 79 76 72
(a) Find a point estimate of the final mean yield of the chemical.
(b) Find the standard error of the point estimator calculated in
part (a).
(c) Find with 95% probability the margin of error.
Solution: Since the sample size is large, all the results discussed above are
applicable to this problem. Also, note that when the population size is not
known as in this case we assume that the population is very large or at least
large enough that the sample size is less than 5% of the population size.
(a) To nd a point estimate of the nal mean yield we nd the
sample mean, which is a point estimate of the nal mean yield
of the chemical. Thus, we have

= X = ( 72 + 74 + 71 + 78 + ... + 72 ) 30 = 77.8
(b) To nd the standard error of the point estimate calculated in
part (a), we rst need to determine the sample standard
deviation S that is given by
1
( Xi X )2
(9.9)
n 1

Substituting n  30, and the values of Xi and X in Equation (9.9), we get


S=

S  4.6416.
so that the standard error of the point estimate is
S
4.6416
=
= 0.8474
n
30
(c) Since we want to nd the margin of error with 95% probability,
  0.05 and the population standard deviation  is not known.
Thus, substituting the value of Z 2  Z0.002 = 1.96, S  4.6416
and n  30 in Equation (9.6), we get the margin of error to be
equal to
E = 1.96( 4.6416

30 ) = 1.6609

Point and Interval Estimation 171

The value of the margin of error E shows that our estimate of the nal
mean yield of the chemical is quite good.

9.2 Interval Estimation

DESCRIPTION

INTERVAL ESTIMATION
A method to find two numbers forming an
interval, based upon the information
contained in a sample, that would contain
with certain probability the true value of an
unknown parameter.

USE

Used to assess characterization of a


population by taking a sample from the
population.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Seeking a sample that contains the


pertinent information about the population.

SPECIAL COMMENTS/CONCERNS

The size of an interval formed by using the


information extracted from a sample usually
varies from sample to sample. An
appropriate sample should be taken so that
the size of the interval under the given
conditions is as small as possible.

RELATED TOOLS

Point estimation, hypothesis testing.

In the preceding section we found a single number to replace the unknown


parameter. But in applications, a practitioner sometimes is more interested in
nding a range of values for the unknown parameter rather than nding a single value, because it is possible that it may be either difcult to attain that
specic value or such an estimate may not make much sense. For example,
it may not be possible to estimate precisely the cost of a project as
$189,547.30, or the life of an electric bulb as exactly 500.25 hours. In this
and the following sections instead of nding a single value we will nd a pair
of values that will serve as the end points of an interval such that the interval
contains the true value of the unknown parameter with a desired probability.
The interval that we nd is usually called an interval estimate or the confidence interval. The probability with which it contains the true value of the
parameter is known as the confidence coefficient.
Consider a population with probability distribution f(x,) where  is an
unknown parameter. Then we are interested in nding two values  and u
such that the interval (, u) contains the unknown parameter  with probability, say 1  , that is
P(  u)  1  
(9.10)

172

Chapter Nine

Then  and u are commonly called the lower confidence limit (LCL) and
upper confidence limit (UCL), respectively. The probability (1  ) is the
confidence coefficient.
The difference u   is called the width or the size of the confidence
interval.
9.2.1 Interpretation of a Confidence Interval
Note that the lower and upper condence limits  and u are statistics and
therefore random variables, which means the interval (, u ) will be different for different samples. Thus, it is possible that some of these condence
intervals may contain the true value of the unknown parameter while some
other may not contain that value. This leads us to the following interpretation
of the condence interval with condence coefcient 1  . That is, if we
take a large number of samples and for each sample we determine the condence interval (, u ) then the frequency with which these intervals will
contain the true value of , say   0, is 1  .
Figure 9.1 shows that 2 out of 50 condence intervals with condence
coefcient 0.95 ( 1  0.05) do not contain 0, the true value of .

9.3 Confidence Intervals


A general method to determine a condence interval for an unknown parameter  is using a random variable or a statistic called a pivotal quantity.
Definition 9.3 Let X1, X2, ..., Xn be a random sample from a population with an unknown parameter . Then a statistic (X1, X2, ...,
Xn) is called a pivotal quantity if it possesses the following properties:
1. It is a function of sample values and some parameters,
including the unknown parameter .

Figure 9.1 An interpretation of a confidence interval.

Point and Interval Estimation 173

2. Among all the parameters it contains,  is the only unknown


parameter.
3. The probability distribution of the random variable  does not
depend upon the unknown parameter .
Example 9.7 Let X1, X2, ..., Xn be a random sample from a normal population with an unknown mean  and known variance 2. Find a pivotal quantity for the unknown mean .
X
is a pivotal quantity because it is a function of
/ n
sample values and parameter  and ,  is the only unknown parameter and
its probability distribution does not depend upon .
Solution: Clearly

9.3.1 Confidence Interval for Population Mean  When the


Sample Size Is Large
9.3.1.1 Population Standard Deviation  Known Let X1, X2, ..., Xn be a
random sample of size n ( 30) from a population having a probability distribution with unknown mean  and variance 2 where 2 is known. We want
to nd a condence interval for  with condence coefcient (1  ). To
nd this condence interval we would rst determine a pivotal quantity,
X
. Since the sample of size n 30 is large, it
which in the case is
n
follows from the central limit theorem (see Chapter 8) that irrespective of the
X
form of the population probability distribution the pivotal quantity
is
n
distributed as standard normal (i.e., the mean 0 and standard deviation 1).
Thus, we have (also, see Figure 9.2)

X
z 2 = 1
P z 2
n

(9.11)

After applying some simple mathematical operations, Equation (9.11) can be


written as

P X z 2

n X + z 2

n = 1

(9.12)

Equation (9.12) gives the condence interval for the mean  with condence coefcient 1  . It is very interesting to note that the areas under the
two tails shown in Figure 9.2 each equal to /2. In other words, we have
divided  into two equal parts such that half-alpha we take under one tail and
other half under the second tail. Technically speaking we may divide  into
two parts as we wish, that is we may take, for example, one-third of  under
one tail and the remaining two-third under the other tail. But traditionally we
always divide  into two equal parts unless we have very strong reason to do
otherwise. Moreover, for a symmetric distribution, by dividing  into two
equal parts we get a slightly smaller condence interval, which is one of the
nicest properties of a condence interval (smaller the better).

174

Chapter Nine

z /2

z /2

Figure 9.2 Standard normal-curve with tail areas equal to /2.

Thus, we have a condence interval ( ,  u) with condence coefcient

(1  ), where  = X z 2
and u = X + z 2
are, respectively,
n
n
the lower condence limit (LCL) and upper condence limit (UCL). The
condence interval ( ,  u) is known as a two-sided condence interval.
As discussed above, suppose we decide not to divide  at all. That is suppose  lies completely under one tail or the other as shown in Figure 9.3(a)
and Figure 9.3(b). Then we get condence intervals that are known as onesided condence intervals. For example, from Figure 9.3(a), we have
X

P
z = 1
/ n

(9.13)

Again, after applying some simple mathematical operations, Equation


(9.13) can be written as
P ( X + z / n ) = 1

(9.14)

This is equivalent to saying that the probability of the interval (,  u )


where u = X + z n contains the true value of  is 1  . In other
words, the probability that the upper bound of  is equal to X + z n is
1  . Hence, the interval (,  u ) is an upper one-sided condence interval with condence coefcient 1  . Similarly, we can determine a lower
one-sided condence interval with condence coefcient (1  ) as
(  , ) = ( X z n , )
(9.15)
9.3.1.2 Population Standard Deviation  Unknown Since in this case  is
X
not known, the quantity
no longer satises the desired properties of
n

Point and Interval Estimation 175

(a)

(b)
Figure 9.3 (a) Standard normal curve with lower-tail area equal to
normal curve with upper-tail area equal to .

. (b) Standard

being a pivotal quantity because it contains besides  another unknown


parameter . Thus, under these circumstances we replace  by its estimator
S. Thus, when the population standard deviation  is unknown, then the
X
.
quantity that we consider as a pivotal quantity is
S n
Since the sample size is large, it can easily be shown that the pivotal
X
quantity
will still be approximately distributed as standard normal.
S n
Thus, the condence interval for  with condence coefcient (1  ) can
be obtained by following the same procedure as when  is known. In other
words, when the sample size is large the condence interval for  can be

176

Chapter Nine

obtained by simply replacing  with S in the corresponding condence interval for  when  is known. Thus, the following:
Two-sided condence interval with condence coefcient 1   is
(  , u ),

where

 = X z

S
n

and

u = X + z

S
n

(9.16)

Lower one-sided condence interval with condence coefcient 1   is


(  , ),

where

 = X z

S
n

(9.17)

Upper one-sided condence interval with condence coefcient 1   is


(, u ),

where

u = X + z

S
n

(9.18)

Example 9.8 A manufacturing engineer decides to check the efficiency of a


new technician hired by the company. She records the time taken by the technician to complete 100 randomly selected jobs and found that the average
time taken per job was 10 hours with a standard deviation of 2 hours. Find a
95% confidence interval for , the average time taken to complete one job.
Solution: In this example we do not know , but we are given that

X  10 and S  2.
Also, the sample size n  100 is large. Thus, using the condence interval
( ,  u) where

 = X z 2

S
n

u = X + z 2

S
n

and z/2  z0.025  1.96, we have


   10 1.96

2
100

 9.608

 u  10 + 1.96

2
100

 10.392

Thus, a 95% condence interval for the average time taken by the technician to complete one job is (9.608, 10.392) hours.
Example 9.9 Suppose that it is known that the standard deviation  of
workers hourly wage in auto industry is $5. A random sample of 64 workers
had an average hourly wage of $30. Find a 99% confidence interval for the
mean hourly wage .
Solution: Since in this case the sample size is n  64, which is large
and
the population standard deviation   5. Also, we are given that X  30.
Thus, a condence interval for mean hourly wage is ( ,  u) where
   X z

 u  X + z

Point and Interval Estimation 177

= 30 2.575

5
64

= 30 + 2.575

5
64

 30  1.61  30  1.61
 28.39  31.61
Thus, a 99% condence interval for the mean hourly wage is (28.39,
31.61) dollars.
Note: It is important to remember that the size of a condence interval,
which is dened as  u   , will increase or decrease as the sample size
decreases or increases.
9.3.2 Confidence Interval for Population Mean  When the
Sample Size Is Small
The large-sample procedures for nding a condence interval for the population mean that we discussed in the previous section does not make any
assumptions about the distribution of the sampled population except that the
mean and the standard deviation of the population are  and  respectively.
The condence interval ( ,  u) was obtained using thefact that for large
sample size by the central limit theorem (see Chapter 8) X is approximately
normally distributed with mean  and standard deviation n . When the
sample size is small, we cannot apply the central limit theorem, so we
assume that the sampled population is normally distributed. Note that when
the population is normal then irrespective of the sample size, the sample

mean X is exactly normal with mean  and standard deviation n .


Under the present scenario we consider two cases. One is when  is
known and the other is when  is not known.
9.3.2.1 Population Standard Deviation  Is Known The case when  is
known is dealt with exactly in the same manner as in section 9.3.1, where the
sample size was large. Since in the present case the sampled population is
normally
distributed with mean  and standard deviation , the sample mean

X for any sample size is also normally distributed with mean  and standard
deviation n . Thus, a condence interval for  with condence coefcient 1   is ( ,  u) where
  = X z

and

 u = X + z

(9.19)

Lower and upper one-sided condence intervals with condence coefcient


1   are the same as in Equations (9.17) and (9.18), respectively.
9.3.2.2 Population Standard Deviation  Is Unknown

When the population


X
is normal and sample size is small, the pivotal quantity
obtained
S/ n

178

Chapter Nine

X
by replacing population standard deviation  with its estimator
/ n
S is no longer normally distributed. In Chapter 8 (Theorem 8.4), we saw that
X
the pivotal quantity
is distributed as Students t-distribution with
S/ n
n  1 degrees of freedom. Thus, from Figure 9.4, we have
from

P ( t n 1, 2

X
t n 1, 2 ) = 1
S/ n

or
P ( t n 1,

S
X t n 1,
n

S
) = 1
n

or
P ( X t n 1,

S
X + t n 1,
n

S
) = 1
n

Thus, a two-sided small-sample confidence interval for  with condence coefcient 1   when  is not known is ( ,  u) where
S
S
, and  u  X + t n 1, 2
(9.20)
n
n
Lower and upper one-sided condence intervals are ( , ) and (,  u)
respectively, where
   X t n 1,

   X t n 1,

S
S
, and  u  X + t n 1,
n
n

Figure 9.4 Students t-distribution with tail areas equal to /2.

(9.21)

Point and Interval Estimation 179

Example 9.10 A random sample of 16 technicians between the ages of 35


and 40 years working in a large manufacturing company was taken and their
cholesterol levels were checked. It was found that the mean cholesterol level
for this sample was 175mg/100ml with a standard deviation of 15mg/100ml.
Assuming that the cholesterol levels of all technicians in that company
between the ages of 35 and 40 years are normally distributed, find a 95%
confidence interval for the population mean .
Solution: From the information provided to us, we have

n  16, X  175mg/100ml, and S  15mg/100ml


Using a two-sided small-sample condence interval for  when  is
unknown (see Equation (9.20)), we get
   X t n 1,

= 175 t15,0.025

S
, and  u  X + t n 1,
n
15
16

= 175 + t15,0.025

S
n

15
16

 175  (2.131)(3.75)  175  (2.131)(3.75)


 175  7.99

 175  7.99

 167.01

 182.99

Thus, a 95% condence interval form the population mean  is


(167.01,182.99).
Example 9.11 A random sample of size 25 of a certain kind of lightbulb
yielded an average lifetime of 1875 hours and a standard deviation of 100
hours. From past experience it is known that the lifetime of this kind of bulb
is normally distributed with mean  and standard deviation . Find a 99%
confidence interval for the population mean . Find a 99% lower and upper
one-sided confidence interval for the population mean .
Solution: We have

n  25, X  1875, and S  100.


Using the small-sample two-sided condence interval (see Equation (9.20)),
we get
S
S
   X t n 1, 2
, and  u  X + t n 1, 2
n
n
 1875  t24, 0.005

100
25
 1875  2.797(20)

 1875  t24, 0.005

100
25
 1875  2.797(20)

 1875  55.94

 1875  55.94

 1819.06
 1930.94
Thus, a small-sample two-sided 99% condence interval for  is (1819.06,
1930.94).

180

Chapter Nine

The lower and upper one-sided 99% condence limits are


S
S
and  u  X + t n 1,
   X t n 1,
n
n
100
100
 1875  t24, 0.01
25
25
 1875  2.492(20)
 1875  2.492(20)

 1875  t24, 0.01

 1875  49.84

 1875  49.84

 1825.16

 1924.84

Thus, lower and upper one-sided small-sample condence intervals for


the population mean  with condence coefcient 99% are (1825.16, ) and
(0, 1926.86) (note that lifetime of bulb cannot be negative, so the lower limit
instead of  is zero).
We would like to remind the reader that the use of Students t-distribution to nd a condence interval for the population mean  is applicable only
when the following are true:
1. Population is normal.
2. Sample size is small (n  30).
3. Population variance is not known.

9.4 Confidence Interval for the Difference


between Two Population Means
We rst give an important result in the form of a theorem. We are going to
use the result of this theorem throughout this section.
Theorem 9.1 Let X1 and X2 be independent random variables distributed as normal with means 1 and 2 and variances 12, 22
respectively. Then the random variable X  X1  X2 is also normally distributed with mean 1  2 and variance 12  22. (The
proof of this theorem is beyond the scope of this book.)
9.4.1 Large Sample Confidence Interval for the Difference
between Two Population Means
9.4.1.1 Population Variances 12 and 22 Are Known Let X11, X12, ..., X1n1
and X21, X22, ..., X1n2 be random samples from two independent populations
2
having probability distribution with means 1 and 2 and
variances 1
2
and 2 , respectively, and n1 30 and n2 30. Let X1 and X2 be the sample
means
of samples X11, X12, ..., X1n1 and X21, X22, ..., X1n2 respectively.
Since

X1 and X2 are unbiased estimators of 1 and 2 respectively, then X1  X2 is


an unbiased estimator of 1  2. The pivotal quantity that we use to determine a condence interval for 1  2 is
( X1 X 2 ) ( 1 2 )

12 n1 + 22 n2

(9.22)

Point and Interval Estimation 181

Using the central limit theorem and the result of Theorem 9.1, it can easily be shown that the pivotal quantity in Equation (9.22) is distributed as
standard normal (i.e., the mean 0 and standard deviation 1). Thus, we have
P ( z

P ( z

( X1 X2 ) ( 1 2 )

12 n1 + 22 n2

z 2 ) = 1

1 n1 + 2 n 2 (X1 X 2 ) ( ) z
1
2
2

1 n1 + 2 n 2 ) = 1
2

or
P (( X1 X 2 ) z

1 n1 + 2 n2 (X1 X 2 ) + z
1
2
2

1 n1 + 2 n 2 ) = 1
2

The large-sample two-sided condence interval for 1  2 with condence


coefcient 1   is as given in Equation (9.23).
(X1 X 2 z

12 2 2
+
, X1 X 2 + z
n1
n2

12 2 2
+
)
n1
n2

(9.23)

Lower and upper one-sided condence intervals with condence coefcient


1   are
(( X1 X 2 z

12 2 2
12 2 2
+
, ) and (, X1 X 2 + z
+
)
n1
n2
n1
n2

(9.24)

respectively.
Example 9.12 Suppose two independent random samples, one of 64
mechanical engineers and the other of 100 electrical engineers, showed that
the mean starting salaries of mechanical and electrical engineers are $36,250
and $40,760, respectively. Suppose it is known that the standard deviations of
starting salaries of mechanical and electrical engineers are $2240 and
$3000, respectively. Find a two-sided 95% confidence interval for 1  2.
Find the upper and lower one-sided 95% confidence intervals for 1  2.
Solution: From the given information, we have

n1  64
X1  36,250
1  2240

n2  100
X2  40,760
2  3000
Using Equation (9.23) a two-sided condence interval for 1  2 with 95%
condence coefcient is given by
(X1 X 2 z

12 2 2
, X1 X 2 + z
+
n1
n2

(36,250  40,760  z0.025


 z0.025

12 2 2
+
)=
n1
n2

(2240 )2 ( 3000 )2
, 36,250  40,760
+
64
100
(2, 240 )2 ( 3, 000 )2
+
)
64
100

182

Chapter Nine

 (4,510  1.96(410.36), 4510  1.96(410.36))


 (5314.30, 3705.70)
Thus, a 95% condence interval for 1  2 is (5314.30, 3705.70).
Note that both the lower and upper condence limits are negative, which
indicates with 95% probability that the electrical engineers starting salary is
higher than that of mechanical engineers.
An upper one-sided 95% condence interval for 1  2 is given by
(. X1 X 2 + z

z0.05

12 2 2
+
) = (, 36, 250 40, 760 +
n1
n2

(2240 )2 ( 3000 )2
+
)
64
100
 (, 4510  675)  (, 3835)

Similarly, a lower one-sided condence interval for 1  2 is (5185, ).


9.4.1.2 Population Variances 12 and 22 Are Unknown When the population variances are not known and the sample sizes are large, the condence
interval for 1  2 with condence coefcient 1   is obtained by replacing 12and 22 in Equation (9.23) by the sample variances S12 and S22
respectively. Thus, a two-sided condence interval for 1  2 with condence coefcient 1   is
S12 S2 2
S12 S2 2
X
X
z
( X1 X 2 z 2
+
,
+
)
(9.25)
1 2 + 2
n1
n2
n1
n2
LCL
UCL
Lower and upper one-sided condence intervals with condence coefcient
1   are
( X1 X 2 z

S2 S2
S12 S2 2
+
, ) and (, X1 X 2 + z 1 + 2 ) (9.26)
n1 n2
n1
n2

Example 9.13 Two types of copper wires used in manufacturing electrical


cables are tested for their tensile strength. Two random samples one of each
type of size n1  40 and n2  40 produced the following summary statistics

X1  150, S1  13 ; X2  120, S2  12
Find a 99% confidence interval for 1  2, the difference in the mean tensile strength of the two types of wires.
Solution: Since the population variances in this case are not known, a 99%
condence interval for 1  2 is obtained by using Equation (9.25). Thus,
we have
(X1 X 2 z

S12 S2 2
+
, X1 X 2 + z
n1
n2

S12 S2 2
+
)
n1
n2

Point and Interval Estimation 183

 (150  120  2.575

132 12 2
, 150  120  2.575
+
40 40

132 12 2 )
+
40 40

 (30  2.575(2.797), 30  2.575(2.797))


 (30  7.20, 30 7.20)  (22.80, 37.20)
Thus, a 99% condence interval for 1  2 is (22.80, 37.20). In this
case both lower and upper condence limits are positive, which indicates
with 99% probability that type I wires have higher mean tensile strength.
9.4.2 Small Sample Confidence Interval for the
Difference between Two Population Means
In this section we discuss how to nd a condence interval for 1  2, the
difference between two population means when the sample sizes are small
(at least one of n1 and n2 is less than 30).
As in the one population case, when the sample size is small, we assume
that the two sampled populations are normally distributed with means 1 and
2 and variances 12and 22 respectively. Under the small-sample case we
shall consider here, are the following three different scenarios:
1. Both variances 12and 22 are known.
2. Both variances 12 and 22 are unknown, but 12 and 22 can be
assumed to be equal.
3. Both variances 12 and 22 are unknown, but 12 and 22 cannot
be assumed to be equal.
9.4.2.1 Both Variances 12and 22 Are Known In this case the pivotal quantity we use to nd a condence interval for 1  2 is
(X1 X 2 ) ( 1 2 )

12 2 2
+
n1
n2

(9.27)

which is distributed as standard normal N(0,1), that is, normal with mean 0
and standard deviation 1. So in this case pivotal quantity is the same as when
the sample size is large and the variances are known. Moreover, in the largesample case, by the central limit theorem, and here, because of the normality assumption, the pivotal quantity is distributed as standard normal. Thus,
the condence interval for 1  2 with condence coefcient (1  ) is
exactly the same as in Equation (9.23). That is,
( X1 X 2 z

12 2 2
12 2 2
+
, X1 X 2 + z 2
+
)
n1
n2
n1
n2

(9.28)

Lower and upper one-sided condence intervals with condence coefcient


(1  ) are
( X1 X 2 z 12 n1 + 22 n2 , )

(9.29)

184

Chapter Nine

and
(, X1 X 2 + z 12 n1 + 22 n2 )

(9.30)

respectively.
Example 9.14 A manager of a company wants to evaluate the technicians
at two of its plants. She took two samples, one from each plant, of sizes n1 
13 and n2  17 technicians. Then she looked at the number of jobs each technician performed during a fixed period of time. From experience, the number
of jobs performed by all the technicians at the two plants are known to be
normally distributed with variances 12  21, 22  18.The data collected
produced the following summary statistics:

X1  27
12  21

X2  24
22  18
Find a 95% confidence interval for 1  2 , the mean difference of the number of jobs performed by the technicians at the two plants.
Solution: The populations are normally distributed with known variances
and the sample sizes are small. To nd a desired condence interval we use
Equation (9.28), that is,
( X1 X 2 z

12 2 2
12 2 2
+
, X1 X 2 + z 2
+
)
n1
n2
n1
n2

21 18
, 27  24  1.96
+
13 17
 (3  3.205, 3  3.205)  (0.205, 6.205)
 (27  24  1.96

21 18
)
+
13 17

Thus, a 95% condence interval for 1  2 is (0.205, 6.205).


9.4.2.2 Both Variances 12and 22 Are Unknown But 12and 22 Can Be
Assumed to Be Equal
Under this scenario we assume that the variances are unknown but they can
still be assumed to be equal, that is, 12  22  2. This situation looks
somewhat strange in that we are assuming the variances to be equal though
we do not know what the variances are. Dont worry! We can easily verify
this assumption by using techniques of testing hypotheses that we are going
to learn in the next chapter.
Since the two populations have the same variance 2, from the variance
point of view only it is quite reasonable to assume that the two populations
are identical. We can enhance the efciency of the estimator of the unknown
variance 2 by pooling the two samples drawn from the two populations. We
denote such an estimator of 2 by Sp2, which is dened as
S p2

(n1 1)S12 + (n2 1)S22


=
n1 + n2 2

(9.31)

Point and Interval Estimation 185

where S12 and S22 are the sample variances of the samples drawn from the two
populations. In this case, the pivotal quantity that we use to nd a condence
interval for 1  2 with condence coefcient 1   is
(X1 X 2 ) ( 1 2 )
1 1
Sp
+
n1 n2

(9.32)

It can easily be shown that the pivotal in Equation (9.32) is distributed as


Students t-distribution with n1  n2  2 degrees of freedom. Thus, the twosided condence interval for 1  2 with condence coefcient 1   is
given by
( X1 X 2 t

n1 + n2 2 ,

Sp

1 n1 + 1 n2 , X1 X 2 + t

n1 + n2 2 ,

Sp

1 n1 + 1 n2 )

(9.33)

Lower and upper one-sided condence intervals for 1  2 with condence coefcient 1   are given by
( X1 X 2 t n1 + n2 2, S p 1 n1 + 1 n2 , )

(9.34)

( , X1 X 2 + t n1 + n2 2, S12 n1 + S22 n2 )

(9.35)

and

respectively.
9.4.2.3 Both Variances 12 and 22 Are Unknown But 12 and 22 Cannot Be
Assumed to Be Equal
Under this scenario the population variances are again unknown, but they
cannot be assumed to be equal (again, this assumption can be veried by
using techniques discussed in Chapter 10). In this case, the pivotal quantity
we use to nd a condence interval for 1  2 with condence coefcient
1   is
(X1 X 2 ) ( 1 2 )
(9.36)
S12 S22
+
n1 n2
The pivotal quantity in Equation (9.36) can be shown to be approximately
distributed as Students t-distribution with m degrees of freedom where

m=

S12 S22
n + n
1
2
S12 n1

) +(

n1 1

S22 n2

n2 1

(9.37)

186

Chapter Nine

Since the degree of freedom is always a whole number, it is usually necessary to round the value of m in Equation (9.37). Thus, the two-sided condence interval for 1  2 with condence coefcient (1  ) is given by
( X1 X 2 t m,

S12 n1 + S22 n2 , X1 X 2 +t m,

S12 n1 + S22 n2 )

(9.38)

Lower and upper one-sided condence intervals for 1  2 with condence coefcient (1  ) are given by
( X1 X 2 t m, S12 n1 + S22 n2 , )

(9.39)

(, X1 X 2 +t m, S12 n1 + S22 n2 )

(9.40)

and

respectively.
Example 9.15 A pharmaceutical company sets two machines to fill 15oz
bottles with cough syrup. Two random samples of n1  16 bottles from
machine 1 and n2  12 bottles from machine 2 are selected. The two samples yield the following sample statistics:

X1  15.24
S12  0.64

X2  14.96
S22  0.36
Find a 95% confidence interval for 1  2, the mean difference of the
amount of cough syrup filled in bottles by the two machines. Assume that the
two population variances are equal.
Solution: Since in this case the population variances are unknown but
assumed to be equal, we rst nd the pooled estimate of the common variance. That is,
S p2 =
=

(n1 1)S12 + (n2 1)S22


n1 + n2 2
(16 1)(0.64 ) + (12 1)(0.36 )
= 0.5215.
16 + 12 2

Sp  0.722
Now to determine a desired condence interval using Equation (9.33), we
have
( X1 X 2 t

n1 + n2 2 ,

Sp
2

1 n1 + 1 n2 , X1 X 2 +t

 (15.24  14.96  2.056(0.722)

Sp
2

1
1
+ , 15.24  14.96
16 12

1
1
+
)
16 12
 (0.28  0.56, 0.28  0.56)  (0.28, 0.84)
 2.056(0.722)

n1 + n2 2 ,

1 n1 + 1 n2 )

Point and Interval Estimation 187

Thus, a 95% condence interval for 1  2 is (0.28, 0.84).


Example 9.16 Repeat Example 9.15, this time assuming that the population variances are not equal.
Solution: When the sample sizes are small and the variances are not equal,
the pivotal quantity used to determine a desired condence interval is
(X1 X 2 ) ( 1 2 )
S12 S22
+
n1 n2
which is approximately distributed as Students t-distribution with m degrees
of freedom, m in this example being equal to

m=

0.64 0.36
+

16
12
2

0.36
0.64

16
12
+
11
15

26

Note that in this case the degree of freedom turned out to be the same as
in Example 9.15. But this is not the case always. Thus, using the condence
interval in Equation (9.38), we get
( X1 X 2 t m,

S12 n1 + S22 n2 , X1 X 2 +t m,

 ( 15.24  14.96  2.056


2.056

S12 n1 + S22 n2 )

0.64 0.36
, 15.24  14.96 
+
16
12
0.64 0.36 )
+
16
12

 (0.28  0.54, 0.28  0.54)  (0.26, 0.82)


Thus, a 95% condence interval for 1  2 is (0.26, 0.82).

9.5 Confidence Intervals for Population


Proportions When Sample Sizes Are Large
Let X1, X2, ..., Xn be a random sample of size n from a Bernoulli population
(see section 8.3.1) with parameter p. Then from our earlier discussion in this
chapter we know that p = X n is an unbiased estimator of p where X is the
total number of successes in n Bernoulli trials. In Chapter 8, we also saw that
p(1 p )
the statistic ( p p ) /
is approximately normally distributed with
n

188

Chapter Nine

p(1 p )
a good cann
didate to be considered a pivotal quantity for estimating p because it possesses all the characteristics of a pivotal quantity. Having said this, we are
now ready to study the technique of nding a condence interval for p. Note
that throughout this section we are going to assume that n is large (np 5,
n(1  p) 5).
mean 0 and standard deviation 1. This makes ( p p ) /

9.5.1 Confidence Interval for p the Population Proportion


Let X1, X2, ..., Xn be a random sample of size n from a Bernoulli population
with parameter p and let X  X1  X2  ...  Xn. Then we are interested in
nding a condence interval for p with condence coefcient 1  . As discussed above we consider the quantity
( p p ) /

p(1 p )
n

(9.41)

as a pivotal quantity for nding the condence interval for p. Also, from our
discussion in Chapter 8, we know that the pivotal quantity in Equation (9.41)
is distributed approximately as standard normal N(0, 1). Thus, we have
P ( z 2

p p
z 2 ) = 1
p(1 p )
n

or
P ( z

p(1 p )
p p z
n

p(1 p )
) = 1
n

or
P ( p z

p(1 p )
p p + z
n

LCL

p(1 p )
) = 1
n
UCL

(9.42)

p(1 p )
in LCL and UCL is the standard error  p ,
n
which is unknown, since p is not known. But a good approximation of the
standard error is found by substituting p for p. Thus, we have

Note that the quantity

P ( p z

p (1 p )
p p + z
n

p (1 p )
) = 1
n

(9.43)

Point and Interval Estimation 189

Therefore a condence interval for p with condence coefcient (1  ) is


( p, pu) where
p (1 p )
p  = p z 2

(9.44)

p (1 p )
n

(9.45)

p u = p + z

Example 9.17 A random sample of 400 computer chips is taken from a


large lot of chips and 50 of them are found defective. Find a 95% confidence
interval for p, the proportion of defective chips contained in the lot.
Solution: From the given information, we have n  400 and x  50. Thus,
we have
x 50 1
=
=
n 400 8
Since we are interested in nding a 95% condence interval, we have  
.05,  / 2  .025, and z.025  1.96.
Hence the 95% condence interval for p is (p, pu) where
p =

p  = p z

p (1 p )
,
n

p u = p + z

p (1 p )
n

(1 / 8 )( 7 / 8 )
400
= 0..1250 + .0016

= 1 / 8 1.96

(1 / 8 )( 7 / 8 )
400
= 0.1250 .0016

= 1 / 8 + 1.96

= 0.1234

= 0.1266

Thus, a 95% condence interval for p is (0.1234, 0.1266).


9.5.2 Confidence Interval for the Difference of
Two Population Proportions
Quite often we are interested in nding a condence interval for the difference of the two population proportions. For example, we may be interested
in estimating the true difference (p1  p2) of the failure rate of a product
manufactured by two independent companies. One way to know which companys product is better, is to nd a condence interval for (p1  p2) with a
certain condence coefcient.
Let X11, X12, ..., X1n1 and X21, X22, ..., X1n2 be random samples of sizes
n1 and n2 from two independent binomial populations with parameters p1 and
p2 respectively. Then from our earlier discussion we know that
n2

n1

p1 =

X1i
i =1

n1

p 2 =

X2 j
j =1

n2

(9.46)

190

Chapter Nine

are unbiased estimators of p1 and p2, respectively. Therefore, (p1  p2) is an


unbiased estimator of (p1  p2). Moreover, for large sample size (n1p1 5,
n1(1  p1) 5) and (n2p2 5, n2(1  p2) 5) we know that p1 and p2 are
p1 (1 p2 )
and
normally distributed with mean p1 and p2 and variances
n1
p2 (1 p2 )
, respectively.
n2
Now using the result of Theorem 9.1, it follows that (p1  p2) is approxp1 (1 p2 )
+
imately normally distributed with mean (p1  p2) and variance
n1
p2 (1 p2 )
.That is,
n2
( p 1 p 2 ) ( p 1 p 2 )

(9.47)

p1 (1 p1 ) p2 (1 p2 )
+
n1
n2

is approximately distributed as standard normal N(0, 1). Thus, using the statistic in Equation (9.47) as the pivotal quantity for estimating (p1  p2), we have
P ( Z / 2

( p1 p 2 ) ( p1 p 2 )
p1(1 p1) p 2(1 p 2 )
+

n2
n1

Z / 2) = 1

or
p (1 p1) p 2(1 p2 )
+
P ( Z / 2 1
( p 1 p 2 ) ( p1 p 2 )

n1
n2
p1(1 p1) p 2(1 p2 )
+
Z /2
= 1

n1
n2
or
p (1 p1) p 2(1 p2 )
+
P (( p 1 p 2 ) Z / 2 1
( p1 p 2 ) ( p1 p 2 ) +

n1
n2
p (1 p1) p 2(1 p2 )
+
Z / 2 1
) = 1

n1
n2
Note that the quantity

p1 (1 p1 ) p2 (1 p2 )
+
in LCL and UCL is
n1
n2

unknown since p1 and p2 are not known. Also, note that this quantity is the
standard error of (p1  p2) Thus, we estimate the standard error of (p1  p2)

Point and Interval Estimation 191

by replacing p1 and p2 with p1 and p2 respectively. Therefore, a condence


interval for (p1  p2) with condence coefcient 1   is given by
(( p1 p 2 ) z

p1 (1 p1 ) p 2 (1 p 2 )
, ( p1 p 2 ) +
+
n1
n2

p1 (1 p1 ) p 2 (1 p 2 )
+
n1
n2

(9.48)

Lower and upper one-sided condence intervals for (p1  p2) with condence coefcient 1   are given by
(( p1 p 2 ) z

p1 (1 p1 ) p 2 (1 p 2 )
, 1)
+
n1
n2

(9.49)

p1 (1 p1 ) p 2 (1 p 2 )
+
)
n1
n2

(9.50)

and
(0, ( p1 p 2 ) + z
respectively.
Example 9.18 Companies A and B claim that the new type of lightbulb
has a lifetime of more than 5,000 hours. In a random sample of 400 bulbs
manufactured by company A, 60 bulbs burned out before the guaranteed
period ended, and in a random sample of 500 bulbs manufactured by company B, 100 bulbs burned out before the guarantee period ended. Find a
point estimate and a 95% confidence interval for the true value of the difference (p1  p2), where p1 and p2 are the proportion of the bulbs manufactured by company A and company B, respectively, that burn out before
the guarantee period, that is, 5,000 hours.
Solution: From the given information, we have

p 1 =

60
3
100 1
and p =
=
=
2
400 20
500 5

Thus, the point estimate of (p1  p2) is


3 1

20 5
We now want to nd a 95% condence interval for ( p1  p2). From Equation
(9.48), we have

p 1 p 2 =

(( p1 p 2 ) z
z

p1 (1 p1 ) p 2 (1 p 2 )
, ( p1 p 2 ) +
+
n1
n2
p1 (1 p1 ) p 2 (1 p 2 )
+
)
n1
n2

192

Chapter Nine

Substituting the values of p1 and p2 in this relation and z/2  1.96 since  
0.05, we have
3 / 20(1 3 / 20 ) 1 / 5(1 1 / 5 )
3 1
) 1.96
+
400
20 5
500
= 0.05 0.06 = 0.11

LCL = (

3 / 20(1 3 / 20 ) 1 / 5(1 1 / 5 )
3 1
) + 1.96
+
400
20 5
500
= 0.05 + 0.06 = 0.01

UCL = (

Thus, a 95% condence interval for ( p1  p2) is (0.11, 0.01).

9.6 Determination of Sample Size


In this section we want to determine the sample size needed to estimate
a parameter  with the desired margin of error E (see section 9.1.1), where
 may be the population mean , difference of the two population means
1  2, population proportion p, or the difference of the two population
proportions p1  p2.
Thus, for example, let X1, X2, ..., Xn be a random sample from a population with probability distribution f(x,) where  is an unknown parameter.
Let    (x1, x2, ..., xn) be an estimate of  where X1  x1, X2  x2, ..., Xn
 xn . Obviously, we cannot expect  to be exactly equal to the true value of
the parameter. The difference between  and  is the error of estimation. The
maximum value of the error of estimation is called the margin of error or
bound on error of estimation where   , 1  2, p, p1  p2. The margin
of error of estimation denoted by E, which from Equation (9.6) with probability 1   is given by
E = z 2
Note that E is also equal to half the width of the condence interval for
 with condence coefcient 1  . Suppose we predetermine the size of the
margin of error E, we then would like to determine the sample size needed
to attain this value of the margin of error.
Case 1
Let   , then the margin of error with probability 1   is given by
E = z

where  is the population standard deviation. By taking the square on both


sides and doing some algebraic manipulations, we get
n=

z2 2 2

(9.51)

Point and Interval Estimation 193

Example 9.19 Suppose a manufacturing engineer wants to estimate the


number of defective parts produced by a machine during each shift. An earlier study on a similar machine shows that the number of defective parts produced by the machine vary from shift to shift, with standard deviation equal
to 12. How large a sample should the engineer take so that with 95% probability the estimate is within 3 parts of the true value of , the average number of defective parts produced by the machine in each shift?
Solution: From the given information, we have
z 2 = z0.025 = 1.96,   12, E  3
Thus, the desired sample size is
n=

2
2

E2

(1.96 )2 (12 )2
= 61.46
32

The engineer should take a sample of size 62 to achieve the goal. Note that
the value of n is always rounded up.
Case 2
Let   1  2, then in order to determine the sample size in this case we
assume that the sample sizes taken from two population are equal, that is
n1  n2  n. Then the margin of error with probability (1  ) is given by
E = z

12 22
+
= z
n
n

12 + 22
n

Now by taking the square on both sides and doing some algebraic manipulations, we get
n=

z2 2 ( 12 + 22 )
E2

(9.52)

where 12 and 22 are the variances of populations under consideration.
Example 9.20 Suppose that we want to estimate the difference between
two population means 1 and 2. Further suppose that we know 1  2.0
and 2  2.5. How large a sample should be taken so that with probability
99% our estimate is within 1.2 units of the true value of 1  2.
Solution: From the given information, we have
z 2 = z0.005 = 2.575, 1 = 2.0 and 2 = 2.5, E = 1.2
Using Equation (9.52), the desired sample size is
n=

(2.575 )2 (2 2 + (2.5 )2 )
= 47.197 48.
(1.2 )2

Notes: In practice it is quite common that we do not know the population


variance. In such cases we replace the population variance by the sample
variance. It is interesting to note here that if we do not know the population

194

Chapter Nine

variance, we will have to nd the sample variance, for which we would need
to have a sample. But to have a sample we must know the sample size we are
trying to nd. Thus it becomes a vicious circle. To solve this problem we use
one of two possible solutions.
a. We use some existing data on the same kind of study to
calculate the sample variance. Then we use the value of the
sample variance to determine the sample size n.
b. We take a preliminary sample, say of size n1, to calculate the
value of the sample variance. Then, we use this value of the
sample variance to determine the sample size n. Since we
already have a sample size n1, now we take another
supplemental sample of size n  n1 and then combine the two
samples in order to get a full sample of size n.
Case 3
Let   p. In this case the margin of error E is given by
E = z

p(1 p )
n

Now taking the square on both sides and doing some algebraic manipulation,
we get
n=

z2 2 p(1 p )
E2

(9.53)

Example 9.21 Suppose we select a random sample of eligible voters from


some district to estimate the proportion of voters who would favor the incumbent candidate. How large a sample should be taken in order to estimate the
proportion with a margin of error of 3% with 95% probability?
Solution: From the information available to us, we have
z 2 = z0.025 = 1.96, and E  3%  0.03
Since we do not have any prior information about p, in order to make
certain that our margin of error is no more than 3% we use p  0.5. Note that
this choice gives us the largest possible sample needed to attain the given
margin of error. Using Equation (9.53) the sample size is
n=

(1.96 )2 (0.5 )(1 0.5 )


= 1067.11 1068.
(0.03)2

Case 4
Let   p1  p2. In this case, we assume that the sample sizes taken from
two Bernoulli populations are equal, that is, n1  n2  n. Then the margin
of error E is given by
E = z

p1 (1 p1 ) p2 (1 p2 )
+
n
n

Point and Interval Estimation 195

Again, taking the square on both sides and doing some algebraic manipulations, we get the desired sample size n needed to have the margin of error no
more than E with probability 1   as
n=

z2 2 [ p1 (1 p1 ) + p2 (1 p2 )]
E2

(9.54)

Example 9.22 A marketing specialist of a car manufacturing company


wants to estimate the difference between the proportion of those customers
who prefer a domestic car and those who prefer an imported car. How large
a sample should she take from the ones who prefer domestic cars and ones
who prefer imported cars, in order to have a margin of error of 2.5% with a
probability of 99%? It is known that not very long ago 60% of the customers
preferred domestic and 40% preferred imported cars.
Solutions: From the given information, we have
p1  0.6

p2  0.4

z/2  z0.005  2.575 E  2.5%  0.025

Substituting these values in Equation (9.54), we get


(2.575 )2 ((0.6 )(0.4 ) + (0.4 )(0.6 ))
n=
= 5092.32 5093
(0.025 )2
Notes: In cases 1 and 2 we are confronted with a problem where we had to
estimate the unknown population variances. Similarly, in cases 3 and 4 we
are confronted with the problem of estimating the unknown population proportions. In this scenario we give three possible solutions.
1. Use some old data on this study to estimate p and then use this
estimate to nd the desired sample size.
2. We take a preliminary sample of size n1 to estimate p and then
use this estimate to determine the sample size n. Then we take a
supplemental sample of size n  n1 and combine the two
samples to get the full sample.
3. Take p  0.5 and use this value to determine the sample size.
When p  0.5 the quantity p(1  p) is largest. Therefore, in this
case we get the largest possible sample that we need to attain
the margin of error E with probability 1  .

9.7 Confidence Interval for Population Variances


So far in this chapter we have considered the problem of point estimation of
population mean, population proportion, and population variance. Also we
considered the problem of interval estimation of population mean, difference
of two population means, population proportion, and the difference of two
population proportions. Quite often we need to nd the condence interval
of a population variance or the ratio of two population variances. In this section we consider the problem of interval estimation of population variances
under the assumption that the sample populations are normally distributed.

196

Chapter Nine

9.7.1 Confidence Interval for a Population Variance


From Theorem 8.3 we know that when the sampled population is normally
distributed with mean  and variance 2 then the random variable

2 =

(n 1)S 2
2

(9.55)

is distributed as chi-square distribution with (n  1) degrees of freedom.


Clearly the random variable 2 is a function of sample values and the
unknown parameter 2. From Theorem 8.2 we can also see that its probability distribution is free of 2. Thus, the random variable 2 has all the characteristics of being a pivotal quantity for estimating 2. Thus, we have
P ( n21,1 2

(n 1)S 2
n21, 2 ) = 1
2

or
P( 2 n21,1 2 (n 1)S 2 2 n21, 2 ) = 1
Doing some further algebraic manipulations, we get
(n 1)S 2
(n 1)S 2
2 2
= 1
P 2
( n 1),1 2
( n 1), 2
LCL

(9.56)

UCL

Thus, a two-sided condence interval for 2 with condence coefcient


2
2
1   is (  , u ) where

2 =

(n 1)S 2
(n 1)S 2

and

=
u
(2n 1), 2
(2n 1),1 2

Similarly, using the following probability relations


P(

(n 1)S 2
n21, ) = 1
2

and
P(

(n 1)S 2
n21,1 ) = 1
2

we get lower and upper one-sided condence intervals for 2 with condence coefcient 1   as
(n 1)S 2
(n 1)S 2
and
,

0, 2

n 1,
n 1,1

(9.58)

respectively. Note that the condence interval for the population standard
deviation  with condence coefcient 1   is obtained by taking the

Point and Interval Estimation 197

square root of the corresponding condence interval for 2. Thus for example, a two-sided condence interval for  with condence coefcient 1  
is (  , u ) where

 =

(n 1)S 2
n21, 2

and

u =

(n 1)S 2
n21,1 2

(9.59)

Example 9.23 The time taken by a worker in a car manufacturing company to finish a paint job on a car is normally distributed with mean  and variance 2. A random sample of 15 paint jobs is randomly selected and
assigned to that worker, and the time taken by the worker to finish the job is
jotted down. These data yields a sample standard deviation of S  2.5 hours.
Find a 95% two-sided and one-sided lower and upper confidence intervals
for the population standard deviation .
Solution: From the given information and using the chi-square distribution
table (Table V of the appendix) and Figure 9.5, we have
S  2.5   0.05 n  1  14
214,10.025  214,0.975  5.629

214,0.025  26.119

Thus, a two-sided condence interval for 2 with condence coefcient 95%


2
2
is (  , u ) where

2 =

(n 1)S 2 (15 1)(2.5 )2


=
= 3.35
26.119
n21, 2

u2 =

(n 1)S 2 (15 1)(2.5 )2


=
= 15.54
5.629
n21,1 2

Therefore, a 95% two-sided condence interval for 2 is (3.35, 15.54).


Now taking the square root of the lower and upper condence limits for 2 ,
we get a 95% condence interval for the population standard deviation,
which is (1.83, 3.94), because
u = 3.94
 = 1.83

0.025

0.025
0

5.629

26.119

Figure 9.5 Chi-square distribution with two tail areas each equal to 0.025.

198

Chapter Nine

To nd a one-sided condence interval, note that the value of 2 will


change as the whole  falls under only one tail. Thus, for example, we have

2 =

u2 =

(n 1)S 2 (51 1)(2.5 )2


87.5
= 3.69
=
=
2
2
23.685
n 1,
14 , 0.05

(n 1)S 2 (51 1)(2.5 )2 (15 1)(2.5 )2


=
= 13.32
=
2
6.57
n21,1
14
, 0.95

Therefore, one-sided lower and upper condence intervals for 2 are (3.69,
) and (0, 13.32), respectively. These condence intervals for the population
standard deviation are found just by taking the square root; that is, one-sided
lower and upper 95% condence intervals for the population standard deviation are (1.92, ) and (0, 3.65) respectively.
9.7.2 Confidence Interval for the Ratio of
Two Population Variances
In this section we consider two normal populations with unknown variances
12 and 22. We want to nd a condence interval for 12 /22 with condence coefcient 1  . Let X11, X12, ..., X1n1 and X21, X22, ..., X2n2 be
random samples from independent populations. Let S12 and S22 be the corresponding sample variances. Then, from Theorem 8.5 it follows that the random variable
F=

S12 / 12
S22 / 22

(9.60)

is distributed as F-distribution with 1 and 2 degrees of freedom, where 1


 n1  1 and 2  n2  1.
From section 8.5, we can see that the probability distribution of the random variable F is free of the unknown parameters 12 and 22. Furthermore,
the random variable F is a function of sample values and of the unknown
parameters 12 and 22 only; it has all the characteristics of being a pivotal
quantity for estimating the ratio of two variances 12 and 22. Thus, we have
P( F1 , 2 ,1 2

S12 12
F1 , 2 , 2 ) = 1
S22 22

P( F1 , 2 ,1 2

S12 22
F1 , 2 , 2 ) = 1
S22 12

or

Point and Interval Estimation 199

Now , using F1 , 2 ,1 2 =

1
F 2 ,1 ,

and doing some algebraic manipulation,


2

we get
P ( F 2 ,1 ,1

S12 12
S12

F
) = 1

2
2 1
S22 22
S22

(9.61)

2
From Equation (9.61) it follows that a condence interval for 12 with con2
dence coefcient 1   is
( F 2 ,1 ,1

S12
S12
,
F
)

2
2 1
S22
S22

(9.62)

The corresponding condence interval for the ratio of the population standard deviations is found by taking the square root of the condence limits in

Equation (9.62). Thus, a condence interval for 1 with condence coef2


cient 1   is

S12
S12
F 2 ,1 ,1 2 2 , F 2 ,1 , 2 2
S2
S2

(9.63)

Lower and upper one-sided condence intervals for the ratio of the population variances and population standard deviations with condence coefcient
1   are
( F 2 ,1 ,1

S12
, )
S22

(0, F 2 ,1 ,

S12
)
S22

(9.64)

and

S12
F 2 ,1 ,1 2 ,
S2

0,

F 2 ,1 ,

S12

S22

(9.65)

respectively.
Example 9.24 Two random samples of sizes 13 and 16 are selected from a
group of patients with hypertension. The patients in the two samples are
independently treated with drug A and B. After a full course of treatments,
these patients are evaluated. The data collected at the time of evaluation
yielded sample standard deviations S1  6.5 mm Hg and S2  7.5 mm Hg.
Assume that the two sets of data come from independent normal populations
with variances 12 and 22, respectively. Determine 95% two-sided and
one-sided confidence intervals for 12/22 and 1/2.

200

Chapter Nine

f(x)

f(x)

f(x)
(a)

0.025

0.025
0.3378

(c)

(b)

3.18

0.05

0.05
1.9679

0.4032

Figure 9.6 F-distribution curve (a) shaded area under two tails each equal to 0.025
(b) shaded area under left tail equal to 0.05 (c) shaded area under the
right tail equal to 0.05.

Solution: From the given information, we have


  0.05

1  12

2  15,

S1  6.5

S2  7.5

Thus, a two-sided condence interval for the ratio 1 /22 of the two population variances is determined by substituting these values in Equation (9.62)
(see Figure 9.6(a)), that is,
2

S12
S12
F
,
F
2 ,1 ,1 2 S 2 2 ,1 , 2 S 2 = (0.22537, 2.3884),
2
2
and for the ratio of the standard deviations 1/2 the two-sided condence
interval is found by taking the square root, which gives
(0.5037, 1.5454)
Using Equations (9.64) and (9.65) and Figures 9.6(b) and 9.6(c), we get
95% lower and upper one-sided condence intervals for 12/22 and 1/2 as
(0.3028, ), (0, 1.9679) and (0.5502, ), (0, 1.4028)
respectively.

10
Hypothesis Testing

s discussed earlier, one of the aims of statistics is to make inferences


about the unknown parameters of a population based upon the information contained in a sample selected from this population. The goal
of making such inferences may be achieved by estimating the unknown
parameters and then testing hypotheses about the plausible values of these
unknown parameters. In Chapter 9 we considered the problem of estimating
the unknown parameters. Here we will consider certain aspects of testing
hypotheses.
Testing of hypotheses is a phenomenon that we deal with in everyday
life. For example, a pharmaceutical company may like to test a hypothesis
for a new drug used to treat patients with high cholesterol, breast cancer, or
coronary artery disease. Amtrak, a train transportation service company, may
like to test whether an existing track can be used to introduce a new train
service for a particular route that covers a certain distance in a given period
of time. A Six Sigma Green Belt in a paper mill may like to test a hypothesis that the new machine will produce no more than 10% of paper with
defects. A civil engineer may like to test a hypothesis that a new bridge can
withstand a weight of 80 tons. The U.S. Congress may like to test a hypothesis that the new economic measures can reduce the unemployment rate by
one full point. We could list any number of possible hypotheses. For all such
hypotheses we are obliged to collect some data and test the validity of these
hypotheses. In this chapter we discuss some commonly used tests that would
help us to either establish or contradict, with a certain desired probability the
validity of such hypotheses.

10.1 Basic Concepts of Testing


Statistical Hypotheses
The rst step toward testing a statistical hypothesis is to identify an appropriate probability model for the population under study, and to identify the
parameter for which the hypothesis is being formulated. Thus, if we identify

201

202

Chapter Ten

DESCRIPTION

HYPOTHESIS TESTING
A decision-making process, based upon the
information contained in a sample, about
whether an unknown population parameter
can take some assigned value.

USE

Used to assess characterization of a


population by taking a sample from the
population.

TYPE OF DATA

Numerical (quantitative) data.

DESIGN/APPLICATION
CONSIDERATIONS

Seeking a sample that contains the


pertinent information about the population.

SPECIAL COMMENTS/CONCERNS

In deciding whether an unknown population


parameter can have an assigned value
based on information contained in a sample,
we commit two types of errors with certain
probabilities. To keep these probabilities at
the desired levels, appropriate sample size
should be determined. If the data is
qualitative then such decisions are made by
using some special methods known as
nonparametric methods.

RELATED TOOLS

Point estimation, interval estimation.

a normal probability model as an appropriate model for the population under


study, we formulate hypotheses about the mean  and/or the standard deviation , since  and  are the only parameters of a normal probability model.
Once an appropriate probability model is selected and the hypothesis is formulated, the next step is to collect data and proceed to conduct our testing of
the hypothesis that we had formulated. After doing so, we will either support
or discredit (with a certain desirable probability) the hypothesis that we had
formulated about the unknown parameter. These steps are usually enough to
identify the probability model that fully describes the population under
investigation.
Generally speaking, a statistical hypothesis consists of a pair of statements about the unknown parameter. One of these statements describes
someones belief or the existing theory and is known as the null hypothesis,
denoted by H0. The second statement is usually an assertion based upon
some new information. It is known as the research hypothesis, or alternative
hypothesis, and is denoted by H1 or Ha. Then based on the information contained in a sample, we either reject the null hypothesis H0 in favor of the
alternative hypothesis H1, or we do not reject H0. Rejecting a null hypothesis H0 means that the sample data is supporting our assertion. However, not
rejecting the null hypothesis H0 means the sample data does not support our
assertion, or, in other words, the sample data do support the existing theory.
This procedure of rejecting or not rejecting the null hypothesis H0 is called
the testing of statistical hypothesis.

Hypothesis Testing 203

Now suppose we have a population with a probability model f(x,),


where  is an unknown parameter. Then we may formulate a statistical
hypothesis as
H0:   0

(10.1)

H1:   0
where 0 is known. Thus, under the null hypothesis it is believed that  takes
some known value 0, whereas under the alternative hypothesis, our assertion, based on some theory or some new information, is that  takes a value
less than 0. Should we have some different information, then that could lead
us to another alternative hypothesis, namely
H1:   0
or

(10.2)
  0

Note that under the null hypothesis we have a specied value 0 of ,


whereas under the alternative hypotheses we do not have any specied value
of . A hypothesis that assigns a specied value to an unknown parameter is
usually known as a simple hypothesis, and the one that does not assign a
specied value to the unknown parameter is known as a composite hypothesis. In the above scenario, the null hypothesis is a simple hypothesis, whereas the alternative hypotheses are composite hypotheses. The hypothesis
H1:   0
or

(10.3)
  0

is called a one-tail alternative while


H1:   0

(10.4)

is called a two-tail alternative. These names for the alternative hypotheses are
based on reasons that will become clear as we move forward. Having dened
these terminologies, the question now is, how do we test these hypotheses?
The most logical answer that comes to mind: To make any decision we are
going to use the information contained in a sample that has been drawn from
the population with probability model f(x,), where  is unknown. We should
consider some statistic, called the test statistic, say , which for example,
may be an estimator of . Then using the sample data we calculate the value
of the test statistic. Then for certain values of the test statistic we may favor
the alternative hypothesis H1 and reject the null hypothesis H0, whereas for
any other value of the test statistic we do not reject the null hypothesis H0.
Thus, for example, consider the following hypothesis.
H0:   0
H1:   0
It seems quite reasonable to consider that if the value of the test statistic
 turns out to be too small then we should favor the alternative hypothesis H1

204

Chapter Ten

and reject the null hypothesis H0. Otherwise we should not reject H0. As far
as the decision of how the small value of  is too small, that can be made by
considering the sample space of  and dividing it into two regions so that if
the value of  falls in the lower region, the shaded region in Figure 10.1(a),
we reject H0. Otherwise we do not reject H0. The region for which we reject
H0 is usually known as the rejection region or critical region, and the region
for which we do not reject the null hypothesis H0 is known as the acceptance
region. The point separating these two regions is called the critical point.
Using the same argument, we can easily see that for testing the alternatives
H1:   0
and
H1:   0
the hypothesis H1:   0 is favored for large values of , while the hypothesis H1 :   0 is favored when  is either very small or very large. Thus,
the rejection regions will respectively fall in the upper region, and in both
lower and upper regions. These regions are shown in Figures 10.1(b) and
10.1(c) respectively.
Now it should be clear why we call the alternatives   0 or   0 the
one tail and the alternative   0 the two tail. It is because of the location
of the rejection regions. In the rst two cases the rejection region is located
on only one side, while in the third case it is located on both sides.
We have developed the above procedure of using the information contained in a sample, by means of a statistic, to make a decision about the
unknown parameters and consequently, about the population itself. Having
done this, the next question that one might ask is whether there are any risks

10.1(a)

H1 :

< 0

10.1(b)

H1 :

10.1(c)

H1 :

> 0

Figure 10.1 Critical points dividing the sample space of  in two regions, the
rejection region and the acceptance region.

Hypothesis Testing 205

of committing any errors while making such decisions. The answer is yes.
There are two risks. One occurs when the null hypothesis is true but, based
on the information contained in the sample, we end up rejecting it. This type
of error is called type I error. The second kind of error is when the null
hypothesis is false or the alternative hypothesis is true but still we do not
reject the null hypothesis. This kind of error is called type II error. Note that
these errors cannot be eliminated, but they certainly can be minimized. For
that we will have to pay some price. We shall study this aspect of the problem a little later in this chapter.
Certain probabilities are associated with committing type I and type II
errors, which we denote by  and , respectively. We may dene  and  as
follows:
  P(rejecting H0  H0 is true)

(10.5)

  P(not rejecting H0  H0 is false)

(10.6)

We summarize the discussion of type I and type II errors in Table 10.1.

Table 10.1 Presenting the view of type I and type II errors.


H0 is true

H0 is false

Reject H0

Type I error ()

Correct decision

Do not reject H0

Correct decision

Type II error ( )

The probability  is also known as the level of significance, while the


probability  is known as the probability of type II error. (In quality control,
 and  are called producers risk and consumers risk, respectively). The
complement of , 1  , is known as the power of the test.
Notes: It is clear that the determination of the rejection regions depends upon
the following:
1. The alternative hypothesis, which determines whether the
rejection region falls under the left tail, right tail, or both
the tails.
2. The level of signicance  determines the size of rejection
region.
3. The value of  is always predetermined, but that is not true for .
The value of  depends upon what the alternative hypothesis is, and it is
always determined at a specic value of  under the alternative hypothesis.
For example, if the alternative hypothesis is H1:   0, the specic value,
say 1, should be such that 1 is less than 0. Similarly, if the alternative
hypothesis is H1:   0 or H1:   0, then 1 should be such that 1  0
or 1  0 respectively. Determining the value of  sometimes may be cumbersome. Quite often, however, the population and the parameter under
investigation are such that the probability distribution of the test statistic is

Chapter Ten

normal. In such cases the value of  can easily be obtained by using one of
the following formulas.

0 1

z , if H1:   0
= PZ >

(10.7)

0 1

+ z ,
= PZ <

if H1:   0

(10.8)


0 1

z < Z < 0 1 + z if H :   
= P
1
0
2
2

(10.9)

H 1:

< 0

1.0

1.0

0.8

0.8

0.4

0.6

0.6

0.2

0.4
0.2
0

0.2

0.4

0.6

0.8

1.0

From these formulas we can easily see that under the alternative hypothesis,
if we set   0 then  will always be (1  ) and the power of the test will
be .
If we plot a graph of the values of  versus different values of 1 under
the alternative hypothesis, we get a curve known as the operating characteristic curve or simply the OC-curve. The OC-curves for different alternative
hypotheses are shown in Figure 10.2.
If we now plot a graph of power (1  ) versus different values of 1
under the alternative hypothesis, we get a curve known as the power curve.
Power curves under different alternative hypotheses are shown in Figure 10.3.
It is quite clear that although the value of  is predetermined, the value
of  is determined later, and it depends upon the alternative hypothesis.
Remember, when the value of  is lower, the power of the test is higher and
is, therefore, a better test. At this juncture one might ask whether one can
ever assign some predetermined value to  as well. The answer is yes, but at
a certain cost. What cost? The only cost is that the appropriate sample size

206

H 1: > 0

Figure 10.2 OC-curves for different alternative hypotheses.

H 1: 0

1.0
0.8
0.6

0.6

H 1: 0

0.2

0.2

0.4

0.4

0.6
0.4
0

0.2

Power

0.8

0.8

1.0

1.0

Hypothesis Testing 207

H 1: < 0

H 1: > 0

Figure 10.3 Power curves for different hypotheses.

needed to achieve this goal may turn out to be quite large. For given values
of  and , the sample size n should be such that
n

( z + z )2 2
(1 0 )2

for one-tail test

(10.10)

for two-tail test

(10.11)

( z + z )2 2
n

(1 0 )2

So far we have learned about some general concepts of testing statistical


hypotheses. Now we are ready to study the actual techniques of testing specic statistical hypotheses. An important technique of testing these hypotheses is to adhere to the following six steps.
Step 1 State the null hypothesis and the alternative hypothesis
clearly.
Step 2 Assign an appropriate value to the level of signicance,
that is, . It is common to assign one of the following values to
: 0.01, 0.05, or 0.10.
Step 3 Determine a suitable test statistic. For the statistical
hypotheses that we are going to discuss in this book, the pivotal
quantity, discussed in Chapter 9, for the parameter under
investigation is usually used as a test statistic.
Step 4 Determine the probability distribution of the test
statistic designated in step 3.
Step 5 Locate rejection region(s) and determine the critical
point.
Note that, as remarked earlier, the location of the rejection region
always depends on the alternative hypothesis, while the critical point

208

Chapter Ten

or the size of the rejection region depends on the value assigned to


, the size of type I error.
Step 6 Calculate the value of the test statistic and make the
decision; that is:
Take a random sample from the population in question and
calculate the value of the test statistic and determine whether it
falls in the rejection region. If it falls in the rejection region, we
reject the null hypothesis H0. Otherwise, do not reject H0.

10.2 Testing Statistical Hypotheses about One


Population Mean When Sample Size Is Large
10.2.1 Population Variance Is Known
Let X1, X2, X3, ..., Xn be a random sample from a population
with probability
density function f(x) with mean  and variance 2. Let X and S2 be the sample mean and the sample variance, respectively. Then we want to test each of
the three hypotheses, proposed in Equation (10.12), about the population
mean .
Step 1 Dene the null and alternative hypotheses
(i) H0:   0 versus H1:   0,
(ii) H0:   0

versus H1:   0,

(10.12)

or
(iii) H0:   0 versus H1:   0
Step 2 Assign some suitable value to , say   0.05.
Step 3 Determine the suitable test statistic.
We consider the pivotal quantity
X
n

(10.13)

for  as a test statistic for testing hypotheses (i), (ii), or (iii) about .
Step 4 Determine the probability distribution of the test
statistic.
Since we are assuming the sample size to be large (n 30), the central
X
limit theorem we know that the test statistic
is distributed as
n
standard normal, that is, normal with mean 0 and standard deviation 1.
Step 5 Find the rejection regions.
Since the location of the rejection region depends upon the alternative hypothesis and its size depends upon the size of type I error set

Hypothesis Testing 209

(ii)

(i)

1.645

1.645

(iii)

1.96

1.96

Figure 10.4 Rejection regions for hypotheses (i), (ii), and (iii).

in this case at 0.05, the rejections regions for all three hypotheses
are as shown in Figure 10.4
Note that because of the location for the rejection regions, the hypotheses (i), (ii), and (iii) are sometimes known as lower-tail, upper-tail, and twotail hypotheses, respectively.
Step 6 Calculate the value of the test statistic and make the
decision.
Now we take a random sample from the given population and calculate
the value of the test statistic
X
n

Note that in the test statistic  and n are known, and X is calculated using the
sample data. The value of  is always taken equal to 0, since we always test
a hypothesis under the assumption that the null hypothesis is true. Then, if
the value of the test statistic falls in the rejection region we contradict our
assumption and reject H0. Otherwise, we do not reject H0.
Example 10.1 A random sample of 36 pieces of copper wire produced in a
plant of a wire manufacturing company yields the mean tensile strength of X
 950 psi. Suppose that the population of tensile strengths of all copper
wires produced in that plant are distributed with mean  and standard deviation   120 psi. Test a statistical hypothesis
H0:   980 versus H1:   980
at the   0.01 level of significance.
Solution:
Step 1
H0:   980 versus H1:   980
Step 2
  0.01

210

Chapter Ten

Step 3 Test statistic is


Z=

X
n

Step 4 Since the sample size n  36 (30) is large, it follows


from the central limit theorem that the test statistic Z is
distributed as standard normal.
Step 5 Since the test is a lower-tail test, using the standard
normal distribution tables it can easily be found that the
rejection region is as shown in Figure 10.5.
Step 6 The value of the test statistic is
Z=

X 950 980
=
= 1.5
n 120 36

This value does not fall in the rejection region, so we do not reject the
null hypothesis H0. In other words, the data seem to support the hypothesis
that the mean tensile strength of the copper wires manufactured in that plant
is 980 psi.
Sometimes instead of determining the rejection region (in step 5) and
then verifying whether the value of the test statistic falls in the rejection
region, we use a different method to take our decision. This method makes
use of a quantity, called the p-value.
Definition 10.1 The p-value of a test is the smallest value of  for
which the null hypothesis H0 is rejected.
Given that the sample has been taken and the value z of the test statistic
Z has been computed, the p-value may be determined as follows:
p-value  P(Z z);

if H1:   0

(10.14)

 P(Z z);

if H1:   0

(10.15)

 2P(Z |z|); if H1   0

(10.16)

f(x)

2.33

Figure 10.5 Lower-tail rejection region with   0.01.

Hypothesis Testing 211

Rules for Using the p-Value


1. If the p-value is less than or equal to , reject H0.
2. If the p-value is greater than , do not reject H0.
In Example 10.1, the p-value (see Equation (10.14)) is given by
p-value  P(Z 1.50)
 0.0668,
which is obviously greater than   0.01. Thus, we do not reject the null
hypothesis.
Note: Whether one uses rejection region method or the p-value method, one
should always arrive at the same decision.
Example 10.2 Using the information given in Example 10.1, test the statistical hypothesis
H0:   980 versus H1:   980
Solution: The only difference between Examples 10.1 and 10.2 is that in
Example 10.1 the test was a lower-tail test, but here it is a two-tail test. The
rst four steps in this example are exactly the same as in Example 10.1, so
we start here with step 5.
Step 5 The level of signicance is 0.01 and the test is a twotail test. The rejection regions shown in Figure 10.6 are found
by using the normal distribution table.
Step 6 Since the only change that occurred in Example 10.2
from Example 10.1 is in the alternative hypothesis, the value of
the test statistic is not affected. In this example the value of the
test statistic is again z  1.5, which obviously does not fall in
the rejection region. Therefore, we do not reject the null
hypothesis H0.
Now we proceed to nd the p-value in this example [see Equation
(10.16)] and determine whether we arrive at the same conclusion.
p-value  2P(Z | 1.50|)  2(0.0668)  0.1336
It is greater than   0.01, so we do not reject the null hypothesis H0.

2.575

2.575

Figure 10.6 Two tail rejection region with   0.01.

212

Chapter Ten

Example 10.3

Construct the power curve for the test in Example 10.2.

Solution: Since the hypothesis in Example 10.2 is a two-tail test. we calculate the value of  at   860, 880, 900, 925, 950, 975, 985, 1010, 1035,
1060, 1080, and 1100. Note that we have selected these values in such a way
that some of these values are smaller and some are larger than the value of 
under the null hypothesis. This clearly satises the requirement of the alternative hypothesis that   0, since 0  980. Now using the formula given
in Equation (10.9) for calculating the value of , we have
980 860
980 860

2.575 < Z <


+ 2.5575
At   860; 1 = 1 P
120 36
120 36

= 1 P ( 3.425 < Z < 8.575 1.0


980 880
980 880

2.575 < Z <


+ 2.5575
At   880; 1 = 1 P
120 36
120 36

= 1 P (2.425 < Z < 7.575 0.9923


980 900
980 900

2.575 < Z <


+ 2.5575
At   900; 1 = 1 P
120 36
120 36

= 1 P (1.425 < Z < 6.575 0.9229


Similarly, we can calculate the values of the power 1   at other values of
, that is,
  925

1    0.4305

  950;

1    0.1412

  975;

1    0.0124

  985;

1    0.0124

  1010

1    0.1412

  1035

1    0.4305

  1060

1    0.9229

  1080

1    0.9923

  1100

1    1.0

Now the power curve for the test in Example 10.2 is obtained by plotting
the values of  versus the values of 1  . The power curve is shown in
Figure 10.7.
It is important to remember that there is no analytic relationship between
type I error () and type II error (). That is, there does not exist any function
 such that   (). However, one can easily see that if everything except
 and  are xed, then as  increases  decreases and as  decreases 
increases.

Hypothesis Testing 213

Power

1.0

0.5

0.0
900

1000
mu

1100

Figure 10.7 Power curve for the test in example 10.2.

10.2.2 Population Variance Is Unknown


Let X1, X2, X3, ..., Xn be a random sample from a population with probability density
function f(x) with unknown mean  and unknown variance 2.

Let X be the sample mean and S2 be the sample variance. Then, we want to
test one of the hypotheses dened below in step 1 at the  level of signicance, say   0.05, assuming that the sample size is large ( 30).
Step 1 (i) H0:   0

versus H1:   0,

(ii) H0:   0

versus H1:   0,

(iii) H0:   0

versus H1:   0

or

Step 2   0.05
Step 3 We consider as a test statistic the pivotal quantity
X
S n

(10.17)

X
a pivotal quan n
tity since in that case we knew . But in the present case we do not know 
X
and, therefore,
is not a pivotal quantity, since a pivotal quantity does
n
not contain any unknown parameter other than the one under consideration,
which in this case is .
for . Note that in the section 10.2.1, we considered

214

Chapter Ten

Step 4 Since the sample size n is large, it can be shown that


X
when  is unknown the pivotal quantity
is
S n
approximately distributed as standard normal N(0, 1).
That is, normal with mean 0 and standard deviation 1.
Step 5 Find the rejection regions.
Since the location of the rejection region depends upon the alternative hypothesis and its size depends upon the size of type I error,
set in this case at 0.05, the rejection regions for all three hypotheses are as shown in Figure 10.8.
As mentioned earlier, because of the location for the rejection
regions the hypotheses (i), (ii), and (iii) are sometimes known as
lower-tail, upper-tail, and two-tail hypotheses, respectively.
Step 6 Calculate the value of the test statistic and make the
decision.
Now we take a random sample from the given population and calculate the
X
value of the test statistic
.
S n

Note that in the test statistic, n, is known, X and S are calculated using
the sample data, and the value of  is always taken equal to 0, that is, the
value of  under the null hypothesis. Then, if the value of the test statistic
falls in the rejection region, we contradict our assertion and reject H0.
Otherwise we do not reject H0.

f(x)

f(x)
(i)

1.645

f(x)
(ii)

1.645

(iii)

1.96

1.96

Figure 10.8 Rejection regions for hypotheses (i), (ii), and (iii).

Example 10.4 A tire manufacturing company claims that its top-of-the-line


tire lasts for, on average, at least 61,000 miles. A consumer group tested 64
of
these tires to check the claim. The data collected by this group yielded X 
60,000 miles and standard deviation S  4,000 miles. Test at the   0.05
level of significance the validity of the companys claim. Find the p-value.
Also, find the size of the type II error, , at 1  60,500 miles.

Hypothesis Testing 215

Solution:
Step 1

H0:  61k versus H1:   61k

This is equivalent to testing


H0:   61k versus H1:   61k
since if we favor H1:   61k and reject H0:   61k, certainly we
will reject H0:  61k.
Step 2   0.05
Step 3 Since the population standard deviation is unknown, we
use
Z=

X
S n

as the test statistics.


Step 4 Since the sample size is 64( 30), the test statistic
Z=

X
S n

is distributed as standard normal.


Step 5 Since the test is a lower-tail test and   0.05, the
rejection region is as shown in Figure 10.9.
Step 6 The value of the test statistic is
Z=

X 60, 000 61, 000


=
= 2.0,
S n
4, 000 64

which falls in the rejection region. Thus, we reject the null hypothesis
H0.
The p-value for the test is given by
p-value  P(Z  z)  P(Z  2.0)  0.0228,

1.645

Figure 10.9 Rejection region under the lower test with   0.05.

216

Chapter Ten

and the type II error  at 1  60,500 miles, using Equation (10.7), is given
by
0 1

z
= P Z >

x
61, 000 60, 500

1.645
= P Z >
4, 000 64

= P ( Z > .0645 ) 0.7405.

10.3 Testing Statistical Hypotheses about the


Difference between Two Population Means When
the Sample Sizes Are Large
10.3.1 Population Variances Are Known
Consider two populations I and II having probability density functions f1(x)
and f2(x) with unknown means 1 and 2 and known variances 12 and 22
respectively. Let X11, X12, X13, ..., X1n1 and X21, X22, X23, ... , X2n2 be the
random samples from two populations
and where n1 and n2 30, that is,

sample sizes are large. Let X1 and X2 be the sample means of samples from
populations I and II, respectively. Then we are interested in testing one of the
hypotheses dened below, in step 1, at the  level of signicance.
To test these hypotheses when the variances are known, we go through
the same six steps that we did in section 10.2.1 for testing hypotheses about
the mean of one population. Thus, we have
Step 1

or

(i) H0: 1  2  0

versus H1: 1  2  0,

(ii) H0: 1  2  0

versus H1: 1  2  0,

(iii) H0: 1  2  0

versus H1: 1  2  0
(10.18)

Step 2 Assign a predetermined value to  the type I error.


Suppose   0.05.
Step 3 Determine a suitable test statistic.
We consider, again, the pivotal quantity for 1  2
Z=

( X1 X 2 ) ( 1 2 )

12 n1 + 22 n2

(10.19)

as the test statistic.


Step 4 Determine the sampling distribution of the test statistic.

Hypothesis Testing 217

(i)

1.645

(ii)

1.645

(iii)

1.96

1.96

Figure 10.10 Rejection regions for testing hypotheses (i), (ii), and (iii) at the
  0.05 level of significance.

Since the sample sizes are large, using the central limit theorem and
Theorem 9.1, we can easily show that the test statistic in Equation (10.19),
that is,
( X X 2 ) ( 1 2 )
Z= 1
12 n1 + 22 n2
is distributed as standard normal, that is, normal with mean 0 and standard
deviation 1.
Step 5 Find the rejection regions.
As explained in section 10.2, the location of the rejection regions is
determined by the alternative hypothesis, and its size is determined by the
size of the type I error . Using   0.05, the rejection region for each of the
above hypotheses is shown in Figure 10.10.
Step 6 Now take two samples one from each of the
populations I and II and calculate the sample means. Then
calculate the observed value of the test statistic, and if it falls in
the rejection region reject the null hypothesis H0. Otherwise do
not reject H0.
Example 10.5 Suppose two random samples one from each of population
I and population II, with known variances 12  23.4 and 22  20.6, yielded the following sample statistics:

n1  50
X1  38.5

n2  45
X2  35.8
Test at the   0.05 level of significance the hypothesis H0: 1  2  0
versus H1: 1  2  0.
Solution:
Step 1 H0: 1  2  0
Step 2   0.05

versus H1: 1  2  0.

218

Chapter Ten

1.645

Figure 10.11 Rejection region under the upper tail with   0.05.

Step 3 Test statistic for testing the hypothesis in step 1 is


Z=

( X1 X 2 ) ( 1 2 )

12 n1 + 22 n2

Step 4 Since sample sizes are large, using the central limit
theorem we can easily show that the test statistic is distributed
as standard normal, that is, normal with mean 0 and standard
deviation 1.
Step 5 The hypothesis in this example is clearly upper-tail
hypothesis. The rejection region is as shown in Figure 10.11.

Step 6 Substituting the value of X1 and X2, 12 and 12, and the
value of 1  2  0, under the null hypothesis, in the test
statistic, the observed value of the test statistic is
Z=

( 38.5 35.8 ) 0
= 2.806
23.4 20.6
+
50
45

Clearly this value falls in the rejection region. Thus, we reject the null
hypothesis of equal means. In other words, based upon the given information, we can conclude, at the   0.05 level of signicance, that population
means are not equal.
The p-value in this example (see Equation (10.15)) can be found, using
the normal tables.
p-value  P(Z z)
 P(Z 2.806)  0.0026
Example 10.6 A suppler furnishes two types of filaments, type I and type II,
to a manufacturer of electric bulbs. Suppose an electrical engineer in the manufacturing company wants to compare the average resistance of the two types
of filament. In order to do so, he takes two samples, size n1  36 of filament
type
filament type II. The two samples yield sample means
I and size n2  40 of
X1  7.35 Ohms and X2  7.65 Ohms, respectively. If from experience it is
known that the standard deviations of the two filaments are 1  0.50 Ohms
and 2  0.64 Ohms, respectively, test at the   0.05 level of significance the

Hypothesis Testing 219

1.96

1.96

Figure 10.12 Rejection regions under the two tails with   0.05.

hypothesis that H0: 1  2  0, versus H1: 1  2  0. Find the p-value of


the test.
Solution:
Step 1 H0: 1  2  0

versus H1: 1  2  0

Step 2   0.05
Step 3 Test statistic for testing the above hypothesis is the
pivotal quantity for 1  2, that is
Z=

( X1 X 2 ) ( 1 2 )

12 n1 + 22 n2

Step 4 Since the sample sizes n1  36 and n2  40 are large,


we can easily show that the test statistic in step 3 is distributed
as standard normal, that is, normal with mean 0 and standard
deviation 1.
Step 5 Since the test is a two-tail test and   0.05, the
rejection regions are as shown in Figure 10.12.
Step 6 Using the information provided by the two samples and
the fact that under the null hypothesis 1  2  0, the
observed value of the test statistic is
Z=

( 7.35 7.65 ) 0

= 2.28
(0.50 )2 (0.64 )2
+
36
40
The observed value of the test statistic falls in the rejection region and we
reject the null hypothesis H0. In other words, we conclude at the   0.05
level of signicance that the two laments have different resistances.
10.3.2 Population Variances Are Unknown
Consider two populations with probability distribution models f1(x) and f2(x)
with means 1 and 2 and variances 12 and 22 respectively. Let X11, X12,
X13, ..., X1n1 and X21, X22, X23, ..., X2n2 be random samples from populations

220

Chapter Ten

I and II respectively. Let X1 and X2 be the sample means and S12 and S22 be
the sample variances of the samples from populations. Then we are interested in testing one of the hypotheses dened below in step 1 at the  level of
signicance.
The method for testing these hypotheses when variances are unknown is
exactly the same as when the variances are known, discussed in section
10.3.1, except that the population variances are replaced with sample variances. We proceed to discuss these hypotheses as follows:
Step 1 Dene the null and alternative hypotheses.

or

(i) H0: 1  2  0

versus H1: 1  2  0,

(ii) H0: 1  2  0

versus H1: 1  2  0,

(iii) H0: 1  2  0

versus H1: 1  2  0.
(10.20)

Step 2 Assign the predetermined value to , the level of


signicance. Suppose   0.05.
Step 3 Determine a suitable test statistic.
Since the population variances are unknown, we consider the pivotal
quantity for 1  2 the test statistic, that is,
(X1 X 2 ) ( 1 2 )
S12 S22
n + n
1
2

(10.21)

Step 4 Determine the probability distribution of the test


statistic.
Since the sample sizes are large, using the central limit theorem and the
result of Theorem 9.4, we show that the test statistic of Z in Equation (10.21)
is distributed as standard normal, that is, normal with mean 0 and standard
deviation 1. Note that when sample sizes are large, the distribution of the test
statistics continues to be approximately normal when the population variances
are replaced by the sample variances. Therefore the probability distribution of
the test statistic in step 3 is the same as the one considered in section 10.3.1,
where the population variances were known.
Step 5 Using the same argument as in section 10.4.1, it
can easily be seen that the rejection regions are as shown
in Figure 10.13.
Step 6 Now take one sample from each of the two populations.
Calculate the sample means and the sample variances, then
calculate the observed value of the test statistic. If the
calculated value of the test statistic falls in the rejection region,
we reject the null hypothesis H0. Otherwise do not reject H0.

Hypothesis Testing 221

(i)

1.645

(ii)

1.645

(iii)

1.96

1.96

Figure 10.13 Rejection regions for testing hypotheses (i), (ii), and (iii) at   0.05
level of significance.

Example 10.7 Rotor shafts of the same diameters are being manufactured
at two different facilities of a manufacturing company. A random sample of
size n1  72 rotor shafts from one facility produced a mean diameter of
0.536 inch with a standard deviation of 0.007 inch, while a sample of size n2
 60 from the second facility produced a mean diameter of 0.540 inch with
a standard deviation of 0.01 inch.
(i) Test the null hypothesis H0: 1  2  0 versus H1: 1  2
 0 at the   0.05 level of significance.
(ii) Find the p-value for the test in part (i)
(iii) Find the size of the type II error  and the power of the test if
the true value of 1  2  0.002.
Solution:
(i)

Step 1 H0: 1  2  0 versus H1: 1  2  0


Step 2   0.05
Step 3 Test statistic (see Equation (10.21)) for testing the
hypothesis in step 1 is
(X1 X 2 ) ( 1 2 )
S12 S22
n + n
1
2
Step 4 Since the sample sizes are large, it can easily be seen
that the test statistic Z in step 3 is distributed as standard normal
N(0, 1).
Step 5 Since the test is a two-tail test and the level of
signicance is   0.05, the rejection regions are as shown in
Figure 10.14.
Step 6 Substituting the values of the sample means, sample
standard deviation and the value 1  2  0 in the test

222

Chapter Ten

1.96

1.96

Figure 10.14 Rejection regions for a two-tail test with   0.05.

statistic given in Equation (10.21), the observed value of the


test statistic is
Z=

(0.536 0.540 ) 0
(0.007 )2 72 + (0.01)2 60

= 2.61,

which obviously falls in the rejection region. Thus, we reject the null hypothesis H0.
(ii) Since the test is a two-tail test the p-value of the test is given
by
p-value  2P(z  2.61)  2(.0045)  0.009
(iii) Using Formula 10.9 for calculating the type II error , we get
( 2 )0 ( 1 2 )1

( 2 )0 ( 1 2 )1
= P 1
z / 2 Z 1
+ z / 2
( 1 2 )
( 1 2 )

0 ( 0.002 )
= P
1.96 Z
(.007 )2 (.01)2
+

72
60

0 ( 0.002 )
+ 1.96

(.007 )2 (.01)2
+

72
60

 P(1.30  1.96 Z 1.30  1.96)


 P(0.66 Z 3.26)  0.7454.
Thus, the power of the test is 1    1  0.7454  0.2546.

10.4 Testing Statistical Hypotheses about One


Population Mean When Sample Size Is Small
Quite often it is not possible to take large samples, perhaps due to budgetary
problems or time constraints. Other times it happens because sampling and
testing means destroying the sampled product, and the product may be so
expensive that it may not be economical to take large samples. Under such

Hypothesis Testing 223

circumstances the experimenter may have to evaluate whether it is more benecial to take a larger sample or to take a smaller sample and accept somewhat less accurate results. So she may choose to take a smaller sample, or
she may have no choice other than to take a smaller sample. In this and the
next two sections we work with the problem of testing hypotheses when
sample sizes are small.
In this section, we assume that the sample is drawn from a population
distributed normally with an unknown mean  and variance 2 that may or
may not be known.
10.4.1 Population Variance Is Known
Let X1, X2, X3, . . . , Xn be a random sample from a normalpopulation with
an unknown mean  and variance 2 that is known. Let X be the sample
mean. We would like to test one of the hypotheses dened in step 1 at the 
level of signicance. As in section 10.3, we use the six-step method to test
these hypotheses.
Step 1 Dene the null and alternative hypotheses.
(i) H0:   0 versus H1:   0,
(ii) H0:   0

versus H1:   0,

(iii) H0:   0

versus H1:   0.

(10.22)

or
Step 2 Assign a predetermined value to type I error , say,
  0.05.
Step 3 Determine a suitable test statistic.
Since the variance is known and the population is normally distributed,
X -
we consider the pivotal quantity
for , as a test statistic.
n
Step 4 Find the probability distribution of the test statistic.
Since the sample has been drawn from a normal population with known
variance, the test statistic in step 3 is distributed as standard normal N(0,1).
Step 5 Find the rejection region.
Using the arguments discussed in sections 10.3 and 10.4, it can be shown
that the rejection regions are as shown in Figure 10.15.
Step 6 Calculate the observed value of the test statistic and
make a decision.

Use the sample data to calculate the sample mean X. Then calculate the
value of the test statistic under the null hypothesis H0:   0. If the
observed value of the test statistic falls in the rejection region, then we reject
the null hypothesis H0. Otherwise do not reject H0.

224

Chapter Ten

(ii)

(i)

1.645

1.645

(iii)

1.96

1.96

Figure 10.15 Rejection regions for testing hypotheses (i), (ii), (iii) at the   0.05
level of significance.

Example 10.8 The workers union of a large corporation located in a big


city demands that each worker should be compensated for traveling time to
work, since it takes on the average at least 75 minutes for each worker to
travel to the job. In order to verify the unions claim, the director of the
human resources took a random sample of 16 workers and found that the
average traveling time for these workers was 68 minutes. Assume that from
experience the director knows that traveling times are normally distributed
(in applications this condition must be verified) with a standard deviation 
 10 minutes. Do these data provide sufficient evidence to support the
unions claim? Use   0.05. Find the p-value. Find the size of the type II
error  if the true travel time is   72 minutes.
Solution:
Step 1 H0:   75 versus H1:   75
It seems more reasonable to consider the null hypothesis H0:  75
instead of H0:   75. But note that it is not necessary, since if H0:   75
is rejected in favor of H1:   75, it is clear that we shall certainly reject H0:
 75.
Step 2   0.05.
Step 3 Test statistic for testing the above hypothesis is
Z=

X -
n

Step 4 Since the population is assumed to be normal, the test


statistic Z is distributed as standard normal N(0,1).
Step 5 Since the test is a lowertail test the rejection region is
as shown in Figure 10.16.
Step 6 Under the nullhypothesis   75 and, from the given
information, we have X  68, n  16, and   10. Thus, the
value of the test statistic is
Z=

68 75 7
=
= 2.8,
10 / 16 2.5

Hypothesis Testing 225

1.645

Figure 10.16 Rejection region under the lower tail with   0.05.

which falls in the rejection region. Thus, we reject the null hypothesis H0 and
conclude that based upon the data, travel time is less than 75 minutes.
The p-value is given by
p-value  P(Z z)  P(Z 2.8)
To nd the type II error, we proceed as follows:

= P(Z >
= P(Z >

0 1
Z )
/ n
75 72
1.645 )
10 / 16

 P(Z  0.445)  0.6768


Example 10.9 The advising office at a university claims that freshmen
spend on an average 10 hours watching television per week. A random sample of 25 freshmen showed that these students spend an average of 10.5
hours watching television per week. From experience it is known that the
time spent watching television by freshmen is normally distributed with standard deviation   2.5 hours. Test at the   0.05 level of significance if
there is sufficient evidence to indicate the validity of the advising offices
claim. Find the p-value.
Solution:
Step 1 H0:   10 versus H1:   10
Step 2   0.05
Step 3 Test statistic is
Z=

X -
/ n

Step 4 Since the population is normal with a known standard


deviation, the test statistic in step 3 is distributed as standard
normal N(0,1).

226

Chapter Ten

1.96

1.96

Figure 10.17 Rejection regions under the two tails with   0.05.

Step 5 Since the test is a two-tail test and   0.05, the


rejection regions are as shown in Figure 10.17.
Step 6 Since under the null hypothesis
  10 and from the
information provided to us we have X  10.5, n  25,   2.5,
the value of the test statistic is
Z=

10.5 10 0.5
=
= 1.0,
2.5 / 25 0.5

which does not fall in the rejection region. Thus, we do not reject the null
hypothesis H0. In other words the data support the advising ofces claim that
  10 at the   0.05 level of signicance.
The p-value is given by
p-value  P(Z z)  P(Z z)
 2P(Z |z|)  2(0.1587)  0.3174.
10.4.2 Population Variance Is Unknown
In this section, as in Chapter 9, we shall invoke the use of the Students t-distribution with (n  1) degrees of freedom. Note that we use t-distribution
only when all of the following conditions hold.
(i) The sampled population is at least approximately normal.
(ii) The sample size is small (n  30).
(iii) The population variance is unknown.
Let X1, X2, X3, . . . , Xn be a random sample from a normal population
with unknown mean  and unknown variance 2. Let X and S be the sample
mean and the sample standard deviation, respectively. Then we want to test
the following hypotheses about the mean , dened in step 1, at the  level
of signicance.
The six-step method to test these hypotheses is as follows:
Step 1 Dene the null and alternative hypotheses.
(i) H0:   0
(ii) H0:   0

versus H1:   0,


versus H1:   0,

(10.23)

Hypothesis Testing 227

or
(iii) H0:   0

versus H1:   0.

Step 2 Assign a predetermined value to , the level of


signicance; say   0.05.
Step 3 Since the population variance 2 is unknown, we
consider the pivotal quantity for  as a test statistic, that is,
T=

X
S/

(10.24)

Step 4 Since the sample has been drawn from a normal


population with an unknown variance 2, it can be shown that
the test statistic in step 3 is distributed as Students tdistribution with (n  1) degrees of freedom.
Step 5 Since from step 4 we know that the test statistic T is
distributed as Students t-distribution with (n  1) degrees of
freedom, the rejection regions with   0.05 are as shown in
Figure 10.18.

Step 6 Use the sample data to calculate the sample mean X


and
the sample standard deviation S. Then using these values of
X and S and whatever the value is of  under the null
hypothesis, calculate the observed value of the test statistic. If
the observed value of the test statistic falls in the rejection
region, we reject H0 at the  level of signicance. Otherwise do
not reject H0.
Example 10.10 A tool assembling company believes that a worker should
take no more than 30 minutes to assemble a particular tool. A sample
of 16
workers who assembled that tool showed that the average time was X  33
minutes with a standard deviation S  6 minutes. Test at the   0.05 level
of significance if the data provide sufficient evidence to indicate the validity

(i)

t n-1, 0.05

(ii)

t n-1, 0.05

(iii)

t n-1, 0.25

t n-1, 0.25

Figure 10.18 Rejection regions for testing hypotheses (i), (ii), and (iii) at the given .

228

Chapter Ten

of the companys belief. Assume that the assembly times are normally distributed. Find the p-value.
Solution:
Step 1 H0:   30 versus H1:   30
Step 2   0.05
Step 3 Test statistic is
T=

X -
S/ n

Step 4 Since the population is normal with an unknown


standard deviation, the test statistic in step 3 is distributed
as Students t-distribution with 15 degrees of freedom, since
n  16.
Step 5 Since the test is an upper-tail test and   0.05, the
rejection region is as shown in Figure 10.19.
Step 6 We nd the observed value of the test statistic, which is
given by
T=

33 30
= 2.0
6 / 16

Since the value of the test statistic T  2.0 falls in the rejection region, we
reject the null hypothesis H0.
To nd the exact p-value, we will have to integrate the density function
of the t-distribution with 15 degrees of freedom between the limits 2.0 to .
But this is beyond the scope of this book. Thus, we content ourselves with
simply nding a pair of values within which the p-value falls, that can be
found simply by using the t-distribution table (Table IV of the appendix).
Thus, to achieve this goal we proceed as follows.
We nd two entries in the t-distribution table with 15 degrees of freedom
such that one value is just smaller and other is just larger than the observed

1.753

Figure 10.19 Rejection region under the upper tail with   0.05.

Hypothesis Testing 229

value of the test statistic. Thus, for example, in this case these entries are
1.753 and 2.131. This implies that
P(T 2.131)  P(T 2.0)  P(T 1.753)
or
0.025  p-value  0.05
since the values 1.753 and 2.131 correspond to under the upper tail areas
0.05 and 0.025, respectively. Note that sometimes the situation may arise
such that the value of the test statistic is either so small or so large that it is
not possible to nd two entries in the table that will enclose that value. Using
the case above, let us assume that the observed value of the test statistic is t
 3.0. We nd only one entry 2.947, which is just smaller than t  3.0, and
there is no value larger than 3.0. In this case p-value will be
p-value  P(t 3.0)  P(t 2.947)  0.005
That is, the p-value is less than 0.005.

10.5 Testing Statistical Hypotheses about


the Difference between Two Population
Means When Sample Sizes Are Small
In this section we shall assume that both the sampled populations are normally distributed with means 1 and 2 and variances 12 and 22, respectively, where the means are unknown but the variances may or may not be
known.
Let X11, X12, X13, ..., X1n1 and X21, X22, X
23, ..., X2n2 be random samples
from populations I and II respectively. Let X1 and X2 be the sample means
and S12 and S22 be the sample variances of the samples from population I and
II, respectively. We are interested in testing one of the hypotheses at the 
level of signicance about the difference of the two population means
dened below.
(i) H0: 1  2  0

versus H1: 1  2  0,

(ii) H0: 1  2  0

versus H1: 1  2  0,

(iii) H0: 1  2  0

versus H1: 1  2  0.

(10.25)

or
We shall consider three possible scenarios.
1. Population variances 12 and 22 are known.
2. Population variances 12 and 22 are unknown, but we can
assume that they are equal, that is, 12  22  2.
3. Population variances 12 and 22 are unknown, but we cannot
assume that they are equal, that is 12  22.

230

Chapter Ten

For testing each of the hypotheses, we shall consider the pivotal quantity for 1  2 as the test statistic, which will depend upon whether the population variances are known. Since under the last two scenarios we shall be
using Students t-distribution, it is quite important to review the conditions
under which we use Students t-distribution.
When to Use Students t-Distribution under Scenarios 2 and 3:
1. The sampled populations are at least approximately normal.
2. The samples are independent and at least one of the sample
sizes is small.
3. The population variances 12 and 22 are unknown.
Note that under scenario 2, we assume that 12  22  2, which
implies that as far as the variance is concerned, the two populations are identical. Thus, as discussed in Chapter 9, to estimate the common unknown variance 2 we use the information from both the samples. Such an estimator,
denoted by Sp2, usually known as the pooled estimator of 2, is dened as
(n1 1)S12 + (n2 1)S22
(10.26)
=
n1 + n2 2
Having said all this, we now proceed to consider, one by one, the three scenarios.
S p2

10.5.1 Population Variances 12 and 22 Are Known


We have two normal populations with unknown means 1 and 2 but with
known variances 12 and 22. Thus, if we take
(X1 X 2 ) ( 1 2 )

12 22
+
n1 n2

(10.27)

the pivotal quantity for 1  2 as the test statistics, then, as in Chapter 9,


we can show that the sampling distribution of the test statistic is standard
normal N(0,1). This means that this case turns out to be exactly the same as
the one we studied in section 10.4.1. We further illustrate this case with the
following example.
Example 10.11 Two brands of motor fuel are being tested for their octane
number. From experience it is known that the octane numbers of each brand
are normally distributed with 1  1.5 and 2  1.5. Further suppose that
the two random samples of size n1  12 and n2  16 from the two brands
produced mean octane numbers X1  92.8 and X2  90.1. Test at the  
0.01 level of significance the hypothesis H0: 1  2  0 versus H1: 1 
2  0. Find the p-value. Find the power of the test if the true mean difference is 1  2  2.

Hypothesis Testing 231

Solution:
Step 1 H0: 1  2  0

versus H1: 1  2  0

Step 2   0.01
Step 3 Test statistic is
Z=

(X1 X 2 ) ( 1 2 )

12 22
+
n1 n2

Step 4 Since the populations are normal with known standard


deviations, the test statistic as given in step 3 is distributed as
standard normal N(0,1).
Step 5 Since the test is an upper-tail test and   0.01, the
rejection region is as shown in Figure 10.20.
Step 6 Since under the null hypothesis
1  2 0 and
from the given information we have X1  92.8, X2  90.1
and 1  2  1.5, the value of the test statistic is
Z=

(92.8 90.1) 0
(1.5 )2 (1.5 )2
+
12
16
2.7
= 4.71
0.328

This value of the test statistic clearly falls in the rejection region. Thus, we
reject the null hypothesis H0.
p-value  P(Z z)
 P(Z 4.71)  0

2.33

Figure 10.20 Rejection region under the upper tail with   0.01.

232

Chapter Ten

To nd the power of the test, we rst nd , the type II error (see


Equation (10.8)), that is given by

= P(Z <

( 1 2 )0 ( 1 2 )1

= P(Z <

12 22
+
n1 n2
02
(1.5 )2 (1.5 )2
+
12
16

+ z0.01 )

+ 2.33)

 P(Z  3.49  2.33)


 P(Z  1.16)  0.1230
Thus, the power of the test is 1    1  0.1230  0.8770.

10.5.2 Population Variances 12 and 22 Are Unknown


But 12  22  2
Consider two normal populations with unknown means 1 and 2 and with an
unknown common variance 2. We want to test hypotheses about 1  2 the
difference between the two population means at the  level of signicance. To
achieve this goal we proceed as follows:
Step 1 (i) H0: 1  2  0
or

versus H1: 1  2  0,

(ii) H0: 1  2  0

versus H1: 1  2  0,

(iii) H0: 1  2  0

versus H1: 1  2  0.
(10.28)

Step 2 Assign a suitable predetermined value to .


Step 3 We consider the pivotal value for 1  2 as the test
statistic, that is,
(X1 X 2 ) ( 1 2 )
1 1
Sp
+
n1 n2

(10.29)

where the pooled estimator Sp2 for the common variance 2 is given by
S p2 =

(n1 1)S12 + (n2 1)S22


n1 + n2 2

(10.30)

Step 4 Since the populations under consideration are normal


with common variance 2 and Sp2 is an estimator of 2, the test

Hypothesis Testing 233

(ii)

(i)

t n + n - 2,

t n + n - 2,
1

(iii)

tn + n - 2, /2 t n + n - 2, /2
1

Figure 10.21 Rejection regions for testing hypotheses (i), (ii), and (iii) at the  level
of significance.

statistic given in step 3 is distributed as t-distribution with


(n1  n2 2) degrees of freedom.
Step 5 The rejection regions for testing the hypotheses (i), (ii),
and (iii) at the  level of signicance are as shown in Figure
10.21.
Step 6 Now take two random samples, one
sample
from each
population. Calculate the sample means X1 and X2 and the pooled
estimator Sp2 of the populations common variance 2. Calculate
the observed value of the test
statistic given in step 3 by
substituting the values of X1, X2, the pooled estimator Sp2, and the
value of 1  2 under the null hypothesis H0. If the observed
value of the test statistic falls in the rejection region then we
reject the null hypothesis H0. Otherwise do not reject H0.
Example 10.12 Suppose that in Example 10.10 the only information we
have about the populations is that they are normally distributed with common variance, but we have no information about the value of the variance.
Two random samples, one sample from each
population of sizes n1  14 and
n2  16 produce mean octane numbers X1  92.7 and X2  89.8 with sample standard deviations S1  1.6 and S2  1.5. Test at the   0.025 level of
significance the hypothesis H0: 1  2  0 versus H1: 1 - 2  0. Find
the p-value.
Solution:
Step 1 H0: 1 - 2  0 versus H1: 1 - 2  0
Step 2   0.025
Step 3 The test statistic is
T=

(X1 X 2 ) ( 1 2 )
1 1
Sp
+
n1 n2

Step 4 Since the sampled populations are normal with


unknown but equal variance, the test statistic given in step 3 is

234

Chapter Ten

2.048

Figure 10.22 Rejection region under the upper tail with   0.025.

distributed as t-distribution with (n1  n2  2)  26 degrees of


freedom.
Step 5 The test in this problem is an upper-tail test. Thus, the
rejection region with   0.025 under the upper tail is as shown
in Figure 10.22.
Step 6 The pooled estimator Sp2 for 2 is given by
S p2 =
=

(n1 1)S12 + (n2 1)S22


n1 + n2 2
13(1.6 )2 + 15(1.5 )2
= 2.39
14 + 16 2

Using Sp2  2.39, X1  92.7, X2  89.8, and the value of 1  2, under
the null hypothesis that is 1  2  0, we have the observed value of the
test statistic as
T=

(92.7 89.9 ) 0
1
1
1.55
+
14 16

 4.94
Clearly this value of the test statistic falls in the rejection region. Thus, we
reject the null hypothesis H0. In other words, we can say that at the  
0.025 level of signicance that fuel one has a higher octane number.
p-value  P(t 4.94)
 P(t 2.763)
 0.005
Thus, the p-value of the test is less than 0.005.

Hypothesis Testing 235

10.5.3 Population Variances 12 and 22 Are Unknown


and 12  22
Consider two normal populations with unknown means 1 and 2 and with
unknown variances 12 and 22, respectively. In this case, we assume that the
variances 12 and 22 cannot be assumed to be equal. Then we want to test
hypotheses about 1  2, the difference between the two population means,
at the  level of signicance. To achieve this goal we proceed as follows:
Step 1 (i) H0: 1  2  0
or

versus H1: 1  2  0,

(ii) H0: 1  2  0

versus H1: 1  2  0,

(iii) H0: 1  2  0

versus H1: 1  2  0.
(10.31)

Step 2 Assign a suitable predetermined value to .


Step 3 We consider the pivotal quantity for 1  2 as the test
statistic, that is,
T=

(X1 X 2 ) ( 1 2 )
S12 S22
+
n1 n2

(10.32)

Step 4 The two populations are normal with unknown


variances, which cant be assumed to be equal. Thus, in this
case the test statistic T given in step 3 is distributed as Students
t-distribution with approximately m degrees of freedom, where

m=

(S

2
1

S12 S22
n + n
1
2
n1

) + (S

n1 1

2
2

n2

(10.33)

n2 1

Step 5 The rejection regions for testing the hypotheses (i),


(ii), and (iii) at the  level of signicance are as shown in
Figure 10.23.

Step 6 Substituting the values X1, X2, S12, S22, and 1  2
 0, since under the null hypothesis 1  2  0, we nd the
observed value t of the test statistic T given in step 3. If the
observed value falls in the rejection region, we reject the null
hypothesis H0. Otherwise we do not reject H0.

236

Chapter Ten

(ii)

(i)

(iii)

t m, /2

t m,

t m,

t m, /2

Figure 10.23 Rejection regions for testing the hypotheses (i), (ii), and (iii) at the 
level of significance.

Example 10.13 A new weight control company A claims that persons who
use its program regularly for a certain period of time lose the same amount
of weight as those who use the program of a well-established company B for
the same period of time. A random sample of n1  12 persons who used company As program lost on the average 20 pounds with standard deviation of
4 pounds, while another sample of n2  10 persons who used company Bs
program for the same period lost on the average 22 pounds with standard
deviation of 3 pounds. Determine at the   0.01 level of significance
whether the data provide sufficient evidence to support the claim by company A. Find the p-value of the test. We assume that the two population variances are not equal.
Solution:
Step 1 H0: 1  2  0 versus H1: 1  2  0.
Step 2 Assign a suitable predetermined value to ;
say   0.01.
Step 3 We consider the pivotal quantity for 1  2 to be the
test statistic, that is,
T=

(X1 X 2 ) ( 1 2 )
S12 S22
+
n1 n2

Step 4 The two populations are normal with unknown


variances, which cannot be assumed to be equal. Thus, in
this case the test statistic given in step 3 is distributed as
t-distribution with approximately m degrees of freedom,
where

m=

(S

2
1

S12 S22
n + n
1
2
n1

) + (S

n1 1

2
2

n2

n2 1

4.988
20.
0.2516

Hypothesis Testing 237

2.845

2.845

Figure 10.24 The rejection region under the two tails with   0.01.

Step 5 Since the test is a two-tail test and   0.01, the


rejection regions are as shown in Figure 10.24.

Step 6 Substituting the values X1, X2, S12, S22, and 1  2 
0, since under the null hypothesis 1  2  0, we nd the
observed value of the test statistic t, given in step 3, is
T=

(20 22 ) 0
16 9
+
12 10

1.34,

which does not fall in the rejection region. Thus, we do not reject the null
hypothesis H0.
In other words, at the   0.01 level of signicance, the data support the
claim of company A.
Since the test is a two-tail test, the p-value is given by
p-value  2P(T 1.34)
But
P(T 1.725)  P(T 1.34)  P(T 1.325)
or
0.05  P(T 1.34)  0.01
or
2(0.05)  2 P(T 1.34)  2(0.10)
or
0.10  p-value  0.20
That is, the p-value of the test is somewhere between 10% and 20%.

10.6 Paired t-Test


In section 10.5 we studied the hypotheses about the difference of two population means when we had access to two independent random samples, one

238

Chapter Ten

from each population. Quite often for various reasons the experiments are
designed in such a way that the data are collected in pairs, that is, two observations are taken on the same subject and consequently the samples are not
independent. We encounter this kind of data in elds such as medicine, psychology, chemical industry, and engineering. For example, a civil engineer
may divide each specimen of concrete into two parts and apply two drying
techniques, one technique to each part; a nurse collects blood samples to test
the serum-cholesterol level, divides each sample into two parts, and sends
one part to one lab and the second part to another lab; a psychologist treats
patients with some mental disorder and takes two observations on each
patient, one before the treatment and the other after the treatment; or a production engineer may want to increase the productivity of some product by
adjusting a machine differently, so she measures the productivity before and
after the adjustment. The data collected in this manner are usually known as
paired data. If the techniques of testing hypotheses discussed in section 10.5
are applied to paired data, our results may turn out to be inaccurate, since the
samples are not independent.
These kinds of data are also sometimes known as before and after data.
To compare the two means in such cases, we use a test known as paired ttest.
Let (X11, X21), (X12, X22), ..., (X1n, X2n) be a set of n paired observations
on n randomly selected individuals or items, where (X1i, X2i) is a pair of
observations on the ith individual or item. We assume that the samples (X11,
X12, X13, ..., X1n) and (X21, X22, X23, ..., X2n) come from populations with
means 1 and 2 and variances 12 and 22 respectively. Clearly the samples
(X11, X12, X13, ..., X1n) and (X21, X22, X23, ..., X2n) are not independent, since
each pair of observations (X1i, X2i) are two observations on the same individual and are not independent. Finally we assume that the sample of differences between each pair, that is (d1, d2, d3, ..., dn), where di  X1i  X2i; i 
1, 2, 3, ..., n, comes from a normal population with mean d  1  2 and
variance d2, where d2 is unknown. We are then interested in testing the following hypotheses:
(i) H0: d  0

versus H1: d  0,

(ii) H0: d  0

versus H1: d  0,

(10.34)

or
(iii) H0: d  0 versus H1: d  0
Recall the discussion we had in section 10.5.2 concerning testing of
hypotheses about one population mean . The problem of testing hypotheses about the mean d falls in the same framework as the problem of testing
hypotheses about the mean of a normal population with unknown variance.
It follows that the test statistic for testing any one of the above hypotheses is
T=

Xd d
Sd n

(10.35)

Hypothesis Testing 239

where Xd and Sd are respectively the sample mean and the sample standard
deviation of the sample of differences (d1, d2, d3, ..., dn). Assuming that the
population of differences is normal it follows that the test statistic T is distributed as Students t-distribution with (n  1) degrees of freedom. We further illustrate this method with the following example.
Example 10.14 A manager of a manufacturing company wants to evaluate
the effectiveness of a training program by measuring the productivity of
those workers who went through that training. The following data shows the
productivity scores before and after the training of 10 randomly selected
workers.
Workers

10

Before

75

78

76

80

79

83

70

72

72

74

After

79

77

80

85

80

84

78

76

70

80

di 
X1i  X2i

4

4

5

1

1

8

4

6

Do the data provide sufficient evidence to indicate that the training is effective? Use   0.05. Find the p-value of the test.
Solution: We rst calculate some sample statistics that we need for testing
the desired hypothesis, that is,

Xd  ( di) / n  30 / 10  3
Sd2

di
1
di2
=
n 1
n

1
= (180 90 ) = 10
9

Sd = 10 = 3.162
To test the hypothesis we again follow the six-step technique.
Step 1 H0: d  0 versus H1: d  0
Step 2 Assign a suitable predetermined value to ; say  
0.05.
Step 3 From our above discussion, the test statistic that we
would use is
X d
T= d
Sd n
Step 4 The test statistic is distributed as Students tdistribution (we encourage readers to check the conditions
needed to use t-distribution) with n  1  9 degrees of
freedom.

240

Chapter Ten

0.05
1.833

Figure 10.25 Rejection region under the lower tail with   0.05.

Step 5 Since the test is a lower tail test with   0.05, the
rejection region is as shown in Figure 10.25.

Step 6 Substituting the values Xd  3, Sd  3.162, and d 


0 in the test statistic T given in step 3, we have
T=

3 0
= 3
10 10

Clearly the observed value of the test statistic falls in the rejection region.
Thus, we reject the null hypothesis. In other words the test does indicate, at
the   0.05 level of signicance, that the training program is effective.
The p-value of the test is given by
p-value  P(T 3)  P(T 3)  0.01.

10.7 Testing Statistical Hypotheses about


Population Proportions
So far in this chapter we have discussed methods of testing hypotheses about
the population means. In this section we shall discuss techniques of testing
hypotheses about the population proportions. In applications, it is quite
common that we want to test such hypotheses. For example, we may be interested in verifying the percentage of defective product manufactured by a
company, the percentage of the population of a country that is infected with
HIV, the proportion of employees of a company who are not happy with the
health insurance they have, the proportion of students of a class who have
made honors, or the proportion of drivers who are going above the posted
speed limit on a given highway. Thus, we now proceed with testing a hypothesis about one population proportion, and then later in this section, we shall
discuss methods of testing hypotheses about the difference of two population
proportions, that is, comparing the proportions of two populations.
10.7.1 Testing of Statistical Hypotheses about One Population
Proportion When Sample Size Is Large
Let X1, X2, X3, ..., Xn be a random sample from a dichotomous population or
a population of Bernoulli trials with parameter p. Let X  Xi be the total

Hypothesis Testing 241

number of elements in the sample that possess the desired characteristic. Then
from Chapter 8 we know that X/n is a point estimator of p, that is, p  X / n.
We also know that for large n (np 5, n(1  p) 5), the estimator p is distributed approximately as normal with mean p and variance p(1  p) / n.
Having said that, we are now ready to discuss the method of testing of a
hypothesis about the population proportion p. Under the assumption that the
sample size is large, we discuss the following hypotheses about the population proportion.
(i) H0: p  p0

versus H1: p  p0

(ii) H0: p  p0 versus H1: p  p0,

(10.36)

or
(iii) H0: p  p0

versus H1: p  p0

Since the method of testing these hypotheses follows the same six-step
technique that we used to test hypotheses about the population mean, we
illustrate the method with the following example.
Example 10.15 Environmentalists believe that sport utility vehicles
(SUVs) consume excessive amount of gasoline and are the biggest polluters
of our environment. An environmental agency wants to find what proportion
of vehicles on U.S. highways are SUVs. Suppose that a random sample of
500 vehicles collected from highways in the various parts of the country
showed that 120 out of 500 vehicles were SUVs. Do these data provide sufficient evidence that 25% of the total vehicles driven in the United States are
SUVs? Use   0.05 level of significance. Find the p-value of the test.
Solution: From the given information, we have
n  500; X  Xi  120,
thus
p  X/n  120/500  0.24
Now to test the desired hypothesis we proceed as follows:
Step 1 H0: p  p0 versus H1: p  p0
Step 2   0.05
Step 3 We consider the pivotal quantity for p as the test
statistic, that is,
Z=

p - p
p(1 - p) n

(10.37)

Step 4 Since np  500(0.24)  120  5, and n(1  p)  500


(1  0.24)  380  5, the sample size is large. Thus, the test
statistic Z in step 3 is distributed approximately as standard
normal N(0,1).

242

Chapter Ten

1.96

1.96

Figure 10.26 Rejection regions under the two tails with   0.05.

Step 5 Since the test is a two-tail test and   0.05, the


rejection regions are as shown in Figure 10.26.
Step 6 Since under the null hypothesis p  0.25 and p  0.24,
the value of the test statistic is
Z=

0.24-0.25
0.25(1-0.25) 500

 0.516,
which does not fall in the rejection region. Thus, we do not reject the null
hypothesis H0.
Since the test is a two tail test the p-value is given by
p-value  2P(Z z)  2P(Z 0.516)
 2(0.3030)  0.6060
10.7.2 Testing of Statistical Hypotheses about the Difference
Between Two Population Proportions When Sample Sizes Are
Large
Consider two binomial populations with parameters n1, p1 and n2, p2 respectively. Then we are usually interested in testing hypotheses such as
(i) H0: p1  p2 versus H1: p1  p2
(ii) H0: p1  p2

versus H1: p1  p2,

(iii) H0: p1  p2

versus H1: p1  p2

(10.38)

or
The hypotheses in 10.38 may equivalently be written as
(i) H0: p1  p2  0

versus H1: p1  p2  0

(ii) H0: p1  p2  0

versus H1: p1  p2  0,

(iii) H0: p1  p2  0

versus H1: p1  p2  0

or

(10.39)

Hypothesis Testing 243

We illustrate the method of testing the above hypotheses with the following example.
Example 10.16 A computer assembling company gets all its chips from
two suppliers. The company has experienced that both suppliers have supplied a certain proportion of defective chips. The company wants to test a
hypothesis with three hypotheses: (i) supplier I supplies a smaller proportion
of defective chips, (ii) supplier I supplies a higher proportions of defective
chips, or (iii) the suppliers do not supply the same proportion of defective
chips. To achieve this goal the company took a random sample from each
supplier. It was found that in one sample of 500 chips, 12 were defective, and
in the second sample of 600 chips, 20 were defective. For each of the above
hypotheses use   0.05 level of significance. Find p-value for each test.
Solution: From the given data, we have
n1  500, X1  12, p1  X1/n1  12/500  0.024
n2  600, X2  20, p1  X2/n2  20/600  0.033
where X1 and X2 are the number of defective chips in samples 1 and 2 respectively. Now to test the desired hypotheses we proceed as follows:
Step 1
(i) H0: p1  p2  0

versus H1: p1  p2  0

(ii) H0: p1  p2  0

versus H1: p1  p2  0,

(iii) H0: p1  p2  0

versus H1: p1  p2  0

or
Step 2   0.05
Step 3 We consider the pivotal quantity for p1 - p2 as the test
statistic, that is,
(p1 p 2 ) (p1 p2 )
Z=
p1 (1 p1 ) p2 (1 p2 )
(10.40)
+
n1
n2
Step 4 Since n1p1  500(0.024)  12  5, and n1(1  p1) 
500(1  0.024)  488  5, the sample size n1 is large.
Similarly, we can verify that the sample size n2 is large. Thus,
the test statistic Z in step 3 is distributed approximately as
standard normal N(0,1).
Step 5 Since the test statistic is approximately normally
distributed, the rejection regions for testing hypotheses (i), (ii),
and (iii) at the   0.05 level of signicance are as shown in
Figure 10.27.
Step 6 Since under the null hypothesis p1  p2  0, we
substitute the values of p1, p2, and p1  p2  0 in the
numerator, and the values of p1 and p2 in the denominator. Note

244

Chapter Ten

(i)

1.645

(ii)

1.645

(iii)

1.96

1.96

Figure 10.27 Rejection regions for testing hypotheses (i), (ii), and (iii) at the
  0.05 level of significance.

however, that p1 and p2 are unknown but under the null


hypothesis p1  p2  p (say). Thus, we estimate p1 and p2 or
for that matter p by pooling the two samples, that is,
p  (X1  X2) / (n1 n2).

(10.41)

Then replace p1 and p2 in the denominator with p. In this example , we have


p  (12  20) / (500  600)  0.029
Thus, the value of the test statistic under the null hypothesis is given by
(0.024 0.033) 0
Z=
= 0.8857
0.029(0.971) (0.029 )(0.9971)
+
500
600
Clearly, in all cases, the value of the test statistic does not fall in the rejection region. Thus, in either case (i), (ii), or (iii), we do not reject the null
hypothesis at the   0.05 level of signicance. In other words the data
imply, at the   0.05 level of signicance, that both suppliers supply the
same proportion of defective chips.
As a nal remark, it is quite interesting to note that whether we want to
test the hypothesis (i), (ii), or (iii), except for the rejection regions, all the
steps including the value of the test statistic are exactly the same. However,
the p-value of these tests will be different for different hypotheses. Thus, we
now proceed to calculate these p-values
(i)

p-value  P(Z 0.8857)  0.1880

(ii)

p-value  P(Z 0.8857)  0.8120

(iii) p-value  2P(Z 0.8857)  2(0.1880)  0.3760.

10.8 Testing Statistical Hypotheses


about Population Variances
So far in this chapter we used population variances and sample variances as
important tools to study techniques of testing hypotheses about population
means and population proportions. In this section we are going to study

Hypothesis Testing 245

methods of testing hypotheses about population variances. First we shall


consider the case of one population variance and then two population variances. As in Chapter 9, we are going to assume that the population(s) is normally distributed with mean  and unknown variance 2.
10.8.1 Testing Statistical Hypotheses about One
Population Variance
distributed norLet X1, X2, X3, ..., Xn be a random sample from a population

mally with mean  and unknown variance 2. Let X be the sample mean and
S2 be the sample variance. Then from Chapter 8, under the assumption of the
population being normal, we know that the pivotal quantity for 2, that is,
(n 1)S 2
2

(10.42)

is distributed as chi-square with (n  1) degrees of freedom. In Chapter 9,


we used this pivotal quantity to nd condence intervals for the population
variance. Here we are using this quantity as a test statistic for testing
hypotheses about the population variance 2. We consider here the hypotheses about the variance 2 as given in Equation (10.43). To test these hypotheses we are going to use again the six-step method we used earlier.
(i) H0: 2  02

versus H1: 2  02,

(ii) H0: 2  02

versus H1: 2  02,

(iii) H0: 2  02

versus H1: 2  02

(10.43)

or
where 02  0 is known. As we described earlier the location of the rejection
regions depend upon the alternative hypotheses. Thus, for example the rejection regions for testing the hypotheses (i), (ii), and (iii) at the  level of signicance are as shown in Figure 10.28.

f(x)

f(x)

(i)

X n-1, 1-

f(x)

(ii)

(iii)

X n-1,

2
X n-1, 1 -
2

2
X n-1, 1 -

Figure 10.28 Rejection region under the chi-square distribution curve for testing
hypotheses (i), (ii), and (iii) at the  level of significance.

246

Chapter Ten

We illustrate the method of testing a hypothesis about the population


variance with the following example.
Example 10.17 The production manager of a light bulb manufacturer
believes that the lifespan of the 14W bulb with light output of 800 lumens is
6000 hours. A random sample of 25 bulbs produced the sample mean of 6180
hours and sample standard deviation of 178 hours. Test at the 5% level of
significance that the population standard deviation is less than 200 hours.
Assume that the lifespan of these bulbs is normally distributed with mean 
and unknown standard deviation . Find the p-value for the test.
Solution: Using the six-step method, we have
Step 1 H0:   200 versus H1:   200
Step 2   0.05
Step 3 The test statistic is a pivotal quantity for 2 (see
Equation (10.42)), that is,
(n 1)S 2
2
Step 4 The test statistic in step 3 is distributed as 2(n1), that is,
chi-square with 24 degrees of freedom since n  25.
X2 =

Step 5 Since the test is a lower-tail test, the rejection region is


as shown in Figure 10.29.
Step 6 Since under the null hypothesis   200 and from the
given sample we have n  25 and S  178, the value of the test
statistic is
(25 1)(178 )2
= 19.0104
(200 )2
This value clearly does not fall in the rejection region, so we do not reject the
null hypothesis H0.
X2 =

f(x)

13.848

Figure 10.29 Rejection region under the lower tail with   0.05.

Hypothesis Testing 247

The p-value for the test is given by


p-value  P(224 19.0104)  0.10
10.8.2 Testing Statistical Hypotheses about the Two Population
Variances
Consider two populations distributed normally with means 1 and 2 and
variances 12 and 22 respectively. Then we are interested in testing the
hypotheses about the ratio of variances of the two populations given in
Equation (10.44).
(i) H0: 12 22  1

2
2
versus H1: 1 2  1,

(ii) H0: 12 22  1

versus H1: 12 22  1,

(iii) H0: 12 22  1

versus H1: 12 22  1.

(10.44)

or
Let X11, X12, X13, ... , X1n1 and X21, X22, X23, ... , X2n2 be the random
2
samples from two independent
2,

normal populations N(1, 1 2) and N(


2
2 ), respectively. Let X1 and X2 be the sample means and S1 and S22 the
sample variances of samples (X11, X12, X13, ... , X1n1) and (X21, X22, X23, ...,
2
2
2
2
X2n2), respectively. From Chapter 8, we know that F  (S1 1 ) (S2 2 ) is
distributed as F-distribution with (n1 - 1) and (n2 1) degrees of freedom.
Since under the null hypothesis 12  22, S12 / S22 is distributed as F-distribution with (n1  1) and (n2  1) degrees of freedom. Thus, using
S12 / S22

(10.45)

as a test statistic we can test any of the hypotheses in Equation (10.44). The
rejection regions for testing the hypotheses (i), (ii), and (iii) at the  level of
signicance are as shown in Figure 10.30.
We illustrate the method of testing a hypothesis about the ratio of two
population variances with the following example.

Figure 10.30 Rejection region under the F-distribution curve for testing hypotheses
(i), (ii), and (iii) at the  level of significance.

248

Chapter Ten

Example 10.18 The quality of any process depends on the amount of variability present in the process, which we measure in terms of the variance of
the quality characteristic. For example, if we have to choose between two
similar processes, we would prefer the one with smaller variance. Any
process with smaller variance is more dependable and more predictable. In
fact, one of the most important criteria used to improve the quality of a
process or to achieve 6 quality is to reduce the variance of the quality characteristic in the process. In practice, comparing the variances of two
processes is common. Suppose the following is the sample summary of samples from two independent processes. We assume that the quality characteristics in the two processes are normally distributed as N(1, 12) and N(2,
22) respectively.

n1  21
X1  15.4
S12  24.6

n2  16
X2  17.2
S22  16.4
Test at the   0.05 level of significance the hypothesis H0: 12  22 versus H1: 12  22, which is equivalent to testing H0: 12 / 22  1 versus H1:
12 / 22  1. Find the p-value for the test.
Solution:
Step 1 H0: 12 / 22  1

versus H1: 12 / 22  1

Step 2   0.05
Step 3 As explained in the introductory paragraph, the test
statistic for testing the hypothesis in step 1 is
F  S12 / S22.
Step 4 The test statistic in step 3 is distributed as Fn1-1,n2-1 or,
in this case, F20, 15.
Step 5 Since the test is a two-tail test, the rejection region is as
shown in Figure 10.31. As noted in Chapter 8, the critical point
under the right tail is F20, 15; 0.025, which can be found directly
from F tables (Table VI of the appendix). However, the critical
point under the lower tail is F20, 15; 1  0.025 or F20, 15; 0.975, which
cannot be found directly from the F tables. Thus, to nd this
value we use the following relation (see Equation (8.32)):
F1 , 2 ,1 =

1
Fv2 ,v1 ,

(10.46)

So that, in the present case, we have


F20, 15; 0.975  1 / F15,20,0.025  1/2.57  0.389
Step 6 Substituting the values of S12 and S22 in the test statistic
S12 / S22, we get
f  24.6 / 16.4  1.5

Hypothesis Testing 249

f(x)

0.389

2.76

Figure 10.31 Rejection region under the right tail with   0.05.

f(x)

2.33

Figure 10.32 Rejection region under the right tail with   0.05.

Clearly this value does not fall in the rejection region. Thus, we do not reject
the null hypothesis H0.
Note that for the tests about variances we can only nd a range for the pvalue. Thus, in this case we have
p-value  2P(F f)  2P(F 1.5)  0.20.
Example 10.19 Use the data of Example 10.17 to test the following
hypothesis:
H0: 12 / 22  1

versus H1: 12 / 22  1.

Solution: The only difference between this example and Example 10.18 is
that the alternative hypothesis is different. In this example the only change
that occurs is in the rejection region; everything else, including the value of
the test statistic, will be exactly the same. The rejection region in this case
will be only under the right tail, which can be determined directly from Table
VI of the appendix. Thus, the rejection region is as shown in Figure 10.32.
Since the value of the test statistic does not fall in the rejection region,
we do not reject the null hypothesis H0.
The p-value for the test in this example is given by
p-value  P(F f)  P(F 1.5)  0.10.

250

Chapter Ten

10.9 An Alternative Technique for Testing of


Statistical Hypotheses Using Confidence Intervals
In Chapter 9, we studied certain techniques of constructing condence intervals for population parameters such as population means, proportions, and
population variances. In this chapter we have studied some techniques of
testing hypotheses about these parameters. From our discussion in this chapter and Chapter 9, it seems that the two techniques are independent of each
other but, in fact, it is quite the contrary. The two techniques are closely knit
together, in the sense that all the testing of hypotheses we have done in this
chapter could have been done by using appropriate condence intervals. We
explore the concept of using condence intervals by redoing some of the
examples that we did earlier in this chapter.
Example 10.20 Referring back to Example 10.8, we have that the sampled
population is normally distributed with an unknown mean  and known standard deviation
  2.5. Furthermore, we are given the sample summary as

n  25, X  10.5. We want to test the hypothesis H0:   10 versus H1: 


 10 at the   0.05 level of significance by using a confidence interval for
 with confidence coefficient 1  , that is, 95%.
Solution: Recall from Example 10.8 the test statistic used for testing the
hypothesis
H0:   10 versus H1:   10
X
. It is clear that we do not reject the null hypothesis H0 if the
n
value of the test statistic under the null hypothesis H0:   0 is such that
was

z 2 <

X 0
< z
n

(10.47)

or
z 2

n < X 0 < z 2

or
X z 2

n < 0 < X + z 2

(10.48)

From Equation (10.48) it follows that we do not reject the null hypothesis H0 if the value (0) of  under the null hypothesis falls in the interval
( X z 2 n ,X + z 2 n ). This is equivalent to saying that we do not
reject the null hypothesis H0 if the condence interval
( X z 2 n ,X + z 2 n ) for  with condence coefcient 1  
contains the value (0) of  under the null hypothesis. Now, using the information contained in the sample summary the condence interval for  with
condence coefcient 1   ( in our case it is 95%) is
( X z 2

n ,X + z 2

n)

Hypothesis Testing 251

 (10.5  1.96

2.5
2.5
, 10.5  1.96
)
25
25

 (9.52, 11.48)
This interval clearly contains 10, the value of  under the null hypothesis.
Thus, we do not reject the null hypothesis H0, and that was the conclusion
we made in Example 10.8.
We now consider a one-tail test.
Example 10.21 Referring back to Example 10.9, we have that the sampled
population is normally distributed with an unknown mean  and unknown
standard deviation . Also, we aregiven that the sample size is small, with
sample summary given as n  16, X  33, S  6. We want to test the hypothesis H0:   30 versus H1:   30 at the   0.05 level of significance,
using a confidence interval with confidence coefficient 1  , that is, 95%.
Solution: Recall from Example 10.9 that the test statistic used to test the
hypothesis
H0:   30 versus H1:   30
was

X
. Thus, it is clear that we do not reject the null hypothesis H0 if
S n

the test statistic under the null hypothesis H0:   0 is such that
X 0
< t n 1,
S n

(10.49)

or
X 0 < t n 1, S

or
X t n 1, S

n < 0

In other words, we do not reject the null hypothesis H0 if the lower onesided condence interval
( X t n 1, S

n ,)

(10.50)

with condence coefcient 1   contains the value (0) of  under the null
hypothesis.
Now using the information contained in the sample and Equation
(10.50), the lower one-sided condence interval for  with condence coefcient 95% is
( X t n 1, S n , )
= ( 33 1.753 (6
 (30.3705, )

16 ), )

252

Chapter Ten

This condence interval clearly does not contain 30, the value of  under
the null hypothesis. Thus, we reject the null hypothesis H0, and this was the
conclusion we made in Example 10.9.
Having discussed these two examples, we now give the rule (see
Equations (10.48) and (10.50)) and the condence intervals to be used for
testing various hypotheses that we discussed earlier in this chapter.
Rule: Do not reject the null hypothesis H0 at the  level of signicance if the corresponding condence interval with condence
coefcient 1  , given in Table 10.2, contains the value of the
parameter under the null hypothesis H0.

Table 10.2 Confidence intervals for testing various hypotheses.


Hypothesis

Confidence Interval with Confidence


Coefficient 1 

Large Sample Size


H0:   0 vs. H1:   0

(, X + z / n ) if  is known

H:   0 vs. H1:   0

(, X + z S / n ) if  is unknown

H0:   0 vs. H1:   0

(X z / n , ) if  is known

H0:   0 vs. H1:   0

(X z S / n , ) if  is unknown

H0:   0 vs. H1:   0

(X z / 2 / n , X + z / 2 / n ) if  is known

H0:   0 vs. H1:   0

(X z / 2 S / n , X + z / 2 S / n ) if  is unknown

H0: 1  2  0 vs. H1: 1  2  0

(, X 1 X 2 + z 12 / n1 + 22 / n2 ) if 1, 2 are
known

H0: 1  2  0 vs. H1: 1  2  0

(, X 1 X 2 + z S12 / n1 + S22 / n2 ) if 1, 2 are


unknown

H0: 1  2  0 vs. H1: 1  2  0

(X 1 X 2 z 12 / n1 + 22 / n2 , ) if 1, 2 are
known

H0: 1  2  0 vs. H1: 1  2  0

(X 1 X 2 z S12 / n1 + S22 / n2 , ) if 1, 2 are


unknown

H0: 1  2  0 vs. H1: 1  2  0

(X 1 X 2 z / 2 12 / n1 + 22 / n2 ,

X 1 X 2 + z / 2 12 / n1 + 22 / n2 ) if 1, 2 are
known
H0: 1  2  0 vs. H1: 1  2  0

(X 1 X 2 z / 2 S12 / n1 + S22 / n2 ,

X 1 X 2 + z / 2 S12 / n1 + S22 / n2 ) if 1, 2 are


unknown
Continued

Hypothesis Testing 253

Hypothesis

Confidence Interval with Confidence


Coefficient 1 

Normal Population with


Small Sample Size
H0:   0 vs. H1:   0

(, X + z / n ) if  is known

H0:   0 vs. H1:   0

(, X + t n 1, S / n ) if  is unknown

H0:   0 vs. H1:   0

(X z / n , ) if  is known

H0:   0 vs. H1:   0

(X t n 1, S / n , ) if  is unknown

H0:   0 vs. H1:   0

(X z / 2 / n , X + z / 2 / n ) if  is known

H0:   0 vs. H1:   0

(X t n 1, / 2S / n ,X + t n 1, / 2S / n ) if  is known

H0: 1  2 0 vs. H1: 1  2  0

(, X 1 X 2 + z 12 / n 1 + 22 / n2 ) if 1, 2
are known

H0: 1  2  0 vs. H1: 1  2  0

(, X 1 X 2 + t n1+n2 2, Sp 1/ n 1 + 1/ n2 ) if 1, 2
are unknown and 1  2

H0: 1  2  0 vs. H1: 1  2  0

(, X 1 X 2 + t m , S12 / n1 + S22 / n2 ) (*)


if 1, 2 are unknown and 1  2

H0: 1  2  0 vs. H1: 1  2  0

(X 1 X 2 z
known

H0: 1  2  0 vs. H1: 1  2  0

12 / n1 + 22 / n2 , ) if 1, 2 are

(X 1 X 2 t n1+n2 2, Sp 1/ n1 + 1/ n2 , ) if 1, 2
are unknown and 1  2

H0: 1  2  0 vs. H1: 1 2  0

(X 1 X 2 t m , S12 / n1 + S22 / n2 , )
if 1, 2 are unknown and 1  2

H0: 1  2  0 vs. H1: 1 2  0

(X 1 X 2 z / 2 12 / n1 + 22 / n2 ,

X 1 X 2 + z / 2 12 / n1 + 22 / n2 )
if 1, 2 are known
H0: 1  2  0 vs. H1: 1 2  0

(X 1 X 2 t n +n

Sp 1/ n1 + 1/ n2 ,

X 1 X 2 + t n +n

Sp 1/ n1 + 1/ n2 )

2 2, 2

2 2, 2

if 1, 2 are unknown and 1  2


H0: 1  2  0 vs. H1: 1 2  0

(X 1 X 2 t m , / 2 S12 / n1 + S22 / n2 ,

X 1 X 2 + t m , / 2 S12 / n1 + S22 / n2
if 1, 2 are unknown and 1  2
Large Sample Size
H0: p1  p0 vs. H1: p1  p0

(0, p + z p 0q 0 / n ) where q  1  p
0
0

H0: p1  p0 vs. H1: p1  p0

(p z p 0q 0 / n , 1)

H0: p1  p0 vs. H1: p1  p0

(p z / 2 p0q0 / n , p + z / 2 p0q0 / n )
Continued

254

Chapter Ten

Continued

Hypothesis

Confidence Interval with Confidence


Coefficient 1 

H0: p1  p2  0 vs. H1: p1  p2  0

(1/ n1 + 1/ n2 )) where
(0, (p 1 p 2 ) + z pq

p =

n1p 1 + n2 p 2
n1 + n2

H0: p1  p2  0 vs. H1: p1  p2  0

(1/ n1 + 1/ n2 ), 1)
((p 1 p 2 ) z pq

H0: p1  p2  0 vs. H1: p1  p2  0

(1/ n1 + 1/ n2 ),
((p 1 p 2 ) z / 2 pq
(1/ n1 + 1/ n2 ))
(p 1 p 2 ) + z / 2 pq

Normal Population but No


Restriction on Sample Size

H0: 2  02 vs. H1: 2  02

(n 1)S 2
,
2
X (n 1),1

H0: 2  02 vs. H1: 2  02

(n 1)S 2
0, 2

X (n 1),1

H0:   0 vs. H1:   0

(n 1)S 2 (n 1)S 2
, 2

2
X (n 1), / 2 X (n 1),1 / 2

Hypothesis

Confidence Interval with Confidence


Coefficient 1 

Normal Population with


Small Sample Size

O 12
O 12
H0: O 2  1 vs. H0: O 2  1
2
2

S12
0, Fn2 1,n11, S 2

O 12
O 12
H0: O 2  1 vs. H0: O 2  1
2
2

S12
Fn2 1,n11, S 2 ,

O 12
O 12
H0: O 2  1 vs. H0: O 2  1
2
2

S12
S12
Fn2 1,n11, / 2 S 2 , Fn2 1,n11, / 2 S 2

2
2

m=
(*)

S12 S22
n + n
1
2

(S12 / n1)2 (S22 / n2 )2


+
n1 1
n2 1

11
Computer Resources to
Support Applied Statistics
Using MINITAB and JMP Statistical Software

n the past two decades, the use of technology to analyze complicated data
has increased substantially, which not only has made the analysis very
simple, but also has reduced the time required to complete such analysis.
To facilitate statistical analysis many companies have acquired personal computer-based statistical software. Several PC-based software packages are
available, including BMDP, JMP, MINITAB, SAS, SPSS, and SYSTAT. A
great deal of effort has been expended in the development of these software
packages to create graphical user interfaces that allow users to complete statistical analysis without having to know a programming or scripting language.
We believe that publishing a book discussing applied statistics without
acknowledging and addressing the importance and usefulness of statistical
software would simply not be in our readers best interests. Accordingly, in
this chapter we briey discuss two popular statistical packages, MINITAB
and JMP. It is our explicit intent not to endorse either software package. Each
package has its strengths and weaknesses.

11.1 Using MINITAB, Version 14


MINITAB offers the option of using commands from the menu bar, typing
in session commands, or using both. As shown in Figure 11.1, in the
Windows environment it has the look and feel of most other applications
where the menu options help you navigate through the package.
Once in the MINITAB environment, you will see the heading
MINITAB-Untitled and three windows:
1. The Data window (Worksheet) is used to enter data in
columns denoted by C1, C2, C3, ..., C4000.
2. The Session window displays the output and also allows the
user to enter commands when using the command language.
3. The Project Manager window (minimized at startup) displays
project folders; navigate through them and manipulate as
necessary.
255

256

Chapter Eleven

Menu
commands

Session
window

Data
window
Project
manager
window

Figure 11.1 The screen that appears first in the MINITAB environment.

11.1.1 Getting Started


In this chapter we discuss briey how to use MINITAB pull-down menus to
analyze statistical data. Once you log on to your computer and open
MINITAB, you will see the picture in Figure 11.1 on your screen. The pulldown menus appear at the top of the screen.
Menu Commands

File

Edit

Data

Calc

Stat

Graph

Editor

Tools

Window

Help

By clicking any of the menu commands, we arrive at options included in


that command. For example, if we click on the File menu we get the dropdown menu as shown in Figure 11.2. The rst option, New, allows us to create a worksheet.
Creating a New Worksheet
Creating a worksheet means to enter the new data in the data window. The
data window consists of 4000 columns, which are labeled C1, C2, ..., C4000.
The data can be entered in one or more columns depending upon the setup
of the problem. In each column immediately below the labels C1, C2, ...

Computer Resources to Support Applied Statistics 257

From the Menu bar select File > New.


This gives two options: create a new
worksheet or a project.
Selecting MINITAB Worksheet opens a
new worksheet (an empty Data window)
that is added to the current project.
Selecting MINITAB Project opens a new
project and closes the existing project.

Figure 11.2 MINITAB window showing the menu command options.

there is one cell that is not labeled, whereas the rest of the cells are labeled
1, 2, 3, ... . In the unlabeled cell you can enter a variable name, such as part
name, shift, lot number, and so on. In the labeled cells you enter data, using
one cell for each data point. If a numerical observation is missing, MINITAB
will replace the missing value with a star (*).
Saving a Data File
The command File  Save Current Worksheet As allows saving the current data le. When you enter this command a dialog box titled Save
Worksheet As appears. Type the le name in the box next to File Name,
select the drive location for the le, and click Save.
Retrieving a Saved MINITAB Data File
Using the command File  Open Worksheet will prompt the dialog box
Open Worksheet to appear. Select the drive and directory where the le was
saved by clicking the down arrow next to the Look in box, enter the le
name in the box next to FILE NAME and then click Open. The data will
appear in the same format you had entered earlier.
Saving a MINITAB Project
Using the command File  Save Project saves the ongoing project in a
MINITAB Project (MPJ) le to the designated directory with the name you
chose. Saving the project saves all windows opened in the project, along with
the contents of each window.
Print Options
To print the contents in any specic window you need to make the specic
window active by clicking on it, then use the command File  Print Session
Window... (Graph..., Worksheet...).
If you want to print multiple graphs on a single page, highlight the
graphs in the Graph folder in the Project Manager Window, right-click
and choose Print. The Print Multiple Graphs dialog box appears. To

258

Chapter Eleven

change the page orientation of the multiple graphs, use File  Page Setup
to adjust the printing options.
11.1.2 Calculating Descriptive Statistics
Column Statistics
First enter the desired data in the Worksheet window. Then, from the Menu
command select Calc  Column Statistics. The statistics that can be displayed for the selected columns are Sum, Mean, Standard Deviation,
Minimum, Maximum, Range, Median, Sum of squares, N total, N nonmissing, and N missing. All these choices appear in the dialog box shown
in Figure 11.3. This dialog box appears immediately after you select command Calc  Column Statistics. Note that using this command you can
choose only one statistic at a time.
Example 11.1 Use the following steps to calculate any one of the statistics
listed in the Column Statistics dialog box, using the following data:
8976568989
Solution:
1. Enter the data in column C1 of the Data window.
2. Select Calc from the Menu command.
3. Click Column Statistics from the pull-down menu in the Calc
command menu.
4. Check in the dialog box Column Statistics the circle next to
the desired statistics; here we will use standard deviation.
5. Enter C1 in the box next to the input variable.
6. Click OK. The MINITAB output will appear in the session
window, as shown in Figure 11.3.
Row Statistics
From the Menu bar select Calc  Row Statistics. The statistics that can be
displayed for the selected rows are Sum, Mean, Standard deviation,
Minimum, Maximum, Range, Median, Sum of squares, N total, N nonmissing, and N missing. Note that Column Statistics and Row Statistics give
you exactly the same choices. Use the appropriate command Column
Statistics or Row Statistics depending upon the format of your data and
whether it is arranged in columns or rows.
Descriptive Statistics
From the Menu bar select Stat  Basic Statistics  Display Descriptive
Statistics. Statistics available for display are Mean, SE of mean, Standard
deviation, Variance, Coefficient of variation, Trimmed mean, Sum,
Minimum, Maximum, Range, N nonmissing, N missing, N total, Cumulative

Computer Resources to Support Applied Statistics 259

Pull-down
menu

Dialog box

Figure 11.3 MINITAB window showing input and output for Column Statistics.

N, Percent, Cumulative percent, First quartile, Median, Third quartile,


Interquartile range, Sum of squares, Skewness, Kurtosis, and MSSD.
The benet of the command Stat  Basic Statistics  Display
Descriptive Statistics over the commands Column Statistics and Row
Statistics is that it provides all the statistics listed in the above paragraph in
one step rather than one at a time.
Example 11.2 Use the following steps to calculate any one of the statistics
listed in the Basic Statistics dialog box, using the following data from
Example 11.1:
8976568989
Solution: Enter the data in column C1 of the Data window.
1. Select Stat from the Menu command.

260

Chapter Eleven

2. Click Basic Statistics  Display Descriptive Statistics from


the pull-down menus available in the Stat command menus
(Figure 11.4).
3. Enter C1 in the box below Variables and click the statistics
button to choose the statistics to be calculated.
4. Click OK. The MINITAB output will appear in the session
window, as shown in Figure 11.4

Figure 11.4 MINITAB window showing various options available under Stat
command.

To Display Several Descriptive Statistics Simultaneously


Graphs
From the Menu bar select Graph  and then the graph of choice. Some of
the choices include Scatterplot, Histogram, Dotplot, Boxplot, Bar Chart,
Stem-and-Leaf, Time Series Plot, and Pie Chart. We discuss below some of
these graphs.
Histogram
First enter the data in one or more columns of the worksheet depending upon
whether you have data on one or more variables. For each variable use only
one column. Then use the menu command Graph  Histogram. This
prompts a Histograms dialog box, which has four options. Choose the
desired option and click OK. For example, choose the option Simple, and
click OK. Then another dialog box appears, titled Histogram-Simple. In
this dialog box, enter under Graph variables one or more variables. If you
have not entered the names of the variables in the data columns, then under

Computer Resources to Support Applied Statistics 261

Graph variables just enter C1, C2, and so forth and click OK. A separate
graph is displayed for each variable. To display more than one graph, select
the Multiple Graphs option and choose the desired display option.
Example 11.3

Prepare a histogram for the following data:

23

25

20

16

19

18

42

25

28

29

36

26

27

35

41

18

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Solution: Use the following steps to draw any one of the graphs listed in the
pull-down menu in the Graph command.
1. Enter the data in column C1 of the Data window.
2. Select Graph from the Menu command.
3. Click Histogram from the pull-down menus available in the
Graph command menu.
4. Enter C1 into the Graph variables box and click OK.
5. The MINITAB output will appear in the Graph window, as
shown in Figure 11.5.

Figure 11.5 MINITAB display of histogram for the data given in Example 11.3.

262

Chapter Eleven

Figure 11.6 MINITAB window showing Edit Bars dialog box.

Sometimes we are interested in constructing a histogram with a particular number of classes or intervals, say 5. Right-click on one of the bars in the
default histogram to bring up the Edit Bars dialog box shown in Figure 11.6.
In the Edit Bars dialog box select Binning tab and then under Interval
Type check the circle next to Midpoint. Under Interval Definition check
the circle next to Number of Intervals and enter the number of desired intervals in the box next to it, 5 in this example. Click OK. The output will appear
in the graph window as shown in Figure 11.7.
Dotplot
First enter the data in one or more columns of the worksheet, depending
upon how many variables you have. For each variable use only one column.
Then use the menu command Graph  Dotplot. These commands prompt
a dialog box titled Dotplots to appear; it has seven options. Choose the
desired graph option and click OK. Then another dialog box appears,
DotplotOne Y, Simple. Enter one or more variables into Graph variables.
If you have not entered the names of the variables in the data columns, under
Graph variables just enter C1, C2, and so forth and click OK. A separate
graph is displayed for each variable. To display more than one graph, select
the Multiple Graphs option and choose the desired display option.
Example 11.4

Prepare a dot plot for the following data:

23

25

20

16

19

18

42

25

28

29

36

26

27

35

41

18

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Computer Resources to Support Applied Statistics 263

Histogram of data with five groups


9
8
7
Frequency

6
5
4
3
2
1
0
18

24

30

36

42

Data

Figure 11.7 MINITAB display of histogram with 5 classes for the data in Example
11.3.

Solution:
1. Enter the data in column C1 of the Data window (same as
Example 11.3).
2. Select Graph from the Menu command.
3. Click Dotplot from the pull-down menus available in the
Graph command menu.
4. Select the Simple dotplot.
5. Enter C1 into the Graph variables box and click OK.
6. The MINITAB output will appear in the Graph window, as
shown in Figure 11.8.
Scatterplot
Enter the data in one or more columns of the worksheet, depending upon
how many variables you have. For each variable use only one column. Use
the menu command Graph  Scatterplot. These commands prompt a dialog box titled Scatterplots, which has seven options. Choose the desired
graph option and click OK. Another dialog box appears, titled ScatterplotSimple. Enter the names of the variables under y variable and x variable. If
you have not entered the names of the variables in the data columns, then do
so under y variable and x variable (enter the columns where you have entered
the data, say C1, C2, etc.) and then click OK. A separate graph is displayed
for each set of variables. To display more than one graph, select the Multiple
Graphs option and choose the desired option.
Example 11.5 The following data shows the test scores (x) and the job
evaluation scores (y) of 16 Six Sigma Green Belts. Prepare a scatter plot for
these data and interpret the result you observe in this graph.

264

Chapter Eleven

45

47

40

35

43

40

49

46

9.2

8.2

8.5

7.3

8.2

7.5

8.2

7.3

38

39

45

41

48

46

42

40

7.4

7.5

7.7

7.5

8.8

9.2

9.0

8.1

Figure 11.8 MINITAB output of Dotplot for the data in Example 11.4.

Solution:
1. Enter the data in columns C1 and C2 of the Data window.
2. Select Graph from the Menu command.
3. Click Scatterplot from the pull-down menus available in the
Graph command menu.
4. Select the simple scatterplot and click OK.
5. Enter C2 and C1 under the y variable and x variable
respectively and click OK.
6. The MINITAB output will appear in the Graph window, as
shown in Figure 11.9.
The graph in Figure 11.9 shows that although the plotted points are not
clustered around the line, they still fall around the line going through them.
This indicates that there is a moderate correlation between the test scores and
the job evaluation scores of the Six Sigma Green Belts.
Box Whisker Plot
First, enter the data in one or more columns of the worksheet depending
upon how many variables you have. For each variable use only one column.
Then use the menu command Graph  Boxplot or Stat EDA  Boxplot.
These commands prompt a dialog box titled Boxplot to appear with four

Computer Resources to Support Applied Statistics 265

Figure 11.9 MINITAB output of Scatterplot for the data given in Example 11.5.

graph options. Choose the desired option and click OK. Then another dialog
box appears, titled BoxplotOne Y, Simple. Enter one or more variables
into Graph variables. If you have not entered the names of the variables in
the data columns, then under Graph variables just enter C1, C2, and so
forth and click OK. A separate graph is displayed for each variable. To display more than one graph select the Multiple Graphs option and choose the
desired display option. The Box Plot will appear in the graph window.
Example 11.6

Prepare a box plot for the following data:

23

25

20

16

19

55

42

25

28

29

36

26

27

35

41

55

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Solution:
1. Enter the data in column C1 of the Data window (same as
Example 11.3).
2. Select Graph from the Menu command.
3. Click Boxplot from the pull-down menus available in the
Graph command menu.

266

Chapter Eleven

4. Select the Simple boxplot.


5. Enter C1 into the Graph variables box and click OK.
6. The MINITAB output will appear in the Graph window, as
shown in Figure 11.10.
The Box plot for the data in this example will appear in the graph window as shown in Figure 11.10(a). Figure 11.10(b) shows the Box plot rotated through 90 so that the whiskers are horizontal.

(b) Rotated through 90 to


illustrate the skewness

(a) As shown in MINITAB output

Figure 11.10 MINITAB display of box plot for the data in Example 11.6.

Graphical Summary
First enter the data in one or more columns of the worksheet, depending how
many variables you have. For each variable use only one column. Then use
the menu command select Stat  Basic Statistics  Graphical Summary.
These commands prompt a dialog box titled Graphical Summary to appear.
Enter the names of the variables you want summarized under Variables. If
you have not entered the names of the variables in the data columns, then
under Variables just enter C1, C2, and so forth. In the box next to
Confidence Level enter the appropriate value of the condence level and
click OK. This option provides both graphical and numerical descriptive statistics. A separate graph and summary statistics are displayed for each variable.
Example 11.7

Prepare the graphical summary for the following data:

23

25

20

16

19

55

42

25

28

29

36

26

27

35

41

55

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Computer Resources to Support Applied Statistics 267

Figure 11.11 MINITAB display of graphical summary for the data in Example 11.7.

Solution:
1. Enter the data in column C1 of the Data window (same as
Example 11.3).
2. Select Stat from the Menu command.
3. Click Basic Statistics and then Graphical Summary from the
pull-down menus available in the Stat command menu.
4. Enter C1 into the Graph variables box and click OK.
The MINITAB output will appear in the Summary window, as shown in
Figure 11.11.
Bar Chart
Enter the data containing categories and frequencies in columns C1 and C2.
Or if the categories and frequencies are not given, enter all the categorical
data in column C1 for the following example. From the Menu command
select Graph  Bar Chart. In the Bar Charts dialog box are three options
under Bars represent. Select option 1, Counts of unique values, if you
have one or more columns of categorical data (as you will in the following
example); select 2, A function of a variable, if you have one or more
columns of measurement data; or select option 3, Values from a table, if you
have one or more columns of summary data. For each of these options there
are several other options about the representation of the graph. Choose an
appropriate option and click OK. Now another dialog box appears, titled Bar

268

Chapter Eleven

Chart[description]. Enter the variable name(s) under Categorical variables. If you have not entered the names of the variables in the data columns,
under Graph variables just enter C1, C2, and so forth and then click OK. A
separate graph is displayed for each variable. To display more than one graph
select Multiple Graphs and choose the desired display option.
Example 11.8

Prepare a bar graph for the following categorical data:

Solution:
1. Enter the data in column C1 of the Data window.
2. Select Graph from the Menu command.
3. Click Bar Chart from the pull-down menus available in the
Graph command menu.
4. Select Counts of Unique Values and Simple from the options.
5. Click OK.
6. Enter C1 into the Categorical variables box.
7. Click OK.
8. The MINITAB output will appear in the graph window, as
shown in Figure 11.12
Pie Chart
Enter the data containing categories and frequencies in columns C1 and C2.
If the categories and frequencies are not given, enter the categorical data in
column C1, as in the following example. From the Menu command select
Graph  Pie Chart. Choose Chart raw data when each row in the column
represents a single observation and Chart variables from a table if the categories and frequencies are given. A slice in the pie is proportional to the
number of occurrences of a value in the column or the frequency of each category. Enter column C1 in the box next to Categorical variables. A separate
pie chart for each column is displayed, on the same graph. To display more
than one graph select the Multiple Graphs option and choose the required
display option. When category names exist in one column and summary data
exist in another column, use the Chart values from a table option. Enter
columns for Categorical variable and Summary variables.

Computer Resources to Support Applied Statistics 269

Figure 11.12 MINITAB display of bar graph for the data Example 11.8.

Example 11.9

Prepare a pie chart for the following categorical data:

Solution:
1. Enter the data in column C1 of the Data window.
2. Select Graph from the Menu command.
3. Click Pie Chart from the pull-down menus available in the
Graph command menu.
4. Select Chart raw data from the options.
5. Enter C1 into the Categorical variables box and click OK.
6. The MINITAB output will appear in the graph window, as
shown in Figure 11.13
11.1.3 Probability Distributions
To calculate various probabilities, select from the Menu command Calc 
Probability Distributions and then the probability of choice. This will bring
up a dialog box where the choice of how the probabilities are calculated, such

270

Chapter Eleven

Figure 11.13 MINITAB display of pie chart for the data in Example 11.9.

as Probability Density, Cumulative Probability, or Inverse Probability, can be


selected. Based on the probability distribution being calculated, appropriate
parameter entries need to be made. The choices of probability distribution
include Uniform, Binomial, t, Chi-Square, Normal, F, Poisson, Exponential,
and others. We discuss a couple of these examples below. The technique for
other distributions is quite similar and self-explanatory.
Normal Distribution
Using MINITAB we can calculate three values related to the normal distribution:
1. Probability density, which means nding an area under the
normal distribution curve.
2. Cumulative probability, which means nding the value of the
normal density function f(x).
3. Inverse cumulative probability, which means that the area under
the normal distribution curve below x is given, and we want to
nd out the corresponding value of x.
To calculate any of the above probabilities proceed as follows:
From the Menu bar select Calc  Probability Distributions 
Normal. This will prompt a dialog box titled Normal Distribution to
appear. Click one of the options, which are Probability density,
Cumulative probability or Inverse cumulative probability. Enter the
value of the Mean and the Standard deviation to dene the normal distribution. Check the circle next to Input column (if you have more than one
value of x to enter in one of the data columns, say, C1) and enter C1 in the
box next to it. Or select the Input constant field if you have only one value
of x, and enter the value of that constant in the box next to it. If desired, in

Computer Resources to Support Applied Statistics 271

the Optional storage enter the column in which you want to store the output. Then click OK.
Example 11.10 Let a random variable be distributed as normal with mean
  6 and standard deviation   4. Determine the probability P(8.0 X
14.0).
Solution: In order to determine the probability P(8.0 X 14.0), we
have to rst nd the probabilities P(X 8.0) and P(X 14.0). Then P(8.0
X 14.0)  P(X 14.0)  P(X 8.0). To nd probabilities P(X 8.0)
and P(X 14.0) using MINITAB, we proceed as follows:
1. Enter the test values of 8 and 14 in column C1.
2. From the Menu bar select Calc  Probability Distribution 
Normal.
3. In the dialog box that appears, click the circle next to
Cumulative probability.
4. Enter 6 (the value of the mean) in the box next to Mean and 4
(the value of the standard deviation) in the box next to
Standard deviation.
5. Click the circle next to Input column and type C1 in the box
next to it.
6. Click OK.
7. In the session window, text will appear as follows, indicating
values of P(X 14.0)  0.977250 and P(X 8.0)  0.691462.
Thus, P(8.0 X 14.0)  P(X 14.0)  P(X 8.0) 
0.977250-0.691462  0.285788.
Normal with mean  6 and standard deviation  4
x

P( X  x )

0.691462

14

0.977250

Binomial Distribution
For binomial probability distributions, the same options are available as for
the normal probability distribution: Probability, Cumulative probability,
and Inverse cumulative probability.
From the Menu bar select Calc  Probability Distributions 
Binomial. This will prompt a dialog box titled Binomial Distribution to
appear. Click one of the options, which are Probability density,
Cumulative probability, or Inverse cumulative probability. Enter the
Number of trials and Probability of success (0 p 1) to dene the binomial distribution. Check the circle next to Input column (if you have more

272

Chapter Eleven

than one value of x that you must enter in one of the data columns, say, C1)
and enter C1 in the box next to it. Or you may select the Input constant field
if you have only one value of x and enter that value in the box next to it. If
desired, use the Optional storage to enter the column in which you want to
store the output. Then click OK.
Example 11.11 The probability is 0.80 that a randomly selected Six Sigma
Green Belt will finish a project successfully. Let X be the number of Green
Belts who will finish successfully from a randomly selected group of 10
Green Belts. Find the probability distribution of the random variable X.
Solution: In order to nd the probability distribution of the random variable X we need to nd the probability of X  0, 1, 2, ..., 10. To nd these
probabilities using MINITAB we proceed as follows:
1. Enter the values 0, 1, 2, ...,10 in column C1.
2. From the Menu bar select Calc  Probability Distributions 
Binomial.
3. In the dialog box that appears, click the circle next to
Probability.
4. Enter 10 (the number of trials) in the box next to Number of
trials and 0.80 (the probability of success) in the box next to
Probability of success.
5. Click the circle next to Input column and type C1 in the box
next to it.
6. Click OK.
The desired probabilities will show up in the session window as:
Binomial with n  10 and p  0.8
x

P( X  x )

0.000000

0.000004

0.000074

0.000786

0.005505

0.026424

0.088080

0.201327

0.301990

0.268435

10

0.107374

Computer Resources to Support Applied Statistics 273

11.1.4 Estimation and Testing of Hypotheses about Population


Mean and Proportion
1-Sample Z
From the Menu bar select Stat  Basic Statistics  1-Sample Z. This will
prompt a dialog box titled 1-Sample Z (Test and Confidence Interval).
Check the circle next to Samples in columns if you have entered raw data
in columns. In the box below, select the columns containing the sample data.
Choose Summarized data if you have summary values for the sample size
and mean. Enter the Standard deviation and Test mean. If you selected
Summarized data, you must also enter Sample size and sample Mean.
Select Options then the Confidence Level and Alternative. The results also
provide the condence intervals.
Example 11.12 Consider the following data from a population with an
unknown mean  and standard deviation , which may or may not be known:
23

25

20

16

19

35

42

25

28

29

36

26

27

35

41

30

20

24

29

26

37

38

24

26

34

36

38

39

32

33

25

30

(a) Find a 95% confidence interval for the mean. (b) Test a hypothesis H0: 
 30 versus H1:   30 at the 5% level of significance.
Solution: Since the sample size n  32 is greater than 30, it is considered
to be a large sample. Therefore, either to nd a condence interval or to test
a hypothesis we use the Z-statistic. Also, when the sample size is large, the
population standard deviation , if it is unknown as in this example, can be
replaced with the sample standard deviation S. Note that we can nd the condence interval and test the hypothesis by using the one procedure given
below.
1. Enter the data in column C1 of the Data window (Worksheet
window).
2. Since the populations standard deviation is not known,
calculate the sample standard deviation of these data using one
of the MINITAB procedures discussed earlier. You will nd S 
6.83.
3. Select the Stat command and then click Basic Statistics  1Sample Z in the pull-down menu. This will prompt a dialog
box titled 1-Sample Z (Test and Confidence Interval).
4. Enter C1 in the box below Samples in columns. (If you had
summary statistics, sample mean and sample size, check the
circle next to Summarized data and enter in the boxes next to
it the appropriate values.) Enter the values of the standard
deviation (6.83 from Step 2).

274

Chapter Eleven

5. Enter the value of the test mean under the null hypothesis, in
this case 30. (If you are not testing any hypothesis, leave it
empty)
6. Check Options, which will prompt another dialog box to
appear. Enter the condence level 95% in the box next to
Confidence level. Finally, next to Alternative, select one of the
three options; less than, not equal, or greater than. Click OK.
The MINITAB output will show up in the session window as:
One-Sample Z: C1
Test of mu  30 vs not  30
The assumed standard deviation  6.83
Variable N Mean StDev SE Mean
C1

95% CI

32 29.6250 6.8285 1.2074 (27.2586, 31.9914) 0.31 0.756

Since the p-value for the test is 0.756, which is much greater than the
level of signicance 5%, we do not reject the null hypothesis. The 95% condence interval is given as (27.2586, 31.9914). Also, note that since 95%
condence interval contains the value of  we were testing for, we do not
reject the null hypothesis at the 5% [(100-95)%] level of signicance.
1-Sample t
In Chapters 9 and 10, we saw that if a small sample is taken from a normal
population with an unknown variance, we use t-statistic for nding a condence interval and testing a hypothesis about the mean. The MINITAB procedure for 1-Sample t is similar to the one for 1-Sample Z.
From the Menu bar select Stat  Basic Statistics  1-Sample t. This
will prompt a dialog box titled 1-Sample t (Test and Confidence Interval)
to appear. Check the circle next to Samples in columns if you have entered
the raw data in columns. In the box below, select the columns containing the
sample data. Check the circle next to Summarized data if you have summary values for the sample, that is, sample size, sample mean, and sample
standard deviation, and enter those values. Enter the hypothesis test mean in
the box next to Test mean. Select Options to prompt another dialog box,
where you enter the condence level and value of the mean under the alternative hypothesis in the boxes next to Confidence Level and Alternative,
respectively. Click OK in both dialog boxes. The MINITAB output, which
provides the condence intervals and the p-value for the hypothesis testing,
will appear in the session window.
Example 11.13 Consider the following data from a population with an
unknown mean  and unknown standard deviation:
23

25

20

16

19

35

42

25

28

29

36

26

27

35

41

30

20

24

29

26

37

38

24

26

Computer Resources to Support Applied Statistics 275

(a) Find a 95% confidence interval for the mean . (b) Test a hypothesis H0:
  28 versus H1:   28 at the 5% level of significance.
Solution: Follow the procedure discussed in the preceding paragraph and
use the same steps as in 1-Sample Z procedure. The MINITAB output will
appear in the session window as:
One-Sample T: C1
Test of mu  28 vs not  28
Variable N
C1

Mean

StDev SE Mean

95% CI

24 28.3750 7.0761 1.4444 (25.3870, 31.3630) 0.26 0.797

Since the p-value for the test is 0.797, which is much greater than the 5%
level of signicance, we do not reject the null hypothesis. The 95% condence interval is (25.3870, 31.3630).
Also, note that since 95% condence interval contains the test value of
, we do not reject the null hypothesis at the 5% [(100  95)%] level of signicance.
1 Proportion
For testing a hypothesis about one population proportion and for nding condence intervals, use the following MINITAB procedure.
From the Menu bar select Stat  Basic Statistics  1 Proportion. This
will prompt a dialog box entitled 1 Proportion (Test and Confidence
Interval) to appear. In the dialog box, check the circle next to Samples in
columns if you have entered the raw data in columns. Then, in the box
below, select the columns containing the sample data. Check circle next to
Summarized data if you have sample summary values for the number of trials and successes (events) and enter those values. Select Options to prompt
another dialog box to appear. Enter the condence level, value of the proportion under the null, and the alternative hypotheses in the boxes next to
Confidence level, Test proportion, and Alternative, respectively. Check the
box next to Use test and interval based on normal population if the sample size is large, that is, if np and nq are greater than or equal to 5. If the sample size is not large, do not check this box. Click OK in both dialog boxes.
The MINITAB output, which provides the condence intervals and the pvalue for the hypothesis testing will appear in the session window.
Example 11.14 Several studies show that many industrial accidents can be
avoided if all safety precautions are strictly enforced. One such study showed
that 35 out of 50 accidents in one kind of industry could have been avoided
if all safety precautions were taken. If p denotes the proportion of accidents
that could be avoided by taking all safety precautions, find a 95% confidence
interval for p and test the hypothesis H0: p  0.85 versus H1: p  0.85 at the
5% level of significance.
Solution: In this problem, sample summary values are given. Following
the procedure discussed in the above paragraph enter the appropriate values

276

Chapter Eleven

in boxes next to Number of trials, Number of events, Confidence level,


Proportion, and Alternative. Click OK in both dialog boxes. The
MINITAB output shown below will appear in the session window.
1. Select the Stat command and then click Basic Statistics  1
Proportion in the pull-down menu. This will prompt a dialog
box titled 1 Proportion (Test and Confidence Interval) to
appear.
2. In this example, we have summarized data, so click on the
circle next to Summarized Data and enter the values for the
Number of trials (50) and the Number of events (35).
3. Check Options, which will prompt another dialog box to
appear. Enter the condence level 95% in the box next to
Confidence level. Next to Alternative, select one of the three
options: less than, not equal, or greater than. Enter the Test
proportion value (.85). Because this is a large sample, check
the box to use a normal distribution. Click OK twice.
The MINITAB output will show up in the session window as:

Test and CI for One Proportion


Test of p  0.85 vs p not  0.85
Sample X
1

N Sample p

95% CI

35 50 0.700000 (0.572980, 0.827020)

Z-Value P-Value
2.97

0.003

Since the p-value is 0.003, which is much smaller than the 5% level of
signicance, we reject the null hypothesis. The 95% condence interval is
(0.572980, 0.827020). Also, note that since 95% condence interval does not
contain the value of p, we reject the null hypothesis at the 5% [(100 - 95)%]
level of signicance.
11.1.5 Estimation and Testing of Hypotheses about Two
Population Means and Proportions
2-Sample t
In Chapters 9 and 10 we saw that if small samples are taken from two normal populations with unknown variances, we use t-statistic for nding a condence interval for the difference of two population means and for testing a
hypothesis about the two population means. We can achieve this goal of nding a condence interval for the difference of two population means and for
testing a hypothesis about the two population means by using the MINITAB
procedure for 2-Sample t.
From the Menu bar select Stat  Basic Statistics  2-Sample t. This
will prompt a dialog box titled 2-Sample t (Test and confidence interval)
to appear. Check the circle next to Samples in columns if you have entered

Computer Resources to Support Applied Statistics 277

the raw data in a single column, differentiated by subscript values in a second column. Enter columns C1 and C2 in boxes next to Samples and
Subscripts, respectively. Check the circle next to Samples in different
columns if the data for the two samples are entered in two separate columns.
Enter C1 and C2 in the boxes next to First and Second. If you have summary data, check the circle next to Summarized data and enter for the two
samples the sample size, sample mean, and sample standard deviation.
Check the box next to Assume equal variances only if the variances of the
two populations can be assumed to be equal. Then select Options to prompt
another dialog box to appear. Enter the condence level and the value of the
mean difference under the null hypothesis in the box next to Confidence
level, Test difference, and, depending on the alternative hypothesis, choose
less than, greater than, or not equal to in the box next to Alternative. Then
click OK in both dialog boxes. The MINITAB output, which provides the
condence intervals and the p-value for the hypothesis testing, will appear in
the session window.
Example 11.15 The following data give the summary statistics of scores
on productivity for two groups, one group who are Six Sigma Green Belts
and another group who are not:
Group

Sample Size Sample Mean Sample Standard Deviation

Six Sigma Green Belts

25

93

2.1

Non-Six Sigma Green Belts

27

87

3.7

Find a 95% confidence interval for the difference between two population
means. Test a hypothesis H0: 1  2  0 versus H1: 1  2  0 at the 5%
level of significance. Assume that variances of the two populations are equal.
Solution: In this problem sample summary values are given. Following the
procedure discussed in the above paragraph, enter the appropriate values in
boxes of the dialog boxes. Then click OK in both dialog boxes. The overall
procedure is similar to that in Example 11.14. The MINITAB output will
appear in the session window as:

Two-Sample T-Test and CI


Sample

Mean

StDev

SE Mean

25

93.00

2.10

0.42

22

87.00

3.70

0.79

Difference  mu (1)  mu (2)


Estimate for difference: 6.00000
Continued

278

Chapter Eleven

Continued

95% CI for difference: (4.25931, 7.74069)


T-Test of difference  0 (vs not ): T-Value  6.94
P-Value  0.000 DF  45
Both use Pooled StDev  2.9565
This output gives the point estimate and a condence interval for 1 
2. The p-value for the test is 0.000, which is less than 5% the level of signicance. Thus, we reject the null hypothesis that the two groups have equal
productivity. Furthermore, both lower and upper limits of the condence
interval are positive, which implies the average productivity score of the
group of Six Sigma Green Belts is signicantly higher than the average score
of the other group.
Paired t
In Chapter 10, we saw that if we have a pair of data points on each individual, the samples are not independent. Therefore, in such problems, the 2sample t procedure discussed above cannot be used. In these situations, we
use a special test called the paired t-test.
From the Menu bar select Stat  Basic Statistics  Paired t. This will
prompt a dialog box titled Paired t (Test and Confidence Interval). In the
dialog box, check the circle next to Samples in columns if you have the raw
data, entered in two columns. Enter the column names in boxes next to First
Sample and Second Sample. If you have summary data, check the circle
next to Summarized data (differences) and enter the sample size, sample
mean, and sample standard deviation. Select Options to prompt another dialog box, where you enter the condence level, the value of the mean difference under the null hypothesis, and, depending upon the alternative hypothesis, where you choose less than, greater than, or not equal to in the box next
to Alternative. Click OK in both dialog boxes. The MINITAB output, which
provides the condence intervals and the p-value for the hypothesis testing,
will appear in the session window.
Example 11.16 The following data give the test scores before and after a
two-week training of a group of 15 Six Sigma Green Belts:

Before

83

87

83

78

76

89

79

83

86

90

81

77

74

86

89

After

87

85

89

77

79

84

90

88

83

92

87

81

83

79

85

Test a hypothesis H0: d  0 versus H1: d  0 at the 5% level of significance. Find a 95% confidence interval for the population means difference
between before and after test scores.
Solution: Since we have two test scores for each Green Belt, the two
samples are not independent. To test the hypothesis and to find the desired

Computer Resources to Support Applied Statistics 279

confidence interval we use the paired t-statistic. Following the procedure


discussed in the above paragraph, and similar to the 2 Sample t, enter the
appropriate values in boxes of the dialog boxes. The biggest difference
here is that we are using raw data and must make references to the appropriate columns. The MINITAB output will appear in the session window
as:

Paired T-Test and CI: C1, C2


Paired T for C1  C2
N

Mean

C1

15

82.7333 5.1056

1.3182

C2

15

84.6000 4.3556

1.1246

Difference 15

StDev

SE Mean

1.86667 5.31664 1.37275

95% CI for mean difference: (4.81092, 1.07759)


T-Test of mean difference  0 (vs not  0): T-Value  1.36
P-Value  0.195
This output gives a 95% condence interval (4.81092, 1.07759) for the
population means difference between before and after test scores. The pvalue for the test is 0.195, which is greater than 5% the level of signicance.
Thus, we do not reject the null hypothesisthe average scores before and
after the training are not signicantly different. In other words, the training
program is not very effective.
2 Proportions
For testing a hypothesis about two population proportions and nding a condence difference between the two population proportions, use the following
MINITAB procedure.
From the Menu bar select Stat  Basic Statistics  2 Proportions.
This will prompt a dialog box titled 2 Proportions (Test and Confidence
Interval) to appear. Check the circle next to Samples in columns if you have
entered raw data into a single column with a second column of subscripts
identifying the sample. Enter columns C1 and C2 in boxes next to Samples
and Subscripts, respectively. Check the circle next to Samples in different
columns if the data for the two samples are entered in two separate columns
and enter the appropriate column reference next to First and Second. If you
have summary data, check the circle next to Summarized data and enter
Number of trials and Number of events (successes) in the appropriate
spaces. Select Options to prompt another dialog box to appear. Enter the
condence level, the value of the difference between the two proportions
under the null hypothesis, and, depending upon the alternative hypothesis,
choose less than, greater than, or not equal to in the box next to Alternative.
Check the box next to Use pooled estimate of p for test only if, under the
null hypothesis, two population proportions are equal. (If we reject the null

280

Chapter Eleven

hypothesis of equal proportions, to nd the condence interval do not check


the box for Use pooled estimate of p for test.) Then click OK in both dialog boxes. The MINITAB output, which provides the condence intervals
and the p-value for the hypothesis testing, will appear in the session window.
Example 11.17 A manufacturing company has two plants A and B where
it manufactures motors for passenger cars. A random sample of 120 motors
from plant A revealed that 8 did not meet the specifications, while a random
sample of 150 motors from plant B revealed that 12 did not meet the specifications. Let p1 and p2 denote the true proportions of motors manufactured at
plant A and B, respectively, that do not meet the specifications. Test H0: p1 
p2  0 versus H2: p1  p2  0 at the 1% level of significance. Find a 99%
confidence interval for p1  p2.
Solution: We are given the sample summary data. Following the procedure
discussed above, enter the appropriate values in boxes of the dialog boxes.
Because we are testing the null hypothesis that both proportions are the
same, be certain to check the box for Use pooled estimate of p for test.
Click OK in both dialog boxes. The MINITAB output will appear in the session window as:

Test and CI for Two Proportions


Sample

N Sample p

120

0.066667

12

150

0.080000

Difference  p (1)  p (2)


Estimate for difference: 0.0133333
99% CI for difference: (0.0951614, 0.0684948)
Test for difference  0 (vs not  0): Z  0.42 P-Value  0.678
This output gives the point estimate and a 99% condence interval for
p1  p2. The p-value for testing the hypothesis is 0.678, which is greater
than 1%, or the level of signicance. Thus, we do not reject the null hypothesis and conclude that the proportions of motors manufactured below specications at the two plants are not signicantly different.
11.1.6 Estimation and Testing of Hypotheses about Two
Population Variances
In many applications dealing with two normal populations, quite frequently
we assume that the variances of two populations are equal. In such situations,
it is important that we verify whether that assumption is valid. Below, we discuss a MINITAB procedure to test whether the two variances are equal or not.

Computer Resources to Support Applied Statistics 281

From the Menu bar select Stat  Basic Statistics  2 Variances. This
will prompt a dialog box titled 2 Variances to appear. Check the circle next
to Samples in one column if you have entered data into a single column,
with a second column of subscripts identifying the samples, and enter the
column references for the data and the subscripts in those boxes. Check the
circle next to Samples in different columns if the data for the two samples
are entered in two separate columns, and enter those column references next
to First and Second. If you have summary data, check the circle next to
Summarized data and enter sample sizes and sample variances in the appropriate boxes. Select Options to prompt another dialog box to appear. In this
dialog box, enter the condence levelany number between 0 and 100. By
default, it is 95%. Then click OK in both dialog boxes. The MINITAB output, which provides the condence intervals and the p-value for the hypothesis testing, will appear in the session window.
Example 11.18 Let 12 and 22 denote the variances of the serum cholesterol levels of elderly and young American men. For a sample of 32 elderly
American men, the sample standard deviation of serum cholesterol was 32.4;
for 36 young American men the sample standard deviation of serum cholesterol was 21.7. Do these data support, at the 5% level of significance, the
assumption that the variations of cholesterol levels in the two populations
are the same?
Solution: We are given the sample summaries. Following the procedure
discussed above, enter the appropriate values in boxes of the dialog boxes.
Notice that the problem provides sample deviations rather than variance.
Click OK in both dialog boxes. The MINITAB output will appear in the session window as:

Test for Equal Variances


95% Bonferroni condence intervals for standard deviations
Sample

Lower

StDev

Upper

32

25.1974

32.4

44.9883

36

17.0998

21.7

29.4730

F-Test (normal distribution)


Test statistic  2.23, p-value  0.023

This MINITAB procedure not only tests a hypothesis of equal variance,


but also provides both numerical and graphical support (see Figure 11.12) in
the form of Bonferroni condence intervals for the two standard deviations
at the desired level of condence. In the above output, since the p-value for
the test is 0.023, which is smaller than 5%, the given level of signicance,
we reject the null hypothesis of equal variances. This means that the variations of cholesterol levels in the two populations are signicantly different.

282

Chapter Eleven

20

25

30

35

40

45

Figure 11.14 MINITAB printout of 95% Bonferroni confidence interval for standard
deviations.

11.1.7 Testing Normality


It is common to assume that the selected sample comes from a normal population. However, whenever we make this assumption it becomes necessary
to verify that assumption. Since if the assumption is not true, any conclusions
made about the data may not be valid.
From the Menu bar select Stat  Basic Statistics  Normality Test.
This will prompt a dialog box titled Normality Test to appear. Enter C1 in
the box next to Variable. Under Percentile lines check the circle next to
None. Then under Normality Test check one of the circles next to
Anderson-Darling, Ryan-Joiner, or Kolmogorov-Smirnov. Click OK.
The normal probability graph will appear in the graph window.
Example 11.19 Test at the 5% level of significance if the following sample
comes from a normal population:

23

25

20

16

19

55

42

25

28

29

36

26

27

35

41

55

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Computer Resources to Support Applied Statistics 283

Solution: Enter the data in column C1 of the data window, then follow the
steps discussed in the Normality Test procedure. The MINITAB output, the
normal probability graph shown in Figure 11.15, will appear in the graph
window. If all the data points fall almost on a straight line, we can conclude
that the sample comes from a normal population. The decision if all the
points fall almost on a straight line is somewhat subjective. Imagine a 10year-old child putting a nger on the straight line. If all the points are hidden
under the nger, we can say the data pass the normality test. In this example,
all the data points except the two points do fall on a straight line. Thus we
can assume that the data in this example passes the normality test. Moreover,
MINITAB also provides the p-value for one of the tests mentioned above.
For instance, we determined the p-value for the Anderson-Darling normality
test. Clearly the p-value is greater than 5%, the level of signicance.
Therefore, we can assume the given data come from a normal population.
The normal probability plot can usually be created in two steps:
1. Arrange the data in the ascending order and rank them 1, 2, 3,
..., n, where n is the sample size.
2. Plot the ith-ranked observation against 100(i  0.5)/n on
special graph paper, called normal-probability graph paper (see
Figure 11.15). If the plotted points fall on a straight line, it
implies that the sample comes from a normal population.
Note that the horizontal axis contains the data values and the vertical axis
contains the values of 100(i  0.5)/n.

Figure 11.15 MINITAB display of normal probability graph for the data in Example
11.19.

284

Chapter Eleven

11.2 Using JMP, Version 5.1


JMP has the option of using commands from a menu bar, typing in script in
a script editor, or using a combination. As shown in Figure 11.16, JMP provides a look and feel familiar to users of other Windows-based products. As
in all Windows-based products, menus are used extensively to help you navigate through the package and select features/options.
Once in the JMP environment, you will see windows with the headings
JMP-JMP starter, as is displayed in Figure 11.16.
The JMP Starter window is simply a tool to help new users become
familiar with the capabilities and features of the software. Brief text descriptions, as well as icons of features and options, are provided in the JMP
Starter but are not as readily available in the main menu bars. The tabs on
the JMP starter window include File, Basic, Model, Multivariate,
Survival, Graph, Surface, QC, Tables, and Index. The File tab contains
the following options: New Data Table, Open Data Table, Open Database
Table, New Script, Open Script, Open Journal, and Preferences. The
JMP Starter may be turned off or disabled by the user under File 
Preferences.
11.2.1 Getting Started with JMP
In this part of the chapter, we discuss briey how to use the JMP pull-down
menus to analyze statistical data. Once you log on to your personal computer

Menu
commands

JMP starter
window

Figure 11.16 The screen that appears first in the JMP environment.

Computer Resources to Support Applied Statistics 285

From the Menu bar


select File > New.
This gives the options to
create either a new
Data Table or a
script file.
Selecting Data Table
opens a new Table (an
empty Data window)
Selecting Script opens
a new Script window
for entering SAS code.

Figure 11.17 JMP menu command options.

and get into the JMP environment, you will see the image in Figure 11.16 on
your screen. The pull-down menus appear at the top of the screen.
Menu commands include
File Edit Tables Rows Cols DOE Analyze Graph Tools View Window
Help
By clicking any of these menu commands, we arrive at options included
in that command. For example, if we click on the File menu, we get the dropdown menu as shown in Figure 11.17. The rst option, New, allows us to
create a data table as displayed in Figure 11.17.
Creating a Data Table
New data are entered in a Data Table. The strength of JMP as a statistical
analysis software package is derived from its ability to process data in
columns, more so than in rows. Data can be entered in one or more columns
depending upon the setup of the problem. By default, one active column
appears in the Data Table window. To add more columns, double-click on the
blank space to the right of the last column that was created. The rst column
on the far left serves as an index of the number of cells. Labels can be entered
for each column by double-clicking on the top cell of each column and entering a label such as Part Name, Shift, Lot Number, Operator, or Machine. In
the labeled cells you can enter data using a single cell for a single data point.
Saving a Data Table
Using the command File  Save As function allows users to save the current
Data Table. When you enter this command, a dialog box entitled Save JMP
File As appears. Type the le name in the box next to File Name and then

286

Chapter Eleven

click Save. The le will be saved in the drive that you must choose before
you click Save.
Retrieving a Saved JMP Data Table
Using the command File  Open Data Table will prompt the dialog box
Open Data Table to appear. Select the drive and directory where the le was
saved by clicking the down arrow next to the Look in box, enter the le
name in the box next to File Name, and click Open. The data will appear in
the same format it was last saved in.
Importing Data from the Internet
Using the command File  Internet open... will prompt the dialog box
Internet Open Window to appear. The default protocol is HTTP: if you are
importing data from the Internet using an FTP site, then select ftp from the
drop-down menu. In the URL box type in the website address that contains
the data you want to import into JMP for analysis.
11.2.2 Calculating Descriptive Statistics
Column Statistics
First, enter the desired data in the active Data Table window. Then from the
Menu command select Table  Summary. Select one or more data columns
in the box located on the left side of the dialog box to calculate descriptive
statistics. Then click on the Statistics button. A drop-down menu appears
with various options available to compute statistics for selected columns such
as the sum, mean, standard deviation, minimum, maximum, range, median,
sum of squares, N total, N nonmissing, and N missing. All these choices of
statistics appear in the dialog box shown in Figure 11.18. Then click OK.
Example 11.20 Use the following steps to calculate any one of the statistics
listed in the dialog box titled Column Statistics, using the following data:
8976568989
Solution:
1. Open a new Data Table under File  New  Data Table.
2. Enter the data in column 1 of the Data Table window.
3. Select Tables from the Menu command.
4. Click Summary from the pull-down menu available in the
Tables command menu.
5. Select column 1 in the box located on the left side of the dialog
box.
6. Under the Statistics option on the right side of the window,
choose the statistics that you want included in the summary.
Repeat this step until you have selected all the statistics that you
would like to be included in the summary.
7. Click OK.

Computer Resources to Support Applied Statistics 287

Dialog box

Pull-down menu

Figure 11.18 JMP window showing input and output for Column Statistics.

288

Chapter Eleven

A Summary Table will appear with the summary of the statistics as


selected.
Graphs
JMP allows two ways of generating graphsone under Graph  Charts
and the second under Analyze  Distribution. Choices under the Graph
menu include Chart, Overlay Plot, Spinning Plot, Contour Plot, Control
Chart, Variability/Gage Chart, Pareto Plot, and Ternary Plot, Next, we discuss some of these graphs.
Histogram
First, enter the data in the Data Table. For each variable use only one column. Then use the Menu Command Analyze  Distribution. These commands prompt a dialog box titled Distribution to appear as shown in Figure
11.19, which has three sections: Select Columns, Cast Selected Columns
into Roles, and Action. An option from each category must be selected.
Choose the appropriate variables to form a distribution by clicking on the
desired columns from the Select Columns Category. Then select Y,
Columns from Cast Selected Columns into Roles, and select the appropriate action from the Action category. Click OK.
For example, choose column 1 from Select Columns and click on Y,
Columns from Cast Selected Columns into Roles. Click on OK from
Action. This results in a new window with the output, displaying the histogram, outlier box plot, quantiles, and summary statistics.
Example 11.21

Prepare a histogram for the following data:

23

25

20

16

19

18

42

25

28

29

36

26

27

35

41

18

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Figure 11.19 JMP Distribution dialog box.

Computer Resources to Support Applied Statistics 289

Figure 11.20 JMP display of histogram for the data given in Example 11.21.

Solution: Take the following steps to generate a histogram from the data set
in Example 11.21.
1. Enter the data in column 1 of the Data Table.
2. Select Tables from the Menu command.
3. Click Summary from the pull-down menu.
4. Select column 1 in the Select Columns box.
5. Select Y, Columns from Cast Selected Columns into Roles.
6. Click OK from Action.
To make the Graph layout horizontal, click on the red arrow to the left of
the column 1 title in the histogram window, shown in Figure 11.20. Select
Display Options  Horizontal Layout.
Stem and Leaf
Prepare a stem and leaf diagram for the data in Example 11.21.
1. Enter the data in column 1 of the Data Table.
2. Select Analyze  Distribution from the Menu command.
3. Click the column name under Select Columns.
4. Select Y, Columns from Cast Selected Columns into Roles.
5. Click OK from Action.
6. Select the red arrow to the left of the column name.
7. Select Stem and Leaf from the drop-down menu.

290

Chapter Eleven

Figure 11.21 JMP printout of stem and leaf for the data given in Example 11.21.

The stem and leaf diagram for the data in Example 11.21 is shown in
Figure 11.21
Box Whisker Plot
Extending our discussion of the data in Example 11.21, JMP is capable of
generating a box whisker plot along with the histogram. In JMP, there are
two types of box plots: Outlier Box Plot and Quantile Box Plot. Outlier
Box Plot is generated by default. To generate the Quantile Box Plot, take
the following steps:
1. Right-click anywhere outside the histogram graph, shown in
Figure 11.20.
2. Select the Quantile Box Plot Option.
To set the graph in a horizontal position as shown in Figure 11.22, leftclick the red arrow to the left of the title Distributions above the graph and
select the stack option.
Displayed above the histogram, we see both outlier and quantile box
plots. Data points that are considered outliers are presented as individual
points in the tails of the graphic while the interquartile range is indicated by
the width of the box. The single vertical line within the box represents the

Computer Resources to Support Applied Statistics 291

Outlier box plot

Quantile box plot

Figure 11.22 JMP display of box plot with summary statistics for Example 11.21.

median of the data. The center of the diamond displays the mean, and the
width of the diamond indicates a 95% condence interval for the mean. JMP
lists by default the quantiles and other related statistics (moments) such as
mean and standard deviation to the right of the graph.
Graphical Summary
First, enter the data in one or more columns of the Data Table depending
upon whether you have data on one or more variables. For each variable use
only one column. Then using the menu command, select Tables 
Summary. These commands prompt a dialog box titled JMP: Summary to
appear. Select the appropriate columns by highlighting them. Under the
Statistics option select the appropriate statistics to display. Once all the statistics to have displayed are selected, select OK. This option provides both
graphical and numerical descriptive statistics. A separate graph and summary of statistics is displayed for each variable.
To t a distribution to the histogram, click on the red arrow to the left of
Column 1 in the Distribution dialog box. This action provides a drop-down
menu. From the drop-down menu, select Fit Distribution  Normal (if we
wanted to t a normal distribution to the data, or choose any other desired
distributions such as Weibull, Exponential, or Poisson). Click the red arrow
by the side of Fitted Normal, and more options become available such as
Goodness of Fit Test and Density Curve.
Example 11.22

Prepare the graphical summary for the following data:

23

25

20

16

19

55

42

25

28

29

36

26

27

35

41

55

20

24

29

26

37

38

24

26

34

36

38

39

32

33

25

292

Chapter Eleven

Solution: Take the following steps to generate the graphical summary as


shown in Figure 11.23.
1. Enter the data in column 1 of the Data Table.
2. Under the Command Menu select Analyze.
3. Then select Distribution.
4. Select column 1 in the Select Columns box.
5. Select Y, Columns from Cast Selected Columns into Roles.
6. Click OK from Action.
7. To t a distribution to the data, right-click anywhere outside the
graph. That prompts a drop-down menu.
8. Select Fit Distribution.
9. Then select the desired Distribution from the list of
distributions available.
Note that from the Goodness-of-Fit Test we see that the p-value is quite
large, which indicates that the data t well within the normal distribution.
Bar Chart
Enter the data containing categories and frequencies in column 1 and column 2, or if the categories and frequencies are not given, enter the categorical data in column 1, and then from the Menu commands select Graph 
Chart, which brings up a dialog box titled Chart. Three options are provided in the Options box. To construct a bar chart, from the Options box, select

Figure 11.23 JMP display of graphical summary for the data in Example 11.22.

Computer Resources to Support Applied Statistics 293

Bar Chart. In the Cast Selected Columns into Roles box, select N from the
Statistics pull-down menu. Then select X, Level option and click OK. Note
that the type of data for the bar chart should be qualitative.
Example 11.23

Prepare a bar graph for the following categorical data:

Solution: Take the following steps to generate a bar chart as shown in Figure
11.24.
1. Enter the data in a column.
2. Under the Command Menu select Graph  Chart.
3. A dialog box titled Chart appears.
4. Select the desired column from the Select Columns box.
5. Under the Options box in this window, select Bar Chart from
the drop-down menu.

Figure 11.24 JMP display of bar graph for the data in Example 11.23.

294

Chapter Eleven

6. Under Cast Selected Columns into Roles, click on the


Statistics button.
7. Select N.
8. Then select the X, Level option.
9. Then click OK.
Pie Chart
Enter the data containing categories and frequencies in column 1 and column 2, or if the categories and frequencies are not given, enter categorical
data in column 1. From the menu command, select Graph  Chart, and a
dialog box titled Chart appears. Under the Select Columns box, select the
columns for which you want a pie chart formed. Then, in the Options box,
click on Vertical, and then select Pie. Select N from the Statistics dropdown menu in the Cast Selected Columns into Roles box. Then select the
X, Level option and click OK. Note that the type of data for the pie chart
should be qualitative.
Example 11.24

Prepare a pie chart for the following categorical data:

Solution: Take the following steps to generate a pie chart as shown in Figure
11.25.
1. Enter the data in a column.
2. Under the Command Menu select Graph  Chart.
3. A dialog box entitled Chart appears.
4. Select the desired column from the Select Columns box.
5. Under the Options box in this window, select Pie Chart from
the drop-down menu.
6. Under Cast Selected Columns into Roles, click on the
Statistics button.
7. Select N from the Statistics drop-down menu.
8. Then select the X, Level option.
9. Click OK.

Computer Resources to Support Applied Statistics 295

Figure 11.25 JMP printout of pie chart for the data in Example 11.24.

11.2.3 Estimation and Testing of Hypotheses about One


Population Mean
1-Sample t
In Chapters 9 and 10, we saw that if a small sample is taken from a normal
population with an unknown variance, we can use the t-statistic for nding a
condence interval and testing a hypothesis about the mean. We now discuss
the JMP procedure for 1-Sample t.
From the Menu bar select Distribution. This will prompt a dialog box
titled Distribution to appear. Select the desired column from the Columns
option, if you have entered the raw data in columns. Then select the Y,
Columns option from the Cast Selected Columns into Roles and click
OK. Select the red arrow to the left of the Column 1. Then select Test
mean. In the Specified Hypothesized Mean box, enter the hypothesized
value of the mean under the null hypothesis. The JMP output is shown in
Figure 11.26.
Example 11.25 Consider the following data from a population with an
unknown mean  and unknown standard deviation :

296

Chapter Eleven

23

25

20

16

19

35

42

25

28

29

36

26

27

35

41

30

20

24

29

26

37

38

24

26

Figure 11.26 JMP printout of 1 sample t-test for the data in Example 11.25.

(a) Find a 95% confidence interval for the mean .


(b) Test a hypothesis H0:   28 versus H1:   28 at the 5% level
of significance.
Solution: Take the following steps to perform the t-test.
1. Enter the data in column 1 of the Data Table.
2. Select Analyze  Distribution from the Menu command.
3. Click the column name under Select Columns.
4. Select Y, Columns from Cast Selected Columns into Roles.
5. Click OK from Action.
6. Select the red arrow to the left of the column name.
7. Select Test Mean from the drop-down menu.
8. Enter the value of the mean under the null hypothesis in the
dialog box that appears.
9. Then click OK.
Since the p-value for the test is 0.797, which is much greater than the 5%
level of signicance, we do not reject the null hypothesis. The 95% condence interval is (25.38704, 31.36296).
1-Samplez
If the sample size is large (n 30), we use z-statistic instead of t-statistic.
To perform a z-test in JMP, follow the steps for the t-test discussed earlier.

Computer Resources to Support Applied Statistics 297

Select the red arrow to the left of the Column 1 in the Distribution box.
Select Test mean. In the Specify Hypothesized Mean box, enter the value
of the mean under the null hypothesis. Then enter the value of the standard
deviation in the box next to Enter True Standard Deviation to do z-test
rather than t-test. The t-test is specied by default.
Example 11.26 Consider the following data from a population with an
unknown mean  and standard deviation , which may or may not be known:
23

25

20

16

19

35

42

25

28

29

36

26

27

35

41

30

20

24

29

26

37

38

24

26

34

36

38

39

32

33

25

30

(a) Find a 95% confidence interval for the mean .


(b) Test a hypothesis H0 :   30 versus H1:   30 at the 5%
level of significance.
Solution: Since the sample size (n  32) is greater than 30, it is considered to be a large sample. Therefore, we use the z-statistic. Also, when the
sample size is large, the population standard deviation , if it is unknown, as
in this example, can be replaced with the sample standard deviation S. The
output is shown in Figure 11.27.
Since the p-value for the test is 0.756, which is much greater than the
level of signicance 5%, we do not reject the null hypothesis. The 95% condence interval is given as (27.1630, 32.0869). Also, note that since 95%
condence interval contains the value of  we were testing for, we do not
reject the null hypothesis at the 5% [(10095)%] level of signicance.
2-Sample-t
For a 2-sample t-test, enter the data in one column by stacking. In the second
column, identify the sample for each observation that was entered in the rst

Figure 11.27 JMP printout of 1 sample z-test for the data in Example 11.26.

298

Chapter Eleven

column. From the JMP starter window, select the 2-sample t-test. From
the Select Column option in the dialog box, select the column containing the
response, and then click on the Y Response in the Cast selected columns
into Roles box. Then select second column that contains the sample identication from the same dialog box in the Select Column options, and then
click on the X grouping option in the Cast selected columns into Roles
box. Click OK. For more options such as Displaying Quantiles, click on the
red arrow to the left of Oneway Analysis of Column 1 By Column 2.
Example 11.27 A company bought resistors from two suppliers, 26 resistors from the first supplier and 19 from the second supplier. The following
data show the coded values of the resistance of the resistors bought by the
company. Assuming that the samples come from two normal populations,
determine whether the mean resistance of one population is significantly different from the other. Use   0.05. Also, find a 95% confidence interval for
difference of the two population means.

Sample 1 7.366 7.256 4.537 5.784 5.604 7.987 10.996 6.743 8.739 4.963 9.065 6.451 7.028
6.924 6.525 9.346 5.157 6.372 9.286 3.818 3.221 11.073 6.775 7.779 4.295 5.964

Sample 2 7.730 5.366 4.365 3.234 5.334 6.870 4.268 5.886 7.040 5.434 4.370 4.239 3.875
4.154 5.798 5.995 5.324 5.190 5.330

Solution:
1. Enter the data in a column in a stacked form.
2. In the second column, identify the sample for each observation
that was entered in the rst column.
3. Under the JMP starter window, click two sample t-test.
4. In the Select Columns, select the data column, then click
Y,Response.
5. Select the sample identication column from Select Columns,
and then click X,Response.
6. Click OK.
The JMP output is as shown in Figure 11.28. The output gives a 95%
condence interval (0.66840, 2.59939) for the difference of the two population means. The p-value for the test is 0.0014, which is less than 5% the level
of signicance. Thus, we reject the null hypothesis and conclude that the
mean resistances of the resistors supplied by the two suppliers are signicantly different.
Paired t
From the Menu bar select Analyze  Matched Pairs  Paired t. Choose
Samples in columns if you have entered raw data in two columns. Once the
columns have been selected in the Select Columns box, (select more than

Computer Resources to Support Applied Statistics 299

Figure 11.28 JMP printout of 2-sample t-test for the data in Example 11.27.

one column by holding the Ctrl key), click on the Y, Paired Response button, and click OK.
A graphical display is provided for the paired t-test. Paired t evaluates
the rst sample minus the second sample. The results also provide the condence intervals.
Example 11.28 The following data give the test scores before and after a
two-week training of a group of 15 Six Sigma Green Belts:

Before

83

87

83

78

76

89

79

83

86

90

81

77

74

86

89

After

87

85

89

77

79

84

90

88

83

92

87

81

83

79

85

Test a hypothesis H0: d  0 versus H1: d  0 at the 5% level of significance. Find a 95% confidence interval for the difference in before and
after population means.
Solution:
1. Enter the data in a column in a stacked form.
2. In the second column, identify the sample for each observation
that entered in the rst column.
3. Under the JMP starter window, click the two sample t-test.
4. In the Select Columns, select the data column, then click
Y,Response.
5. Select the sample identication column from Select Columns,
and then click X,Response.
6. Click OK.

300

Chapter Eleven

Figure 11.29 JMP printout of paired t-test for the data in Example 11.28.

The output is as shown in Figure 11.29.


This output gives a 95% a condence interval (4.81092, 1.07759) for
the population means difference between before and after test scores. The pvalue for the test is 0.195, which is greater than 5% the level of signicance.
Thus, we do not reject the null hypothesisthe average scores before and
after the training are not signicantly different. In other words, the training
program is not very effective.
11.2.4 Estimation and Testing of Hypotheses about Two
Population Variances
In many applications dealing with two normal populations, frequently we
assume that the variances of two populations are equal. In such situations, it
is important that we verify whether that assumption is valid. Below, we discuss a JMP procedure to test whether the two variances are equal or not.
Example 11.29 Consider the data from Example 11.27. Determine
whether the two population variances are equal.
Sample 1 7.366 7.256 4.537 5.784 5.604 7.987 10.996 6.743 8.739 4.963 9.065 6.451 7.028
6.924 6.525 9.346 5.157 6.372 9.286 3.818 3.221 11.073 6.775 7.779 4.295 5.964

Sample 2 7.730 5.366 4.365 3.234 5.334 6.870 4.268 5.886 7.040 5.434 4.370 4.239 3.875
4.154 5.798 5.995 5.324 5.190 5.330

Solution:
1. Enter the data in a column in a stacked form.
2. In the second column, identify the sample for each observation
that was entered in the rst column.
3. Under the JMP starter window, click two sample t-test.

Computer Resources to Support Applied Statistics 301

Figure 11.30 JMP printout of test of equal variances in Example 11.29.

4. In Select Columns, select the data column, then click Y,


Response.
5. Select the sample identication column from Select Columns,
and then click X,Response.
6. Click OK.
7. Click the red arrow to the left of Oneway Analysis of Column
1 By Column 2.
8. Select UnEqual Variances.
The output is shown in Figure 11.30
Since the p-value of the F Test 2-sided is 0.0167, which is less than 5%,
the level of signicance, we reject the null hypothesis of equal variances.

11.2.5 Normality Test


In most applications it is common to assume that the selected sample comes
from a normal population. However, then it becomes necessary to verify that
assumption. If the assumption is not true, any conclusions made may not be
valid.
Example 11.30 Determine whether the following sample comes from a
normal population:

302

Chapter Eleven

23

25

20

16

19

55

42

25

28

29

36

26

27

35

41

55

20

24

29

26

37

38

24

26

34

36

38

39

32

33

Solution: We used the normal quantile plot to verify if the sample comes
from a normal population. To construct the normal quantile plot, take the following steps.
1.
2.
3.
4.

Enter the data in a column.


Under the Command Menu select Analyze.
Then select Distribution.
In Select Columns, select the data column, then click
Y,Columns.
5. Click OK.
6. Click the red arrow to the left of Column 1.
7. Select Normal Quantile Plot.
The output is shown in Figure 11.31.
The interpretation of the plot is as follows:
1. If the data points fall on the red straight line, it indicates that the
normality assumption is satisfactory.
2. If the points fall outside the dotted red curves, it indicates a
signicant departure from the normality assumption.
3. The slope of the red line represents the standard deviation of the
distribution.

Quantile Plot
55

Dotted
straight line

50
45
40

Confidence interval for


normality

35
30

Dotted red
curves

25
20
15

Slope represents
standard deviation

-3

-2

-1

Normal quantile

Figure 11.31 JMP display of normal quantile graph for the data in Example 11.30.

Computer Resources to Support Applied Statistics 303

11.3 Web-based Computing Resources


We have devoted a substantial portion of this book to a discussion of PCbased software for statistical support, namely JMP and MINITAB. We realize that not all the readers of this book may have access to this software, as
inexpensive as it is intended to be. It is important to note that the calculations
discussed throughout this book can be completed by hand or with an inexpensive calculator. The reality of actually applying statistics in an operational
industrial environment is such that hand calculations inherently take too
much time and are prone to data entry or procedural errors. The question
becomes, are there are computer-based resources available to support statistical analysis that are free, easy to access, and simple to use? The answer is,
fortunately, yes!
A search of the Internet using a phrase such as statistical support will
turn up thousands of sources related to applications of statistics. While many
of these sites would prove not to be useful in your efforts to complete statistical analyses or Six Sigma works, you would quickly nd a great many sites
that would be invaluable as they meet all the criteria previously mentioned
(i.e., free, easy to access, simple to use).
Online resources are roughly grouped into three primary categories: statistics calculated from public or private data (these types of resources are not
applicable for our purposes in this book), interactive calculators, and
texts/references. While there are more than 28 million statistics-related sites
online, you need only nd one or two sites that are well prepared, stable, and
of course meet your needs.
One excellent example of an interactive calculator is at www.statpages.
net. Statpages.net offers more than 380 interactive statistical tools/techniques/procedures for anyone to use. The statistical procedures are all
applied in nature in that they are active calculators, many of which offer
insights into assumptions, precautions, and interpretations associated with
the procedures. Statpages.com also offers free statistical support software,
statistics textbooks/references, statistical demonstrations/tutorials, and links
to other statistics-related pages.
An excellent example of a web-based text/reference is www.statsoftinc.
com. Statsoftinc.com is primarily a reference manual for explaining statistical terms, procedures, and related concepts. All the topics at this site are
linked to a comprehensive table of contents and index.
It is important to say that there are many, many more computer-based
resources available than we have the time or page count to address in this
book. In fact, there are so many web-based resources that information overload quickly becomes a consideration when directing anyone to these
resources; hence our advice is to limit your interest in these types of
resources to those few that meet your needs. Our most sincere advice regarding use of computing resources to support applied statistics is to nd one or
two tools that you like and that meet your needs, learn these tools and
resources as well as you can, and DO NOT keep looking for additional
resources.

About the Authors

Bhisham C. Gupta, M.A., M.S., Ph.D.


Bhisham C. Gupta is a professor of Statistics at the University of Southern
Maine in the United States. Bhisham has taught numerous topics of statistics,
including probability theory, statistical inference, biostatistics, linear models,
regression analysis, design of experiments, analysis of variance, sampling
techniques, quality control, nonparametric methods, and applied statistics for
engineers at undergraduate and graduate levels in the United States, Canada,
Brazil, and India.
Bhisham developed undergraduate and graduate programs at the Federal
University of Rio de Janeiro, Brazil, and at the University of Southern
Maine. He also organized the rst Brazilian National Symposium of
Probability and Statistics.
Bhisham has published 35 research papers in well-reputed international
journals. His research interests include sampling theory, design of experiments, and statistical quality control. Bhisham has also co-authored or
authored Regression and Analysis of Variance Techniques and Elements of
Probability Theory (in Portuguese) published by the Federal University of
Rio de Janeiro, Brazil. In addition, Bhisham has consulted in the semiconductor industry, pulp and paper industry, and medical community.
H. Fred Walker, Ph.D., CQMgr, CQE, CQA, CSIT
H. Fred Walker is the department chair and graduate coordinator in the
Department of Technology at the University of Southern Maine. Fred develops and teaches graduate courses in applied research methods, engineering
economy, quality systems, statistical quality control, quality engineering,
design of experiment applications in manufacturing, manufacturing strategies, and project management. He also develops and teaches undergraduate
courses in quality, industrial statistics, statistical quality control, cost analysis and control, human resource management, project management, and technical writing.

331

332

About the Authors

Freds research agenda is focused on enhancing the competitiveness of


manufacturers through appropriate technology and operating practices. In
support of his research agenda, Fred has written nearly 30 refereed articles
published in national and international journals. Fred also co-edited and coauthored the Certified Quality Engineering Handbook (2nd ed.) and coauthored the Certified Quality Technician Handbook, both published by ASQ
Press.
Freds industrial experience includes 12 years with airborne weapons
systems integration and automation, supervision and administration, project
management, and program management in countries around the Pacic Rim,
Australia, and Africa. Freds consulting experience includes 25 years of continual involvement with international manufacturers in the semiconductor,
pulp and paper, biomedical equipment, food processing, printing, farm
implement, and machined component industries. This experience has
enabled Fred to earn Certied Quality Manager, Certied Quality Engineer,
Certified Quality Auditor, Certified Manufacturing Technologist, and
Certied Senior Industrial Technologist designations.
Fred is a Six Sigma Black Belt and continues to support Six Sigma
implementation and training needs in several companies.

Acknowledgments

e would like to thank Professors John Brunette and Cheng Peng of


the University of Southern Maine, and Ramesh Gupta and Pushpa
Gupta of the University of Maine, Orono, for reading the nal draft
line-by-line. Their comments and suggestions have proven to be invaluable.
We would like to thank Professor Joel Irish of the University of Southern
Maine for help in writing a computer program in Mathematica that was used
to prepare all the gures in this book. We thank graduate students Mohamad
Ibourk, Seetha Shetty and Melanie Thompson for help preparing the chapter
on computer resources, as well as Mary Ellen Costello and Stacie
Santomango for general manuscript preparation. Also, we thank Laurie
McDermott, administrative assistant of the Department of Mathematics and
Statistics of the University of Southern Maine, for help in typing the various
drafts of the manuscript. We would like to thank the several anonymous
reviewers whose constructive suggestions greatly improved the presentations. We also want to thank Annemieke Hytinen, acquisition editor, and Paul
OMara, project editor, of ASQ Quality Press for their patience and cooperation throughout the preparation of this project.
We thank Minitab Inc. for permitting us to print MINITAB screen
shots in this book. MINITAB and the MINITAB logo are registered
trademarks of Minitab Inc.
We also thank SAS Institute Inc., of Cary, North Carolina, for permitting
us to reprint screen shots of JMP v. 5.1 ( 2004 SAS Institute Inc. SAS,
JMP and all other SAS Institute Inc. product or service names are registered
trademarks or trademarks of SAS Institute Inc. in the USA and other countries. indicates USA registration).
Most of all, the authors would like to thank their families. Bhisham is
grateful to his wife, Swarn, daughters Anita and Anjali, and son, Shiva, for
their deep love and support. He is grateful to his son-in-law, Mark, for his
expressed curiosity. Last but not least, he is grateful to his rst grandchild,

xxii

Acknowledgments xxiii

Priya, for reminding him that there is always time for play. Fred would like
to sincerely thank his wife, Julie, and sons, Carl and George, for their love,
support, and patience as he worked on this and two previous books. Without
their encouragement, such projects would not be possible or meaningful.
Bhisham C. Gupta
H. Fred Walker

Bibliography

Selected Books on the Theory of


Probability and Statistics
Freund, John E. (1992). Mathematical Statistics, 5th ed. Englewood Cliffs,
NJ: Prentice Hall.
Hogg, Robert V., & Tanis, Elliot A. (1993). Probability and Statistical
Inference, 4th ed. New York: Macmillan.
Hoel, Paul G., Port, Sidney C., & Stone, Charles J. (1971). Introduction to
Probability Theory. Boston: Houghton Mifin.
Hoel, Paul G., Port, Sidney C., & Stone, Charles J. (1971). Introduction to
Statistical Theory. Boston: Houghton Mifin.
Hogg, Robert V., & Craig, Allen T. (1978). Introduction to Mathematical
Statistics, 4th ed. New York: Macmillan.
Ross, Sheldon. (2002). A First Course in Probability, 6th ed. Upper Saddle
River, NJ: Prentice Hall.
Ross, Sheldon M. (1996). Introductory Statistics. New York: McGraw-Hill.
Wackerly, D. D., Mendenhall, W., III, & Scheaffer, R. L. (2002).
Mathematical Statistics with Applications, 6th ed. Pacic Grove, CA:
Duxbury & Thomson Learning.

Selected Books on Engineering


Statistics/Special Topics
Daniel, Wayne W. (1990). Applied Nonparametric Statistics, 2nd ed. Pacific
Grove, CA: Duxbury Thomson Learning.
Devore, J. L. (2004). Probability and Statistics for Engineers and Sciences,
6th ed. Belmont, CA: Duxbury & Brooks/Cole.
Guttman, I., Wilks, S. S., & Hunter, J. S. (1982). Introductory Engineering
Statistics, 3rd ed. New York: John Wiley and Sons.
Johnson, Richard A. (2000). Miller & Freunds Probability and Statistics for
Engineers, 6th ed. Upper Saddle River, NJ: Prentice Hall.

329

330

Bibliography

Montgomery, D. C., & Runger, G. C. (2003). Applied Statistics and


Probability for Engineers, 3rd ed. New York: John Wiley and Sons.
Ross, S. (2000). Introduction to Probability and Statistics for Engineers and
Scientists, 2nd ed. San Diego: Academic Press.
Sall, J., Creighton, L., & Lehman, A. (2005). JMP Start Statistics, 3rd ed.
Belmont, CA: Duxbury & Brooks/Cole.
Walpole, R. E., Myers, R. H., & Myers, S. L. (2002). Probability and
Statistics for Engineers and Scientists, 7th ed. Upper Saddle River, NJ:
Prentice Hall.

Practice Problem Solutions


Chapter 2
1. (a)
(b)
(c)
(d)
(e)
(f)

quantitative
qualitative
quantitative
quantitative
qualitative
qualitative

2. (a)
(b)
(c)
(d)
(e)

quantitative
qualitative
quantitative
qualitative
quantitative

3. (a)
(b)
(c)
(d)
(e)

ordinal
nominal
nominal
ordinal
nominal

4. (a) interval
(b.) ratio
(c.) ratio
(d.) ratio
5. (a) nominal
(b) ordinal
(c) nominal
(d) interval
(e) ratio
(f) ratio
6. Answers may vary:
(a.) nominal: Types of apples in an apple orchard
(b.) ordinal: Guide book rating of local restaurants: poor, fair, good
(c.) interval: Water temperature of Lake Michigan
(d.) ratio: Cost of tickets to a baseball game
7. (a.)
(b.)
(c.)
(d.)
(e.)
(f.)
(g.)

nominal
ratio
ratio
ordinal
nominal
ratio
interval

8. (a.) descriptive
1

(b.)
(c.)
(d.)
(e.)
(f.)

descriptive
inferential
inferential
inferential
descriptive

9. Population: A collection of all conceivable individuals, elements, numbers or entities which possess a
characteristic of interest
Sample: A portion of a population selected for study
Random Sample: A sample in which every element of the population has an equal chance of being
selected
Representative Sample: A sample that has approximately the same distribution of characteristics as
the population from which it was drawn
Descriptive Statistics: A branch of statistics that uses techniques to organize, summarize, present,
and

interpret a data set to draw conclusions that do not go beyond the boundaries of the data

set.
10. (a.)
(b.)
(c.)
(d.)
(e.)

inferential
descriptive
inferential
inferential
descriptive

11. (a.)
(b.)
(c.)
(d.)
(e.)

population
sample
sample
sample
population

12. Population: All patients who were admitted to the hospital


Sample: 200 patients who were admitted to the hospital over the past several months
13. (a.) descriptive
(b.) inferential
14. (a.) population
(b.) sample
(c.) population
15. (a.)
(b.)
(c.)
(d.)
(e.)
(f.)

no
no
no
yes
yes
no, since of the person in the telephone book may not be the eligible voters.
2

Chapter 3
1. (a) & (b)
Number of Classes m = 1 + 3.3(log30) = 5.87  6.0
Range 20
=
= 3.33  4.0
Class Width =
m
6
Class Limit

Freq.

Rel. Freq.

Percent

[40 44)
[44 48)
[48 52)
[52 56)
[56 60)
[60 64)

5
7
5
6
6
1

0.1667
0.2333
0.1667
0.2000
0.2000
0.0333

16.67%
23.33%
16.67%
20.0%
20.0%
3.33%

(c)
Histogram of Annual Salaries of Six Sigma Green Belt Workers
7
6

Frequency

5
4
3
2
1
0

39.5

43.5

47.5

51.5
55.5
Annual Salaries

2. (a)
Class Limit
2
3
4
5
6

Freq.
9
11
12
5
3

59.5

63.5

(b
Bar Chart of the Number of Defective Items
12

Frequency

10
8
6
4
2
0

4
5
Number Of Defective Items

(c)
Dotplot of the Number of Defective Items

4
5
Number of Defective Items

(d) 50% of the shipments contained at least 4 defective items.


3. (a) & (b)
Number of Classes m = 1 + 3.3(log30) = 5.87  6.0
Range 30
=
= 5.0
Class Width =
m
6

Class Limit

Freq.

Rel. Freq.

Percent

[40 45)
[45 50)
[50 55)
[55 60)
[60 65)
[65 70)

5
8
4
4
4
5

0.1667
0.2667
0.1333
0.1333
0.1333
0.1667

16.67%
26.67%
13.33%
13.33%
13.33%
16.67%

(c)
Histogram of the Number of Computers Assembled
9
8
7

Frequency

6
5
4
3
2
1
0

40

45

50
55
60
Number 0f Computers Assembled

65

(d)
Frequency Polygon of the Number of Computers Assembled
9
8
7

Frequency

6
5
4
3
2
1
0

42.5

47.5
52.5
57.5
62.5
Number of Computers Assembled

67.5

70

4. (a)
Bar Chart of R&D Budget
9

Budget in Millions of Dollars

8
7
6
5
4
3
2
1
0

Chicago

Detroit

Houston
Facility Locations

New York

St Louis

(b)
Class
Chicago
Detroit
Houston
New York
St. Louis

Freq
3.5
5.4
4.2
8.5
5.5

Percentage
3.5 / 27.1 = 12.9%
19.9%
15.5%
31.4%
20.3%

No. of Degrees
0.129 x 360 = 46
72
56
113
73

Pie Chart of R&D Budget


Chicago
12.9%

St Louis
20.3%

Detroit
19.9%

New York
31.4%
Houston
15.5%

5. (a) & (b)


Class
Limit
A
B
C
D

Freq.

Rel. Freq.

Percent

9
12
11
8

0.225
0.300
0.275
0.200

22.5%
30.0%
27.5%
20.0%

6. (a.)
Class Limit
1
2
3
4

Freq.
10
10
13
17

(b.) 60% of the households own three or more TV sets


(c.) 20% of the households own only one TV set
7. Stem and Leaf Diagram of the Number of Computers Assembled
4 0: 40 assembled computers
4
5
6
7

0013355777799
01226689
33345679
0

There were 17 days where 50 or more computers were assembled.


8. Stem and Leaf Diagram of the Diameter of Ball Bearings
3 1: 31mm
3
4
5
6

11112233345666778999
0123344456799
0001111122233345555666799
00

65% of ball bearings have a diameter great than 40mm

9. (a) & (b)


Class Limit

Freq.

Cum. Freq.

[30 35)
[35 40)
[40 45)
[45 50)
[50 - 55)
[55 60)

10
10
8
5
15
12

10
20
28
33
48
60

(c)
Cumulative Frequency Histogram of the Diameter of Ball Bearings
60

Cumulative Frequency

50
40
30
20
10
0

30

35

40
45
50
Diameter of Ball Bearings (mm)

55

60

(d)
Ogive of the Diameter of Ball Bearings
60

Cumulative Frequency

50
40
30
20
10
0

32

37

42
47
52
Diameter of Ball Bearings (mm)

57

10. Stem and Leaf Diagram of the Amount of Gasoline Sold (in Gallons)
50 1: 501 gallons of gasoline
50
51
52
53
54
55
56
57
11.

11355
002233579
0234579
2344445699
01456
024566
012356788
022246679

501
513
529
539
556
568

501
515
532
540
556
570

503
517
533
541
560
572

505
519
534
544
561
572

505
520
534
545
562
572

510
522
534
546
563
574

510
523
534
550
565
576

12. Stem and Leaf diagram #1


3 0: 30
3
3
4
4
5

01111111122233344
556666777788888999
000000012222344444
555557788888999
00000111111122222222333333444444

Stem and Leaf Diagram #2


3 0: 30
3
3
3
3
3
4
4
4
4
4
5
5
5

011111111
222333
4455
66667777
88888999
00000001
22223
4444455555
77
88888999
000001111111
22222222333333
444444

Chapter 4
9

512
524
535
552
566
576

512
525
536
554
567
577

513
527
539
555
568
579

1. (a) mean = 120.02, median = 120.10


(b) standard deviation = 1.84
2. (a) mean = 12.026, variance = 0.289, standard deviation = 0.537
(b) Q1 = 11.753, median = 12.07, Q3 = 12.32, IQR = 12.32 11.753 = 0.567
3. (a) mean = 6.769, median = 6.79
(b) variance = 0.0837, standard deviation = 0.2894
(c) The difference between the mean tread depth from our sample (6.769 mm) and the desired tread
depth (7 mm) is 0.2315mm. Therefore the sample mean tread depth is less than one standard
deviation away from the desired tread depth indicating that quality of the tires appears to be
adequate.
4. (a) mean = 49.56, median = 48, mode = 58
(b) The distribution is right skewed
Boxplot of the Number of Parts Manufactured

Number of Parts Manufactured

60

55

50

45

40

5. The data set contains 3 outliers: 56, 58, and 59.

10

Boxplot of the Length of Rods in cm


60

Length of Rods in cm

50

40

30

20

6. (a) 60th percentile = 73.0


(b) 75th percentile = 75.0
(c) 5 data points fall between the 60th and 75th percentiles
7. (a) x1 = 25.667 , s1 = 2.82 , x 2 = 51.194 , s2 = 5.966
(b) cv1 = 10.99 , cv2 = 11.65
(c) The second set of data has a larger relative variability.
8. mean = x = 68.42 , st.dev. = s = 8.18

x s = ( 68.42  8.18 ) , ( 68.42 + 8.18 ) = ( 60.24,76.60 )

data points = 28, expected number of data points = 24

x 2s = ( 68.42  16.36 ) , ( 68.42 + 16.36 ) = ( 52.06,84.78 )


data points = 35, expected number of data points = 34

x 3s = ( 68.42  24.54 ) , ( 68.42 + 24.54 ) = ( 43.88,92.96 )


data points = 35, expected number of data points = 36
The number of data points that fell within one, two, and three standard deviations of the mean is 25,
35, and 35 respectively. Using the empirical rule the expected number of data points that should fall
within one, two, and three standard deviations of the mean is 24, 34, and 36 respectively. This being
similar to the actual data points that fell within one, two, and three standard deviations we can say that
the shape of the distribution is approximately symmetrical.
9. Number of Classes m = 1 + 3.3(log36) = 6.13  6.0
Range 20
=
= 3.33  4.0
Class Width =
m
6
11

Class Limit
[40 44)
[44 48)
[48 52)
[52 56)
[56 60)
[60 64)

Class
Mark
42
46
50
54
58
62

Freq.
11
5
6
2
11
1

actual mean = 49.56, standard deviation = 7.00


mean of the grouped data = 49.5, standard deviation of the grouped data = 6.90
The mean and standard deviation of the grouped data is slightly lower, and therefore more
conservative than the actual mean and standard deviation of the data.
10. 68% of the salaries will fall between $51,100 and $60,100
95% of the salaries will fall between $46,600 and $64,600
99.7% of the salaries will fall between $42,100 and $69,100
11. The median would be a better measure of central tendency in this case due to the extreme values (140
& 281) in the data set. Unlike the mean, the median does not change if the extreme values change.
12. (a) mean = 43.25, median = 44, standard deviation = 9.66, coefficient of variation = 22.33
(b)
Boxplot of the Ages of Six Sigma Green Belt Employees
60
55

Age in Years

50
45
40
35
30
25

(c) 60 % of the ages are within one standard deviation of the mean
100% of the ages are within two standard deviations of the mean
100% of the ages are within three standard deviations of the mean
13. (a) mean = 37.65, median = 38, standard deviation = 1.748, coefficient of variation = 4.64
(b)
12

Boxplot of the Gestational Ages of 40 Children Born


40

Age in Weeks

39

38

37

36

35

(c) 65% of all the ages are within one standard deviation of the mean
100% of all the ages are within two standard deviations of the mean
100% of all the ages are within three standard deviations of the mean
(d) The distribution appears to be skewed left.
14.
Categories
35
36
37
38
39
40

Freq.
5
8
6
7
5
9

mean of the grouped data = 37.65, standard deviation of the grouped data = 1.7475. In this case
note that the grouped mean and grouped standard deviation are equal to the actual mean and standard
deviation.
15. The box plots of the two data sets allow us to visually see the range of the data sets, the shape of their
distributions, median, and their quartile values, while also allowing us to draw both comparative and
individual conclusions. For example, the box plot of data set #1 shows that the data set has a range of
about 10, slightly left skewed distribution, the median lies at about 26, while Q1 and Q3 are about 24
and 28 respectively. The box plot of data set #2 shows that the data set has a range of about 18, the
shape of the distribution is some what left skewed, the median lies at about 51, while Q1 and Q3 are
about 46 and 57 respectively.

13

Boxplot of Data Set #1


30

28

26

24

22

20

Boxplot of Data Set #2


60

55

50

45

40

Chapter 5
1. (1)
(2)
(3)
(4)

A B
A B
Ac  B c
A  B c  Ac  B

(5) ( A  B)

) (

2. (1)  = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT}


(2)  = {1H, 1T, 2H, 2T, 3H, 3T, 4H, 4T, 5H, 5T, 6H, 6T}
(3)  = { (1,1)
(2,1)

(1,2)
(2,2)

(1,3)
(2,3)

(1,4)
(2,4)

(1,5)
(2,5)

(1,6)
(2,6)
14

(3,1)
(4,1)
(5,1)
(6,1)

(3,2)
(4,2)
(5,2)
(6,2)

(3,3)
(4,3)
(5,3)
(6,3)

(3,4)
(4,4)
(5,4)
(6,4)

(3,5)
(4,5)
(5,5)
(6,5)

(3,6)
(4,6)
(5,6)
(6,6)

(4)  = {MMM, FMM, MFM, MMF, FFM, FMF, MFF, FFF}


(5)  = {0, 1, 2, 3, , 10}
3. P(sum of 10, given different) = 0.066
4. P(3 or 5, given odd) = 0.667
5. P(2H, given 1H) = 0.428
6. P(soccer, given boy) = 0.7272
7. P ( A1 | B ) =

P ( A1 ) P ( B | A1 )

( 0.2)( 0.4 )
0.08
=
= 0.2051
P ( A ) P ( B | A ) + ... + P ( A ) P ( B | A ) ( 0.2 ) ( 0.4 ) + ... + ( 0.3) ( 0.2 ) 0.39
1

P ( A2 | B ) = 0.05

P ( A3 | B ) = 0.6

P ( A4 | B ) = 0.15
8. P (W | m ) =
9.

P ( A2 | lost ) =

P (W ) P ( m | W )

( 0.35)( 0.2)
0.07
=
= 0.4746
P ( D ) P ( m | D ) + ... + P (W ) P ( m | W ) ( 0.4 ) ( 0.1) + ... + ( 0.35) ( 0.2 ) 0.1475
( 0.25)( 0.3)
0.075
=
= 0.3333
P ( A ) P ( lost | A ) + ... + P ( A ) P ( lost | A ) ( 0.4 ) ( 0.15) + ... + ( 0.1) ( 0.4 ) 0.225

10. (i.)

P C2 | H =
(ii.)

P (C4 | T ) =

P ( A2 ) P ( lost | A2 )

( ) (
)
(0.25)(0.75)
0.1875
=
=
= 0.2727
P ( C ) P ( H | C ) + ... + P ( C ) P ( H | C ) ( 0.25) ( 0.9 ) + ... + ( 0.25) ( 0.5) 0.6875
P C2 P H | C2

( 0.25)( 0.5)
0.125
=
= 0.4000
P (C ) P ( T | C ) + ... + P (C ) P ( T | C ) ( 0.25) ( 0.1) + ... + ( 0.25) ( 0.5) 0.3125

11. (i.)

P ( m | favor ) =

P (C4 ) P ( T | C4 )

P ( m ) P ( favor | m )

( 0.55)( 0.75)
0.4125
=
= 0.6962
P ( m ) P ( favor | m ) + P ( f ) P ( favor | f ) ( 0.55) ( 0.75) + ( 0.45) ( 0.4 ) 0.5925
=

(ii.) P(f | no opinion) = 0.6207


15

(iii.) P(f | not favor) = 0.6716


12. Let the event a worker does not perform his/her job be denoted by np, then we have
P ( M1 ) P ( np | M1 )
( 0.5)( 0.1)
0.05
P ( M1 | np ) =
=
=
= 0.5155
P ( M1 ) P ( np | M1 ) + ... + P ( M 3 ) P ( np | M 3 ) ( 0.5) ( 0.1) + ... + ( 0.22 ) ( 0.15) 0.097

P ( M 2 | np ) = 0.1443
13. Let 0A = no accident , 1A = one accident, 2A = two or more accidents, A = accident :
P ( 0 A) P ( A | 0 A)
( 0.6)( 0.01)
0.006
P ( 0 A | A) =
=
=
= 0.2105
P ( 0 A) P ( A | 0 A) + ... + P ( 2A) P ( A | 2A) ( 0.6 ) ( 0.01) + ... + ( 0.15) ( 0.1) 0.0285
14. W = White, AA = African American, H = Hispanic, A = Asian, Sci = Science Major
(i.)

P (W ) P ( Sci | W )

( 0.4 )( 0.5)
0.2
=
= 0.4734
P (W ) P ( Sci | W ) + ... + P ( A) P ( Sci | A) ( 0.4 ) ( 0.5) + ... + ( 0.15) ( 0.75) 0.4225

P (W | Sci ) =

(
)
(iii.) P ( AA | Sci ) = 0.1420
(ii.) P A | Sci = 0.2623
15. (i.)

P ( A) P ( D | A)

( 0.45)( 0.05)
0.0225
=
= 0.6164
P ( A) P ( D | A) + ... + P (C ) P ( D | C ) ( 0.45) ( 0.05) + ... + ( 0.30 ) ( 0.03) 0.0365
(ii.) P ( B | D ) = 0.1370
(iii.) P (C | D ) = 0.2466

P ( A | D) =

16. C1 = fair , C2 = fair , C3 = fair , C4 = two  headed , C5 = two  tailed

P (C4 | 2H ) =

17. P A | D =

P (C4 ) P ( 2H | C4 )

( 0.2)(1.0 )
0.2
=
= 0..5714
P (C ) P ( 2H | C ) + ... + P (C ) P ( 2H | C ) ( 0.2 ) ( 0.25) + ... + ( 0.2 ) ( 0.0 ) 0.35
1

P ( A) P D c | A

( 0.4 )( 0.98)
0.392
=
= 0.4050
P ( A) P ( D | A) + ... + P (C ) P ( D | C ) ( 0.4 ) ( 0.98 ) + ... + ( 0.35) ( 0.96 ) 0.968
c

Chapter 6
1.

Use MINITAB or JMP

n = 20, p = 0.80, q = 0.20


16

(
)
(
)
( ) ( )
()
(b) P ( X < 15) = P (14 ) + P (13) + ... + P (1) + P ( 0 ) = 0.1958
(c) P ( X = 14 ) = 0.1091

()

(a) P X > 12 = 1  P X 12 = 1  P 12  P 11  ...  P 1  P 0 = 0.9679

(d) Let n = 20, p = 0.2, q = 0.80, since the probability of a car being from Maine is 0.2.
P X = 8 = 0.0222

2. n = 12, p = 0.40, q = 0.60


(a) P 4  X  6 = P X = 4 + P( X = 5) + P X = 6 = 0.617

(
) ( )
( )
(b) P ( X > 5) = 1  P ( X  5) = 1  P ( 5)  P ( 4 )  ...  P (1)  P ( 0 ) = 0.3348
(c) P ( X < 8 ) = P ( 7 ) + P ( 6 ) + ... + P (1) + P ( 0 ) = 0.9427
(d) P ( X = 0 ) = 0.0022
(

3. mean = = np = 25  0.35 = 8.75

)(

variance =  2 = npq = 25 0.35 0.65 = 5.6875


standard deviation =  = npq = 5.6875 = 2.3848
4. n = 15, p = 0.70, q = 0.30
(a) P X > 10 = 1  P X 10 = 1  P 10  P 9  ...  P 1  P 0 = 0.5155

(
)
(
)
( ) ()
() ( )
(b) P ( X < 8 ) = P ( 7 ) + P ( 6 ) + ... + P (1) + P ( 0 ) = 0.0500
(c) P (10  X  12 ) = P ( X  12 )  P ( X  9 ) = 0.8720  0.2731 = 0.5989

5. N = 100, n = 10, r = 8
(a) P ( X 1) = 1  P ( X = 0 ) = 1 

(
)
(c) P ( X = 9 ) = 0.0000
(d) P ( X = 0 ) = 0.4166

 8   92
 0   10 
 100 
 10 

= 1

1 7.21 1012
13

1.73 10

) = 1  0.4166 = 0.5834

(b) P X = 10 = 0.0000

 8 
6. mean = = np = 10 
= 0.80
 100 

variance =  2 =

 8   92 
N n
100  10
 npq =
10 
= 0.6691
N 1
100  1
 100   100 

standard deviation =  =

N n
 npq = 0.6691 = 0.8180
N 1

17

7. N = 12, n = 5, a = 5, b = 7
(a) P ( X 2 ) = 1  P ( 0 )  P (1) = 1 

5 7
 0   5
12
 5 

(b) P ( X  3) = 1  P ( 0 )  P (1)  P ( 2 ) = 1 

5 7
 1  4 
12
 5 

 7  5
 0   5
 12
 5 

= 1  0.0265  0.2210 = 0.7525

 7  5 
 1  4 
 12
 5 

 7  5
 2  3
 12
 5 

= 1  0.0013  0.0442  0.2652 = 0.6893


(c) P X  2 = P 0 + P 1 + P 2 = 0.0013 + 0.0442 + 0.2652 = 0.3107

( ) ( ) ()
(d) P ( X = 0 ) = 0.0265
 3 
8.  = ( 2000 ) 
= 6.0
 1000 

()

()

()

()

()

(a) P X  4 = 1  P X < 4 = 1  P 3  P 2  P 1  P 0
= 1  0.0025  0.0149  0.0446  0.0892 = 0.8488

(
) ( ) ()
() ( )
(c) P (5  X  8) = P ( X  8)  P ( X  4 ) = 0.874  0.446 = 0.428
(d) P ( X < 2 ) = P (1) + P ( 0 ) = 0.0174
(e) P ( X > 2 ) = 1  P ( 2 )  P (1)  P ( 0 ) = 1  0.0446  0.0149  0.0025 = 0.9380
(b) P X  10 = P 10 + P 9 + ... + P 1 + P 0 = 0.9574

 2
9.  = ( 5)   = 10.0
 1

( ) () ()
() ( )
(b) P ( X 4 ) = 1  P ( X < 4 ) = 1  P ( 3)  P ( 2 )  P (1) + P ( 0 ) = 0.9897
(c) P ( 3  X  5) = P ( X  5)  P ( X  2 ) = 0.067  0.003 = 0.063
(d) P ( X > 1) = 1  P ( X  1) = 1  P (1)  P ( 0 ) = 0.9995
(a) P X < 8 = P 7 + P 6 + ... + P 1 + P 0 = 0.2202

4 
10.  = ( 25)  = 10.0
 10 

( )
( )
( ) () ( ) () ( )
(b) P ( X  2 ) = P ( 2 ) + P (1) + P ( 0 ) = 0.0028
(c) P ( 2  X  6 ) = P ( X  6 )  P ( X  1) = 0.130  0.000 = 0.130
(d) P ( X < 6 ) = P (5) + P ( 4 ) + ... + P (1) + P ( 0 ) = 0.067

(a) P X 5 = 1  P X < 5 = 1  P 4  P 3  P 2  P 1  P 0 = 0.9707

18

11. (a) This experiment can be studied using a binomial model because it satisfies all of the necessary
conditions: there are a fixed number of trials: trails are independent, each trail has only two
possible outcomes, and in each trial the probability of success is the same.
(b) This experiment cannot be studied using a binomial model.
(c) This experiment cannot be studied using a binomial model. By the definition of a binomial
experiment one of the conditions that needs to be satisfied is that each trial has only two possible
outcomes, success and failure. This particular experiment is only observing the number that
appears on the die which means there are six possible outcomes.
(d) This experiment cannot be studied using a binomial model because it fails to satisfy the condition
of having a fixed number of trials, since we are not given that from how many companies we are
selecting the manufacturing company.
12. n = 20, p = 0.60, q = 0.40
(a) P X 5 = 1  P X < 5 = 1  P 5  P 4  ...  P 1  P 0 = 0.9987

( )
( )
() ()
() ( )
(b) P ( X  7 ) = P ( 7 ) + P ( 6 ) + ... + P (1) + P ( 0 ) = 0.021
(c) P (5 < X < 10 ) = P ( X  9 )  P ( X  5) = 0.128  0.057 = 0.071
(d) P ( X = 8 ) = 0.0355
(e) P ( X  9 ) = P ( 9 ) + P (8) + ... + P (1) + P ( 0 ) = 0.128

 2 
13.  = (1000 ) 
= 4.0
 500 

e    x e 4  4 5 ( 0.01832 ) (1024 )
=
=
= 0.1563
x!
5!
120
(b) (i) P X > 5 = 1  P X  5 = 1  P 5  P 4  ...  P 1  P 0 = 0.2149
(a) P ( X = 5) =

( )
( )
() ()
() ( )
(ii) P ( X  6 ) = P ( 6 ) + P ( 5) + ... + P (1) + P ( 0 ) = 0.8893
(iii) P ( 4  X  8 ) = P ( X  8 )  P ( X < 4 ) = 0.97864  0.42247 = 0.5562

14. N = 500, n = 20, r = 60 (One can also find this probability very easily by using MINITAB or JMP)

P ( X  3) = P ( 0 ) + P (1) + P ( 2 ) + P ( 3) =

60  440 

 0 
 20 
500 

 20 

60  440 

 1 
 19 
500 

 20 

60  440 

 2 
 18 
500 

 20 

60  440 

 3 
 17 
500 

 20 

= 0.07354 + 0.20961 + 0.27840 + 0.22904 = 0.7906

15. (a)  = 5
P X = 4 = 0.1755

)
(b)  = ( 2 ) ( 5) = 10.0
P ( X > 7 ) = 1  P ( X  7 ) = 1  P ( 7 )  P ( 6 )  ...  P (1)  P ( 0 ) = 0.7798
19

 5
(c)  = ( 90 )   = 7.5
 60 

()

()

()

P( X  8) = 1  X  7 = 1  [P 7 + ... + P 1 + P 0 ] = 1  0.220 = 0.780

( )( )

(d)  = 2 5 = 10.0

P 5 < X < 10 = P X 9  P X 5 = 0.4579  0.0671 = 0.3908


16. n = 10, p = 0.60, q = 0.40
 n
 10 
6
10 6
P ( X = 6 ) =   p x q n  x =   ( 0.6 ) ( 0.4 )
= 0.2508
 x
 6
Chapter 7
1. = 20 ,  = 2
 17  20 X  20 23  20 
(a) P (17 < X < 23) = P 
<
<
2
2 
 2

= P ( 1.5 < Z < 1.5)

= P ( Z < 1.5)  P ( Z  1.5)


= 0.9332  0.0668 = 0.8664
 18  20 X  20 22  20 
(b) P (18 < X < 22 ) = P 
<
<
2
2 
 2

= P ( 1 < Z < 1)

= P ( Z < 1)  P ( Z  1)
= 0.8413  0.1587 = 0.6826
15  20 X  20 25  20 
(c) P (15 < X < 25) = P
<
<
2
2 
 2
= P ( 2.5 < Z < 2.5)

= P ( Z < 2.5)  P ( Z  2.5)


= 0.9938  0.0062 = 0.9876
2. = 15 ,  = 1.5
 X  15 17  15 
(a) P X > 14 = P 
>
= P Z > 1.33 = 0.5  0.4082 = 0.0918
1.5 
 1.5

 X  15 14  15 
(b) P X > 14 = P 
>
= P Z > 0.67 = 0.5 + 0.2486 = 0.7486
1.5 
 1.5

 X  15 15  15 
(c) P ( X < 15) = P 
<
= P ( Z < 0 ) = 0.5000
1.5 
 1.5

20

= 33
 =3
 27  30 X  30 30  30 
<
<
P ( 27 < X < 30 ) = P 
3
3 
 3
(d)
= P ( 1 < Z < 0 )

= P ( Z < 0 )  P ( Z 1)
= 0.5000  0.1587 = 0.3413

3. = 33 ,  = 3
 27  30 X  30 30  30 
(a) P ( 27 < X < 30 ) = P 
<
<
3
3 
 3

= P ( 1 < Z < 0 )

= P ( Z < 0 )  P ( Z  1)
= 0.5000  0.1587 = 0.3413
27  30 X  30 35  30 
(b) P ( 27 < X < 35) = P
<
<
3
3 
 3

= P ( 1 < Z < 1.67 )

= P ( Z < 1.67 )  P ( Z  1.)


= 0.9525  0.1587 = 0.7938

 32  33 X  33 39  33 
(c) P ( 32 < X < 39 ) = P 
<
<
3
3 
 3

= P ( 0.33 < Z < 2 )

= P ( Z < 2 )  P ( Z 0.33)
= 0.9772  0.3707 = 0.6065
4. = 0 ,  = 1

(
)
(b) P ( 2  Z  2 ) = P ( Z  2 )  P ( Z < 2 ) = 0.9772  0.0228 = 0.9544
(c) P ( 3  Z  3) = P ( Z  3)  P ( Z < 3) = 0.9987  0.0013 = 0.9974
(a) P 1  Z  1 = P(Z  1)  P(Z  1) = 0.8413  0.1587 = 0.6826

5. From the empirical rule we know that approximately 68% of the data values will fall within one
standard deviation of the mean, 95% will fall within two standard deviations of the mean, and 99.7%
will fall within three standard deviations of the mean. It is clear to see that the results obtained in
problem 4 are similar to the results obtained using the empirical rule.
21

(
)
(b) P ( Z  1.2 ) = 1  P ( Z < 1.2 ) = 0.8849
(c) P ( 1.58 Z 2.40 ) = P ( Z 2.40 )  P ( Z < 1.58) = 0.9918  0.0559 = 0.9359
(d) P ( Z  1.96 ) = 1  P ( Z < 1.96 ) = 1  0.9750 = 0.0250
(e) P ( Z  1.96 ) = 0.0250

6. (a) P Z  2.11 = 0.9826

7.  = 1.5

( )
(b) P ( X < 4 ) = P ( X  3) = 1  e = 1  e = 0.9889
(c) P ( 2 < X < 4 ) = P ( X < 4 )  P ( X 2 ) = 0.9889  0.9502 = 0.0387
(d) P ( X < 0 ) = 0.0000
 1.5 2
(a) P X > 2 = e   x = e ( )( ) = e 3 = 0.0498

x

8.

4.5

Mean = 12, which means  = 1 / 12


(a) 0.3679
(b) 0.1353
(c) 0.2326
(d) 1.0000

9. mean = =


1 
1
1 
1
 1 + =
 1 +
= (100 ) ( 2 ) = 200
 
  0.01 
0.5 

2
1 
2  
1 
variance =  = 2    1 +
   1 +


  
 
  


2
1 
2   
1 
 = 10000 ( 20 ) = 200000
 1 +
=
  1+
0.5   
0.5   
0.012 

(
)
(b) P ( X > 7000 ) = e (

10. (a) P X > 4500 = e

 ( x )

 x )

11. mean = =

=e (

 0.01( 4500 )

) 2 = 0.0012
1

 0.01( 7000 )) 2
=e (
= 0.00023


1 
1
1
1
 1 +  =
  1 +  = (1000 ) ( 0.886226 ) = 886.226
 
 0.001 
2
2

2  
1 
  1 +
   1 +


  
 
 


2
1 
2  
1
 1 +      1 +    = (1000000 ) ( 0.2146034769 ) = 214603.4769
=
2  
2  
0.0012 

1
variance =  = 2

2

22


 0.001( 799 ))
 x
= 0.4719
12. (a) P ( X < 800 ) = P ( X  799 ) = 1  e ( ) = 1  e (
2


 0.001(1000 ))
 x
= 0.3679
(b) P ( X > 1000 ) = e ( ) = e (
2

2
2
 ( 0.001(1500 )) 
 ( 0.001(1000 )) 



1

e
(c) P 1000 < X < 1500 = P X 1500  P X 1000 =  1  e
 


= 0.8943  0.6321 = 0.2622

13.  = 0.2

()
= e = 0.2466
( )
(b) P ( 7 < X < 10 ) = P ( X < 10 )  P ( X  7 ) = 0.8647  0.7534 = 0.1113
P ( X  8)  P ( X > 5) P ( X  8) 0.2019
(c) P ( X  8 | X > 5) =
=
=
= 0.5488
P ( X > 5)
P ( X > 5) 0.36788
(d) P ( X < 7 ) = P ( X 7 ) = 1  0.2466 = 0.7534

(a) P X > 7 = e   x = e

) (

0.2 7

1.4

14. P X  5 + 9 5 = P X  9 = 0.4065
15.  = 0.00125
(a) P X = 700 = 0.00 , since the probability for a continuous random variable to be equal to any

number is always zero.


(b) P X > 850 = 0.3456

(
)
(c) P ( 600 < X < 900 ) = P ( X  900 )  P (  600 ) = 0.6754  0.5274 = 0.1480
(d) P ( X  650 ) = 0.4437
Chapter 8
1. Because the sample size is large, the sampling distribution of x is approximately normal with mean
9
x = 28 and standard deviation  x =
= 1.5 .
36
2. mean = x = = 18
variance = 2 x =

25 50  5
= 4.59, Note that population is finite and sample size is > 5% of the
5 50  1

population.
standard deviation = 2.1424



to
6
8


(b) standard error will decrease from
to
10
20

3. (a) standard error will decrease from

23

 
to
9 18


(d) standard error will decrease from
to
16
24

(c) standard error will decrease from

4. x = 3000 ,  x =

100
16

= 25

 2960  3000 X  3000 3040  3000 


P ( 2960 < X < 3040 ) = P 
<
<

25
25
25


= P ( 1.6 < Z < 1.6 ) = 0.9452  0.0548 = 0.8904

5. x = 140 ,  x =

35

=5
49
 X  140 145  140 
(i) P ( X > 145) = P 
>
 = P ( Z > 1) = 1  P ( Z  1) = 1  0.8413 = 0.1587
5
5


(
) ( )
(iii) P (132 < X < 148 ) = P ( 1.6 < Z < 1.6 ) = 0.9452  0.0548 = 0.8904
(ii) P X < 140 = P Z < 0 = 0.5000

6. x = 120 ,  x =

10

= 1.6667
36
X  120 122  120 
(i) P ( X > 122 ) = P 
>
= P ( Z > 1.20 ) = 1  P ( Z 1.20 ) = 1  0.8849 = 0.1151
1.6667 
 1.6667

(
) (
)
(iii) P (116 < X < 123) = P ( 2.40 < Z < 1.80 ) = 0.9641  0.0082 = 0.9559
(ii) P X < 115 = P Z < 3.00 = 0.0013

7. x = 70 ,  x =

= 0.6667
36

X  70 75  70 
(i) P ( X > 75) = P
>
= P ( Z > 7.50 ) = 1.0000
 0.6667 0.6667 

(
) (
)
(iii) P ( 70 < X < 80 ) = P ( 0.00 < Z < 15.00 ) = P ( Z  15.00 )  P ( Z  0.0000 ) = 0.5000
(ii) P X < 70 = P Z < 0.00 = 0.5000

8. (i) Since np > 5 and nq > 5, the sample proportion is approximately normal with

( 0.4 )( 0.6) = 0.0775 .


pq
=
40
n
(ii) Since np > 5 and nq > 5, the sample proportion is approximately normal with
mean = p = p = 0.4 and standard deviation =  p =

24

( 0.2)( 0.8) = 0.0566 .


pq
=
50
n
(iii) Since np > 5 and nq > 5, the sample proportion is approximately normal with
mean = p = p = 0.2 and standard deviation =  p =

pq
=
n

mean = p = p = 0.1 and standard deviation =  p =

( 0.1)( 0.9) = 0.0335 .


80

9. (i) In this problem we are given n = 100 and p = 0.5. Thus, we have np > 5 and nq > 5, therefore, the
sample proportion p is approximately normal with mean = p = p = 0.5 and standard deviation

 p =

(0.5)(0.5) = 0.05 .

pq
=
n

100

p  0.5 0.6  0.5 


(ii) P p > 0.60 = P
>
= P Z > 2.00 = 0.0228
0.05 
 0.05

10. (i) In this problem we are given n = 500 and p = 0.8. Thus, we have np> 5 and nq > 5, therefore, the
sample proportion is approximately normal with
mean p = p = 0.8 and standard deviation  p =

(0.8)(0.2) = 0.0179 .

pq
=
n

500

 p  0.8 0.75  0.8 


(ii) P ( p  0.75) = P 

= P ( Z  2.80 ) = 0.9974
0.0179 
 0.0179

11. (i) In this problem we have n = 100 and p = 0.5. Thus, we have np > 5 and nq > 5, therefore, the
sample proportion p is approximately normal with mean = p = p = 0.5 and standard deviation

 p =

(0.5)(0.5) = 0.05 .

pq
=
n

100

 p  0.5 0.6  0.5 


(ii) P p > 0.60 = P 
>
= P Z > 2.00 = 0.0228
0.05 
 0.05

(
(ii) P ( 
(iii) P ( 
(iv) P ( 
(v) P ( 

)
 6.2621) = 0.975
6.2621) = 0.025
 7.2609 ) = 0.95
 7.2609 ) = 0.05

12. (i) P 152  24.9958 = 0.05


2
15

2
15

2
15

2
15

13. (i.) t18,0.025 = 2.101


(ii.) t20,0.05 = 1.725
(iii.) t15,0.01 = 2.602
25

(iv.) t10,0.10 = 1.372


(v.) t12,0.005 = 3.055
14. (i) F6,8,0.05 = 3.58
(ii) F8,10,0.01 = 5.06
(iii) F6,10,0.05 = 3.22
(iv) F10,11,0.025 = 3.53
15. (i) F10,12,0.95 =

1
F12,10,0.05

(ii) F8,10,0.975 =
(iii) F15,20,0.95 =
(iv) F20,15,0.99 =

= 0.3436

1
F10,8,0.025
1
F20,15,0.05
1
F15,20,0.01

= 0.2326
= 0.4292
= 0.3236

Chapter 9
1. margin of error E = z0.025


n

= 1.96

1.5
36

= 0.49

( )

2. (a.) Since E X = , the sample mean X is always an unbiased estimator of the population mean .
Therefore, the point estimate of the population mean wage is 25.

4
(b.) standard error of the point estimate is
=
= 0.5714
n
49
4

(c.) margin of error E = z0.025
= 1.96
= 1.12
49
n
3. x = 12 , s = 0.6 , z = z0.005 = 2.58 , n = 64
2

99% confidence interval for the population mean:



0.6 
s
s  
0.6
x

z



x
+
z



12
+
2.58
=
12

2.58





2
2
64 
n
n 
64

= (11.8065   12.1935)

4. x1 = 203 , x 2 = 240 ,  1 = 6 ,  2 = 8.5 , n1 = 36 , n2 = 49

26

90% confidence interval for the difference of two population means:

(x

 x 2 ) z

12 22
62 8.52
+
= ( 203  240 ) 1.645
+
= 37 2.5877
n1 n2
36 49
= ( 39.5877  1  2  34.4123)

99% confidence interval for the difference of two population


means: ( x1  x 2 ) z

 12  22
62 8.52
+
= ( 203  240 ) 2.58
+
= 37 4.0585
n1 n2
36 49
= ( 41.0585  1  2  32.9415)

5. x1 = 295,000 , x 2 = 305,000 , s1 = 10,600 , s2 = 12,800 , n1 = 100 , n2 = 121


98% confidence interval for the difference of two population means:

(x

 x 2 ) z

s12 s22
10,600 2 12,800 2
+
= ( 295,000  305,000 ) 2.33
+
= 10,000 3667.5485
n1 n2
100
121

= ( 13,667.5485  1  2 6,332.4515)

6. x1 = 10.17 , x 2 = 12.34 , s1 = 1.209 , s2 = 0.848 , n1 = 16 , n2 = 25

2
p

(n
=

 1) s12 + ( n2  1) s22
n1 + n2  2

(16  1)(1.209 ) + ( 25  1)( 0.848 ) = 1.0047


=
2

16 + 25  2

95% confidence interval for the difference between two population means:

1
1 
1 1
x1  x2 tn + n  2, S p
+
= 10.17  12.34 2.021 1.0023
+  = 2.17 0.6485
1
2
16 25 
n1 n2
2


= 2.8185  1  2  1.5215

7. 95% confidence interval for the difference between two population means:

(x

 x 2 ) z
1

s12 s22
30 40
+
= ( 79  86 ) 1.96
+
= 7 2.573
n1 n2
49 36

= ( 9.573  1  2  4.427 )

98% lower confidence interval for the difference between two population means:

(x  x )  z
1

s12 s22
30 40
+
= 79  86  2.06
+
= 7  2.704 = 9.704
49 36
n1 n2

= 9.704  1  2 
27

98% upper confidence interval for the difference between two population means:

(x  x ) + z
1

s12 s22
30 40
+
= 79  86 + 2.06
+
= 7 + 2.704 = 4.296
49 36
n1 n2

=   1  2  4.4296

8. 95% confidence interval for the difference between two population means:

( x1  x2 ) tm,

s12 s22
+ where,
n1 n2

m=

s12 s22 
n + n 
 1
2

s12 
s22 


 n1 
 ns 
+
n1  1
n2  1

 1.212 0.852 
 16 + 25 

1.212

16
16  1

) +(
2

0.852

25
25  1

= 24.44  24

tm, = t24,0.025 = 2.069


2

Then,
1.212 0.852
(10.17  12.34 ) 2.069 16 + 25 = 2.17 0.7179 = ( 2.8879  1  2  1.4521)

9. A point estimate of p = p =

18
= 0.02 .
900

95% confidence interval for the population proportion:


p z 
2

0.02 0.98
p q
= 0.02 1.96
= 0.02 0.009 = 0.011  p  0.029
900
n

95% lower confidence limit for the population proportion:

p  z

0.02 0.98
p q
= 0.02  1.645
= 0.02  0.00767 = 0.01233  p  1
900
n

22
= 0.055 , q = 0.945 , n = 400
400
95% confidence interval for the population proportion:

10. p =

p z 
2

0.055 ( 0.945)
p q
= 0.055 1.96
= 0.055 0.0223 = ( 0.0327  p  0.0773)
n
400
28

95% lower confidence interval for the population proportion:

0.055 0.945
p q
= 0.055  1.645
= 0.055  0.0187 = 0.0363  p  1
400
n

p  z

95% upper confidence interval for the population proportion:

11. p1 =

0.055 0.945
p q
= 0.055 + 1.645
= 0.055 + 0.0187 = 0.  p  0.0737
400
n

p + z

40
50
= 0.05 , q1 = 0.95 , p 2 =
= 0.083 , q2 = 0.917
800
600

95% confidence interval for the difference between two population proportions:

( p  p ) z
1

12. p1 =


2

0.05 ( 0.95) 0.083 ( 0.917 )


p1q1 p 2 q2
+
= ( 0.05  0.083) 1.96
+
n1
n2
800
600

= 0.033 0.0267 = ( 0.0597  p1  p2  0.0063)

72
110
= 0.6 , q1 = 0.40 , p 2 =
= 0.73 , q1 = 0.27
120
150

95% confidence interval for the difference between two population proportions:

( p  p ) z
1

0.6 ( 0.4 ) 0.73 ( 0.27 )


p1q1 p 2 q2
+
= ( 0.6  0.73) 1.96
+
n1
n2
120
150


2

= 0.13 0.1128 = 0.2428  p1  p2  0.0205


2



 z  p q
1.962 ( 0.02 ) ( 0.98 )
2
=
= 120.47  121
13. n =
2
E2
0.025
( )


 z  pq 1.962 ( 0.02 ) ( 0.98 )
2
=
= 30.12 31
14. n =
2
E2
0.05
( )

When we increase the margin of error from 0.025 to 0.05 we see that the sample size decreases by 90
from 121 to 31.

width 20
=
= 10 , z 2 = z0.02 2 = 2.33
2
2
2.332 30 2
z 2 22
n=
=
= 48.86 49
E2
10 2

15. E =

)( )

29

16. When we decrease the confidence coefficient from 98% to 95% our new sample size is
1.962 30 2
z 2 2 2
n=
=
= 34.57  35
E2
10 2
When we increase the standard deviation from 30 to 50 our new sample size is:
1.962 50 2
z 2 2 2
n=
=
= 96.04  97
E2
10 2

)( )

)( )

(1.96 )( 0.5( 0.5) + 0.5( 0.5)) = 1200.5  1201


=
17. (a) n =
E
( 0.04 )
z  p q + p q  (1.96 ) ( 0.5 ( 0.5) + 0.5 ( 0.5))
=
= 1568
(b) n =
E
( 0.035 )
z  p q + p q  (1.96 ) ( 0.5 ( 0.5) + 0.5 ( 0.5))
=
= 2134.22  2135
(c) n =
E
( 0.03 )
2

z2 2  p1q1 + p2 q2 
2

2
 2

1 1

2 2

2
 2

1 1

2 2

18. (i) n =

z 2 2 2

(ii) n =
(iii) n =
(iv) n =

19. n =
20. Fn

E2

(1.645 )( 40 ) = 10.82  11
=

z 2 22
E

z 2 2 2
E2
z 2 22
E

z2 2 s12 + s22


E2

1,n1 1,

20 2
1.6452 40 2

)( ) = 3.53 4

)( ) = 2.14  3

)( ) = 1.02 2

352
1.6452 40 2

452
1.6452 40 2
652

) = ( 2.58 )(30 + 40 ) = 116.49  117


2

22

= F14,9,0.05 = 3.03

95% upper confidence interval for the ratio of two variances:



s12  
302 
= 0,1.8885
0,
F
=
0,
F

n2 1,n1 1, 2 
s2   14,9,0.05 382 


21. 95% confidence interval for the population variance:


2
 n1  1 s 2
25  1 52
n1  1 s 2   25  1 5
2
2
  2
 
 2
=
12.4011
n1,1  2
 39.3641
 n1, 2

)( )

30

)( ) 

= 15.2423   2  48.3828

95% confidence interval for the population standard deviation:


 n  1 s2
n  1 s2 

= 15.2423    48.3828 = 3.9041    6.9557
 
2
2
n1,1
 n1,
2
2

95% lower confidence limit for the population standard deviation:

( n  1) s

 n21,

( 25  1)(5 ) = 4.0591
2

36.4151

95% upper confidence limit for the population standard deviation:

( n  1) s

 n21,1 

( 25  1)(5 ) = 6.5823
2

13.8484

The lower one-sided confidence limit, 4.0591, is greater than the lower two-sided confidence limit,
3.9043. Also the upper one-sided confidence limit, 6.5823, is less than the upper two-sided confidence
limit, 6.9557. The difference in these confidence limits has to do with the way that alpha was used in
each of the equations. For the two sided confidence interval we used  /2 on each sided and for the
one-sided confidence limits we used  .
22. x1 = 27.5 ,  1 = 1.96 , n1 = 10
95% confidence interval for the population variance:
2
 ( n1  1)  12
10  1) 1.962
n1  1)  12  (10  1) 1.96
(
(
2
2
 1  2
 1 

=
2
2.7004
 n 1,1  2
 19.0228
  n 1, 2

= 1.8175   12  12.8034

99% lower confidence interval for the population variance

 ( n1  1)  12

 (10  1) 1.962
2
2




=




= 1.5958   12  


1
1
2

21.666
  n 1,


99% upper confidence interval for the population variance:



10  1) 1.962
n1  1)  12 
(
(
2
2
= 0   12  16.5594
=  0  1 
 0  1 
2


2.0879


n 1,1 

) (

23. x 2 = 31.2 ,  2 = 2.21 , n2 = 15


95% confidence interval for the population variance:
31

2
 ( n2  1)  22
15  1) 2.212
n2  1)  22  (15  1) 2.21
(
(
2
2
 2  2
 2 

=
2
26.119
5.6287
 n 1,1  2

  n 1, 2

= 2.618  22 12.148

99% lower confidence interval for the population variance



 ( n2  1)  22
 (15  1) 2.212
2
2




=






= 2.3464   22  


2
2
2


29.1413
 n 1,


99% upper confidence interval for the population variance:



15  1) 2.212 
n2  1)  22  
(
(
2
2
= 0   22  14.672
0   2 
= 0   2 
2


4.6604
n 1,1 



24. Fn

1,n1 1, 2

= F14,9,0.025 = 3.80 , Fn

1,n1 1,1  2

) (

= 1

= 1

Fn 1,n

2 1, 2

F9,14,0.025

= 0.3115

95% confidence interval for the ratio of two variances:


 s2
  1.962

 12 s12
 12 1.962
1
1
 2  2 Fn 1,n 1, 2
= 
0.3115


3.80
 2

2
2
1
 22 2.212

 s2 Fn1 1,n2 1, 2  2 s2
 2.21



 12
=  0.2450  2  2.988
2


25. 95% upper confidence interval for the ratio of two variances:
Fn 1,n 1, = F14,9,0.05 = 3.03
2


s12 
1.962 
1.962 
= 0, 3.03
= 0,2.3832
 0, Fn2 1,n1 1, 2  =  0, F14,9,0.05
s2  
2.212  
2.212 


95% lower confidence interval for the ratio of two variances:


Fn 1,n 1,1  = 1
= 1
= 0.3773
2
1
Fn 1,n 1,
F9,14,0.05
1

 s2
  1.962  1    1.962

1
1
,
=
,
0.3773
,
 2
=



 2.212
= 0.2967,
2
F
F
s
2.21





 2 n1 1,n2 1,  
9,14,0.05

Chapter 10
32

1. (a.) upper-tail test


(b.) lower-tail test
(c.) two tail test

} {

} {

2. Rejection Region: RR = Z  z = Z  z0.01 = Z  2.33


Test Statistic:
X  7.5  7.0
Z=
=
= 2.5
s n 1.4 49
Since 2.5 > 2.33 , we reject the null hypothesis H 0 .

} {

} {

3. (a.) Rejection Region: RR = Z  z = Z  z0.05 = Z  1.645


Test Statistic:
X  450  500
Z=
=
= 6.0
s n
50 36
Since 6.0 < 1.645 , we reject the null hypothesis H 0

(b.) Rejection Region: RR = Z  z

} = { Z > z } = { Z > 1.96}


0.025

Test Statistic:
X  500  450
Z=
=
= 6.0
s n
50 36
Since 6.0 > 1.96 , we reject the null hypothesis H 0 .

4. (a.) p  value = P Z  z = P Z  6.0 = 0.0000 . Since the p  value = 0.0000 < 0.05 =  , we reject
the null hypothesis H 0 .

(b.) p  value = 2P Z  z = 2P ( Z 6.0 ) = 2 ( 0.0000 ) = 0.0000 , Note that we multiply by 2 because


the test is a two-tail test. Since the p  value = 0.0000 < 0.05 =  , we reject the null
hypothesis H 0 .

} {

} {

5. (a.) Rejection Region: RR = Z z = Z z0.01 = Z 2.33


Test Statistic:
X  58.7  60
Z=
=
= 5.2
s n 2.5 100
Since 5.2 < 2.33 , we reject the null hypothesis H 0 .

p  value = P ( Z < z ) = P ( Z < 5.2 ) = 0.0000


Since the p  value = 0.0000 < 0.01 =  , we reject the null hypothesis H 0 .

33

(b.) Rejection Region: RR = Z  z

} = { Z  z } = { Z  2.58}
0.005

Test Statistic:
X  58.7  60
Z=
=
= 5.2
s n 2.5 100
Since 5.2 > 2.58 , we reject the null hypothesis H 0 .

p  value = 2P Z  z = 2P ( Z  5.2 ) = 2 ( 0.0000 ) = 0.0000 . Note that we multiply by 2 because


the test is a two-tail test. Since the p  value = 0.0000 < 0.01 =  , we reject the null
hypothesis H 0 .

6. Rejection Region: RR = Z  z

} = { Z  z } = { Z  1.96}
0.025

Test Statistic:
X  18.2  18
Z=
=
= 1.33
s n 1.2 64
Since 1.33 < 1.96 , we do not reject the null hypothesis H 0 .

p  value = 2P Z  z = 2P ( Z 1.33) = 2 ( 0.0918 ) = 0.1836

Since the p  value = 0.1836 > 0.05 =  , we do not reject the null hypothesis H 0 .
Power of the test:

 


At = 18.5 ; 1   = 1  P  0
 z < Z < 0
+ z
 s n
s n
2
2
 18  18.5

18  18.5
= 1 P
 1.96 < Z <
+ 1.96
 1.2 64

1.2 64

= 1  P ( 5.29 < Z < 1.37 ) = 0.9147

7. (a.) Rejection Region: RR = Z  z

} = { Z  z } = { Z  1.96}
0.025

Test Statistic:
( X  X 2 )  ( 1  2 ) = ( 73.54  74.29)  0 = 21.2132
Z= 1
0.22 0.152
s12 s22
+
+
50
50
n1 n2
Since 21.2132 > 1.96 , we reject the null hypothesis H 0 .

(b.) p  value = 2P Z  z = 2P ( Z 21.2132 ) = 2 ( 0.0000 ) = 0.0000


(c.) Since the p  value = 0.000 < 0.05 =  , we reject the null hypothesis H 0 .

} {

} {

8. (a.) Rejection Region: RR = Z  z = Z  z0.05 = Z  1.645


34

Test Statistic:
( X  X 2 )  ( 1  2 ) = (68.8  81.5)  0 = 9.8918
Z= 1
5.12 7.4 2
s12 s22
+
+
49
49
n1 n2
Since 9.8918 < 1.645 , we reject the null hypothesis H 0 .






( 1  2 )0  ( 1  2 )  z = P  Z >

(b.) Type II Error  = P  Z >


s12 s22


+




n1 n2



05
 1.645

5.12 7.4 2
+


49
49

= P ( Z > 5.5394 ) = 1.0000

Power of the Test:


At = 5 ; 1   = 1  P Z > 5.5394 = 0.0000

9. (a.) H 0 : 1  2 = 0 vs. H1 : 1  2  0

Rejection Region: RR = Z  z

} = { Z  z } = { Z  1.96}
0.025

Test Statistic:
( X  X 2 )  ( 1  2 ) = (68,750  74,350 )  0 = 5.8135
Z= 1
4,930 2 5,400 2
s12 s22
+
+
55
60
n1 n2
Since 5.8135 > 1.96 , we reject the null hypothesis H 0 . In other words, we conclude at the 0.05
significance level that two loan officers do not issue loans of equal value.
(b.) H 0 : 1  2 = 0 vs. H1 : 1  2 < 0

} {

} {

Rejection Region: RR = Z  z = Z  z0.01 = Z  2.33

Test Statistic:
( X  X 2 )  ( 1  2 ) = (68,750  74,350 )  0 = 5.8135
Z= 1
4,930 2 5,400 2
s12 s22
+
+
55
60
n1 n2
Since 5.8135 < 2.33 , we reject the null hypothesis H 0 . In other words, we conclude at the 0.01
significance level that on the average loans issued by officer one is less than the average loans
issued by officer two.

} {

} {

10. (a.) Rejection Region: RR = Z  z = Z  z0.01 = Z  2.33


Test Statistic:

35

Z=

X

15.5  15

= 2.86
 n 1.4 64
Since 2.86 > 2.33 , we reject the null hypothesis H 0 .





0  1
15  16




(b.) Type II Error = P  Z <
+ z  = P  Z <
+ 2.33

1.4




64
n

= P ( Z < 3.38 ) = 0.0004

Power of the Test:


At = 16 ; 1   = 1  P Z < 3.38 = 0.9996

11. x = 25.4 , s = 3.57 , n = 10


(a.) Rejection Region: RR = T  tn1,

} = { T  t } = { T  3.25}
9,0.005

Test Statistic:
x  25.4  15
T=
=
= 9.21
s
3.57
n
10
Since 9.21 > 3.25 , we reject the null hypothesis H 0 .

(b.) p  value = P T t = 2P T 9.21 < 2P T 4.781 = 0.001

p  value < 0.001

12. Assume that the sample come from a normal population.


RR = T  tn1, = T  t15,0.05 = T  1.753

} {

} {

Test Statistic:
x  4,858  5,000
T=
=
= 0.99
575
s
16
n
Since 0.99 > 1.753 , we do not reject the null hypothesis H 0 .

p  value = P ( T t ) = P ( T 0.99 ) > P ( T 1.341) = 0.100


p  value > 0.100

2
p

} = {T  t

} = { T  2.797}
( n  1) s + ( n  1) s = (12  1)(9 ) + (14  1)(10 ) = 91.2917
=

13. Rejection Region: RR = T  tm,


1

2
1

n1 + n2  2

2
2

24,0.005

12 + 14  2

Test Statistic:

36

T=

(X

 X 2 )  ( 1  2 )
1 1
+
n1 n2

Sp

(110  115)  0
1
1
+
9.5547
12 14

= 1.33

Since 1.33 < 2.797 , we do not reject the null hypothesis H 0 .

)
)

)
)

p  value = 2P T t = 2P T 1.33 2  P T 1.711 < P T 1.33 < P T 1.318 


 2 0.05 < P(T  1.33) < 0.100

 0.100 < p  value < 0.200

14. Rejection Region: RR = T tm,

m=

 s12 s22 
n +n 
 1
2

} = { T t

24,0.005

 92 10 2 
 12 + 14 

} = { T 2.797}

=
= 23.93 = 24
2
2
2
2
 s12 
 s22 
10 2
92
12
14
 n 
 n 
+
1
s
+
12  1
14  1
n1  1
n2  1
Test Statistic:
( X  X 2 )  ( 1  2 ) = (110  115)  0 = 1.34
T= 1
92 10 2
s12 s22
+
+
12 14
n1 n2

( ) (

Since 1.34 < 2.797 , we do not reject the null hypothesis H 0 .

p  value = 2P T  t = 2P T  1.33  2  P T  1.711 < P T  1.34 < P T  1.318 


 2 0.05 < P(T  1.34) < 0.100

 0.100 < p  value < 0.200

15. (a.) Rejection Region: RR = Z  z

} = { Z  z } = { Z  2.33}
0.01

Test Statistic:
X  X 2  1  2
30.25  41.27  0
Z= 1
=
= 4.84
4.52 6.22
 12  22
+
+
12
11
n1 n2

) (

) (

Since 4.84 > 2.33 , we reject the null hypothesis H 0 .

(b.) p  value = 2P Z  z = 2P ( Z  4.84 ) = 2 ( 0.0000 ) = 0.0000


Since the p  value = 0.0000  0.02 =  , we reject the null hypothesis H 0 . This is the same
conclusion that we arrived at it part (a.).
37

16. x = 16.05 , s = 2.198


(a.) H 0 : = 16 vs. H1 :  16

} {

} {

(b.) Rejection Region: RR = T tn1, = T t18,0.05 = T 1.734


Test Statistic:
x  16.05  16
T=
=
= 0.0992
s
2.198
n
19
Since 0.0992 < 1.734 , we do not reject the null hypothesis H 0

(c.) p  value = P t18  0.0992 > P t18  1.330 = 0.10 . Thus, p-value > 0.10.

17. X d =

d
n

70
=7
10


1
2
  di2 
Sd =
n  1


( d )  =
2




2
70 ) 
(
1 
 540 
 = 5.5556
10  1 
10 

Sd = Sd2 = 5.5556 = 2.357


(a.) Hypothesis: H 0 : d = 0 vs. H1 : d > 0

} {

} {

Rejection Region: RR = T  tn1, = T  t9,0.05 = T  1.833


Test Statistic:
X  d
70
T= d
=
= 9.3916
Sd
2.357
10
n
Since 9.3916 > 1.833 , we reject the null hypothesis H 0 .

(b.) p  value = P t9 t = P t9 9.3916 < P t9 4.781 = 0.0005

p  value < 0.0005

18. X d = 3.8 , Sd2 = 3.5111 , Sd = 1.874


(a.) Hypothesis: H 0 : d = 0 vs. H1 : d < 0

} {

(b.)Rejection Region: RR = T < tn 1, = T < t9,0.01 = {T < 2.821}


Test Statistic:
X  d
3.8  0
T= d
=
= 6.4123
Sd
1.874
10
n
Since 6.4123 < 2.821 , we reject the null hypothesis H 0 .
38

250
380
= 0.625 , p 2 =
= 0.76
400
500
X + X 2 250 + 380
p = 1
=
= 0.70
n1 + n2
400 + 500

19. p1 =

(a.) Hypothesis: H 0 : p1  p2 = 0 vs. H1 : p1  p2 < 0

} {

} {

Rejection Region: RR = Z z = Z z0.05 = Z 1.645


Test Statistic:
p  p 2  p1  p2
Z= 1
=
p q p q
+
n1
n2

) (

(0.625  0.76)  0 = 4.3916


0.7 ( 0.3) 0.7 ( 0.3)
+

400
500
Since 4.3916 < 1.645 , we reject the null hypothesis H 0 .
(b.) Hypothesis: H 0 : p1  p2 = 0 vs. H1 : p1  p2  0

Rejection Region: RR = Z  z

} = { Z  z } = { Z  2.33}
0.01

Test Statistic:
( p  p 2 )  ( p1  p2 ) =
Z= 1
p1q1 p2 q2
+
n1
n2

( 0.625  0.76)  0 = 4.3916


0.7 ( 0.3) 0.7 ( 0.3)
+
400

500

Since 4.3916 > 2.33 , we reject the null hypothesis H 0 .

20. p  value = P Z  z = P Z  4.3916 = 0.0000


Since the p  value = 0.0000  0.05 =  , we reject the null hypothesis H 0 .

p  value = 2P Z  z = 2P ( Z 4.3916 ) = 2 ( 0.0000 ) = 0.0000


Since the p  value = 0.0000  0.01 =  , we reject the null hypothesis H 0 .
Yes, we arrive at the same conclusion in problem 19 using the p-value approach.
21. (a.) Hypothesis: H 0 : p = 0.5 vs. H1 : p  0.5
(b.) Test Statistic:
p  p
0.3  0.5
Z=
=
= 4.0
pq
0.5 ( 0.5)
n
100
To find the level of significance at which the null hypothesis is rejected is equivalent to find the pvalue. p-value = 2P(Z  4 ) = 2P(Z  4) = 0.00 . Thus, the null hypothesis will be rejected at any
level.
21
24
= 0.07 , p 2 =
= 0.056
22. p1 =
300
425
39

p =

X1 + X 2
21 + 24
=
= 0.062
n1 + n2 300 + 425

Hypothesis: H 0 : p1  p2 = 0 vs. H1 : p1  p2  0

Rejection Region: RR = Z  z

} = { Z  z } = { Z  1.96}

Test Statistic:
p  p 2  p1  p2
Z= 1
=
p q p q
+
n1
n2

) (

0.025

(0.07  0.056)  0
= 0.7818
0.062 ( 0.938) 0.062 ( 0.938)
+
300

425

Since 0.7818 < 1.96 , we do not reject the null hypothesis H 0 .

p  value = 2P Z  z = 2P ( Z  0.78 ) = 2 ( 0.2177 ) = 0.4354


23. Hypothesis: H 0 : p = 0 vs. H1 : p > 0.10

} {

} {

Rejection Region RR = Z  z = Z  z0.05 = Z  1.645

Test Statistic:
p  p 0.129  0.1
Z=
=
= 1.14
pq
0.1( 0.9 )
n
140
Since 1.14 < 1.645 , we do not reject the null hypothesis H 0 .

p  value = P ( Z  z ) = P ( Z  1.14 ) = 0.1271


24. Hypothesis: H 0 :  2 = 20 vs. H1 :  2 < 20

} {

} {

2
= X 2 < 8.6718
Rejection Region RR = X 2 <  n21,1  = X 2 < 17,0.95

Test Statistic:
( n  1) s 2 = (18  1)(15.6) = 13.26
X2 =
20
2
Since 13.26 > 8.6718 , we do not reject the null hypothesis H 0 .
25. n = 12 , x = 15.8583 , s = 0.4757

} {

2
2
Rejection Region RR = X 2   n1,1
or X 2   n1,
 /2


} = { X   } or { X
= { X  2.6032} or { X
2

2
11,0.995

Test Statistic:
X

( n  1) s
=

(12  1)( 0.4757 ) = 12.446


=
2

0.2
2
Since 3.565 < 12.446 < 29.8194 , we do not reject then null hypothesis H 0 .

40

2
 11,0.005

 26.7569



1
26. Rejection Region RR =  F
 or F  Fn1 1,n2 1, 2
Fn 1,n 1, 2 

2
1



1
= F 
 or F  F24,35,0.025
F35,24,0.025 

Note: Since neither 35 nor 24 can be found in the table we will use approximate values for F using a
method commonly known as interpolation method. Thus, we have
RR = F  1 / 2.15  0.4651 or F  2.02

} {

Test Statistic:
S12 1.475
F= 2 =
= 1.67995
S2 0.878
Since 0.4651 < 1.67995 < 2.02 , we do not reject the null hypothesis H 0 .

27. Rejection Region RR = F Fn 1.n


1

1,

} = {F F

24,35,0.05

} = { F 1.82}

Note: Since neither 24 nor 35 can be found in the table we will use approximate value by looking the
value for 25 and 35 degrees of freedom. This will give us somewhat conservative rejection region.
Test Statistic:
S 2 1.475
F = 12 =
= 1.67995
S2 0.878
Since 1.67995 < 1.82 , we do not reject the null hypothesis H 0 .
28. Sample I: n1 = 12 , x1 = 21.075 , s1 = 1.8626
Sample II: n2 = 12 , x 2 = 22.5 , s2 = 1.3837

1
Rejection Region: RR =  F 
Fn 1,n 1,
2
1



or F  Fn1 1,n2 1,
2

1


= F 
 or F  F11,11,0.025
F11,11,0.025 

= F  0.2882 or F  3.47

} {

Test Statistic:
S 2 1.86282
F = 12 =
= 1.8124
S2 1.38372
Since 0.2882 < 1.8124 < 3.47 , we do not reject the null hypothesis H 0 .
29. Portfolio I: n1 = 12 , x1 = 1.125 , s1 = 1.5621
Portfolio II: n2 = 12 , x 2 = 0.733 , s2 = 2.7050


1
Rejection Region: RR =  F
 or F  Fn1 1,n2 1,
F
n2 1,n1 1, 2



41



1
= F 
 or F  F11,11,0.025
F11,11,0.025 

= F  0.2882 or F  3.47

} {

Test Statistic:
S12 1.56212
F= 2 =
= 0.3335
S2 2.70502
Since 0.2882 < 03335 < 3.47 , we do not reject the null hypothesis H 0 .

 
1
1 
30. Rejection Region RR =  F 
= F 
= F  0.3546
F
F



n
1.n
1,

11,11,0.05
2
1


 
Test Statistic:
S 2 1.56212
F = 12 =
= 0.3335
S2 2.70502

Since 03335 < 0.3546 , we reject the null hypothesis H 0 .

Chapter 11
1.

Frequency Distribution Table for Problem #1


Classes
34 - 40
40 - 46
46 - 52
52 - 58
58 - 64
64 - 70
70 - 76
Total

Frequency
7
5
7
5
6
4
6
40

Rel.Freq.
7/40
5/40
7/40
5/40
6/40
4/40
6/40
1

Cumul.Freq.
7
12
19
24
30
34
40

Note: The Frequency distribution table can vary depending upon how many classes you choose.

42

43

2. Descriptive Statistics for Data from Problem #1


Variable
Prob. 1

Mean

StDev

Variance

CoefVar

53.80

12.54

157.14

23.30

44

Median
53.00

IQR
22.25

3. (i)

Probabilities for x = 0, 1, 2, , 30 when n = 30 and p = 0.2


Row
r
P(r)
1

0.001238

0.009285

0.033656

0.078532

0.132522

0.172279

0.179457

0.153821

0.110559

10

0.067564

11

10

0.035471

12

11

0.016123

13

12

0.006382

14

13

0.002209

15

14

0.000671

16

15

0.000179

17

16

0.000042

18

17

0.000009

19

18

0.000002

20

19

0.000000

21

20

0.000000

22

21

0.000000

23

22

0.000000

24

23

0.000000

25

24

0.000000

26

25

0.000000

27

26

0.000000

28

27

0.000000

29

28

0.000000

30

29

0.000000

31

30

0.000000
45

(ii)

Cumulative Probabilities for x = 0, 1, 2, , 30 when n = 30 and p = 0.2


Row

P(r <= r)

0.00124

0.01052

0.04418

0.12271

0.25523

0.42751

0.60697

0.76079

0.87135

10

0.93891

11

10

0.97438

12

11

0.99051

13

12

0.99689

14

13

0.99910

15

14

0.99977

16

15

0.99995

17

16

0.99999

18

17

1.00000

19

18

1.00000

20

19

1.00000

21

20

1.00000

22

21

1.00000

23

22

1.00000

24

23

1.00000

25

24

1.00000

26

25

1.00000

27

26

1.00000

28

27

1.00000

29

28

1.00000

30

29

1.00000

31

30

1.00000
46

4. (a) Random Sample of size 100 from a Binomial Population with n = 60 and p = 0.7.

44

35

44

41

38

45

38

41

41

38

35

42

45

42

43

41

45

45

42

48

41

51

40

38

43

42

38

44

47

48

45

39

39

44

42

42

47

43

43

37

46

45

43

37

46

42

43

46

30

40

38

32

41

45

41

45

39

39

42

41

47

46

43

43

31

45

42

36

37

41

38

43

48

41

43

44

36

45

42

41

43

40

40

47

40

47

35

39

43

41

40

50

44

39

37

36

40

43

42

45

Note: Each time you generate random data using MINITAB, JMP, or, in
fact, any other software the data will be different. You should use
the above data to calculate some of the descriptive measures.

47

(b)

Random Sample of size 100 from a Standard Normal Population


-1.21981

-2.37602

1.29775

0.00277

2.36307

0.75201

-1.78136

-0.15683

0.60029

0.81729

-0.63226

-0.59754

1.43915

1.09275

-0.65072

1.31089

-1.61429

1.67827

-0.73676

1.44223

-1.13921

-0.42632

-1.32405

-0.89732

0.27543

0.31205

1.09948

-1.50948

-0.35195

-1.53494

-0.07543

-0.47851

-0.05165

0.73783

-1.01755

-0.40007

-0.33038

1.47401

-1.80159

-1.52018

-1.18923

-0.27756

1.11539

-0.84455

0.74224

-1.45352

-1.63088

0.22896

-0.44742

1.18517

1.64296

2.03639

-0.66084

-0.51547

2.55221

0.76959

-0.02838

1.02326

0.04541

1.39261

0.57836

0.48845

-0.14035

1.03660

0.64137

0.32206

-0.22875

-0.13656

-1.37834

-0.13024

-0.20022

-0.93577

-2.05991

-1.08695

1.39269

-0.21355

-0.13393

1.13944

-0.13758

0.47625

-1.51160

-2.84154

1.12404

0.83947

-0.39440

-0.47810

-1.18853

0.61019

-0.05268

0.48313

-0.60043

-0.95162

-1.55669

-1.50416

-0.09011

0.10925

0.17342

-0.07181

-0.11341

-2.11343

Note: Each time you generate random data using MINITAB, JMP, or, in
fact, any other software the data will be different. You should use
the above data to calculate some of the descriptive measures.

48

5. (i) Find cumulative probabilities,


P(z  1.645) = 0.95
P(z  1.96) = 0.9750
P(z  2.575) = 0.9950
Inverse Cumulative Distribution Function
(ii) Find the value of z0 such that,

P ( z  z0 ) = 0.025 where z0 = -1.96


P ( z  z0 ) = 0.05

where z0 = -1.645

P ( z  z0 ) = 0.10

where z0 = -1.28

P ( z  z0 ) = 0.90

where z0 = 1.28

P ( z  z0 ) = 0.95

where z0 = 1.6445

P ( z  z0 ) = 0.975 where z0 = 1.9600


6. One-Sample Z, since the sample size is 40 which is large (>30)
(a).
Variable

Mean

Prob. #1

40

53.8000

Variable

Mean

Prob. #1

40

53.8000

Variable

Mean

Prob. #1

40

53.8000

StDev

SE Mean

12.5355
StDev

1.9820
SE Mean

12.5355
StDev

1.9820
SE Mean

12.5355

1.9820

90% CI
(50.5398, 57.0602)
95% CI
(49.9153, 57.6847)
99% CI
(48.6946, 58.9054)

(b).
Variable
Prob. 4b
0.069944)

N
100

Mean
-0.110786

St Dev
1.098759

49

SE Mean
0.109876

90% CI
(-0.291515,

Variable
Prob. 4b
0.104568)

N
100

Mean
-0.110786

St Dev
1.098759

Variable
Prob. 4b
0.172236)

N
100

Mean
-0.110786

St Dev
1.098759

SE Mean
0.109876

95% CI
(-0.326139,

SE Mean
0.109876

99% CI
(-0.393807,

Note: The confidence intervals in problem 6 (b) are strictly for the data set generated in problem
4(b). If you generate another data set then your confidence intervals will be different from the ones
shown above.

7. (a) Construct a 95% confidence interval


Two-Sample t-Test and CI: Sample 1, Sample 2
2-sample-t for Sample 1 Vs Sample 2
N

Mean

StDev

SE Mean

Sample 1

15

4.35

1.03

0.27

Sample 2

15

4.40

0.816

0.21

Difference = mu (Sample 1) - mu (Sample 2)


Point Estimate for difference = 4.35 - 4.40 = -0.05
95% CI for difference:

(-0.752233, 0.645566)

Interpretation: Since the confidence interval includes 0 we do not reject the null hypothesis that two
population means are equal, at 5% level of significance.

(b) Construct a 99% confidence interval


Two-Sample t-Test and CI: Sample 1, Sample 2
2-sample-t for Sample 1 vs Sample 2
Sample 1

N
15

Mean
4.35

StDev
1.03

SE Mean
0.27

Sample 2

15

4.40

0.816

0.21
50

Difference = mu (Sample 1) - mu (Sample 2)


Point Estimate for difference = 4.35 - 4.40 = -0.05
99% CI for difference:

(-0.998122, 0.891456)

Interpretation: Since the confidence interval includes 0 we do not reject the null hypothesis that two
population means are equal, at 1% level of significance.
8. (a) H 0 : 1  2 = 0 versus H a : 1  2 0 . Let  = 0.05 .
Two-Sample t-Test: Sample 1, Sample 2
Two-sample T for Sample 1 Vs Sample 2
N

Mean

StDev

SE Mean

Sample 1

15

4.35

1.03

0.27

Sample 2

15

4.40

0.816

0.21

Difference = mu (Sample 1) - mu (Sample 2)


point Estimate for difference = 4.35 - 4.40 = -0.05
T-Test H 0 : 1  2 = 0 Vs H1 : 1  2  0 , T-Value = -0.16 P-Value = 0.877
DF = 26
Interpretation: Since the p-value = 0.877 >  = 0.05 , we cannot reject Ho.
(b) H 0 : 1  2 = 0 versus H a : 1  2 > 0 . Let  = 0.05 .
Two-Sample T-Test and CI: Sample 1, Sample 2
Two-sample T for Sample 1 Vs Sample 2
N

Mean

StDev

SE Mean

Sample 1

15

4.35

1.03

0.27

Sample 2

15

4.40

0.816

0.21

Difference = mu (Sample 1) - mu (Sample 2)


point Estimate for difference = 4.35 - 4.40 = -0.05
51

T-Test H 0 : 1  2 = 0 Vs H1 : 1  2 > 0 , T-Value = -0.16 P-Value = 0.562


DF = 26
Interpretation: Since the p-value = 0.562 >  = 0.05 , we do not reject Ho.
(c) H 0 : 1  2 = 0 versus H a : 1  2 < 0 . Let  = 0.05 .
Two-Sample T-Test and CI: Sample 1, Sample 2
Two-sample T for Sample 1 Vs Sample 2
N

Mean

StDev

SE Mean

Sample 1

15

4.35

1.03

0.27

Sample 2

15

4.40

0.816

0.21

Difference = mu (Sample 1) - mu (Sample 2)


point Estimate for difference = 4.35 - 4.40 = -0.05
T-Test H 0 : 1  2 = 0 Vs H1 : 1  2 < 0 , T-Value = -0.16 P-Value = 0.438
DF = 26
Interpretation: Since the p-value = 0.438 >  = 0.05 , we do not reject Ho.

9(a)

Random Sample of size 100 from a Binomial population B(40, 0.4)


12

17

11

13

14

14

12

18

13

16

12

10

11

21

15

12

16

16

21

18

18

15

19

20

19

15

19

17

16

18

14

10

15

17

14

16

16

23

17

14

14

12

17

15

14

13

16

19

14

17

13

13

15

15

17

15

15

10

15

16

15

15

15

12

15

19

11

17

16

15

17

20

12

15

15

16

21

16

16

16

10

19

16

19

19

16

16

15

15

15

18

22

21

18

15

15

23

11

52

(b)
5

10

11

10

(c)
0.92686

1.05933

4.86450

0.89129

1.34490

4.58611

4.23555

0.72925

3.10352

9.80666

1.06287

0.48886

0.78756

1.14476

0.41525

0.99962

0.53098

0.58796

0.87946

6.17721

6.04005

3.52215

0.70621

4.47782

2.82171

2.20250

0.26019

0.49498

3.48382

0.51510

3.94087

0.63957

2.24939

3.60930

0.02941

2.20857

1.86662

5.91489

2.39399

2.11482

0.28320

1.70997

0.20455

1.72291

2.57827

3.19872

6.59576

2.73302

2.44231

0.24167

0.98836

0.85113

2.08334

0.55120

1.56717

2.43075

0.45812

2.49099

3.71435

1.19253

0.61162

0.19957

1.40573

1.93952

0.01378

2.31370

1.07929

1.46698

1.31533

1.86552

1.99617

1.79145

0.87150

0.02049

2.14500

5.15758

3.72229

1.65200

2.35870

2.27768

1.20566

5.05366

4.95891

2.73776

0.11289

2.93519

0.69056

0.20431

1.79161

1.47690

1.53913

0.11944

1.65551

0.94907

1.02282

5.37658

2.80358

0.28814

2.71166

2.00077

53

Note: Each time you generate random data using MINITAB, JMP, or, in
fact, any other software the data will be different. You should use
the above data to calculate some of the descriptive measures.

10. (a)
Stem-and-Leaf Display for the Data in Problem # 1
Stem
7
18
(7)
15
6

Leaf Unit = 1.0

3
4
5
6
7

4
1
1
0
0

6
2
3
1
2

7
2
3
1
2

7
3
4
3
3

8
4
6
3
5

9
6
7
5
6

9
7 7 8 8 9
8
6 7 9

(b)
Boxplot of Data from Problem #1
80

70

Data

60

50

40

30

54

11.
Pie Chart of Data from Problem #11

6
16.7%

Category
2
3
4
5
6

2
20.0%

5
23.3%

3
16.7%

4
23.3%

12. (a) Cumulative Distribution Function


Binomial with n = 21 and p = 0.3
x
9
(b)

P( X <= x )
0.932427

P 5  X  15 = P X  15  P X  4 = 0.999983  0.198381 = 0.801602


Cumulative Distribution Function
Binomial with n = 21 and p = 0.3
x
15

P( X <= x )
0.999983

x
4

P( X <= x )
0.198381

13. (a) P(X  12) = 0.995549 with  = 5.5


Cumulative Distribution Function
Poisson with mean = 5.5
x

P( X <= x )
55

12

0.995549

(b) P(7  X  25) = P( X  25)  P( X  6) = 1.00000 0.686036 = 0.313964 with  = 5.5


Cumulative Distribution Function
Poisson with mean = 5.5
x
25
x
6

P( X <= x )
1.00000
P( X <= x )
0.686036

14. (a) = 20 and  = 6


P(7  X  33) = 0.9849 0.0151 = 0.9698
Cumulative Distribution Function
Normal with mean = 20 and standard deviation = 6
x
33
x
7

P( X <= x )
0.9849
P( X <= x )
0.0151

(b) = 12 and  = 4
P(8  Z  16) = 0.8413 0.1587 = 0.6826
Cumulative Distribution Function
Normal with mean = 12 and standard deviation = 4
x
16
x
8

P( X <= x )
0.8413
P( X <= x )
0.1587

(c) = 0 and  = 1
P(-1.96  Z  1.96) = 0.9750 0.0250 = 0.95
Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1
x
1.96
x
-1.96

P( X <= x )
0.9750
P( X <= x )
0.0250

(d) = 0 and  = 1
P(-1.645  Z  1.645) = 0.9500 0.0500 = 0.90
56

Cumulative Distribution Function


Normal with mean = 0 and standard deviation = 1
x
1.645

P( X <= x )
0.9500

x
-1.645

P( X <= x )
0.0500

(e) = 0 and  = 1
P(-2.575  Z  2.575) = 0.9950 0.0050 = 0.99
Cumulative Distribution Function
Normal with mean = 0 and standard deviation = 1
x
2.575
x
-2.575

P( X <= x )
0.9950
P( X <= x )
0.0050

15. (a) Find a 95% Confidence Interval


Variable

Mean

St Dev

SE Mean

Problem 15

32

29.6250

6.8285

1.2071

95% CI
(27.1630, 32.0870)

(b) H 0 : = 30 versus H a :  30 . Let  = 0.05 .


One-Sample - Z: Problem 15
Test = 30 Versus  30
Variable
Problem 15
T
0.31

Mean

St Dev

SE Mean

32

29.6250

6.8285

1.2071

95% CI
(27.1630, 32.0870)

P
0.758

Interpretation: Since the p-vale = 0.758 >  = 0.05 , we do not reject the null hypothesis.

57

16.

Two-Sample T-Test and CI


Sample
N
Mean
St Dev
1
25
93.00
2.10
2
27
87.00
3.70

SE Mean
0.42
0.71

Difference = 1  2
Estimate for 1  2 : 93.00 83.00 = 6.00000
95% CI for 1  2 :

(4.30579, 7.69421)

T-Test for 1  2 = 0 versus 1 2  0 T-Value = 7.11


P-Value = 0.000 DF = 50
Both use Pooled Standard Deviation = 3.0390
The 95% confidence interval does not include 0, therefore, we reject the null hypothesis that two
population means are equal.
Also, the p-value = 0.000 <  = 0.05 , we reject the null hypothesis and conclude that the difference
between the population means is not equal to zero.

58

In loving memory of my parents, Roshan Lal and Sodhan Devi.


Bhisham
In loving memory of my father, Carl Ellsworth Walker.
Fred

Glossary
Alpha ( )Probability of rejecting a null hypothesis when it is true.
alternative hypothesisA hypothesis formulated using new information.
arithmetic meanThe average of a data set.
bar graphA graph representing categories in a data set with bars of height
equal to the frequencies of the corresponding categories.
Bernoulli trialA trial that has only two possible outcomes, such as success and failure.
Beta ()Probability of not rejecting a null hypothesis when the alternative
is true.
binomial model or binomial distributionA discrete distribution that is
applicable whenever an experiment consists of n independent Bernoulli
trials and the probability of an outcome, say, success, is constant throughout the experiment.
bimodal distributionA distribution that has two modes; a distribution
with more modes is called multimodal distribution; a distribution with one
mode is called unimodal.
bound on error of estimationMaximum difference between the estimated value and the true value of a parameter.
bimodal dataA data set that has two modes.
box and whisker plotA plot that helps detect any outliers in the data and
shows whether the data are skewed.
central limit theoremA theorem that states that irrespective of shape of
the distribution of a population, the distribution of sample means is
approximately normal when the sample size is large.
Chi-square distributionProbability distribution of sum of squares of n
independent normal variables.
classAn interval that includes all observations in a quantitative data set
within two values (for qualitative data, classes are generally dened by
categories).
class frequencyNumber of data points of a data set that belong to a certain class.
class limitsThe two values that determine a class interval. The smaller
value is the lower limit, and the larger value is the upper limit.
class midpointClass midpoint is the average of the lower and upper class
limits; also called class mark.
class widthThe difference between the upper and lower class limits.
coefficient of variation (CV)Ratio of the standard deviation to the mean,
expressed as a percentage.
305

306

Glossary

complement eventAn event that consists of all those sample points that
are in the sample space but not in the given event.
conditional probabilityThe probability of one event happening given that
another event has already happened.
contingency tableA two-dimensional table that contains frequencies or
count data that belong to various categories according to the classication
dictated by two attributes.
continuous distributionProbability distribution of a continuous random
variable.
continuous random variableA random variable that can take any value
in one or more intervals.
correction factor of continuityA correction made when a discrete distribution is approximated with a continuous distribution.
correlation coefficientA unitless measure that measures the strength of
the association between two numerical variables.
critical pointThe point that separates the rejection region from the acceptance region.
cumulative frequencyThe frequency of a class that includes the frequencies of that class and of all the preceding classes.
data setA collection of values collected through experimentation, sample
surveys, or observations.
degrees of freedomNumber of variables in a sample that can independently vary.
dependent eventsEvents when the probability of occurrence of certain
events changes when the information about the occurrence of other events
is taken into consideration.
descriptive statisticsA group of methods used for organizing, summarizing, and representing data using tables, graphs, and summary statistics.
design of experiment (DOE)A well-structured plan that designs an
experiment in which the experimenter can control the factors that might
inuence the outcome of the experiment.
discrete distributionProbability distribution of a discrete random variable.
discrete random variableA random variable that can take nite or countably innite number of values.
empirical ruleA rule that gives the percentage of data points, which falls
within a given number of standard deviation of the mean of data set that
is normally distributed.
equally likely eventEvents when no event can occur in preference to any
other event.
estimationProcedure to nd an estimator of an unknown parameter.
estimatorA rule (sample statistic) that tells us how to nd a single value
known as an estimate of an unknown parameter.
error of estimationThe difference between the estimated value and the
true value of a parameter.
eventA set of one or more outcomes of a random experiment.
expected valueThe weighted average of all the values that a random variable takes.

Glossary 307

expected frequenciesFrequencies with which different categories in a


contingency table would occur if the given null hypothesis is true.
experimentA procedure that produces observations.
exponential distributionContinuous probability distribution of times
between random occurrences that follow the Poisson process.
F-distributionA continuous probability distribution of the ratio of two
independent chi-square random variables.
first quartileThe value such that at most 25% of the data values fall below
and at most 75% of the data values fall above it.
frequency distributionA summary of data presented in a table that lists
all the classes or categories along with their corresponding frequencies.
frequency histogram, relative frequency histogram, and cumulative frequency histogramA graph in which classes are marked on the horizontal axis and frequencies, relative frequencies, or cumulative frequencies are marked on the vertical axis. The frequencies, relative frequencies,
or cumulative frequencies are represented by bars or rectangles that are
drawn back to back.
frequency polygonA graph prepared by joining the midpoint at the top of
adjacent bars or rectangles of a histogram.
grouped dataSummarized data produced by a frequency distribution.
hypergeometric distributionProbability distribution of random occurrences when the sampling is done without replacement from a binary population.
independent eventsEvents when the occurrence of one event does not
change the probability of the occurrence of the other event.
independent samplesSamples in which elements of one are related to the
elements in another sample only by chance.
inferential statisticsA set of techniques used to arrive at a conclusion
about a population based upon the information contained in a sample
taken from that population.
intersectionThe portion that is common in two or more sets.
interval dataQuantitative data in which multiplication and division are
not allowed.
interval estimateAn interval that contains the true value of a parameter
with certain probability, called condence coefcient.
interquartile rangeDifference between the third (upper) and the rst
(lower) quartile.
level of significance ( )The probability of committing a type I error.
lower inner fenceThe value that is used in a box and whisker plot and that
is equal to rst quartile minus 1.5  IQR.
lower outer fenceThe value that is used in a box and whisker plot and that
is equal to rst quartile minus 3.0  IQR.
marginal probabilityThe probability of one variable without taking into
account the other variables.
mean square error (MSE)The average of the squared errors (squared
residuals).
meanOne of the measures of central tendency calculated by dividing the
sum of all the observations; also known as the expected value.

308

Glossary

measures of central tendencyStatistics (e.g., mean, median, mode) used


to provide information about the location of the center of a data set.
measures of variabilityStatistics (e.g., range, standard deviation) that
provide information about the spread of a data set; also known as measures of dispersion.
medianA value such that at most 50% of the data values fall below and at
most 50% fall above it.
modeThe value or values that occur most frequently in a data set.
mutually exclusive eventsEvents that do not have any common outcomes
and, therefore, the probability of occurrence of such events together is
always zero.
nominal dataWeakest data in which observation are denoted by names or
symbols.
nonparametric statisticsStatistical techniques that make no assumption
about the distribution of a population.
normal probability distributionThe most widely used continuous probability distribution with a bell-shaped frequency curve; also known as
bell-shaped or Gaussian distribution.
null hypothesisA hypothesis formulated without taking into consideration
any new information or formulated based upon the belief that what
already exists is true.
observationsValues obtained from outcomes of an experiment or sample
survey or simply by observing a phenomenon.
observed level of significanceThe smallest value of  for which the null
hypothesis is rejected; also known as the p-value.
Ogive curveThe curve obtained by joining the midpoints at the top of
adjacent bars or rectangles of a cumulative frequency histogram.
one-tail testA statistical test when the rejection region falls under one tail
only.
operating characteristics curve (OC-curve)Used in testing of statistical
hypotheses. It is a graph obtained by plotting the type II errors versus various values of the parameter under the alternative hypothesis.
ordinal dataData values arranged by rank or order. It is stronger than
nominal data but weaker than interval or ratio data.
outliersData values that lie far below or above the majority of the values
in the data set.
p-valueThe observed level of signicance.
paired dataData collected from pairs of subjects that have common characteristics or from the same subject before and after a treatment; also
known as before and after data.
parameterA descriptive measure calculated using population data.
percentilesMeasures that divide the ranked data into 100 equal parts
(numerical).
pie chartA circle divided into pizza slices that represent the percentage
breakdown of data into different categories.
point valueA value obtained by using sample data, which replaces the true
value of an unknown parameter.
Poisson distributionA discrete probability distribution of rare events.

Glossary 309

populationA collection of all conceivable items, objects, or persons of


interest.
powerProbability of rejecting a null hypothesis when it is not true.
power curveA graph used in testing of statistical hypotheses. It is
obtained by plotting power (complement of type II errors) versus various
values of the parameter under the alternative hypothesis.
processA series of steps required to complete a procedure.
qualitative dataData that measures the quality characteristic of a subject;
also called categorical data (e.g., nominal data and ordinal data).
quality controlA set of techniques or methods used to maintain the quality of a product.
quantitative dataData that measures characteristics of a subject quantitatively; also known as numerical data.
quartilesMeasures that divide the ranked data into four equal parts.
random sampleA sample drawn such that each element of the population
has the same chance of being included in the sample; also called simple
random sample.
random variableA variable that assigns some real values to the outcomes
of a random experiment.
rangeThe difference between the largest and the smallest data value in a
data set.
rejection regionThe null hypothesis is rejected when the value of the test
statistic falls in that region.
relative frequencyThe frequency of a class or category divided by the
total frequency.
sampleA part of a population of interest.
sample spaceA complete listing of all possible outcomes of a random
experiment.
sample statisticA measure calculated by using sample data (e.g., sample
mean, sample standard deviation); also simply called a statistic.
sample surveyA technique used to collect data through mail, telephone,
or personal interviews.
sampling distributionProbability distribution of a statistic.
second quartileA value such that at most 50% of the data values fall below
and at most 50% fall above it; also called the middle quartile or median.
simple eventAn event that contains only one possible outcome of random
experiment.
Six SigmaA measure of process quality wherein the distance between the
target value and the upper or lower specication limit is at least six standard deviations.
skewed dataData whose frequency curve has a longer tail either to the left
(left skewed) or to the right (right skewed).
standard deviationA measure of dispersion that is equal to the positive
square root of the variance.
standard errorThe standard deviation of a sample statistic (e.g., sample
mean or sample proportion).
standard normal distributionA normal distribution with a mean of 0 and
a standard deviation of 1.

310

Glossary

stem and leaf diagramA diagram in which each observation is broken


into two parts; a stem and a leaf.
sure eventAn event that has the probability of occurrence 1.
symmetric dataData whose frequency curve is identical on both sides of
its center.
t-distributionA continuous distribution of the ratio of two independent
random variablesa standard normal and a chi-square.
third quartileA value such that at most 75% of the data values fall below
it and at most 25% of the data values fall above it.
test statisticA statistic used to make a decision whether to reject the null
hypothesis.
two-tail testA statistical test when the rejection region falls under both the
tails.
type I errorProbability of rejecting a null hypothesis when it is true.
type II errorProbability of not rejecting a null hypothesis when it is false.
uniform distributionA continuous distribution with its frequency curve
shaped as a rectangle; also called the rectangular distribution.
upper inner fenceThe value that is used in a box and whisker plot and is
equal to the third quartile plus 1.5  IQR.
upper outer fenceThe value that is used in a box and whisker plot and is
equal to the third quartile plus 3.0  IQR.
unbiased estimatorAn estimator with its expected value equal to the true
value of the parameter that is being estimated.
variableA characteristic of interest that takes different values for different
elements.
varianceA measure of dispersion that is equal to the average of the
squared deviation from the mean of the data set.
Venn diagramA diagram used for representing the sample space and
events.
weighted meanMean of a data set where each observation is given a relative importance expressed numerically by a set of numbers called
weights.
Z distributionAnother name of a standard normal distribution.
z scoreThe location of any value in a data set relative to the mean
expressed in terms of the units of standard deviation.

Figures

Figure 1.1
Figure 1.2
Figure 1.3
Figure 1.4
Figure 1.5
Figure 2.1
Figure 3.1
Figure 3.2
Figure 3.3
Figure 3.4
Figure 3.5
Figure 3.6
Figure 3.7
Figure 3.8
Figure 3.9
Figure 3.10
Figure 3.11
Figure 3.12
Figure 3.13
Figure 3.14
Figure 3.15
Figure 3.16
Figure 3.17
Figure 3.18

The normal distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Six Sigma (Motorola denition) . . . . . . . . . . . . . . . . . . . . . . . . . . .
Current Six Sigma implementation ow chart . . . . . . . . . . . . . . . .
Six Sigma support personnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Classications of statistical data . . . . . . . . . . . . . . . . . . . . . . . . . . .
Dot plot for the data on defective motors that are received in
20 shipments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Pie chart for defects associated with manufacturing process
steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bar chart for annual revenues of a company over the period
of ve years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Bar graph for the data in Example 3.7 . . . . . . . . . . . . . . . . . . . . . .
Bar charts for types of defects in auto parts manufactured in
Plant I (P1) and Plant II (P2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Frequency histogram for survival time of parts under extreme
operating conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Relative frequency histogram for survival time of parts under
extreme operating conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Frequency polygon for the data in Example 3.9 . . . . . . . . . . . . . . .
Relative frequency polygon for the data in Example 3.9 . . . . . . . .
A typical frequency distribution curve . . . . . . . . . . . . . . . . . . . . . .
Three types of frequency distribution curves . . . . . . . . . . . . . . . . .
Cumulative frequency histogram for the data in Example 3.9 . . . .
Ogive curve for the survival data in Example 3.9 . . . . . . . . . . . . .
Line graph for the data on lawn mowers given in
Example 3.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ordinary and ordered stem and leaf diagram for the data
on survival time for parts in extreme operating conditions
in Example 3.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Ordered stem and leaf diagram for the data in Table 3.10 . . . . . . .
Ordered two-stem and leaf diagram for the data in Table 3.12 . . .
Ordered ve-stem and leaf diagram . . . . . . . . . . . . . . . . . . . . . . . .

xiii

2
3
4
6
7
12
21
23
25
26
26
29
30
30
31
31
32
32
33
34
36
37
38
38

xiv

Figures

Figure 3.19 MINITAB display depicting eight degrees of correlation: (a)


represents strong positive correlation, (b) represents strong negative
correlation, (c) represents positive perfect correlation, (d) represents
negative perfect correlation, (e) represents positive moderate
correlation, (f) represents negative moderate correlation, (g)
represents a positive weak correlation, and (h) represents a negative
weak correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Figure 4.1 Frequency distributions showing the shape and location of
measures of centrality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
Figure 4.2 Two frequency distribution curves with equal mean, median
and mode values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Figure 4.3 Application of the empirical rule . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Figure 4.4 Amount of soft drink contained in a bottle . . . . . . . . . . . . . . . . . . . 62
Figure 4.5 Dollar value of units of bad production . . . . . . . . . . . . . . . . . . . . . 62
Figure 4.6 Salary data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
Figure 4.7 Quartiles and percentiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Figure 4.8 Box-whisker plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Figure 4.9 Example box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Figure 4.10 Box plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Figure 5.1 Tree diagram for an experiment of testing a chip, randomly
selecting a part, and testing another chip . . . . . . . . . . . . . . . . . . . . 76
Figure 5.2 Venn diagram representing the sample space S and the event
A in S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Figure 5.3 Venn diagram representing the union of events A and B
(shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 5.4 Venn diagram representing the intersection of events A and B
(shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 5.5 Venn diagram representing the complement of an event A
(shaded area) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Figure 5.6 Venn diagramrepresenting AB  {1, 4, 5, 6, 7, 8, 9, 10},
AB  {7}, A  {2, 3, 5, 9, 10}, B  {1, 2, 3, 4, 6, 8} . . . . . . . . 82
Figure 5.7 Two mutually exclusive events, A and B . . . . . . . . . . . . . . . . . . . . 83
Figure 5.8 Venn diagram showing the phenomenon of P(AB) 
P(A)  P(B)  P(AB) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Figure 6.1 Graphical representation of probability function in Table 6.2 . . . . 96
Figure 6.2 Graphical representation of probability function f(x) in
Table 6.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Figure 6.3 Graphical representation of the distribution function F(x) in
Example 6.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Figure 6.4 Location of mean  and the end point of interval
(  2,   2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
Figure 6.5 Binomial probability distribution with n  10, p  0.80 . . . . . . . . 105
Figure 7.1 An illustration of a density function of a continuous
random variable X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Figure 7.2 Graphical representation of F(x)  P(X x) . . . . . . . . . . . . . . . . 117
Figure 7.3 Uniform distribution over the interval (a, b) . . . . . . . . . . . . . . . . . 118
Figure 7.4 Probability P(x1 X x2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Figure 7.5 The normal density function curve with mean  and
standard deviation  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Figure 7.6 Curves representing the normal density function with
different means, but with the same standard deviation . . . . . . . . . . 122
Figure 7.7 Curves representing the normal density function with different
standard deviations, but with the same mean . . . . . . . . . . . . . . . . . 123

Figures xv

Figure 7.8
Figure 7.9
Figure 7.10
Figure 7.11
Figure 7.12
Figure 7.13
Figure 7.14
Figure 7.15
Figure 7.16
Figure 7.17
Figure 7.18
Figure 7.19
Figure 7.20
Figure 7.21
Figure 7.22
Figure 7.23
Figure 7.24
Figure 7.25
Figure 8.1
Figure 8.2
Figure 8.3
Figure 8.4
Figure 8.5
Figure 8.6
Figure 8.7
Figure 8.8
Figure 8.9
Figure 8.10
Figure 8.11
Figure 8.12
Figure 8.13
Figure 8.14
Figure 8.15
Figure 8.16
Figure 8.17
Figure 8.18
Figure 8.19
Figure 8.20
Figure 8.21
Figure 9.1
Figure 9.2

The standard normal density function curve . . . . . . . . . . . . . . . . . 123


Probability (a Z b) under the standard normal curve . . . . . . . 124
Shaded area equal to P(1 Z 2) . . . . . . . . . . . . . . . . . . . . . . . . 125
Two shaded areas showing P(1.50 Z 0) 
P(0 Z 1.50) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Two shaded areas showing P(2.2 Z 1.0) 
P(1.0 Z 2.2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Showing P(1.50 Z .80)  P(1.50 Z 0) 
P(0 Z 0.80) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Shaded area showing P(Z 0.70) . . . . . . . . . . . . . . . . . . . . . . . . . 126
Shaded area showing P(Z 1.0) . . . . . . . . . . . . . . . . . . . . . . . . 126
Shaded area showing P(Z 2.15) . . . . . . . . . . . . . . . . . . . . . . . . . 127
Shaded area showing P(Z 2.15) . . . . . . . . . . . . . . . . . . . . . . . 127
Converting normal N(6,4) to standard normal N(0,1) . . . . . . . . . . 128
Shaded area showing P(0.5 Z 2.0) . . . . . . . . . . . . . . . . . . . . . 128
Shaded area showing P(1.0 Z 1.0) . . . . . . . . . . . . . . . . . . . 128
Shaded area showing P(-1.50 Z 0.50) . . . . . . . . . . . . . . . . 129
Graphs of exponential density function for
 0.1, 0.5,
1.0, and 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Curves of three hazard rate functions . . . . . . . . . . . . . . . . . . . . . . . 132
Hazard function h(t) with   1;  0.5, 1, 2 . . . . . . . . . . . . . . . 133
Weibull density function (a)   1,  0.5 (b)   1,  1
(c)   1,  2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
Shaded area showing P(2 Z 2) . . . . . . . . . . . . . . . . . . . . . . 142
Shaded area showing P(Z 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Shaded area showing P(2.28 Z 2.28) . . . . . . . . . . . . . . . . . 143
Shaded area showing P(Z 1.14) . . . . . . . . . . . . . . . . . . . . . . . . . 143
Shaded area showing P(1.5 Z 1.5) . . . . . . . . . . . . . . . . . . . 144
Shaded area showing P(1.6 Z 1.6) . . . . . . . . . . . . . . . . . . . 144
Shaded area showing P(2 Z 2) . . . . . . . . . . . . . . . . . . . . . . 144
Shaded area showing P(1.71 Z 1.71) . . . . . . . . . . . . . . . . . 146
Shaded area showing P(2.23 Z 2.23) . . . . . . . . . . . . . . . . . 147
Chi-square distribution with different degrees of freedom . . . . . . . 149
Chi-square distribution with upper-tail area  . . . . . . . . . . . . . . . . 149
Chi-square distribution with upper-tail area   0.05 . . . . . . . . . . 150
Chi-square distribution with lower-tail area  . . . . . . . . . . . . . . . . 150
Chi-square distribution with lower-tail area   0.10 . . . . . . . . . . 151
Frequency distribution function of t-distribution with, say n  15
degrees of freedom and standard normal distribution . . . . . . . . . . 154
t-distribution with shaded area under the two tails equal to
P(T tn,)  P(T  tn,)   . . . . . . . . . . . . . . . . . . . . . . . . . . 154
A typical probability density function curve of F1, 2 . . . . . . . . . 156
Probability density function curve of F1, 2 with upper-tail
area   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Probability density function curve of F1, 2 with lower-tail
area   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
Comparison of histograms for various binomial distributions
(n  15, p  0.2, 0.3, 0.4, 0.5) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
(a) Showing the normal approximation to the binomial.
(b) Replacing the shaded area contained in the rectangles by the
shaded area under the normal curve . . . . . . . . . . . . . . . . . . . . . . . . 163
An interpretation of a condence interval . . . . . . . . . . . . . . . . . . . 172
Standard normal-curve with tail areas equal to /2 . . . . . . . . . . . . 174

xvi

Figures

(a) Standard normal curve with lower-tail area equal to 


(b) Standard normal curve with upper-tail area equal to  . . . . . . . 175
Figure 9.4 Students t-distribution with tail areas equal to /2 . . . . . . . . . . . . 178
Figure 9.5 Chi-square distribution with two tail areas each equal to 0.025 . . . 197
Figure 9.6 F-distribution curve (a) shaded area under two tails each
equal to 0.025 (b) shaded area under left tail equal to 0.05
(c) shaded area under the right tail equal to 0.05 . . . . . . . . . . . . . . 200
Figure 10.1 Critical points dividing the sample space of in two regions, the
rejection region and the acceptance region . . . . . . . . . . . . . . . . . . . 204
Figure 10.2 OC-curves for different alternative hypotheses . . . . . . . . . . . . . . . 206
Figure 10.3 Power curves for different hypotheses . . . . . . . . . . . . . . . . . . . . . . 207
Figure 10.4 Rejection regions for hypotheses (i), (ii), and (iii) . . . . . . . . . . . . . 209
Figure 10.5 Lower-tail rejection region with   0.01 . . . . . . . . . . . . . . . . . . . 210
Figure 10.6 Two-tail rejection region with   0.01 . . . . . . . . . . . . . . . . . . . . . 211
Figure 10.7 Power curve for the test in example 10.2 . . . . . . . . . . . . . . . . . . . . 213
Figure 10.8 Rejection regions for hypotheses (i), (ii), and (iii) . . . . . . . . . . . . . 214
Figure 10.9 Rejection region under the lower test with   0.05 . . . . . . . . . . . 215
Figure 10.10 Rejection regions for testing hypotheses (i), (ii), and (iii) at the
  0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
Figure 10.11 Rejection region under the upper tail with   0.05 . . . . . . . . . . . 218
Figure 10.12 Rejection regions under the two tails with   0.05 . . . . . . . . . . . 219
Figure 10.13 Rejection regions for testing hypotheses (i), (ii), and (iii) at
  0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
Figure 10.14 Rejection regions for a two-tail test with   0.05 . . . . . . . . . . . . 222
Figure 10.15 Rejection regions for testing hypotheses (i), (ii), and (iii) at the
  0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
Figure 10.16 Rejection region under the lower tail with   0.05 . . . . . . . . . . . 225
Figure 10.17 Rejection regions under the two tails with   0.05 . . . . . . . . . . . 226
Figure 10.18 Rejection regions for testing hypotheses (i), (ii), and (iii) at
the given  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
Figure 10.19 Rejection region under the upper tail with   0.05 . . . . . . . . . . . 228
Figure 10.20 Rejection region under the upper tail with   0.01 . . . . . . . . . . . 231
Figure 10.21 Rejection regions for testing hypotheses (i), (ii), and (iii) at
the  level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
Figure 10.22 Rejection region under the upper tail with   0.025 . . . . . . . . . . 234
Figure 10.23 Rejection regions for testing the hypotheses (i), (ii), and (iii) at
the  level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
Figure 10.24 The rejection region under the two tails with   0.01 . . . . . . . . . 237
Figure 10.25 Rejection region under the lower tail with   0.05 . . . . . . . . . . . 240
Figure 10.26 Rejection regions under the two tails with   0.05 . . . . . . . . . . . 242
Figure 10.27 Rejection regions for testing hypotheses (i), (ii), and (iii) at
the   0.05 level of signicance . . . . . . . . . . . . . . . . . . . . . . . . . 244
Figure 10.28 Rejection region under the chi-square distribution curve for testing
hypotheses (i), (ii), and (iii) at the  level of signicance . . . . . . . 245
Figure 10.29 Rejection region under the lower tail with   0.05 . . . . . . . . . . . 246
Figure 10.30 Rejection region under the F-distribution curve for testing
hypotheses (i), (ii), and (iii) at the  level of signicance . . . . . . . 247
Figure 10.31 Rejection region under the two tails with   0.05 . . . . . . . . . . . . 249
Figure 10.32 Rejection region under the right tail with   0.05 . . . . . . . . . . . . 249
Figure 11.1 The screen that appears rst in the MINITAB environment . . . . . 256
Figure 11.2 MINITAB window showing the menu command options . . . . . . . 257
Figure 11.3 MINITAB window showing input and output for Column
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
Figure 9.3

Figures xvii

Figure 11.4 MINITAB window showing various options available


under Stat command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
Figure 11.5 MINITAB display of histogram for the data given in
example 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Figure 11.6 MINITAB window showing Edit Bars dialog box . . . . . . . . . . . . 262
Figure 11.7 MINITAB display of histogram with 5 classes for the data
in Example 11.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
Figure 11.8 MINITAB output of Dotplot for the data in Example 11.4 . . . . . . 264
Figure 11.9 MINITAB output of Scatterplot for the data given in
Example 11.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
Figure 11.10 MINITAB display of box plot for the data in Example 11.6 . . . . . 266
Figure 11.11 MINITAB display of graphical summary for the data in
example 11.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
Figure 11.12 MINITAB display of bar graph for the data in Example 11.8 . . . . 269
Figure 11.13 MINITAB display of pie chart for the data in Example 11.9 . . . . . 270
Figure 11.14 MINITAB printout of 95% Bonferroni condence interval for
standard deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
Figure 11.15 MINITAB display of normal probability graph for the data in
example 11.19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
Figure 11.16 The screen that appears rst in the JMP environment . . . . . . . . . . 284
Figure 11.17 JMP menu command options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Figure 11.18 JMP window showing input and output for Column Statistics . . . 287
Figure 11.19 JMP Distribution dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Figure 11.20 JMP display of histogram for the data given in
Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
Figure 11.21 JMP printout of stem and leaf for the data given in
Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
Figure 11.22 JMP display of box plot with summary statistics for
Example 11.21 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
Figure 11.23 JMP display of graphical summary for the data in
Example 11.22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
Figure 11.24 JMP display of bar graph for the data in Example 11.23 . . . . . . . . 293
Figure 11.25 JMP printout of pie chart for the data in Example 11.24 . . . . . . . . 295
Figure 11.26 JMP printout of 1 sample t-test for the data in
Example 11.25 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296
Figure 11.27 JMP printout of 1 sample z-test for the data in
Example 11.26 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
Figure 11.28 JMP printout of 2-sample t-test for the data in
Example 11.27 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
Figure 11.29 JMP printout of paired t-test for the data in Example 11.28 . . . . . 300
Figure 11.30 JMP printout of test of equal variances in Example 11.29 . . . . . . . 301
Figure 11.31 JMP display of normal quantile graph for the data in
Example 11.30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

Tables

Table 1.1
Table 1.2
Table 3.1
Table 3.2
Table 3.3
Table 3.4
Table 3.5
Table 3.6
Table 3.7
Table 3.8
Table 3.9
Table 3.10
Table 3.11
Table 4.1
Table 5.1
Table 5.2
Table 6.1
Table 6.2
Table 6.3
Table 6.4
Table 6.5
Table 7.1
Table 8.1
Table 8.2
Table 8.3

Process step completion times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7


Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Annual revenues of 110 small to midsize companies in the
midwestern United States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Frequency distribution table for 110 small to midsize
companies in the midwestern United States . . . . . . . . . . . . . . . . . . 17
Complete frequency distribution table for the 110 small to
midsize companies in the midwestern United States . . . . . . . . . . . 17
Complete frequency distribution table for the data in
Example 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Frequency table for the data on rod lengths . . . . . . . . . . . . . . . . . . 20
Understanding defect rates as a function of various
process steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Frequency distribution table for the data in Example 3.7 . . . . . . . . 25
Frequency distribution table for the survival time of parts . . . . . . . 28
Data on survival time (in hours) in Example 3.9 . . . . . . . . . . . . . . 35
Number of parts produced by each worker per week . . . . . . . . . . . 37
Cholesterol levels and systolic BP of 30 randomly selected
U.S. men . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Age distribution of group of 40 people watching a basketball
game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Classication of technicians by qualication and gender . . . . . . . 89
Classication of technicians by qualication and gender . . . . . . . 91
Probability distribution of a random variable X . . . . . . . . . . . . . . . 95
Probability distribution of random variable X dened in
Example 6.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Probability function of X . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Portion of Table I of the appendix for n  5 . . . . . . . . . . . . . . . . . 106
Portion of Table II of the appendix . . . . . . . . . . . . . . . . . . . . . . . . . 114
A portion of standard normal distribution Table III
of the appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Population with its distribution for the experiment of rolling
a fair die . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
All possible samples of size 2 with their respective means . . . . . . 139
Different sample means with their respective probabilities . . . . . . 139

xviii

Tables

Table 8.4
Table 8.5
Table 8.6
Table 10.1
Table 10.2
Table I
Table II
Table III
Table IV
Table V
Table VI

A portion of the t-table giving the value of tn, for certain


values of n and  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Comparison of approximate probabilities to the exact
probabilities (n  5, p  0.4, 0.5) . . . . . . . . . . . . . . . . . . . . . . . . . 161
Showing the use of continuity correction factor under different
scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Presenting the view of type I and type II errors . . . . . . . . . . . . . . . 205
Condence intervals for testing various hypotheses . . . . . . . . . . . 252
Binomial probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
Poisson probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
Standard Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
Critical values of 2 with  degrees of freedom . . . . . . . . . . . . . . . 320
Critical values of t with  degrees of freedom . . . . . . . . . . . . . . . . 322
Critical values of F with numerator and denominator degrees of
freedom 1, 2 respectively (  0.10) . . . . . . . . . . . . . . . . . . . . . 323

xix

THE
NORMAL
LAW OF ERROR
STANDS OUT IN THE
EXPERIENCE OF MANKIND
AS ONE OF THE BROADEST
GENERALIZATIONS OF NATURAL
PHILOSOPHY IT SERVES AS THE
GUIDING INSTRUMENT IN RESEARCHES
IN THE PHYSICAL AND SOCIAL SCIENCES AND
IN MEDICINE AGRICULTURE AND ENGINEERING
IT IS AN INDISPENSIBLE TOOL FOR THE ANALYSIS AND THE
INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT

W. J. Youden

Appendix

Table I

Binomial probabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

Table II

Poisson probabilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Table III Standard normal distribution. . . . . . . . . . . . . . . . . . . . . . . . 317


Table IV

Critical values of 2 with  degrees of freedom. . . . . . . . . . 318

Table V

Critical values of t with  degrees of freedom. . . . . . . . . . . 320

Table VI

Critical values of F with numerator and


denominator degrees of freedom 1, 2
respectively (  0.10). . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

311

312

Appendix

Table I

Binomial probabilities.

( )

n x
x
Tabulated values are P (X = x ) = n
x p (1 p )
p
n
1
2
3

x
0
1
0
1
2
0
1
2
3
0
1
2
3
4
0
1
2
3
4
5
0
1
2
3
4
5
6
0
1
2
3
4
5
6
7
0
1
2
3
4
5
6
7
8
0
1
2
3
4
5
6
7
8
9

.05
.950
.050
.902
.095
.003
.857
.136
.007
.000
.815
.171
.014
.000
.000
.774
.204
.021
.001
.000
.000
.735
.232
.031
.002
.000
.000
.000
.698
.257
.041
.004
.000
.000
.000
.000
.663
.279
.052
.005
.000
.000
.000
.000
.000
.630
.298
.063
.008
.001
.000
.000
.000
.000
.000

.10
.900
.100
.810
.180
.010
.729
.243
.027
.001
.656
.292
.048
.004
.000
.591
.328
.073
.008
.000
.000
.531
.354
.098
.015
.001
.000
.000
.478
.372
.124
.023
.003
.000
.000
.000
.430
.383
.149
.033
.005
.000
.000
.000
.000
.387
.387
.172
.045
.007
.001
.000
.000
.000
.000

.20
.800
.200
.640
.320
.040
.512
.384
.096
.008
.410
.410
.154
.025
.001
.328
.410
.205
.051
.006
.000
.262
.393
.246
.082
.015
.002
.000
.210
.367
.275
.115
.029
.004
.000
.000
.168
.335
.294
.147
.046
.009
.001
.000
.000
.134
.302
.302
.176
.066
.017
.003
.000
.000
.000

.30
.700
.300
.490
.420
.090
.343
.441
.189
.027
.240
.412
.265
.075
.008
.168
.360
.309
.132
.028
.003
.118
.302
.324
.185
.059
.010
.001
.082
.247
.318
.227
.097
.025
.004
.000
.058
.198
.296
.254
.136
.047
.010
.001
.000
.040
.156
.267
.267
.172
.073
.021
.004
.000
.000

.40
.600
.400
.360
.480
.160
.216
.432
.288
.064
.130
.346
.345
.154
.025
.078
.259
.346
.230
.077
.010
.047
.187
.311
.276
.138
.037
.004
.028
.131
.261
.290
.194
.077
.017
.002
.017
.089
.209
.279
.232
.124
.041
.008
.001
.010
.061
.161
.251
.251
.167
.074
.021
.004
.000

.50
.500
.500
.250
.500
.250
.125
.375
.375
.125
.062
.250
.375
.250
.063
.031
.156
.312
.312
.156
.031
.016
.094
.234
.313
.234
.094
.015
.008
.055
.164
.273
.273
.164
.055
.008
.004
.031
.109
.219
.273
.219
.110
.031
.004
.002
.018
.070
.164
.246
.246
.164
.070
.018
.002

.60
.400
.600
.160
.480
.360
.064
.288
.432
.216
.025
.154
.346
.346
.129
.010
.077
.230
.346
.259
.078
.004
.037
.138
.277
.311
.186
.047
.002
.017
.077
.194
.290
.261
.131
.028
.001
.008
.041
.124
.232
.279
.209
.089
.017
.000
.004
.021
.074
.167
.251
.251
.161
.060
.010

.70
.300
.700
.090
.420
.490
.027
.189
.441
.343
.008
.076
.264
.412
.240
.002
.028
.132
.308
.360
.168
.001
.010
.059
.185
.324
.302
.118
.000
.004
.025
.097
.227
.318
.247
.082
.000
.001
.010
.048
.136
.254
.296
.198
.057
.000
.000
.004
.021
.073
.172
.267
.267
.156
.040

.80
.200
.800
.040
.320
.640
.008
.096
.384
.512
.002
.026
.154
.409
.409
.000
.006
.051
.205
.410
.328
.000
.002
.015
.082
.246
.393
.262
.000
.000
.004
.029
.115
.275
.367
.210
.000
.000
.001
.009
.046
.147
.294
.335
.168
.000
.000
.000
.003
.017
.066
.176
.302
.302
.134

.90
.100
.900
.010
.180
.810
.001
.027
.243
.729
.000
.004
.048
.292
.656
.000
.001
.008
.073
.328
.590
.000
.000
.001
.015
.098
.354
.531
.000
.000
.000
.003
.023
.124
.372
.478
.000
.000
.000
.000
.005
.033
.149
.383
.430
.000
.000
.000
.000
.001
.007
.045
.172
.387
.387

.95
.050
.950
.003
.095
.902
.000
.007
.135
.857
.000
.001
.014
.171
.815
.000
.000
.001
.021
.204
.774
.000
.000
.000
.002
.031
.232
.735
.000
.000
.000
.000
.004
.041
.257
.698
.000
.000
.000
.000
.000
.005
.052
.279
.664
.000
.000
.000
.000
.000
.001
.008
.063
.298
.630

Continued

Appendix 313

Table I

Binomial probabilities.
p

n
10

11

12

13

14

x
0
1
2
3
4
5
6
7
8
9
10
0
1
2
3
4
5
6
7
8
9
10
11
0
1
2
3
4
5
6
7
8
9
10
11
12
0
1
2
3
4
5
6
7
8
9
10
11
12
13
0
1
2
3
4
5
6
7
8

.05
.599
.315
.075
.010
.001
.000
.000
.000
.000
.000
.000
.569
.329
.087
.014
.001
.000
.000
.000
.000
.000
.000
.000
.540
.341
.099
:017
.002
.000
.000
.000
.000
.000
.000
.000
.000
.513
.351
.111
.021
.003
.000
.000
.000
.000
.000
.000
.000
.000
.000
.488
.359
.123
.026
.004
.000
.000
.000
.000

.10
.349
.387
.194
.057
.011
.002
.000
.000
.000
.000
.000
.314
.384
.213
.071
.016
.003
.000
.000
.000
.000
.000
.000
.282
.377
.230
.085
.021
.004
.001
.000
.000
.000
.000
.000
.000
.254
.367
.245
.010
.028
.006
.001
.000
.000
.000
.000
.000
.000
.000
.229
.356
.257
.114
.035
.008
.001
.000
.000

.20
.107
.268
.302
.201
.088
.026
.006
.001
.000
.000
.000
.086
.236
.295
.222
.111
.039
.010
.002
.000
.000
.000
.000
.069
.206
.283
.236
.133
.053
.016
.003
.001
.000
.000
.000
.000
.055
.179
.268
.246
.154
.069
.023
.006
.001
.000
.000
.000
.000
.000
.044
.154
.250
.250
.172
.086
.032
.009
.002

.30
.028
.121
.234
.267
.200
.103
.037
.009
.001
.000
.000
.020
.093
.200
.257
.220
.132
.057
.017
.004
.001
.000
.000
.014
.071
.168
.240
.231
.159
.079
.029
.008
.001
.000
.000
.000
.010
.054
.139
.218
.234
.180
.103
.044
.014
.003
.001
.000
.000
.000
.007
.041
.113
.194
.229
.196
.126
.062
.023

.40
.006
.040
.121
.215
.251
.201
.111
.042
.011
.002
.000
.004
.027
.089
.177
.237
.221
.147
.070
.023
.005
.001
.000
.002
.017
.064
.142
.213
.227
.177
.101
.042
.013
.003
.000
.000
.001
.011
.045
.111
.185
.221
.197
.131
.066
.024
.007
.001
.000
.000
.001
.007
.032
.085
.155
.207
.207
.157
.092

.50
.001
.010
.044
.117
.205
.246
.205
.117
.044
.010
.001
.001
.005
.027
.081
.161
.226
.226
.161
.081
.027
.005
.001
.000
.003
.016
.054
.121
.193
.226
.193
.121
.054
.016
.003
.000
.000
.002
.010
.035
.087
.157
.210
.210
.157
.087
.035
.010
.00
.000
.000
.001
.006
.022
.061
.122
.183
.210
.183

.60
.000
.002
.011
.042
.111
.201
.251
.215
.121
.040
.006
.000
.001
.005
.023
.070
.147
.221
.237
.177
.089
.027
.004
.000
.000
.003
.012
.042
.101
.177
.227
.213
.142
.064
.017
.002
.000
.000
.001
.007
.024
.066
.131
.197
.221
.184
.111
.045
.011
.001
.000
.000
.001
.003
.014
.041
.092
.157
.207

.70
.000
.000
.001
.009
.037
.103
.200
.267
.234
.121
.028
.000
.000
.001
.004
.017
.057
.132
.220
.257
.200
.093
.020
.000
.000
.000
.002
.008
.030
.079
.159
.231
.240
.168
.071
.014
.000
.000
.000
.001
.003
.014
.044
.103
.180
.234
.218
.139
.054
.0100
.000
.000
.000
.000
.001
.007
.023
.062
.126

.80
.000
.000
.000
.001
.006
.026
.088
.201
.302
.268
.107
.000
.000
.000
.000
.002
.010
.039
.111
.222
.295
.236
.086
.000
.000
.000
.000
.001
.003
.016
.053
.133
.236
.283
.206
.069
.000
.000
.000
.000
.000
.001
.006
.023
.069
.154
.246
.268
.179
.055
.000
.000
.000
.000
.000
.000
..002
.010
.032

.90
.000
.000
.000
.000
.000
.002
.011
.057
.194
.387
.349
.000
.000
.000
.000
.000
.000
.003
.016
.071
.213
.384
.314
.000
.000
.000
.000
.000
.000
.001
.004
.021
.085
.230
.377
.282
.000
.000
.000
.000
.000
.000
.000
.001
.006
.028
.100
.245
.367
.254
.000
.000
.000
.000
.000
.000
.000
.000
.001

.95
.000
.000
.000
.000
.000
.000
.001
.011
.075
.315
.599
.000
.000
.000
.000
.000
.000
.000
.001
.014
.087
.329
.569
.000
.000
.000
.000
.000
.000
.000
.000
.002
.017
.099
.341
.540
.000
.000
.000
.000
.000
.000
.000
.000
.000
.003
.021
.111
.351
.513
.000
.000
.000
.000
.000
.000
.000
.000
.000
Continued

314

Appendix

Continued

Table I

Binomial probabilities.
p

x
9
10
11
12
13
14
15 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

.05
.0000
.000
.000
.000
.000
.000
.463
.366
.135
.031
.005
.001
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000

.10
.0000
.000
.000
.000
.000
.000
.206
.343
.267
.129
.043
.011
.002
.000
.000
.000
.000
.000
.000
.000
.000
.000

.20
.0003
.000
.000
.000
.000
.000
.035
.132
.231
.250
.188
.103
.043
.014
.004
.001
.000
.000
.000
.000
.000
.000

.30
.0066
.001
.000
.000
.000
.000
.005
.031
.092
.170
.219
.206
.147
.081
.035
.012
.003
.001
.000
.000
.000
.000

.40
.0408
.014
.003
.001
.000
.000
.001
.005
.022
.063
.127
.186
.207
.177
.118
.061
.025
.007
.002
.000
.000
.000

.50
.1222
.061
.022
.006
.001
.000
.000
.001
.003
.014
.042
.092
.153
.196
.196
.153
.092
.042
.014
.003
.001
.000

.60
.2066
.155
.085
.032
.007
.001
.000
.000
.000
.002
.007
.025
.061
.118
.177
.207
.186
.127
.063
.022
.005
.001

.70
.1963
.229
.194
.113
.041
.007
.000
.000
.000
.000
.001
.003
.012
.035
.081
.147
.206
.219
.170
.092
.031
.005

.80
.0860
.172
.250
.250
.154
.044
.000
.000
.000
.000
.000
.000
.001
.004
.014
.043
.103
.188
.250
.231
.132
.035

.90
.0078
.035
.114
.257
.356
.229
.000
.000
.000
.000
.000
.000
.000
.000
.000
.002
.011
.043
.129
.267
.343
.206

.95
.000
.004
.026
.123
.359
.488
.000
.000
.000
.000
.000
.000
.000
.000
.000
.000
.001
.005
.031
.135
.366
.463

Appendix 315

Table II

Poisson probabilities.

Tabulated values are P (X = x ) =

e x
x!

x
0
1
2
3
4
5
6
7

0.1
.905
.091
.005
.000
.000
.000
.000
.000

0.2
.819
.164
.016
.001
.000
.000
.000
.000

0.3
.741
.222
.033
.003
.000
.000
.000
.000

0.4
.670
.268
.054
.007
.000
.000
.000
.000

0.5
.607
.303
.076
.013
.002
.000
.000
.000

x
0
1
2
3
4
5
6
7
8
9

1.1
.333
.366
.201
.074
.020
.005
.001
.000
.000
.000

1.2
.301
.361
.217
.087
.026
.006
.001
.000
.000
.000

1.3
..273
.354
.230
.100
.032
.008
.002
.000
.000
.000

1.4
.247
.345
.242
.113
.040
.011
.003
.001
.000
.000

1.5
.223
.335
.251
.126
.047
.014
.004
.001
.000
.000

x
0
1
2
3
4
5
6
7
8
9
10
11
12

2.1
.123
.257
.270
.189
.099
.042
.015
.004
.001
.000
.000
.000
.000

2.2
.111
.244
.268
.197
.108
.048
.017
.006
.002
.000
.000
.000
.000

2.3
.100
.231
.265
.203
.117
.054
.021
.007
.002
.001
.000
.000
.000

2.4
.091
.218
.261
.209
.125
.060
.024
.008
.003
.001
.000
.000
.000

2.5
.082
.205
.257
.214
.134
.067
.028
.010
.003
.001
.000
.000
.000

0.6
.549
.329
.099
.020
.003
.000
.000
.000

0.7
.497
.348
.122
.028
.005
.001
.000
.000

0.8
.449
.360
.144
.038
.008
.001
.000
.000

0.9
.407
.366
.165
.049
.011
.002
.000
.000

1.0
.368
.368
.184
.061
.015
.003
.001
.000

1.6
.202
.323
.258
.138
.055
.018
.005
.001
.000
.000

1.7
.183
.311
.264
.150
.064
.022
.006
.002
.000
.000

1.8
.165
.298
.268
.161
.072
.026
.008
.002
.001
.000

1.9
.150
.284
.270
.171
.081
.031
.010
.003
.001
.000

2.0
.135
.271
.271
.180
.090
.036
.012
.003
.001
.000

2.6
.074
.193
.251
.218
.141
.074
.032
.012
.004
.001
.000
.000
.000

2.7
.067
.182
.245
.221
.149
.080
.036
.014
.005
.001
.000
.000
.000

2.8
.061
.170
.238
.223
.156
.087
.041
.016
.006
.002
.001
.000
.000

2.9
.055
.160
.231
.224
.162
.094
.046
.019
.007
.002
.001
.000
.000

3.0
.050
.149
.224
.224
.168
.101
.050
.022
.008
.003
.001
.000
.000

Continued

316

Appendix
Continued

Table II

Poisson probabilities.

x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

4.1
.017
.068
.139
.190
.195
.160
.109
.064
.033
.015
.006
.002
.009
.000
.000
.000

4.2
.015
.063
.132
.185
.194
.163
.114
.069
.036
.017
.007
.003
.001
.000
.000
.000

4.3
.014
.058
.125
.180
.193
.166
.119
.073
.039
.019
.008
.003
.001
.000
.000
.000

4.4
.012
.054
.119
.174
.192
.169
.124
.078
.043
.021
.009
.004
.001
.001
.000
.000

4.5
.011
.050
.113
.169
.190
.171
.128
.082
.046
.023
.010
.004
.002
.001
.000
.000

4.6
.010
.046
.106
.163
.188
.173
.132
.087
.050
.026
.012
.005
.002
.001
.000
.000

4.7
.009
.043
.101
.157
.185
.174
.136
.091
.054
.028
.013
.006.
.002
.001
.000
.000

4.8
.008
.040
.095
.152
.182
.175
.140
.096
.058
.031
.015
.006
.003
.001
.000
.000

4.9
.007
.037
.089
.146
.179
.175
.143
.100
.061
.033
.016
.007
.003
.001
.000
.000

5.0
.007
.034
.084
.140
.176
.176
.146
.104
.065
.036
.018
.008
.003
.001
.000
.000

5.6
.004
.021
.058
.108
.152
.170
.158
.127
.089
.055
.031
.016
.007
.003
.001
.000
.000
.000

5.7
.003
.019
.054
.103
.147
.168
.159
.130
.093
.059
.033
.017
.008
.004
.002
.001
.000
.000

5.8
.003
.018
.051
.099
.143
.166
.160
.133
.096
.062
.036
.019
.009
.004
.002
.001
.000
.000

5.9
.003
.016
.048
.094
.138
.163
.161
.135
.100
.065
.039
.021
.010
.005
.002
.001
.000
.000

6.0
.002
.015
.045
.089
.134
.161
.161
.138
.103
.069
.041
.023
.011
.005
.002
.001
.000
.000

6.6
.001
.010
.029
.065
.108
.142
.156
.147
.122
.089
.059
.035
.019
.010
.005
.002
.001
.000
.000
.000

6.7
.001
.008
.028
.062
.103
.139
,155
.148
.124
.092
.062
.038
.021
.011
.005
.002
.001
.000
.000
.000

6.8
.001
.008
.026
.058
.099
.135
.153
.149
.126
.095
.065
.040
.023
.012
.006
.003
.001
.000
.000
.000

6.9
.001
.007
.024
.055
.095
.131
.151
.149
.128
.098
.068
.043
.025
.013
.006
.003
.001
.001
.000
.000

7.0
.001
.007
.022
.052
.091
.128
.149
.149
.130
.101
.071
.045
.026
.014
.007
.003
.001
.001
.000
.000

x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

5.1
.006
.031
.079
.135
.172
.175
.149
.109
.069
.039
.020
.009
.004
.002
.001
.000
.000
.000

5.2
.006
.029
.075
.129
.168
.175
.151
.113
.073
.042
.022
.010
.005
.002
.001
.000
.000
.000

5.3
.005
.027
.070
.124
.164
.174
.154
.116
.077
.045
.024
.012
.005
.002
.001
.000
.000
.000

5.4
.005
.024
.066
.119
.160
.173
.156
.120
.081
.049
.026
.013
.006
.002
.001
.000
.000
.000

5.5
.004
.022
.062
.113
.156
.171
.157
.123
.085
.052
.029
.014
.007
.003
.001
.000
.000
.000

x
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

6.1
.002
.014
.042
.085
.129
.158
.160
.140
.107
.072
.044
.024
.012
.006
.003
.001
.000
.000
.000
.000

6.2
.002
.013
.040
.081
.125
.155
.160
.142
.110
.076
.047
.026
.014
.007
.003
.001
.001
.000
.000
.000

6.3
,002
.012
.036
.077
.121
.152
.159
.144
.113
.079
.050
.029
.015
.007
.003
.001
.001
.000
.000
.000

6.4
.002
.011
.034
.073
.116
.149
.159
.145
.116
.083
.053
.031
.016
.008
.004
.002
.001
.000
.000
.000

6.5
.002
.010
.032
.069
.112
.145
.158
.146
.119
.086
.056
.033
.018
.009
.004
.002
.001
.000
.000
.000

Appendix 317

Table III Standard normal distribution.


Tabulated values are P (0 Z z)  shaded area under the standard normal curve

z
0.0
0.1
0.2
0.3
0.4
0.5

.00
.0000
.0398
.0793
.1179
.1554
.1915

.01
.0040
.0438
.0832
.1217
.1591
.1950

.02
.0080
.0478
.0871
.1255
.1628
.1985

.03
.0120
.0517
.0910
.1293
.1664
.2019

.04
.0160
.0557
.0948
.1331
.1700
.2054

.05
.0199
.0596
.0987
.1368
.1736
.2088

.06
.0239
.0636
.1026
.1406
.1772
.2123

.07
.0279
.0675
.1064
.1443
.1808
.2157

.08
.0319
.0714
.1103
.1480
.1844
.2190

.09
.0359
.0753
.1141
.1517
.1879
.2224

0.6
0.7
0.8
0.9
1.0

.2257
.2580
.2881
.3159
.3413

.2291
.2611
.2910
.3186
.3438

.2324
.2642
.2939
.3212
.3461

.2357
.2673
.2967
.3238
.3485

.2389
.2704
.2995
.3264
.3508

.2422
.2734
.3023
.3289
.3531

.2454
.2764
.3051
.3315
.3554

.2486
.2794
.3078
.3340
.3577

.2517
.2823
.3106
.3365
.3599

.2549
.2852
.3133
.3389
.3621

1.1
1.2
1.3
1.4
1.5

.3643
.3849
.4032
.4192
.4332

.3665
.3869
.4049
.4207
.4345

.3686
.3888
.4066
.4222
.4357

.3708
.3907
.4082
.4236
.4370

.3729
.3925
.4099
.4251
.4382

.3770
.3962
.4131
.4279
.4406

.3790
.3980
.4147
.4292
.4418

.3810
.3997
.4162
.4306
.4429

.3830
.4015
.4177
.4319
.4441

.3830
.4015
.4177
.4319
.4441

1.6
1.7
1.8
1.9
2.0

.4452
.4554
.4641
:4713
.4772

.4463
.4564
.4649
.4719
.4778

.4474
.4573
.4656
.4726
.4783

.4484
.4582
.4664
.4732
.4788

.4495
.4591
.4671
.4738
.4793

.4505
.4599
.4678
.4744
.4798

.4515
.4608
.4686
.4750
.4803

.4525
.4616
.4693
.4756
.4808

.4535
.4625
.4699
.4761
.4812

.4545
.4633
.4706
.4767
.4817

2.1
2.2
2.3
2.4
2.5

.4821
.4861
.4893
.4918
.4938

.4826
.4864
.4896
.4920
.4940

.4830
.4868
.4898
.4922
.4941

.4834
.4871
.4901
.4925
.4943

.4838
.4875
.4904
.4927
.4945

.4842
.4878
.4906
.4929
.4946

.4846
.4881
.4909
.4931
.4948

.4850
.4884
.4911
.4932
.4949

.4854
.4887
.4913
.4934
.4951

.4857
.4890
.4916
.4936
.4952

2.6
2.7
2.8
2.9
3.0

.4953
.4965
.4974
.4981
.4987

.4955
.4966
.4975
.4982
.4987

.4956
.4967
.4976
.4982
.4987

.4957
.4968
.4977
.4983
.4988

.4959
.4969
.4977
.4984
.4988

.4960
.4970
.4978
.4984
.4989

.4961
.4971
.4979
.4985
.4989

.4962
.4972
.4979
.4985
.4989

.4963
.4973
.4980
.4986
.4990

.4964
.4974
.4981
.4986
.4990

For negative values of z the probabilities are found by using the symmetric property.

318

Appendix

Table IV

Critical values of 2 with  degrees of freedom.

2

11
12
13
14
15
16
17
18
19
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
140
150
160
170
180
190
100

2
0.995
0.00004
0.0100
0.0717
0.2070
0.4117
0.6757
0.9893
1.3444
1.7349
2.1559
2.6032
3.0738
3.5650
4.0747
4.6009
5.1422
5.6972
6.2648
6.8440
7.4339
8.0337
8.6427
9.2604
9.8862
10.5197
11.1603
11.8076
12.4613
13.1211
13.7867
20.7065
27.9907
35.5346
43.2752
51.1720
59.1963
67.3276

2
0.990
0.00016
0.0201
0.1148
0.2971
0.5543
0.8720
1.2390
1.6465
2.0879
2.5582
3.0535
3.5706
4.1069
4.6604
5.2294
5.8122
6.4078
7.0149
7.6327
8.2604
8.8972
9.5425
10.1957
10.8564
11.5240
12.1981
12.8786
13.5648
14.2565
14.9535
22.1643
29.7067
37.4848
45.4418
53.5400
61.7541
70.0648

2
0.975
0.00098
0.0506
0.2158
0.4844
0.8312
1.2373
1.6899
2.1797
2.7004
3.2470
3.8158
4.4038
5.0087
5.6287
6.2621
6.9077
7.5642
8.2308
8.9066
9.5908
10.2829
10.9823
11.6885
12.4011
13.1197
13.8439
14.5733
15.3079
16.0471
16.7908
24.4331
32.3574
40.4817
48.7576
57.1532
65.6466
74.2219

2
0.950
0.00393
0.1026
0.3518
0.7107
1.1455
1.6354
2.1674
2.7326
3.3251
3.9403
4.5748
5.2260
5.8919
6.5706
7.2609
7.9616
8.6718
9.3905
10.1170
10.8508
11.5913
12.3380
13.0905
13.8484
14.6114
15.3791
16.1513
16.9279
17.7083
18.4926
26.5093
34.7642
43.1879
51.7393
60.3915
69.1260
77.9295

2
0.900
0.01589
0.2107
0.5844
1.0636
1.6103
2.2041
2.8331
3.4895
4.1682
4.8652
5.5778
6.3038
7.0415
7.7895
8.5468
9.3122
10.085
10.865
11.6509
12.4426
13.2396
14.0415
14.8479
15.6587
16.4734
17.2919
18.1138
18.9392
19.7677
20.5992
29.0505
37.6886
46.4589
55.3290
64.2778
73.2912
82.3581

Continued

Appendix 319

Continued

Table IV

111
112
113
114
115
116
117
118
119
110
111
112
1~3
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
140
150
160
170
180
190
100

Critical values of 2 with  degrees of freedom.


2
0.100
2.7055
4.6052
6.2514
7.7794
9.2364
10.6446
12.0170
13.3616
14.6837
15.9871
17.2750
18.5494
19.8119
21.0642
22.3072
23.5418
24.7690
25.9894
27.2036
28.4120
29.6151
30.8133
32.0069
33.1963
34.3816
35.5631
36.7412
37.9159
39.0875
40.2560
51.8050
63.1671
74.3970
85.5271
96.5782
107.5650
118.4980

2
0.050
3.8415
5.9915
7.8147
9.4877
11.0705
12.5916
14.0671
15.5073
16.9190
18.3070
19.6751
21.0261
22.3621
23.6848
24.9958
26.2962
27.5871
28.8693
30.1435
31.4104
32.6705
33.9244
35.1725
36.4151
37.6525
38.8852
40.1133
41.3372
42.5569
43.7729
55.7585
67.5048
79.0819
90.5312
101.8795
113.1452
124.3421

2
0.025
5.0239
7.3778
9.3484
11.1433
12.8325
14.4494
16.0128
17.5346
19.0228
20.4831
21.9200
23.3367
24.7356
26.1190
27.4884
28.8454
30.1910
31.5264
32.8523
34.1696
35.4789
36.7807
38.0757
39.3641
40.6465
41.9232
43.1944
44.4607
45.7222
46.9792
59.3417
71.4202
83.2976
95.0231
106.6285
118.1360
129;5613

2
0.010
6.6349
9.2103
11.3449
13.2767
15.0863
16.8119
18.4753
20.0902
21.6660
23.2093
24.7250
26.2170
27.6883
29.1413
30.5779
31.9999
33.4087
34.8053
36.1908
37.5662
38.9321
40.2894
41.6384
42.9798
44.3141
45.6417
46.9630
48.2782
49.5879
50.8922
63.6907
76.1539
88.3794
100.4251
112.3288
124.1162
135.8070

2
0.005
7.8794
10.5966
12.8381
14.8602
16.7496
18.5476
20.2777
21.9550
23.5893
25.1882
26.7569
28.2995
29.8194
31.3193
32.8013
34.2672
35.7185
37.1564
38.5822
39.9968
41.4010
42.7956
44.1813
45.5585
46.9278
48.2899
49.6449
50.9933
52.3356
53.6720
66.7659
79.4900
91.9517
104.2148
116.3210
128.2290
140.1697

320

Appendix

Table V

Critical values of t with  degrees of freedom.

t v,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
40
60
80
100
120


t.100
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1..330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
1.311
1.310
1.303
1.296
1.292
1.290
1.289
1.282

t.050
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
1.699
1.697
1.684
1.671
1.664
1.660
1.658
1.645

t.025
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
2.045
2.042
2.021
2.000
1.990
1.984
1.980
1.960

t.010
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
2.462
2.457
2.423
2.390
2.374
2.364
2.358
2.326

t.005
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
2.763
2.756
2.750
2.704
2.660
2.639
2.626
2.617
2.576

Critical points of t for lower tail areas are found by using the symmetric property.

t.0005
636.619
31.599
12.924
8.610
6.869
5.959
5.408
5.041
4.781
4.587
4.437
4.318
4.221
4.140
4.073
4.015
3.965
3.922
3.883
3.850
3.819
3.792
3.768
3.745
3.725
3.707
3.690
3.674
3.659
3.646
3.551
3.460
3.416
3.390
3.373
3.291

Appendix 321

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.10).

F v1, v2,

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
40
45
50
60
70
80
90
100


1
39.86
8.53
5.54
4.54
4.06
3.78
3.59
3.46
3.36
3.29
3.23
3.18
3.14
3.10
3.07
3.05
3.03
3.01
2.99
2.97
2.92
2.88
2.85
2.84
2.82
2.81
2.79
2.78
2.77
2.76
2.76
2.71

2
49.50
9.00
5.46
4.32
3.78
3.46
3.26
3.11
3.01
2.92
2.86
2.81
2.76
2.73
2.70
2.67
2.64
2.62
2.61
2.59
2.53
2.49
2.46
2.44
2.42
2.41
2.39
2.38
2.37
2.36
2.36
2.30

3
53.59
9.16
5.39
4.19
3.62
3.29
3.07
2.92
2.81
2.73
2.66
2.61
2.56
2.52
2.49
2.46
2.44
2.42
2.40
2.38
2.32
2.28
2.25
2.23
2.21
2.20
2.18
2.16
2.15
2.15
2.14
2.08

4
55.83
9.24
5.34
4.11
3.52
3.18
2.96
2.81
2.69
2.61
2.54
2.48
2.43
2.39
2.36
2.33
2.31
2.29
2.27
2.25
2.18
2.14
2.11
2.09
2.07
2.06
2.04
2.03
2.02
2.01
2.00
1.94

5
57.24
9.29
5.31
4.05
3.45
3.11
2.88
2.73
2.61
2.52
2.45
2.39
2.35
2.31
2.27
2.24
2.22
2.20
2.18
2.16
2.09
2.05
2.02
2.00
1.98
1.97
1.95
1.93
1.92
1.91
1.91
1.85

6
58.20
9.33
5.28
4.01
3.40
3.05
2.83
2.67
2.55
2.46
2.39
2.33
2.28
2.24
2.21
2.18
2.15
2.13
2.11
2.09
2.02
1.98
1.95
1.93
1.91
1.90
1.87
1.86
1.85
1.84
1.83
1.77

7
58.91
9.35
5.27
3.98
3.37
3.01
2.78
2.62
2.51
2.41
2.34
2.28
2.23
2.19
2.16
2.13
2.10
2.08
2.06
2.04
1.97
1.93
1.90
1.87
1.85
1.84
1.82
1.80
1.79
1.78
1.78
1.72

8
59.44
9.37
5.25
3.95
3.34
2.98
2.75
2.59
2.47
2.38
2.30
2.24
2.20
2.15
2.12
2.09
2.06
2.04
2.02
2.00
1.93
1.88
1.85
1.83
1.81
1.80
1.77
1.76
1.75
1.74
1.73
1.67

9
59.86
9.38
5.24
3.94
3.32
2.96
2.72
2.56
2.44
2.35
2.27
2.21
2.16
2.12
2.09
2.06
2.03
2.00
1.98
1.96
1.89
1.85
1.82
1.79
1.77
1.76
1.74
1.72
1.71
1.70
1.69
1.63

10
60.19
9.39
5.23
3.92
3.30
2.94
2.70
2.54
2.42
2.32
2.25
2.19
2.14
2.10
2.06
2.03
2.00
1.98
1.96
1.94
1.87
1.82
1.79
1.76
1.74
1.73
1.71
1.69
1.68
1.67
1.66
1.60
Continued

322

Appendix

Continued

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.10).
dId
11
1 60.47
2 9.40
3 5.22
4 3.91
5 3.28
6 2.92
7 2.68
8 2.52
9 2.40
10 2.30
11 2.23
12 2.17
13 2.12
14 2.07
15 2.04
16 2.01
17 1.98
18 1.95
19 1.93
20 1.91
25 1.84
30 1.79
35 1.76
40 1.74
45 1.72
50 1.70
60 1.68
70 1.66
80 1.65
90 1.64
100 1.64

1.57

12
60.71
9.41
5.22
3.90
3.27
2.90
2.67
2.50
2.38
2.28
2.21
2.15
2.10
2.05
2.02
1.99
1.96
1.93
1.91
1.89
1.82
1.77
1.74
1.71
1.70
1.68
1.66
1.64
1.63
1.62
1.61
1.55

13
60.90
9.41
5.21
3.89
3.26
2.89
2.65
2.49
2.36
2.27
2.19
2.13
2.08
2.04
2.00
1.97
1.94
1.92
1.89
1.87
1.80
1.75
1.72
1.70
1.68
1.66
1.64
1.62
1.61
1.60
1.59
1.52

14
61.07
9.42
5.20
3.88
3.25
2.88
2.64
2.48
2.35
2.26
2.18
2.12
2.07
2.02
1.99
1.95
1.93
1.90
1.88
1.86
1.79
1.74
1.70
1.68
1.66
1.64
1.62
1.60
1.59
1.58
1.57
1.50

15
61.22
9.42
5.20
3.87
3.24
2.87
2.63
2.46
2.34
2.24
2.17
2.10
2.05
2.01
1.97
1.94
1.91
1.89
1.86
1.84
1.77
1.72
1.69
1.66
1.64
1.63
1.60
1.59
1.57
1.56
1.56
1.49

20
61.74
9.44
5.18
3.84
3.21
2.84
2.59
2.42
2.30
2.20
2.12
2.06
2.01
1.96
1.92
1.89
1.86
1.84
1.81
1.79
1.72
1.67
1.63
1.61
1.58
1.57
1.54
1.53
1.51
1.50
1.49
1.42

25
62.05
9.45
5.17
3.83
3.19
2.81
2.57
2.40
2.27
2.17
2.10
2.03
1.98
1.93
1.89
1.86
1.83
1.80
1.78
1.76
1.68
1.63
1.60
1.57
1.55
1.53
1.50
1.49
1.47
1.46
1.45
1.38

30
62.26
9.46
5.17
3.82
3.17
2.80
2.56
2.38
2.25
2.16
2.08
2.01
1.96
1.91
1.87
1.84
1.81
1.78
1.76
1.74
1.66
1.61
1.57
1.54
1.52
1.50
1.48
1.46
1.44
1.43
1.42
1.34

40
62.53
9.47
5.16
3.80
3.16
2.78
2.54
2.36
2.23
2.13
2.05
1.99
1.93
1.89
1.85
1.81
1.78
1.75
1.73
1.71
1.63
1.57
1.53
1.51
1.48
1.46
1.44
1.42
1.40
1.39
1.38
1.30

50
62.69
9.47
5.15
3.80
3.15
2.77
2.52
2.35
2.22
2.12
2.04
1.97
1.92
1.87
1.83
1.79
1.76
1.74
1.71
1.69
1.61
1.55
1.51
1.48
1.46
1.44
1.41
1.39
1.38
1.36
1.35
1.26

75
62.90
9.48
5.15
3.78
3.13
2.75
2.51
2.33
2.20
2.10
2.02
1.95
1.89
1.85
1.80
1.77
1.74
1.71
1.69
1.66
1.58
1.52
1.48
1.45
1.43
1.41
1.38
1.36
1.34
1.33
1.32
1.21

To find the critical value of F when  is under the lower tail denoted by F1,2,1 we use the
following formula: F
. Example: Fv ,v ,1.10= 1
1 2
1,2,1 = 1
Fv ,v ,.10
F , ,
2 1
2 1

100
63.01
9.48
5.14
3.78
3.13
2.75
2.50
2.32
2.19
2.09
2.01
1.94
1.88
1.83
1.79
1.76
1.73
1.70
1.67
1.65
1.56
1.51
1.47
1.43
1.41
1.39
1.36
1.34
1.32
1.30
1.29
1.18


63.33
9.49
5.13
3.76
3.11
2.72
2.47
2.30
2.16
2.06
1.97
1.90
1.85
1.80
1.76
1.72
1.69
1.66
1.63
1.61
1.52
1.46
1.41
1.38
1.35
1.33
1.29
1.27
1.24
1.23
1.21
1.00

Appendix 323

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.05).

dfd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
40
45
50
60
70
80
90
100

1
161.45
18.51
10.13
7.71
6.61
5.99
5.59
5.32
5.12
4.97
4.84
4.75
4.67
4.60
4.54
4.49
4.45
4.41
4.38
4.35
4.24
4.17
4.12
4.09
4.06
4.03
4.00
3.98
3.96
3.95
3.94
3.84

2
199.50
19.00
9.55
6.94
5.79
5.14
4.74
4.46
4.26
4.10
3.98
3.89
3.81
3.74
3.68
3.63
3.59
3.55
3.52
3.49
3.39
3.32
3.27
3.23
3.20
3.18
3.15
3.13
3.11
3.10
3.09
3.00

3
215.71
19.16
9.28
6.59
5.41
4.76
4.35
4.07
3.86
3.71
3.59
3.49
3.41
3.34
3.29
3.24
3.20
3.16
3.13
3.10
2.99
2.92
2.87
2.84
2.81
2.79
2.76
2.74
2.72
2.71
2.70
2.60

4
224.58
19.25
9.12
6.39
5.19
4.53
4.12
3.84
3.63
3.48
3.36
3.26
3.18
3.11
3.06
3.01
2.96
2.93
2.90
2.87
2.76
2.69
2.64
2.61
2.58
2.56
2.53
2.50
2.49
2.47
2.46
2.37

5
230.16
19.30
9.01
6.26
5.05
4.39
3.97
3.69
3.48
3.33
3.20
3.11
3.03
2.96
2.90
2.85
2.81
2.77
2.74
2.71
2.60
2.53
2.49
2.45
2.42
2.40
2.37
2.35
2.33
2.32
2.31
2.21

6
233.99
19.33
8.94
6.16
4.95
4.28
3.87
3.58
3.37
3.22
3.09
3.00
2.92
2.85
2.79
2.74
2.70
2.66
2.63
2.60
2.49
2.42
2.37
2.34
2.31
2.29
2.25
2.23
2.21
2.20
2.19
2.10

7
236.77
19.35
8.89
6.09
4.88
4.21
3.79
3.50
3.29
3.14
3.01
2.91
2.83
2.76
2.71
2.66
2.61
2.58
2.54
2.51
2.40
2.33
2.29
2.25
2.22
2.20
2.17
2.14
2.13
2.11
2.10
2.01

8
238.88
19.37
8.85
6.04
4.82
4.15
3.73
3.44
3.23
3.07
2.95
2.85
2.77
2.70
2.64
2.59
2.55
2.51
2.48
2.45
2.34
2.27
2.22
2.18
2.15
2.13
2.10
2.07
2.06
2.04
2.03
1.94

9
240.54
19.38
8.81
6.00
4.77
4.10
3.68
3.39
3.18
3.02
2.90
2.80
2.71
2.65
2.59
2.54
2.49
2.46
2.42
2.39
2.28
2.21
2.16
2.12
2.10
2.07
2.04
2.02
2.00
1.99
1.97
1.88

10
241.88
19.40
8.79
5.96
4.74
4.06
3.64
3.35
3.14
2.98
2.85
2.75
2.67
2.60
2.54
2.49
2.45
2.41
2.38
2.35
2.24
2.16
2.11
2.08
2.05
2.03
1.99
1.97
1.95
1.94
1.93
1.83
Continued

324

Appendix

Continued

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.05).
dId
11
12
13
14
15
20
25
30
40
50
75
100

1 243.0 243.9 244.7 245.4 246.0 248.0 249.3 250.1 251.1 251.8 252.6 253.0 254.3
2 19.40 19.41 19.42 19.42 19.43 19.45 19.46 19.46 19.47 19.48 19.48 19.49 19.50
3
8.76
8.74
8.73
8.71
8.70
8.66
8.63
8.62
8.59
8.58
8.56
8.55
8.53
4
5.94
5.91
5.89
5.87
5.86
5.80
5.77
5.75
5.72
5.70
5.68
5.66
5.63
5
4.70
4.68
4.66
4.64
4.62
4.56
4.52
4.50
4.46
4.44
4.42
4.41
4.37
6
4.03
4.00
3.98
3.96
3.94
3.87
3.83
3.81
3.77
3.75
3.73
3.71
3.67
7
3.60
3.57
3.55
3.53
3.51
3.44
3.40
3.38
3.34
3.32
3.29
3.27
3.23
8
3.31
3.28
3.26
3.24
3.22
3.15
3.11
3.08
3.04
3.02
2.99
2.97
2.93
9
3.10
3.07
3.05
3.03
3.01
2.94
2.89
2.86
2.83
2.80
2.77
2.76
2.71
10
2.94
2.91
2.89
2.86
2.85
2.77
2.73
2.70
2.66
2.64
2.60
2.59
2.54
11
2.82
2.79
2.76
2.74
2.72
2.65
2.60
2.57
2.53
2.51
2.47
2.46
2.40
12
2.72
2.69
2.66
2.64
2.62
2.54
2.50
2.47
2.43
2.40
2.37
2.35
2.30
13
2.63
2.60
2.58
2.55
2.53
2.46
2.41
2.38
2.34
2.31
2.28
2.26
2.21
14
2.57
2.53
2.51
2.48
2.46
2.39
2.34
2.31
2.27
2.24
2.21
2.19
2.13
15
2.51
2.48
2.45
2.42
2.40
2.33
2.28
2.25
2.20
2.18
2.14
2.12
2.07
16
2.46
2.42
2.40
2.37
2.35
2.28
2.23
2.19
2.15
2.12
2.09
2.07
2.01
17
2.41
2.38
2.35
2.33
2.31
2.23
2.18
2.15
2.10
2.08
2.04
2.02
1.96
18
2.37
2.34
2.31
2.29
2.27
2.19
2.14
2.11
2.06
2.04
2.00
1.98
1.92
19
2.34
2.31
2.28
2.26
2.23
2.16
2.11
2.07
2.03
2.00
1.96
1.94
1.88
20
2.31
2.28
2.25
2.22
2.20
2.12
2.07
2.04
1.99
1.97
1.93
1.91
1.84
25
2.20
2.16
2.14
2.11
2.09
2.01
1.96
1.92
1.87
1.84
1.80
1.78
1.71
30
2.13
2.09
2.06
2.04
2.01
1.93
1.88
1.84
1.79
1.76
1.72
1.70
1.62
35
2.07
2.04
2.01
1.99
1.96
1.88
1.82
1.79
1.74
1.70
1.66
1.63
1.56
40
2.04
2.00
1.97
1.95
1.92
1.84
1.78
1.74
1.69
1.66
1.61
1.59
1.51
45
2.01
1.97
1.94
1.92
1.89
1.81
1.75
1.71
1.66
1.63
1.58
1.55
1.47
50
1.99
1.95
1.92
1.89
1.87
1.78
1.73
1.69
1.63
1.60
1.55
1.52
1.44
60
1.95
1.92
1.89
1.86
1.84
1.75
1.69
1.65
1.59
1.56
1.51
1.48
1.39
70
1.93
1.89
1.86
1.84
1.81
1.72
1.66
1.62
1.57
1.53
1.48
1.45
1.35
80
1.91
1.88
1.84
1.82
1.79
1.70
1.64
1.60
1.54
1.51
1.45
1.43
1.32
90
1.90
1.86
1.83
1.80
1.78
1.69
1.63
1.59
1.53
1.49
1.44
1.41
1.30
100
1.89
1.85
1.82
1.79
1.77
1.68
1.62
1.57
1.52
1.48
1.42
1.39
1.28

1.79
1.75
1.72
1.69
1.67
1.57
1.51
1.46
1.39
1.35
1.28
1.24
1.00
To find the critical value of F when  is under the lower tail denoted by F1,2,1 we use the
following formula: F
. Example: Fv ,v ,1.10= 1
1 2
1,2,1 = 1
Fv ,v ,.10
F , ,
2 1
2 1

Appendix 325

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.025).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
40
45
50
60
70
80
90
100


1
647.79
38.51
17.44
12.22
10.01
8.81
8.07
7.57
7.21
6.94
6.72
6.55
6.41
6.30
6.20
6.12
6.04
5.98
5.92
5.87
5.69
5.57
5.49
5.42
5.38
5.34
5.29
5.25
5.22
5.20
5.18
5.02

2
799.50
39.00
16.04
10.65
8.43
7.26
6.54
6.06
5.71
5.46
5.26
5.10
4.97
4.86
4.77
4.69
4.62
4.56
4.51
4.46
4.29
4.18
4.11
4.05
4.01
3.97
3.93
3.89
3.86
3.84
3.83
3.69

3
864.16
39.17
15.44
9.98
7.76
6.60
5.89
5.42
5.08
4.83
4.63
4.47
4.35
4.24
4.15
4.08
4.01
3.95
3.90
3.86
3.69
3.59
3.52
3.46
3.42
3.39
3.34
3.31
3.28
3.26
3.25
3.12

4
899.58
39.25
15.10
9.60
7.39
6.23
5.52
5.05
4.72
4.47
4.28
4.12
4.00
3.89
3.80
3.73
3.66
3.61
3.56
3.51
3.35
3.25
3.18
3.13
3.09
3.05
3.01
2.97
2.95
2.93
2.92
2.79

5
921.85
39.30
14.88
9.36
7.15
5.99
5.29
4.82
4.48
4.24
4.04
3.89
3.77
3.66
3.58
3.50
3.44
3.38
3.33
3.29
3.13
3.03
2.96
2.90
2.86
2.83
2.79
2.75
2.73
2.71
2.70
2.57

6
937.11
39.33
14.73
9.20
6.98
5.82
5.12
4.65
4.32
4.07
3.88
3.73
3.60
3.50
3.41
3.34
3.28
3.22
3.17
3.13
2.97
2.87
2.80
2.74
2.70
2.67
2.63
2.59
2.57
2.55
2.54
2.41

7
948.22
39.36
14.62
9.07
6.85
5.70
4.99
4.53
4.20
3.95
3.76
3.61
3.48
3.38
3.29
3.22
3.16
3.10
3.05
3.01
2.85
2.75
2.68
2.62
2.58
2.55
2.51
2.47
2.45
2.43
2.42
2.29

8
956.66
39.37
14.54
8.98
6.76
5.60
4.90
4.43
4.10
3.85
3.66
3.51
3.39
3.29
3.20
3.12
3.06
3.01
2.96
2.91
2.75
2.65
2.58
2.53
2.49
2.46
2.41
2.38
2.35
2.34
2.32
2.19

9
963.28
39.39
14.47
8.90
6.68
5.52
4.82
4.36
4.03
3.78
3.59
3.44
3.31
3.21
3.12
3.05
2.98
2.93
2.88
2.84
2.68
2.57
2.50
2.45
2.41
2.38
2.33
2.30
2.28
2.26
2.24
2.11

10
968.63
39.40
14.42
8.84
6.62
5.46
4.76
4.30
3.96
3.72
3.53
3.37
3.25
3.15
3.06
2.99
2.92
2.87
2.82
2.77
2.61
2.51
2.44
2.39
2.35
2.32
2.27
2.24
2.21
2.19
2.18
2.05
Continued

326

Appendix

Continued

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.025).
dId
11
12
13
14
15
20
25
30
40
50
75
100
1 973.0 976.7 979.8 982.5 984.9 993.1 998.1 1001.4 1005.6 1008.1 1011.5 1013.2
2 39.41 39.41 39.42 39.43 39.43 39.45 39.46
39.46
39.47
39.48
39.48
39.49
3 14.37 14.34 14.30 14.28 14.25 14.17 14.12
14.08
14.04
14.01
13.97
13.96
4
8.79
8.75
8.71
8.68
8.66
8.56
8.50
8.46
8.41
8.38
8.34
8.32
5
6.57
6.52
6.49
6.46
6.43
6.33
6.27
6.23
6.18
6.14
6.10
6.08
6
5.41
5.37
5.33
5.30
5.27
5.17
5.11
5.07
5.01
4.98
4.94
4.92
7
4.71
4.67
4.63
4.60
4.57
4.47
4.40
4.36
4.31
4.28
4.23
4.21
8
4.24
4.20
4.16
4.13
4.10
4.00
3.94
3.89
3.84
3.81
3.76
3.74
9
3.91
3.87
3.83
3.80
3.77
3.67
3.60
3.56
3.51
3.47
3.43
3.40
10
3.66
3.62
3.58
3.55
3.52
3.42
3.35
3.31
3.26
3.22
3.18
3.15
11
3.47
3.43
3.39
3.36
3.33
3.23
3.16
3.12
3.06
3.03
2.98
2.96
12
3.32
3.28
3.24
3.21
3.18
3.07
3.01
2.96
2.91
2.87
2.82
2.80
13
3.20
3.15
3.12
3.08
3.05
2.95
2.88
2.84
2.78
2.74
2.70
2.67
14
3.09
3.05
3.01
2.98
2.95
2.84
2.78
2.73
2.67
2.64
2.59
2.56
15
3.01
2.96
2.92
2.89
2.86
2.76
2.69
2.64
2.59
2.55
2.50
2.47
16
2.93
2.89
2.85
2.82
2.79
2.68
2.61
2.57
2.51
2.47
2.42
2.40
17
2.87
2.82
2.79
2.75
2.72
2.62
2.55
2.50
2.44
2.41
2.35
2.33
18
2.81
2.77
2.73
2.70
2.67
2.56
2.49
2.44
2.38
2.35
2.30
2.27
19
2.76
2.72
2.68
2.65
2.62
2.51
2.44
2.39
2.33
2.30
2.24
2.22
20
2.72
2.68
2.64
2.60
2.57
2.46
2.40
2.35
2.29
2.25
2.20
2.17
25
2.56
2.51
2.48
2.44
2.41
2.30
2.23
2.18
2.12
2.08
2.02
2.00
30
2.46
2.41
2.37
2.34
2.31
2.20
2.12
2.07
2.01
1.97
1.91
1.88
35
2.39
2.34
2.30
2.27
2.23
2.12
2.05
2.00
1.93
1.89
1.83
1.80
40
2.33
2.29
2.25
2.21
2.18
2.07
1.99
1.94
1.88
1.83
1.77
1.74
45
2.29
2.25
2.21
2.17
2.14
2.03
1.95
1.90
1.83
1.79
1.73
1.69
50
2.26
2.22
2.18
2.14
2.11
1.99
1.92
1.87
1.80
1.75
1.69
1.66
60
2.22
2.17
2.13
2.09
2.06
1.94
1.87
1.82
1.74
1.70
1.63
1.60
70
2.18
2.14
2.10
2.06
2.03
1.91
1.83
1.78
1.71
1.66
1.59
1.56
80
2.16
2.11
2.07
2.03
2.00
1.88
1.81
1.75
1.68
1.63
1.56
1.53
90
2.14
2.09
2.05
2.02
1.98
1.86
1.79
1.73
1.66
1.61
1.54
1.50
100
2.12
2.08
2.04
2.00
1.97
1.85
1.77
1.71
1.64
1.59
1.52
1.48

1.99
1.94
1.90
1.87
1.83
1.71
1.63
1.57
1.48
1.43
1.34
1.30
To find the critical value of F when  is under the lower tail denoted by F1,2,1 we use the
following formula: F
. Example: Fv ,v ,1.10= 1
1 2
1,2,1 = 1
Fv ,v ,.10
F , ,
2 1
2 1

Appendix 327

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.01).

dId
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
25
30
35
40
45
50
60
70
80
90
100


1
4052
98.50
34.12
21.20
16.26
13.75
12.25
11.26
10.56
10.04
9.65
9.33
9.07
8.86
8.68
8.53
8.40
8.29
8.18
8.10
7.77
7.56
7.42
7.31
7.23
7.17
7.08
7.01
6.96
6.93
6.90
6.63

2
5000
99.00
30.82
18.00
13.27
10.92
9.55
8.65
8.02
7.56
7.21
6.93
6.70
6.51
6.36
6.23
6.11
6.01
5.93
5.85
5.57
5.39
5.27
5.18
5.11
5.06
4.98
4.92
4.88
4.85
4.82
4.61

3
5403
99.17
29.46
16.69
12.06
9.78
8.45
7.59
6.99
6.55
6.22
5.95
5.74
5.56
5.42
5.29
5.18
5.09
5.01
4.94
4.68
4.51
4.40
4.31
4.25
4.20
4.13
4.07
4.04
4.01
3.98
3.78

4
5625
99.25
28.71
15.98
11.39
9.15
7.85
7.01
6.42
5.99
5.67
5.41
5.21
5.04
4.89
4.77
4.67
4.58
4.50
4.43
4.18
4.02
3.91
3.83
3.77
3.72
3.65
3.60
3.56
3.53
3.51
3.32

5
5764
99.30
28.24
15.52
10.97
8.75
7.46
6.63
6.06
5.64
5.32
5.06
4.86
4.69
4.56
4.44
4.34
4.25
4.17
4.10
3.85
3.70
3.59
3.51
3.45
3.41
3.34
3.29
3.26
3.23
3.21
3.02

6
5859
99.33
27.91
15.21
10.67
8.47
7.19
6.37
5.80
5.39
5.07
4.82
4.62
4.46
4.32
4.20
4.10
4.01
3.94
3.87
3.63
3.47
3.37
3.29
3.23
3.19
3.12
3.07
3.04
3.01
2.99
2.80

7
5928
99.36
27.67
14.98
10.46
8.26
6.99
6.18
5.61
5.20
4.89
4.64
4.44
4.28
4.14
4.03
3.93
3.84
3.77
3.70
3.46
3.30
3.20
3.12
3.07
3.02
2.95
2.91
2.87
2.84
2.82
2.64

8
5981
99.37
27.49
14.80
10.29
8.10
6.84
6.03
5.47
5.06
4.74
4.50
4.30
4.14
4.00
3.89
3.79
3.71
3.63
3.56
3.32
3.17
3.07
2.99
2.94
2.89
2.82
2.78
2.74
2.72
2.69
2.51

9
6022
99.39
27.35
14.66
10.16
7.98
6.72
5.91
5.35
4.94
4.63
4.39
4.19
4.03
3.89
3.78
3.68
3.60
3.52
3.46
3.22
3.07
2.96
2.89
2.83
2.78
2.72
2.67
2.64
2.61
2.59
2.41

10
6056
99.40
27.23
14.55
10.05
7.87
6.62
5.81
5.26
4.85
4.54
4.30
4.10
3.94
3.80
3.69
3.59
3.51
3.43
3.37
3.13
2.98
2.88
2.80
2.74
2.70
2.63
2.59
2.55
2.52
2.50
2.32
Continued

328

Appendix

Continued

Table VI Critical values of F with numerator and denominator degrees of freedom 1, 2
respectively (  0.01).
dId
11
12
13
14
15
20
25
30
40
50
1 6056
6106
6130
6140
6157
6209
6240
6261
6287
6303
2
99.41
99.42
99.42
99.43
99.43
99.45
99.46
99.47
99.47
99.48
3
27.13
27.05
26.98
26.92
26.87
26.69 2658
26.50
26.41
26.35
4
14.45
14.37
14.31
14.25
14.20
14.02
13.91
13.84
13.75
13.69
5
9.96
9.89
9.82
9.77
9.72
9.55
9.45
9.38
9.29
9.24
6
7.79
7.72
7.66
7.61
7.56
7.40
7.30
7.23
7.14
7.09
7
6.54
6.47
6.41
6.36
6.31
6.16
6.06
5.99
5.91
5.86
8
5.73
5.67
5.61
5.56
5.52
5.36
5.26
5.20
5.12
5.07
9
5.18
5.11
5.05
5.01
4.96
4.81
4.71
4.65
4.57
4.52
10
4.77
4.71
4.65
4.60
4.56
4.41
4.31
4.25
4.17
4.12
11
4.46
4.40
4.34
4.29
4.25
4.10
4.01
3.94
3.86
3.81
12
4.22
4.16
4.10
4.05
4.01
3.86
3.76
3.70
3.62
3.57
13
4.02
3.96
3.91
3.86
3.82
3.66
3.57
3.51
3.43
3.38
14
3.86
3.80
3.75
3.70
3.66
3.51
3.41
3.35
3.27
3.22
15
3.73
3.67
3.61
3.56
3.52
3.37
3.28
3.21
3.13
3.08
16
3.62
3.55
3.50
3.45
3.41
3.26
3.16
3.10
3.02
2.97
17
3.52
3.46
3.40
3.35
3.31
3.16
3.07
3.00
2.92
2.87
18
3.43
3.37
3.32
3.27
3.23
3.08
2.98
2.92
2.84
2.78
19
3.36
3.30
3.24
3.19
3.15
3.00
2.91
2.84
2.76
2.71
20
3.29
3.23
3.18
3.13
3.09
2.94
2.84
2.78
2.69
2.64
25
3.06
2.99
2.94
2.89
2.85
2.70
2.60
2.54
2.45
2.40
30
2.91
2.84
2.79
2.74
2.70
2.55
2.45
2.39
2.30
2.25
35
2.80
2.74
2.69
2.64
2.60
2.44
2.35
2.28
2.19
2.13
40
2.73
2.66
2.61
2.56
2.52
2.3 7
2.27
2.20
2.11
2.06
45
2.67
2.61
2.55
2.51
2.46
2.31
2.21
2.14
2.05
2.00
50
2.63
2.56
2.51
2.46
2.42
2.27
2.17
2.10
2.01
1.95
60
2.56
2.34
2.44
2.39
2.19
2.03
2.10
1.86
1.76
1.88
70
2.51
2.45
2.40
2.35
2.31
2.15
2.05
1.98
1.87
1.83
80
2.48
2.42
2.36
2.31
2.27
2.12
2.01
1.94
1.85
1.79
90
2.45
2.39
2.33
2.27
2.24
2.09
1.99
1.92
1.82
1.76
100
2.43
2.37
2.31
2.27
2.22
2.07
1.97
1.89
1.80
1.74

2.25
2.18
2.12
2.08
2.04
1.88
1.77
1.70
1.59
1.52
To find the critical value of F when  is under the lower tail denoted by F1,2,1 we use the
following formula: F
. Example: Fv ,v ,1.10= 1
1 2
1,2,1 = 1
Fv ,v ,.10
F , ,
2 1
2 1

75
6320
99.49
26.28
13.61
9.17
7.02
5.79
5.00
4.45
4.05
3.74
3.50
3.31
3.15
3.01
2.90
2.80
2.71
2.64
2.57
2.33
2.17
2.06
1.98
1.92
1.87
1.79
1.74
1.70
1.67
1.65
1.42

100
6334
99.49
26.24
13.58
9.13
6.99
5.75
4.96
4.41
4.01
3.71
3.47
3.27
3.11
2.98
2.86
2.76
2.68
2.60
2.54
2.29
2.13
2.02
1.94
1.88
1.82
1.75
1.70
1.65
1.62
1.60
1.36

INDEX

Index Terms

Links

A
absolute probability

89

acceptance regions

204

aging factors

129

Alpha, defined

305

alternative hypotheses

202

alternatives, two-tail

203

131

203

305

267

292

Analyze phase, tools/techniques


associated
arithmetic mean

xxi
305

See also mean


association, measures of

39

associations, perfect

41

axiomatic approach to probability theory

86

B
bar charts

23

before and after data, hypothesis testing with.


bell-shaped curves

237
2

Bernoulli distribution

102

Bernoulli populations

147

Bernoulli random variable

102

Bernoulli trials

101

Beta, defined

305

beta function

153

bias in point estimators

167

bimodal data

305

bimodal distribution

305

305

310

This page has been reformatted by Knovel to provide easier navigation.

305

Index Terms

Links

binomial distribution mean


calculating in MINITAB

270

criteria for applying

102

defined

305

mean

106

normal approximation to

159

point

102

Poisson approximation to

111

sampling with replacement and

107

standard deviation

106

binomial probabilities, tables of


bivariate data
black belts, responsibilities of

271

158

105

314

39

41

BMDP software

255

bound on error of estimation

168

192

305

307

66

264

290

305

box-whisker plots

C
categorical data, graphical representation

22

cdf (cumulative distribution function)

97

117

121

141

305

central limit theorem


central tendency, measures of. See measures
of centrality
chance

71

charts. See also JMP; MINITAB


bar charts

23

267

292

305

box-whisker plots

66

264

290

305

categorical data

22

control charts

33

dot plots

20

39

262

frequency distribution tables

15

histograms

27

35

260

307
JMP

291
This page has been reformatted by Knovel to provide easier navigation.

288

Index Terms

Links

charts. See also JMP; MINITAB (Cont.)


line graphs

33

Pareto chart

24

pie chart

22

probability function

97

scatter plots

21

Six Sigma implementation flow

39

268

309

39

stem and leaf diagrams

27

34

summary information

266

291

time series graph

33

tree diagram

75

Venn diagrams

294

310

310

chi-square critical values table

320

chi-square distributions

148

270

classes

20

306

coefficient of variation (CV)

52

57

combination

77

79

complement

80

306

composite hypotheses

203

conditional probability

88

confidence coefficients

171

305

306

306

confidence intervals
See also interval estimation defined

165

171

differences between two population means

180

hypothesis testing

250

275

for large sample sizes

173

180

one-sided

174

176

pivotal quantities and

172

for population proportions

187

for population variances

195

for ratio of two population variances

198

for small sample sizes

177

Students t-distribution and

180

two-sided

173

187

183

176

This page has been reformatted by Knovel to provide easier navigation.

320

Index Terms

Links

confidence limits

171

contingency tables

306

continuity correction factor

160

306

continuous distribution

115

306

continuous random variables

94

115

117

306

control charts

33

Control phase, tools/techniques associated

xxi
140

160

168

306

40

306

critical points

204

306

critical regions

204

correction factors
correlation coefficient

cumulative distribution function (cdf)

97

117

cumulative frequencies

16

306

cumulative frequency histograms

32

307

cumulative probabilities

96

curves
bell-shaped

frequency distribution

31

Ogive

32

308

operating characteristic

206

308

power

309

CV (coefficients of variation)

52

57

306

D
data
before and after

237

bimodal

305

bivariate

39

categorical

22

41

converting to information

defined

grouped

20

57

interval

12

307

nominal

12

308

307

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

data (Cont.)
numerical. See numerical data
ordinal

12

308

paired

237

309

12

15

qualitative

22

309

quantitative
See quantitative data
ratio

12

sets of

15

306

skewed

51

52

symmetric

51

310

types of

11

ungrouped

20

310

defects per million opportunities


(DPMO)

Define, Measure, Analyze, Improve, and


Control (DMAIC)

xvii

Define phase, tools/techniques


associated
degrees of freedom

xxi
148

154

306

dependent events

91

306

descriptive statistics

10

15

306

94

97

99

45

52

60

density functions. See probability


functions

design of experiments (DOE)


deterministic experiments

306
72

diagrams. See charts


dichotomized populations

107

discrete distributions

306

discrete random variables

93
306

discrete sample space


dispersion, measures of

74
2
64

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

distribution functions
continuous random variables

117

cumulative

97

frequency

153

117

distributions
Bernoulli

102

bimodal

305

binomial. See binomial distributions


calculating in MINITAB

269

chi-square

148

270

continuous

115

306

93

14

306

exponential

129

270

307

F-

155

270

307

hypergeometric

107

10

307

110

14

270

309

39

307

discrete

305

320

normal. See normal distributions


Poisson
probability

95

rectangular distributions

118

of sample mean

140

sampling. See sampling distributions


shapes of

51

skewedsymmetric

67

67

Snedecors F-

155

Students t-

153

230

15

34

tables
uniform

118

Weibull

132

311

DMAIC (Define, Measure, Analyze, Improve,


and Control)

xvii

DOE (design of experiments)

306

dot plots

20

DPMO (defects per million opportunities)

39

262

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

E
empirical rule

60

equally likely events

66

70

307

305

307

305

307

307

errors
of estimation

168

192

in hypothesis testing.

204

212

margin of

168

192

mean square

308

of point estimation

168

192

standard

140

310

type I

205

212

310

type

11

205

212

estimators

137

307

defined

74

307

dependent

91

306

See also point estimation


events

equally likely

307

independent

89

307

mutually exclusive

83

90

null

75

79

of random experiments

73

rare

308

110

representations of

75

simple

73

75

309

sure

75

80

310

expected frequencies
expected values

307
99

307

experiments
defined

307

deterministic

72

random. See random experiments


exponential distributions

129

270

307

This page has been reformatted by Knovel to provide easier navigation.

310

Index Terms

Links

exponential models
extreme values

131
48

66

67

270

307

308

F
F critical values table

323

F-distributions

155

failure rate function

132

finite correction factors

168

finite populations
first quartile

11

140

307

flow chart, Six Sigma implementation

freedom, degrees of

148

frequencies, class

306

frequencies, cumulative

16

154

306

frequencies, expected

307

frequencies, relative

16

83

frequency distribution curves


frequency distribution functions

306

309

153

frequency distributions. See distributions


frequency histograms

27

32

34

307

frequency polygons

27

30

33

307

glossary

305

11

Gosset, W. S.

153

graphical representations. See charts


graphs. See charts
green belts, responsibilities of,
grouped data

xvii

20

57

307

H
hazard rate function

132

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

histograms

27

35

260

288

307
hypergeometric distributions

107

10

307

hypotheses, types of

202

209

305

308

273

See also hypothesis testing


hypothesis testing
before and after data

237

confidence intervals

250

275

errors in

204

212

general concepts

203

in JMP

295

300

large samples

208

240

252

295
in MINITAB

273

normal population

238

253

254

one population mean

208

223

238

250

250

274

296

276

295
one population proportion

240

one population variance

244

paired t-test

237

probability model for

201

purpose

201

small samples

223

steps in

207

two population means

216

229

two population proportions

242

276

two population variances

247

280

300

I
Improve phase, tools/techniques
associated

xxi

independent events

89

independent samples

307

inertia, moments of

101

307

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

inferential statistics

10

infinite populations

11

information

307

inter-quartile range (IQR)

52

64

intersection

80

307

interval data

12

307

165

171

192

52

64

308

interval estimation

308

See also hypothesis testing; point


estimation
IQR (inter-quartile range)

J
JMP
basic functions

284

calculating statistics

286

displaying bar charts

292

displaying box-whisker plots

290

displaying graphical summaries

291

displaying histograms

288

displaying pie charts

294

displaying stem and leaf diagrams

289

hypothesis testing

295

normality testing

301

paired t-test

298

300

L
LCL (lower confidence limits)

171

left skewed data

51

left skewed distributions

67

level of significance

205

limits, class

306

limits, confidence

171

limits, specification

308

This page has been reformatted by Knovel to provide easier navigation.

307

Index Terms

Links

line graphs
location, measures of

33

39

lower confidence limits (LCL)

171

lower fences

308

lower-tail hypotheses

209

63

M
MAIC (Measure, Analyze, Improve, and
Control)

margin of error

168

marginal probability

308

marks, class

192

20

mean square error (MSE)

308

mean
arithmetic

305

Bernoulli distribution

102

binomial distribution

106

continuous random variable

120

defined

308

discrete random variable

99

exponential distribution

130

for grouped data

58

hypergeometric distribution

110

Poisson distribution

114

population

138

sample

138

uniform distribution

120

Weibull distribution

133

weighted

311

Measure, Analyze, Improve, and Control


(MAIC)

Measure phase, tools/techniques


associated,
measures of association

xxi
39

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

measures of centrality
defined

45

for grouped data

57

limitations of

52

308

mean. See mean


median

48

51

58

mode

50

59

308

45

52

63

51

58

measures of dispersion

308

60

64
measures of location
measures of variability
median

2
308
48

memory-less properties
midpoints, class

130
20

306

MINITAB
calculating distributions

269

calculating statistics

258

displaying bar charts

267

displaying box-whisker plots

264

displaying dot plots

262

displaying graphical summaries

266

displaying graphs, generally

260

displaying histograms

260

displaying pie charts

268

displaying scatter plots

263

general use

255

264

265

hypothesis testing about population mean


and proportion

273

hypothesis testing about two population


means and proportions

276

hypothesis testing about two population


variances

280

normality testing

282

paired t-test

278

82

This page has been reformatted by Knovel to provide easier navigation.

308

Index Terms

Links

mode

50

moment of inertia
Motorola definition of Six Sigma
MSE (mean square error)

59

308

101
3
308

multiplication rule

77

90

mutually exclusive events

83

90

nominal data

12

308

nonconditional probability

89

308

nonparametric statistics

308

normal distribution
calculating in MINITAB

270

chi-square distributions and

148

defined

121

empirical rule

123

308

60

examples

124

generally

121

standard deviation and

Students t-distribution and

153

tables

123

319

normality testing
JMP

301

MINITAB

282

null event
null hypotheses

75

79

202

308

20

27

numerical data
graphical representations
interval estimation and
measures of

66

171
52

point estimation and


numerical measures

166
45

See also measures of


centrality; measures of dispersion
This page has been reformatted by Knovel to provide easier navigation.

310

Index Terms

Links

O
observations

308

observed level of significance

205

308

206

308

32

308

OC (operating characteristic)
curves
Ogive curves
one-tail alternatives

203

one-tail tests

308

operating characteristic (OC)


curves

206

opportunities for defects


ordered stem and leaf diagrams

308

1
36

See also
stem and leaf diagrams
ordinal data

12

308

outliers

48

66

p-values

210

309

paired data

237

309

paired t-test

237

278

298

parameters

45

137

165

309

Pareto chart

24

294

309

67

308

Pearson correlation
coefficient

40

Pearson, Karl

40

percentiles

63

perfect association

41

permutations

77

pie charts

22

pivotal quantities

172

point binomial distribution

102

309

268

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

point estimation
See also hypothesis testing;
interval estimation
bias in

167

defined

165

description

166

69

errors of

168

192

examples

169

variance of

167

point value

310

305

307

169

309

Poisson approximation to binomial


distribution

111

158

Poisson distribution

110

270

Poisson probability tables

114

317

Poisson process

111

131

309

population means
confidence intervals for large samples

180

confidence intervals for small samples

183

differences between

216

sample mean and

138

population proportions
confidence intervals

187

difference between two

242

estimating unknown

195

hypothesis testing and

240

273

population variances
confidence intervals

195

formula for

54

for grouped data

60

hypothesis testing with known

208

216

223

229

250
hypothesis testing of one

244

hypothesis testing of two

247

280

300

hypothesis testing with unknown

213

219

226

This page has been reformatted by Knovel to provide easier navigation.

232

Index Terms

Links

populations
defined

10

309

types of

11

107

power curve, defined

309

power, defined

309

power of the test

205

140

probability
absolute

89

axiomatic approach

86

conditional

88

306

defined

71

72

defining by relative frequency

83

marginal

308

nonconditional

89

random experiments

72

statistics and

72

theoretical

85

probability distributions. See distributions


probability functions
Bernoulli

102

Binomial

103

continuous random variables

115

exponential distributions

129

formula for

95

graphical representations

97

hypergeometric

108

normal

121

Poisson distributions

111

Snedecors F-distributions

156

uniform

118

Weibull

132

probability tables, Poisson


problem-solving methodologies
process, defined

114

317

xix
309

This page has been reformatted by Knovel to provide easier navigation.

147

Index Terms

Links

process improvement using Six Sigma

Q
qualitative data
defined

12

frequency distribution tables

15

graphical representations

22

quality control, defined

309

309

quantitative data
defined

12

frequency distribution tables

18

graphical representations

20

interval estimation and


measures of

309

27

34

307

309

66

171
52

point estimation and


quartiles

166
64

R
random experiments defined

307

events of

73

probability and

72

random samples

11

309

random variables
Bernoulli

102

continuous

94

115

defined

93

309

discrete

93

94

117

306

97

99

306
standard normal
types

122
93

range spaces

95

range

52

115

53

309

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

range, interquartile
rare events

52

64

110

ratio data

12

rectangular distribution

118

rejection regions

204

205

309

relative frequencies

16

83

309

relative frequency approach

85

relative frequency histogram

27

relative frequency polygon

30

replacement, sampling and

107

research hypotheses

202

right skewed data

52

right skewed distributions

67

root cause analysis

xix

307

S
sample mean, probability distributions of
sample points

140
73

sample sizes, determining


sample space

77

192
73

sample statistics

309

sample survey

309

sample variance

54

sampled populations

11

79

309

60

samples
defined

11

independent

307

replacement and

107

309

sampling distributions. See also central limit


theorem
defined

309

of sample mean

138

of sample proportion

147

Students t-distribution

153

This page has been reformatted by Knovel to provide easier navigation.

Index Terms

Links

SAS software

255

scatter plots

21

second quartile

39

263

309

Set Theory

80

significance, level of
simple events
simple hypotheses
single-valued frequency distribution tables

205

308

73

75

309

203
18

Six Sigma
defined

implementation flow chart

methodology

xix

Motorola definition

statistical concept

steps in

xvii

tools/techniques

xxi

skewed data

310

Snedecors F-distribution

155

software for statistical analysis

255

310

303

See also JMP; MINITAB


specification limits
SPSS software

2
255

standard deviation
Bernoulli distribution

102

binomial distribution

106

continuous random variable

120

defined

310

discrete random variable

99

exponential distribution

130

for grouped data

60

hypergeometric distribution

110

Poisson distribution

114

uniform distribution

120

standard error

140

310

This page has been reformatted by Knovel to provide easier navigation.

265

Index Terms

Links

standard normal distribution. See normal


distribution
standard normal random variables

122

statistical tools

255

303

See also JMP;


MINITAB
statistics
calculating in JMP

286

calculating in MINITAB

258

defined
descriptive

45

137

10

15

306

goals of

165

inferential

10

nonparametric

308

probability and

72

sample

309

Statpages.net

303

Statsoftinc.com

303

307

stem and leaf diagrams

27

34

289

310

Students t-distribution

153

180

226

230

80

310

310
Sturges formula

19

sure events

75

survey, sample

309

symmetric data

51

symmetric distribution

67

SYSTAT software

310

255

T
t critical values table

322

t-distributions

153

180

226

t-test, paired

237

278

298

105

159

314

tables
binomial probability

This page has been reformatted by Knovel to provide easier navigation.

230

Index Terms

Links

tables (Cont.)
chi-square distribution

149

F critical value

323

frequency distribution

320

15

34

normal distribution

123

319

Poisson probability

114

317

Snedecors F-distribution

157

Students t-distribution

154

t critical values

322

target populations

11

test statistic

39

310

testing statistical
hypotheses

202

tests, types of

310

theoretical probability
third quartile

85
310

time series graphs

33

tree diagrams

75

two-tail alternatives

203

two-tail hypotheses

209

two-tail tests

310

type I error

205

212

310

type II error

205

212

310

U
UCL (upper confidence limits)

171

unbiased estimators

167

ungrouped data

310

20

uniform distributions
union

118

270

310

80

upper confidence limits (UCL)

171

upper fences

310

upper-tail hypotheses

209

This page has been reformatted by Knovel to provide easier navigation.

307

Index Terms

Links

V
values
chi-square

320

expected

99

307

extreme

48

66

F critical

323

p-

210

point

309

t critical

322

variability, measures of

67

309

308

variables
defined

310

in frequency distribution tables

16

variances
defined

310

of point estimators

167

169

54

60

population. See population variances


sample
variation within a process
Venn diagram

3
79

310

W
web-based statistical tools

303

Weibull distribution

132

weighted mean

311

width, class

306

Z
Z distribution

311

z-scores

123

311

This page has been reformatted by Knovel to provide easier navigation.

308

You might also like