Professional Documents
Culture Documents
Brooks/Cole
10 Davis Drive
Belmont, CA 94002-3098
USA
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
CONTENTS
Preface xiii
Note to the Student xxi
1 What Is Statistics? 1
1.1 Introduction 1
1.2 Characterizing a Set of Measurements: Graphical Methods 3
1.3 Characterizing a Set of Measurements: Numerical Methods 8
1.4 How Inferences Are Made 13
1.5 Theory and Reality 14
1.6 Summary 15
2 Probability 20
2.1 Introduction 20
2.2 Probability and Inference 21
2.3 A Review of Set Notation 23
2.4 A Probabilistic Model for an Experiment: The Discrete Case 26
2.5 Calculating the Probability of an Event: The Sample-Point Method 35
2.6 Tools for Counting Sample Points 40
2.7 Conditional Probability and the Independence of Events 51
2.8 Two Laws of Probability 57
v
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
vi Contents
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents vii
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
viii Contents
8 Estimation 390
8.1 Introduction 390
8.2 The Bias and Mean Square Error of Point Estimators 392
8.3 Some Common Unbiased Point Estimators 396
8.4 Evaluating the Goodness of a Point Estimator 399
8.5 Confidence Intervals 406
8.6 Large-Sample Confidence Intervals 411
8.7 Selecting the Sample Size 421
8.8 Small-Sample Confidence Intervals for μ and μ1 − μ2 425
2
8.9 Confidence Intervals for σ 434
8.10 Summary 437
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Contents xi
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xii Contents
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
PREFACE
Our Approach
Talking with students taking or having completed a beginning course in mathematical
statistics reveals a major flaw in many courses. Students can take the course and leave
it without a clear understanding of the nature of statistics. Many see the theory as a
collection of topics, weakly or strongly related, but fail to see that statistics is a theory
of information with inference as its goal. Further, they may leave the course without
an understanding of the important role played by statistics in scientific investigations.
These considerations led us to develop a text that differs from others in three ways:
• First, the presentation of probability is preceded by a clear statement of the
objective of statistics—statistical inference—and its role in scientific research.
As students proceed through the theory of probability (Chapters 2 through 7),
they are reminded frequently of the role that major topics play in statistical
inference. The cumulative effect is that statistical inference is the dominating
theme of the course.
• The second feature of the text is connectivity. We explain not only how major
topics play a role in statistical inference, but also how the topics are related to
xiii
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xiv Preface
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xv
FIGURE 1
Applet illustration of
Bayes’ rule
random variable are equivalent to probabilities associated with the standard normal
distribution. The applet Normal Probabilities (One Tail) provides upper-tail areas as-
sociated with any user-specified, normal distribution and can also be used to establish
the value that cuts off a user-specified area in the upper tail for any normally distributed
random variable. Probabilities and quantiles associated with standard normal random
variables are obtained by selecting the parameter values mean = 0 and standard de-
viation = 1. The beta and gamma distributions are more thoroughly explored in this
chapter. Users can simultaneously graph three gamma (or beta) densities (all with user
selected parameter values) and assess the impact that the parameter values have on
the shapes of gamma (or beta) density functions (see Figure 2). This is accomplished
FIGURE 2
Applet comparison of
three beta densities
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xvi Preface
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xvii
FIGURE 3
Applet illustration of
the central limit
theorem.
works. In addition, it is seen that for large sample sizes the sample variance has
an approximate normal distribution.
The applet Normal Approximation to the Binomial permits the user to assess the quality
of the the (continuous) normal approximation for (discrete) binomial probabilities.
As in previous chapters, a sequence of Applet Exercises leads the user to discover
important and interesting answers and concepts. From a more theoretical perspective,
we establish the independence of the sample mean and sample variance for a sample
of size 2 from a normal distribution. As before, the proof of this result for general
n is contained in an optional exercise. Exercises provide step-by-step derivations of
the mean and variance for random variables with t and F distributions.
Throughout Chapter 8, we have stressed the assumptions associated with confi-
dence intervals based on the t distributions. We have also included a brief discussion
of the robustness of the t procedures and the lack of such for the intervals based
on the χ 2 and F distributions. The applet ConfidenceIntervalP illustrates properties
of large-sample confidence intervals for a population proportion. In Chapter 9, the
applets PointSingle, PointbyPoint, and PointEstimation ultimately lead to a very nice
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xviii Preface
The Exercises
This edition contains more than 350 new exercises. Many of the new exercises use the
applets previously mentioned to guide the user through a series of steps that lead to
more thorough understanding of important concepts. Others use the applets to provide
confidence intervals or p-values that could only be approximated by using tables in the
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Preface xix
appendix. As in previous editions, some of the new exercises are theoretical whereas
others contain data from documented sources that deal with research in a variety of
fields. We continue to believe that exercises based on real data or actual experimental
scenarios permit students to see the practical uses of the various statistical and proba-
bilistic methods presented in the text. As they work through these exercises, students
gain insight into the real-life applications of the theoretical results developed in the
text. This insight makes learning the necessary theory more enjoyable and produces
a deeper understanding of the theoretical methods. As in previous editions, the more
challenging exercises are marked with an asterisk (*). Answers to the odd-numbered
exercises are provided in the back of the book.
Acknowledgments
The authors wish to thank the many colleagues, friends, and students who have made
helpful suggestions concerning the revisions of this text. In particular, we are indebted
to P. V. Rao, J. G. Saw, Malay Ghosh, Andrew Rosalsky, and Brett Presnell for their
technical comments. Gary McClelland, University of Colorado, did an outstanding
job of developing the applets used in the text. Jason Owen, University of Richmond,
wrote the solutions manual. Mary Mortlock, Cal Poly, San Luis Obispo, checked
accuracy.
We wish to thank E. S. Pearson, W. H. Beyer, I. Olkin, R. A. Wilcox, C. W.
Dunnett, and A. Hald. We profited substantially from the suggestions of the review-
ers of the current and previous editions of the text: Roger Abernathy, Arkansas State
University; Elizabeth S. Allman, University of Southern Maine; Robert Berk, Rutgers
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xx Preface
DENNIS D. WACKERLY
WILLIAM MENDENHALL III
RICHARD L. SCHEAFFER
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
NOTE TO THE STUDENT
As the title Mathematical Statistics with Applications implies, this text is concerned
with statistics, in both theory and application, and only deals with mathematics as a
necessary tool to give you a firm understanding of statistical techniques. The following
suggestions for using the text will increase your learning and save your time.
The connectivity of the book is provided by the introductions and summaries in
each chapter. These sections explain how each chapter fits into the overall picture of
statistical inference and how each chapter relates to the preceding ones.
FIGURE 4
Applet calculation of
the probability that a
gamma–distributed
random variable
exceeds its mean
xxi
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
xxii Note to the Student
Within the chapters, important concepts are set off as definitions. These should be
read and reread until they are clearly understood because they form the framework
on which everything else is built. The main theoretical results are set off as theo-
rems. Although it is not necessary to understand the proof of each theorem, a clear
understanding of the meaning and implications of the theorems is essential.
It is also essential that you work many of the exercises—for at least four reasons:
• You can be certain that you understand what you have read only by putting your
knowledge to the test of working problems.
• Many of the exercises are of a practical nature and shed light on the applications
of probability and statistics.
• Some of the exercises present new concepts and thus extend the material covered
in the chapter.
• Many of the applet exercises help build intuition, facilitate understanding of
concepts, and provide answers that cannot (practically) be obtained using tables
in the appendices (see Figure 4).
D. D. W.
W. M.
R. L. S.
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
CHAPTER 1
What Is Statistics?
1.1 Introduction
1.6 Summary
1.1 Introduction
Statistical techniques are employed in almost every phase of life. Surveys are de-
signed to collect early returns on election day and forecast the outcome of an election.
Consumers are sampled to provide information for predicting product preferences.
Research physicians conduct experiments to determine the effect of various drugs and
controlled environmental conditions on humans in order to infer the appropriate treat-
ment for various illnesses. Engineers sample a product quality characteristic and var-
ious controllable process variables to identify key variables related to product quality.
Newly manufactured electronic devices are sampled before shipping to decide whether
to ship or hold individual lots. Economists observe various indices of economic health
over a period of time and use the information to forecast the condition of the economy
in the future. Statistical techniques play an important role in achieving the objective
of each of these practical situations. The development of the theory underlying these
techniques is the focus of this text.
A prerequisite to a discussion of the theory of statistics is a definition of statis-
tics and a statement of its objectives. Webster’s New Collegiate Dictionary defines
statistics as “a branch of mathematics dealing with the collection, analysis, interpre-
tation, and presentation of masses of numerical data.” Stuart and Ord (1991) state:
“Statistics is the branch of the scientific method which deals with the data obtained by
counting or measuring the properties of populations.” Rice (1995), commenting on
experimentation and statistical applications, states that statistics is “essentially con-
cerned with procedures for analyzing data, especially data that in some vague sense
have a random character.” Freund and Walpole (1987), among others, view statistics
as encompassing “the science of basing inferences on observed data and the entire
1
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
2 Chapter 1 What Is Statistics?
problem of making decisions in the face of uncertainty.” And Mood, Graybill, and
Boes (1974) define statistics as “the technology of the scientific method” and add
that statistics is concerned with “(1) the design of experiments and investigations,
(2) statistical inference.” A superficial examination of these definitions suggests a
substantial lack of agreement, but all possess common elements. Each description
implies that data are collected, with inference as the objective. Each requires select-
ing a subset of a large collection of data, either existent or conceptual, in order to
infer the characteristics of the complete set. All the authors imply that statistics is a
theory of information, with inference making as its objective.
The large body of data that is the target of our interest is called the population, and
the subset selected from it is a sample. The preferences of voters for a gubernatorial
candidate, Jones, expressed in quantitative form (1 for “prefer” and 0 for “do not
prefer”) provide a real, finite, and existing population of great interest to Jones. To
determine the true fraction who favor his election, Jones would need to interview
all eligible voters—a task that is practically impossible. The voltage at a particular
point in the guidance system for a spacecraft may be tested in the only three sys-
tems that have been built. The resulting data could be used to estimate the voltage
characteristics for other systems that might be manufactured some time in the future.
In this case, the population is conceptual. We think of the sample of three as being
representative of a large population of guidance systems that could be built using the
same method. Presumably, this population would possess characteristics similar to
the three systems in the sample. Analogously, measurements on patients in a medical
experiment represent a sample from a conceptual population consisting of all patients
similarly afflicted today, as well as those who will be afflicted in the near future. You
will find it useful to clearly define the populations of interest for each of the scenarios
described earlier in this section and to clarify the inferential objective for each.
It is interesting to note that billions of dollars are spent each year by U.S. indus-
try and government for data from experimentation, sample surveys, and other data
collection procedures. This money is expended solely to obtain information about
phenomena susceptible to measurement in areas of business, science, or the arts. The
implications of this statement provide keys to the nature of the very valuable contri-
bution that the discipline of statistics makes to research and development in all areas
of society. Information useful in inferring some characteristic of a population (either
existing or conceptual) is purchased in a specified quantity and results in an inference
(estimation or decision) with an associated degree of goodness. For example, if Jones
arranges for a sample of voters to be interviewed, the information in the sample can be
used to estimate the true fraction of all voters who favor Jones’s election. In addition
to the estimate itself, Jones should also be concerned with the likelihood (chance)
that the estimate provided is close to the true fraction of eligible voters who favor his
election. Intuitively, the larger the number of eligible voters in the sample, the higher
will be the likelihood of an accurate estimate. Similarly, if a decision is made regarding
the relative merits of two manufacturing processes based on examination of samples
of products from both processes, we should be interested in the decision regarding
which is better and the likelihood that the decision is correct. In general, the study of
statistics is concerned with the design of experiments or sample surveys to obtain a
specified quantity of information at minimum cost and the optimum use of this infor-
mation in making an inference about a population. The objective of statistics is to make
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Characterizing a Set of Measurements: Graphical Methods 3
Exercises
1.1 For each of the following situations, identify the population of interest, the inferential objective,
and how you might go about collecting a sample.
a A university researcher wants to estimate the proportion of U.S. citizens from
“Generation X” who are interested in starting their own businesses.
b For more than a century, normal body temperature for humans has been accepted to be
98.6◦ Fahrenheit. Is it really? Researchers want to estimate the average temperature of
healthy adults in the United States.
c A city engineer wants to estimate the average weekly water consumption for single-family
dwelling units in the city.
d The National Highway Safety Council wants to estimate the proportion of automobile tires
with unsafe tread among all tires manufactured by a specific company during the current
production year.
e A political scientist wants to determine whether a majority of adult residents of a state favor
a unicameral legislature.
f A medical scientist wants to estimate the average length of time until the recurrence of a
certain disease.
g An electrical engineer wants to determine whether the average length of life of transistors
of a certain type is greater than 500 hours.
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
4 Chapter 1 What Is Statistics?
contract, fixed dollar costs, and the supervisor who is assigned the task of organizing
and conducting the manufacturing operation. The statistician will wish to measure the
response or dependent variable, profit per contract, for several jobs (the sample). Along
with recording the profit, the statistician will obtain measurements on the variables
that might be related to profit—the independent variables. His or her objective is to
use information in the sample to infer the approximate relationship of the independent
variables just described to the dependent variable, profit, and to measure the strength
of this relationship. The manufacturer’s objective is to determine optimum conditions
for maximizing profit.
The population of interest in the manufacturing problem is conceptual and consists
of all measurements of profit (per unit of capital and labor invested) that might be
made on contracts, now and in the future, for fixed values of the independent variables
(size of the contract, measure of competition, etc.). The profit measurements will vary
from contract to contract in a seemingly random manner as a result of variations in
materials, time needed to complete individual segments of the work, and other uncon-
trollable variables affecting the job. Consequently, we view the population as being
represented by a distribution of profit measurements, with the form of the distribution
depending on specific values of the independent variables. Our wish to determine the
relationship between the dependent variable, profit, and a set of independent variables
is therefore translated into a desire to determine the effect of the independent variables
on the conceptual distribution of population measurements.
An individual population (or any set of measurements) can be characterized by
a relative frequency distribution, which can be represented by a relative frequency
histogram. A graph is constructed by subdividing the axis of measurement into inter-
vals of equal width. A rectangle is constructed over each interval, such that the height
of the rectangle is proportional to the fraction of the total number of measurements
falling in each cell. For example, to characterize the ten measurements 2.1, 2.4, 2.2,
2.3, 2.7, 2.5, 2.4, 2.6, 2.6, and 2.9, we could divide the axis of measurement into in-
tervals of equal width (say, .2 unit), commencing with 2.05. The relative frequencies
(fraction of total number of measurements), calculated for each interval, are shown
in Figure 1.1. Notice that the figure gives a clear pictorial description of the entire set
of ten measurements.
Observe that we have not given precise rules for selecting the number, widths,
or locations of the intervals used in constructing a histogram. This is because the
F I G U R E 1.1 Relative
Relative frequency Frequency
histogram .3
.2
.1
0
2.05 2.25 2.45 2.65 2.85 3.05
Axis of
Measurement
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.2 Characterizing a Set of Measurements: Graphical Methods 5
selection of these items is somewhat at the discretion of the person who is involved
in the construction.
Although they are arbitrary, a few guidelines can be very helpful in selecting the
intervals. Points of subdivision of the axis of measurement should be chosen so that it is
impossible for a measurement to fall on a point of division. This eliminates a source of
confusion and is easily accomplished, as indicated in Figure 1.1. The second guideline
involves the width of each interval and consequently, the minimum number of intervals
needed to describe the data. Generally speaking, we wish to obtain information on the
form of the distribution of the data. Many times the form will be mound-shaped, as
illustrated in Figure 1.2. (Others prefer to refer to distributions such as these as bell-
shaped, or normal.) Using many intervals with a small amount of data results in little
summarization and presents a picture very similar to the data in their original form.
The larger the amount of data, the greater the number of included intervals can be while
still presenting a satisfactory picture of the data. We suggest spanning the range of the
data with from 5 to 20 intervals and using the larger number of intervals for larger
quantities of data. In most real-life applications, computer software (Minitab, SAS,
R, S+, JMP, etc.) is used to obtain any desired histograms. These computer packages
all produce histograms satisfying widely agreed-upon constraints on scaling, number
of intervals used, widths of intervals, and the like.
Some people feel that the description of data is an end in itself. Histograms are
often used for this purpose, but there are many other graphical methods that provide
meaningful summaries of the information contained in a set of data. Some excellent
references for the general topic of graphical descriptive methods are given in the
references at the end of this chapter. Keep in mind, however, that the usual objective
of statistics is to make inferences. The relative frequency distribution associated with a
data set and the accompanying histogram are sufficient for our objectives in developing
the material in this text. This is primarily due to the probabilistic interpretation that
can be derived from the frequency histogram, Figure 1.1. We have already stated that
the area of a rectangle over a given interval is proportional to the fraction of the total
number of measurements falling in that interval. Let’s extend this idea one step further.
If a measurement is selected at random from the original data set, the probability
that it will fall in a given interval is proportional to the area under the histogram lying
over that interval. (At this point, we rely on the layperson’s concept of probability.
This term is discussed in greater detail in Chapter 2.) For example, for the data used
to construct Figure 1.1, the probability that a randomly selected measurement falls in
the interval from 2.05 to 2.45 is .5 because half the measurements fall in this interval.
Correspondingly, the area under the histogram in Figure 1.1 over the interval from
F I G U R E 1.2 Relative
Relative frequency Frequency
distribution
0
2.05 2.25 2.45 2.65 2.85 3.05
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
6 Chapter 1 What Is Statistics?
2.05 to 2.45 is half of the total area under the histogram. It is clear that this interpreta-
tion applies to the distribution of any set of measurements—a population or a sample.
Suppose that Figure 1.2 gives the relative frequency distribution of profit (in mil-
lions of dollars) for a conceptual population of profit responses for contracts at spec-
ified settings of the independent variables (size of contract, measure of competition,
etc.). The probability that the next contract (at the same settings of the independent
variables) yields a profit that falls in the interval from 2.05 to 2.45 million is given by
the proportion of the area under the distribution curve that is shaded in Figure 1.2.
Exercises
1.2 Are some cities more windy than others? Does Chicago deserve to be nicknamed “The Windy
City”? Given below are the average wind speeds (in miles per hour) for 45 selected U.S. cities:
a Construct a relative frequency histogram for these data. (Choose the class boundaries
without including the value 35.1 in the range of values.)
b The value 35.1 was recorded at Mt. Washington, New Hampshire. Does the geography of
that city explain the magnitude of its average wind speed?
c The average wind speed for Chicago is 10.3 miles per hour. What percentage of the cities
have average wind speeds in excess of Chicago’s?
d Do you think that Chicago is unusually windy?
1.3 Of great importance to residents of central Florida is the amount of radioactive material present
in the soil of reclaimed phosphate mining areas. Measurements of the amount of 238 U in 25 soil
samples were as follows (measurements in picocuries per gram):
.74 6.47 1.90 2.69 .75
.32 9.99 1.77 2.41 1.96
1.66 .70 2.42 .54 3.36
3.59 .37 1.09 8.32 4.06
4.55 .76 2.03 5.70 12.48
Construct a relative frequency histogram for these data.
1.4 The top 40 stocks on the over-the-counter (OTC) market, ranked by percentage of outstanding
shares traded on one day last year are as follows:
11.88 6.27 5.49 4.81 4.40 3.78 3.44 3.11 2.88 2.68
7.99 6.07 5.26 4.79 4.05 3.69 3.36 3.03 2.74 2.63
7.15 5.98 5.07 4.55 3.94 3.62 3.26 2.99 2.74 2.62
7.13 5.91 4.94 4.43 3.93 3.48 3.20 2.89 2.69 2.61
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Exercises 7
c If one of the stocks is selected at random from the 40 for which the preceding data were
taken, what is the probability that it will have traded fewer than 5% of its outstanding shares?
1.5 Given here is the relative frequency histogram associated with grade point averages (GPAs) of
a sample of 30 students:
Relative
Frequency
6/30
3/30
0
1.85 2.05 2.25 2.45 2.65 2.85 3.05 3.25 3.45
Grade Point
Average
a Which of the GPA categories identified on the horizontal axis are associated with the largest
proportion of students?
b What proportion of students had GPAs in each of the categories that you identified?
c What proportion of the students had GPAs less than 2.65?
1.6 The relative frequency histogram given next was constructed from data obtained from a random
sample of 25 families. Each was asked the number of quarts of milk that had been purchased
the previous week.
Relative
.4
Frequency
.3
.2
.1
0
0 1 2 3 4 5
Quarts
a Use this relative frequency histogram to determine the number of quarts of milk purchased
by the largest proportion of the 25 families. The category associated with the largest relative
frequency is called the modal category.
b What proportion of the 25 families purchased more than 2 quarts of milk?
c What proportion purchased more than 0 but fewer than 5 quarts?
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
8 Chapter 1 What Is Statistics?
1.7 The self-reported heights of 105 students in a biostatistics class were used to construct the
histogram given below.
Relative 10/105
frequency
5/105
0
60 63 66 69 72 75
Heights
a Construct a relative frequency histogram to describe the aluminum oxide content of all
26 pottery samples.
b What unusual feature do you see in this histogram? Looking at the data, can you think of
an explanation for this unusual feature?
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
1.3 Characterizing a Set of Measurements: Numerical Methods 9
could be formed from the same set of measurements. To make inferences about a
population based on information contained in a sample and to measure the goodness
of the inferences, we need rigorously defined quantities for summarizing the infor-
mation contained in a sample. These sample quantities typically have mathematical
properties, to be developed in the following chapters, that allow us to make probability
statements regarding the goodness of our inferences.
The quantities we define are numerical descriptive measures of a set of data.
We seek some numbers that have meaningful interpretations and that can be used
to describe the frequency distribution for any set of measurements. We will confine
our attention to two types of descriptive numbers: measures of central tendency and
measures of dispersion or variation.
Probably the most common measure of central tendency used in statistics is the
arithmetic mean. (Because this is the only type of mean discussed in this text, we will
omit the word arithmetic.)
1b n
y= yi .
n i=1
The symbol y, read “y bar,” refers to a sample mean. We usually cannot measure
the value of the population mean, μ; rather, μ is an unknown constant that we may
want to estimate using sample information.
The mean of a set of measurements only locates the center of the distribution
of data; by itself, it does not provide an adequate description of a set of measure-
ments. Two sets of measurements could have widely different frequency distributions
but equal means, as pictured in Figure 1.3. The difference between distributions I
and II in the figure lies in the variation or dispersion of measurements on either
side of the mean. To describe data adequately, we must also define measures of data
variability.
The most common measure of variability used in statistics is the variance, which is a
function of the deviations (or distances) of the sample measurements from their mean.
F I G U R E 1.3
Frequency
distributions with
equal means but
different amounts
of variation
& &
" ""
Copyright 2008 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has
deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it.
Another random document with
no related content on Scribd:
DANCE ON STILTS AT THE GIRLS’ UNYAGO, NIUCHI
I see increasing reason to believe that the view formed some time
back as to the origin of the Makonde bush is the correct one. I have
no doubt that it is not a natural product, but the result of human
occupation. Those parts of the high country where man—as a very
slight amount of practice enables the eye to perceive at once—has not
yet penetrated with axe and hoe, are still occupied by a splendid
timber forest quite able to sustain a comparison with our mixed
forests in Germany. But wherever man has once built his hut or tilled
his field, this horrible bush springs up. Every phase of this process
may be seen in the course of a couple of hours’ walk along the main
road. From the bush to right or left, one hears the sound of the axe—
not from one spot only, but from several directions at once. A few
steps further on, we can see what is taking place. The brush has been
cut down and piled up in heaps to the height of a yard or more,
between which the trunks of the large trees stand up like the last
pillars of a magnificent ruined building. These, too, present a
melancholy spectacle: the destructive Makonde have ringed them—
cut a broad strip of bark all round to ensure their dying off—and also
piled up pyramids of brush round them. Father and son, mother and
son-in-law, are chopping away perseveringly in the background—too
busy, almost, to look round at the white stranger, who usually excites
so much interest. If you pass by the same place a week later, the piles
of brushwood have disappeared and a thick layer of ashes has taken
the place of the green forest. The large trees stretch their
smouldering trunks and branches in dumb accusation to heaven—if
they have not already fallen and been more or less reduced to ashes,
perhaps only showing as a white stripe on the dark ground.
This work of destruction is carried out by the Makonde alike on the
virgin forest and on the bush which has sprung up on sites already
cultivated and deserted. In the second case they are saved the trouble
of burning the large trees, these being entirely absent in the
secondary bush.
After burning this piece of forest ground and loosening it with the
hoe, the native sows his corn and plants his vegetables. All over the
country, he goes in for bed-culture, which requires, and, in fact,
receives, the most careful attention. Weeds are nowhere tolerated in
the south of German East Africa. The crops may fail on the plains,
where droughts are frequent, but never on the plateau with its
abundant rains and heavy dews. Its fortunate inhabitants even have
the satisfaction of seeing the proud Wayao and Wamakua working
for them as labourers, driven by hunger to serve where they were
accustomed to rule.
But the light, sandy soil is soon exhausted, and would yield no
harvest the second year if cultivated twice running. This fact has
been familiar to the native for ages; consequently he provides in
time, and, while his crop is growing, prepares the next plot with axe
and firebrand. Next year he plants this with his various crops and
lets the first piece lie fallow. For a short time it remains waste and
desolate; then nature steps in to repair the destruction wrought by
man; a thousand new growths spring out of the exhausted soil, and
even the old stumps put forth fresh shoots. Next year the new growth
is up to one’s knees, and in a few years more it is that terrible,
impenetrable bush, which maintains its position till the black
occupier of the land has made the round of all the available sites and
come back to his starting point.
The Makonde are, body and soul, so to speak, one with this bush.
According to my Yao informants, indeed, their name means nothing
else but “bush people.” Their own tradition says that they have been
settled up here for a very long time, but to my surprise they laid great
stress on an original immigration. Their old homes were in the
south-east, near Mikindani and the mouth of the Rovuma, whence
their peaceful forefathers were driven by the continual raids of the
Sakalavas from Madagascar and the warlike Shirazis[47] of the coast,
to take refuge on the almost inaccessible plateau. I have studied
African ethnology for twenty years, but the fact that changes of
population in this apparently quiet and peaceable corner of the earth
could have been occasioned by outside enterprises taking place on
the high seas, was completely new to me. It is, no doubt, however,
correct.
The charming tribal legend of the Makonde—besides informing us
of other interesting matters—explains why they have to live in the
thickest of the bush and a long way from the edge of the plateau,
instead of making their permanent homes beside the purling brooks
and springs of the low country.
“The place where the tribe originated is Mahuta, on the southern
side of the plateau towards the Rovuma, where of old time there was
nothing but thick bush. Out of this bush came a man who never
washed himself or shaved his head, and who ate and drank but little.
He went out and made a human figure from the wood of a tree
growing in the open country, which he took home to his abode in the
bush and there set it upright. In the night this image came to life and
was a woman. The man and woman went down together to the
Rovuma to wash themselves. Here the woman gave birth to a still-
born child. They left that place and passed over the high land into the
valley of the Mbemkuru, where the woman had another child, which
was also born dead. Then they returned to the high bush country of
Mahuta, where the third child was born, which lived and grew up. In
course of time, the couple had many more children, and called
themselves Wamatanda. These were the ancestral stock of the
Makonde, also called Wamakonde,[48] i.e., aborigines. Their
forefather, the man from the bush, gave his children the command to
bury their dead upright, in memory of the mother of their race who
was cut out of wood and awoke to life when standing upright. He also
warned them against settling in the valleys and near large streams,
for sickness and death dwelt there. They were to make it a rule to
have their huts at least an hour’s walk from the nearest watering-
place; then their children would thrive and escape illness.”
The explanation of the name Makonde given by my informants is
somewhat different from that contained in the above legend, which I
extract from a little book (small, but packed with information), by
Pater Adams, entitled Lindi und sein Hinterland. Otherwise, my
results agree exactly with the statements of the legend. Washing?
Hapana—there is no such thing. Why should they do so? As it is, the
supply of water scarcely suffices for cooking and drinking; other
people do not wash, so why should the Makonde distinguish himself
by such needless eccentricity? As for shaving the head, the short,
woolly crop scarcely needs it,[49] so the second ancestral precept is
likewise easy enough to follow. Beyond this, however, there is
nothing ridiculous in the ancestor’s advice. I have obtained from
various local artists a fairly large number of figures carved in wood,
ranging from fifteen to twenty-three inches in height, and
representing women belonging to the great group of the Mavia,
Makonde, and Matambwe tribes. The carving is remarkably well
done and renders the female type with great accuracy, especially the
keloid ornamentation, to be described later on. As to the object and
meaning of their works the sculptors either could or (more probably)
would tell me nothing, and I was forced to content myself with the
scanty information vouchsafed by one man, who said that the figures
were merely intended to represent the nembo—the artificial
deformations of pelele, ear-discs, and keloids. The legend recorded
by Pater Adams places these figures in a new light. They must surely
be more than mere dolls; and we may even venture to assume that
they are—though the majority of present-day Makonde are probably
unaware of the fact—representations of the tribal ancestress.
The references in the legend to the descent from Mahuta to the
Rovuma, and to a journey across the highlands into the Mbekuru
valley, undoubtedly indicate the previous history of the tribe, the
travels of the ancestral pair typifying the migrations of their
descendants. The descent to the neighbouring Rovuma valley, with
its extraordinary fertility and great abundance of game, is intelligible
at a glance—but the crossing of the Lukuledi depression, the ascent
to the Rondo Plateau and the descent to the Mbemkuru, also lie
within the bounds of probability, for all these districts have exactly
the same character as the extreme south. Now, however, comes a
point of especial interest for our bacteriological age. The primitive
Makonde did not enjoy their lives in the marshy river-valleys.
Disease raged among them, and many died. It was only after they
had returned to their original home near Mahuta, that the health
conditions of these people improved. We are very apt to think of the
African as a stupid person whose ignorance of nature is only equalled
by his fear of it, and who looks on all mishaps as caused by evil
spirits and malignant natural powers. It is much more correct to
assume in this case that the people very early learnt to distinguish
districts infested with malaria from those where it is absent.
This knowledge is crystallized in the
ancestral warning against settling in the
valleys and near the great waters, the
dwelling-places of disease and death. At the
same time, for security against the hostile
Mavia south of the Rovuma, it was enacted
that every settlement must be not less than a
certain distance from the southern edge of the
plateau. Such in fact is their mode of life at the
present day. It is not such a bad one, and
certainly they are both safer and more
comfortable than the Makua, the recent
intruders from the south, who have made USUAL METHOD OF
good their footing on the western edge of the CLOSING HUT-DOOR
plateau, extending over a fairly wide belt of
country. Neither Makua nor Makonde show in their dwellings
anything of the size and comeliness of the Yao houses in the plain,
especially at Masasi, Chingulungulu and Zuza’s. Jumbe Chauro, a
Makonde hamlet not far from Newala, on the road to Mahuta, is the
most important settlement of the tribe I have yet seen, and has fairly
spacious huts. But how slovenly is their construction compared with
the palatial residences of the elephant-hunters living in the plain.
The roofs are still more untidy than in the general run of huts during
the dry season, the walls show here and there the scanty beginnings
or the lamentable remains of the mud plastering, and the interior is a
veritable dog-kennel; dirt, dust and disorder everywhere. A few huts
only show any attempt at division into rooms, and this consists
merely of very roughly-made bamboo partitions. In one point alone
have I noticed any indication of progress—in the method of fastening
the door. Houses all over the south are secured in a simple but
ingenious manner. The door consists of a set of stout pieces of wood
or bamboo, tied with bark-string to two cross-pieces, and moving in
two grooves round one of the door-posts, so as to open inwards. If
the owner wishes to leave home, he takes two logs as thick as a man’s
upper arm and about a yard long. One of these is placed obliquely
against the middle of the door from the inside, so as to form an angle
of from 60° to 75° with the ground. He then places the second piece
horizontally across the first, pressing it downward with all his might.
It is kept in place by two strong posts planted in the ground a few
inches inside the door. This fastening is absolutely safe, but of course
cannot be applied to both doors at once, otherwise how could the
owner leave or enter his house? I have not yet succeeded in finding
out how the back door is fastened.