You are on page 1of 42

Modern Mathematical Statistics with

Applications Jay L. Devore


Visit to download the full and correct content document:
https://textbookfull.com/product/modern-mathematical-statistics-with-applications-jay-l
-devore/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Modern Mathematical Statistics With Applications 3rd


Edition Jay L. Devore

https://textbookfull.com/product/modern-mathematical-statistics-
with-applications-3rd-edition-jay-l-devore/

Probability and Statistics for Engineering and the


Sciences Jay L. Devore

https://textbookfull.com/product/probability-and-statistics-for-
engineering-and-the-sciences-jay-l-devore/

Mathematical Statistics With Applications in R (Third


Edition) Kandethody M. Ramachandran

https://textbookfull.com/product/mathematical-statistics-with-
applications-in-r-third-edition-kandethody-m-ramachandran/

Mathematical Statistics Borovkov A. A.

https://textbookfull.com/product/mathematical-statistics-
borovkov-a-a/
Water waves the mathematical theory with applications
Stoker

https://textbookfull.com/product/water-waves-the-mathematical-
theory-with-applications-stoker/

Mathematical Statistics 1st Edition Dieter Rasch

https://textbookfull.com/product/mathematical-statistics-1st-
edition-dieter-rasch/

Mathematical Modeling Applications with GeoGebra 1st


Edition Jonas Hall

https://textbookfull.com/product/mathematical-modeling-
applications-with-geogebra-1st-edition-jonas-hall/

Essentials of Modern Business Statistics with Microsoft


Excel 7th Edition David Anderson

https://textbookfull.com/product/essentials-of-modern-business-
statistics-with-microsoft-excel-7th-edition-david-anderson/

Essentials of Modern Business Statistics with Microsoft


Excel 8th Edition David Anderson

https://textbookfull.com/product/essentials-of-modern-business-
statistics-with-microsoft-excel-8th-edition-david-anderson/
Modern
Mathematical
Statistics with
Applications
Second Edition

Jay L. Devore
California Polytechnic State University

Kenneth N. Berk
Illinois State University
Jay L. Devore Kenneth N. Berk
California Polytechnic State University Illinois State University
Statistics Department Department of Mathematics
San Luis Obispo California Normal Illinois
USA USA
jdevore@calpoly.edu kberk@ilstu.edu

Additional material to this book can be downloaded from http://extras.springer.com

ISBN 978-1-4614-0390-6 e-ISBN 978-1-4614-0391-3


DOI 10.1007/978-1-4614-0391-3
Springer New York Dordrecht Heidelberg London
Library of Congress Control Number: 2011936004

# Springer Science+Business Media, LLC 2012, corrected publication 2018


Contents
Preface x
1 Overview and Descriptive Statistics 1
Introduction 1
1.1 Populations and Samples 2
1.2 Pictorial and Tabular Methods in Descriptive Statistics 9
1.3 Measures of Location 24
1.4 Measures of Variability 32

2 Probability 50
Introduction 50
2.1 Sample Spaces and Events 51
2.2 Axioms, Interpretations, and Properties of Probability 56
2.3 Counting Techniques 66
2.4 Conditional Probability 74
2.5 Independence 84
3 Discrete Random Variables and Probability Distributions 96
Introduction 96
3.1 Random Variables 97
3.2 Probability Distributions for Discrete Random Variables 101
3.3 Expected Values of Discrete Random Variables 112
3.4 Moments and Moment Generating Functions 121
3.5 The Binomial Probability Distribution 128
3.6 Hypergeometric and Negative Binomial Distributions 138
3.7 The Poisson Probability Distribution 146
4 Continuous Random Variables and Probability Distributions 158
Introduction 158
4.1 Probability Density Functions and Cumulative Distribution Functions 159
4.2 Expected Values and Moment Generating Functions 171
4.3 The Normal Distribution 179
4.4 The Gamma Distribution and Its Relatives 194
4.5 Other Continuous Distributions 202
4.6 Probability Plots 210
4.7 Transformations of a Random Variable 220

5 Joint Probability Distributions 232


Introduction 232
5.1 Jointly Distributed Random Variables 233
5.2 Expected Values, Covariance, and Correlation 245
5.3 Conditional Distributions 253
5.4 Transformations of Random Variables 265
5.5 Order Statistics 271
6 Statistics and Sampling Distributions 284
Introduction 284
6.1 Statistics and Their Distributions 285
6.2 The Distribution of the Sample Mean 296
6.3 The Mean, Variance, and MGF for Several Variables 306
6.4 Distributions Based on a Normal Random Sample 315
Appendix: Proof of the Central Limit Theorem 329

7 Point Estimation 331


Introduction 331
7.1 General Concepts and Criteria 332
7.2 Methods of Point Estimation 350
7.3 Sufficiency 361
7.4 Information and Efficiency 371

8 Statistical Intervals Based on a Single Sample 382


Introduction 382
8.1 Basic Properties of Confidence Intervals 383
8.2 Large-Sample Confidence Intervals for a Population Mean and Proportion 391
8.3 Intervals Based on a Normal Population Distribution 401
8.4 Confidence Intervals for the Variance and Standard Deviation of a Normal
Population 409
8.5 Bootstrap Confidence Intervals 411
9 Tests of Hypotheses Based on a Single Sample 425
Introduction 425
9.1 Hypotheses and Test Procedures 426
9.2 Tests About a Population Mean 436
9.3 Tests Concerning a Population Proportion 450
9.4 P-Values 456
9.5 Some Comments on Selecting a Test Procedure 467

10 Inferences Based on Two Samples 484


Introduction 484
10.1 z Tests and Confidence Intervals for a Difference Between Two
Population Means 485
10.2 The Two-Sample t Test and Confidence Interval 499
10.3 Analysis of Paired Data 509
10.4 Inferences About Two Population Proportions 519
10.5 Inferences About Two Population Variances 527
10.6 Comparisons Using the Bootstrap and Permutation Methods 532
11 The Analysis of Variance 552
Introduction 552
11.1 Single-Factor ANOVA 553
11.2 Multiple Comparisons in ANOVA 564
11.3 More on Single-Factor ANOVA 572
11.4 Two-Factor ANOVA with Kij ¼ 1 582
11.5 Two-Factor ANOVA with Kij > 1 597

12 Regression and Correlation 613


Introduction 613
12.1 The Simple Linear and Logistic Regression Models 614
12.2 Estimating Model Parameters 624
12.3 Inferences About the Regression Coefficient b1 640
12.4 Inferences Concerning mY x  and the Prediction of Future Y Values 654
12.5 Correlation 662
12.6 Assessing Model Adequacy 674
12.7 Multiple Regression Analysis 682
12.8 Regression with Matrices 705
13 Goodness-of-Fit Tests and Categorical Data Analysis 723
Introduction 723
13.1 Goodness-of-Fit Tests When Category Probabilities
Are Completely Specified 724
13.2 Goodness-of-Fit Tests for Composite Hypotheses 732
13.3 Two-Way Contingency Tables 744
14 Alternative Approaches to Inference 758
Introduction 758
14.1 The Wilcoxon Signed-Rank Test 759
14.2 The Wilcoxon Rank-Sum Test 766
14.3 Distribution-Free Confidence Intervals 771
14.4 Bayesian Methods 776

Erratum to: Statistics and Sampling Distributions E1


Appendix Tables 787
A.1 Cumulative Binomial Probabilities 788
A.2 Cumulative Poisson Probabilities 790
A.3 Standard Normal Curve Areas 792
A.4 The Incomplete Gamma Function 794
A.5 Critical Values for t Distributions 795
A.6 Critical Values for Chi-Squared Distributions 796
A.7 t Curve Tail Areas 797
A.8 Critical Values for F Distributions 799
A.9 Critical Values for Studentized Range Distributions 805
A.10 Chi-Squared Curve Tail Areas 806
A.11 Critical Values for the Ryan–Joiner Test of Normality 808
A.12 Critical Values for the Wilcoxon Signed-Rank Test 809
A.13 Critical Values for the Wilcoxon Rank-Sum Test 810
A.14 Critical Values for the Wilcoxon Signed-Rank Interval 811
A.15 Critical Values for the Wilcoxon Rank-Sum Interval 812
A.16 b Curves for t Tests 813
Answers to Odd-Numbered Exercises 814
Index 835
Preface
Purpose
Our objective is to provide a postcalculus introduction to the discipline of statistics
that
• Has mathematical integrity and contains some underlying theory.
• Shows students a broad range of applications involving real data.
• Is very current in its selection of topics.
• Illustrates the importance of statistical software.
• Is accessible to a wide audience, including mathematics and statistics majors
(yes, there are a few of the latter), prospective engineers and scientists, and those
business and social science majors interested in the quantitative aspects of their
disciplines.
A number of currently available mathematical statistics texts are heavily
oriented toward a rigorous mathematical development of probability and statistics,
with much emphasis on theorems, proofs, and derivations. The focus is more on
mathematics than on statistical practice. Even when applied material is included,
the scenarios are often contrived (many examples and exercises involving dice,
coins, cards, widgets, or a comparison of treatment A to treatment B).
So in our exposition we have tried to achieve a balance between mathemati-
cal foundations and statistical practice. Some may feel discomfort on grounds that
because a mathematical statistics course has traditionally been a feeder into gradu-
ate programs in statistics, students coming out of such a course must be well
prepared for that path. But that view presumes that the mathematics will provide
the hook to get students interested in our discipline. This may happen for a few
mathematics majors. However, our experience is that the application of statistics to
real-world problems is far more persuasive in getting quantitatively oriented
students to pursue a career or take further coursework in statistics. Let’s first
draw them in with intriguing problem scenarios and applications. Opportunities
for exposing them to mathematical foundations will follow in due course. We
believe it is more important for students coming out of this course to be able to
carry out and interpret the results of a two-sample t test or simple regression
analysis than to manipulate joint moment generating functions or discourse on
various modes of convergence.

Content
The book certainly does include core material in probability (Chapter 2), random
variables and their distributions (Chapters 3–5), and sampling theory (Chapter 6).
But our desire to balance theory with application/data analysis is reflected in the
way the book starts out, with a chapter on descriptive and exploratory statistical
techniques rather than an immediate foray into the axioms of probability and their
consequences. After the distributional infrastructure is in place, the remaining
statistical chapters cover the basics of inference. In addition to introducing core
ideas from estimation and hypothesis testing (Chapters 7–10), there is emphasis on
checking assumptions and examining the data prior to formal analysis. Modern
topics such as bootstrapping, permutation tests, residual analysis, and logistic
regression are included. Our treatment of regression, analysis of variance, and
categorical data analysis (Chapters 11–13) is definitely more oriented to dealing
with real data than with theoretical properties of models. We also show many
examples of output from commonly used statistical software packages, something
noticeably absent in most other books pitched at this audience and level.

Mathematical Level
The challenge for students at this level should lie with mastery of statistical
concepts as well as with mathematical wizardry. Consequently, the mathematical
prerequisites and demands are reasonably modest. Mathematical sophistication and
quantitative reasoning ability are, of course, crucial to the enterprise. Students with
a solid grounding in univariate calculus and some exposure to multivariate calculus
should feel comfortable with what we are asking of them. The several sections
where matrix algebra appears (transformations in Chapter 5 and the matrix approach
to regression in the last section of Chapter 12) can easily be deemphasized or
skipped entirely.
Our goal is to redress the balance between mathematics and statistics by
putting more emphasis on the latter. The concepts, arguments, and notation
contained herein will certainly stretch the intellects of many students. And a solid
mastery of the material will be required in order for them to solve many of the
roughly 1,300 exercises included in the book. Proofs and derivations are included
where appropriate, but we think it likely that obtaining a conceptual understanding
of the statistical enterprise will be the major challenge for readers.

Recommended Coverage
There should be more than enough material in our book for a year-long course.
Those wanting to emphasize some of the more theoretical aspects of the subject
(e.g., moment generating functions, conditional expectation, transformations, order
statistics, sufficiency) should plan to spend correspondingly less time on inferential
methodology in the latter part of the book. We have opted not to mark certain
sections as optional, preferring instead to rely on the experience and tastes of
individual instructors in deciding what should be presented. We would also like
to think that students could be asked to read an occasional subsection or even
section on their own and then work exercises to demonstrate understanding, so that
not everything would need to be presented in class. Remember that there is never
enough time in a course of any duration to teach students all that we’d like them to
know!

Acknowledgments
We gratefully acknowledge the plentiful feedback provided by reviewers and
colleagues. A special salute goes to Bruce Trumbo for going way beyond his
mandate in providing us an incredibly thoughtful review of 40+ pages containing
many wonderful ideas and pertinent criticisms. Our emphasis on real data would
not have come to fruition without help from the many individuals who provided us
with data in published sources or in personal communications. We very much
appreciate the editorial and production services provided by the folks at Springer, in
particular Marc Strauss, Kathryn Schell, and Felix Portnoy.

A Final Thought
It is our hope that students completing a course taught from this book will feel as
passionately about the subject of statistics as we still do after so many years in the
profession. Only teachers can really appreciate how gratifying it is to hear from a
student after he or she has completed a course that the experience had a positive
impact and maybe even affected a career choice.
Jay L. Devore
Kenneth N. Berk
CHAPTER ONE

Overview
and Descriptive
Statistics

Introduction
Statistical concepts and methods are not only useful but indeed often indis-
pensable in understanding the world around us. They provide ways of gaining
new insights into the behavior of many phenomena that you will encounter in your
chosen field of specialization.
The discipline of statistics teaches us how to make intelligent judgments
and informed decisions in the presence of uncertainty and variation. Without
uncertainty or variation, there would be little need for statistical methods or statis-
ticians. If the yield of a crop were the same in every field, if all individuals reacted
the same way to a drug, if everyone gave the same response to an opinion survey,
and so on, then a single observation would reveal all desired information.
An interesting example of variation arises in the course of performing
emissions testing on motor vehicles. The expense and time requirements of the
Federal Test Procedure (FTP) preclude its widespread use in vehicle inspection
programs. As a result, many agencies have developed less costly and quicker tests,
which it is hoped replicate FTP results. According to the journal article “Motor
Vehicle Emissions Variability” (J. Air Waste Manage. Assoc., 1996: 667–675), the
acceptance of the FTP as a gold standard has led to the widespread belief that
repeated measurements on the same vehicle would yield identical (or nearly
identical) results. The authors of the article applied the FTP to seven vehicles
characterized as “high emitters.” Here are the results of four hydrocarbon and
carbon dioxide tests on one such vehicle:

HC (g/mile) 13.8 18.3 32.2 32.5


CO (g/mile) 118 149 232 236
2 CHAPTER 1 Overview and Descriptive Statistics

The substantial variation in both the HC and CO measurements casts considerable


doubt on conventional wisdom and makes it much more difficult to make precise
assessments about emissions levels.
How can statistical techniques be used to gather information and draw
conclusions? Suppose, for example, that a biochemist has developed a medication
for relieving headaches. If this medication is given to different individuals, varia-
tion in conditions and in the people themselves will result in more substantial
relief for some individuals than for others. Methods of statistical analysis could
be used on data from such an experiment to determine on the average how much
relief to expect.
Alternatively, suppose the biochemist has developed a headache medication
in the belief that it will be superior to the currently best medication. A comparative
experiment could be carried out to investigate this issue by giving the current
medication to some headache sufferers and the new medication to others. This
must be done with care lest the wrong conclusion emerge. For example, perhaps
really the two medications are equally effective. However, the new medication may
be applied to people who have less severe headaches and have less stressful lives.
The investigator would then likely observe a difference between the two medica-
tions attributable not to the medications themselves, but to a poor choice of test
groups. Statistics offers not only methods for analyzing the results of experiments
once they have been carried out but also suggestions for how experiments can
be performed in an efficient manner to lessen the effects of variation and have a
better chance of producing correct conclusions.

1.1 Populations and Samples


We are constantly exposed to collections of facts, or data, both in our professional
capacities and in everyday activities. The discipline of statistics provides methods
for organizing and summarizing data and for drawing conclusions based on infor-
mation contained in the data.
An investigation will typically focus on a well-defined collection of
objects constituting a population of interest. In one study, the population might
consist of all gelatin capsules of a particular type produced during a specified
period. Another investigation might involve the population consisting of all indi-
viduals who received a B.S. in mathematics during the most recent academic year.
When desired information is available for all objects in the population, we have
what is called a census. Constraints on time, money, and other scarce resources
usually make a census impractical or infeasible. Instead, a subset of the popula-
tion—a sample—is selected in some prescribed manner. Thus we might obtain
a sample of pills from a particular production run as a basis for investigating
whether pills are conforming to manufacturing specifications, or we might select
a sample of last year’s graduates to obtain feedback about the quality of the
curriculum.
1.1 Populations and Samples 3

We are usually interested only in certain characteristics of the objects in a


population: the amount of vitamin C in the pill, the gender of a mathematics
graduate, the age at which the individual graduated, and so on. A characteristic
may be categorical, such as gender or year in college, or it may be numerical in
nature. In the former case, the value of the characteristic is a category (e.g., female
or sophomore), whereas in the latter case, the value is a number (e.g., age ¼ 23
years or vitamin C content ¼ 65 mg). A variable is any characteristic whose
value may change from one object to another in the population. We shall initially
denote variables by lowercase letters from the end of our alphabet. Examples
include
x ¼ brand of calculator owned by a student
y ¼ number of major defects on a newly manufactured automobile
z ¼ braking distance of an automobile under specified conditions
Data comes from making observations either on a single variable or simultaneously
on two or more variables. A univariate data set consists of observations on a
single variable. For example, we might consider the type of computer, laptop (L)
or desktop (D), for ten recent purchases, resulting in the categorical data set

D L L L D L L D L L

The following sample of lifetimes (hours) of brand D batteries in flashlights is a


numerical univariate data set:

5:6 5:1 6:2 6:0 5:8 6:5 5:8 5:5

We have bivariate data when observations are made on each of two variables.
Our data set might consist of a (height, weight) pair for each basketball player on
a team, with the first observation as (72, 168), the second as (75, 212), and so on.
If a kinesiologist determines the values of x ¼ recuperation time from an injury and
y ¼ type of injury, the resulting data set is bivariate with one variable numerical
and the other categorical. Multivariate data arises when observations are made
on more than two variables. For example, a research physician might determine
the systolic blood pressure, diastolic blood pressure, and serum cholesterol level
for each patient participating in a study. Each observation would be a triple of
numbers, such as (120, 80, 146). In many multivariate data sets, some variables
are numerical and others are categorical. Thus the annual automobile issue of
Consumer Reports gives values of such variables as type of vehicle (small, sporty,
compact, midsize, large), city fuel efficiency (mpg), highway fuel efficiency
(mpg), drive train type (rear wheel, front wheel, four wheel), and so on.

Branches of Statistics
An investigator who has collected data may wish simply to summarize and
describe important features of the data. This entails using methods from descriptive
statistics. Some of these methods are graphical in nature; the construction of
histograms, boxplots, and scatter plots are primary examples. Other descriptive
methods involve calculation of numerical summary measures, such as means,
4 CHAPTER 1 Overview and Descriptive Statistics

standard deviations, and correlation coefficients. The wide availability of


statistical computer software packages has made these tasks much easier to
carry out than they used to be. Computers are much more efficient than
human beings at calculation and the creation of pictures (once they have
received appropriate instructions from the user!). This means that the investiga-
tor doesn’t have to expend much effort on “grunt work” and will have more
time to study the data and extract important messages. Throughout this book,
we will present output from various packages such as MINITAB, SAS, and R.

Example 1.1 Charity is a big business in the United States. The website charitynavigator.
com gives information on roughly 5500 charitable organizations, and there are
many smaller charities that fly below the navigator’s radar screen. Some charities
operate very efficiently, with fundraising and administrative expenses that are
only a small percentage of total expenses, whereas others spend a high percentage
of what they take in on such activities. Here is data on fundraising expenses as
a percentage of total expenditures for a random sample of 60 charities:

6.1 12.6 34.7 1.6 18.8 2.2 3.0 2.2 5.6 3.8
2.2 3.1 1.3 1.1 14.1 4.0 21.0 6.1 1.3 20.4
7.5 3.9 10.1 8.1 19.5 5.2 12.0 15.8 10.4 5.2
6.4 10.8 83.1 3.6 6.2 6.3 16.3 12.7 1.3 0.8
8.8 5.1 3.7 26.3 6.0 48.0 8.2 11.7 7.2 3.9
15.3 16.6 8.8 12.0 4.7 14.7 6.4 17.0 2.5 16.2

Without any organization, it is difficult to get a sense of the data’s most promi-
nent features: what a typical (i.e., representative) value might be, whether values
are highly concentrated about a typical value or quite dispersed, whether there
are any gaps in the data, what fraction of the values are less than 20%, and so on.
Figure 1.1 shows a histogram. In Section 1.2 we will discuss construction and
interpretation of this graph. For the moment, we hope you see how it describes the

40

30
Frequency

20

10

0
0 10 20 30 40 50 60 70 80 90
FundRsng

Figure 1.1 A MINITAB histogram for the charity fundraising % data


1.1 Populations and Samples 5

way the percentages are distributed over the range of possible values from 0 to 100.
Of the 60 charities, 36 use less than 10% on fundraising, and 18 use between 10%
and 20%. Thus 54 out of the 60 charities in the sample, or 90%, spend less than 20%
of money collected on fundraising. How much is too much? There is a delicate
balance; most charities must spend money to raise money, but then money spent on
fundraising is not available to help beneficiaries of the charity. Perhaps each
individual giver should draw his or her own line in the sand. ■

Having obtained a sample from a population, an investigator would fre-


quently like to use sample information to draw some type of conclusion (make an
inference of some sort) about the population. That is, the sample is a means to an
end rather than an end in itself. Techniques for generalizing from a sample to a
population are gathered within the branch of our discipline called inferential
statistics.

Example 1.2 Human measurements provide a rich area of application for statistical methods.
The article “A Longitudinal Study of the Development of Elementary School Chil-
dren’s Private Speech” (Merrill-Palmer Q., 1990: 443–463) reported on a study of
children talking to themselves (private speech). It was thought that private speech
would be related to IQ, because IQ is supposed to measure mental maturity, and it
was known that private speech decreases as students progress through the primary
grades. The study included 33 students whose first-grade IQ scores are given here:

082 096 099 102 103 103 106 107 108 108 108 108 109 110 110 111 113
113 113 113 115 115 118 118 119 121 122 122 127 132 136 140 146

Suppose we want an estimate of the average value of IQ for the first graders
served by this school (if we conceptualize a population of all such IQs, we are
trying to estimate the population mean). It can be shown that, with a high degree
of confidence, the population mean IQ is between 109.2 and 118.2; we call this
a confidence interval or interval estimate. The interval suggests that this is an above
average class, because the nationwide IQ average is around 100. ■

The main focus of this book is on presenting and illustrating methods of


inferential statistics that are useful in research. The most important types of inferen-
tial procedures—point estimation, hypothesis testing, and estimation by confidence
intervals—are introduced in Chapters 7–9 and then used in more complicated settings
in Chapters 10–14. The remainder of this chapter presents methods from descriptive
statistics that are most used in the development of inference.
Chapters 2–6 present material from the discipline of probability. This material
ultimately forms a bridge between the descriptive and inferential techniques.
Mastery of probability leads to a better understanding of how inferential procedures
are developed and used, how statistical conclusions can be translated into everyday
language and interpreted, and when and where pitfalls can occur in applying the
methods. Probability and statistics both deal with questions involving populations
and samples, but do so in an “inverse manner” to each other.
In a probability problem, properties of the population under study are
assumed known (e.g., in a numerical population, some specified distribution of
the population values may be assumed), and questions regarding a sample taken
6 CHAPTER 1 Overview and Descriptive Statistics

from the population are posed and answered. In a statistics problem, characteristics
of a sample are available to the experimenter, and this information enables the
experimenter to draw conclusions about the population. The relationship between
the two disciplines can be summarized by saying that probability reasons from
the population to the sample (deductive reasoning), whereas inferential statistics
reasons from the sample to the population (inductive reasoning). This is illustrated
in Figure 1.2.

Probability

Population Sample
Inferential
statistics

Figure 1.2 The relationship between probability and inferential statistics

Before we can understand what a particular sample can tell us about the
population, we should first understand the uncertainty associated with taking a
sample from a given population. This is why we study probability before statistics.
As an example of the contrasting focus of probability and inferential statis-
tics, consider drivers’ use of manual lap belts in cars equipped with automatic
shoulder belt systems. (The article “Automobile Seat Belts: Usage Patterns in
Automatic Belt Systems,” Hum. Factors, 1998: 126–135, summarizes usage
data.) In probability, we might assume that 50% of all drivers of cars equipped in
this way in a certain metropolitan area regularly use their lap belt (an assumption
about the population), so we might ask, “How likely is it that a sample of 100 such
drivers will include at least 70 who regularly use their lap belt?” or “How many
of the drivers in a sample of size 100 can we expect to regularly use their lap belt?”
On the other hand, in inferential statistics we have sample information available; for
example, a sample of 100 drivers of such cars revealed that 65 regularly use their lap
belt. We might then ask, “Does this provide substantial evidence for concluding that
more than 50% of all such drivers in this area regularly use their lap belt?” In this
latter scenario, we are attempting to use sample information to answer a question
about the structure of the entire population from which the sample was selected.
Suppose, though, that a study involving a sample of 25 patients is carried out
to investigate the efficacy of a new minimally invasive method for rotator cuff
surgery. The amount of time that each individual subsequently spends in physical
therapy is then determined. The resulting sample of 25 PT times is from a popula-
tion that does not actually exist. Instead it is convenient to think of the population as
consisting of all possible times that might be observed under similar experimental
conditions. Such a population is referred to as a conceptual or hypothetical popula-
tion. There are a number of problem situations in which we fit questions into the
framework of inferential statistics by conceptualizing a population.
Sometimes an investigator must be very cautious about generalizing from
the circumstances under which data has been gathered. For example, a sample of
five engines with a new design may be experimentally manufactured and tested to
investigate efficiency. These five could be viewed as a sample from the conceptual
population of all prototypes that could be manufactured under similar conditions,
but not necessarily as representative of the population of units manufactured once
regular production gets under way. Methods for using sample information to draw
1.1 Populations and Samples 7

conclusions about future production units may be problematic. Similarly, a new


drug may be tried on patients who arrive at a clinic, but there may be some question
about how typical these patients are. They may not be representative of patients
elsewhere or patients at the clinic next year. A good exposition of these issues is
contained in the article “Assumptions for Statistical Inference” by Gerald Hahn and
William Meeker (Amer. Statist., 1993: 1–11).

Collecting Data
Statistics deals not only with the organization and analysis of data once it has been
collected but also with the development of techniques for collecting the data. If data
is not properly collected, an investigator may not be able to answer the questions
under consideration with a reasonable degree of confidence. One common problem
is that the target population—the one about which conclusions are to be drawn—
may be different from the population actually sampled. For example, advertisers
would like various kinds of information about the television-viewing habits of
potential customers. The most systematic information of this sort comes from
placing monitoring devices in a small number of homes across the United States.
It has been conjectured that placement of such devices in and of itself alters viewing
behavior, so that characteristics of the sample may be different from those of the
target population.
When data collection entails selecting individuals or objects from a list, the
simplest method for ensuring a representative selection is to take a simple random
sample. This is one for which any particular subset of the specified size (e.g., a
sample of size 100) has the same chance of being selected. For example, if the list
consists of 1,000,000 serial numbers, the numbers 1, 2, . . . , up to 1,000,000 could
be placed on identical slips of paper. After placing these slips in a box and
thoroughly mixing, slips could be drawn one by one until the requisite sample
size has been obtained. Alternatively (and much to be preferred), a table of random
numbers or a computer’s random number generator could be employed.
Sometimes alternative sampling methods can be used to make the selection
process easier, to obtain extra information, or to increase the degree of confidence
in conclusions. One such method, stratified sampling, entails separating the
population units into nonoverlapping groups and taking a sample from each one.
For example, a manufacturer of DVD players might want information about
customer satisfaction for units produced during the previous year. If three different
models were manufactured and sold, a separate sample could be selected from each
of the three corresponding strata. This would result in information on all three
models and ensure that no one model was over- or underrepresented in the entire
sample.
Frequently a “convenience” sample is obtained by selecting individuals or
objects without systematic randomization. As an example, a collection of bricks
may be stacked in such a way that it is extremely difficult for those in the center to
be selected. If the bricks on the top and sides of the stack were somehow different
from the others, resulting sample data would not be representative of the popula-
tion. Often an investigator will assume that such a convenience sample approx-
imates a random sample, in which case a statistician’s repertoire of inferential
methods can be used; however, this is a judgment call. Most of the methods
discussed herein are based on a variation of simple random sampling described in
Chapter 6.
8 CHAPTER 1 Overview and Descriptive Statistics

Researchers often collect data by carrying out some sort of designed


experiment. This may involve deciding how to allocate several different treatments
(such as fertilizers or drugs) to the various experimental units (plots of land or
patients). Alternatively, an investigator may systematically vary the levels or
categories of certain factors (e.g., amount of fertilizer or dose of a drug) and
observe the effect on some response variable (such as corn yield or blood pressure).

Example 1.3 An article in the New York Times (January 27, 1987) reported that heart attack risk
could be reduced by taking aspirin. This conclusion was based on a designed
experiment involving both a control group of individuals, who took a placebo
having the appearance of aspirin but known to be inert, and a treatment group
who took aspirin according to a specified regimen. Subjects were randomly
assigned to the groups to protect against any biases and so that probability-based
methods could be used to analyze the data. Of the 11,034 individuals in the control
group, 189 subsequently experienced heart attacks, whereas only 104 of the 11,037
in the aspirin group had a heart attack. The incidence rate of heart attacks in the
treatment group was only about half that in the control group. One possible
explanation for this result is chance variation, that aspirin really doesn’t have the
desired effect and the observed difference is just typical variation in the same way
that tossing two identical coins would usually produce different numbers of heads.
However, in this case, inferential methods suggest that chance variation by itself
cannot adequately explain the magnitude of the observed difference. ■

Exercises Section 1.1 (1–9)


1. Give one possible sample of size 4 from each of the a. Pose several probability questions based on se-
following populations: lecting a sample of 100 such DVD players.
a. All daily newspapers published in the United b. What inferential statistics question might be
States answered by determining the number of such
b. All companies listed on the New York Stock DVD players in a sample of size 100 that need
Exchange warranty service?
c. All students at your college or university
4. a. Give three different examples of concrete popu-
d. All grade point averages of students at your
lations and three different examples of hypothet-
college or university
ical populations.
2. For each of the following hypothetical populations, b. For one each of your concrete and your hypo-
give a plausible sample of size 4: thetical populations, give an example of a prob-
a. All distances that might result when you throw a ability question and an example of an inferential
football statistics question.
b. Page lengths of books published 5 years from
5. Many universities and colleges have instituted sup-
now
plemental instruction (SI) programs, in which a
c. All possible earthquake-strength measurements
student facilitator meets regularly with a small
(Richter scale) that might be recorded in Califor-
group of students enrolled in the course to promote
nia during the next year
discussion of course material and enhance subject
d. All possible yields (in grams) from a certain
mastery. Suppose that students in a large statistics
chemical reaction carried out in a laboratory
course (what else?) are randomly divided into a
3. Consider the population consisting of all DVD control group that will not participate in SI and a
players of a certain brand and model, and focus on treatment group that will participate. At the end of
whether a DVD player needs service while under the term, each student’s total score in the course is
warranty. determined.
1.2 Pictorial and Tabular Methods in Descriptive Statistics 9

a. Are the scores from the SI group a sample from bathrooms, distance to the nearest school, and
an existing population? If so, what is it? If not, so on. How might she select a sample of single-
what is the relevant conceptual population? family homes that could be used as a basis for this
b. What do you think is the advantage of randomly analysis?
dividing the students into the two groups rather
8. The amount of flow through a solenoid valve in an
than letting each student choose which group to
automobile’s pollution-control system is an impor-
join?
tant characteristic. An experiment was carried out
c. Why didn’t the investigators put all students in
to study how flow rate depended on three factors:
the treatment group? [Note: The article “Supple-
armature length, spring load, and bobbin depth.
mental Instruction: An Effective Component of
Two different levels (low and high) of each factor
Student Affairs Programming” J. Coll. Stud.
were chosen, and a single observation on flow was
Dev., 1997: 577–586 discusses the analysis of
made for each combination of levels.
data from several SI programs.]
a. The resulting data set consisted of how many
6. The California State University (CSU) system con- observations?
sists of 23 campuses, from San Diego State in the b. Does this study involve sampling an existing
south to Humboldt State near the Oregon border. population or a conceptual population?
A CSU administrator wishes to make an inference
9. In a famous experiment carried out in 1882,
about the average distance between the hometowns
Michelson and Newcomb obtained 66 observations
of students and their campuses. Describe and dis-
on the time it took for light to travel between two
cuss several different sampling methods that might
locations in Washington, D.C. A few of the mea-
be employed.
surements (coded in a certain manner) were 31, 23,
7. A certain city divides naturally into ten district 32, 36, 22, 26, 27, and 31.
neighborhoods. A real estate appraiser would like a. Why are these measurements not identical?
to develop an equation to predict appraised value b. Does this study involve sampling an existing
from characteristics such as age, size, number of population or a conceptual population?

1.2 Pictorial and Tabular Methods


in Descriptive Statistics
There are two general types of methods within descriptive statistics. In this section
we will discuss the first of these types—representing a data set using visual
techniques. In Sections 1.3 and 1.4, we will develop some numerical summary
measures for data sets. Many visual techniques may already be familiar to you:
frequency tables, tally sheets, histograms, pie charts, bar graphs, scatter diagrams,
and the like. Here we focus on a selected few of these techniques that are most
useful and relevant to probability and inferential statistics.

Notation
Some general notation will make it easier to apply our methods and formulas to
a wide variety of practical problems. The number of observations in a single
sample, that is, the sample size, will often be denoted by n, so that n ¼ 4 for
the sample of universities {Stanford, Iowa State, Wyoming, Rochester} and also
for the sample of pH measurements {6.3, 6.2, 5.9, 6.5}. If two samples are
simultaneously under consideration, either m and n or n1 and n2 can be used to
denote the numbers of observations. Thus if {3.75, 2.60, 3.20, 3.79} and {2.75,
1.20, 2.45} are grade point averages for students on a mathematics floor and the rest
of the dorm, respectively, then m ¼ 4 and n ¼ 3.
1.2 Pictorial and Tabular Methods in Descriptive Statistics 21

Construct a stem-and-leaf display using repeated 2 1 2 4 0 1 3 2 0 5 3 3 1 3 2 4 7 0 2 3


stems (see the previous exercise), and comment 0 4 2 1 3 1 1 3 4 1 2 3 2 2 8 4 5 1 3 1
on any interesting features of the display. 5 0 2 3 2 1 0 6 4 2 1 6 0 3 3 3 6 1 2 3
13. The accompanying data set consists of observa-
a. Determine frequencies and relative frequen-
tions on shower-flow rate (L/min) for a sample of
cies for the observed values of x ¼ number of
n ¼ 129 houses in Perth, Australia (“An Appli-
nonconforming transducers in a batch.
cation of Bayes Methodology to the Analysis of
b. What proportion of batches in the sample have
Diary Records in a Water Use Study,” J. Amer.
at most five nonconforming transducers? What
Statist. Assoc., 1987: 705–711):
proportion have fewer than five? What propor-
4.6 12.3 7.1 7.0 4.0 9.2 6.7 6.9 11.5 5.1 tion have at least five nonconforming units?
11.2 10.5 14.3 8.0 8.8 6.4 5.1 5.6 9.6 7.5 c. Draw a histogram of the data using relative
7.5 6.2 5.8 2.3 3.4 10.4 9.8 6.6 3.7 6.4
8.3 6.5 7.6 9.3 9.2 7.3 5.0 6.3 13.8 6.2 frequency on the vertical scale, and comment
5.4 4.8 7.5 6.0 6.9 10.8 7.5 6.6 5.0 3.3 on its features.
7.6 3.9 11.9 2.2 15.0 7.2 6.1 15.3 18.9 7.2
5.4 5.5 4.3 9.0 12.7 11.3 7.4 5.0 3.5 8.2 16. In a study of author productivity (“Lotka’s Test,”
8.4 7.3 10.3 11.9 6.0 5.6 9.5 9.3 10.4 9.7 Collection Manage., 1982: 111–118), a large
5.1 6.7 10.2 6.2 8.4 7.0 4.8 5.6 10.5 14.6 number of authors were classified according to
10.8 15.5 7.5 6.4 3.4 5.5 6.6 5.9 15.0 9.6 the number of articles they had published during
7.8 7.0 6.9 4.1 3.6 11.9 3.7 5.7 6.8 11.3
9.3 9.6 10.4 9.3 6.9 9.8 9.1 10.6 4.5 6.2 a certain period. The results were presented in
8.3 3.2 4.9 5.0 6.0 8.2 6.3 3.8 6.0 the accompanying frequency distribution:

a. Construct a stem-and-leaf display of the data. Number of


papers 1 2 3 4 5 6 7 8
b. What is a typical, or representative, flow rate?
Frequency 784 204 127 50 33 28 19 19
c. Does the display appear to be highly concen-
trated or spread out? Number of
d. Does the distribution of values appear to be papers 9 10 11 12 13 14 15 16 17
Frequency 6 7 6 7 4 4 5 3 3
reasonably symmetric? If not, how would you
describe the departure from symmetry? a. Construct a histogram corresponding to this
e. Would you describe any observation as being frequency distribution. What is the most inter-
far from the rest of the data (an outlier)? esting feature of the shape of the distribution?
14. Do running times of American movies differ b. What proportion of these authors published at
somehow from times of French movies? The least five papers? At least ten papers? More
authors investigated this question by randomly than ten papers?
selecting 25 recent movies of each type, resulting c. Suppose the five 15’s, three 16’s, and three
in the following running times: 17’s had been lumped into a single category
displayed as “15.” Would you be able to
Am: 94 90 95 93 128 95 125 draw a histogram? Explain.
91 104 116 162 102 90 110
92 113 116 90 97 103 95 d. Suppose that instead of the values 15, 16, and
120 109 91 138 17 being listed separately, they had been com-
bined into a 15–17 category with frequency
Fr: 123 116 90 158 122 119 125
90 96 94 137 102 105 106
11. Would you be able to draw a histogram?
95 125 122 103 96 111 81 Explain.
113 128 93 92 17. The article “Ecological Determinants of Herd
Construct a comparative stem-and-leaf display Size in the Thorncraft’s Giraffe of Zambia”
by listing stems in the middle of your paper and (Afric. J. Ecol., 2010: 962–971) gave the follow-
then placing the Am leaves out to the left and the ing data (read from a graph) on herd size for a
Fr leaves out to the right. Then comment on sample of 1570 herds over a 34-year period.
interesting features of the display. Herd size 1 2 3 4 5 6 7 8
15. Temperature transducers of a certain type are Frequency 589 190 176 157 115 89 57 55
shipped in batches of 50. A sample of 60 batches
Herd size 9 10 11 12 13 14 15 17
was selected, and the number of transducers
Frequency 33 31 22 10 4 10 11 5
in each batch not conforming to design specifi-
cations was determined, resulting in the follo- Herd size 18 19 20 22 23 24 26 32
wing data: Frequency 2 4 2 2 2 2 1 1
Another random document with
no related content on Scribd:
in its natural state water and solid matter in the proportion of 90 parts
of the former to 10 parts of the latter, and if, we suppose, these 10
parts of solid matter to be cholenic acid with 5.87 per cent. of
nitrogen, then 100 parts of bile must contain 0.171 of nitrogen in the
form of taurine, which quantity is contained in .06 parts of theine, or,
in other words, 272 grains of theine can give to an ounce of bile the
quantity of nitrogen it contains in the form of taurine. The action of
the compound in ordinary circumstances is not obvious, but that it
unquestionably exists and exerts itself in both tea and coffee is
proven by the fact that both were originally met with among nations
whose diet was chiefly vegetable. These facts clearly show in what
manner tea proves to the poor a substitute for animal food, and why
it is that females, literary persons and others of sedentary habits or
occupation, who take but little exercise, manifest such a partiality for
tea, and also explain why the numerous attempts made to substitute
other articles in its place have so signally failed.

TEA AS A STIMULANT.
“Life without stimulants would be a dreary waste,” remarks some
modern philosopher, which, if true, the moderate use of good tea,
properly prepared and not too strong, will be found less harmful than
the habitual resort to alcoholic liquors. The impression so long
existing that vinous or alcoholic beverages best excite the brain and
cause it to produce more or better work is rapidly being exploded,
healthier and more beneficial stimulants usurping their place. But
while the claims made in favor of the “wine cup” must be admitted, it
cannot for a moment be denied that as excellent literary work has
been accomplished under the influence of tea, in our own times,
particularly when the poet, the essayist, the historian, the statesman
and the journalist no longer work under the baneful influence of
spirituous stimulants. Mantegaza, an Italian physiologist of high
repute, who has given the action of tea and other stimulants careful
study, confirms this claim by placing tea above all other stimulants,
his enthusiasm for it being almost unbounded, crediting it with “the
power of dispelling weariness and lessening the annoyances of life,
classing it as the greatest friend to the man of letters by enabling him
to work without fatigue, and to society as an aid to conversation,
rendering it more easy and pleasant, reviving the drooping
intellectual activity and the best stimulus to exertion, and finally
pronouncing it to be one of the greatest blessings of Providence to
man.”
Tea was Johnson’s only stimulant, he loved it as much as Porson
loved gin, drinking it all times and under all circumstances, in bed
and out, with his friends and alone, more particularly while compiling
his famous dictionary. Boswell drank cup after cup, as if it had been
the “Heliconian spring.” While Hazlitt, like Johnson, was a prodigious
tea-drinker, Shelley’s favorite beverage was water, but at the same
time tea was always grateful to him. Bulwer’s breakfast was
generally composed of dry toast and cold tea, and De Quincy states
that he invariably drank tea from eight o’clock at night until four in the
morning, when engaged in his literary labors, and knew whereof he
spoke when he named tea “the beverage of the intellectual.” Kent
usually had a cup of tea and a pipe of tobacco, on which he worked
eight hours at a stretch, and Motley, the historian tells us that he
“usually rose at seven, and with the aid of a cup of tea only, wrote
until eleven.” And Victor Hugo, as a general rule, used tea freely, but
fortifying it with a little brandy. Turning from literature to politics, we
find that Palmerston resorted to tea during the midnight sessions of
Parliament. Cobden declaring “the more work he had to do the more
tea he drank,” and Gladstone himself confesses to drinking large
quantities of tea between midnight and morning during the prolonged
parliamentary sittings, while Clemenceau, the leader of the French
Radicals, admits himself to be “an intemperate tea-drinker” during
the firey discussions of debate.
In moderation, tea is pre-eminently the beverage of the twilight hour,
when tired humanity seeks repose after a day of wearying labor.
Then the hot infusion with its alluring aroma refreshing and
stimulating, increasing the respiration, elevating the pulse, softening
the temper, producing tranquility in mind and body, and creating a
sense of repose peculiarly grateful to those who have been taxed
and tormented by the rush and routine of business cares and
vexations. What a promoter of sociability, what home comforts does
it not suggest, as, when Cowper, on a winter’s evening, draws a
cheerful picture of the crackling fire, the curtained windows, the
hissing urn and “the cup that cheers?” When, however, tea drinking
ceases to be the amusement of the leisure moment or resorted to in
too large quantities or strong infusions as a means of stimulating the
flagging energies to accomplish the allotted task, whatever it might
be, then distinct danger commences. A breakdown is liable to ensue
in more than one way, as not infrequently the stimulus which tea in
time fails to give is sought in alcoholic or other liquors, and the atonic
dyspepsia which the astringent decoction produced, by overdrawing
induces, helps to drive the victim to seek temporary relief in spirits
chloral or the morphine habit, which is established with extraordinary
rapidity. For it is a truth that as long as a person uses stimulants
simply for their taste he is comparatively safe, but as soon as he
begins to drink them for effect he is running into great danger. This
may be stating the case too forcibly for stimulants, but if this rule was
more closely adhered to we should have fewer cases of educated
people falling into the habit of secret intemperance or morphomania.

TEA AND THE POETS.


The subdued irascibility, the refreshed spirits, and the renewed
energies which the student and the poet so often owed to tea has
been the theme of many an accomplished pen, eminent writers of all
times and all countries considering it no indignity to extol the virtues
of this precious and fascinating beverage. What Bacchanalian and
hunting songs, cavalier and sea songs, rhapsodies and laudations of
other subjects have been to our literature, such was tea to the
writers, poets, artists and musicians of China and Japan, theirs being
confined to the simple subject—Tea. Each plantation was supposed
to possess its own peculiar virtues and excellences, not unlike the
vineyards of the Rhine, the Rhone and the Moselle, each had its
poet to sing its praises in running rhymes. One Chinese bard, who
seemingly was an Anacreon in his way, magnifying the product of
the Woo-e-shan mountains in terms literally translated as follows:—

“One ounce does all disorders cure.


With two your troubles will be fewer,
Three to the bones more vigor give,
With four forever you will live
As young as on your day of birth,
A true immortal on the earth.”

However hyperbolical this testimony may be considered, it at least


serves to show the high estimation in which the plant was held in
China.
The first literary eulogist to espouse the cause of the new drink in
Europe was Edmund Waller, reciting how he became first induced to
taste it. In a poem containing several references to the leaf occurs
the following pregnant allusion to tea:—

“The muses friend doth our fancy aid,


Repress these vapors which the head invade,
Keeping that palace of the soul serene.”

That Queen Anne ranked among its votaries is manifest from Pope’s
celebrated couplet:—

“Though great Anna, whom the realms obey,


Doth sometimes counsel take and—sometimes Tea.”

Johnson did not make verses in its honor, but he has drawn his own
portrait as “a hardened and shameless tea drinker, who for twenty
years diluted his meals with an infusion of this fascinating plant,
whose kettle had scarcely time to cool, who with tea amused the
evening, with tea solaced the night, and with tea welcomed the
morning.” While Brady, in his well-known metrical version of the
psalms, thus illustrates its advantages:—

“Over our tea conversations we employ,


Where with delight instructions we enjoy,
Quaffing without waste of time or wealth
The soverign drink of pleasure and of health.”

Cooper’s praise of the beverage has been sadly hackneyed,


nevertheless, as the Laureate of the tea table, his lines are worthy of
reproduction here:—

“While the bubbling and loud hissing urn


Throws up a steaming column, and the cup
That cheers, but not inebriates, wait on each,
So let us welcome peaceful evening in.”

That Coleridge, in his younger days, must have liked tea is inferred
from the following stanza:—

“Though all unknown to Greek and Roman song,


The paler Hyson and the dark Souchong,
Which Kieu-lung, imperial poet praised
So high that cent, per cent. its price was raised.”

Gray eulogizing it:—

“Through all the room


From flowing tea exhales a fragrant fume.”

Byron, in his latter years, became an enthusiast on the use of tea,


averring that he “Must have recourse to black Bohea,” still later
pronouncing Green tea to be the “Chinese nymph of tears.” And in
addition to the praises sung to it by English-speaking poets and
essayists, its virtues have also been sounded by Herricken and
Francius in Greek verse, by Pecklin, in Latin epigraphs, by Pierre
Pettit, in a poem of five hundred lines, as well as by a German
versifier, who celebrated, in a fashion of his own, “The burial and
happy resurrection of tea.” In opposition to the “country parson,” who
calls tea “a nerveless and vaporous liquid,” and Balzac, who
describes it as an “insipid and depressing beverage,” the author of
“Eothen” records his testimony to “the cheering, soothing influence of
the steaming cup that Orientals and Europeans alike enjoy.”
CHAPTER IX.

WORLD’S PRODUCTION
AND

CONSUMPTION.

The first direct importation of tea into England was in 1669, and
consisted of but “100 pounds of the best tea that could be procured.”
In 1678 this order was increased to 4,713 pounds, which appears to
have “glutted the market;” the following six years the total
importations amounting to only 410 pounds during that entire period.
How little was it possible from these figures to have foreseen that tea
would one day become one of the most important articles of foreign
productions consumed.
Up to 1864 China and Japan were practically the only countries
producing teas for commercial purposes. In that year India first
entered the list as an exporter of tea, being subsequently followed by
Java and Ceylon. In 1864, when India first entered the list of tea-
producing countries, China furnished fully 97 per cent. of the world’s
supply and India only 3, the latter increasing at such a marvelous
rate that it now furnishes 57, China declining to 43 per cent. of the
total.

TABLE 1.
ESTIMATED TEA PRODUCTION OF THE WORLD.
Countries. Production Exportation
(Pounds). (Pounds).
China, 1,000,000,000 300,000,000
Japan, 100,000,000 50,000,000
India, 100,000,000 95,000,000
Ceylon, 50,000,000 40,000,000
Java, 20,000,000 10,000,000
Singapore, 20,000 10,000
Fiji 30,000 20,000
Islands,
South 50,000 20,000
Africa,
—————— —————
Total, 1,270,100,000 495,050,000
From these estimates it will be noted that China ranks first in tea-
producing countries, followed by Japan, India, Ceylon and Java in
the order of their priority; the total product of the other countries
having little or no effect as yet on the world’s supply.
This most important food auxiliary is now in daily use as a beverage
by probably over one-half the population of the entire world, civilized
as well as savage, the following being the principal countries of
consumption:—

TABLE 2.
ESTIMATED TEA CONSUMPTION OF THE WORLD.

Countries. Consumption Per


(Pounds). capita
(Pounds).
Austria, 1,000,000 0.03
Australia, 18,000,000 4.50
Belgium 130,000 0.03
China, 800,000,000 3.00
Canada, 23,000,000 4.00
Central 13,000,000 ...
Asia,
Denmark, 850,000 0.37
France, 1,250,000 0.03
Germany, 4,000.000 0.09
Holland, 5,000,000 1.20
Italy, 60,000 0.01
India, 5,000,000 ...
Japan, 50,000,000 4.00
Java, 5,009,000 1.00
Norway, 165,000 0.09
New 4,500,000 7.50
Zealand,
Portugal, 600,000 0.12
Russia, 100,000,000 1.70
Spain, 275,000 0.02
Sweden, 150,000 0.03
Switzerland, 150,000 0.08
South 600,000 0.80
Africa,
South 12,000,000 0.03
America,
Straits 1,000,000 ...
Settlements,
United 82,000,000 1.50
States,
United 180,000,000 5.94
Kingdom,
West Indies, 300,000 0.03
—————— ——
Total, 1,308,039,000 1.67
From these estimates it will be observed that England ranks first in
the list of tea-consuming countries, the United States second, and
Russia third, the Australian colonies and Canada coming next in
order, comparatively little tea being used in France, Germany and
the other European countries. It is rarely used in some parts of the
globe, and is practically unknown in a great many other countries. It
is also apparent that 90 per cent. of the world’s supply is chiefly
consumed by English-speaking people, fully 75 per cent. of this
being used by England and her dependencies alone, the United
States being next in importance as a tea-consuming country. And it
may here be noted that while the world’s production of tea has been
very largely increased during the last quarter of a century in greater
ratio than that of any other of the great staples of commerce, the
production of China and Japan having increased at least 50 per cent.
in that period, to which must be added that of India and Ceylon, from
which countries little or none was received until a few years ago. Yet
it cannot be said that the consumption has increased in anything like
the same proportion, which will account for the great decline in price
in later years, and to prevent prices from going still lower it is evident
that new markets must be opened up for its sale in other countries
where it has not yet been introduced.

TABLE 3.
SUMMARY.

World’s 1,377,600,000
Production,
“ 1,307,130,000
Consumption,
—————
Surplus, 70,470,000

or

Quantity 503,100,000
exported,
Consumption in 432,630,000
non-producing
countries,
—————
Surplus, 70,470,000
In England, particularly, the increase in the consumption of tea in late
years borders on the marvelous, the figures for 1890 reaching
upwards of 195,000,000 pounds, which, at the present rate of
increase, will, in all probability, exceed 200,000,000 in 1892, as in
the quarter of a century between 1865 and 1890 the consumption
rose from 3½ to 5 pounds per capita of the population. But as in the
latter half of that period strong India teas were more freely used,
being increased appreciably by the similar Ceylon product in the
closing years of that time largely displacing the lighter liquored teas
of China, a larger consumption is indicated by the number of gallons
of liquid yielded. This is calculated on the moderate estimate formed
in a report to the Board of Custom to the effect that if one pound of
China leaf produces five gallons of liquor of a certain depth of color
and body, one pound of India tea will yield seven and a half gallons
of a similar beverage. Then by allowing for an apparent arrest of the
advancing consumption when the process of displacement was only
commencing, the increase in the consumption of tea in the British
Islands has not only been steady but rapid; thus, from 17 gallons per
head in 1865 to 24 in 1876, 28 in 1886, reaching 33½ gallons per
head per annum in 1890, the figures of last year almost exactly
doubling that of the first year of the series, so that in consequence of
the introduction of the stronger products of India and Ceylon the
people of Britain have been enabled to double their consumption of
the beverage, although the percentage of increase in the leaf has
been only from 3½ to 5 pounds during the same period. Ceylon tea,
which a decade ago was only beginning to intrude itself as a new
and suspiciously regarded competitor in the English market with
products so well known and established as the teas of China and
India, has recently made such rapid progress that its position in the
British market in 1890, rated by home consumption, occupying third
place on the list. India teas 52 per cent., China 30 per cent., Ceylon
18 per cent.

TABLE 4.

Showing relative positions of kinds of Tea consumed in England, and


increase in pounds of same since 1880:—
Kind. 1880. 1885. 1890.
China, 126,000,000 113,500,000 60,000,000
India, 34,000,000 65,500,000 95,000,000
Ceylon, 3,000,000 24,000,000
In 1868, when the price of tea was reduced in England to an average
of 36 cents per pound, the consumption increased to the heretofore
unprecedented figures of 107,000,000 pounds, while in 1888, when
the average price was again reduced to 20 cents, owing to the
enormous increase in the production of India and Ceylon teas, the
total consumption became augmented to 185,000,000 pounds,
comprised as follows, in round numbers:—
Kinds. Pounds.
China teas, 80,000,000
India and Ceylon 105,000,000
teas,
————
Total, 185,000,000
The latter, for the first time on record, exceeding that of China teas,
being an almost exact inversion of the figures of 1886 in favor of
India and Ceylon teas, by which it will be seen that China is year by
year becoming of less importance as a source of tea supply to
English consumers. And as the demand becomes greater the
importations from India and Ceylon are constantly expanding, prices
being correspondingly reduced to an unprecedentedly low figure,
being now so cheap in the United Kingdom as to be in daily use in
almost every household. The relative positions of China, India and
Ceylon teas in England at the present writing being
Kind. Consumption,
Pounds.
India (estimated), 105,000,000
China “ 50,000,000
Ceylon “ 35,000,000
————
Total, 180,000,000
The proportion of Black tea consumed in England is about as 5 to 1,
the per capita consumption ranging from 5 to 6 pounds for the entire
population.
Ceylon teas continue to grow in public favor to a marvelous extent in
England and beyond anticipating in the natural growth of
consumption, they help fill up the yearly displacement of China teas.
The total production for 1890 was nearly 38,000,000 pounds against
over 30,000,000 pounds for 1889, and 18,500,000 pounds for 1888,
thus showing an increase of 19,500,000 pounds for the two years.
The supply for 1891 is about 40,000,000 pounds, the stock being
increased 3,000,000 pounds, which may be considered very
moderate and quite steady considering the steady all-round demand
there is for Ceylon teas in that country. But there is not the slightest
doubt but that the check which the consumption of China tea
appears to have sustained in England is entirely due to the forced
use of India and Ceylon teas in that country and her dependencies,
there being a positive revulsion of taste in many sections in favor of
the truer, purer and more delicate and richer of China teas. Medical
opinions have been recently given to prove that the excessive
quantity of tannin contained in India and Ceylon teas is very injurious
to health, and a revival of the Chinese tea-trade may be confidently
expected in the future.
So far as the English tea-trade is concerned the market for China
and Japan teas is now but a tame affair to what it was only a few
years ago, little interest being taken there in the tea product of these
countries. Year by year since 1885 China and Japan teas has had
less hold upon the English market, and it is remarkable to note how
continuously the consumption of these varieties have been on the
decline there from that time, notwithstanding their superior merits in
drawing and drinking qualities over both India and Ceylons. In that
year their consumption in the British isles amounted to over
113,000,000 pounds, but fell off to less than 105,000,000 pounds in
1886, to about 90,000,000 in 1887, to 80,000,000 in 1888, to
60,000,000 in 1889. The quantity of China and Japan teas
consumed in the whole United Kingdom declining to about
50,000,000 pounds in 1890, although the prices for them were
exceedingly low during that period. There are two main causes for
this serious reduction which have been in operation simultaneously
and for a length of time. The first was the great competition of India
teas stimulated for the reasons already named, and the second
cause the extraordinary favor that Ceylon teas found with English
consumers in 1888, when the quantity imported for use from that
island amounted to 18,500,000 pounds, or nearly double of what it
was the preceding year, the quantities cleared for 1889 and 1890
being respectively 28,500,000 pounds and 34,500,000 pounds,
showing an astonishing increase within the short space of three
years, and which fully accounts for the decadence of the English
demand for China and Japan teas. The consumption of the latter
varieties has retrograded there, while that for India and Ceylon teas
has increased proportionately, so that, although the market for the
former descriptions has occasionally given signs of revival, they
have been only spasmodic efforts at recovering, the much expected
and promised reaction soon subsiding. And instead of the
phenomenal cheapness of China and Japans being regarded as a
recommendation to consumers it has been used as an argument by
British dealers as an evidence of their unpopularity, and so
completely has the demand been transferred from China and Japan
teas to Indias and Ceylons that it has been no uncommon
occurrence for the latter kinds to be selling at improving rates whilst
the former descriptions have been disposable only at drooping
prices.
The enormous size of the tea estates in India and Ceylon as
compared with the small gardens of China and Japan give the
growers in the former countries several advantages over those in the
latter as they can be worked more systematically and with less
expense in larger areas. The use of machinery in curing and firing
also lessens the cost of preparation for market, together with a
saving in freight and quicker sale consequent to English preferences
giving a speedier return for the money invested. The advantages
which India and Ceylon tea-growers have over those of China are
greater command of capital, as in both India and Ceylon tea estates
are generally owned by companies consisting of shareholders whose
living is not dependent on the product of the plantations. The
companies can consequently afford to carry on the business at a
loss for several years, can purchase extensive tea lands, and can
spend large sums on machinery, labor and experiments as well as
on agents to introduce and distribute them. The India and Ceylon
tea-growers can obtain loans at a lower rate of interest, borrowing
money at from 4 to 5 per cent., while their Chinese competitors have
to pay from 20 to 30 per cent. for the same accommodations, in
addition to a command of better chemical and agricultural
knowledge. But against these admitted advantages of India and
Ceylon, China possesses one great advantage, that is, that the
Chinese grower, working for himself instead of wages, brings greater
care and more industry to the task. Experience with him takes the
place of science, and he is thus enabled to produce a finer flavored
tea than has yet or ever will be produced in either India or Ceylon.
Again the great decline in the consumption of China teas in England
and her dependencies cannot be attributed, as is so loudly
proclaimed by her statisticians, to any falling off in the quality of
China teas or any inherent merit possessed by those of India or
Ceylon, but simply to the narrow and contracted policy of her
merchants of favoring and forcing the product of her colonies to the
prejudice if not positive exclusion of that of the older tea-growing
countries.
In 1865 China exported over 120,000,000 pounds of tea, in 1870
nearly 170,000,000 pounds, in 1880 over 214,000,000 pounds,
reaching the enormous total of 221,000,000 pounds in 1890, thus
China’s export has also been increasing in a proportionate degree.
But although the figures for 1870 and 1890 show that in twenty years
it has nearly doubled, still it is not such a remarkable increase
relatively when compared with that of India, which during the same
period has increased nearly fourteen fold in quantity. In estimating
the probability of a recovery in the position of China teas in the
markets of the world the following considerations are of interest on
the subject: First, it is well known that the heavy Likin (grower’s tax)
Kutang (transit dues) and export duties levied on tea have
contributed in a great measure to the decadence of the tea-trade in
that country and to the development of that of India and Ceylon,
where the article, at least, starts free and unencumbered. The
Chinese laboring under this disadvantage, at the outset, have
endeavored to compete with India and Ceylon by reducing the cost
of production and lowering their standard of quality with a
consequent deterioration in the grade of the leaf. This changed
condition of the tea-trade may be attributed to these specific causes.
Fifty years ago India and Ceylon produced no tea, as it was not until
1840 that the export from the former began with a small venture of
400 pounds, since that year, however, the increase has been both
rapid and striking. Thus, commencing in 1840, the export has
steadily increased year after year until now, when the average
annual production reaches 100,000,000 pounds, of which England
consumes some 97,000,000 pounds, the balance going to Australia
and other of her colonies. It is contended by the Chinese themselves
that if the Likin and export duties were removed entirely or the export
duty alone reduced to an ad valorem charge of 5 per cent. it would
greatly help those engaged in the China tea-trade in their
competition with the growers and shippers of India and Ceylon,
others holding that a simple reduction of the duty will not
permanently benefit the China tea-trade unless it enables China to
lay down teas in Europe and America at a less price than can be
done by either India or Ceylon.
Russia is now regarded as the main hope of Chinese Congous and
sorts, the British islands consuming Indias and Ceylons almost
exclusively, the United States favoring Oolongs and Japans
principally. The trade in China teas with Russia is increasing
annually, while it is decreasing with England. In former years tea was
first shipped to England and thence to Russia, the Russian tea-
dealers now purchasing direct from China. The Russian demand
seems, in fact, to grow as fast as that from England declines,
constituting a total which is hardly suspected by those who are
interested in the trade, so that, although ousted from her monopoly,
China has still a great market for her produce.
Great quantities of tea are consumed in the domains of the Czar and
it is believed that the Russians use as much tea per capita as the
Chinese themselves. The “Samovar” or tea-urn is always steaming
and the natives never cease sipping tea while there is water left to
make it. It is served at all hours of the day, in palace as well as
hovel, being regarded as much a necessary of life there as bread or
tobacco. Shops abound for its sale in the principal cities; bargains
made and business transactions sealed over steaming tumblers of
tea.

TABLE 5.

The earliest official record of the importation of Tea into the United
States is in 1790, the order of increase for its importation, value and
consumption in the country by decades since that year being as
follows:—
Year. Imports, Value. Consumption Average
Pounds. per capita. Import
Price.
1790, 3,022,983 ...... ... ...
1800, 5,119,341 ...... ... ...
1810, 7,708,208 ...... ... ...
1820, ...... ...... ... ...
1830, 8,609,415 $2,425,018 0.53 22.3
1840, 20,006,595 5,427,010 0.99 24.1
1850, 29,872,654 4,719,232 0.87 27.9
1860, 31,696,657 8,915,327 0.84 26.3
1870, 47,408,481 13,863,273 1.10 29.4
1880, 72,162,936 19,782,631 1.39 27.2
1890, 84,627,870 13,360,685 1.40 20.0
The first duty levied on tea by the United States was in 1789, when a
tax of 15 cents was imposed on all Black teas, 22 cents on Imperial
and Gunpowder, and 55 cents on Young Hyson. But in order to
stimulate American shipping these duties were reduced to 8, 13 and
26 cents respectively, the following year, when imported from Europe
in American vessels, and to 6, 10 and 20 cents when imported direct
from China in the same manner. In 1794, however, the rates were
increased 75 per cent. on direct importations, and 100 per cent. on
all teas shipped from Europe, but again reduced to 12,18 and 32
cents in 1796, the latter rates being doubled during the War of 1812.
In 1828 this tax was again reduced, being entirely removed in 1830,
except when imported in foreign bottoms, when a duty of 10 cents
per pound was collected. The latter rate continued in force up to the
outbreak of the Rebellion in 1861, when a uniform duty of 15 cents
per pound was placed on all teas, which was eventually increased to
20 cents and finally to 25 cents per pound. In January, 1871, this
duty was reduced to 15 cents, being entirely removed in July, 1872,
since which year tea has been uninterruptedly on the free list in the
United States.

TABLE 6.

Showing net imports, value and per capita consumption of tea in the
United States, from 1885 to 1891, inclusive:—
Year. Net Imports, Value. Per
Pounds. Capita,
Pounds.
1880, 69,894,760 $18,983,368 1.39
1881, 79,130,849 20,225,418 1.54
1882, 77,191,060 18,975,045 1.47
1883, 69,597,945 16,278,894 1.30
1884, 60,061,944 12,313,200 1.09
1885, 65,374,365 13,135,782 1.18
1886, 78,873,151 15,485,265 1.37
1887, 87,481,186 16,365,633 1.49
1888, 83,944,547 13,154,171 1.40
1889, 79,192,253 12,561,812 1.28
1890, 83,494,956 12,219,633 1.33
1891, 82,395,924 13,639,785 1.32

TABLE 7.

Estimated average annual Quantity and Value of tea imported into


the United States:—
Countries. Quantity, Value.
Pounds.
China, 43,000,000 $7,000,000
Japan, 38,000,000 5,500,000
India, 100,000 20,000
Java, 200,000 30,000
Ceylon, 100,000 20,000
England, 3,000,000 650,000
Ireland, 1,000 500
Scotland, 12,000 2,500
Germany, 10,000 2,000
Russia, 200 60
Belgium, 50 25
Canada, 300,000 50,000
———— ————
Total, 85,000,000 $13,000,000
The average annual exports range from 1,000,000 to 5,000,000
pounds.

TABLE 8.

Showing varieties most in demand in the United States:—


Varieties. Kinds. Quantity,
Pounds.
Oolong, (Formosa), 10,000,000
“ (Amoy and 8,000,000
Foochow),
Green (all kinds), 10,000,000
Teas,
Japans, “ 38,000,000
Pekoes (China), 10,000,000
and
Congous,
India, Java and 6,000,000
Ceylon,
————
Total, 82,000,000
During the fiscal year ending June 30, 1890, there was imported into
the United States, at all ports, 84,627,870 pounds of tea, of which
43,043,651 pounds were received from China and 37,627,560
pounds from Japan, the balance consisting of imports from India,
Java and Ceylon, received via England and Holland. The United
States official reports show that tea represents 27 per cent. of the
total value of imported merchandise into this country. The gross
trade in the article, however, even at retail prices, does not exceed
$35,000,000, the total annual value of all food products being about
$220,000,000, of which tea only represents a value of $13,000,000,
equivalent to about 6 per cent. of the whole.
In round numbers the consumption of tea in the principal importing
countries has increased from 350,000,000 pounds in 1880 to
upwards of 400,000,000 pounds in 1892. To which may be added for
the minor consuming countries another 60,000,000 pounds, in which
case we get a grand total of 460,000,000 pounds. Tea consumption
in India and Ceylon is scarce worth computing, and it is also claimed
that the consumption in China has been greatly exaggerated, for
although the Chinese drink tea constantly much of the liquor is little
different from hot water, so that to credit China and her feudatories
with another 500,000,000 pounds would be an extravagant estimate.
But, admitting it to be near the mark, we may then take in round
numbers 1,000,000,000 pounds of leaf, or say 6,000,000,000
gallons, as the world’s annual consumption of tea. But it is
confidently predicted that if peace be preserved and wealth and
civilization continues to advance that much greater increase during
the closing years of the present century and the whole of the
twentieth century—for large portions of mankind are at length
discovering that alcohol with its “borrowed fire” is a deceiver and a
curse. If the civilization of an age or a community can be tested by
the quantity of sulphuric acid which it uses, much more certainly can
the moral status of a time and a people be judged by a comparison
of the quantities of alcoholic and non-alcoholic stimulants it uses.
All teas have declined one-half in value during the past ten years,
owing to the increased production of India and Ceylon, the position
of the market at the present time is, however, unique and unusual.
Heretofore the rule has been for the supply to exceed the demand,
particularly of China tea, it being the custom to claim that the market
would never run short of the latter, as the production could be
increased to meet any sudden or excessive demand. Now, however,
the position is entirely different, the shortage in China tea the past
year reaching some 21,000,000 pounds, to which must be added the
increase in consumption of 11,500,000 pounds, due in a measure to
the reduction of the duty in England, against which deficit is to be
placed the increase of production in India of 3,000,000 pounds, and
that of Ceylon of 15,000,000 pounds, but still leaving a shortage of
14,000,000 pounds. This position has led to an advance in China
common grades, part of which is undoubtedly due to speculation.

You might also like