Professional Documents
Culture Documents
Student Workbook: Prepared By-Gordon Bear
Student Workbook: Prepared By-Gordon Bear
STATISTICAL REASONING
IN PSYCHOLOGY AND EDUCATION
Second Edition
Edward W. Minium
prepared by
Gordon Bear
Ramapo College of New Jersey
10 987654321
Acknowledgments
Ed Minium, whose text has made statistics so easy for me to teach and so
easy for my students to learn, and who also provided a careful critique of
the first three chapters of the workbook;
Jack Burton, my editor, who supplied much valuable guidance and understood
when the work expanded to fill the time available;
Bob Abelson, Barry Collins, and Fred Sheffield, who were my own conscien¬
tious instructors in the subject of statistics;
my many students, from whom I have learned a great deal about the teaching
of statistics and other matters;
John Harsh, who offered to share with me the fruits of his labor on his
mastery-plan workbook;
Bob Worsham, who contributed the data used in the homework for Chapter 9;
Joe Fontanazza, who found the fantastic machine on which I typed both the
rough and the final drafts;
in
Digitized by the Internet Archive
in 2018 with funding from
Kahle/Austin Foundation
https://archive.org/details/workbooktoaccompOOOObear
A PERSONAL STATEMENT from the AUTHOR of this WORKBOOK
I want to help you learn statistics and earn a good grade in this course.
You're off to a good start, because your text is the book by my colleague
Ed Minium, and it's an outstanding one. I taught from the first edition of
this book ten times: seven times at the University of Wisconsin at Madison,
where my classes had 60 to 90 students, and three times at Ramapo College of
New Jersey, where my classes had 10 to 30 students. At both schools the book
proved to be excellent for developing a thorough understanding of statistics,
and now the second edition is even better. Many other texts supply only the
bare facts, but Dr. Minium gives you much more. He shows you the overall
structure into which the facts fit, and he adds details that provide insights
into matters treated only superficially in other texts.
The purpose of my workbook is to help you master Dr. Minium's text. You'll
find here:
But please note: You will not be able to "cram" successfully by reading only
the summaries here. The summaries cannot substitute for the text itself. And
the exercises I offer you cannot substitute for those in the text either. In
fact, I have generally constructed exercises that differ markedly from the ones
in the text, to provide you with other kinds of opportunities for learning.
To get the most out of your course in statistics, attend class regularly,
study your text carefully, do the problems there—and then come to this workbook
and let me help you get really good at this stuff. I've tried hard to make this
book something special, and I'd be delighted to know that you used it.
v
■
■'
TABLE OF CONTENTS
Author's Statement v
Are You Worried about the Mathematics in this Course? 1
Tips on Buying a Calculator 2
Chapter 1 Introduction 3
Chapter 2 Preliminary Concepts 9
Chapter 3 Frequency Distributions 19
Chapter 4 Graphic Representation 33
Chapter 5 Central Tendency 39
Chapter 6 Variability 47
Chapter 7 The Normal Curve 63
Chapter 8 Derived Scores 71
Chapter 9 Correlation 79
Chapter 10 Factors Influencing the Correlation Coefficient 87
Chapter 11 Regression and Prediction 91
Chapter 12 Interpretive Aspects of Correlation and Regression 97
Chapter 13 The Basis of Statistical Inference 103
Chapter 14 The Basis of Statistical Inference: Further Considerations 113
Chapter 15 Testing Hypotheses About Single Means: Noraml Curve Model 119
Chapter 16 Further Considerations in Hypothesis Testing 129
Chapter 17 Testing Hypotheses About Two Means: Normal Curve Model 137
Chapter 18 Estimation of y and yx-yY 155
-
ARE YOU WORRIED ABOUT THE MATHEMATICS IN THIS COURSE?
too.
Consider this:
•Your text emphasizes the logic of statistics, not the theorems, for¬
mulas, and proofs that mathematicians work with. The title of the
book is "Statistical Reasoning in Psychology and Education," and it's
reasoning, not mathematics, that's important here.
•Your instructor (and your teaching assistant, if you have one) will
go over the material in the book and answer any questions you have.
Moreover:
1
TIPS ON BUYING A CALCULATOR
You don't need anything fancy. You'll have no use for the special features
of the "scientific" calculators that are meant to replace a slide rule—no use
for the keys for pi (tt) , logarithms, exponentiation, or the trigonometric func¬
tions sine, cosine, tangent, and cotangent. (I told you this course doesn't
require fancy mathematics.) You do want the following:
•an add-on memory, which permits you to add a number to another number
already in storage. The key for doing so is usually labeled M+. A machine
with add-on memory usually also has keys labeled M-, MR or RM, and MC or
CM, in which case it's said to have a four-function or four-key memory.
'keys that are big enough for you to press easily and accurately. Too much
miniaturization is a liability.
Be a smart shopper:
•Ask about the stores' policies on defective merchandise. What will they do
if your purchase malfunctions after you get it home? The store should agree
that if it breaks within 30 days, they will replace it with a new machine
from their own stock, rather than sending the old one to the factor for re¬
pair. Ask the salesperson to write "30-day exchange" on your receipt and
sign it.
You should be able to get what you want for less than $20.
2
CHAPTER 1
INTRODUCTION
Now here's a list of the problems and exercises at the end of Chapter 1.
Again, you can keep track of your progress by checking off those you did and
noting which ones you answered correctly, which ones you want to ask your
instructor or your teaching assistant about, and the like.
1 -----__---
2 ---------
3 ___
4 ________—-
3
4 Chapter 1
SUMMARY
problems. The techniques fall into two categories: descriptive statistics and
tence of Section 1.1 in the text]. As for inferential statistics, the object
Those who work with statistics might be divided into four classes: (1)
<■ . IY y' (■■■ Vv\ c-ir ■ v n.;, / ■- ,_, (2) those who must select and apply
interest of those in the first two classes is in statistics itself / their own
subject matter [Cross out the incorrect wording]. We might think of them as
(the first two kinds of persons) in finding statistical techniques and applying
tical techniques than in the theory behind them. In contrast, the fourth kind
a statistical (that is, numerical) property of the data. Upon applying a sta¬
clusion derives emLirerly / only partly from the statistical conclusion [1.6] .
Thus statistical procedures are tools that enable a researcher to move from a
Here are five accusations about the field of applied statistics. Each
contains some truth, but not the whole truth. No one is trying to tell you
what opinions to adopt, but the author of your text and I would like you to
see both sides of each issue. So fill in a counterargument for each accusa¬
tion. Those in the text appear in Section 1.8.
groups of people and not with individuals . (?•/ vOA ^ Q € -f v' C‘-^ ■ > ■ ■a A Ly—
, .i
r ■ alT cc'-qo VV
AT
Descriptive Inferential
consist of
Statistics Statistics
i
-♦Substantive Conclusion
ends with'
TOKYO, Oct. 27—This month, the 10th one of the year, the 72-year-old Takeo
Fukuda, this nation's 13th post-war Prime Minister, leads the 113 million citi¬
zens of Japan in marking the fourth anniversary of a very special event in the
official life of Japan—Statistics Day.
But there is likely no nation that [now] ranks higher in its collective
passion for statistics.
In Japan, statistics are the subject of a holiday, local and national con¬
ventions, awards ceremonies and nationwide statistical collection and graph¬
drawing contests.
"This year," said Yoshiharu Takahashi, a Government statistician, "we have
almost 30,000 entries. Actually, we had 29,836."
judges, who gave first prize this year to the work of five 7-year-olds. Their
graph creation, titled "Mom, play with us more often," was the result of a
survey of 32 classmates on the frequency that mothers play with their off¬
spring and the reasons given for not doing so (the most often heard excuse:
"I'm just too busy").
But there is one figure that won't be included: Officials do not yet keep
statistics on the number of statistics they keep. "We don't know," says Mr.
Takahashi, "they are countless."
[Copyright 1977 by The New York Times Company. Reprinted by permission. The
actual date of Statistics Day is October 18.]
Note the sense in which the word statistics is used in this article (in
the sixth paragraph, for example). This is not the kind you're studying in
this course. Any 7-year-old can comprehend the kind the article talks about.
What you're studying is techniques for describing this kind and for drawing
proper inferences from them.
PRELIMINARY CONCEPTS
Here's a list of the sections that make up Chapter 2 in the text. As for
Chapter 1, if you'd like to keep track of your progress, there's space to write
in things like "Okay," "Reread," or "Ask about this."
1 2
3 4
5 6
7 8
9 10
11 12
13 14
10 Chapter 2
SUMMARY
one sense, it refers to the people (or whatever) the .investigator is studying,
and the term designates the group about which the investigator wishes to i
called an \v\ 1 [2.1]. Of the two definitions of population, the one that
a random sample. For a sample to be random, it must have been selected in such
several random samples of the same size from a given population. It is highly
unlikely that we would get exactly the same collection of elements in each
sample. Thus the characteristics of the sample as a whole will change/ stay
the same from sample to sample [2.2]. This phenomenon is called sampling
variation.
The second important property of random samples is the effect of the size
of the sample on the amount of variability that occurs among different samples
of the same size. The larger the samples, the more / the less the variation
will provide us with a more / be-srs precise estimate of what is true about the
In Section 2.3 the text lists four meanings for the term statistics. In
the first two senses the word is singular and refers to branches of mathematics.
Preliminary Concepts 11
data; this is the kind of statistics that consists of descriptive and inferen¬
the branch of mathematics, owing much to the theory of pyp ha fej -u_' that
provides the theory behind the descriptive and the inferential techniques.
The third meaning of the word listed in Section 2.3 is the meaning that
the word takes in common usage. The word has already been defined in this
the word in this sense as a set of uYUC^' ; such as averages. In this sense
we may use the word in the singular and speak of a statistic, meaning a single
numerical fact or a single index. The fourth meaning of the word statistics
is a refinement of the third. In this sense the word is again plural, but it
refers not just to any old numerical facts, but to indices that describe a
_the college they attend--does not vary from person to person. So far as our
study goes, it is not possible for this characteristic to have other than a
Other characteristics, like the sex of a subject, can vary from one per
graph 2], and an example is a person’s sex, which can be either male or female
The two categories, male and female, differ in kind but not in degree. To
designate one person as male and another as female is to say that they differ,
but the designations do not say that one is more of something than the other.
12 Chapter 2
or sisters) that a person has, which can vary from 0 up to 10 or more. Another
example is a person's height. In each case, people who differ in where they
These two examples, number of siblings and height, illustrate the two
types of quantitative variable. Number of siblings can take only certain values
namely 0, 1, 2, 3, and so on. No person can have 0.6 or 1.2 siblings. Values
of such variables are stepwise in nature, and such variables are said to be
discrete / continuous [2.5], In contrast, height can take any value within
the range of possible heights. A person can be exactly five feet tall, or just
a little more than five feet, or just a little more than that, and so on. Where
as a discrete variable has gaps in its scale, this kind of variable has none,
Accuracy in Measurement
recorded an exact number. Numbers lacking this kind of accuracy are known as
variable, such numbers arise if our method of collecting data has 4^0;
the potential accuracy in the discrete values [2.6]. This happens when we
Now suppose the variable we're working with is quantitative and continu¬
ous, such as height. Even though a variable is continuous in theory, the pro¬
Levels of Measurement
categorizing, ranking, and scoring. (This information is not in the text but
subject to indicate the category into which the subject falls. (The number 1
might be used to indicate the category "male" and the number 2, the category
male" for the variable sex) are said to constitute a(n) nominal/ ordinal' /
interval / ratio scale [2.7]. All observations placed in the same category
be used here, but if numbers are used to identify the categories of a given
nominal scale, the numbers are simply a substitute for the _ of the
The subject with the greatest magnitude (the subject with the greatest merit
ranked 2; and so on. In this type of measurement, the numbers form a(n)
nominal / ordinal / interval / ratier scale [2.7]. The basic relation expressed
(The 1, for example, says that the workman given this rank is greater in com¬
petence than the workman given the rank 2.) However, nothing is implied about
(The difference in merit between the man ranked first and the man ranked sec¬
ond may be large or small, and this difference is not necessarily the same as
that between the man ranked second and the one ranked third.) Further, nothing
sessed [4.7] . (All workmen could be excellent, or all could be quite ordinary,
stands on the variable of interest, without regard for where anyone else stands.
14 Chapter 2
The number is a true score that indicates how much of the variable the sub¬
ject is thought to possess, and the difference between one score and another
the possible scores form an interval scale, a given numerical interval along
[2.7, Paragraph 4]. (The difference of 1 point between the scores of -5 and
ence of 1 point between the scores of, say, 0 and +1 or +4 and +5.)
When measurement is at this level, the level of the interval scale, one
may talk meaningfully about the fyi - ■ between intervals [2.7, Paragraph 5],
ence in competence as the interval of 1 point between +3 and +4, for example,
that a workman rated at +3 has three times the competence as a workman rated
at +1, even though 3 * 1 = 3.) The reason is that the x point is arbi¬
trarily determined, and does not imply an absence of the characteristic being
If the numbers available for scoring form a ratio scale, they possess all
the desirable features of the interval scale, and in addition the ratio be¬
tween ffyftCc .,_ becomes meaningful [Table 2.1]. On a ratio scale, the
find and repair in a widget rigged with a dozen malfunctions, for example.
This point is obvious. A more subtle point is that numbers are often used
in psychology and education that look like they fall on an interval or even a
ratio scale, but we cannot be sure that they really do. Some authorities ad¬
vise the use of statistical techniques appropriate for ordinal scales in such
cases. But as Dr. Minium implies at the end of Section 2.8 in his text, the
weight of the evidence suggests that in most situations, it is okay to treat
the ambiguous numbers as though they came from an interval or even from a ratio
scale.
16 Chapter 2
--M NEMONIC* T I P-
A statistic (in the fourth sense of the word listed in Section 2.3) is a
characteristic of a sample, and a parameter is a characteristic of a popula¬
tion. You can easily remember the distinction by noting that statistic and
sample both begin with an S, while parameter and population both begin with
a P.
*Mnemonic ("nem-ON-ik"): pertaining to memory.
EXERCISES*
You are one of those bright young people who's making a lot of money these
days by conducting telephone polls for politicians. Governor Grassroots hires
you to tell her what proportion of the adults in her state approve of her work
as the state's chief executive. You get your staff busy making telephone calls
around the state, and they ask each adult they reach, "Do you think the present
governor is doing a good job?" You yourself make the first call, which is an¬
swered by John Q. Public, who says "Yes." In all, your firm completes 500 such
quickie interviews.
1.
What is the sample that you have drawn? Again define it in two senses.
3. _
4.
— - " " “ ' " ' “ ’ * 1 " * “ ' -—' ' - ~ i
How large is your sample? 5._ Mr. Public (or his answer to your question)
*The answers to these and all other exercises here appear at the back of
this book, beginning on p. 227.
Preliminary Concepts 17
Sex 8. 9. 10.
17. Now we come to an ambiguity. Suppose the answers to the question about
the governor's performance are recorded as "Yes," "No," and "No Opinion."
Should the answers be treated as categories making up a nominal scale for the
measurement of a qualitative variable? Why or why not?
18. Here's another question for which the answer is not clear cut. Marketing
I
researchers often ask their respondents to rate a product by saying "Excellent,
"Good," "Fair," "Poor," or "Bad," and they score these answers as 5, 4, 3, 2,
and 1, respectively. This scoring is a way of measuring favorability (or un-
favorability) of opinion toward the product. What level of measurement is this
CHAPTER 3
FREQUENCY DISTRIBUTIONS
In case you're just starting now to use this workbook: the blank lines on
this page are explained on the title pages for Chapters 1 and 2.
2 3
1
5 6
4
8 9
7
11 12
10
14 15
13
17 18
16
20 21
19
23 24
22
19
20 Chapter 3
SUMMARY
the _est and the _est score values [3.2]. Then list all possible
score values / only the scores that actually occurred , including these two
extremes, in ascending / descending order down the page [3.2]. Finally, add
a second column to the right of the first one. In this column, list for each
Frequency is abbreviated Freq in Tables 3.2 and 3.3 and thereafter symbolized
Grouping Scores
tribution. Grouping makes it easier to display the data and grasp the essen¬
tial facts they contain. In grouping, the various possible scores are collected
into a number of class _ [3.3] . Here are some rules for making
these groupings:
4. The interval containing the highest score should be placed at the top /
5. For most work, there should be not fewer than class intervals and
Disadvantages of Grouping
A grouped frequency distribution does not contain all the information that
the corresponding ungrouped one does, because within a given class interval,
one cannot tell exactly where the scores fell. Because this information is
not given, a problem called grouping error can arise. It is often necessary
to make calculations using the numbers in a grouped frequency distribution
(this very chapter discusses calculations of centile points and centile ranks,
for example), but in doing so, one cannot tell exactly where each score fell
along the scale of possible values, and so it is necessary to make an assump¬
tion about where the scores occurred. Sometimes it is assumed that the scores
within a given class interval are distributed evenly throughout the interval,
and sometimes it is assumed that they fell in any way such that the midpoint
of the interval is the average of the scores in that interval.
it will be in error. Other things being equal, the narrower the class inter¬
val width, the more / less the potentiality for grouping error [3.6].
A set of raw scores results / does not result in a unique set of grouped
scores [3.3]. That is, there is more than one way to construct a grouped fre¬
Relative Frequencies
scores. There are two kinds of relative frequency, proportion and percentage.
tional frequency, divide the raw frequency by the total number of cases in
orfe-half __ [3.1].
When scores are grouped into class intervals, the limits of a class inter¬
val can be given as score limits or as exact limits. The score limits are
merely the lowest and the highest raw scores that fall into the class inter¬
val. The exact limits are the lower limit of the lowest score and the upper
limit of the highest score. (See Section 3.4, Principle 6.)
The difference between the upper exact limit and the lower exact limit is
the width of the class interval and is symbolized with the letter [3.5].
numbers in the left-hand column are the upper exact limits of the class inter¬
vals. For each class interval, a cumulative frequency distribution shows how
many cases lie above / below the upper exact limit [3.8], The number of
cases below a given upper exact limit is the gumulative frequency for that
to the total number of cases, n. Again, the relative figure can be a propor¬
[Table 3.6]. To compute one, divide the raw cumulative frequency by [3.8].
The proportional cumulative frequency for the upper exact limit of the top-
Frequency Distributions 23
most interval will always equal 1.00 (as Table 3.6 shows). A cumulative fre¬
quency by [3.8]. The cumulative percentage frequency' for the upper ex¬
act limit of the topmost class interval will always equal 100.
EXERCISES
Here's a table of the kind we've been reviewing. Never mind what the
scores mean; that's irrelevant for now. Just think about the internal logic
of the table. I have given you the top value in the cum f column and four
figures in the column of cum %f's. From these numbers you can determine all
the missing information. Just recall what you know about the topmost value
in the cum f column, and remember how you go about finding the cum %f's.
Give this problem a good, honest try. You'll feel really proud if you
figure it out for yourself. If you need help, though, it's available at the
back of the book.
If you think you can't figure the table out, look at the hint below and then
try again.
* -[UAjieq
Here are two more problems of the same kind, to help you understand the
connections among the different numbers in tables like these.
23 - 27
j V • S' 4 12 /* 0 0 i |v jrt
18 - 22 t
i "1 0 67
a A
18 "7
. ul
13 - 17 l Z 1 s$ ~ i )• if 0s. ll 11 Cj? * / \ 50
8 - 12 IS- l i. s 6 ft %uJ*\J 33
M* J®*
3 - 7
St'S'- #, A
§1 O
(j 33 Lj ( «< "S 33
3
£
n
Score Limits Exact Limits f Prop. f %f cum f Prop, cum f cum %f
496 - 505 495.5 - 505.5 ( *0 15 l-yt | 05
f .
A
Hit1 - 485
<-/74<S Mt<'5 .13 •i * ’ - M1 1vyf5
r\ A
LUS -474 44> V S- 4741 'S' * h .33 *
Here are some principles that describe the connections among the various
parts of tables like these. Note how they're illustrated in the tables above
(when the numbers are correctly filled in).
1. The sum of the f values = n (or N) = the top number in the cum f column.
2.
The top number in the Prop, cum f column is 1.00, and the top number in
the cum %f column is 100.
A Collection of scores
is well described by-—» a Frequency Distribution
on a Quantitative Variable
Uncumulated Frequencies
with or without with which scores Cumulative
occurred Frequency
Grouping
can cause
collects scores
into
Grouping
Error
Class Intervals
should be
J Raw Numbers Relative Numbers
mutually exclusive
• continuous
SUMMARY , Continued
percentile rank are used instead.) A centile point (sometimes called just a
centile) is a point along the scale of possible scores (which is assumed to
be a continuum), and it falls among the numbers shown on the left side of the
table. (It may be helpful to label the left-most column of Table 3.6 "Centile
Points.") A centile point is named by specifying the percentage of scores in
the distribution that fall below it. Thus the 96th centile point in a given
distribution is a certain point along the scale of scores, namely that point
that cuts off the bottom 96% of the distribution. The symbol for the 96th
along the scale of scores, it may have any value that a score may have. The
number of casbs in the distribution. Centile ranks fall among the cumulative
(It may be helpful to label the right-most column of Table 3.6 "Centile Ranks."
As percentages, centile ranks may take values only between and [3.9].
The centile rank for the score 90.5 in Table 3.6 is 96.
EXERCISES
To practice using the concepts of centile point arid centile rank, label
the left-most column of Table 3.6 "Centile Points" and the right-most column
Centile Ranks, if you haven't done so already. Note again that the value
of 90.5 on the left goes with the cum %f of 96 on the right. In terms of
centile points and centile ranks, 90.5 has a centile rank of 96, and the 96th
centile point is 90.5. Now answer these questions about Table 3.6.
5. c28 is .5 6. C52 is 91
o
,.i,.
In the early 1960s, researchers working for the federal government ap¬
proached about 400 young men and 500 young women, asked them to take off their
shoes and step on a scale, and measured their height and weight. Let's focus
on just the heights for the men. The 400-odd heights were cast into a cumu¬
lative percentage frequency distribution like Table 3.6, and certain centile
points were found, namely the 1st, 5th, 10th, 20th, 30th, and so on up to the
90th, 95th, and 99th. Instead of publishing a table of the kind you're now
familiar with, though, the National Center for Health Statistics reported just
these centiles. They're shown on the left below.
To practice using the concepts of interest now, ask yourself what the
50th centile point (C50) for the men's heights is. A centile point is a
point along the scale of scores, remember, so here it will be a height.
The 50th centile point is that score that has 50% of the scores below it.
The table tells you that it's 68.6", so we now know that half the men in
this sample were under 68.6" in height.
A man who stands six feet tall (72.0") has what centile rank in compar¬
ison to these other fellows? Asking for a centile rank is asking for a
percentage, so the answer must be between 0 and 100. The table shows that
the centile rank is somewhat less than 90. For a height of 72.4, which is
the closest we can get to 72.0 using the entries in the table, the centile
rank is exactly 90, and a shorter height would have a smaller centile rank,
of course. So a six-footer is taller than almost 90% of the men in the
sample.
*****
/S
_ 1. How tall do you have to be, at minimum, to exceed the height
of the shortest 20% of the men in this sample?
Ithi! 6 5. C60 = ?
\
VD
<b7«e1 -tplIS. The middle 20% of the scores in any distribution run from C40
to c60* Half of the 20%, namely 10%, lie between C4q and C5Q,
which is the score in the very middle. The other half of the
20% lie between C50 and C0Q. In this distribution, the middle
20% of the scores lie between which two heights?
The middle 90% of the scores lie between which two values?
How many men in this sample were between 65.4" and 66.5" tall?
Enough on male heights. The National Center for Health Statistics treated
the heights of the women, the weights of the men, and the weights of the women
in the way they treated the data you just examined in detail. The other
three distributions are tabled above, and the set of four will enable you to
compare your height and weight with the measurements for people of your own.
sex (and for most students, for people of their own age). The samples were
large (for men, n was about 411; for women, n was about 534—the researchers
did not supply the exact figures), and they were drawn to be representative
of the noninstitutionalized population of 18- to 24-year-olds in the lower
48 states. Thus you can be quite confident that the centile ranks shown in
these tables are close to the figures for the full populations.
)
Frequency Distributions 29
Students often ask at this point about the correlation between height and
weight. Common observation tells us that taller people tend to be heavier,
so there is in fact a correlation between height and weight for human beings.
These four tables do not, however, provide any evidence on the correlation
for either sex. So far as we can tell from the tables, the 50% of the men
who are shorter than 68.6" could be the same 50% who are heavier than 157 lb.
Ch. 9 will show you what kind of table provides evidence for the existence of
a correlation between one variable and another.
[The tables above were derived from Weight, Height, and Selected Body
Dimensions of Adults, National Center for Health Statistics Series 11, No.
8 (Washington: U.S. Government Printing Office, 1965).]
SUMMARY, Concluded
Look at Table 3.6 again. You should have labeled the left-most column
find ourselves wondering about a centile point whose centile rank is not one
table. (In Table 3.6, for example, we might wonder where the 50th centile
point falls.) Such a centile point will not be one of the upper exact limits
shown on the left side of the table, and the table does not directly give
procedure called linear u.3 f ? -Iflfl im_ l3-10' footnote]. This pro-
cedure rests on the assumption that the scores in a given class interval are
of a score that is not one of the upper exact limits shown on the left side
tile rank for such a score will not be found on the right in the column of
wonder what the centile rank for a score of 75 is.) It can also be estimated,
about how the scores are divided within a given class interval is the same
1. Find the class interval in which the desired centile point falls. To
do so, determine the number of cases that constitute X% of the whole. This
number will be the cum f for C^., and it is given by the formula
cum f for lower exact limit < cum f for C < cum f for upper exact limit
For example, to find C^q for the data in Table 3.7 of the text, find cum f
for ^50' which is (50/100)(80) = 40. The desired class interval, we can now
tell, has the score limits 73 and 75 and the exact limits 72.5 and 75.5 (the
exact limits are not shown in the table), because
cum f for 72.5 (namely 32) < cum f for C50 (namely 40) < cum f for 77.5 (namely
2. Note again what the cum f for the lower exact limit is. In the example,
it's 32.
3. Determine the number of additional scores that together with this cum f
will equal the cum f for C . The formula is simple:
4. Note the f (not the cum f) for the interval and assume that this number
of scores is distributed evenly throughout the interval. In the example, f =
12, and we thus assume that the bottom score in the interval is in the bottom
twelfth of the interval, the next-to-the-bottom score is in the next-to-the-
bottom twelfth, and so on.
5. Find the distance up into the interval that (on the stated assumption)
is occupied by the additional scores needed to equal the cum f for C
X'
a. This distance will be a certain fraction of the width of the inter¬
val, and the fraction is given by the formula
In the example, the fraction is 8/12; the 8 is from Step 3 and the 12 from
Step 4. On the assumption that the 12 scores are distributed evenly through¬
out the interval, the bottom 8 are in the bottom 8/12 of the interval.
b. To find the desired distance into the interval, the distance occupied
by those additional scores needed to equal the cum f for C^, multiply the frac¬
tion of the interval just computed by the width of the interval. The width is
simply:
In the example, the width is 75.5 - 72.5 = 3.0 units. Multiplying the fraction
by the width, we have 8/12 times 3.0 units = 2.0 units, and we have thus deter¬
mined that on our assumption, the bottom 8/12 of the interval is the bottom 2
units.
6.Add the distance found in the preceding step to the lower exact limit
of the interval in which you are working. This addition determines the point
along the scale of scores that cuts off a) those scores lying below the lower
exact limit plus b) the additional scores within the interval needed to equal
the cum f for C . This point is C , the point along the scale of scores below
which X% of the^cases fall. In the example, we have 72.5 + 2.0 = 74.5 = the
50th centile point.
Check to be sure your answer is within the interval that you located in
Step 1.
In general, C=
lower exact + a certain distance up into the interval (as found in Step 5c)
limit of interval
or
In the last formula, the # of additional scores is given by the equation in Step
3, and the width is given by the equation in Step 5b.
The examples offered in Section 3.10 and the problems and exercises for Ch.
3 provide material for practicing the procedure spelled out above. Remember
that this workbook does not attempt to replace those problems and exercises.
f;
.
*
CHAPTER 4
GRAPHIC REPRESENTATION
The purpose of the blank lines below is explained on the title pages for
Chapters 1 and 2 of this workbook.
4.1 Introduction
1 2 3
6
4 5
7 8 9
11
12
10
14 15
13
16 17
33
34 Chapter 4
SUMMARY
The Histogram
[4.2]. All the information in this table can also be presented in a graph
A graph like this has two axes. The horizontal one is also called the
abscissa / ordinate or the X/Y axis, and the vertical one is called the
_ [4.3].
of scores in that interval [4.4]. These points are then connected with
straight / curved lines [4.4]. (The histogram in Figure 4.1 can be easily
turned into a frequency polygon by making a dot in the middle of the top of
each rectangle and then connecting the dots.)
Graphic Representation 35
If nothing further is done, the zig-zaggy line will not touch the horizontal
axis at either end, but it is conventional practice to bring it down to the axis
of these intervals are plotted at __ frequency, and these two points are
As with the histogram, the vertical axis of a frequency polygon can show
[4.4] .
cent of the area in the bars represents 100 percent of the scores in the distri¬
bution. The percentage of the total area that falls in any given rectangle
represents the same percentage of the scores. In Figure 4.1, for example, the
bar over the interval from 99.5 to 109.5 has 20% of the total area in all the
bars, and it thus indicates that 20% of the scores fall in this interval. This
characteristic called its shape. The shape is determined most easily by look¬
ing at the literal shape of the histogram or frequency polygon graphing the
The bimodal, normal, and rectangular distributions are all symmetrical / asym¬
metrical, while the J-shaped and skewed distributions are symmetrical / asym¬
metrical [4.11].
36 Chapter 4
There is no such thing as the graph of a given set of data. The same set
of raw scores may be grouped in different ways /only one way [4.10]. And a
graph can be squat or slender depending on the relative scale of the two axes.
Sampling Variation
sample is drawn from such a population, the shape of the sample is likely to
be more irregular. In general, the fewer the cases, the greater /less the
principle that a larger sample will usually resemble the population more closely
distribution can also be presented in a graph like Figure 4.4, which is called
As with the histogram and the frequency polygon, the horizontal axis shows the
various possible scores, but now the vertical axis shows cumulated frequencies
In a. table like 4.2, remember that a given cumulative percentage (say 40.0
on the cum %f scale) corresponds to the upper exact limit of the class interval
across from it (in this case, to the upper exact limit of the interval 70 - 72,
A cumulative percentage curve can be used to find the centile rank for a
given score, or the centile point for a given centile rank. Such graphic deter¬
mination of centile points or centile ranks will /will not yield the same
chapter [4.6]. Connecting the points on the cumulative curve with straight
EXERCISES
If you're having trouble with these, try this strategy: Visualize the
table listing the frequency distribution. In its simplest form, it will look
like Table 3.2 or 3.3 on p. 29 of the text, with just two columns. Think what
the numbers in the left-hand column would be. Then guess the pattern in the
frequencies in the right-hand column. Where, for example, would the large
frequency counts fall—at the top, in the middle, or at the bottom of the col¬
umn? Where would the small counts fall? Finally, visualize the translation
of the frequency distribution into a histogram or a frequency polygon, and
choose the appropriate word for describing its shape.
dents' heights?
7. Suppose that 523 sixth-graders take a 50-item spelling
test intended for college seniors. What shape will the distribution of the
sixth-graders' scores have?
The study found that average speed records for all types of vehicles on
two-lane rural roads, both night and day, have actually increased over the last
study in 1971. The only indication that people may be heeding pleas to slow
down and conserve gasoline are slightly slower speeds for passenger cars on
interstate highways.
The biennial speed studies were conducted at 25 points on the state trunk
system in May, 1973, to establish average speeds for various types of vehicles,
and to determine trends as well.
Average and 85th percentile speeds are as follows, with comparative 1971
figures in parentheses:
Interstate, daytime (speed limit 70): Wisconsin passenger cars, 85th per¬
centile speed 74.1 (74.3), average speed 67.9 (68.4). Out-of-state passenger
cars, 85th percentile speed 74.9 (75.0), average speed 69.3 (69.6).
Interstate, nighttime (speed limit 60): All passenger cars, 85th percentile
speed 68.7 (70.7), average speed 62.8 (64.2).
Traffic speeds have been steadily increasing since studies began in 1938-
1939, except for the 1942-45 war years, when there was a uniform 35 mph limit.
Average speed in 1938 was 49.6 with the 85th percentile speed at 61.5.
There was no speed limit.
The lowest speeds recorded were in 1942, when average speed was 37.1 mph
and the 85th percentile speed was 42.9 mph. In 1950 the average speed was
50.9 mph and the 85th percentile speed was 59.9.
CENTRAL TENDENCY
5.1 Introduction
2 3
1
5 6
4
8 9
7
11 12
10
14 15
13
17 18
16
20 21
19
22 23
39
40 Chapter 5
SUMMARY
the , and the arithmetic _ [5.1]. Any one of these can properly
be called the average of the scores; the term average is thus vague.
The Mode
If the scores in the collection have been left ungrouped, the mode is the
score that occurs with the greatest _ [5.3]. In grouped data, the
greatest / smallest number of scores [5.3]. The symbol for the mode is _ [5.3]
The Median
If there are only a few scores in a collection (in which case they would
scores are arranged from high down to low, and if there is an odd number, the
median is defined as the score in the middle. When there is an even number of
scores, there is no one middle score, and the median is taken as the point
way between the two scores that bracket the middle position [5.4].
The formal definition of the median is C , the 50th centile point, which
b(J
is the point along the scale of scores below which _% of the scores fall [5.4]
For a large number of scores that have been grouped, the median can be calcu¬
The Mean
The mean is the result of summing all the scores and then dividing the sum
the arithmetic mean (there are other measures of central tendency also called
Central Tendency 41
means). The symbol for the mean of a sample of scores collectively called x
is , and the symbol for the mean of a population is _ [5.5]. The latter
When the scores in a set are to be summed, as in finding the mean, the
formed [5.2]. Y.X should be read " of X" [5.2]. Using this symbol,
y = - [Formula 5.1a]
X = - [Formula 5.1b]
Many important properties of the mean can be understood with the aid of a
physical analogy. Picture the scores in a distribution as weights arranged
along a weightless plank, as in Figure 5.1. The plank will balance (become
level) at a certain point, and that point is the mean of the scores.
Here is a basic property of the mean (not noted in the text) that you can
easily understand by thinking of the mean as the balance point of the distribu¬
tion: The mean always falls somewhere between the lowest and the highest score
in the distribution (unless all scores have the same value). In the physical
analogy, the distribution always balances at a point somewhere between the left¬
most and the right-most weights.
You can also understand how the mean is sensitive to the exact location of
each score. The balance point is sensitive to the exact location of each weight,
and if a given weight is moved (corresponding to a change in value), the bal¬
ance point will change. Note especially that an extremely low score pulls the
mean way down, just as a weight well to the left of the others pulls the bal¬
ance point way over to the left. Similarly, an extremely high score pulls the
mean way up, just as a weight well to the right of the others pulls the balance
point way over to the right.
Suppose you subtract the mean from each score in the distribution. The
result for a given score is called the score's deviation from the mean. A
score below the mean will have a negative deviation, and a score above the mean
will have a positive deviation. If you compute the mean and the deviations
correctly, the sum of the deviations, taking into account the fact that some
are negative and some positive, will always be zero. This is the numerical way
of saying that the mean is the balance point of the scores, and it is illustra¬
ted in Figure 5.1
42 Chapter 5
Suppose you add a certain number (say 10) to each score in a distribution.
What will happen to the mean? In the physical analogy, this is like sliding
each weight to the right by a certain amount (10 units in the example). The
balance point will obviously move right along with the weights (to a point 10
central tendency discussed in this chapter are / are not affected in the same
way [5.6].
Unfortunately, the analogy between the mean and the balance point does not
help you understand what happens when you multiply or divide every score in a
and dividing each score by a constant has the effect of_the mean
The median responds to how many scores lie below (or above) it, and also
to /but not to how far away the scores may be [5.8]. Thus the median is
more / less sensitive than the mean to the presence of a few extreme scores
[5.8]. This fact means that in distributions that are strongly asymmetrical
(or skewed), the median / mean may be the better choice if it is desired to
represent the bulk of the scores and not give undue weight to the relatively
For a given distribution, there is only one mean, just as there is only one
point where the weights representing the scores would balance. Likewise, there
is only one median, because there is only one point that divides the top half
of the scores from the bottom half. There may / may not be more than one
mode [5.9]. The mode is the only measure that can be used for data that have
The -s wil1 vary least among themselves [5.7, final paragraph], and
fluctuation [5.8]. This state of affairs makes the mean the most useful measure
metic and algebraic manipulation in a way that the other measures are not [5.7].
In distributions that are perfectly symmetrical, mean, median, and (if the
distribution is unimodal) mode will yield the same / different values [5.10].
If the mean and median have different values, the distribution cannot be /may
or may not be symmetrical [5.10]. The more skewed, or lopsided, the distribu¬
tion is, the greater / lesser the discrepancy between these two measures [5.10].
score value [5.10]. The _ has been specially affected by the fewer but
relatively extreme scores in the tail, and thus has the lowest value [5.10]
EXERCISES
_____ 5. Responsive only to the number of scores above it or below it, not
to their exact locations?
_ 10. The point about which the sum of negative deviations equals the sum
of positive deviations?
__ T^e point along the scale of scores that divides the upper half of
the scores from the lower half?
_ 17. The measure that best reflects the total value of the scores?
SYMBOLISM DRILL
Yes, this is a drill such as you did many of back when you learned your
alphabet and multiplication table. And like those it'll be good for you. Fill
in the blanks. Answers appear on p. 231.
.
CHAPTER 6
VARIABILITY
__ 6.1 Introduction
47
48 Chapter 6
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
SUMMARY
Measures of Variability
how far any particular score diverges from the center of the group; rather it
is a summary figure that describes the spread of the entire set of scores [6.1,
provide information about the level of performance (the central tendency), and
it gives / does not give a clue as to the shape of the distribution [6.1, Para¬
graph 3 on p. 82].
The Range
the other measures of variability are also distances, except for the variance,
Every distribution has three quartile points, which are the three score
points which divide the distribution into four parts, each containing an equal
The symbol for the semiinterquartile range is [6.3], and the formula that
defines it is:
[Formula 6.1a or b]
Note that the distance from Q ^ to Qis the range of the middle 50% of the
scores. Thus the semiinterquartile range is half the range of the middle 50%
of the scores. It may also be thought of as the mean distance between the
mode / median / mean and the two outer quartile points [6.3].
Deviation Scores
The other two measures of variability use the mean of the distribution as a
reference point and indicate how far the scores lie from the mean, on the average.
The distance between a given score and the mean is called the deviation score
for that raw score, and it is found by subtracting the mean from the raw score.
and
And when you write the symbols, you must make them clearly different. I suggest
that you make your capital letter large with straight lines, like this: X , and
your small letter small with one hooked line, like this: .
The Variance
the foreign letter there is the lower-case Greek letter _ [6.5, Line 2],
and the symbol is read "sigma squared." The symbol for the variance of a sam¬
ple is _, which is read "es squared" [6.5]. The formulas that define the
2
Variance of a population: = Z ( - ) or _ [Formula 6.2a]
2
Variance of a sample: = Z ( - ) or _ [Formula 6.2b]
The variance is a most important measure which finds its greatest use in
measurement [6.5]. We can correct this flaw, though, by taking the square root
of the variance. (To find the square root of a number, figure out what value
has to be squared to equal that number. Thus the square root of 4, symbolized
The square root of the variance is called the standard deviation, and it
serves as a fourth measure of variability. The symbol for the standard devia¬
tion of a population is _, which is read "sigma," and the symbol for the
The standard deviation, like the mean / median / mode , is sensitive to the
range to the presence or absence of scores which lie at the extremes of the
deviation is / may not be the best choice among measures of variability when
the distribution contains a few very extreme scores, or when the distribution is
it contains a few very extreme scores, the semiinterquartile range will /will
not respond to the presence of such scores, and will / but will not give them
thus highly / not very sensitive to the total condition of the distribution
[6.12].
constant from each score, which is equivalent to sliding all the weights to the
left in one big clump, affects / does not affect the measures of variability
[6.8]. When scores are multiplied or divided by a constant, the range, the
There are many occasions on which a researcher will wish to compare the mean
of one distribution with the mean of another. The researcher will subtract one
mean from another, but the difference so obtained usually has little meaning
Measure of Variability
variance
Variability 53
The mean will always fall somewhere between the lowest score in the distri¬
bution and the highest one, as noted on p. 41 of this workbook, so some scores
will be below the mean, and some will be above it. For an illustration, look
at the left-most column in the table below. This is not a frequency distribu¬
tion, just a listing of the scores in order from high down to low. Check the
computation of the mean, and note that it falls between the low of 11 and the
high of 23.
23 +5 25
22 +4 16
19 +1 1
17 -1 1
16 -2 4
11 -7 49
o
Ex2 = 96
w
II
X
EX = 108
Mean = S2 = 96/6
X = EX/n = xd.u
Where are the raw scores in relation to the mean? The deviation scores in
the second column tell you. Raw scores below the mean have negative deviations;
raw scores above the mean have positive deviations; scores right at the mean
would have deviations of zero. The larger the deviation score, ignoring its
sign, the farther the corresponding raw score lies from the mean. Thus the raw
score of 11 is the farthest from the mean, because its deviation of 7 (really
-7) is the greatest of the deviation scores.
Now: the standard deviation of a distribution is just what its name says.
It is a standard, or typical, deviation. How big do the deviation scores for
the distribution run (ignoring their signs)? The standard deviation tells you
how big a typical deviation score is. In fact, the standard deviation is a
kind of average of the deviation scores (when their signs are ignored). It must
fall somewhere between the smallest deviation and the largest one, and the bigger
the deviations are, in general, the bigger the standard deviation will be.
54 Chapter 6
The formula for the standard deviation is easy to remember if you recall
that it is the root mean squared deviation, as the text tells you on p. 86.
That is, it is the square root of the mean of the squared deviation scores.
So to compute the standard deviation, you have to find the deviation scores,
then the squares of the deviation scores, then the mean of the squared devia¬
tions, and then the square root of this mean. These calculations are shown in
this order in the other columns of the table above.
Note that the standard deviation is calculated by doing two operations and
then undoing them, which gets you back to something comparable to what you began
with. You start with deviation scores, and you do two things to them: square
them, and then sum the squares. Then you undo these things: divide by the num¬
ber of cases, which undoes the summing; and take the square root, which undoes
the squaring. The result, the standard deviation, is comparable to the devia¬
tion scores with which you began: it is a standard, or typical, deviation.
Look again now at the computations in the table above. On the way to the
standard deviation in the lower right-hand corner, the variance turned up, for
the variance is nothing more than the mean of the squared deviation scores.
Remembering that it's a mean will help you tell if you have a reasonable value
when you've computed a variance. As the mean of the squared deviation scores,
it must fall somewhere between the largest squared deviation and the smallest
one. In the table, note that 16.0 does fall between the high of 49 and the low
of 1.
Variability 55
EXERCISES
The Mean (X): As an average of the raw scores, indicates about how large they
run. Takes an intermediate value, with at least one raw score larger and at
least one raw score smaller. Serves as the reference point for the deviation
scores (xs).
Deviation Scores (xs): Indicate where the corresponding raw score (X) lies in
relation to the mean—whether the raw score is above (+) or below (-) the mean,
and how far from the mean the raw score falls. Taking their signs into account,
Ex = 0 (see Section 5.7 in the text).
Variance (S2): Is the mean of the squared deviation scores (x2s). As an average
of the squared deviation scores, indicates about how large they run, and takes
an intermediate value, with at least one squared deviation larger and at least
one squared deviation smaller.
Standard Deviation (S): Is a typical value for the deviation scores (ignoring
their signs); serves as an average for the deviation scores (xs), indicating
about how large they run. Takes an intermediate value, with at least one
deviation larger and at least one deviation smaller. Is calculated by doing
two operations (squaring the deviation scores and summing them) and then un¬
doing the operations (dividing the sum by n or N and taking the square root
of the results), which gets you back to something comparable to what you started
with—something comparable to the deviation scores.
13
12
Ex = Ex2 =
EX =
n =
Mean = S2 = /
X = EX/n
56 Chapter 6
Your standard deviation for the little distribution in that table should
have been 4.0, the same as the value for the example on p. 53. Why do the two
distributions have the same standard deviation? Yes, it's because they have
the same amount of variability, but why, exactly, do the two standard deviations
work out to be the same number?
14
13
10
10
EX = Ex = Ex 2 _
n = Mean = s2 = /
X = IX/n
S
/
Note how the generalizations on the preceding page apply to these tables-
14
~9
EX = Ex = Exz =
n = Mean = S2 = /
X = Ex/n
/ S = /
Variability 57
2
Raw Score, X Deviation Score, * Squared Deviation Score, x
ZX = Zx = Zx2 =
n = Mean = S'2 = /
X = ZX/n
/ S = v
MORE EXERCISES
Squared RAW Score, X2 Raw Score, X To make it clear that X2 differs from
2
x2, the column of X values is listed
529 29 on the left side of the raw scores.
484 22
Zx2 = ZX2 - (IX)2/n
361 19
= 2040 - (108) 2/6
289 17
= 2040 - 11664/6
256 16
= 2040 - 1944
121 11
-— - = 96 = the value computed
ZX2 = 2040 Zx = 108 n = 6 directly on p. 53
7 /
6
10
/
8
5
should = the value
computed directly
ZX2 = Zx = n
on p. 56
Variability 59
9
- ( ) 2/
9
/
9
~ = - ( ) 2/
7
--- /
5
4 =
Ex = n =
That formula you practiced using in the exercise above is worth memorizing
Write it out several times, taking care to distinguish x from X and Ex from
(Ex)2. Note again that the formula is read "The sum of little-x-squared equals
the sum of big-X-squared minus the-sum-of-big-X-the-quantity-squared over n.
Rehearse these words as well as the symbols to which they correspond.
60 Chapter 6
SYMBOLISM DRILL
Symbol_Pronunciation_Meaning
1
"little en" Number of scores in a
2
"big en" Number of scores in a
3
X "eks" or "big eks" A raw score, or the set of raw scores
4
Result of summing quantities of some kind
5
X Zx/n; the mean of a
6
Z X/N; the mean of a
9
"little eks" X - X or X - U; deviation score
1 0
"cue" (C 7 5 ~ C25)/2; semiinterquartile range
1 1
a2 "sigma squared" Zx2/N; variance of a population
12
"sigma" /Z xz/Ni of a
1 3
s2 "es squared" Zx2/n; of a
1 4
"es" /Z xz/n; of a
Variability 61
Statistics in
None 52
1-5 68 Six additional
6-10 31 respondents re-
11 - 15 8 ported "some"
16 - 20 8 inquiries.
21 - 25 2
Over 25 7
176
Range 0 - 200. X = 6.81 inquiries per respondent. C25 = 0/ C50 = 3.2, C75 = 7.6.
None 49
1-5 58
Two additional
6-10 13
respondents re¬
11-15 4
ported "some"
16-20 2
studies.
21-25 1
Over 25 1
128
Range 0-44. X = 2.97 studies per respondent. C25 = 0, C50 = 2.0, C75 - 4.2.
As you can see, the researchers did not follow standard practice in con¬
structing their grouped frequency distributions: they have too few class inter¬
vals, the intervals are not of uniform width, and one interval is open-ended.
Nevertheless, these data illustrate a number of techniques of descriptive sta¬
tics and offer you a chance to review much of what you've learned so far.
62 Chapter 6
5. Why is the mean higher than the median in the first distribution?
6. Why is the mean higher than the median in the second distribution?
10. For the 176 respondents who reported an exact number, what was the total
number of people who consulted with them about the effects of exposure to spray
adhesives?
11. For the 128 respondents who reported an exact number, what was the total
number of people whose chromosomes were studied?
12. The researchers' goal was to determine the impact of the ban and the
warning on the genetic counselors in the U. S. who do diagnostic cytogenetics.
What was the population of interest to them?
13. Why did the researchers employ only descriptive statistics with no infer¬
ential techniques?
Important note: The ban was withdrawn in six months when the purported cor¬
relations between exposure to spray adhesive and chromosomal damage or birth de¬
fects could not be confirmed, and no toxicity could be demonstrated for the ad¬
hesives. In fact, investigators who reexamined the slides that had initially
been believed to show chromosome damage in exposed individuals did not agree with
the original interpretation.
Less important note: Don't be upset if you computed C50 and C75 for the two
distributions and failed to get the figures the researchers reported. I can't
derive the figures from the numbers in the tables either. The researchers must
have computed the quartiles from the ungrouped data.
7.1 Introduction
63
64 Chapter 7
SUMMARY
bution of scores—its shape. Some distributions have a shape for which the
normal curve is a poor model. Furthermore, the normal curve best describes
many real distributions whose shape is well represented by the normal curve.
The normal curve also functions well as a model for many distributions of
that make up the samples [7.5]. Suppose, for example, that we draw a very great
number of random samples from some population. (All the samples should be of
the same size.) We compute the mean of each sample, which is a statistic char¬
would be found that the shape of the distribution of this large number of means
The normal curve, as a model for the shape of a distribution, does not
specify the central tendency of that distribution (it does not specify the mean,
for example), nor does the model specify the variability (it does not specify
the standard deviation, for example). Thus different distributions can all
conform to the shape called normal, even though the distributions differ in
their mean, in their standard deviation, or in both. (See Figure 7.1 for an
example.)
rical and unimodal / bimodal [7.2]. (It is thus a specific kind of bell
shape.) Going away from the middle toward either end, the curve gets closer
and closer to the horizontal axis, and it eventually / but it never actually
Areas under various segments of the normal curve are of special interest.
The interval from one standard deviation below the mean to one standard devia¬
(68%) of the total area. In any distribution whose shape is well modeled by
The Normal Curve 65
the normal distribution, then, about two-thirds (68%) of the scores will have a
tains about _% of the area, and thus about _% of the scores fall into this
interval [6.13]. The interval y ± 3a contains almost all the area (99.7% of it),
and thus almost all the scores (99.7% of them) fall into this interval. In gen¬
cases in the distribution [7.6, Paragraph 1; see also the last paragraph of
given a certain kind of IQ test, namely the Wechsler Adult Intelligence Scale
(the WAIS), the distribution of their scores will have a mean about 100 and a
standard deviation about 15. Back when the College Entrance Examination Board
Test (the CEEB test, also called the Scholastic Aptitude Test, or SAT) was first
constructed, a large number of high-school seniors would have earned scores with
a mean about 500 and a standard deviation about 100 on both the verbal and the
quantitative parts. (The means for the CEEB scores are somewhat lower now.)
of 100 from the first distribution corresponds to a CEEB score of 500 from the
second distribution: each falls at the mean of its distribution. There is also
each falls one standard deviation above the mean of the distribution from which
the score comes. Similarly, an IQ of 85 and a CEEB score of 400 are each one
There is a convenient way of saying where a raw score falls within its dis¬
tribution, a way that makes clear the sort of correspondences noted above. The
states the position of the raw score in relation to the _ of the distri¬
mean (which is 100). A CEEB score of 600 in the other distribution also has a
z score of +1, because it too lies one standard deviation above the mean, but
here the standard deviation and the mean are those characterizing the distribu¬
tion of CEEB scores, namely 100 and 500, respectively. The z score for an IQ of
Standard scores can easily be computed in your head if the numbers involved
are simple, like those in the examples above. If you need to use a formula to
The two distributions in the examples above would both be close to normal
in their shape, as Figure 8.4 shows, but z scores are useful in describing raw
scores in a distribution with any kind of shape.
Properties of z Scores
If all the scores in any given distribution are converted to z values, the
mean of the z scores is always _, and the standard deviation of the z scores
shape is characteristic of the set of raw scores from which they were derived
[7.6].
ADVICE
The exercises offered in the text for this chapter are especially important.
Be sure to do at least some of the parts of each one. As for the other chapters
the exercises provided here (following the map of the new concepts) do not dupli
cate those in the text.
The Normal Curve 67
z SCORES
variability
certain distributions
EXERCISES
Here's a collection of scores listed in order from high down to low. (This
is not a frequency distribution.) Take them to be a sample called X. Compute
the mean, find the deviation scores, and then calculate the standard deviation
from the squares of the deviation scores. These are the sorts of exercises I
offered you in the preceding chapter.
What's new here is this: Find the z score corresponding to each raw score,
and check to see that the mean of the z scores, z" ("zee bar"), is zero. Then
compute the standard deviation of the z scores using a procedure like that by
which you found the standard deviation of the raw scores. This requires treat¬
ing the z scores as though they were raw scores and finding the deviation scores
for them, using their own mean, zero, as a reference point.
All the figures in the table work out to be simple (but not necessarily
whole) numbers, and the standard deviation of the z scores should be exactly one,
of course.
X x = {X - X) x2 z = x/S (z - z~) (z - z) 2
13
13
Zx = Zx = Zx2 = Zz = Z (z - z) = Z (z - z)2 =
X = Z x/n = / z = Z z/n = /
S = / sz — -J
The Normal Curve 69
If you'd like to do more problems of this kind, use the distributions on pp.
53 and 55 - 59 above.
SYMBOLISM DRILL
Yes, another one. Repetition is what makes a drill a drill and a good tactic
for learning the sort of material in the table below.
l l Zxz/N}_ of a
1 2 fzP7N; of a
1 3 Zx2/n; of a
1 4 v^Zx^/n; of a
PROBLEMS nd EXERCISES
2_____
1 _____
4 _____
3____
6 ___
5 _____
8______
7______
10_______
9_____
12_______
11_____—
14 ___
13 ___-
71
72 Chapter 8
SUMMARY
whether a given score indicates a good performance or a poor one, then, a frame
A number that indicates where a raw score stands in relation to other raw
scores is called a derived score. There are two major kinds of derived scores:
those like the z score that preserve the proportional relation of interscore
__, and those like the centile rank that do not [8.1, final para¬
graph] .
it also changes/but it does not change the shape of the distribution [8.2].
It is in this sense (leaving the shape unchanged) that z scores preserve the
Chapter 7 used the term standard score to refer to z scores, but there are
distribution with a fixed mean and a fixed standard deviation; this is what is
For example, in WWII the Army transformed raw scores on its General Classi¬
fication Test into a type of standard score with a mean of and a standard
deviation of _ [8.3, Table]. And as this workbook noted in the summary for
Chapter 7, raw scores on the Wechsler Intelligence Scale are transformed into
while transformations of raw scores on the CEEB test originally had a mean of
that they, like the z score, locate a raw score by stating how many _
Another important property of these standard scores is that they, like the z
All of the standard scores considered so far, z scores and the others, are
Centile Scores
Centile ranks (which this chapter occasionally calls centile scores) are
also derived scores. Like the standard scores described so far, the centile
distribution [8.6]. The centile rank does this by indicating what percentage
of the scores fall below it. Centile ranks have a major disadvantage, however:
changes in centile rank [8.6]. When one centile rank is higher than another,
the corresponding raw score of the one is higher than that of the other, but we
do not know by how much. Changes in raw score are accompanied by proportionate
changes in centile rank only when the shape of the distribution of scores is
meaningfully compared with a standard score of the same type from another dis¬
tribution. One element necessary for appropriate comparison is that the refer¬
ence groups (norm groups) used to generate the standard scores are _
(that is, similar) [8.7]. Furthermore, standard scores should be used for com¬
paring scores from two different distributions only if the two distributions
74 Chapter 8
that produce a standard mean and a standard standard deviation, but in addition
raw scores so that it follows the _ curve [8.8, Sentence 2]. Trans¬
with a mean of 5, a standard deviation almost 2, and again a shape that is nor¬
Suppose an instructor has scores for students on a quiz, a midterm exam, and
a final. To compute the final grades, the instructor might simply sum each of
the three scores for each student and then decide which sums are worth an A,
which sums are worth a B, and so on. A naive person would think that this pro¬
cedure of simply summing the three scores makes them count equally in determin¬
ing the total and thus the final grade. When several scores are summed, each
one does / does not, however count equally in determining the total [8.9, Par¬
agraph 2]. If the several scores are independent (i.e., if the size of a per¬
son's score on one variable is / is not predictive of the size of that person's
score on another variable), then the contribution of each score to the total is
This situation can be rectified by assuring that all of the test distribu¬
tions from which scores are to be added to form the composite have the same
Derived Scores 75
This is not the end of the problem, however, because it will work only if
the several scores are independent, and usually they are not. The lack of in¬
it is a problem when there are more than _ [8.9, p. 138]. With more than
, the procedure recommended above ensures / does not ensure that the
weights assigned will result in the intended relative importance of the contri¬
EXERCISES
+3 145
97.7
650
50.0
-1
250
20
76 Chapter 8
RAW SCORE
I
Test Norms
of interscore
distances
tion of interscore
distances
consists of consists of
SYMBOLISM DRILL
1 n Number of scores in a __
9 x X - y or X - X; ___ score
4 I
6 E / ; the mean of a
V
5 E / ; the mean of a
X
11 E /N; of a
/E / N; of a
12 a
E /n; of a
13 s2
A of a
14 s /n;
/a or /S; score
Statistics in Action—------
defining mental retardation
But many professionals now accept an IQ of 70 as the dividing line, Zigler says.
Zigler suggests that people with IQ between 50 and 70 who are currently
called retarded merely represent "the lower portion of the normal distribution
of intelligence" and are "thus an integral part of the normal population." In
this conception, these people are like those whom we regard as short. Most of
the people m each classification came to be what they are through the usual gene
tic and environmental processes, which produce a normal distribution of 10 scores
m the one case and a normal distribution of heights in the other, both of them
Dust naturally including some scores well below the mean. The truly retarded,
m Zigler's view, are like the dwarfs and midgets who fall far below the mean'
m height as a result of processes that are clearly abnormal. Unfortunately,
he notes, there is no emotionally neutral word comparable to "short" to describe
the lower end of the naturally occurring distribution of IQ scores. The word
retarded" is pejorative, and to apply it to all people with an IQ between 50
and 70 is unfair and misleading, Zigler believes, just as it is unfair and mis-
eadmg to refer to all adults who are, say, five-two or less, as dwarfs or
midgets.
-- 8. What proportion of the young adult men in America are below five
feet, two inches in height? (See the table on p. 27 on this book.)
IQ^is not the only criterion that is used in diagnosing retardation. The
person's social competence and the age at which the abnormalities began are also
taken into consideration by many professionals. Social competence, or the abil¬
ity to meet the demands of everyday living, is not adequately defined by an IQ
score, Zigler says, and the exact relation between intelligence and social com¬
petence is unclear. There is great need for a measure of social competence
that can be used throughout the lifespan as IQ tests can.
10
79
80 Chapter 9
SUMMARY
In the cases considered thus far in your study of statistics, there has
been only one score for a given subject, a score indicating the value of just
one variable (the subject's height, for example), when two scores are avail-
able for each subject, one score for each of two variables (the subject's height
and the subject's weight, say), it is possible to determine the correlation
between the two variables for the subjects on hand. Except in the final sec¬
tion, Chapter 9 is concerned with the correlation between variables that are
each quantitative and continuous (such as height and weight).
high scores on one variable tend to be associated with both high and low
scores on the other variable, and if low scores on the one variable also tend
to be associated with both high and low scores on the other. Diagram 6 of
In the case of a perfect correlation, all data points in a diagram like those
in Figure 9.1 fall on a straight line. (If the direction is positive, the line
slopes from lower left to upper right; if the direction is negative, the line
slopes the other way.) In the case of less than perfect correlations, the data
points swarm more or less closely about a straight line; the farther away, the
lower the degree of the correlation (whether the direction is positive or nega¬
tive) .
The direction and the degree of correlation between two continuous quantita¬
lation coefficient. The Greek letter p, pronounced "_," stands for the
population value, and _ stands for a sample value [9.3]. In this chapter
the symbol is used consistently, but the principles and procedures for
is a tendency for high values of one variable (X) to be associated with high/
low values of the other variable (Y), and low values of the one to be associ¬
ated with high / low values of the other [9.3, Paragraph 2]. A negative value
associated with high / low values of Y, and vice versa [9.3]. The sign of the
line [9.3, p. 145]. This means that if we know the value of X, we can predict
[Formula 9.1]
r
xy
Figure 9.2. This diagram is divided into four quadrants by two lines, one
Points located to the right of the vertical line are therefore characterized
values of x [9.4]. Those points lying above / below the horizontal line are
values of y [9.4]. For any point, the xy product may be positive or negative,
depending on the sign of x and the sign of y. The xy products will be positive
for points falling in quadrants _ and _ and will be negative for points
equals the sum of the positive products from quadrants I and III; the coeffic¬
ient will be negative when the contributions from quadrants and exceed
those from quadrant _ and _; and the coefficient will be positive when the
reverse is true [9.4]. The greater the predominance of the sum of products
bearing one sign over those bearing the other, the greater the magnitude /
from each X score before obtaining the correlation between that variable and Y?
between the altered variable and the remaining variable changes /remains just
and the value of the coefficient changes/ remains unaltered [9.6]. As long
1. Does a correlation coefficient of, say, +.50 mean that there is 50%
ient [9.9].
part, the cause of the other? This is / is not so. Mere association is in¬
other things, on the nature of the measurement of the two variables as well as
the correlation between two variables without taking these factors into con¬
sideration [9.9].
smaller the range of talent in X and/or Y, the higher / lower the correlation
EXERCISES
For each case described below, use your common sense and your new under¬
standing of correlation to determine whether the correlation between the two
84 Chapter 9
SYMBOLISM DRILL
Symbol_Pronunciation Meaning
2 _ "big en"
3
A raw score, or the set of raw scores
5
Zx/n', the mean of a
Correlation 85
11 of a
Ex2/N;
1 3 of a
Ex2/n;
1 2 of a
Ax2/iV;
14 of a
/Ex^/n;
Bivariate Distribution
may reveal
I I I
Negative correlation
Positive correlation Zero correlation
between X and Y between X and Y between X and Y
permits
p for a population,
is appropriate as a
measure of correlation the
\
strength of the linear
r for a sample
only if the relation association between X and Y
between X and Y is linear
l
is shown graphically by the extent
to which scores cluster around the
straight line of best fit on the
CHAPTER 10
10.3 Homoscedasticity
1 ____——-
2 ________
3 ______
4 ________
5 ________
6 _____
7 _____
87
88 Chapter 10
SUMMARY
The fact that x and 7 vary together is a necessary and also / but not a
between the two variables [10.1]. That is, evidence that two variables vary
shows four of the possibilities that may occur when two variables are correla¬
ted, and in only two of these cases (the first and second) does one of the
Linearity of Regression
In scatter diagrams such as those shown on pp. 144-145 and 165 of the text,
it is helpful to fit a straight line to the swarm of data points. (The next
chapter tells how to do this precisely.) In general, the more closely the
scores hug the straight line of best fit, the higher / lower the value of r
[10.2] . When r is 0, the scores scatter as widely about the line as possible,
and when r is _ (plus or minus), the scores hug the line as closely as possi¬
ble, since they all fall exactly on the line [10.2]. One meaning of this prin¬
ciple is that prediction of Y from knowledge of X can be made with greater accu¬
racy when the correlation is high / low than when it is high / low [10.2].
But in a given set of data, a straight line may or may not reasonably
describe the relationship between the two variables. When a straight line is
[10.2] .
What happens when X and Y are not linearly related? When the correlation
is other than zero and the relationship is nonlinear, as in Figure 10.3, Pearson
tion [10.2] .
Homoscedasticity
the left side of Figure 10.4, the bivariate distribution is said to exhibit the
Factors Influencing the Correlation Coefficient 89
points hug the straight line of best fit, the value obtained for it in such a
When homoscedasticity does not obtain, as on the right side of Figure 10.4, r
will reflect the average degree to which the scores hug the line, but this aver¬
age will properly characterize the degree of relationship for only some values
Discontinuous Distributions
What will happen if distributions that normally would be continuous have been
on one variable? A sample constituted in this manner will / will not yield a
[10.5] . Obtaining the coefficient is often only the first step in analysis,
though. When additional steps are undertaken, the assumption frequently must
Random Variation
two variables; another sample will yield the same / a somewhat different value
[10.6] . In general, large / small samples yield values of r that are similar
from sample to sample, and thus the value obtained from a large / small sample
will probably be close to the population value [10.6]. For very large / small
samples, r is quite unstable from sample to sample, and its value can / cannot
The influence of random sampling variation is the only / but one reason
why the obtained correlation coefficient is not the coefficient between the two
variables under study [10.7]. The degree of association between two variables
depends on (1) how the two variables were measured, (2) who the subjects were,
and (3) under what circumstances the variables operate. If any of these fac¬
tors changes, the extent of the assocation may also change. Consequently, it
SYMBOLISM DRILL
2
N
l
n
3
X
9
x X - or X score
4
E
6
y E / ; the _ of a _
5
x E / ; the _ of a _
1 4
s /E / ; _ of a
1 1
j2 E / ; _ of a
1 3
s2 E / ; _ of a
1 2
o /L 7 ; _ of a
1 5 z /a or / S ; score
1 6
r Pearson cor'n coef't for a
1 7
P Pearson cor'n coef't for a
CHAPTER 11
1 ___
2 ___ . . ■
3 _______
4 _____ ...
5,
6___________
7 ___________
8 _______
9 ___
91
92 Chapter 11
SUMMARY
diction of Y with better than chance accuracy, and if the coefficient is ±1, we
can predict Y with almost perfect / perfect accuracy [11.1]. The problem of
ient called r is thus a good measure of the correlation between X and Y. The
chapter shows how to use the value of r and certain other facts to find the
line [11.1, Paragraph 4], and it will correspond to an equation called a regress¬
ion equation. This line or its equation is used to predict the value of Y for
a given value of X.
How shall we judge which of all possible straight lines is the one that
best fits the values of y on hand and permits the best prediction of unknown Y
values? The criterion in use was first proposed by Karl Pearson, the person
such a manner that the _ of the squares of the discrepancies between the
actual and the predicted values of Y is as small as possible [11.2]. One impor¬
tant property of the least-squares solution is that the location of the regress¬
ion line and the value of the correlation coefficient will fluctuate less / more
under the influence of random sampling than would occur if another criterion
The regression line is a "running mean," a line that tells us the mean, or
mean of all 7 values in the set, whereas 7' ("7 Prime," 7 predicted from the
The straight line of best fit to the 7 values, which is called the regress¬
ion line and is used for predicting unknown 7 values from particular values of
raw scores.
z’ = [Equation 11.1]
where z'Y is the predicted standard score value of __ [11.3]. This form of
the equation makes it easy to see how two important generalizations are true.
First, suppose the value from which prediction was made was the mean of X.
Since the z-score equivalent of the mean is _, the predicted standard score
will be the same, irrespective of the value of r / hold only for certain values
to predict 7 [11.3].
Note, however, that the expression inside the first pair of parentheses is the
mean value of 7 for persons with the given score on X. If the correlation
94 Chapter 11
the predicted value may be expected [11.5]. If the correlation is low / high,
the actual values will cluster more closely about the predicted value [11.5].
Only when the correlation is _ will the actual values regularly and
about the predicted Y score [11.5]. The value of Syx ranges from when
186]. The simplest formula for Sy% expresses it in terms of r and SY:
With the help of SyX one can predict not only the mean score on Y for cases
with a given score on X (which is what Y' is), but also the entire distribution
of Y scores for cases with that score on X. The procedure for doing so is
is the line of best fit, or this predicted value may be too high or too low
[11.8].
Regression and Prediction 95
EXERCISES
To understand what you are doing in using the regression equation to predict
a Y score given an X score, and to check to see if the prediction you are making
is a reasonable one, it is helpful to think: Where is X in relation to its mean
X? And where is Y predicted to be in relation to its mean Y? Where Y is pre¬
dicted to be in relation to Y depends on a) where X is in relation to X, and on
b) whether r is positive, zero, or negative. You can figure out for yourself
exactly how this works by using the standard-score form of the regression equa¬
tion (see p. 118).
For example, suppose you want to predict the score on Y for a case whose X
value is above the mean of X. Suppose further that the correlation between X
and Y is positive. The regression equation says that z’y = rz^, and here r will
be some positive number, while zx will also be positive, because a raw score
above its mean has a positive z score. The product of two positive numbers is
also positive; thus the equation predicts that this case will have a positive
z score on variable Y. A positive z score indicates a raw score above the mean
of Y. Thus if X > X and if r is positive, Y' > Y.
If X = X Y< y Y' Y Y’ Y
Y' Y Y> Y
If X < X Y' Y
96 Chapter 11
SYMBOLISM DRILL
1
Number of scores in a sample
11 Ix2/IV; of a
12 /Ix2/IV; of a
13 Ix2/n; of a
14 /Ix2/n; of a
INTERPRETIVE ASPECTS
OF CORRELATION AND REGRESSION
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
19 20
21
97
98 Chapter 12
SUMMARY
Range of Talent
range or the standard deviation of the distribution, affects the value of r when
r is used as an indicator of the correlation between this variable and any other
The value of r will be smaller / larger in those situations in which the range
of either X or Y (or both) is less, other things being equal [12.1]. This means
that there is no such thing as the correlation between two variables, and that
Other things being equal, the greater the restriction of range in X and/or Y,
Heterogeneity of Samples
On the left side of Figure 12.2 (p. 200) is a scatter diagram showing the
the combined data, however, it will be smaller (though still positive). Why?
When the data are pooled, the scores no longer hug the _ line
(which now must be a line lying amidst the two distributions) as closely as they
samples, in the mean of _ but not in the mean of _ [12.2]. Other types
shows a situation in which a second sample differs in that both the mean of X
and the mean of Y are higher than in the first sample. In this case, the corre¬
lation will be greater / smaller among the pooled data than among the separate
samples [12.2].
regression equation is cast in standard score form. In this case, the correla¬
in X [12.3] .
When the regression equation for predicting Y from X is stated in raw score
deviation above the mean [12.4]. On the other hand, if parents intelligence
is two standard deviations below the mean, the predicted intelligence of their
offspring is only one standard deviation above / below the mean [12.4]. To put
it in other words, bright parents will tend to have children who are brighter/
duller than average, but not as bright as they, and dull parents will tend to
have children who are dull / bright , but not as dull as their parents [12.4].
the mean is, of course, characteristic of any relationship in which the correla¬
tion is less than perfect / zero [12.4]. The more extreme the value from which
prediction is made, the greater / lesser the amount of regression toward the
mean [12.4]. The higher the value of r, the less / greater the amount of re¬
gression [12.4].
are selected because of their extreme position (either high or low) on one vari¬
opposite direction, but more / less extreme [12.5]. The two variables could
be test and retest on the same measure (as in the excellent example offered in
the first paragraph of Section 12.5), or they could be two different variables.
Again, it should be noted that the amount of regression will depend on the size/
teristics less extreme than themselves, how is it that, after a few generations,
we do not find everybody at the center? The answer is that regression of pre¬
about the predicted values, and the greater the degree of regression toward the
mean, the greater / lesser the amount of variation [12.6]. Specifically, Y',
the predicted value of Y, is only the predicted of Y for those who ob¬
of X will be distributed about Y' with a standard deviation equal to _____ [12.6]
The lower the value of the correlation coefficient, the greater the value of
_, so that the greater the degree of regression on the mean, the greater/
lesser the variation of obtained Y values about their predicted values [12.6].
might call it, the measure of error of prediction, is given by the standard error
in which case Syy = [12.7]. The ratio of SYX to _ gives the proportion
of the maximum possible predictive error that characterizes the present predic¬
tive circumstances [12.7], and this ratio turns out to equal /l - r2. This
_ [12.7].
When the value of k is close to unity (its maximum value), the magnitude of
predictive error is close to its maximum / minimum [12.7]. On the other hand,
when the value of k is close to zero, most / little of the possible error of
prediction has been eliminated [12.7]. The first two columns of Table 12.1 on
p. 208 show how k changes as r changes and make it clear that a given change in
creases more slowly / rapidly than does the magnitude of the correlation coef¬
ficient [12.8] .
[12.9], A somewhat more cheerful outlook may be had by considering the propor¬
tion of correct placements that occur when the regression equation is used to
defined as scoring above the median on the criterion variable (Y), and that
those who are selected as potentially successful are those who score above the
median on the predictor variable (X). The last two columns of Table 12.1 on
SYMBOLISM DRILL
3
X
102 Chapter 12
Symbol_Pronunciation_Meaning
4
Z
6
V ZX/N ; the of a
5 X Zx/n ; the of a
9 X or f score
11
a2 2 / •
/ of a
12
o ^ / f of a
1 3
s2 2 / f of a
14
s 2 / •
s of a
15
z /Q or /S score
16
r Pearson cor 1 n coef't for a
17
P Pearson cor' n coef't for a
18
Y' "wi prime" Predicted score on
19
ZV "zee prime sub wi" Predicted score on
20
"es sub wi eks"
SYX
CHAPTER 13
7 8
9 10
103
104 Chapter 13
SUMMARY
hypothesize that the value we have in mind characterizes the population / sample
sample results one would expect to obtain if the hypothesis were correct / in¬
correct [13.2, next to last paragraph on p. 218]. If the sample outcome is not
in accord with what one would expect, we will accept / reject the hypothesis
[13.2] .
need be stated. Rather, the question is, what is the sample / population value
[13.3] ? To answer the question, a sample is drawn and studied and an inference
are studying will vary / stay the same from sample to sample [13.4]. The key
composed of values (such as the mean) characterizing samples of some one partic¬
ular size drawn repeatedly from the same population is known as a sampling dis¬
happen when samples of that size are drawn from that population.
There is just one basic method of sampling that permits sampling distribu-
The Basis of Statistical Inference 105
is known [13.4]. One kind of probability sample is the random sample, which is
a sample so drawn that each possible _ of that size has an equal prob¬
It is the method of selection, and not the particular sample outcome, that
defines a sample as random. If we were given a population and a sample from that
population, it would be possible / impossible to say whether the sample was ran¬
population, but a family of such distributions, one for each possible sample
_ [13.7, p. 224].
completely defined by specifying its mean, standard deviation, and shape. The
mean of any random sampling distribution of means is the same as the mean of
eks bar."
*Actually, there are two random sampling distributions for each such case:
one for sampling with replacement and one for sampling without replacement.
This is a nuance hinted at in the second footnote on p. 223 and treated fully m
Section 14.5
106 Chapter 13
= - [Formula 13.2]
The formula shows that (a) means vary more / less than scores do (when sample
size is at least two), (b) means vary more / less where scores vary less, and
(c) means vary more / less when sample size is greater [13.7].
Central Limit Theorem informs us that the random sampling distribution of means
Thus even when the population of scores differs substantially from a normal
were normally distributed when sample size is reasonably small / large [13.7].
[If you feel a need for a summary of the material on probability, look ahead
to the first two paragraphs on p. 114 of this workbook.]
The Basis of Statistical Inference 107
symbolized
X
probability that a
single sample of size
n will have a mean
that falls between
certain limits
as described by
l
Central Limit
Theorem
108 Chapter 13
EXERCISES
To gain some insight into the random sampling distribution of means, turn
to p. 547 of the text. There are 2500 single digits on this page, which we can
take to be a population. What are the characteristics of this population?
1. We are assured that the digits were chosen at random, so each of the ten
possible values (1 through 9 plus 0) occurs about 1/10 of the time in this pop¬
ulation. Let's assume the figure is exactly 1/10 for each digit; the assumption
can't be far wrong. (You're welcome to make a frequency distribution to check
the assumption; it shouldn't take more than a week.) On this assumption, then,
each value has a frequency of 1/10 of 2500, or 250, and the shape of the dis¬
tribution is thus rectangular.
Now, let's approximate the random sampling distribution of means for the
case in which samples of size two are drawn from this population. To draw a
first sample of size two, we should pick two digits in such a way that all
samples of size two have the same chance of occurrence. That's tough to do.
For present purposes, it will be sufficient to close your eyes and put the point
of a pencil down somewhere in the table. Take the digit closest to the point
as the first element of the sample, and take the digit to the right of this one
(or to the left of it, or above it, or below it—whatever you want) as the sec¬
ond element. Record the mean of these two numbers on the next page. Now repeat
The Basis of Statistical Inference 109
this procedure to get another sample of size two, and continue in this way
until you have 20 samples. (To generate the real sampling distribution, you
should continue forever, but then you'd never finish this course.) If you
get tired of closing your eyes each time, just read off pairs of digits start
ing any old place on the page; that'll be good enough.
1 6 11 16
2 7 12 17
8 13 18
3
4 9 14 19
5 10 15 20
Putting these 20 means together now will get you a rough approximation of
the random sampling distribution of the mean for samples of size two selected
from that population of 2500 digits.
And now, to gain even more insight, draw 20 samples all of some size greater
than two. Ten is a convenient size, because the division involved in finding
the mean of a sample is then simple. Again record the mean of each sample.
2 7 12 17
3 8 13 18
4 9 14 19
5 10 15 20
Again make at least a rough histogram to get the shape of the distribution
these 20 sample means It should be even closer to normal than the shape
the distribution for the case in which sample size was only two. Also cal-
culate the mean of your collection of means and the standard deviation. The
latter should be less than the standard deviation of the raw scores in the pop¬
ulation again, and less than the standard deviation of the distribution for the
case in which sample size was two. Finally, calculate the theoretical standard
error of the mean for samples of whatever size you used this second time.
The Basis of Statistical Inference 111
SYMBOLISM DRILL
Symbol_Pronunciation_Meaning
3
A raw score, or the set of raw scores
4
Result of summing quantities of some kind
6
ZX/Nj the mean of a
5
Zx/n; the mean of a
9
X - X or x - y; deviation score
12
/Zx2/N; of a
11
Zx2/N‘, of a
13
Zx2/n; of a
14
/Zx2/n; of a
15
x/0 or x/S',
17
Pearson correlation coefficient for a pop' n
16
Pearson correlation coefficient for a sample
18
Predicted raw score on Y •
19
Predicted z score on y
20
Standard error of estimate of Y on X
14.1 Introduction
1_
2_
3_
4_
10
113
114 Chapter 14
SUMMARY
Questions of probability arise when a repeatable event occurs and gives rise
to one of two or more possible outcomes. Examples are flipping a coin, which
gives rise to the outcome "heads" or the outcome "tails," and drawing a card from
a deck of playing cards, which gives rise to one of 52 possible outcomes. The
occurrence of such an event is called a trial. What now do we mean when we speak
when trials are repeated over and over again indefinitely: The probability of
infinite series of trials can never be obtained. The best we can do is to repeat
the event of interest some finite number of times and compute the proportion of
trials characterized by A. This gives us what the present chapter calls an em¬
If we know that the several possible outcomes of an event are equally likely,
we don't have to fuss with empirical probabilities. Instead we can find a prob¬
_ [14.2].
In a roll of a fair die, for example, the six different faces are equally likely
to turn up. Thus the probability of rolling a face with an odd number of spots
is 3/6, or 1/2, because 3 faces yield this characteristic (the faces with 1, 3,
and 5 spots), and there are a total of 6 faces. A probability like this computed
Questions about probability sometimes take the form: what is the probability
of this OR that happening when some event occurs? Such a question can be readily
answered by using the addition theorem of probability, but only if the outcomes
of interest (the "this" and the "that") are mutually exclusive. Outcomes are
of the occurrence of any of the others [14.3]. Another way to say this is that
two outcomes are mutually exclusive if they cannot occur on the same trial. In
drawing a card from a deck of playing cards, for example, the outcomes King and
Queen are mutually exclusive, because no card can be both a King and a Queen.
But the outcomes King and Club are not mutually exclusive, because a card can be
Other questions about probability take the form: what is the probability of
this AND that happening? Such a question can be readily answered by using the
multiplication theorem of probability, but only if the event that might generate
the "this" and the event that might generate the "that" are independent. Inde¬
pendence of events means that the outcome of one event must have some / no influ¬
ence on and in some /no way be related to the outcome of the other event [14.3].
[14.3] .
are considered together, as in the tossing of two coins or the result of tossing
Random Sampling
[14.4]. The reverse is / is not true; giving equal probability to the elements
does / does not necessarily result in equal probability for samples [14.4].
Although there is only one way to define a random sample, there are two
sampling plans that yield a random sample. One plan is sampling without replace¬
ment. The characteristic of this method is that an / no element may appear more
than once in a sample [14.5]. The other plan is sampling with replacement.
Under this plan it is possible / impossible to draw a sample in which the same
element appears more than once [14.5]. Both of these plans can satisfy the con¬
dition of random sampling, but certain sample outcomes possible when sampling
with/without replacement are not possible under the other method [14.5].
completely: mean, standard deviation, and shape. The first and last of these are
ment [14.5]. The formula given in Chapter 13 for the standard deviation (for the
"standard error of the mean") is strictly correct only if sampling is with re¬
placement. Despite the fact that most sampling in behavioral science is done
without replacement, we typically use the Chapter-13 formula for the standard
error of the mean. No/ Considerable harm is done in the usual case, where the
In the previous chapter, the random sampling distribution of means was con¬
ceived as the result of a(n) large/ infinite series of sampling trials [14.6].
Another view is possible: the random sampling distribution of means is the rela¬
a given size that could be formed from a given population [14.6]. This definition
appears on pp. 240 - 243. Among the important insights this example offers is
the point that random sampling results in equal probability of occurrence of any
SYMBOLISM DRILL
Symbol_Pronunciation Meaning
4 I
5 Zx/n; the of a
6 ZX/N; the of a
9 or •
r
score
X
12 Z / ; of a
0
1 1 of a
a2 £ / ;
14 s I / ; of a
1 3 Z / ? of a
s2
1 5 z /a or /S’, score
16 r
17 p
18
Y'
19 z'Y -
20
Svv
YX -
21 u_
-
22 Standard of the ; a//n
CHAPTER 15
15.1 Introduction
1_ 2 __ 3 __
4_ 5 6____
7_ 8 ____ 9___
10_ 11____12___
13 14 _
119
120 Chapter 15
SUMMARY
This chapter introduces the logic of hypothesis testing and the procedure
for testing a hypothesis about the mean of a single population. The procedure,
strictly speaking, requires knowledge of the standard deviation of the popula¬
tion, but this will rarely be known, so it must usually be estimated from the
sample on hand. Naturally, substituting an estimate for the real thing intro¬
duces some error, but the larger the sample the smaller / larger the error
[15.1]. So, if sample size is large enough (n greater than 40 or so), the error
will be small enough so that the procedure described here is satisfactory. For
this reason, the procedure is sometimes known as the large-sample method. It
takes the normal curve as a model for a certain sampling distribution.
Estimating the Population's Standard Deviation and the Standard Error of the Mean
a will be unknown, and it must be estimated from the sample. One would think
that _, the sample standard deviation, would be the proper estimate, but it
formula is:
s = [Formula 15.1]
The defining formula for S (which should now be read "big es") is _,
n [15.4]. The change in the divisor makes s a bit larger / smaller than S [15.4].
[It is now highly important to distinguish "big S" from "little s." One way
this: , while making the small letter small and of the script variety, like
this: . ]
Substituting s for a in the formula for the standard error of the mean yields
the working formula for estimating the standard error. The estimate of the stan¬
dard error is symbolized s—, to distinguish it from 0—. The formula for the
X X
estimate is:
s— = [Formula 15.2]
X
Substituting s for a in making this estimate takes care of bias that would be
Testing Hypotheses about Single Means: Normal Curve Model 121
introduced if S had been used, but different samples will always yield the same
estimate / still yield different estimates , and so the constant / variable error
introduced by substituting an estimate for the true value remains [15.4]. Pro¬
cedures described here make / do not make allowance for this error, so we must
remember to use them only when samples are large / small enough [15.4].
Stating Hypotheses
that the mean has a certain specific value (e.g., y = 30). Such a statement is
the hypothesis that the researcher will test and will decide to accept or reject.
A nondirectional hypothesis states that the population mean does not equal the
value specified by the null hypothesis (e.g., y / 30), without saying whether the
mean is less than the value specified by the null or greater. Use of a nondirec¬
that y is greater than the value hypothesized (by Hq), or to the possibility that
it is less [15.7] .
that the population mean is less than the value specified by the null hypothesis
(e.g., y < 30) or that the population mean is greater than this value (e.g., y
> 30). A directional alternative hypothesis is appropriate when it is only of
interest to learn that the true value of y differs from the hypothesized value
direction [15.7].
tional one. The choice should be determined by the rationale that gave rise to
the study, and should be made before / after the data are gathered [15.7].
if the obtained sample mean is located in an extreme position in just one/ either
122 Chapter 15
In testing a null hypothesis, one draws a sample at random from the popula¬
tion of interest and determines the mean of the sample. If the sample mean is
so different from what is expected when Hq is true that its appearance would be
rarity of occurrence is so great that it seems better to reject the null hypoth¬
esis than to accept it? Common research practice is to reject Hq if the sample
What sample means would occur if Hq were true? If it were true, the random
sampling distribution of means (for whatever sample size is used) would center
on the value specified by the null hypothesis, because the mean of a random sam¬
pling distribution of means, P—, is equal to the mean of the population of raw
scores, ]i. The value that the null hypothesis specifies is symbolized P^yp, so
One can thus draw a picture of the random sampling distribution of means
(for whatever sample size is used) on the assumption that Hq is true. Such a
picture appears in Figures 15.2 and 15.3 (which are essentially the same), 15.4,
and 15.5. Each pictured distribution is centered on Upyp, which takes the value
30 for the example used in this chapter.
The sampling distribution of means that would occur if Hq were true is divi¬
ded into a region of acceptance and one or two regions of rejection. If the
obtained sample mean falls within a region of acceptance, the null hypothesis is
accepted; if it falls within a region of rejection, the null hypothesis is rejec¬
ted.
The region of acceptance always covers the central portion of the distribution
and always includes as Figures 15.2 through 15.5 show.
There are two regions of rejection, one in each tail, if the alternative hy¬
pothesis is nondirectional, as in Figures 15.2, 15.3, and 15.4. There is just
one region of rejection, located in just one of the tails, if the alternative
hypothesis is directional. If the alternative hypothesis specifies a value for
Testing Hypotheses about Single Means: Normal Curve Model 123
y less than that named by the null, the region of rejection lies in the left
(lower) tail, as in the left half of Figure 15.5; if the alternative specifies
a value for y greater than that named by the null, the region of rejection lies
in the right (upper) tail, as in the right half of Figure 15.5.
In every case, the total area of the region or regions of rejection is equal
to a, the level of significance. If there are two regions, each has half of
this amount (half of .05 in Figures 15.2 and 15.3, half of .01 in Figure 15.4).
The base line in the pictures of the random sampling distribution of means
implied by the null hypothesis is divided into regions of acceptance and rejec¬
tion by one or two z scores. These z scores are found by consulting Table C in
Appendix F and are called critical values, symbolized zcr±t'
To determine whether the obtained sample mean falls within the region of
the mean is the standard deviation [15.6, Paragraph 2], Consequently, the loca¬
Now since a is unknown. is also unknown, and the sample mean can/
Concluding a Test
To conclude the test of a null hypothesis about the mean of a single popu¬
lation, the approximate z score that locates the obtained sample mean within
the random sampling distribution of means implied by the null hypothesis is com¬
pared with the critical z value or values. If the obtained sample mean, as in¬
dicated by the approximate z, falls in the region of rejection, the null hypoth-
124 Chapter 15
tance, the null hypothesis is accepted. The decision to "accept" Hq does not
mean that it is likely that Hq is true / false , but only that it could be
true / false [15.6, p. 257]. For this reason, some statisticians prefer to say
Also for this reason, if the null hypothesis is accepted, the alternative hypoth¬
Statistical Jargon
that "the outcome was significant at the 5% level"? This usually means that a
decision criterion was ct = _, and the evidence from the sample led to
"not significant" (sometimes abbreviated n.s.) imply that the null hypothesis
could / could not be rejected [15.9]. When a report simply says "not signifi¬
cant without stating the value of ot, it is probably safe to assume that it was
_ [15.9].
means only that the sample value was / was not within the limits of sampling
variation expected under the null hypothesis [15.9]. Whether the difference
impression that their conclusions are general ones, but a little probing will
likely reveal that the conclusions apply only to subjects who are of a particular
and a ___ one (one about the subject matter) [15.11]. The
ing of the study for psychology, or education, or some other discipline [15.11].
[If something in this chapter remains unclear to you, the next chapter may
help. It presents some of the details of the logic and the procedures introduced
here. Don't be afraid to look ahead and search the next chapter for more infor¬
mation on anything you're still puzzled about.]
126 Chapter 15
is
symbolized
is centered,
if Hq is true,
H0
on
states
value of is divided
into
s— = s/Vn~
region of region of
where s = acceptance rejection
v/I'Lx2 / (n-1)
is
has area appears m
equal to one or both
tails of
accepted if obtained
sample mean falls in depending on
level of
significance
rejected if obtained
sample mean falls in
Alternative
Hypothesis
is symbolized
may be
is symbolized
nondirectional, directional,
a
H. requiring a two- requiring a one-
A
tailed test of H tailed test of Hq
0
Testing Hypotheses about Single Means: Normal Curve Model 127
SYMBOLISM DRILL
9 X - X or X - P; score
11 E / ; variance of a population
1 5 x/a or x/S;
19 Predicted z score on Y
16.1 Introduction
2
16.10 The Problem of Bias in Estimating O
J-
3 4
5 6
7 8
9 10
11 12
14 .
13
129
130 Chapter 16
SUMMARY
This chapter spells out some of the details of the logic and the procedures
introduced in the previous chapter.
There are three important points to note concerning the null hypothesis. Hi
[16.2] . (b) Hq is expressed in terms of a point value / range [16.2]; that is,
it states only one particular value for the population parameter of interest.
(c) The decision to accept or reject "the hypothesis" always has reference to
The term "null hypothesis" makes little sense in the case of an hypothesis
about the mean of a single population, but it is appropriate in the case of a
hypothesis about the relationship between the mean of a first population and the
mean of a second one. Here the hypothesis typically states that there is no
difference between the two population means, and the word null means zero.
results, and it is possible to detect a discrepancy between the true value and
the hypothesized value of the parameter irrespective of the direction / for only
finding that the null hypothesis is true and finding that a difference exists in
to use a one-tailed alternative must always flow from the logic of the substan¬
Suppose a is set at .05. When the null hypothesis is true, _% of the sample
decide to adopt a = .05, we are really saying that we will accept a probability
of .05 that the null hypothesis will be accepted / rejected when it is really
true [16.4].
To reduce the risk, we may set a at a lower / higher level [16.4]. In this
For general use, a = .05 and a = .01 make quite good sense. They tend to
give reasonable assurance that the null hypothesis will not be rejected unless
it really should be. At the same time, they are not so stringent as to raise
decision to reject means that we do not believe the mean of the population to
be what the null says it is. Moreover, the lower / higher the probability of
obtaining a sample mean of the kind that occurred when the null hypothesis is
true, the greater the confidence we have in the correctness of our decision to
On the other hand, accepting the null hypothesis means/ does not mean that
we believe the hypothesis to be true [16.5]. Rather, this decision merely re¬
flects the fact that we do not have sufficient evidence to accept / reject the
null hypothesis [16.5]. To put it another way, the decision to accept means
simply that the hypothesis is a tenable one. Certain other hypotheses that
might have been stated would also have been accepted if subjected to the same
test.
In short, rejecting the null hypothesis means that it does not seem reason¬
able to believe that it is true / false , but accepting the null hypothesis
132 Chapter 16
merely means that we believe that the hypothesis could / must be true [16.5].
It does not mean that it must / could be true, or even that it is probably true,
for there would be many other hypotheses that if tested with the same sample
an estimated z score, "z" ("approximate z"), that indicates where the obtained
sample mean falls within the sampling distribution of means that would occur if
the null hypothesis were true. If "z" is large enough (in the sense of being
far away from zero, either above it or below it), we will reject the null hypoth¬
esis. (If the test is one-tailed, "z" must fall on the appropriate side of the
Now the magnitude of "z" depends both on the quantity in the numerator and
^ Hhyp
"z" =
s//n
Other things being equal, if sample size n is very large, the denominator, s//n,
will be quite large / small [16.6]. In this event, a relatively large / small
discrepancy between X and y^ may produce a value of "z" large enough to lead
us to reject the null hypothesis [16.6]. In cases of this kind, we may have a
the standard error of the mean will be relatively large / small , and it will be
difficult / easy to discover that the null hypothesis is false, if indeed it is,
or reject the null hypothesis. Either decision may be in error; thus there are
false [16.7]. The probability of committing a Type I error is Ot, the level of
ations where the null hypothesis is true / false [16.7]. If the null hypothe¬
false [16.7], The Greek letter _____ (beta, pronounced "bayta") is used to in¬
exists only in situations where the null hypothesis is true / false [16.7]. If
error [16.7].
Misuses of a
Some researchers evaluate the outcome of the test of a null hypothesis by
if the null hypothesis were true. For a given outcome they might report, say,
that "p < .03." This probability statement is an expression of the rarity of
the sample outcome if the null were true and nothing more. It can/ cannot be
interpreted as the value of ot [16.8], which is a statement of the risk the re¬
Multiple Tests
Suppose that several hypothesis tests are conducted using the same level of
significance, say .05. For each test taken individually, the probability of a
Type I error is , but taken as a group, the probability that at least one
from among the several will prove to be a false positive is greater / less than
.05 and continues rising / falling as more tests are made [16.9].
Bias in Estimating
The standard error of the mean, symbolized G_, is computable from the formula
A
When o is unknown, as it usually is, it must be estimated from a sample.
The basic problem is that the sample variance, S2, is a biased estimator of
estimates made from all possible samples equals the value of the parameter esti-
mated [16.10]. But the mean value of S , calculated from all possible samples
of any given size that could be drawn from a given population, is a little smaller
than a2.
The formula for the sample variance is:
2
S [p. 278]
root of the formula, we have an estimate of the standard deviation of the popula¬
tion, symbolized s:
s [p. 278]
If s is then substituted for O in the formula for the standard error of the
s— = [p. 279]
mean makes for a better estimate on the average, we should recognize that any
[16.10].
Further Considerations in Hypothesis Testing 135
SYMBOLISM DRILL
4
I
5
X / ; the of a
6
y / ; the of a
9 score
X or
1 1
a2 £ / ? __ _ of a
1 3
s2 I / ; _ _ of a
1 2
0 I / ; _ _ of a
1 4
s £ / ? ___ _ of a
1 5 score
z /a or /S;
1 6
r
2 1
22 of the _ ; a//~
°x
32
s2 "little es squared" Estimate of a2; lx2/{n~1)
2 3
S _Estimate of 0;
2 4
s— Estimate of_; /J~n
X
2 5
Ho
2 6
ha
2 7
yhyp
3 3 True value of y
u
true
2 8
"z"
2 9
z .
crit
3 0 Risk of Type error; level of
a
17.1 Introduction
137
138 Chapter 17
1 2
3 4
5 6
7 8
9 10
11 12
SUMMARY
Chapter 17 describes the method for testing a hypothesis about the relation
between the mean of a first population and the mean of a second population.
Scores from the first population are called X, those from the second population
are called Y, and the null hypothesis usually states that the two populations
have the same mean, which is to say that the difference between the population
means is zero. In symbols, H says that - yy = 0. This hypothesis is appro¬
priate for many studies in which a variable is measured under two different con¬
ditions. In particular, this is the appropriate null hypothesis for an experiment
in which an experimental condition and a control condition are established and
a sample of scores on some variable is collected in each condition.
The Random Sampling Distribution of the Differences Between Two Sample Means
of scores called Y; its size is symbolized ny and its mean Y. Ideally, each
The difference between the two sample means, (X - Y), is the statistic on
which we focus. We ask whether the obtained difference is likely or unlikely to
have occurred if the null hypothesis were true. To answer this question, we must
consult a sampling distribution, in this case the random sampling distribution
°f differences between two sample means for samples of the sizes we drew and for
the two populations from which we drew them.
Testing Hypotheses about Two Means: Normal Curve Model 139
(of size n%) is drawn at random from the population of X scores, and another (of
size ny) is drawn from the population of Y scores. The _ of each is com¬
puted, and the difference between these two _ obtained and recorded
[17.2]. Let the samples be returned to their respective populations and a second
pair of samples be selected in the same way / in a different way [17.2]. The
sample from population x must have size nx again, and the sample from population
Y must have size ny again. Again the two _ are calculated and the dif¬
ference between them noted and recorded [17.2]. If this procedure is repeated
sampled; that is, the characteristics of the sampling distribution would change
ulation means, \ix - py [17.4]. This is true regardless of the sample sizes and
regardless of the shapes of the populations. For cases in which the null hypoth¬
between pairs of sample means will be centered on zero if the null is true.
Even when the two populations are not normal, the sampling distribution tends
toward normal, and with bigger sample sizes, the sampling distribution becomes
bar") [17.4]. Its value depends on whether the samples are independent or not.
ments comprising the sample of X scores, and vice versa [17.4]. In ordinary
random selection from two populations, this would /would not he true [17.4]
For this case, the standard error of the difference between two means behaves
Formula 17.1 requires the standard error of the mean of X and of Y and these,
the two population standard deviations must be estimated from the samples. Sub¬
Since s~ = sx/-/nx and s— = s^Z/n^, in practice the formula works out to:
sample equals or exceeds __, the error will be small enough that the procedures
There are two basic ways in which dependent samples may be generated: (1)
the same subjects are used for both conditions of the study, and (2) different
subjects are used, but they are __ on some variable related to per¬
formance on the variable being observed [17.9]. when samples are dependent, the
standard error of the difference between means must take account of the
When the parameters are unknown, the formula that estimates O_ is-
X-Y
[Formula 17.5]
Testing Hypotheses about Two Means: Normal Curve Model 141
will generally be acceptable, although not entirely accurate, when the number
The alternative hypothesis, H^, may take one of three forms. The nondirec-
tional form says that the two populations of interest do not have the same mean,
which is to say that that the difference between the population means is not
zero. In symbols, this form of HA says that ]ix - Py ^ 0. This form gives rise
to a two-tailed test in which the region of rejection is divided between the two
tails of the sampling distribution of differences between sample means, as in
Figure 17.2 on p. 294.
The other possibilities for the alternative hypothesis are directional forms
stating either that the X population has a greater mean than the Y population or
vice versa. In symbols, these forms say either than Px “ hy > 0 or that Px ~ ^y
< 0. These forms give rise to a one-tailed test in which the region of rejection
is located entirely in one tail of the sampling distribution of differences be¬
tween sample means. Figure 17.3 on p. 294 shows the appropriate picture for a
directional alternative of the first kind; for a directional alternative of the
second kind, the region of rejection would be located in the left-hand tail.
No matter what the form of the alternative hypothesis, the region or regions
of rejection have an area equal to a, the level of significance for the test.
As noted above, if the null hypothesis of no difference between the two popu¬
sample means is centered on zero. How deviant is the obtained sample difference
,
and s-is the estimated ____ [17.5].
X Y
As with problems involving single means, we have an approximate z rather than a
value [17.5]. The use of the symbol "z" will continue to remind us
142 Chapter 17
of this. The formula for the location of (X -SY) in the sampling distribution,
Here (y^ ~ hyp ^"mew sut)_eks minus mew sub-wi, the quantity hype") is the
value of (y^ - yy) stated in the null hypothesis, which is usually zero.
work involving in finding the correlation rXY, which is required for the estimate
able .
it is also true that the mean of the population of differences between paired
state of affairs appears below.] If the difference between an X score and its
mean of the population/ sample set of difference scores, and inquire whether
duced to a one-sample problem exactly like that treated in the previous two
chapters.
When inference concerns two independent sample means, the samples may be /
must be of different size [17.7]. However, if a and a are equal, the total
X Y
Testing Hypotheses about Two Means: Normal Curve Model 143
[17.7].
The point just noted has to do with the relative size of the two samples.
What about the absolute magnitude of sample size? Other things being equal,
Comparisons between two or more groups may be divided into two categories:
those in which the investigator can assign to each subject any particular treat¬
ment condition, and those in which the investigator cannot. In a study of the
first kind, it is possible for the investigator to assign treatment condition
at random to the subjects, and to do so has important advantages.
among the groups to be compared. (An extraneous influence is one other than the
treatment, which will be one whose effects the investigator would not wish to
entangle with any effect the treatment might have.) Those who are likely to do
well have more / just as much chance of being assigned to one treatment group
than / as they have to another, and the same / opposite is true of those who
this type of experimental control over extraneous influences whether or not they
ment groups guarantees / does not guarantee equality [17.8]. But randomization
to exchange a few subjects from group to group before proceeding with the treat-
ingnt in order to obtain groups more nearly alike. Such a move improves things /
leads to disaster [17.8]. The standard error formulas are based on the assump¬
tion of randomization, and casual adjustment of this kind makes them more appro¬
Samples (or means) can be dependent for one of two reasons, as noted above:
because one group of subjects appeared in one sample and another group in the
other sample, but the subjects were matched in pairs, one member of each pair
from each sample; or because the same subjects were used in both conditions. In
both cases, randomization can be used to advantage.
taking care to do so independently for each pair of subjects [17.12]. The problem
is more complicated when the same subjects are used for both treatment conditions.
Here, random assignment would mean deciding randomly which treatment the subject
will receive ___________ and which will be given [17.12]. This will
create some problems when the first treatment experience changes the subject in
some way so that she or he performs differently under the second treatment. Prac¬
tice effect and fatigue are two possible influences that might affect a subject's
second performance.
paired observations rather than independent random samples, when a choice is avail¬
dom variation on the differences between means. The standard error measures this
factor. Ihe effect of reducing the standard error of the difference by pairing
is the same as reducing it by increasing sample ______ [17.12]. The less the
error (as measured by the standard error), the less / more likely it is to mask
a true difference between the means of the two populations [17.12]. To put it
I / II error [17.12].
the subjects, the correlation will be higher / lower than otherwise, and the re¬
1. y-f y_ - Y
X-Y T *X
("Mu sub eks bar minus wi bar does not equal Mu sub eks bar Minus Wibar.")
The quantity on the left here is the mean of a population; that's what the
y indicates. The population is composed of numbers derived by taking the mean
of a sample of scores called A" and subtracting from this mean the mean of a
sample of scores called Y; that's what the subscript X~Y indicates. The numbers
described by the subscript are differences between sample means, then, and the
expression y^_y designates the mean of a sampling distribution of such quantities.
The first paragraph on p. 139 of this workbook describes how such a sampling
distribution could be generated.
The quantity on the right above is a difference, the difference between (a)
the mean (y) of a population, the elements of which are means (X) of samples of
scores called X, and (b) the mean (Y) of a single sample of scores called Y.
You will have no occasion to deal with this bizarre expression in this course,
and probably no occasion to deal with the expression at any other time in your
life, even if you become a professional statistician. Be sure you don't confuse
it with the expression on the left above.
("Sigma sub eks bar minus wi bar does not equal Sigma sub eks Minus Wibar.")
The quantity on the left this time is the standard deviation of a population;
that's what the O indicates. The population is composed of numbers derived by
taking the mean of a sample of scores called X and subtracting from this mean
the mean of a sample of scores called Y; that's what the subscript X-Y indicates.
The numbers described by the subscript are differences between means, then, just
as in the expression yy_y. °x-Y is the standard deviation of a sampling distri¬
bution of differences between means, and it has the special name "standard error.
might affect the subject's visual acuity under dim light. The variable that
might be influenced by the independent one is called a dependent variable. It
is not manipulated; rather it is left free to vary, and it is measured for each
subject in each condition of an experiment. In the one described in the text,
the dependent variable is visual acuity under dim light.
If the researcher tested each subject first in one condition and then in the
other condition, though, the mean of the scores for the one condition and the
mean of the scores for the other condition are dependent. Or if the researcher
picked a first subject, looked around for a second one who matched the first in
some way (in visual acuity under normal light for the example on p. 297), flipped
a coin to determine which member of this pair went into which condition, and
continued in this way, matching each subject for one condition with a subject
for the other condition again the mean of the scores for the one condition and
the mean of the scores for the other condition are dependent. When means are
dependent, the samples they characterize must be of the same size, and there is
a logical way to pair each score in one sample with a score from the other sample.
In the bottom paragraph on p. 299, the text zips you through the point that
Kd - 0 if \ix - yy = o. If you found that point unclear, this exercise should
help.
Testing Hypotheses about Two Means: Normal Curve Model 147
Dependent means arise when there is some logical way to pair each score in
one condition of a study with a score from the other condition of the study.
Such pairings are shown in Table 17.5 on p. 300 of the text. Here we are asked
to imagine that 20 subjects were chosen at random from some population and given
a preliminary test to determine their reaction time to a white light. Ten pairs
of subjects were then formed on the basis of these reaction times. Within each
pair, the two subjects were equal in the speed of their reaction to the white
light, but some pairs were relatively slow while others were relatively fast.
The reaction times on which the pairings were done do not appear in Table 17.5,
though, and they do not enter into any of the statistical calculations for the
study.
The researcher then flipped a coin or did the equivalent to assign the members
of each pair at random to one condition or the other of the experiment. The pro¬
cedure might have gone like this: Take a pair of subjects; call one of them A
and the other B. If the coin comes up heads, Subject A is tested with the green
light and Subject B with the red light; if the coin comes up tails, it's vice
versa for the subjects.
Reaction times to the colored lights for the ten pairs of subjects are shown
in Table 17.5 in the columns headed X and Y. Each score is the time in milli¬
seconds (thousandths of a second) to respond to a light. The light was green for
one member of each pair, whose score was subsequently called X, and red for the
other member of each pair, whose score was called Y.
Onward to Table 17.6 now. Here the same ten pairs of subjects are listed
in the same order, from Pair #1 down to Pair #10, along with their scores, X or
Y, again. But this time the difference, called D, between the X score and the Y
score for each pair is included. Check the column of D values: 3 = 28 - 25; -1 =
26 - 27; and so on. In general, D = X - Y.
What is D? It works out to be +.6, as the upper right corner of the table
indicates. (Check the computation.) So what? Well, X-Y = 27.4- 26.8 = +.6 too.
This is an instance of the generalization that where difference scores "D” are
computed as X-Y, D = X-Y.
Now construct another such instance yourself. Fill in the missing numbers in
the table below. It's been arranged so that here the mean of the difference
scores works out to be zero.
Pair X Y D = X - Y
IX = X = / 1 6 4 2
lY = Y = /
2 5 6
X - Y =
3 7 3
Id = d = /
4 5 5
_ _If you did your computations correctly, your table will indicate that D =
X - Y.
The generalization the text states on p. 299 is just this statement expressed
in terms of population parameters. Corresponding to D, the mean of a sample of
difference scores called D, is yD. Corresponding to X and 7 are ]ix and yy. In
general, \iD = \\x - y^. if \AX - ]Ay = 0, then \iD = 0 too.
You yourself can prove that this statement is an eternal truth; all it takes
is a very little high-school algebra. If you want to try—and you'll feel good
if you figure out the proof for yourself—the notes below will get you started.
Proof that y = y - y
D X Y
In general, a table of the kind under consideration here has the following
form, where N is the total number of pairs of X and Y scores:
Pair X Y D = X -
1 X1 D1
Yl
2 X2 y2 d2
3 X3 Y3 d3
♦ • • •
• • • •
• • • •
N X Y D
N N N
To prove that \iD = y^ - yy, start as follows:
lD Dl + b>2 + D3 + . . . + dn
N
Testing Hypotheses about Two Means: Normal Curve Model 149
To complete the proof, change that last expression on the right of the -
sign until you have bx “ Uy* You maY or maY not need a11 four of the additional
lines there, or you may need more than four; there's more than one way to do the
proof.
If you get stuck, consult the hint in the middle of this page.
•abed qxau aqq go uioqqoq aqq uo aiqeiTeAe qugq aaoui auo s,aaaqq qaoM
q.usaop OTqoeq sxqq qe—Ajq poob e—Aaq poob e II ioqui uoxssaadxa sqqq
uanq noA ueo qeqM *Arl - xrl st qoqqM ' qeob anoA uioag paeMqoeq burqaoM Aai
150 Chapter 17
SYMBOLISM DRILL
32
"little es squared" Estimate of 2*7 ( )
1 6
"ar"
1 7
"rho"
2 1
"mew sub eks bar" of of
2 2
"sigma sub eks bar" of the
2 4
"little es sub eks bar* Estimate of ; s/f
2 5
"aitch null"
2 6
"aitch sub ay"
2 7
"mew hype" Value of stated in
3 3
"mew true" value of
2 9 "zee crit"
3 o
"alpha" Risk of Type error; level of
of of
3 4
"mew sub eks bar minus
wi bar" between
of the
36 "sigma sub eks bar minus --
- wi bar" between two
3 8 X - V; score
"dee"
A businessman came to see me recently for advice on the analysis of some data
he'd collected. He ran a marketing-research firm, and he'd conducted two studies
testing consumer reaction to several varieties of frozen food. In both studies,
his subjects were shoppers who were approached in public places such as malls and
parking lots. The subjects were asked to taste one or more varieties of a food
(which had been cooked, of course) and to report a judgment of "bad," poor,
"fair," "good," or "excellent." In accord with the procedure described in Ques¬
tion 18 on p. 17 of this workbook, the fellow had translated these judgments into
You know how to analyze data of this kind now. (A fine accomplishment, no?)
Say how you would do it.
2. What inferential procedure would you apply? Say whether you would test
a hypothesis about a single population mean or about two population means, and
if the latter, whether the sample means are independent or dependent. State your
null hypothesis and your alternative hypothesis, choosing between a one-tailed
and a two-tailed test. List the calculations you would have to do.
In the second study, there were only 100 subjects, but each subject had tasted
two varieties ol a food and judged both of them, so each subject contributed two
scores to the data, and there were again a total of 200 scores.
You also know how to analyze data of this kind. Again outline how you would
do it.
4. What inferential procedure would you apply? Spell out the details as for
Question 2 above.
Now we come to the reason why this example is egregious. In his first study,
the businessman had collected 200 judgments, 100 recorded on one page of a note-
Testing Hypotheses about Two Means: Normal Curve Model 153
book for one variety of a food, and the other 100 recorded on a second page for
a second variety. The fellow had cast each sample of 100 scores into a frequency
distribution, producing two tables looking like this (the frequencies are hypo¬
thetical) :
Score f Score f
5 23 5 17
4 55 4 40
3 12 3 23
2 6 2 15
1 4 1 5
If = 100 If = 100
The mean and the other descriptive statistics required for each sample were easy
to calculate from these tables. (The text covers these techniques on pp. 66 and
90.) All this is fine and dandy for the first study.
But this is exactly how the businessman had preserved his data for the second
study too. Two tables of this sort were all that he had to go on.
The data from the second study, then, could not be properly analyzed. The^
businessman could hardly believe it; the difference between the procedures he'd
followed in the two studies seemed so slight.
Statistical Moral: Plan the statistical techniques you'll use on your data
BEFORE you collect the data.
The businessman confessed to me that he had already written his report on the
foods for the company that was considering marketing them. In the report he had
simply asserted that the variety with the highest mean rating in each study was
significantly higher than the others tested in that study but he didn t really
know this to be so, not even for the first study.
»
C H A p T E yT8IMATI0N QF AND jU,X-/XY
18.1 Introduction
1 2
3 4
5 6_______
7 8
10
9
11 12
14
13
16
15
18
17
20
19
155
156 Chapter 18
SUMMARY
___ estimates alone are made reluctantly, because they may be consider¬
Other things being equal, if wide limits are set, the likelihood that the limits
will include the population value is low / high , and if narrow limits are set,
there is greater / lesser risk of being wrong [18.2]. Because the option exists
falls within the limits [18.2]. The limits themselves are usually referred to
a ___coefficient [18.2].
Interval Estimates of y
the mean of a sample drawn (ideally at random) from that population. A certain
quantity is added to the sample mean to set the upper limit of the interval,
and the same quantity is subtracted from the mean to set the lower limit. The
quantity that is added and subtracted is the product of two values, a certain
z score and—if it is known--the standard error of the mean for samples of what¬
x ± z a-
p x
tuted as an estimate of CT— [18.3]. When n _> , little error will be intro-
X
duced by substituting _ for _ [18.3].
Once the specific limits are established for a given set of data, the inter¬
val thus obtained either does or does not cover _ [18.3]. The probability
What does it mean to say that we are, say, "95% confident"? We do not know
whether the particular interval covers _, but when intervals are constructed
according to the rule, of every 100 of them (on the average) will include
wide confidence interval, and a large sample in a narrower / wider one [18.3].
Interval Estimates of y— - y—
(X - Y) + z O— —
P X-Y
case of dependent / independent samples), which are the values needed to obtain
Once again, for a given confidence coefficient, a small / large sample re¬
one [18.4].
Is a given interval wide or narrow? If we are not familiar with the variable
under study, we cannot say. In such a case, one way to add meaning is to inter¬
of the variable rather than in terms of raw-score points [18.5, Paragraph 2].
that it compensates for the fact that the importance of a given interscore dis¬
variable [18.5].
When confidence limits are expressed this way, we need to keep in mind that
the width of the limits must still be considered in the light of the value of the
Interval estimation and hypothesis testing are two sides of the same coin.
estimate contains all values of H that would be accepted/ rejected had they
been tested using a = 1 - C [18.7]. But estimation has some important advantages
in many cases:
ter (s). A confidence interval is thus a(n) indirect / direct answer to the
variable depends on two factors: the difference between what was hypothesized
terized by a particular parameter value, and if used in this way, the interval
makes plain all of the values that might characterize the parameter, including,
able to believe that it could be exactly true in any practical encounter. Inter¬
SYMBOLISM DRILL
Symbol_Pronunciation_Meaning__
1 n Number of scores in a
2 n Number of scores in a _
3 X _____
4 I __-
/ ; the _ of a
/ ; the __ of a
160 Chapter 18
9
X - or / score
1 1
a2 Z / ; of a
1 2
o 2 / ; of a
1 3
s2 Z / ; of a
1 4
s Z / ; of a
3 2 2
S Estimate of ; ^ /( )
2 3
S Estimate of ; Z /( )
22
a—
v
of the ; /
A
2 4
SX Estimate of •
/ /
A
of the
3 6
°X-Y between
3 7
SX~Y Estimate of
2 5 H0
26 ha
2 1 of of
2 7 Value of stated in
nyp
Vhvv
of of
3 4 U-
between
38 D _ - ;_score
39 C "see" _ coefficient
1 7 P
1 6 r
CHAPTER 19
___ Means
2
1
4
3
6
5
161
162 Chapter 19
7 8
9 10
11 12
13 14
15 16
17 18
SUMMARY
In previous chapters (15, 17, and 18), the text presented techniques for
making inferences about population means. Each technique was described first
in an ideal form in which the appropriate standard error could be calculated
from known population parameters. A modification necessary for practice use
was then introduced, because in practical use the standard error must be esti¬
mated from the sample or samples on hand. Estimation introduces a degree of
error that makes the normal curve the wrong model for the distribution of the
"z" statistic, but the error is tolerable when sample size is large. The pre¬
sent chapter describes a modification of the modification that is necessary
when sample size is small.
procedures, which might lead one to think that the basic issue is one of sample
size. This is not so. The fundamental issue is whether the formulas for the
_the procedures of chapters 15, 17, and 18 / this chapter are exact¬
lation parameters, the procedures of chapters 15, 17, and 18 / this chapter are
t = "Z" ^ z
To locate our obtained sample mean within the sampling distribution that
would occur if the null hypothesis were true, we would like to calculate a z
score according to the formula:
X - ]i
. z =
a—
x
not change the shape of the distribution, and so / nor does dividing by a con¬
if we were to draw repeated random samples from the population of interest, not
only would values of X vary, but so would the estimates of a-. The resulting
Because of the presence of the variable quantity in the denominator, this statis
tic does not follow the normal distribution, though it is close to normal when
sample size is large. The distribution it does follow is called Student's dis¬
tribution, and the statistic itself is called t. "z" was invented by the author
denominator is a variable.
that are completely ________ to vary [19.4]. One might at first suppose that
this would be the same as the number of scores in the sample (or samples), but
z [19.3]. When samples are large/ small , the values of s— will be close to
X
that of
and wil1 be much like z [19.3, Paragraph 1], Its distribution
-ls, consequently, very nearly normal. When sample size is large / small , the
some ways, and different in others [19.3], They are alike in that both distribu¬
tions have a mean of_, are symmetrical / asymmetrical , and are unimodal
more / less leptokurtic than the normal distribution (a leptokurtic curve has a
lesser / greater concentration of area in the center and in the tails than does a
Putting t to Work
When sample size is so small that the procedures presented in the previous
chapters do not work accurately, the same procedures can still be followed with
these changes: (a) "z" should be called t, because t is the conventional name.
(b) In hypothesis testing, an obtained value of t should be evaluated not with
reference to the normal distribution, but with reference to the distribution of
t for whatever degrees of freedom are involved. For inferences regarding a single
population mean, df = n — 1. For inferences regarding two population means, df —
(nx - 1) + (ny - 1) in the case of independent means and df = 1 less than the
number of pairs of scores in the case of dependent means. (c) In interval estima¬
tion, to determine the value of tp (the quantity analogous to zp), not the normal
distribution but the t distribution for the appropriate degrees of freedom should
be used. The rules about degrees of freedom just cited also apply here. (d) For
both hypothesis testing and estimation in the case of two independent means, the
standard error of the difference between two means should be calculated as noted
below.
To test the difference between two independent means, the procedure introduced
When sample size is relatively small / large , "z" is very nearly normally
the information from both samples and make a single ___ estimate of the
2 [Formula 19.2]
s
p
166 Chapter 19
This quantity may be substituted for _ and for in the formula for
v's 2 (1 /n + 1/n ) . The same formula should be used in setting a confidence inter-
p x Y
val for the difference between two population means.
assumption makes less disturbance when samples are large / small than when they
are large/ small [19.14]. As a rule of thumb, it might be hazarded that moder¬
ate departure from homogeneity of variance will have little effect when each
SYMBOLISM DRILL
Symbol_Pronunciation_Meaning
6 IX/N; the of a
5■ T,X/n; the of a
9 X - X or X - y; score
1 1 Zx2/N; of a
12 yZx2/N; of a
1 3 I>x2/n; of a
1 4 /Ex2/n; of a
Inference About Means and the t Distribution 167
32
Estimate of G2 ; Zx2/(n-l)
2 3
Estimate of G; /tx2/ (n-1)
22
Standard error of the mean; 0//n
3 7 Estimate of G—
2 5
Null hypothesis
Alternative hypothesis
2 6
3 3
True value of y
Critical value of z
29
X - Yj score
3 8
Confidence coefficient
3 9
4 1
"dee ef" Degrees of freedom
168 Chapter 19
X, D,
■ .p the sampling distribution of
or - the mean of has
(X~Y) quantities of the given kind
mean of zero
divided by
j
an estimate of the standard error of this
sampling distribution
shape
that is
must be_made, for quantities of the
kind (X-Y), by pooling information standard deviation
from the two samples (assuming that greater than one,
°X = Oy ) when the samples are but smaller and
independent closer to one for
a larger number of
unimodal
degrees of freedom
symmetrical
is number of scores that
are completely free to
leptokurtic in comparison to vary
the normal distribution, but
closer to normal for a larger
number of ____
INFERENCE ABOUT PEARSON CORRELATION
COEFFICIENTS
20.1 Introduction
20.5 Estimating p
169
170 Chapter 20
SUMMARY
from the population, calculate r, return the sample to the population, and repeat
this operation indefinitely, the multitude of sample r's will form the
we might expect, the values of r will vary more / less from sample to sample
distribution of r is symmetrical and nearly normal [20.2]. But, when p has a value
to provide a practical frame for inference. For some problems, the distribu¬
r characterizing a given sample and the number of cases in the sample, n, has a
[Formula 20.1]
t =
of Appendix F [20.3]. The method can be used for a variety of levels of signifi¬
what if a given r permits one to reject the null hypothesis that p = 0)? In
searchers frequently draw their inferences from a value of r by testing the null
From r to z1
The transformation to z' is accomplished via a formula invented by R. A.
apply inferential techniques to the Z* that use the convenient normal-curve model.
But the outcomes of the inferential techniques will still apply to the correlation
coefficients of interest.
in using the Z1 transformation, reasonable results will obtain unless sample
size (n) is very large / small or p is very low / high [20.4, last paragraph].
172 Chapter 20
Estimating p
Rather than testing the hypothesis that p is some specific value (usually
zero), it may be desirable to ask within what limits the population coefficient
z1 — [Formula 20.3]
where z' is the value of z' corresponding to the sample _; z is the magni-
P
tude of z for which the probability is _ of obtaining a value so deviant or
Application of the above rule will result in a lower limit and an upper limit,
test of the hypothesis that pi = p2 is available only if the samples (and popula¬
tions) are independent. That is, there must be no logical way to pair either set
of scores in one sample with either set of scores in the other sample. The hypoth¬
esis is tested with the same procedure used for testing a hypothesis about two
z = - [Formula 20.5]
The denominator here is the standard error of the difference between two values of
a ,
z i-z 2 ~ [Formula 20.6]
Inference about Pearson Correlation Coefficients 173
However, all of the procedures for inference about coefficients described in this
chapter are based on the assumption that the population of pairs of scores forms
tion, then these procedures for inference must be considered to yield only approx¬
imate results.
SYMBOLISM DRILL
Number of scores in a
n
Number of scores in a
N
Z ; the of a
X
Z ; the of a
u
score
or
Z of a
1 3
a
z of a
1 2
a2
z of a
1 1
s z of a
14
of
23
of ; E
32
174 Chapter 20
Symbol_Pronunciation_Meaning
22 CT—
X---- /
2 4 ST7 /
36 °x-Y
3 7 S-_-
2 1 U ___of __
X
27 y Value of stated in
hyp
34 y—
x-y
Value of stated in
5 (y -y )
* y hyp
2 5 h0
26 H
A
ff ff
2 8 Z
29 Z
crit
of t
3 0 ^
of
3 1 3 of
38 D score
16
1 7 P
42 a
r
4 3 Fisher’s transformation of
Inference about Pearson Correlation Coefficients 175
Symbol_Pronunciation Meaning
_ of
4 4 CF i
Z -
of the _
3 9 C
21.1 Introduction
177
178 Chapter 21
1 _ 2_
3 4
5 6
7 8
9 10
11 12
SUMMARY
A researcher faces not only statistical problems but also substantive ones,
and the two types of problem are often interrelated. The present chapter treats
some such interrelationships.
Thus a Type II error is committed when a false null hypothesis is accepted. The
opposite occurs when a false null hypothesis is rejected. Since the probability
when a true difference really exists. The probability of doing so, (1 - 3), is
There are a number of factors that affect 3f and these are listed below.
Since 3 and power are complementary, it must be remembered that any condition
that decreases 3 increases/ decreases the power of the test, and vice versa [21.3]
Some Aspects of Experimental Design 179
1. The greater the discrepancy between ytrue and V1hyp• the greater / less
hypotheses about the difference between the mean of a first population and the
mean of a second, the greater the discrepancy between the true difference and
the hypothesized difference (which is usually zero), the less the probability of
its action in reducing the standard error of the mean. Since the standard error
of the mean is o//n, another way to make it smaller is to increase / reduce the
size of a [21.6]. a is the standard deviation of the set of measures, and it re¬
flects not only variation attributable to the factors of interest, but also vari¬
and thus augment / reduce 3 [21.6]. In comparing means of two groups, the in¬
dependent / dependent sample design makes it possible to reduce the standard error
[21.6].
4. 3 is also related to the choice of a. In general, reducing the risk of
large / small sample, but accuracy suggests a large / small one [21.11]. How
a
I
180 Chapter 21
ways:
fering factor constant for every_in the study [21.13] . But there
is an important price to be paid for seeking control in this manner: the tighter
| M
the control developed by holding many conditions constant, the more limited the
the two groups on some characteristic, rather than holding the characteristic
assignment of treatment achieves control over differences that subjects may bring
to the study but still limits / without limiting generalization in the way that
would be done by holding these variables constant [21.13], Although random as¬
ence the dependent variable). But it cannot control certain other types of ex¬
traneous influence, namely those factors that vary along with the treatment from
one condition to the other. Thus conducting the study as an experiment with ran¬
meaning of the outcome of the statistical test, and it means / but it does not
mean that the answer to the substantive question posed in the beginning is auto¬
tion by the experimenter. Some are unmanipulable for ethical reasons. Other
large a sample is really needed? To answer this question, we must first decide
, . , . value and
what magnitude of discrepancy between the ----
The decision as to just how big a discrepancy between the parameter's hypothesized
(1) this discrepancy and (2) the risk (B) we are willing to take of overlooking
a discrepancy of that magnitude, then we can estimate the size of the sample
standard deviation of the variable measured, sample size must be large / small
[21.12].
3. If it is acceptable to increase the risk of a Type II error, larger/
required in each of the two samples to achieve the same level of protection
Control in ExpGrimsntution
in the classic model of the experiment, all variables are controlled except
the one subject to inquiry. The variable to be studied is manipulated, and the
absent can still / cannot be called experiments [21.13]. The text refers to
between
them as in situ studies. The important difference/experiment and in situ study
control makes it more / less difficult to interpret the outcome of such studies
[21.13].
The loss of control arises because when individuals are selected according
we may well find that the different "treatment" groups are significantly differ¬
ent with regard to the dependent variable, but the origins of these differences
variety in which matched pairs of subjects are formed with treatment conditions
randomly assigned to the two members of each pair. But trouble arises in the
that exposure to the first treatment condition will change the subject in some
way that affects his or her performance under the treatment condition assigned
of one treatment upon the other is the same as that of the other upon the one,
However, the disturbing order effect will introduce an additional source of vari¬
ation in each set of scores, according to the magnitude of its influence. This
Furthermore, if the influence of one treatment upon the other is not the
same as that of the other upon the one, _ will be introduced as well as
to interpret.
2. If the design utilizes repeated observations on the same subject but as¬
signment of the treatment condition is not random with regard to order, we are
in less/even graver difficulty [21.14]. Any order effect will bias the com¬
parison. Studies of this type may also be subject to another source of bias:
the regression effect. If subjects are selected because of their extreme scores
tions are likely to arise when studying the effect of a nonmanipulable variable
and are susceptible to all of the usual difficulties of such studies plus several
additional hazards:
(a) Matching may increase / reduce , and therefore obscure, the influence
of other important variables associated with the variable on which matching took
place [21.14].
(b) When the two intact populations differ widely on the variable on which
matching is done, it may be possible to form matched pairs only by using subjects
who are unusual relative to others of their own kind. Under these conditions,
(c) When subjects are selected for pairing because of their extreme scores
SYMBOLISM DRILL
1
Number of scores in a sample
or ; deviation score
9
Z ; variance of a population
1 1
1 3 Z ; variance of a sample
2. v1
32 Estimate of 0 ; Z
23 Estimate of a; Z
24 Estimate of O—; /
-——. — ———-
37 Estimate of 0— —
Z\ JL
25 Null hypothesis
Alternative hypothesis
26
Symbol_Pronunciation Meaning
33 _ ____
True value of y
Critical value of z
29
X - Y; difference score
3 8
Standard error of r
42
Fisher's transformation of r
43
Confidence coefficient
39
Degrees of freedom
41
ELEMENTARY ANALYSIS OF VARIANCE
_22.1 Introduction
187
188 Chapter 22
1___
2_____
3-
4_
5 ____
6 ____
7 ___
8 _____
9 _____
10______
11_____
1 2 _____
13 ______
14 _______
15 _____
16 _
17
Elementary Analysis of Variance 189
SUMMARY
Terminology
ysis of variance is appropriate for a study in which there is just one treatment.
The Hypotheses
effect on the variable under observation, then we may expect these subgroup
H0:
[22.2]. Such a distinction still / no longer makes sense when the number of
applied to the special case of two subgroups is identical with that of the t
test. Like the t test, the analysis of variance is suited to samples only of
each score from the sample / population mean, and dividing by the number of
be one less than the number of scores. This general relationship can be summar¬
where the letters _ stand for "sum of squares (of deviation scores)" [22.5).
2
Homogeneity of Variance and the Within-Groups Estimate of U
The analysis makes the assumption that the several subgroup populations all
from any one of the subgroup samples by taking the sum of ___ of the
deviations of scores in that group from the subgroup / grand mean and dividing
assumption that the subgroup population variances are the same for all subgroups
made by combining information from these several subgroup samples [22.5]. Such
an estimate may be made by pooling the sums of squares of deviation scores from
the several subgroups and dividing by the sum of the degrees of freedom character
*If the difficult material in this chapter has not totally destroyed your
sense of humor, you should have recognized this question as analogous to Groucho
Marx's favorite on You Bet Your Life: "Who is buried in Grant's tomb?" But the
proper answer to Groucho's question is not the obvious one, for the tomb actually
holds both Grant and Grant’s wife. Similarly, the quantity that is analyzed
(that is, decomposed) in the analysis of variance is not exactly a variance; it
is the sum of the squared deviation scores that contributes to an estimate of
the population variance derived from the total set of scores on hand, as the
text explains in Section 22.6.
Elementary Analysis of Variance 191
izing each of the subgroups. This estimate is called the within-groups / among-
is:
w —- [Formula 22.1]
df}W
The numerator of this expression is called the within-groups / among groups sum
First, these means are treated as though they were raw scores, and the vari¬
ance of the population from which they came (the population of subgroup sample
means) is estimated in the usual fashion: find the mean of all the "scores"
(which here is the mean of the sample means, and this will be the same as the
grand mean of all raw scores); for each "score," find its deviation from this
mean (here, find the deviation between each sample mean and the grand mean);
square each deviation; sum the squares of the deviations; and divide the sum by
one less than the number of "scores" (which is k - 1 here). The symbols describ¬
ing these operations are:
1(X - X)2
k - 1
The k over the summation sign indicates that there are k quantities to be summed,
Now the variance estimated in this way is the variance of the population of
subgroup sample means, not the variance that we assume to characterize each of
the subgroup populations of raw scores. But from the former we can estimate the
latter, using the following reasoning:
An estimate of the standard error of a mean, s—, is computed from the formula
s 2 = [Section 22.5]
A
But s—2 is the quantity that we just computed; that is, it is the estimate
A
of the variance of the population of subgroup sample means that we derived from
the means themselves. We computed it directly here, whereas earlier in this
course we always computed it by following the formula s2/n. So we can take our
value for s— and multiply it by n to get an estimate of the variance that we
A
assume to be common to each population of raw scores.
symbolized by _ [22.5, paragraph 4]. When the subgroup samples are of equal
sizes, the formula for this estimate, in deviation score form, is:
and n is the number of scores in each subgroup sample [22.5]. The numerator of
of _ (dfA) [22.5].
When the subgroup samples are of unequal sizes, a slightly different formula
is required:
where _ is the number of scores in the ith subgroup sample and is the
about _ as predicted by the standard error of the mean, and sA2 will be an
other hand, if there is a treatment effect, the sum of squares of the deviations
of X about X will tend to be larger / smaller , and s^2 will tend to be larger /
2. 2
Comparing sA and s^
To compare the two estimates of c2, sA2 and sw2, we form them into a quantity
called an F ratio: F = sA2/sw2. If the null hypothesis is true, which means that
there is no treatment effect, the top and the bottom of this ratio will have^
about the same value, so the value of F will be about one. But if the null is
false and there is a treatment effect, sA2 will tend to be larger than sw , as
noted in the paragraph just above, and now the value of F will tend to be greater
than one.
But even if the null hypothesis were true, it would be possible for F to be
greater than one, even considerably greater, just because of sampling variation.
The best we can do is to determine whether a given value of F is likely or un
likely to occur should the null hypothesis be true.
We thus need to know the sampling distribution of F when the null is true.
This depends on the number of degrees of freedom associated with sA and on the
number of degrees of freedom associated with sw2. Table H in the back of t e
text shows selected values from the various members of the family of F distnbu
tions. Hypothetically, the values of F that make up a sampling distribution
could be generated by repeatedly replicating a given experiment: from the same
populations, draw samples, each of whatever size was originally used, and compute
sa2, Stf2, and their ratio, F, for each replication. The null hypothesis must
remain true throughout the replications.
lations, we must compare the calculated value of F with the values of F that would
occur through random sampling if the hypothesis were true / false [22.9]. Now
must be satisfied:
distributed [22.10].
1. The subgroup populations are
As with the t test for independent means, moderate departure from conditions
specified in the first and fourth requirements will not unduly disturb the outcome
of the test. Resistance to such disturbance is enhanced when sample size rises/
falls [22.10].
If you re now lost m the thicket of details, it may help to walk out of the
de^n onto a hl11 to look at the big picture. You can then go down into the
details again when you see where they fit in.
ical set^f T atTandom from its Parent population, which if the hvpothet-
If f saores for a11 individuals who could have been subjected to the aiven
condition The samples are independent of one another, in that each subject tls
observed in only one condition, and there is no logical way to pair thescores
n any condition with the scores in any other condition. The n's may thus be
1. Hr
l0: l1 D h# - V-p, and so on. As usual, H
lation parameters. 0 is a statement about the popu-
hA: H0 is false. The null can be false in more than one way dependina
between^ne-taile^and^two-taile^tests^o'longer'^makes'^sense5"
variance
D ,Lhp noDulations is called 02,
common to the populations is wn '
with no subscript, a measures
5. Assume that HQ is true and ask whether it would then be likely or„^**iY
Z ~ « ~.t remain «.
A value of F is computed for each replication; it will vary from repi
replication but will be about 1.0 most of the time.
^•rthfn^r^to
The bottom estimate is called the within—groups estimate, s 2; the top estimate
is called the among-groups estimate, s 2. ^
2
F = aftiQng-groups estimate of g2 SA
within-groups estimate of oz s 2
W
4b. The estimate on the bottom of the ratio takes each of the k samples in
turn and looks within it for information about the variability of the scores in
the population from which it was drawn. Since each sample's parent population
has the same variance, according to the assumption in Step 3a above, an especially
good estimate of this variance can be made by pooling the information from within
sample D, the information from within sample E, and so on. This pooling is done
as a fancy kind of averaging: information about the variability of population D ,
which is derived from the data within sample D, is averaged with information
about the variability of population E derived from within sample E, and so on. The
number thus produced is not influenced by the dispersion among the samples; in
particular, it is not influenced by the variation among the xs, because it never
compared the xs.
4c.^The estimate on the top of the F ratio completely ignores the information
about g that is available within each of the samples; it does not consider the
values of the individual scores (the Xs), Rather it looks only at the variability
among the samples, taking the sample means as measures of the locations of the
samples.
From the variability among the sample means, it figures an estimate of the
variance of the population of such means. This variance is not g2; rather this
is something analogous to G— , the standard error of a sampling distribution
composed of sample means. In fact, the estimate of the variance of the population
of sample means can be properly symbolized s—2 .
X
Now, when the null hypothesis is true, s_2 reflects only the variation inher-
X.
ent in each of the populations, a2. Think this through. If the null is true,
all populations have the same mean. We are assuming, moreover, that each has the
same inherent variation among its scores, variation measured by the number a2,
and that each is normally distributed. So if the null is true, we are, in effect,
drawing each sample from the same population. Why, then, should the various
samples turn out to have different means? Only because of the variation among
the scores in the population. This fact,
the fact that when the null is true s_2
X
reflects only^the variation inherent in a population of scores, means that we can
fuss with s— a bit and turn it into an estimate of the inherent variation, an
estimate of a2.
But we must remember that the estimate will be predicated on the assumption
that the null hypothesis is true. If the null is false, the several populations
Elementary Analysis of Variance 197
don't all have the same mean; at least one is different from the others.
sample mean will tend to fall where its parent population mean falls, of cours ,
* n.lV1 -jo faise the spread among the sample means will reflect no
so when the nul / f scores, but also the spread among
only the inherent variation in a populationof scor , 2 bably not
the population means. So the estimate of 0 derived rrom
»hiS“S'n«
F ratio will yield an F greater than 1.0.
issSiZ
familiar quantity Zx2/ (n - 1).
*. *■ —»*■»
-f - co+- nf scores is the mean squared deviation, remember.
in case they're not all equa .) ' 2 The sqUares whose sum this ex-
ss, is used for a quantity of the kind IU of^^ations, of course-the
pression is talking about are rea y ,q<.„r,r„ x and its mean, X. Third, the
squares of the deviations between eac ra 2 ' ' described as the degrees
n - 1 on the bottom of the formula^X^xWUi population. Degrees
Finally, the averaging: From sample D we estimate Oe2 , its parent popula¬
tion s variance, by computing £ {xE - XE)2/(nE - 1); from sample E we estimate
°E by computing £ {XE - XE)2 / {nE - 1); and so on for each sample. Each of the
estimates uses only information from within a single sample, note again; only
the raw scores within the sample, their mean, and their number go into the
estimate. The estimates are then averaged in a fancy way: (a) The tops of the
estimates, the quantities of the kind £ (X - X)2 , are added together. (b) The
bottoms of the estimates, the quantities of the kind (n - 1), are added together,
(c) The sum of the tops is divided by the sum of the bottoms. These operations
are summarized in Formula 22.1 on p. 395 of the text, which also reveals that
the sum of the tops is called the within-groups SSW, while the sum of squares,
sum of the bottoms is called the within-groups degrees of freedom, df
Though the text doesn't say so, SSw - SSE + SSE + SSE, and so on, while
dftyj = dfD + dfE + dfE, and so on.
If you're disoriented again, go back to the big picture on p. 194 and reread
through Section 4c on p. 196, but skip Section 4b this time. Following are some
details of 4c.
into one concerned with variances rather than standard deviations or errors.
Squaring tells us that an estimate of the variance of a population of Xs can be
had by dividing an estimate of the variance of the population of raw scores by
the sample size. That is, the estimate of the variance of the raw scores must be
scaled down through division by n. We already have an estimate of the variance
of a population of Xs; this is s— . So to derive an estimate of the variance of
And when the ns of the several samples are equal, that's all there is to
the formula for the among-groups estimate of a : n, the common sample size, is
used as a multiplier for Z (X - F)2/(k - l)-that is, as a multiplier for our
friend s—2 • This gives Formula 22.2 on p. 395.
WhenXthe ns are not all equal, a slight modification is required, as incor¬
porated in Formula 22.3. The example on p. 397 is a case of unequa ns.
In either case, the product of sample size and £(X X) is called the among
groups sim of squahs, SS^, and the rest of the formula, * - 1, xs the among-
groups degrees of freedom, df^.
Disoriented again? Back to the big picture on p. 194. It’s always there if
you get lost in details-but at least you should now see where the details fit.
SUMMARY Continued
sample of scores.
action between two treatments may be phrased this way: Whatever the difference
among the several levels of one treatment, is it the same for each of the -
Variance Estimates
Each of the substantive questions gives rise to a o7
= \x
, and so on, where Cx names the
a
a null
null hypothesis
nyjJUbncoxo asserting
2J
that
- U(
'*C\ C2
C 33
treatment level represented by the first column in the table, C2 names the trea ■
200 Chapter 22
ment level represented by the second column, and so on. For the second substan¬
tive question, the statistical question is the plausibility of another null hy¬
pothesis, this one asserting that y^ = y^ = y^, and so on, where R, names the
treatment level representing by the first row, R2 names the treatment level rep¬
resented by the second row, and so on. For the third substantive question, the
statistical question is the plausibility of a third null hypothesis, and this one
asserts that there is no interaction between the two treatments in the pattern of
the population means for the individual cells of the table.
1 * SWC (Within-cells estimate), derived from the variation among the scores
in the first cell, the variation among the scores in the second cell, and so forth
rows (row _), and also any interaction effect, if present [22.13]. it
---. [22.13]. If the null hypothesis about the population values of the
columns is correct, variation among column means (Xr , Xr , and so on) will be
° 1 2
affected only by inherent variation. Under these circumstances, sc2 will estimate
the same quantity estimated by [22.13]. If the null is false, sr2 will
C
tend to be larger than otherwise. It is therefore analogous to in one-way
ANOVA [22.13].
[22.13]. If the null hypothesis about the population values of the rows is cor-
nect, variation among row means (XR^f XR^, and so on) will be affected only by
inherent variation. Under these circumstances, sR2 will estimate the same quan¬
tity estimated by swc2. If the null is false, sR2 will tend to be larger than
4. s (
Rxc - estimate), derived from the discrepancy be¬
tween the means of the several columns / rows / cells and the values predicted
each in turn in the numerator / denominator and s^c in the numerator / denomi¬
of the presence of the effect specially associated with the kind of estimate
the number of scores within each cell (the n's for the cells are assumed to be^
equal). Since there are C deviations involved in the computation of sj, dfc =
jf = [22.15].
a±Rxc ----
important to study the several values of the column / row / cell means when the
COMPARISONS
The F test for a treatment (the test for the one treatment in a one-way
among the subgroup means than to ask the one overall question, though. Some¬
times the logic of the study will suggest the particular comparisons to be made,
and if so we will know in advance what comparisons would interest us. Compar¬
isons chosen this way are called planned / post hoc comparisons [22.19]. On
data. Such comparisons are known as planned / post hoc comparisons [22.19].
The same / A different strategy is desirable for examining post hoc compari¬
sons as/than for evaluating planned comparisons [22.19]. The way the compar-
is constructed is the same for both planned and post hoc comparisons; the differ¬
Constructing a Comparison
3. the total of the positive coefficients equals / exceeds the total of the
K = + + + [Formula 22.14]
[22.20]. If some levels are not included in the comparison, the coefficients of
pendent of each other. From among the k means of the levels of a treatment one
1. Construct the first comparison using two or more / all levels of the
treatment [22.23].
2. The second comparison must be constructed wholly from subgroups that fall
on one side of the first comparison. Again, use two or more / all available
subgroups [22.23].
3. Construct the third comparison by applying the procedure of step 2 to the
[Formula 22.15]
SK = error
where s 2 is the variance estimate that would constitute the numerator / de-
w error in two-way
in one-way ANOVA;
nominator of the overall F test (-
of the subgroup A, etc.; and is
ANOVA); a is the coefficient of the —
the number of cases in subgroup sample A, etc. [22.21], The number of degrees of
freedom associated with this estimated standard error is the number associated
with __ [22.21].
Evaluating a Comparison
A comparison, planned or otherwise, may be evaluated by --
[22.22].
or by -------
204 Chapter 22
val are given by the rule K ± t^sR, and this is parallel to the rule in Formula
For post hoc comparisons, the text offers a procedure that, strictly speaking,
treatment has shown significance. If such a test has shown significance, then
there exists at least one comparison for which the null hypothesis will be re¬
comparisons as are desired may be made, whether independent or not. The price
for such flexibility is that each comparison yields narrower / wider limits than
SYMBOLISM DRILL
Symbol Meaning
1
Df E, F, The several subcategories of a
2
XD' XE' %E ' Scores in the several subcategories
3
XD' XE' xF, Means of samples / populations in the subcategories
4
Hd' He' hp' Means of samples / populations in the subcategories
6
___-groups estimate of a2
8 ssw _-groups
Elementary Analysis of Variance 205
Meaning _______—
S vmbol
-groups estimate of
V
cc -aroups __—--
—
2 /
F sA /
estimate of
estimate of
SRXC2
for
ssc
for
SSR
for
SSRxC
within
sswc
SST sum of
206 Chapter 22
Symbol Meaning
dfc for _
3 1
3 2
for _
dfR
3 3
for _
dfRxC
34
within
dfwc
3 5 degrees of _
dfT
3 6 F V or / or
COMPARISONS
3 9 Estimate of _ error of a _
SK
2 2
4 0 s s or s
error
Look back at the description of the marketing survey on p. 151 of this work¬
book, and note that in his first study, the researcher really tested five versions
of that frozen foodstuff he had been hired to evaluate.
1. What inferential technique should he have used on the full set of data
from the first study?
In the second study, he again tested five versions of a product, but this time
each subject tasted and rated all five versions.
2. Is the inferential technique appropriate for the first study also appropri¬
ate for the second? Why or why not?
INFERENCE ABOUT FREQUENCIES
23.1 Introduction
207
208 Chapter 23
1_
2___
3 ___
4 _____
5 ______
6 ____
7
-——---—---
8
———— — "
9
— —--——— - --
10___
11 _
12
■ I... - ,
13:_ •———— . —— —- ■ - ■ — ■ ■
Inference about Frequencies 209
SUMMARY
• . | -l *
course, with the exception of the bar drag ^ Nervations are numerical
for observations on guantitatrve varia^^^^ the frequency distributions of
scores, and it is scores that polygon and cumulative frequency curves
Chapter 3, by the histogram an deviation of Chapters 5 and 6, by the
The simplest case to which the chi-square statistic can be applied is that
in which frequency counts are available for the categories of a single variable.
[23.9] . In the 1 x C table, x2 may be used to test whether the relative fre¬
population frequency distribution are in accord with the set of such values
esis of interest [23.9]. The hypothesized proportions must / need not be equal
[23.9] .
#A> the alternative hypothesis, is simply that the null hypothesis is untrue
m some (any) way. Note that the distinction between a directional test and a
Expected Frequencies
To conduct the test, we must generate expected frequencies, and these are
on infinite repetitions of an experiment such as the one actually done when the
Computing Chi-Square
[Formula 23.1]
X' = l
a iq the obtained frequency and sum
where is the expected frequency and - is the obrai
- characterizing a given
mation is over the number of -------
problem [23.3], Examination of the formula reveals several pornts of rnter
2
about X :
1. V2 cannot be positive / zero / negative since all discrepances are
the f 's and their corresponding f0's, the larger X wl11 be t23.
: But, it is not the size of the discrepancy alone that accounts for a con¬
tribution to the value of X2, it is the size of the discrepancy relative to the
take this factor into account. This is done by considering the number o -
involved [23.5].
is the number or ----
212 Chapter 23
quencies that would occur on infinite repetitions of an experiment when the null
the hypothesis is true, the several obtained frequencies will vary from their
when agreement between fQ's and fe's is good / poor and larger when it is not
[23.4].
When the hypothesized fe's are not the true ones, the set of discrepancies
between fQ and fe will tend to be larger/ smaller than otherwise, and, conse¬
must learn what calculated values of X2 would occur under random sampling when
the hypothesis is true/ false [23.4]. Then we will compare the calculated
that such a value would rarely occur when the hypothesis is true, the hypothesis
When the hypothesis to be tested is true, and when the conditions noted below
obtain, the sampling distribution formed by the values of y2 calculated from re¬
ways the same as the number of degrees of freedom associated with that particular
distribution [23.6].
pling this tends to be true. There are two important ways in which this assump¬
sult in a degree of error [23.7] . The importance of this discrepancy is, for¬
tunately, minimal unless both n and df are small / large [23.7], A correction
When the variable under study consists of only two categories, the data fall
for the error involved in comparing calculated values of X > which form a dis-
squaring [23.10].
and compare that value with the critical one-tailed value of normally distributed
[23.8].
Fourth, note that this test may be conceived as a test about a single
this conception states the value of the proportion in the population of interest;
the population value is symbolized P and the sample value p. The test of a
frequency distributions [23.12]. Here there are two variables of interest, and
each subject is simultaneously classified into one category of one variable and
one category of the other variable. The resulting frequency counts are cast
[23.12]. In many ways, such a table is similar to the bivariate frequency dis¬
tributions encountered in the study of correlation (see chapter 9). Indeed, the
major difference is that here the two variables are both qualitative / quantitative
From such a table we may inquire what cell frequencies would be expected if
the two variables are independent of each other in the sample /population
[2 3.12] . Then, chi-square may be used to compare the obtained cell frequencies
crepancies are small / large , y2 will be small, suggesting that the two variables
population.
Inference about Frequencies 215
cies in the population for any row is the same for all _, or that in
1. Find the column proportions by dividing each column total by the grand
2. Multiply each row total by these column proportions; the result in each
instance is the expected cell frequency (fe) for cells in that row. Keep the
3. Check to see that the total of the expected frequencies in any row or in
The same result could be obtained by finding the row proportions and multiply-
Degrees of Freedom
column totals and the row totals are fixed / free and ask how many cell frequen¬
cies are free to vary [23.15]. In general, for an R x C contingency table, their
values of X that are in accord with the tabled distribution of that statistic
in any way, the calculated value of X2 will tend to be smaller / larger than
otherwise [23.15]. As before, then (as for the case of a single variable), the
region of rejection is placed in the lower / upper tail of the tabled distri¬
bution [23.15].
though x2 may be converted into such a measure, it does not, by itself, serve
alogous to the test of the hypothesis, in a correlation table, that the true
correlation is [23.15].
plicable to any row or column/ only to the data taken as a whole [23.16]. The
from each cell) composing it. We cannot say for sure whether one group is re¬
We should also remember that when small / large samples are involved, pro¬
in exactly the same way as afforded an R x C table, except that each (fQ - f )
447.
50 db 87 db
Yes 16 3
Help?
No 4 17
218 Chapter 23
(c) State the null hypothesis for a test for a difference between two
proportions.
4. Compute chi-square and draw a conclusion about the null hypothesis, usinq
the .05 level of significance.
The researchers conducted a parallel experiment in which the stranger did not
wear a cast on his arm, and here the noise level did not significantly affect the
proportion of subjects who helped him, which was 20% at 50 db and 10% at 87 db.
The effect of noise on helpfulness is thus not a simple one; sometimes it matters
and sometimes it doesn't.
24.1 Introduction
__ 2__
1_
_ 4_
3
_ 6_
5
__ 8__
7
10_
9
12_
11
_ 14_
13
16
15
17
219
220 Chapter 24
SUMMARY
The t-tests and the one-way analysis of variance are very efficient, in the
sense of providing high power for a given sample size, and they are the tech¬
techniques require that certain assumptions hold true for the distributions of
scores in the populations from which the available data came. For example,
independent samples and also for the one-way analysis of variance [24.1]. The
tests are quite "robust" against violation of such assumptions, in that they
yield results close to correct when the assumptions are wrong. However, a prob¬
lem can arise when the distributional assumptions are materially violated and
All but the Sign Test among the techniques described in this chapter require
that the data be in the form of ranks, so if scores are on hand they must be
rank-ordered. Once ranks are available, though, the techniques that require them
are easy to use, and the Sign Test is exceptionally simple. Thus the techniques
assumptions required for the more efficient techniques may be violated and when
of identical scores that therefore cause ties in rank. Most rank-order proce¬
dures are based on the assumption that the underlying measure is discrete / con¬
tinuous and that therefore theoretically there are no/ only a few ties [24.2],
There are various ways to deal with ties in rank. A simple and reasonably satis¬
of the ranks that would be available to them [24.2]. This procedure usually has
little or no effect on the of the entire sample but tends to reduce the
are both in the form of ranks (and there are no ties in rank), calculation of
used on occasion when both sets of measures are in score form. In this case,
each set of measures is translated into rank form, assigning _ to the lowest
score, to the next lowest, etc. [24.3]. When would one do this? Sometimes
2.7 and 2.8. If it can be concluded that what matters is that one score is higher
than another and that how much higher is not really important, translating scores
[Formula 24.1]
rs =
between a pair of scores / ranks and n is
where D is the __
Exact procedures have been developed for testing the hypothesis of no correla¬
tion in the population sampled for very small samples, but good results may be
had for n >_ _ by finding the ciitical values required for significance for df
rather than the identity of just the two means, medians, or whatever measure of
1. Label the two groups X and Y; if one group contains fewer cases than the
the rank of 1 to the lowest / highest score, 2 to the next lowest / highest
3. Find ZRX, the sum of the __of all scores in the X/Y distribu¬
tion [24.4].
without replacement and only a few / no ties in rank [24.4]. A moderate number
of tied ranks does not substantially disturb the sense of the outcome.
to more than two groups [24.5]. Like the __ test (and like
the one- and two-way analysis of variance procedures described in Chapter 22), it
is for dependent / independent groups [24.5]. The null hypothesis states the
the mean / median / mode is probably the best descriptive statistic to use [24.5]
Some Order Statistics (Mostly) 223
computed via Formula 24.2 on p. 461. With three groups and 4 or more cases per
give good approximate results [24.5]. With more than three groups, some groups
[24.5]. The region of rejection lies in the upper / lower tail of the -
distribution [24.5].
As with the _ test for two independent groups, the
effect of ties in rank is not great unless there are many of them [24.5]. As¬
sumptions for the Kruskal-Wallis test are the same as for the -.-
test: random sampling with / without replacement and only a few / no ties in
rank [24.5].
the Sign Test, difference scores are calculated as though one were going to do
a t-test for dependent means using the procedure of Section 17.11, but all posi¬
tive difference scores are assigned the symbol "+", and all negative difference
scores are assigned the symbol Under the null hypothesis, we would expect
that there would be as many "pluses" as "minuses" in the sample within the limits
Probably the simplest is to ignore such cases, reduce - accordingly, and pro-
The assumptions required for this test are that the x - Y differences have
been randomly drawn from the
-----__ ot difference scores and that
sampling is with / without replacement [24.6], A third assumption is that no
using the method described above will be reasonably satisfactory provided the
number of _____ is small [24.6] .
we may not be willing to make [24.7]: we must assume that differences between
pairs of scores can be placed in rank order. For the test itself, the assump¬
To conduct the test, compute difference scores in the usual way. Then dis¬
regard the sign /size of the differences obtained, and supply ranks to the
We can see immediately why this test is more sensitive than the Sign Test.
That test responded only to the size / direction of the difference between a
pair of scores, whereas the Wilcoxon test uses additional information about the
size / direction of the difference [24.7].
; £\*r. t* - ■*-*■*- * * • _
'
answers
CHAPTER 2
5. 500, of course.
6. An element.
7. A constant so far as your survey is concerned. In a nationwide survey,
state of residence would be a variable.
8.
O.
Qualitative.
^uai.iv,uuj-»w.
9.
-
Discrete. 10. Nominal.
13. Ratio.
11. Quantitative. 12. In theory, continuous; in practice, discrete.
--
is measured on a nominal scale.
227
228 Answers for Chapter 3
CHAPTER 3
Page 23:
3 .25 25 9 .75
1
1
1
1
1
1
|
1
1 o
i
i
i
i
i
2 .17 17
LO
6
•
CD
o
1 8 4 . 33
•
3 .25 25 3 .25
ii
n
ii
ii
ii
ii
ii
ii
ii
ii
n
ii
i
i
it
i
ii
ii ii
i
ii
ii
i
i
i
n
ii
ii
ii
12 1.00 100
Page 24:
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
15 .99 99
In this latter table, the sums of the proportionate and percentage frequencies
are not quite what they should be because of rounding error.
Answers for Chapters 3 and 4 229
CHAPTER 3 , continued
Page 26:
1. 98 2. 14
3. 87.5 4. 69.5
10. 28 and 40
11. 52 and 64
12. 96 and 98
Page 28:
1. 66.5 inches.
2. A centile point.
3. Six-one = six feet + one inch = 72 inches + 1 inch = 73 inches. The 95th
centile point is 73.1 inches. Thus 5% of the men are over 73.1 inches, and so
a bit more than 5% are over 73 inches even.
4. Neither. The answer is indeed a percentage, but it's not the percentage of
cases falling below a given point along the scale of scores.
5. 69.3 inches.
8. 64.3 inches and 73.1 inches (which are C5 and C95, respectively).
9. 10%, or about 41 of the 411 or so. The table indicates that 10% were below
65.4 inches in height, and 20% were below 66.5. So going up the scale of heights
from 65.4 to 66.5 raised the cumulative percentage from 10 to 20, getting us an
additional 10% of the cases in the interval in question.
10. 25%, or about 103 of the 411 or so men. The logic behind this answer is
the same as that for Question 9.
CHAPTER 4
Page 37:
2. Skewed, with the tail on the left. Maybe J-shaped, even. It will be J-
shaped if the maximum score, 50, is the score that occurs most often (and is thus
the mode, as you will learn in the next chapter). Note that the size of the
group, 523, is irrelevant to the shape.
230 Answers for Chapters 4 and 5
for7thek!?v?h Wita *** taU °n the right‘ If the test is extremely difficult
for the sixth-graders, the shape might even be a backwards J.
CHAPTER 5
Pages 43-44:
1 . The mode.
5. The median.
6 . The mean.
7. The mode.
8. The mode.
9. The mean.
10 . The mean.
i—1
i—1
The mode.
•
12 . The mode.
The median.
•
i—1
The mode.
•
The mean.
•
Pronunciation Meaning_.
Svrnbol
1 0 "cue"
(C?5 _ C25)/2; semiinterquartile range
Q
26 yh
hyp "mew hype" Value of y stated in null hypothesis
36 0- _ "sigma sub eks bar minus Standard error of the difference between
X-Y wi bar" two means
Symbol_Pronunciation Meaning
"sigma sub zee prime Standard error of the difference between two
0 sub one minus zee
45
z'l-z'; independent Z1's
prime sub two"
CHAPTER 6
Page 55:
Ex = 48, n = 6, X = 8.0. Ex = 0. lx2 = 96, S2 = 16.0, S = 4.0.
Ex = 60, n = 6, X = 10.0.
II
X
Page 57:
Ex2 = 48, s2 = 4.0, S = 2.0.
M
O
X
II
EX = 48 n = 12, X = 4.0.
•
ZX2 = 480.
lX2 = 654.
IX2 = 520.
Ex2 = 240.
234 Answers for Chapter 6
CHAPTER 6, continued
Page 60:
2. The range.
3. The range.
8. The range.
Symbolism Drill:
See p. 231.
Page 62:
3. 3.2.
4. 2.0.
5. The very large scores pull the mean up. See the next-to-the-last para¬
graph on p. 41 of this workbook.
10. Since X = Zx/n, Zx = nX. Here nX = (176)(6.81) = 1198.56. The total must
have been a whole number and could have been either 1198 or 1199; both figures
round to 6.81 when divided by 176.
11. Again we must compute nX. Here the figures are (128) (2.97) = 380 16
which rounds to 380. * * '
CHAPTER 7
Page 68:
Deviation Squared
Squared
Raw Deviation z Score Score for Deviation
Deviation
Score Score z Score for z Score
+1.25 1.5625
+5 25 +1.25
13
+1.25 1.5625
+5 25 +1.25
13
+0.25 0.0625
+1 1 +0.25
9
-0.50 0.2500
-2 4 -0.50
6
EX = 48
X
II
E(z - z)2/n
6 S2 = Ex2/n n = 6 Sz2 =
n =
6.0000/6
= 96/6 z = Ez/n
X = iX/n
= 0.00/6 — 1.0000
—
48/6 = 16.0
= 0.00 sz = /1.0000
=
8.0 S = /16.0
= 1.0000
= 4.0
Page 69:
For the answers to the symbolism drill, see p. 231
236 Answers for Chapter 8
CHAPTER 8
Page 75
+2
130 700 70 97.7
0
100 500 50 50.0
-1
85 400 40 15.9
-3 55 200
20
0.1
Page 77:
Pages 77-78:
4. About 2%.
- ,6' y°rdyg t0 Table B of Appendix F, 0.05% of the cases lie beyond a z score
or 3.30 (which is the closest we can get to 3.33 in the table). Coming in toward
the mean to a z of -2.00 (corresponding to an IQ of 70), we find that 2.28% of
the cases lie beyond it. That leaves 2.28 - 0.05 = 2.23% of the cases in the
lnterva^ between z - 3.30 and z = -2.00. So about 2% of the population is
mildly retarded by Zigler's definition.
7.
About 0.05% (actually somewhat less). This is about 1 person in 2000.
8 . •FlIn'cW° =5x12+2=60+2=62 inches. For men, the first centile
point
is 62.6 inches. Thus less than 1% of the men are below 62 inches in height.
9. Between 20 and 30%.
Answers for Chapters 9-11 237
CHAPTER 9
Pages 83-84:
2. Positive and probably at least moderate. The older children will have
both longer noses and larger vocabularies. The examples in the first two ques
tions here show that two variables can be correlated even though neither as
4. Positive and probably high. Note that instead of two scores for a single
subject, here we have two scores for a pair of subjects (a couple). The couple
is thus the equivalent of a single subject, in that it is the unit on which the
5. Still positive and high. The change in social custom would not influence
the relationship between the two variables; it would only raise the scores on
one variable (husband's age, relative to the scores on the °^er variable (wif,e s
age). A couple with a high score on husband's age would still tend to ha
high score on wife's age, and a couple with a low score on husband s age would
still tend to have a low score on wife’s age.
likS ^Life is full of questions like this one, by the way, questions to which
the proper answer is, "That's a stupid question." Stay alert for them.
Symbolism Drill:
See p. 231.
CHAPTER 10
Symbolism Drill:
See p. 231.
CHAPTER 11
Y' = Y Y> = Y
If X = X Y' = Y
Y' > Y
Y' = Y
If X < X Y' < Y
238 Answers for Chapters 11 - 17
Page 96:
CHAPTER 12
Pages 101-102:
CHAPTER 13
Page 108:
Page 110:
3. The standard error of the mean, which is the standard deviation for the
real sampling distribution and not just your approximation to it, is 2.87/v^2 =
2.87/1.41 = 2.03.
For n = 10, the standard error of the mean is 2.87//Io = 2.87/3.16 = 0.91.
Pages 111-112:
CHAPTER 14
Page 117:
CHAPTER 15
Page 127:
CHAPTER 16
Page 135:
CHAPTER 17
Page 147:
Pages 148-149:
Xj + x2 + x3 + ... + xN - yi ~ Y2 ~ Y3 ~ ~ yN
VD = " ~ N
= N
IX - ZY
N
lx lY_
N N
= UX - y,
Pages 150-151:
Pages 152-153: _
1 Calling the two samples X and Y, sx, Y- sl' and
you should compute X,
X - 7 The latter should be compared to the average of sx and sy, as Section
6.14 ;f the text starting on p. 95 tells you, so you can get some idea of how
large the difference between the sample means is.
2. You should test a hypothesis about the difference between two Population
means. The sample means are independent. The null should state that
ence between the two population means of interest is zero, an have
should say that it is not zero, which is the two tailed case-
to choose an a level, estimate and calculate z *
3 You should proceed as in the first study (see Question 1), and in addition
you should calculate the correlation coefficient for the two sets of score .
4 This is a case of dependent means, so you have your choice of the pro¬
cedures described in Sections 17.10 and 17.11 of the text. The null hypo asls
should again declare no difference between the two population means and t
alternative should again be two-tailed. You would have to calculate s__Y or D
and a "z" again.
5 in the second study, each score for Variety 7> is paired with a score for
Variety B, but the two tables of data do not indicate the pairings,
impossible to compute s-_— or the difference scores an s^.
240 Answers for Chapters 18 - 22
CHAPTERS 18 - 21
Pages 159-160
Pages 184-185
CHAPTER 22
Pages 204-206:
1. treatment
3. samples
4. populations
5. Grand
6. Within
7. Sum of squares
15*
16. column
17. row
37. Sample
38. Population
41. F
Page 206:
2. No, because the samples are dependent. There is a kind of one-way analysis
of variance suitable for the data in such a case, but like the t-test for depen¬
dent means, it requires knowledge of how the scores in one sample line up with
the scores in the other sample or samples. This was the information that the
businessman failed to record.
CHAPTER 23
Page 218:
1 Among the subjects tested at 50 db, 80% helped the stranger. Among those
tested at 87 db, though, only 15% helped. [80% = 16/(16+4), and 15% = 3/(3+17).]
accepted.
245
HOMEWORK
On the following pages is homework, one double-sided page for each chapter
but the first. The answers to the homework problems appear only m the ms rue
Most of the problems in the homework are modeled after ones appearing in
the text or in this workbook, to encourage you to do those in the text and the
The space for your name is at the bottom of the second side of the homework
Daaes you will note, and you should write your name there upside down. The
person who checks your work is thus unlikely to know who you are until she or
he has finished the checking, and there will then be no question of bias in the
checking.
Homework for Chapter 2 247
a parameter , or a statistic m
1.
Pop Samp El Par Stat
the experimental condition.
Stat 2.
2. The 20 altruism scores in the control condition.
Pop Samp El Par
4.
Pop Samp El Par Stat
group described in Question 1.
crete or continuous.
1—1
CO
To specify a child's sex, you write "1"
e
Nominal Ordinal Interval Ratio
for a boy and "2" for a girl.
*Did you hear about the girl who received a report card with "F" written in after
the word "sex"'? "'F' in sex!" she cried. "I didn't even know I was taking it."
Think about the level of measurement involved in grading on the scale A-B-C-D-F.
eureN
Homework for Chapter 3 249
The question logically next is the one on the back of this page, but it
wouldn't fit on this side. You may wish to do it now.
In the table of selected centile points from the distribution of heights for
_ 7. What percentage of the women were shorter than five feet even?
_ 9. The middle 40% of the distribution lies between what two values?
_ 11. Cg o = ?
_ 12. How short can you be and still be taller than half the women in
this sample?
45 44 47 29 28 37 41 34 34 50 47 34 36 17 43 42
40 23 36 28 22 28 33 25 38 49 21 43 25 44 15 37
35 18 32 41 32 42 41 35
Those are real data, so if you took the algebra test, you can meaningfully
compare your score with them. You might want to compute your centile rank in
the distribution. Also, if you won't be getting this page back before you have
to do the homev/ork for the next chapter, you should make a copy of the table.
In the table that answers Question 13 on p. 502 of the text, what are the
following centile points and centile ranks? Use the procedures of Sections 3 10
and 3.11 in your computations.
- The centile rank for 64.5. _ The centile rank for 68.0.
eureisi
Homework for Chapter 4 251
as large as possible.
Be neat, and plan ahead so your graph is
47 34 36 17 43 42
37 41 34 34 50
45 44 47 29 28 37
21 43 25 44 15
22 28 33 25 38 49
40 23 36 28
41 32 42 41 35
35 18 32
252 Homework for Chapter 4
L“:rpercentage — -- 3.s
0UIPN
Homework for Chapter 5 253
test again
Here are the scores on the algebra
34 50 47 34 36 17 43 42
47 29 28 37 41 34
45 44
38 49 21 43 25 44 15 37
23 36 28 22 28 33 25
40
32 41 32 42 41 35
35 18
First, leave the data ungrouped. In the space to the left below, arrange the
scores in order of magnitude, as in Table 3.2 of the text, and show your work in
7. What would the mean be? 8. How did you figure out Question 7?
254 Homework for Chapter 5
Now let the data (as originally collected) be grouped into class intervals
of width 3 with 48 - 50 on top. (This is the way you've been grouping the data
in previous homework.) In answering the questions below, don't bother to show
the grouped frequency distribution again, but do show your computations.
In the 1970 census, American women over age 45, who had presumably completed
any childbearing they were going to do, reported the number of children they had
borne. Some said none (about 6% had never married and about 10% of those who did
marry had remained childless), some said one, some said two, and so on. The mean
oyer all the women in this population was about 2.6 (W. Petersen, Population 3rd
ed., New York: Macmillan, 1975, p. 533).
What would the mean have been if the following events had happened?
- 12. Each woman had one additional child. (Those who had borne none
in reality would hypothetically have had one.)
- 13. Each woman had two times as many children as she actually did.
(Those who had really borne none would hypothetically have had
2xo=0 still.)
- 14. Each woman had first two times as many children as she actually
did, and then one additional one.
- 15. Each woman had first one child more than she actually did, and
then enough more to double the resulting number.
suiun
Homework for Chapter 6 255
34 50 47 34 36 17 43 42
47 29 28 37 41 34
45 44
49 21 43 25 44 15 37
28 22 28 33 25 38
40 23 36
32 41 32 42 41 35
35 18
3. What is Zx?
4. What is (Zx)2?
5. What is Zx2?
6. What is S2?
7. What is S?
NOW suppose, as you did in the homework for the last chapter, that the test
Let the data now be grouped in the familiar way, into class intervals of
width 3 ^ with 48 50 on top. Show your computations for the questions below,
but don t bother to copy in the grouped frequency distribution.
18. What is S?
In the fall of 1977, the Educational Testing Service reported that 54,903
people had taken the Graduate Record Exam, and that their scores on those'items
measuring verbal aptitude had a mean of 503 and a standard deviation of 126.
What would the new mean and the new standard deviation be if the following silly
operations were performed on each of those 54,903 scores?
New Standard
New Mean
Deviation
26. 50 points are subtracted from each score, and the re¬
sulting value is then divided by 3.
0UTCN
Homework for Chapter 7 257
1. above z — +0.50?
2. above z = +1.50?
3. below z = -2.50?
4. below z = -3.50?
5. above z = -0.50?
6. above z = -1.50?
7. below z = +2.50?
8. below z = +3.50?
and z = -2.40?
CO
o
1—1
z
II
10. between
1
and z = +0.60?
o
00
t—1
z
II
11. between
1
+0.80?
13. outside the limits z = +0.40 and z
-1.60?
14. outside the limits z = -1.20 and z
10 scores for the general public on the Wechsler Adult Intelligence Scale
(the WAIS) are normally distributed with a mean of 100 and a standard de.iation
of 15 Answer the following questions to 2 decimal places (e.g., 0.12,). What
percentage of the general (adult) public has a Wechsler IQ...
18. below 85?
17. below 70?
20. above 145?
19. above 115?
22. above 85?
21. below 130?
24. between 70 and 130?
23. between 85 and 115?
26. divides the upper 10% of the scores from the remainder?
27. divides the upper 20% of the scores from the remainder?
28. divides the lower 25% of the scores from the remainder?
29. divides the lower 40% of the scores from the remainder?
30. divides the upper 60% of the scores from the remainder?
31. divides the upper 70% of the scores from the remainder?
32. divides the lower 80% of the scores from the remainder?
33. divides the lower 90% of the scores from the remainder?
Answer these questions to the nearest whole number (e.g., 123). If a distri¬
bution of scores on a standardized aptitude test is normal in shape with a mean
of 500 and a standard deviation of 100, what is the raw score (not the z score)...
Again answer to the nearest whole number. In the distribution described just
above, what are the raw scores (not the z scores)...
auiPN
Homework for Chapter 8 259
Fill in the missing values in the table below, noting that the four scores
on a given line would be truly equivalent only if the distributions from which
they came had similar shapes. This exercise is modeled after the one in the
workbook on p. 75.
Where the answer is not a whole number, give it to one decimal place (e.g.,
123.4)—except that you should give the centile ranks in the right-hand column
to two decimal places (e.g., 12.34).
Score where Score where Score where Score where Centile Rank if
y=100, a=15 y=100, a=10 y=500, a=100 y=80, a=20 Shape is Normal
850
125
100
54.50
50.00
I
94
84
300
10
260 Homework for Chapter 8
In the fall of 1977, a college senior took the Graduate Record Examination
and received the following information from the Educational Testing Service,
which constructs and scores this instrument:
Quantitative Verbal Analytic
Aptitude Aptitude Aptitude
Student's own score: 440 560 585
Mean score for all who took the test: 525 503 513
Standard deviation for all who took the test: 133 126 129
Her centile rank in this group: 25 66 63
4. If the verbal-aptitude scores of all who took the test had been
normally distributed, what would her centile rank have been (to the
nearest whole number)?
6. If the analytic-aptitude scores of all who took the test had been
normally distributed, what would her centile rank have been (to the
nearest whole number)?
Now for each subtest compare the student's actual centile rank with the one
she would have earned in a normal distribution. The comparison doesn't provide
conclusive evidence, but it does permit an informed guess about the actual shape
of the distribution of scores. If the discrepancy between her actual centile rank
and the one she would have earned in a normal distribution is small, we have
little evidence against the most plausible hypothesis, which is that the true
shape is normal. If the discrepancy is large, we do have some good evidence
against the hypothesis of normality, and we can tell whether the shape is skewed
left or skewed right. So for each subtest, indicate your conclusion about the
shape of the distribution of scores. if you infer a skew, spell out your reason¬
ing about the direction of the skew.
7. Quantitative aptitude:
8. Verbal aptitude:
9. Analytic aptitude:
euiep
Homework for Chapter 9 261
38 75
A
54 65
B
62 94
C
67 81
D
67 84
E
72 93
F
77 90
G
77 93
H
82 90
I
85 95
J
How closely is performance on the first test related to performance over the
entire semester? To begin to answer this question, make a scatter plot of the
data in the space below. Do it neatly and as large as possible.
262 Homework for Chapter 9
Don't bother to list the individual values of x2 and F2, but do show your
other work. Give all values that are not whole numbers to three decimal places
(as 1.23, e.g.). This is more than the number that would usually be reported
for a mean or a standard deviation or a correlation coefficient, but you'll need
the extra accuracy for future work with these data.
IX IF
IX2 If
<N
w
*
Xv
X F
sx
IXF
Ixy
Now that you ve found the means, go back to your scatter plot and add lines
that show the locations of the means, as in the figure on p. 148 of your text.
euieisi
Homework for Chapter 10 263
2. Suppose you classify all the registered voters in the U. S. by their age.
You group all the 18-year-olds together, all the 19-year-olds together, and so
on. Would the Pearsonian correlation coefficient do a good job of describing
the relationship between a group's age and the proportion of the people in that
group who actually voted in a given election? Why or why not?
The following data (taken from an almanac) indicate the sort of numbers you'd
be working with. These are estimates of the national turn-out for the 1972
presidential election, which was the first such election in which citizens under
21 could vote.
% of Those
Age
Registered
Bracket
Who Voted
18 - 20 48.3
21 - 24 50.7
25 - 29 57.8
30 - 34 61.9
35 - 44 66.3
45 - 54 70.9
55 - 64 70.7
65 - 74 68.1
75 & + 55.6
264 Homework for Chapter 10
.4* Have you ever wondered just how the amount of studying a person does on
a given subject relates to the person's mastery of that subject? Suppose you
questioned a variety of your classmates, asking each: a) How much time did you
spend in studying for whatever objective examination you took most recently, and
b) what percentage of the items on the exam did you get correct? Imagine that
the Pearsonian correlation coefficient for these two variables turns out to be
negative in your sample. Say there are 15 people in the sample, and the value
of r is -.32. Would you be tempted to reduce your studying time in the expecta¬
tion that your grades would increase? Name at least two reasons why your finding
(r ~ *32) Pr°vides only very weak evidence that more studying time causes exam
performance to deteriorate.
BureN
Homework for Chapter 11 265
Chapter 9 again.
the data from the homework for
38 75
A
54 65
B
62 94
C
67 81
D
67 84
E
72 93
F
77 90
G
77 93
H
82 90
I
85 95
.T
work.
7' =
The student who was at the median on the first exam (and who was
omitted from the table) earned a score of 69% correct on that exam
Use your regression equation to predict this person's score over
the entire semester (to 2 decimal places). The actual figure was
86% correct. Show your work below.
Now use the regression equation to predict performance over the entire
semester for the 10 students who contributed data to the table. Fill in the
table below, which parallels the one on p. 186 of the text. Give the values of
F' to 2 decimal places. Remember that Z(Y - Y')2 should be 0, but it may be a
little off because of rounding error.
i r• \Y - Y ) (Y - Y’)
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
aureN
Homework for Chapter 12 267
those data?
tion is zero.
course.
8. That company you're working for now develops a set of materials for
teaching reading. (There's a huge market for this kind of thing.) To persuade
potential customers that'the materials work, it is necessary to try them out,
collecting data before and after students use the materials. Now the company
has to decide what kind of sample to study: children who are already well above
average for their age in reading ability, children who are average or close to
it for their age, or children who are well below average for their age. There
is an unethical choice you could make here that would virtually guarantee that
the mean reading-ability score of the children in the sample would increase from
before the use of your company's materials to afterwards, even if the materials
were ineffective. Which choice is this, and why will this sample's mean almost
certainly rise from the pretest to the posttest no matter how poor the instruc¬
tional materials are?
0UIPN
Homework for Chapter 13 269
Answer these guestions to 4 decimal places (as .1234, e.g.), and show your
work.
1. over 105?
2. under 90?
3. more than 5 points away from 100 in either direction (i.e., more
than 100 + 5 or less than 100-5)?
4. more than half a standard deviation away from the population mean
(The standard deviation this question refers to is that of the
population.)
6. over 105?
7. under 90?
9. more than half a standard deviation away from the population mean?
11. Suppose your sample will be quite large, with 100 persons. To 4
decimal places, what is the probability that the 100 people will
have Wechsier IQs whose mean lies within 1 measly point of the
population mean? You may find the answer surprisingly large.
12. With a sample as large as 100, it doesn't much matter whether the distri¬
bution of IQs is normal in the population. Even if it departs considerably from
normality, we can still be quite confident that the answer to Question 11 is
correct. Why is this? There are three "magic words" that name the reason, and
the briefest possible answer to this question (an answer that is still entirely
correct, though) requires no more than those three little words.
auiPN
Homework for Chapter 14 271
An honest die is one whose six faces turn up with equal probability. The
"other" way of looking at probability described in Section 14.2 of the text thus
applies to it.
In answering these questions, assume that the dice are honest, and 9iye the
requested probabilities both as common fractions reduced as far as possi e (as
1/6, e.g.) and as decimal fractions to four decimal places (as .1234, e.g.).
You'll do best if you first translate each question into an OR question, an
question, or a combination of the two, whichever is appropriate. (See p. 115 of
the workbook.) Show your work below each of the questions.
other?
If you draw a card at random from a standard deck of 52, which is the
9. a queen OR a heart?
Now suppose you draw a first card at random, look at it, replace it, and
draw again at random. This is sampling (drawing a sample of size 2) with re-
p acement, and it is equivalent to drawing the first card from one deck and
the second card from a second deck.
and ?!S1few aUthor °f yOUr workbook' there were seven women enrolled,
d °f them gave blrth to a 9irl rather than a boy. Is this a rare occur¬
rence. Assume that the probability of any one's bearing a girl is 1/2 (Actuallv
the probability of a boy is slightly greater than the probability of a'girl
about .51 or .52. More boys than girls are conceived, and even though male’
sure*!
Homework for Chapter 15 273
The psychologists expected that the students in the PSI course would generally
make their initial try at the exam on a given unit without being fully prepared,
so that most would not meet the criterion of 16 correct on the firs l' .
fact in the students' first tries at the exam on the very first unit of
course, they earned a mean of 17.93 correct (which is almost 90%). The Psychol¬
ogists reported the n and the standard deviation for this group: 64 and . .,
respectively. Thus you can determine whether it is plausible that those 64 score
are a random sample from a population whose mean is only 16 Test the appropriate
hypothesis, using the .05 level of significance and doing a two-tailed test
Carry all calculations to 3 decimal places, and round your final answers to
but If you need an answer for a later calculation, use the 3-place version.
2. Ha in symbols
1. Ho in symbols:
4. X:
3. a:
5. The value for the standard deviation given above is S. Compute s using the
formula s = /(S2) (n) / (n - 1) :
7. z ______—
crit -—
6‘ sx:-
II „ II
9. Decision on Hq: Accept Reject
8. "z
■calc *■
11. If you had conducted that test at the .01 level of significance, would
your decision on the null hypothesis have been different?
Yes No
12. If you had conducted the test at the .05 level of significance but had
done a one-tailed version in which the alternative hypothesis stated that u <16
would your decision on the null hypothesis have been different?
Yes No
13. If you had conducted the test at the .01 level of significance and had
done a one-tailed version in which the alternative hypothesis stated that u<16,
would your decision on the null hypothesis have been different?
Yes No
In the lecture course, the mean score on the first exam was only 13.41 (67%
correct). Is it plausible that the scores in this group are a random sample from
apopuiation whose mean is as large as 16? The n for the group was 61, and the
s andard deviation (S) was 4.07. Do a two-tailed test at the .05 level of sig¬
nificance again. ^
14. H0 in symbols:
15. H in symbols:
i—1
a: 17. X: 18. s:
•
>
19* X -
20. • 4- : 21. "z" n :
2
cnt calc - 22. Decision on H_: Anrppf Reject
Over the remaining nine units of the course, the PSI students earned a mean
a ove 16 on their first tries at every unit except one. The mean of their first
th^ ^ 619 Unit WaS °nlY 15'39 (n = 62' S = 2.78). is it plausible that
y 1th6 SC°reS Were a random samPle fron> a population whose mean was only
16? Do the appropriate two-tailed test at the .05 level of significance.
27. If you had conducted the test (still two-tailed) at the .10 level of
significance, what would z have been'?
cnt
You may be interested to know that over all 10 units of the course, even on
their first tries the PSI students earned a mean score higher than the mean for
the students in the lecture version. In the homework for Chapter 17 you will
have a chance to determine whether the differences are statistically significant
ouiun
Homework for Chapter 16 275
1. Recall from the homework for Chapter 15 the comparison of PSI with
- traditional lecture course in elementary psychology at Southwest Minnesota
State College in their first tries at the exam on the first unit of the course,
the 64 PSI students earned a mean of 17.93 correct out of the 20 items on the
the b4 stuaeiiuij . 0 Q1 Thp figure 16 out of 20 was of special
intere st'here^because^l^o^better^was' required" for going on to the next unit
Evaluate the difference between 17.93 and 16.00 following the Procedure sugge
13.41
3 The 61 students in the lecture course earned a mean of only
devia-
on the first exam, with a standard deviation of 4.07. How many stan ar
tions worth is the difference between 13.41 and 16.00.
5 The poorest performance for the PSI students came in their first
_s-Trirzi
Nation of 2 89 How'many standard deviations worth is the difference between
96?
8. How important is this difference according to the standards on P
276 Homework for Chapter 16
--- 11. Given that the psychologists were interested in discovering the
population mean to be either above 16 or below it, what should
their alternative
hypothesis have said?
12. Suppose you test the null hypothesis that the population mean is 16 for
tftSaTP^? °f PSI S=ores' and you end UP rejecting the null. The sample mean was
.' 1®t S Say' and the.n was 60- You a one-tailed test at the .01 level of
significance, and a friend who is naive about statistics asks, "Does your result
mean we can be 99% confident that the population mean is above 16?" Explain to
your friend the logic behind your hypothesis test, and say how the figure 99%
enters into things. Work the figures 16, 18, and 60 into your explanation too.
Recall again the comparison between PSI and a conventional lecture course in
elementary psychology. The instructors of the course reported the following data
for scores on the first exam: PSI Lecture
X 17.93 13.41
S 2.91 4.07
n 64 61
is it plausible that the two groups are random samples from populations wi
identical means? Test the appropriate hypothesis at the .01 level of signi 1
cance, doing a two-tailed test. Carry your calculations to 3 decimal places, an
round your final answers to 2—but if you need an earlier answer for a calcula
tion, use the figure with 3 decimal places. Show your work. If you need to com-
pute s, the formula is s = /(S2) (n)/ (n - 1).
1. These samples are (circle one): independent dependent
3. Ha in symbols:
2. Hq in symbols:
4. a: 5. X - Y: 6. s-:.
7• sy:. 8- sx-7:'
9. z . , : 10. "z"
cnt calc ---
The mean for the PSI students' first attempts at each of the remanning nine
exams was higher than the corresponding mean for the lecture students (jho ne
^reft °nThe°thirdaunitais^f intS^ because that was the one on which the
iectureStudents did the best and the only unit on which they earned a mean over
X 17.06 16.60
5 2.01 2.89
n 64 72
Again, determine whether it is plausible that the parent populations have the
same mean, doing a two-tailed test at the .01 level of significance.
278 Homework for Chapter 17
Also of interest are the data on the eighth unit, because this was the one
on which the PSI students did least well, and the only one on which the mean of
their first tries at the exam was under 16. The PSI class still outperformed
the lecture class, though:
PSI Lecture
i—1
X 15.39 27
•
5 2.78 3. 34
n 62 61
None—1st 2nd 3rd 27. Which, if any, of the above three tests would have
yielded a different conclusion about the null hypothesis if it had been conducted
at the .05 level of significance?
28. Suppose you're wondering whether the PSI students' performance on their
first tries on the first unit differed significantly from their performance on
their first tries at the tenth unit (x = 16.49). If you were to do a two-tailed
test of the hypothesis that the parent populations for the two samples of scores
had identical means, using the .01 level of significance again, could you follow
the procedure and use the formulas that you employed for the three problems above
Why or why not?
29. If you could not follow the same procedure and use the same formulas, say
what you would have to do differently.
9UIBN
Homework for Chapter 18 279
Here are the data again for the first-unit comparison of PSI with a lecture
course covering the same material:
PSI Lecture
X 17.93 13.41
5 2.91 4.07
n 64 61
Determine the 95% confidence limits for the PSI population's mean.
1. X
2.
ZP
3.
sx
4. Lower limit
5. Upper limit
6. di
8. w
9. s
11. Required n
Determine the 99% confidence limits for the lecture population's mean.
_ 12. X
____13- zP
280 Homework for Chapter 18
14. s—
- X
_ 17. di
Now find the 95% confidence limits for the difference between the PSI popu¬
lation's mean and the lecture population's mean.
19. X - Y
20.
ZP
21.
SX-Y
24. sav
25. d2
Suppose you wanted the 95% confidence limits for the problem above to be only
1 point wide on the scale of raw scores. Estimate the required size for each
sample (which must be a whole number, of course).
27. w
28. s
av
29. z,
0UIPN
Homework for Chapter 19 281
Here are the data from the homework for Chapter 9 again.
% Correct % Correct
Student
on Exam 1 over Semester
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
If the students had known the answer to 75% of the questions over the entire
semester, on the average, their mean percentage correct would have been 81.25.
The extra 6.25 percentage points would have come from their guessing correctly
on a quarter of the 25% of the items they didn't know. (With 4-choice items,
the probability of a correct guess is .25.) Is it plausible that the mean of
the population of percentage-correct scores for the entire semester is 81.25?
Do a two-tailed test of the appropriate hypothesis at the .05 level of signifi¬
cance. (Don't be confused because you previously called the scores Y.
symbols: 2. H in symbols:
1. Hq in
3. a: 4. X: 5. s„: ____—-
- X
Now estimate the population mean for the total-performance scores by finding
11. t
-p
19 •
__ 20. df: 21. t . :
— —— —--—- crit —--
Here again are the data for the comparison of PSI with a conventional lecture
course in elementary psychology:
PSI Lecture
X 17.93 13.41
S 2.91 4.07
n 64 61
Test the difference between the two sample means, doing a two-tailed test at the
.01 level of significance and using the t statistic. It will be interesting to
compare this test with the one you did on the same data for Chapter 17.
31. df:
Finally, compute the 9()% confidence limits for the difference between the two
population means.
_ 1. Hq in symbols
2. in symbols
3. n
__ 4. r
_ 5. t
_ 6. df
7. t ..
——- crit
8. Decision on
Now compute the 95% confidence limits for the population value, again show¬
10. z'
i“H
1—1
z
•
12. 0 ,
z'
coefficient
16. Upper limit expressed as a correlation
17. Are the limits equidistant from the sample value? If not, which limit
is closer?
284 Homework for Chapter 20
Relevant data have been gathered at Queens College of the City University
of New York (L. H. Seiler, L. D. Weybright, & D. J. Stang, "How Useful are
Published Evaluations Ratings to Students Selecting Courses and Instructors?"
Teaching of Psychology, 1977, 4, 174 - 177). Five different Pearsonian correla¬
tion coefficients are available, each describing the relationship between the
mean rating an instructor received for a given course and the mean that instruc¬
tor received for the same course one year later. The n's for the correlations
range from 99 to 183, and the r's range from .58 to .65. Even the largest of
these, which happens to be the one based on the biggest n and is thus the best
single estimate of the true correlation, is disappointingly small.
--- 18. What is the coefficient of alienation for these data? (See
Section 12.7.)
20 . H0 in symbols
21 . H^ in symbols
22. z1 i
23. Z'2
24.
25.
26. z
crit
27. Decision on H ^
ouiPN
Homework for Chapter 21 285
Look back at the first page of homework for Chapter 15. Note that on their
first tries at the exam on the first unit of the PSI course, the 64 students
correctly answered a mean of almost 18 out of the 20 items, which turned out to
be significantly greater (in the statistical sense) than 16, the minimum needed
for proceeding to the next unit.
To get some idea of the power of the test you did there, assume that the
standard deviation of the scores in the population is 2.40. (The article from
which this example is drawn reports the standard deviation for 10 samples of
first tries at an exam in that PSI course, one for each of the 10 units into
which the course was divided, and the mean of the 10 standard deviations is 2.42.)
Following the procedure illustrated in Section 21.10, determine B for the test
vou did (a two-tailed test at the .05 level of significance using a sample of
size 64)— on the assumption that the true mean was 17, just one point higher than
the hypothesized mean. To show your work, construct a neat, carefully labele ^
diagram like Figure 21.5 on p. 371. Do a rough draft first on another sheet or
paper.
2. Say in words what this figure means for this particular case
--- 5* What was the minimum difference between the population means
that was discoverable with a probability of .80? (This is
the difference for which the risk of missing it was .20.)
---6. What was the minimum difference that was discoverable with
a probability of 90%? (This is the difference for which the
risk of wrongly accepting the null was 10%.)
ouipjsi
Homework for Chapter 22 287
21 16 12
21 15 10
20 15 10
19 15 10
19 14 8
z =
1. XD 2. Xe 3. XF Note that the
means are very widely dispersed, whereas within each subgroup the scores cluster
tightly about their mean.
Now compute
ssw ■
10. Zx2
11. The <
12. ssw ,
Onward to sa . First use the formula on the bottom of p. 395, showing your
work in computing the numerator of the fraction by filling in the table below.
X X - X (X - X) 2
2
XD 13. Z (x - x)
XE 14. n
XF 15. ssA
16. dfA = k - 1
Z =
2
17.
288 Homework for Chapter 22
__ 18- The other term in the formula, the one subtracted from the first
_ 20• SSA as computed from Formula 22.6 Compare with previous result
21. F
calc
22. F for a = .05
cnt
23. Decision on null hypothesis stating equality of population means
For a bit of extra insight, finally, compute SST first via the definitional
formula on p. 397 and then via the raw-score formula on p. 399. To use the defi-
nitional formula, fill in the table below, which lists all the scores.
X x-x (x-x)
Suppose an experimenter assigns 45 men at random
to receive a placebo, a small dose of caffeine, or
a large dose, and then determines their reaction time
in an apparatus simulating the braking of an automo¬
bile. The n's are equal for the three treatment lev¬
els. The experimenter then repeats this procedure
with 45 women. The resulting data can be studied in
a two-way analysis of variance. One variable is
dosage of drug, and the other is sex of subject.
dfwc
32. F for dosage at a = .05
cnt
33. F for sex at a = . 05
cnt
34. F . for interaction at a =
cnt
26. The other term in Formula 22.7 27. SST from Formula 22.7
Homework for Chapter 23 289
Want to convince people that you can read their minds? Try this demonstra¬
tion. Ask a good-sized group of people each to think of a number between six
and ten, inclusive. Each person should make his or her choice individually and
keep it private. Then request that the group think their numbers "at" you, and
announce that you will receive, via telepathy, the number that "comes through"
most strongly, which will be the modal choice. Pretend to receive their thoughts,
and state with confidence that the "loudest" number is seven. You will have an
excellent chance of being correct. Why? Consider the following data, which are
the results of asking 207 introductory-psychology students to choose a number
from six to ten. (The data were reported by Philip Zimbardo in the instructor's
manual for the ninth edition of his text Psychology & Life.)
Choice f
six 24
seven 112
eight 33
nine 25
ten 13
Does it appear plausible, in light of these data, that people in our contem¬
porary society make those five possible choices in equal proportions? Test the
appropriate hypothesis at the .05 level of significance.
Choice f
seven 112
other 95
Males Females
Luck 6 22
Choice
Skill 22 12
28 34
11. Conceptualizing this study as a test of the difference between two propor¬
tions, state the appropriate null hypothesis in words.
16. State your decision on the null hypothesis and interpret your finding,
specifying the direction of the difference, if any, between the two sexes. Use
the remaining space to copy in (neatly) your computation of X2•
The study is the work of Kay Deaux, Leonard White, and Elizabeth Farris:
"Skill versus Luck: Field and Laboratory Studies of Male and Female Preferences,"
Journal of Personality and Social Psychology, 1975, 32, 629-636.
3UIUJSI
Homework for Chapter 24 291
1. Here are the data from the homework for Chapter 9 again. Compute Spearman's
rank order correlation coefficient for the two variables, showing all your work.
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
2. Compute X2 for a Sign Test of the difference between the two samples of
scores in the table above. Show your work in the space below.
3. Recall that with 1 df, Jy* = z, and z is comparable to t. Compute z for the
Sign Test. You may be interested to compare it with the value of t that you
found in the homework for Chapter 19.at the top of p. 282.
292 Homework for Chapter 24
Now do Wilcoxon's Signed Ranks Test for the data on the other side. If you
think a bit, you'll see that it's not necessary to find the difference scores.
Show your work in the space below.
4. W+
5. W_
8. Decision on Hq
Test the difference between the two samples with the Mann-Whitney procedure,
using a two-tailed alternative and the .05 level of significance. Call the
stare condition X, and show your work in the table above.
_ 9. ZRX
11. Decision on H
- 0
amujsi