Student Workbook: Prepared By-Gordon Bear

Student Workbook
prepared by-Gordon Bear

WORKBOOK
to accompany
STATISTICAL REASONING
IN PSYCHOLOGY AND EDUCATION
Second Edition
Edward W. Minium
prepared by
Gordon Bear
Ramapo College of New Jersey
John Wiley & Sons, Inc.

New York • Santa Barbara • Chichester • Brisbane •Toronto
Copyright © 1978 by John Wiley & Sons, Inc.
Reproduction or translation of any part of

this work beyond that permitted by Sections
107 or 108 of the 1976 United States Copyright
Act without the permission of the copyright
owner is unlawful. Requests for permission
or further information should be addressed to
John Wiley & Sons, Inc.
ISBN 0 471 03663 1

Printed in the United States of America
10 987654321
Acknowledgments
I am indebted to many fine people for important contributions to this work¬

book. Thank you to:
Ed Minium, whose text has made statistics so easy for me to teach and so
easy for my students to learn, and who also provided a careful critique of
the first three chapters of the workbook;
Jack Burton, my editor, who supplied much valuable guidance and understood
when the work expanded to fill the time available;
Bob Abelson, Barry Collins, and Fred Sheffield, who were my own conscien¬
tious instructors in the subject of statistics;
my teaching assistants at the University of Wisconsin, who worked diligently

with me and provided helpful advice for improving my instruction;
my many students, from whom I have learned a great deal about the teaching
of statistics and other matters;
John Harsh, who offered to share with me the fruits of his labor on his
mastery-plan workbook;
Bob Worsham, who contributed the data used in the homework for Chapter 9;
the Faculty Research Committee and the Academic Administration of Ramapo

College, who reduced my teaching duties to facilitate my work on this project;
Joe Fontanazza, who found the fantastic machine on which I typed both the
rough and the final drafts;
the classical-music stations of New York City, which nourished my spirit

through many long nights of typing;
Cartoon Cat, who provided companionship through those nights.
I happily dedicate this work to the women in my life.
in
Digitized by the Internet Archive
in 2018 with funding from
Kahle/Austin Foundation
https://archive.org/details/workbooktoaccompOOOObear
A PERSONAL STATEMENT from the AUTHOR of this WORKBOOK
I want to help you learn statistics and earn a good grade in this course.
You're off to a good start, because your text is the book by my colleague
Ed Minium, and it's an outstanding one. I taught from the first edition of
this book ten times: seven times at the University of Wisconsin at Madison,
where my classes had 60 to 90 students, and three times at Ramapo College of
New Jersey, where my classes had 10 to 30 students. At both schools the book
proved to be excellent for developing a thorough understanding of statistics,
and now the second edition is even better. Many other texts supply only the
bare facts, but Dr. Minium gives you much more. He shows you the overall
structure into which the facts fit, and he adds details that provide insights
into matters treated only superficially in other texts.
The purpose of my workbook is to help you master Dr. Minium's text. You'll
find here:
•Do-it-yourself summaries that direct your attention to the important points

•Maps showing you how the various concepts in statistics connect with each other
•Exercises in which you can practice your newly-learned skills
•Drills to help you become fluent in reading symbols and understanding them
•Tricks for remembering things
•Special help with the most difficult matters
•Examples of statistics at work in psychology, education, and the world at large
But please note: You will not be able to "cram" successfully by reading only
the summaries here. The summaries cannot substitute for the text itself. And
the exercises I offer you cannot substitute for those in the text either. In
fact, I have generally constructed exercises that differ markedly from the ones
in the text, to provide you with other kinds of opportunities for learning.
To get the most out of your course in statistics, attend class regularly,
study your text carefully, do the problems there—and then come to this workbook
and let me help you get really good at this stuff. I've tried hard to make this
book something special, and I'd be delighted to know that you used it.
v
■
■'
TABLE OF CONTENTS
Author's Statement v
Are You Worried about the Mathematics in this Course? 1
Tips on Buying a Calculator 2
Chapter 1 Introduction 3
Chapter 2 Preliminary Concepts 9
Chapter 3 Frequency Distributions 19
Chapter 4 Graphic Representation 33
Chapter 5 Central Tendency 39
Chapter 6 Variability 47
Chapter 7 The Normal Curve 63
Chapter 8 Derived Scores 71
Chapter 9 Correlation 79
Chapter 10 Factors Influencing the Correlation Coefficient 87
Chapter 11 Regression and Prediction 91
Chapter 12 Interpretive Aspects of Correlation and Regression 97
Chapter 13 The Basis of Statistical Inference 103
Chapter 14 The Basis of Statistical Inference: Further Considerations 113
Chapter 15 Testing Hypotheses About Single Means: Noraml Curve Model 119
Chapter 16 Further Considerations in Hypothesis Testing 129
Chapter 17 Testing Hypotheses About Two Means: Normal Curve Model 137
Chapter 18 Estimation of y and yx-yY 155
Chapter 19 Influence about Means and the t Distribution 161

Chapter 20 Influence about Pearson Correlation Coefficients 169
Chapter 21 Some Aspects of Experimental Design 177
Chapter 22 Elementary Analysis of Variance 187
Chapter 23 Inference about Frequencies 207
Chapter 24 Some Order Statistics (Mostly) 219
Answers 227
Homework 2^6
1
-
ARE YOU WORRIED ABOUT THE MATHEMATICS IN THIS COURSE?
As of this writing, I have led over 600 students through an introductory

course in statistics. Almost all of them, I'm sure, were initially worried
about the mathematics facing them. One young woman even dreamed of being
attacked by numbers on the night before her first class with me. But 95% of
these students passed the course with a grade of C or better, in spite of my
high standards, and the person who had the nightmare earned a strong A. She
also acquired good self-confidence about mathematics. You can be a success
too.
Consider this:
•Your text emphasizes the logic of statistics, not the theorems, for¬
mulas, and proofs that mathematicians work with. The title of the
book is "Statistical Reasoning in Psychology and Education," and it's
reasoning, not mathematics, that's important here.
•The mathematics employed in the text is only simple algebra. You

covered this in high school, and if you need to relearn it, you can
do so easily. Appendix A in the text will help.
Furthermore, look what you've got going for you:
•Your text is an exceptionally good book. As I noted above, it pre¬

sents more than the bare facts. It also provides the big picture,
so you can see how the facts fit together, and the fine details, so
you can gain insight into the facts.
•This workbook offers summaries of the text, exercises to help you in

various ways, and interesting examples of the things you're learning.
•Your instructor (and your teaching assistant, if you have one) will
go over the material in the book and answer any questions you have.
Moreover:
*This course itself provides a leisurely review of mathematics. The

math in the course begins with counting (tallying up observations).
It goes on to proportions and percentages, and more complex matters
come up only later. So you can gradually relearn whatever you're
uncertain of—and you'11 be relearning it in a context that makes it
vivid and useful.
The mathematics in this course is thus fully within your comprehension.

If you've got some time to devote to the course, you can learn absolutely
everything in your text and feel really good about it.
1
TIPS ON BUYING A CALCULATOR
A miniature calculator would be a good investment for this course, and

you111 probably find other uses for it too.
You don't need anything fancy. You'll have no use for the special features
of the "scientific" calculators that are meant to replace a slide rule—no use
for the keys for pi (tt) , logarithms, exponentiation, or the trigonometric func¬
tions sine, cosine, tangent, and cotangent. (I told you this course doesn't
require fancy mathematics.) You do want the following:
•an add-on memory, which permits you to add a number to another number
already in storage. The key for doing so is usually labeled M+. A machine
with add-on memory usually also has keys labeled M-, MR or RM, and MC or
CM, in which case it's said to have a four-function or four-key memory.
•automatic constant for multiplication (for which there's usually no special

key), or a key for squaring a number, labeled x2. Automatic constant works
like this: To square a number (to multiply it by itself), you first enter
the number and then hit the x key. Instead of entering the number again,
though, you just press the = key. See if this works on any calculator
you're trying: 3 x = should get the calculator to read 9. A key labeled
x2 is even better.
You should also look for a machine with:
•positive-action keys, which click or change in amount of resistance to the

touch when they work. Some machines give you no feedback on whether a given
key has functioned, and you wind up constantly checking the display, which
is a nuisance.
'keys that are big enough for you to press easily and accurately. Too much
miniaturization is a liability.
•a square-root key. It'll be labeled / or /x.
•a display that you can read easily from a variety of angles.
Be a smart shopper:
•Compare models, guarantees, and prices, and try several stores.
•Ask about the stores' policies on defective merchandise. What will they do
if your purchase malfunctions after you get it home? The store should agree
that if it breaks within 30 days, they will replace it with a new machine
from their own stock, rather than sending the old one to the factor for re¬
pair. Ask the salesperson to write "30-day exchange" on your receipt and
sign it.
You should be able to get what you want for less than $20.
2
CHAPTER 1
INTRODUCTION
Here's a list of the sections that make up Chapter 1 in your textbook.

If you'd like to keep track of your progress in this course, there's space
to write in such things as "Okay," "Reread," or "Ask about this."
1.1 Descriptive Statistics
__ 1.2 Inferential Statistics
___ 1.3 Relationship and Prediction
1.4 Kinds of Statisticians
1.5 For Whom Is This Book Intended?
1.6 The Role of Applied Statistics
1.7 More about the Road Ahead
1.8 Dirty Words about Statistics
1.9 Some Tips on Studying Statistics
Now here's a list of the problems and exercises at the end of Chapter 1.
Again, you can keep track of your progress by checking off those you did and
noting which ones you answered correctly, which ones you want to ask your
instructor or your teaching assistant about, and the like.
1 -----__---
2 ---------
3 ___
4 ________—-
3
4 Chapter 1
SUMMARY
To construct a helpful summary of the first chapter in your text, write

in the appropriate term or phrase where a blank appears below, and cross out
the incorrect wording where you're offered a choice. Following each blank and
each choice is a number in brackets that tells you which section of the text
has the answer. You might want to treat this as a test, though, and see how
well you can do without looking in the book.
What Is (Are?) Statistics?
In ordinary speech, the term statistics refers to facts involving numbers,

as in the expression "unemployment statistics." The word is plural in this
sense ("The statistics are due to be released soon"). In Ch. 1, however, the
text uses the term in a different sense, a sense in which the term is singular.
In this sense, statistics is a field within the discipline of mathematics, a
field comparable to algebra, geometry, and trigonometry.
As a field of mathematics, statistics consists of techniques for solving
problems. The techniques fall into two categories: descriptive statistics and
inferential statistics. The primary function of descriptive statistics is to
provide meaningful and convenient techniques for .!<? '

z
(iyO: rr. ft C\ c ; _ [For the answer, see the last sen¬
tence of Section 1.1 in the text]. As for inferential statistics, the object
of these procedures is to draw an __ about conditions which exist

r
in a larger set of observations from study of \ , 1 ■ - , , .- [1.2].
This branch of statistics is also known as _ statistics [1.2] .
Who Uses Statistical Techniques?
Those who work with statistics might be divided into four classes: (1)
those who need to know statistics in order to appreciate _' ? M i r,
<■ . IY y' (■■■ Vv\ c-ir ■ v n.;, / ■- ,_, (2) those who must select and apply
statistical treatment in the course of __, (3) profession¬
al r»,„g_, and (4) ____rr I ’ ■_ statisticians [1.4]. The main
interest of those in the first two classes is in statistics itself / their own
subject matter [Cross out the incorrect wording]. We might think of them as
amateur/-professional statisticians [1.4].
The professional (practicing) statistician acts as a _ in
the process of research by assisting those with _ questions

Introduction 5
(the first two kinds of persons) in finding statistical techniques and applying
them to the evidence they gather [1.4].
These three kinds of persons are more interested in applications of statis¬
tical techniques than in the theory behind them. In contrast, the fourth kind
of person, the mathematical statistician, is primarily interested in the theory.
The text is concerned with applied / têorebi-cal statistics, and it is
directed to the prospective amateur / pa?e#e^sdrorrad. statistician [1.5].
(For a "map" of the information reviewed here, see p. 6 of this workbook.)
How Do Statistical Techniques Figure into Research?
It is important to distinguish between substantive matters and statistical
matters. An investigation typically begins with a -■_ question,
which is a question of \ \ ^ " (__ [1.6]. The statistical
work comes later and begins with a question, the answer to
which is expected to throw light on the r,s t

question [1.6]. A
statistical question differs from a substantive question in that it concerns
a statistical (that is, numerical) property of the data. Upon applying a sta¬
tistical procedure, one arrives at a - C.1X ■_ conclusion, which like
the corresponding question concerns a ^ ;TH64! CO \_ (numerical) property
of the data [1.6]. Finally, a s\jj; conclusion is drawn. This con¬
clusion derives emLirerly / only partly from the statistical conclusion [1.6] .
Thus statistical procedures are tools that enable a researcher to move from a
substantive question to a substantive conclusion.
(For a map of this information, see p. 7 below.)
Is It True What They Say About Statistics?
Here are five accusations about the field of applied statistics. Each
contains some truth, but not the whole truth. No one is trying to tell you
what opinions to adopt, but the author of your text and I would like you to
see both sides of each issue. So fill in a counterargument for each accusa¬
tion. Those in the text appear in Section 1.8.
1. Statistics is a dry field. \-V ftv - •[ CO _L_—i—’ ; > ..LÛ2 _——

6 Chapter 1
2. statistical techniques are depersonalizing, because they deal with
groups of people and not with individuals . (?•/ vOA ^ Q € -f v' C‘-^ ■ > ■ ■a A Ly—
4ut v -ec\ \ a a 4V\€ o’< hd ii) k ad -.in'i
v V\(s \ \f\ QUO - _— -- --— -— ■ -■ --
3. Statistical technique:q ai vo misleading results, i njf \aia '
&£■'?'& HOYS Cn- CAT C. Uit cjo c ces k 3, ir-<:
Statistical techniques dictate the kind of research that is done. ( The
statistical tail wags the substantive dog.") •a
kY(Vi iVv'A ■<?., r ' ■■■.(£,--
5. Statistics is too mathematical a field for anyone but an expert to
understand. mV'li a ___—
, .i
r ■ alT cc'-qo VV
AT
MAP of STATISTICS and THOSE INVOLVED WITH IT’
is developed by Mathematical Statisticians

Statistics
(as a field of
is applied to •Professional Statisticians
mathematics)
substantive
questions by serve as consultants to
consists of
Amateur
I
Statisticians
Descriptive Inferential
consist of
Statistics Statistics
Those who need to Those who use

understand statistics statistical
in order to understand techniques in
reports of research in their own
their own fields. research.
Introduction 7
MAP of the ROLE of STATISTICS in RESEARCH
begins with -♦Substantive Question

Research
often requires ^^n ✓ ask' -►Statistical Question
, _ vStatistical /
the use of-» , . <
Techniques \
\ yield- -►Statistical Conclusion
contributes to but does

not fully determine
i
-♦Substantive Conclusion
ends with'
Statistics in Action — ... ..-.. —.. ...
DATA-LOVING JAPANESE REJOICE on STATISTICS DAY
TOKYO, Oct. 27—This month, the 10th one of the year, the 72-year-old Takeo
Fukuda, this nation's 13th post-war Prime Minister, leads the 113 million citi¬
zens of Japan in marking the fourth anniversary of a very special event in the
official life of Japan—Statistics Day.
Japan, a 2,600-year-old nation that consists of 3,937 islands covering

145,267 square miles, was a relative latecomer in the official compilation of
numbers used throughout the world today to portray national characteristics.
But there is likely no nation that [now] ranks higher in its collective
passion for statistics.
In Japan, statistics are the subject of a holiday, local and national con¬
ventions, awards ceremonies and nationwide statistical collection and graph¬
drawing contests.
"This year," said Yoshiharu Takahashi, a Government statistician, "we have
almost 30,000 entries. Actually, we had 29,836."
"In a modern society," noted Mr. Takahashi, "statistics have become a

necessity." In addition to the obvious statistical categories, the central
Government now compiles figures on such things as the success rate of the
artificial incubation of chicken eggs, the number of railroad cars produced,
the volume of mail from overseas, the size of children's monthly allowances,
the number of baseball gloves imported, and the frequency of tootbrush usage.
Four years ago, however, the Government began to notice a statistical

decline in the cooperation rate of its citizens, many of whom were apparently
unconvinced of the numbers' necessity. Thus National Statistics Day was estab¬
lished .
This year's national theme is "Statistics are the beacon for our happy
life." Entries in the statistical graph contest were screened three times by
8 Chapter 1
judges, who gave first prize this year to the work of five 7-year-olds. Their
graph creation, titled "Mom, play with us more often," was the result of a
survey of 32 classmates on the frequency that mothers play with their off¬
spring and the reasons given for not doing so (the most often heard excuse:
"I'm just too busy").
Tomorrow, 2,500 Government employees involved with statistics will gather

in the city of Fukui for Japan's main statistical rally. The highlight will
be an address by Prof. Takashi Iga on "The kinds of statistics needed for the
economy of the future."
But there is one figure that won't be included: Officials do not yet keep
statistics on the number of statistics they keep. "We don't know," says Mr.
Takahashi, "they are countless."
[Copyright 1977 by The New York Times Company. Reprinted by permission. The
actual date of Statistics Day is October 18.]
Note the sense in which the word statistics is used in this article (in
the sixth paragraph, for example). This is not the kind you're studying in
this course. Any 7-year-old can comprehend the kind the article talks about.
What you're studying is techniques for describing this kind and for drawing
proper inferences from them.
PRELIMINARY CONCEPTS
Here's a list of the sections that make up Chapter 2 in the text. As for
Chapter 1, if you'd like to keep track of your progress, there's space to write
in things like "Okay," "Reread," or "Ask about this."
_ 2.1 Populations and Samples
_ 2.2 Random Samples
_ 2.3 The Several Meanings of "Statistics"
_ 2.4 Variables and Constants
2.5 Discrete and Continuous Variables
2.6 Accuracy in a Set of Observations
2.7 Levels of Measurement
2.8 Levels of Measurement and Problems of

~~ Statistical Treatment
Here's a list of the problems and exercises at the end of Chapter 2. To

keep track of your progress, check off those you did and note such things as
whether you answered a question correctly or need to ask for help with it.
1 2
3 4
5 6
7 8
9 10
11 12
13 14
10 Chapter 2
SUMMARY
Populations, Samples, and Random Sampling
The term population is used in two senses in the field of statistics. In
one sense, it refers to the people (or whatever) the .investigator is studying,
and the term designates the group about which the investigator wishes to i
(yv, / _ [2.1]. The term sample designates a ■ of a population
[2.1]. In the second sense, the term population refers to the v ; _ , _
set of C:\y_e, \ w f . vv-PV^f tVOr--^ :V:o ^\ _
\ \ y , Y.j ■ , ^_ [2.1] . A sample again is a part of
a population. A single observation or measurement (a sample of size one) is
called an \v\ 1 [2.1]. Of the two definitions of population, the one that
usually works better in statistics is the / second [2.1].
If a sample is drawn in a certain way from its parent population, it is
a random sample. For a sample to be random, it must have been selected in such
a way that every ih the y-.- > • had an equal opportunity of
being included in the sample [2.2].
Random samples have two important properties. First, suppose we draw
several random samples of the same size from a given population. It is highly
unlikely that we would get exactly the same collection of elements in each
sample. Thus the characteristics of the sample as a whole will change/ stay
the same from sample to sample [2.2]. This phenomenon is called sampling
variation.
The second important property of random samples is the effect of the size
of the sample on the amount of variability that occurs among different samples
of the same size. The larger the samples, the more / the less the variation
in the characteristics of the samples [2.2]. This fact is of great importance
in statistical descrlpfeupn / inference because it means that large samples
will provide us with a more / be-srs precise estimate of what is true about the
population that we can expect from small samples [2.2].
More Meanings for the Term Statistics
In Section 2.3 the text lists four meanings for the term statistics. In
the first two senses the word is singular and refers to branches of mathematics.
Preliminary Concepts 11
Statistics in the first sense is statistics, which is the science
of organizing, describing, and analyzing bodies of quantitative (numerical)
data; this is the kind of statistics that consists of descriptive and inferen¬
tial techniques. Statistics in the second sense is statistical ; -_ _
the branch of mathematics, owing much to the theory of pyp ha fej -u_' that
provides the theory behind the descriptive and the inferential techniques.
The third meaning of the word listed in Section 2.3 is the meaning that
the word takes in common usage. The word has already been defined in this
sense on p. 4 of this workbook as facts involving numbers. The text defines
the word in this sense as a set of uYUC^' ; such as averages. In this sense
we may use the word in the singular and speak of a statistic, meaning a single
numerical fact or a single index. The fourth meaning of the word statistics
is a refinement of the third. In this sense the word is again plural, but it
refers not just to any old numerical facts, but to indices that describe a
sample. In this sense, then, a statistic is a characteristic of a sample. The
comparable word for a characteristic of a population is ---—•
Constants and Various Kinds of Variables
Suppose we have decided to study a certain group of people, such as the
students at a particular college. The characteristic that defines this group
_the college they attend--does not vary from person to person. So far as our
study goes, it is not possible for this characteristic to have other than a
single value, and it is referred to as a constant / v^vira&ê [2.4].
Other characteristics, like the sex of a subject, can vary from one per
son to another within the group we are studying. A characteristic such as
this which may take on different values is a eefistant / variable [2.4].
A variable may be either qualitative or quantitative. A qualitative
variable consists of discrete categories that differ in quality, not in quan¬
tity. Such a variable is also called a _ variable [2.7, Para¬
graph 2], and an example is a person’s sex, which can be either male or female
The two categories, male and female, differ in kind but not in degree. To
designate one person as male and another as female is to say that they differ,
but the designations do not say that one is more of something than the other.
12 Chapter 2
A quantitative variable, on the other hand, consists of values that do
differ in quantity. An example is the number of siblings (the number of broth-
or sisters) that a person has, which can vary from 0 up to 10 or more. Another
example is a person's height. In each case, people who differ in where they
stand on the variable possess different guantities of it.
These two examples, number of siblings and height, illustrate the two
types of quantitative variable. Number of siblings can take only certain values
namely 0, 1, 2, 3, and so on. No person can have 0.6 or 1.2 siblings. Values
of such variables are stepwise in nature, and such variables are said to be
discrete / continuous [2.5], In contrast, height can take any value within
the range of possible heights. A person can be exactly five feet tall, or just
a little more than five feet, or just a little more than that, and so on. Where
as a discrete variable has gaps in its scale, this kind of variable has none,
and it is known as a f _ variable [2.5, Paragraph 2],
A qualitative variable is always a discrete one.
Accuracy in Measurement
Suppose we have a quantitative variable that is discrete, such as number
of siblings. If we record the value of such a variable with no error, we have
recorded an exact number. Numbers lacking this kind of accuracy are known as
Ok,.- numbers [2.6]. In working with a quantitative, discrete
variable, such numbers arise if our method of collecting data has 4^0;
the potential accuracy in the discrete values [2.6]. This happens when we
estimate a value or round it.
Now suppose the variable we're working with is quantitative and continu¬
ous, such as height. Even though a variable is continuous in theory, the pro¬
cess of measurement always reduces it to a _ one [2.5, Paragraph 4].
Thus any measurement of a continuous variable must be treated as an exact /
approximate number [2.6, Paragraph 2].
Levels of Measurement
Measurement is the process of assigning a number to a subject (or to what¬
ever a researcher is studying) so as to indicate the value of some variable
that characterizes the subject. There are three techniques of measurement:

categorizing, ranking, and scoring. (This information is not in the text but
should help clarify it.)
Categorizing is the technique of measurement used for a qualitative (cat¬
egorical) variable (such as sex). In categorizing, a number is assigned to a
subject to indicate the category into which the subject falls. (The number 1
might be used to indicate the category "male" and the number 2, the category
"female.") The several categories of a qualitative variable ("male" and "fe¬
male" for the variable sex) are said to constitute a(n) nominal/ ordinal' /
interval / ratio scale [2.7]. All observations placed in the same category
are considered to be Cfi\ f i "•, • _ [2.7]. Numbers do not actually have to
be used here, but if numbers are used to identify the categories of a given
nominal scale, the numbers are simply a substitute for the _ of the
categories and serve only for purposes of _» 7, _ [2.7].
Ranking and scoring are techniques of measurement used for quantitative
variables (such as competence as a workman). In ranking, a number from the
series 1, 2, 3, and so on is assigned to a subject to indicate the subject's
position relative to others in the magnitude of the variable of interest.
The subject with the greatest magnitude (the subject with the greatest merit
as a workman) is ranked 1; the subject with the second greatest magnitude is
ranked 2; and so on. In this type of measurement, the numbers form a(n)
nominal / ordinal / interval / ratier scale [2.7]. The basic relation expressed
in a series of numbers used in this way is that of QffQ.tf.p 4 ^_ [2.7].
(The 1, for example, says that the workman given this rank is greater in com¬
petence than the workman given the rank 2.) However, nothing is implied about
the of the difference between adjacent steps on the scale.
(The difference in merit between the man ranked first and the man ranked sec¬
ond may be large or small, and this difference is not necessarily the same as
that between the man ranked second and the one ranked third.) Further, nothing
is implied about the _ of whatever variable is being as¬
sessed [4.7] . (All workmen could be excellent, or all could be quite ordinary,
the numbers do not indicate which.)
In scoring, a number is assigned to a subject to indicate where the subject
stands on the variable of interest, without regard for where anyone else stands.
14 Chapter 2
The number is a true score that indicates how much of the variable the sub¬
ject is thought to possess, and the difference between one score and another
is meaningful. (Workmen might be scored for competence on a scale from -5
to +5, for example, with 0 indicating an average degree of competence.) If
the possible scores form an interval scale, a given numerical interval along
the scale (a difference of 1 point, say) is considered to represent the same
difference / varying differences in the characteristic being measured irre¬
spective of the ,■ f,?y,_ of that interval along the measurement scale
[2.7, Paragraph 4]. (The difference of 1 point between the scores of -5 and
-4 is considered to represent the same difference in competence as the differ¬
ence of 1 point between the scores of, say, 0 and +1 or +4 and +5.)
When measurement is at this level, the level of the interval scale, one
may talk meaningfully about the fyi - ■ between intervals [2.7, Paragraph 5],
(The interval of 2 points between +3 and +5 represents twice as big a differ¬
ence in competence as the interval of 1 point between +3 and +4, for example,
because 2 t 1 = 2.) Nevertheless, it is not possible to speak meaningfully
about a ratio between two tj [2.7]. (It is not meaningful to assert
that a workman rated at +3 has three times the competence as a workman rated
at +1, even though 3 * 1 = 3.) The reason is that the x point is arbi¬
trarily determined, and does not imply an absence of the characteristic being
assessed. (A rating of 0 does not indicate a total absence of competence in
the example offered here.)
If the numbers available for scoring form a ratio scale, they possess all
the desirable features of the interval scale, and in addition the ratio be¬
tween ffyftCc .,_ becomes meaningful [Table 2.1]. On a ratio scale, the
number zero indicates a true absence of the characteristic being assessed.
(Competence as a workman might be measured as the number of faults one can
find and repair in a widget rigged with a dozen malfunctions, for example.
A score of zero would then indicate an absence of competence of this kind,
and a man scoring 3 could be meaningfully said to be three times as competent
as a man scoring 1, since 3 f 1 = 3.)

The Effect of Level of Measurement on Statistical Treatment
The level of measurement at which a variable is assessed limits the kind

of statistical treatment applicable to the observations on that variable. For
example, if a variable is measured at the nominal level, by simply categorizing
the subjects, it is not meaningful to find an average. (If some subjects are
categorized as male and each designated as a "1," while others are categorized
as a female and each designated as a "2," it would be silly to find the aver¬
age of all the l's and 2's and say that the subjects' sex averaged out to be
1.4 or whatever.)
This point is obvious. A more subtle point is that numbers are often used
in psychology and education that look like they fall on an interval or even a
ratio scale, but we cannot be sure that they really do. Some authorities ad¬
vise the use of statistical techniques appropriate for ordinal scales in such
cases. But as Dr. Minium implies at the end of Section 2.8 in his text, the
weight of the evidence suggests that in most situations, it is okay to treat
the ambiguous numbers as though they came from an interval or even from a ratio
scale.
16 Chapter 2
--M NEMONIC* T I P-
A statistic (in the fourth sense of the word listed in Section 2.3) is a
characteristic of a sample, and a parameter is a characteristic of a popula¬
tion. You can easily remember the distinction by noting that statistic and
sample both begin with an S, while parameter and population both begin with
a P.
*Mnemonic ("nem-ON-ik"): pertaining to memory.
EXERCISES*
You are one of those bright young people who's making a lot of money these
days by conducting telephone polls for politicians. Governor Grassroots hires
you to tell her what proportion of the adults in her state approve of her work
as the state's chief executive. You get your staff busy making telephone calls
around the state, and they ask each adult they reach, "Do you think the present
governor is doing a good job?" You yourself make the first call, which is an¬
swered by John Q. Public, who says "Yes." In all, your firm completes 500 such
quickie interviews.
In this example, what is the population of interest to you? Define it in

the two senses of the word noted in Section 2.1 of the text.
1.
What is the sample that you have drawn? Again define it in two senses.
3. _
4.
— - " " “ ' " ' “ ’ * 1 " * “ ' -—' ' - ~ i
How large is your sample? 5._ Mr. Public (or his answer to your question)
is technically known as what? 6. —

7. Is the
state in which your respondents live a constant or a variable? _

i
Now suppose your staff asks each adult not only the question about the
governor's performance but also his or her age, sex, and political affiliation,
if any. Sex is recorded as "male" or "female," age as the number of the latest
birthday, and political affiliation as "Democrat," "Independent," or "Republican."
Describe these variables by filling in the following table.
*The answers to these and all other exercises here appear at the back of
this book, beginning on p. 227.
Qualitative or Discrete or Measured on Nominal, Ordinal,

Quantitative? Continuous? Interval, or Ratio Scale?
Sex 8. 9. 10.
Age 11. 12. 13.
Affiliation 14. 15. 16.
17. Now we come to an ambiguity. Suppose the answers to the question about
the governor's performance are recorded as "Yes," "No," and "No Opinion."
Should the answers be treated as categories making up a nominal scale for the
measurement of a qualitative variable? Why or why not?
18. Here's another question for which the answer is not clear cut. Marketing
I
researchers often ask their respondents to rate a product by saying "Excellent,
"Good," "Fair," "Poor," or "Bad," and they score these answers as 5, 4, 3, 2,
and 1, respectively. This scoring is a way of measuring favorability (or un-
favorability) of opinion toward the product. What level of measurement is this
CHAPTER 3
FREQUENCY DISTRIBUTIONS
In case you're just starting now to use this workbook: the blank lines on
this page are explained on the title pages for Chapters 1 and 2.
_ 3.1 The Nature of a Score
3.2 A Question of Organizing Data
3.3 Grouped Scores
3.4 Characteristics of Class Intervals
3.5 Constructing a Grouped Data Frequency

~~ " Distribution
3.6 Grouping Error
3.7 The Relative Frequency Distribution
3.8 The Cumulative Frequency Distribution
3.9 Centiles and Centile Ranks
3.10 Computation of Centiles from Grouped

~ Data
3.11 Computation of Centile Rank
PROBLEMS and EXERCISES
2 3
1
5 6
4
8 9
7
11 12
10
14 15
13
17 18
16
20 21
19
23 24
22
19
20 Chapter 3
SUMMARY
The Frequency Distribution
Ch. 3 presents the basic technique of descriptive statistics, which is the

frequency distribution. The chapter shows how to apply this technique to a
collection of scores on a quantitative variable. The collection may be either
a sample or a population.
To construct a frequency distribution (with no grouping of scores), locate
the _est and the _est score values [3.2]. Then list all possible
score values / only the scores that actually occurred , including these two
extremes, in ascending / descending order down the page [3.2]. Finally, add
a second column to the right of the first one. In this column, list for each
score value its frequency of occurrence (the number of times it occurred).
Frequency is abbreviated Freq in Tables 3.2 and 3.3 and thereafter symbolized
with the letter _, as in Table 3.4.
The term distribution is appropriate for the collection of scores when it

is cast into a table like this, because the table shows how the scores are
distributed over the range of possible values.
Grouping Scores
Sometimes it is helpful to group the scores before making a frequency dis¬
tribution. Grouping makes it easier to display the data and grasp the essen¬
tial facts they contain. In grouping, the various possible scores are collected
into a number of class _ [3.3] . Here are some rules for making
these groupings:
1. A set of class intervals should be mutually exclusive. That is, inter¬
vals should be chosen so that [3.4]
2. It is / is not important that all intervals be of the same width [3.4].
3. The intervals should be continuous throughout the distribution.
4. The interval containing the highest score should be placed at the top /
bottom of the column listing the class intervals [3.4] .
5. For most work, there should be not fewer than class intervals and
not more than [3.4].
In a grouped frequency distribution (as in an ungrouped one), the total
number of cases in the distribution is found by summing the several values of

Frequency Distributions 21
_, and is symbolized _ if the distribution is considered as a sample,
or by _ if it is a population [3.5]. All the examples here and in the text
assume that the collection of scores is a sample, so the symbol _ is used.
Disadvantages of Grouping
A grouped frequency distribution does not contain all the information that
the corresponding ungrouped one does, because within a given class interval,
one cannot tell exactly where the scores fell. Because this information is
not given, a problem called grouping error can arise. It is often necessary
to make calculations using the numbers in a grouped frequency distribution
(this very chapter discusses calculations of centile points and centile ranks,
for example), but in doing so, one cannot tell exactly where each score fell
along the scale of possible values, and so it is necessary to make an assump¬
tion about where the scores occurred. Sometimes it is assumed that the scores
within a given class interval are distributed evenly throughout the interval,
and sometimes it is assumed that they fell in any way such that the midpoint
of the interval is the average of the scores in that interval.
To the extent that such an assumption is false, the calculations based on
it will be in error. Other things being equal, the narrower the class inter¬
val width, the more / less the potentiality for grouping error [3.6].
A set of raw scores results / does not result in a unique set of grouped
scores [3.3]. That is, there is more than one way to construct a grouped fre¬
quency distribution for a given set of scores. This state of affairs is a
second liability for the grouped frequency distribution, because a researcher,
knowingly or unknowingly, may select a grouping that is misleading.
Relative Frequencies
It is often helpful to "cook" raw frequencies by transforming them into
relative frequencies, which are frequencies relative to the total number of
scores. There are two kinds of relative frequency, proportion and percentage.
A frequency expressed as a proportion of the total is called a proportional
frequency and is symbolized _ [Table 3.5]. To calculate a propor¬
tional frequency, divide the raw frequency by the total number of cases in
the distribution, or in symbols, calculate __ [3.7]. A frequency ex¬
pressed as a percentage of the total number of cases is called a percentage
frequency and is symbolized _ [Table 3.5]. To calculate a percentage
frequency, multiply the proportional frequency by _ [3.7].

22 Chapter 3
The Meaning of a Score
Closely related to the frequency distribution is another kind of table

called a cumulative frequency distribution. To understand such a table, it
is necessary to understand a convention often adopted in descriptive statis¬
tics when the data on hand are scores on a quantitative variable. By con¬
vention, the variable is assumed to be continuous, even though the measure¬
ment of the variable yielded discrete data. A given score is then taken to
represent a range of values on the underlying continuum. A score of 18 items
correct on an algebra test, for example, is not treated as though it were a
highly precise measurement of exactly 18.0000.... Instead it is assumed to
represent a score somewhere between 17.5 and 18.5 on the underlying continuum.
The figures 17.5 and 18.5 are the limits of the score that is called 18, and
they are one-half of an integer below 18 and one-half of an integer above it.
In general, the limits of a score are considered to extend from one-half
of the smallest _ below the value of the score to
orfe-half __ [3.1].
When scores are grouped into class intervals, the limits of a class inter¬
val can be given as score limits or as exact limits. The score limits are
merely the lowest and the highest raw scores that fall into the class inter¬
val. The exact limits are the lower limit of the lowest score and the upper
limit of the highest score. (See Section 3.4, Principle 6.)
The difference between the upper exact limit and the lower exact limit is
the width of the class interval and is symbolized with the letter [3.5].
The Cumulative Frequency Distribution
In a cumulative frequency distribution (such as Table 3.6), the important
numbers in the left-hand column are the upper exact limits of the class inter¬
vals. For each class interval, a cumulative frequency distribution shows how
many cases lie above / below the upper exact limit [3.8], The number of
cases below a given upper exact limit is the gumulative frequency for that
limit, and it is symbolized __ [3.8] . The cumulative frequencies are
entered starting at the top / bottom [3.8],
A cumulative frequency, like an uncumulated one, can be expressed relative
to the total number of cases, n. Again, the relative figure can be a propor¬
tion or a percentage. A cumulative frequency expressed as a proportion of the
total is called a proportional cumulative frequency and is symbolized
[Table 3.6]. To compute one, divide the raw cumulative frequency by [3.8].
The proportional cumulative frequency for the upper exact limit of the top-
most interval will always equal 1.00 (as Table 3.6 shows). A cumulative fre¬
quency expressed as a percentage of the total number of cases is called a
cumulative percentage frequency and is symbolized __ [Table 3.6], It
is calculated by multiplying the corresponding proportional cumulative fre¬
quency by [3.8]. The cumulative percentage frequency' for the upper ex¬
act limit of the topmost class interval will always equal 100.
[The summary continues after the following exercises.]
EXERCISES
Here's a table of the kind we've been reviewing. Never mind what the
scores mean; that's irrelevant for now. Just think about the internal logic
of the table. I have given you the top value in the cum f column and four
figures in the column of cum %f's. From these numbers you can determine all
the missing information. Just recall what you know about the topmost value
in the cum f column, and remember how you go about finding the cum %f's.
Give this problem a good, honest try. You'll feel really proud if you
figure it out for yourself. If you need help, though, it's available at the
back of the book.
Score Limits Exact Limits f Prop.f %f cum f Prop.cum f cum %f
23 - 27 22.5 - 27.5 3 "Nt

/ 'V
12 /. 0 0 K 0
tpr' 75
18 - 22 17.5 - 22.5
3 .3^ ^ s . 7 S'
50
. 5D
13 - 17 12.5 - 17.5
a. ,n 17
33
8-12 7.5 - 12.5
1 . 0% % . 33
3-7 2.5 - 7.5 3 c 25
. 2.5" OO 3 *
If you think you can't figure the table out, look at the hint below and then
try again.
* -[UAjieq
ut qsouiuioqqoq oqg uog anppA g sqq ospe st £ pup £ = ZT 1° %S£ — TPAaoguT

qsouimoqqoq aqq joj J mno aqq ‘ Zl = u 31 ’u = 3nTBA J mnD 3SOUidoq
24 Chapter 3
Here are two more problems of the same kind, to help you understand the
connections among the different numbers in tables like these.
Score Limits Exact Limits f Prop, f %f cum f Prop, cum f cum %f
23 - 27
j V • S' 4 12 /* 0 0 i |v jrt
18 - 22 t
i "1 0 67
a A
18 "7
. ul
13 - 17 l Z 1 s$ ~ i )• if 0s. ll 11 Cj? * / \ 50
8 - 12 IS- l i. s 6 ft %uJ*\J 33
M* J®*
3 - 7
St'S'- #, A
§1 O
(j 33 Lj ( «< "S 33
3
£
n
Score Limits Exact Limits f Prop. f %f cum f Prop, cum f cum %f
496 - 505 495.5 - 505.5 ( *0 15 l-yt | 05
f .
HUc - 415 Lj$s S'- '' *1< s' 5 35 1^ .67

\£r f
A
Hit1 - 485
<-/74<S Mt<'5 .13 •i * ’ - M1 1vyf5
r\ A
LUS -474 44> V S- 4741 'S' * h .33 *
Here are some principles that describe the connections among the various
parts of tables like these. Note how they're illustrated in the tables above
(when the numbers are correctly filled in).
1. The sum of the f values = n (or N) = the top number in the cum f column.
2.
The top number in the Prop, cum f column is 1.00, and the top number in
the cum %f column is 100.
^* AH figures in the column of f values and cum f values must be whole

numbers (0, 1, 2, 3, and so on), because these figures are counts, counts
of the numbers of scores of various kinds.
4. The figures in the Prop. f and Prop, cum f columns, as proportions,

must be between 0 and 1.
5. The figures in the %f and cum %f columns, as percentages, must be

between 0 and 100.
6. A given figure in a cumulative column is the sum of two numbers: the

cumulative figure immediately below and the corresponding uncumulated value
in its row.
MAP of the FREQUENCY and CUMULATIVE FREQUENCY DISTRIBUTIONS
A Collection of scores
is well described by-—» a Frequency Distribution
on a Quantitative Variable
lists all lists can be recast

possible scores into
Uncumulated Frequencies
with or without with which scores Cumulative
occurred Frequency
Grouping
can cause
collects scores
into
Grouping
Error
Class Intervals
should be
J Raw Numbers Relative Numbers
mutually exclusive
• continuous
in order from high down to low
•between 10 and 20 in number
Total Number of Cases
SUMMARY , Continued
Centile Points and Centile Ranks

The information in a table giving the cumulative percentage distribution
for a given set of scores is often presented in terms of what, are called cen
tile points and centile ranks. (Sometimes the words percentile point an
26 Chapter 3
percentile rank are used instead.) A centile point (sometimes called just a
centile) is a point along the scale of possible scores (which is assumed to
be a continuum), and it falls among the numbers shown on the left side of the
table. (It may be helpful to label the left-most column of Table 3.6 "Centile
Points.") A centile point is named by specifying the percentage of scores in
the distribution that fall below it. Thus the 96th centile point in a given
distribution is a certain point along the scale of scores, namely that point
that cuts off the bottom 96% of the distribution. The symbol for the 96th
centile point is _[3.9, Paragraph 2], Because a centile point is a point
along the scale of scores, it may have any value that a score may have. The
96th centile point in Table 3.6 is _[3.9, Paragraph 2],
A centile rank, in contrast, is a percentage, a percentage of the total
number of casbs in the distribution. Centile ranks fall among the cumulative
percentage frequencies shown on the right side of the cumulative distribution.
(It may be helpful to label the right-most column of Table 3.6 "Centile Ranks."
As percentages, centile ranks may take values only between and [3.9].
The centile rank for the score 90.5 in Table 3.6 is 96.
[The summary concludes after the following exercises and examples.]
EXERCISES
To practice using the concepts of centile point arid centile rank, label
the left-most column of Table 3.6 "Centile Points" and the right-most column
Centile Ranks, if you haven't done so already. Note again that the value
of 90.5 on the left goes with the cum %f of 96 on the right. In terms of
centile points and centile ranks, 90.5 has a centile rank of 96, and the 96th
centile point is 90.5. Now answer these questions about Table 3.6.
1 . The centile rank of 93.5 is

O
2. The centile rank of 69.5 is
—
3. The 84th centile point is s / i/ . _. 4. The 14th centile point is f7■ •
5. c28 is .5 6. C52 is 91
o
C falls between and %\

CJl
o
,.i,.
8 . Cgg falls between

k
and C9 0 falls between and
10 . The centile rank of 77.0 falls between and •
11. The centile rank of 82.0 falls between ^ {iLsi- and dli .
12. The centile rank of 92.0 falls between and
M U .
Statistics in Action ■ --— --—-—— -
CENTILE POINTS and CENTILE RANKS of SPECIAL INTEREST
In the early 1960s, researchers working for the federal government ap¬
proached about 400 young men and 500 young women, asked them to take off their
shoes and step on a scale, and measured their height and weight. Let's focus
on just the heights for the men. The 400-odd heights were cast into a cumu¬
lative percentage frequency distribution like Table 3.6, and certain centile
points were found, namely the 1st, 5th, 10th, 20th, 30th, and so on up to the
90th, 95th, and 99th. Instead of publishing a table of the kind you're now
familiar with, though, the National Center for Health Statistics reported just
these centiles. They're shown on the left below.
SELECTED CENTILE POINTS from the DISTRIBUTIONS of

HEIGHTS and WEIGHTS of AMERICAN MEN and WOMEN
AGED 18-24 YEARS
Men's Height Women's Height Men's Weight* Women's Weight*

(Inches) (Inches) (Pounds) (Pounds)
Centile Centile Centile Centile Centile Centile Centile Centile
Point Rank Point Rank Point Rank Point Rank
74.8 99 69.3 99 231 99 218 99
73.1 95 67.9 95 214 95 170 95
72.4 90 66.8 90 193 90 157 90
70.9 80 65.9 80 180 80 145 80
70.1 70 65.0 70 171 70 137 70
69.3 60 64.5 60 164 60 131 60
68.6 50 63.9 50 157 50 126 50
67.9 40 63.0 40 151 40 122 40
67.1 30 62.3 30 145 30 117 30
66.5 20 61.6 20 140 20 111 20
65.4 10 60.7 10 131 10 104 10
64.3 5 60.0 5 124 5 99 5
62.6 1 58.4 1 115 1 91 1
*Weight includes some clothing. Nude

weight about 2 lb. less.
28 Chapter 3
To practice using the concepts of interest now, ask yourself what the
50th centile point (C50) for the men's heights is. A centile point is a
point along the scale of scores, remember, so here it will be a height.
The 50th centile point is that score that has 50% of the scores below it.
The table tells you that it's 68.6", so we now know that half the men in
this sample were under 68.6" in height.
A man who stands six feet tall (72.0") has what centile rank in compar¬
ison to these other fellows? Asking for a centile rank is asking for a
percentage, so the answer must be between 0 and 100. The table shows that
the centile rank is somewhat less than 90. For a height of 72.4, which is
the closest we can get to 72.0 using the entries in the table, the centile
rank is exactly 90, and a shorter height would have a smaller centile rank,
of course. So a six-footer is taller than almost 90% of the men in the
sample.
Now do these problems on your own:
*****
/S
_ 1. How tall do you have to be, at minimum, to exceed the height
of the shortest 20% of the men in this sample?
2. Is the answer to Question 1 a centile point or a centile rank?
3. Suppose you're six-one. What percentage of this sample was

taller than you?
Ithi! 6 5. C60 = ?
\
VD
))f\¥ Is Ceo a centile point or a centile rank?

•
<b7«e1 -tplIS. The middle 20% of the scores in any distribution run from C40
to c60* Half of the 20%, namely 10%, lie between C4q and C5Q,
which is the score in the very middle. The other half of the
20% lie between C50 and C0Q. In this distribution, the middle
20% of the scores lie between which two heights?
The middle 90% of the scores lie between which two values?
How many men in this sample were between 65.4" and 66.5" tall?
How many men were between 70.1" and 73.1" tall?
Enough on male heights. The National Center for Health Statistics treated
the heights of the women, the weights of the men, and the weights of the women
in the way they treated the data you just examined in detail. The other
three distributions are tabled above, and the set of four will enable you to
compare your height and weight with the measurements for people of your own.
sex (and for most students, for people of their own age). The samples were
large (for men, n was about 411; for women, n was about 534—the researchers
did not supply the exact figures), and they were drawn to be representative
of the noninstitutionalized population of 18- to 24-year-olds in the lower
48 states. Thus you can be quite confident that the centile ranks shown in
these tables are close to the figures for the full populations.
)
Students often ask at this point about the correlation between height and
weight. Common observation tells us that taller people tend to be heavier,
so there is in fact a correlation between height and weight for human beings.
These four tables do not, however, provide any evidence on the correlation
for either sex. So far as we can tell from the tables, the 50% of the men
who are shorter than 68.6" could be the same 50% who are heavier than 157 lb.
Ch. 9 will show you what kind of table provides evidence for the existence of
a correlation between one variable and another.
[The tables above were derived from Weight, Height, and Selected Body
Dimensions of Adults, National Center for Health Statistics Series 11, No.
8 (Washington: U.S. Government Printing Office, 1965).]
SUMMARY, Concluded
Look at Table 3.6 again. You should have labeled the left-most column
"Centile Points" and the right-most column "Centile Ranks." Sometimes we
find ourselves wondering about a centile point whose centile rank is not one
of the cumulative percentage frequencies shown on the right side of such a
table. (In Table 3.6, for example, we might wonder where the 50th centile
point falls.) Such a centile point will not be one of the upper exact limits
shown on the left side of the table, and the table does not directly give
the value of such a centile point. It can be estimated, though, using a
procedure called linear u.3 f ? -Iflfl im_ l3-10' footnote]. This pro-
cedure rests on the assumption that the scores in a given class interval are
,r _ throughout the interval [3.10, Paragraph 2]. (The next
section of the workbook offers help in understanding how to compute a centile
point in this way.)
Similarly, we sometimes find ourselves wondering about the centile rank
of a score that is not one of the upper exact limits shown on the left side
of a cumulative percentage frequency distribution like Table 3.6. The cen¬
tile rank for such a score will not be found on the right in the column of
cumulative percentage frequencies. (In Table 3.6, for example, we might
wonder what the centile rank for a score of 75 is.) It can also be estimated,
though, again using the procedure of linear interpolation. The assumption
about how the scores are divided within a given class interval is the same
as / the assumption made for computation of a centile point [3.11].

30 Chapter 3
SPECIAL HELP with the COMPUTATION of CENTILE POINTS
Here's a detailed version of the procedure outlined on p. 38 of the text.

The six steps listed here correspond to those in the text. The symbol C is
used to designate the centile point with a centile rank of X. X
1. Find the class interval in which the desired centile point falls. To
do so, determine the number of cases that constitute X% of the whole. This
number will be the cum f for C^., and it is given by the formula
cum f for C^ = (j^) (total number of cases)
The desired class interval is the one such that
cum f for lower exact limit < cum f for C < cum f for upper exact limit
For example, to find C^q for the data in Table 3.7 of the text, find cum f
for ^50' which is (50/100)(80) = 40. The desired class interval, we can now
tell, has the score limits 73 and 75 and the exact limits 72.5 and 75.5 (the
exact limits are not shown in the table), because
cum f for 72.5 (namely 32) < cum f for C50 (namely 40) < cum f for 77.5 (namely
2. Note again what the cum f for the lower exact limit is. In the example,
it's 32.
3. Determine the number of additional scores that together with this cum f
will equal the cum f for C . The formula is simple:
# of additional scores = cum f for C - cum f for lower exact limit

X
In the example, the number of additional scores needed to equal 40 is 40 - 32

= 8.
4. Note the f (not the cum f) for the interval and assume that this number
of scores is distributed evenly throughout the interval. In the example, f =
12, and we thus assume that the bottom score in the interval is in the bottom
twelfth of the interval, the next-to-the-bottom score is in the next-to-the-
bottom twelfth, and so on.
5. Find the distance up into the interval that (on the stated assumption)
is occupied by the additional scores needed to equal the cum f for C
X'
a. This distance will be a certain fraction of the width of the inter¬
val, and the fraction is given by the formula
fraction of interval = —ac*ditional scores needed to equal the cum f for C

f for the interval ^
In the example, the fraction is 8/12; the 8 is from Step 3 and the 12 from
Step 4. On the assumption that the 12 scores are distributed evenly through¬
out the interval, the bottom 8 are in the bottom 8/12 of the interval.
b. To find the desired distance into the interval, the distance occupied
by those additional scores needed to equal the cum f for C^, multiply the frac¬
tion of the interval just computed by the width of the interval. The width is
simply:
width of interval = upper exact limit - lower exact limit
In the example, the width is 75.5 - 72.5 = 3.0 units. Multiplying the fraction
by the width, we have 8/12 times 3.0 units = 2.0 units, and we have thus deter¬
mined that on our assumption, the bottom 8/12 of the interval is the bottom 2
units.
c. The general formula for the desired distance is:
# of additional scores needed to equal cum f for Cy width of

desired distance
f for the interval interval
up into interval >
6.Add the distance found in the preceding step to the lower exact limit
of the interval in which you are working. This addition determines the point
along the scale of scores that cuts off a) those scores lying below the lower
exact limit plus b) the additional scores within the interval needed to equal
the cum f for C . This point is C , the point along the scale of scores below
which X% of the^cases fall. In the example, we have 72.5 + 2.0 = 74.5 = the
50th centile point.
Check to be sure your answer is within the interval that you located in
Step 1.
In general, C=
lower exact + a certain distance up into the interval (as found in Step 5c)
limit of interval
or
lower exact + a certain fraction times the width of the interval

limit of interval
or
# of additional scores needed to equal cum f for Cy width of

lower exact
+ f for the interval interval
limit of interval
In the last formula, the # of additional scores is given by the equation in Step
3, and the width is given by the equation in Step 5b.
The examples offered in Section 3.10 and the problems and exercises for Ch.
3 provide material for practicing the procedure spelled out above. Remember
that this workbook does not attempt to replace those problems and exercises.
f;
.
*
CHAPTER 4
GRAPHIC REPRESENTATION
The purpose of the blank lines below is explained on the title pages for
Chapters 1 and 2 of this workbook.
4.1 Introduction
4.2 The Histogram
4.3 The Bar Diagram
4.4 The Frequency Polygon
4.5 The Cumulative Percentage Curve
4.6 Graphic Solution for Centiles

and Centile Ranks
4.7 Comparison of Different Distri¬

butions
4.8 Histogram versus Frequency

Polygon
4.9 The Underlying Distribution and

~~ “ *“ “ ' — " Sampling Variation
4.10 The Mythical Graph
4.11 Possible Shapes of Frequency

Distributions
1 2 3
6
4 5
7 8 9
11
12
10
14 15
13
16 17
33
34 Chapter 4
SUMMARY
The Histogram
Look at Table 4.1 in the text. This table presents a grouped _
distribution for a collection of scores on a certain kind of intelligence test
[4.2]. All the information in this table can also be presented in a graph
called a _, which is shown in Figure 4.1.
A graph like this has two axes. The horizontal one is also called the
abscissa / ordinate or the X/Y axis, and the vertical one is called the
abscissa / ordinate or the X/Y axis [4.2]. It is customary to represent
_ along the horizontal axis, and
_ along the vertical axis [4.2].
The histogram consists of a series of rectangles, one for each class

interval. The width of a rectangle indicates the width of the corresponding
class interval. The left edge of the rectangle rises from the point along the
horizontal axis that represents the lower exact limit of the class interval,
and the right edge of the rectangle rises from the point that represents the
upper exact limit. The height of a rectangle in Figure 4.1 indicates the raw
frequency of the scores in the corresponding class interval.
It is possible for the height of the rectangles in a histogram to indicate

not raw frequency but relative frequency of scores; it is necessary only to
relabel the vertical axis. For the sample of scores shown in Table 4.1 and
Figure 4.1, n = 100. Thus the raw frequency of 20 for the interval from 99.5
to 109.5 is a relative frequency of 20/100, which is .20 or 20%. In order to
show percentage frequency, the vertical axis of Figure 4.1 would only have to
say 20% where it now says 20, and similarly for the other numbers on the axis.
The Bar Diagram
The bar diagram is used to graph categorical / quantitative data, and it
is similar to the _, except that space is inserted between the
_ [4.3].
The Frequency Polygon
Data that can be graphed as a histogram can also be represented by a
frequency polygon. In this type of graph, a point is plotted above the _
_ of each class interval at a height representing the _
of scores in that interval [4.4]. These points are then connected with
straight / curved lines [4.4]. (The histogram in Figure 4.1 can be easily
turned into a frequency polygon by making a dot in the middle of the top of
each rectangle and then connecting the dots.)
Graphic Representation 35
If nothing further is done, the zig-zaggy line will not touch the horizontal
axis at either end, but it is conventional practice to bring it down to the axis
on both sides. To do so, identify the two ___ falling immedi¬
ately outside those end class intervals containing scores. The _
of these intervals are plotted at __ frequency, and these two points are
connected to the graph [4.4].
As with the histogram, the vertical axis of a frequency polygon can show
either raw frequency or frequency, depending on how it is labeled
[4.4] .
Choosing Between a Histogram and a Frequency Polygon
The information in any frequency distribution for a quantitative variable
can be graphed as either a histogram or a frequency polygon. Which is better?
In general, the _ is better [4.8, Paragraph 1]. It is
especially valuable when two or more distributions are to be compared. The
general public, however, seems to find it a little easier to understand the
[4.8, Paragraph 2]. And the area in the bars of a _
is directly representative of relative _ [4.8]. One hundred per¬
cent of the area in the bars represents 100 percent of the scores in the distri¬
bution. The percentage of the total area that falls in any given rectangle
represents the same percentage of the scores. In Figure 4.1, for example, the
bar over the interval from 99.5 to 109.5 has 20% of the total area in all the
bars, and it thus indicates that 20% of the scores fall in this interval. This
relationship between percentage of area and percentage of scores is only approx
imately / exactly true in the frequency polygon [4.8].
Shapes of Frequency Distributions
Every collection of scores on a quantitative variable has a fundamental
characteristic called its shape. The shape is determined most easily by look¬
ing at the literal shape of the histogram or frequency polygon graphing the
distribution of scores. Some commonly occurring shapes appear in Figure 4.11.
The bimodal, normal, and rectangular distributions are all symmetrical / asym¬
metrical, while the J-shaped and skewed distributions are symmetrical / asym¬
metrical [4.11].
36 Chapter 4
The Graph of a Distribution?
There is no such thing as the graph of a given set of data. The same set
of raw scores may be grouped in different ways /only one way [4.10]. And a
graph can be squat or slender depending on the relative scale of the two axes.
Sampling Variation
Frequency distributions resulting from a very large number of scores often
exhibit a pronounced regularity/ irregularity of shape [4.9]. But when a
sample is drawn from such a population, the shape of the sample is likely to
be more irregular. In general, the fewer the cases, the greater /less the
irregularity of the shape of the sample [4.9]. This is an illustration of the
principle that a larger sample will usually resemble the population more closely
than a smaller sample will.
The Cumulative Percentage Curve
Look at Table 4.2 on p. 49. The right-hand column shows the
percentage frequency distribution for a collection of 80 scores [4.5]. Such a
distribution can also be presented in a graph like Figure 4.4, which is called
a cumulative percentage frequency _, or __ [Figure 4.4, caption].
As with the histogram and the frequency polygon, the horizontal axis shows the
various possible scores, but now the vertical axis shows cumulated frequencies
(raw or relative) rather than uncumulated ones.
In a. table like 4.2, remember that a given cumulative percentage (say 40.0
on the cum %f scale) corresponds to the upper exact limit of the class interval
across from it (in this case, to the upper exact limit of the interval 70 - 72,
which is 72.5). Therefore in constructing cumulative percentage curves, a
given cumulative frequency is plotted at the midpoint / upper exact limit of
the corresponding class interval [4.5].
A cumulative percentage curve can be used to find the centile rank for a
given score, or the centile point for a given centile rank. Such graphic deter¬
mination of centile points or centile ranks will /will not yield the same
result as that given by the computational procedures outlined in the previous
chapter [4.6]. Connecting the points on the cumulative curve with straight
lines is the graphic equivalent of \a 61

Graphic Representation 37
EXERCISES
To develop your understanding of the concept of the shape of a distribu¬

tion, try guessing the shape of each distribution described below. Name the
shape, choosing from the terms introduced in Section 4.11 and used in the fig¬
ure of the same number. If you guess that a distribution is skewed or J-
shaped, specify whether the long "tail" is on the left or the right.
1. Consider the millions of Americans who earned money

in the form of wages or a salary last year. What is the shape of the distri¬
bution of their incomes?
2. Suppose that 523 college seniors take a 50-item arith¬

metic test intended for sixth graders. What shape will the distribution of
the seniors' scores have?
3. In beautiful New Jersey, a person must be at least

17 years old to possess a driver's license. Consider the population of New
Jerseyites who currently hold a license; some are just 17, but most are older.
If we determine for each such person the age at which she or he first acquired
a license, what will be the shape of the distribution of ages?
If you're having trouble with these, try this strategy: Visualize the
table listing the frequency distribution. In its simplest form, it will look
like Table 3.2 or 3.3 on p. 29 of the text, with just two columns. Think what
the numbers in the left-hand column would be. Then guess the pattern in the
frequencies in the right-hand column. Where, for example, would the large
frequency counts fall—at the top, in the middle, or at the bottom of the col¬
umn? Where would the small counts fall? Finally, visualize the translation
of the frequency distribution into a histogram or a frequency polygon, and
choose the appropriate word for describing its shape.
4. Suppose we assemble all the college professors in

North America who have one and only one child in grade school, and each pro¬
fessor brings his or her child. We time each of them, parents and children
alike, while they read the article on Japan's Statistics Day reprinted earlier
in this workbook, and for each we determine the reading rate in words per min¬
ute. What shape will the distribution of reading rates take?
5. What is the shape of the distribution of the weights

of all students enrolled in statistics courses around the world this semester?
6. What is the shape of the distribution of these stu¬
dents' heights?
7. Suppose that 523 sixth-graders take a 50-item spelling
test intended for college seniors. What shape will the distribution of the
sixth-graders' scores have?
8. Two hundred basketball players are assembled, 100

chosen at random from the population of professionals and 100 chosen at random
from the population of high-school teams. What shape will the distribution of
200 heights take?

38 Chapter 4
Statistics in Action —-----
GAS SHORTAGE ISN'T AFFECTING SPEED
Madison, Wis., Aug. 1, 1973--Gasoline shortages notwithstanding, motorists

are still traveling about as fast as ever, according to the 1973 biennial traf¬
fic speed study conducted by the State Department of Transportation.
The study found that average speed records for all types of vehicles on
two-lane rural roads, both night and day, have actually increased over the last
study in 1971. The only indication that people may be heeding pleas to slow
down and conserve gasoline are slightly slower speeds for passenger cars on
interstate highways.
The biennial speed studies were conducted at 25 points on the state trunk
system in May, 1973, to establish average speeds for various types of vehicles,
and to determine trends as well.
Average speeds are somewhat misleading. A more accurate reflection of how

fast traffic moves is the 85th percentile speed, which has long been accepted
by traffic engineers as representative of "reasonable" speeds. The 85th per¬
centile speed is the speed below which 85 percent of vehicles move.
Average and 85th percentile speeds are as follows, with comparative 1971
figures in parentheses:
Interstate, daytime (speed limit 70): Wisconsin passenger cars, 85th per¬
centile speed 74.1 (74.3), average speed 67.9 (68.4). Out-of-state passenger
cars, 85th percentile speed 74.9 (75.0), average speed 69.3 (69.6).
Interstate, nighttime (speed limit 60): All passenger cars, 85th percentile
speed 68.7 (70.7), average speed 62.8 (64.2).
Rural, non-interstate, daytime (speed limit 65): Wisconsin passenger cars,

85th percentile speed 68.1 (66.5), average speed 59.7 (58.5). Out-of-state
passenger cars, 85th percentile speed 69.5 (68.7), average speed 61.1 (61.0).
Rural, non-interstate, nighttime (speed limit 55): All passenger cars,

85th percentile speed 62.3 (61.5), average speed 55.1 (54.7).
Traffic speeds have been steadily increasing since studies began in 1938-
1939, except for the 1942-45 war years, when there was a uniform 35 mph limit.
Average speed in 1938 was 49.6 with the 85th percentile speed at 61.5.
There was no speed limit.
The lowest speeds recorded were in 1942, when average speed was 37.1 mph
and the 85th percentile speed was 42.9 mph. In 1950 the average speed was
50.9 mph and the 85th percentile speed was 59.9.
[Excerpted by permission from The Capital Times of August 7, 1973, p. 20.]

The average speed and the 85th percentile speed are used here to summarize
the location of an entire distribution. The next chapter discusses this matter
in detail and notes that there are several kinds of average. (The article does
not state which one the traffic engineers used.)
CHAPTER 5
CENTRAL TENDENCY
5.1 Introduction
5.2 Some Statistical Symbolism
5.3 The Mode
5.4 The Median
5.5 The Arithmetic Mean
5.6 Effect of Score Transformations
5.7 Properties of the Mean
5.8 Properties of the Median
5.9 Properties of the Mode
5.10 Symmetry and Otherwise
5.11 The Mean of Combined Subgroups
5.12 Properties of the Measures of

Central Tendency: Summary
2 3
1
5 6
4
8 9
7
11 12
10
14 15
13
17 18
16
20 21
19
22 23
39
40 Chapter 5
SUMMARY
Look at the distribution of scores shown in Table 5.1 on p. 66 of the text.

You can see at a glance that the scores range from the 40s up to the 90s. The
column of frequency counts indicates, however, that most of the scores fall not
at either of the extremes, but in the middle of the distribution, in the 60s
and 70s. This is a typical pattern for a collection of scores on a quantitative
variable, whether the scores are a sample or a population: the scores tend to
cluster around some central location. A measure of central tendency is a num¬
ber that points to this central location, telling you about how large the scores
in general run.
There are three measures of central tendency in common use: the _,
the , and the arithmetic _ [5.1]. Any one of these can properly
be called the average of the scores; the term average is thus vague.
The Mode
If the scores in the collection have been left ungrouped, the mode is the
score that occurs with the greatest _ [5.3]. In grouped data, the
mode is taken as the _ of the class interval that contains the
greatest / smallest number of scores [5.3]. The symbol for the mode is _ [5.3]
The Median
If there are only a few scores in a collection (in which case they would
naturally be left ungrouped), the median is usually defined informally. The
scores are arranged from high down to low, and if there is an odd number, the
median is defined as the score in the middle. When there is an even number of
scores, there is no one middle score, and the median is taken as the point
way between the two scores that bracket the middle position [5.4].
The formal definition of the median is C , the 50th centile point, which
b(J
is the point along the scale of scores below which _% of the scores fall [5.4]
For a large number of scores that have been grouped, the median can be calcu¬
lated like any other centile point, using interpolation.
The symbol for the median is _ [5.4].
The Mean
The mean is the result of summing all the scores and then dividing the sum
by the _of scores [5.5]. Strictly speaking, this procedure defines
the arithmetic mean (there are other measures of central tendency also called
Central Tendency 41
means). The symbol for the mean of a sample of scores collectively called x
is , and the symbol for the mean of a population is _ [5.5]. The latter
symbol is the lower-case Greek letter _____ [5.5].
When the scores in a set are to be summed, as in finding the mean, the
capital Greek letter _, £, indicates that this operation is to be per¬
formed [5.2]. Y.X should be read " of X" [5.2]. Using this symbol,
the mean of a population called X is defined as follows:
y = - [Formula 5.1a]
The mean of a sample called X is defined as:
X = - [Formula 5.1b]
(Be sure you used N and n correctly in these formulas.)
Properties of the Mean
Many important properties of the mean can be understood with the aid of a
physical analogy. Picture the scores in a distribution as weights arranged
along a weightless plank, as in Figure 5.1. The plank will balance (become
level) at a certain point, and that point is the mean of the scores.
Here is a basic property of the mean (not noted in the text) that you can
easily understand by thinking of the mean as the balance point of the distribu¬
tion: The mean always falls somewhere between the lowest and the highest score
in the distribution (unless all scores have the same value). In the physical
analogy, the distribution always balances at a point somewhere between the left¬
most and the right-most weights.
You can also understand how the mean is sensitive to the exact location of
each score. The balance point is sensitive to the exact location of each weight,
and if a given weight is moved (corresponding to a change in value), the bal¬
ance point will change. Note especially that an extremely low score pulls the
mean way down, just as a weight well to the left of the others pulls the bal¬
ance point way over to the left. Similarly, an extremely high score pulls the
mean way up, just as a weight well to the right of the others pulls the balance
point way over to the right.
Suppose you subtract the mean from each score in the distribution. The
result for a given score is called the score's deviation from the mean. A
score below the mean will have a negative deviation, and a score above the mean
will have a positive deviation. If you compute the mean and the deviations
correctly, the sum of the deviations, taking into account the fact that some
are negative and some positive, will always be zero. This is the numerical way
of saying that the mean is the balance point of the scores, and it is illustra¬
ted in Figure 5.1
42 Chapter 5
Suppose you add a certain number (say 10) to each score in a distribution.
What will happen to the mean? In the physical analogy, this is like sliding
each weight to the right by a certain amount (10 units in the example). The
balance point will obviously move right along with the weights (to a point 10
units to the right in the example). In terms of numbers, if some constant
amount is added to each score in a distribution, the entire distribution is
shifted up /down by that amount, and the mean will be ____
[5.6]. Similarly, if a constant is subtracted from each score,
the mean will be ___ [5.6]. Other measures of
central tendency discussed in this chapter are / are not affected in the same
way [5.6].
Unfortunately, the analogy between the mean and the balance point does not
help you understand what happens when you multiply or divide every score in a
distribution by a constant (because it is hard to picture what happens to the
weights as a result of the multiplication or the division). Multiplying each
score by a constant the mean by the amount of the constant,
and dividing each score by a constant has the effect of_the mean
by that amount [5.6].
Properties of the Median
The median responds to how many scores lie below (or above) it, and also
to /but not to how far away the scores may be [5.8]. Thus the median is
more / less sensitive than the mean to the presence of a few extreme scores
[5.8]. This fact means that in distributions that are strongly asymmetrical
(or skewed), the median / mean may be the better choice if it is desired to
represent the bulk of the scores and not give undue weight to the relatively
few deviant ones [5.8] .
Properties of the Mode
For a given distribution, there is only one mean, just as there is only one
point where the weights representing the scores would balance. Likewise, there
is only one median, because there is only one point that divides the top half
of the scores from the bottom half. There may / may not be more than one
mode [5.9]. The mode is the only measure that can be used for data that have
the character of a(n) nominal / ordinal / interval / ratio scale [5.9].

Central Tendency 43
Sampling Fluctuation and Inferential Statistics
Suppose there is a population of scores, and you draw a random sample of a

certain size from it. The sample will turn out to have a particular mean, a
particular median, and a particular mode (maybe more than one mode). Now draw
another random sample of the same size. The mean, median, and mode(s) of the
second sample are likely to differ from their counterparts in the first sample.
Draw a third sample of the same size, and continue drawing samples of this size.
The means will fluctuate from sample to sample, and so will the medians, and so
will the modes.
The -s wil1 vary least among themselves [5.7, final paragraph], and
the - stands second in ability to resist the influence of sampling
fluctuation [5.8]. This state of affairs makes the mean the most useful measure
Oj. central tendency in inferential statistics—the most useful, that is, in
techniques for drawing inferences about a population from a sample. Further¬
more, the inferential techniques require various calculations based on a measure
of central tendency, and of the three measures, the _ is amenable to arith¬
metic and algebraic manipulation in a way that the other measures are not [5.7].
Relative Locations of the Three Measures of Central Tendency
In distributions that are perfectly symmetrical, mean, median, and (if the
distribution is unimodal) mode will yield the same / different values [5.10].
If the mean and median have different values, the distribution cannot be /may
or may not be symmetrical [5.10]. The more skewed, or lopsided, the distribu¬
tion is, the greater / lesser the discrepancy between these two measures [5.10].
In a smooth negatively skewed distribution, the has the highest
score value [5.10]. The _ has been specially affected by the fewer but
relatively extreme scores in the tail, and thus has the lowest value [5.10]
In a positively skewed distribution, the same / opposite situation obtains

[5.10].
EXERCISES
Which measure of central tendency is:
_ !• The only one suitable for qualitative (nominal or categorical) data?
_ 2. Impossible to compute if the distribution is open-ended?

44 Chapter 5
_ 3. Sensitive to the exact value of each score?
_ 4. The most resistant to sampling fluctuation?
_____ 5. Responsive only to the number of scores above it or below it, not
to their exact locations?
_ 6. Most widely used in advanced statistical procedures?
_ 7. The least resistant to sampling fluctuation?
__ 8. Least useful for advanced statistical procedures?
_ 9. The balance point of the distribution?
_ 10. The point about which the sum of negative deviations equals the sum
of positive deviations?
_ 11. Lowest in value when the distribution is positively skewed?
_ 12. Sometimes not a unique point in the distribution?
_ 13. Lowest in value when the distribution is negatively skewed?
_ 14. Most sensitive to extremely low or extremely high scores?
__ Tê point along the scale of scores that divides the upper half of
the scores from the lower half?
__ 16. Highest in value when the distribution is negatively skewed?
_ 17. The measure that best reflects the total value of the scores?
_ 18. Highest in value when the distribution is positively skewed?
SYMBOLISM DRILL
Yes, this is a drill such as you did many of back when you learned your
alphabet and multiplication table. And like those it'll be good for you. Fill
in the blanks. Answers appear on p. 231.
Symbol Pronunciation Meaning

1
"little en" Number of scores in a _
2
"big en" Number of scores in a
3
"eks" A score, or the set of scores, collectively
4
I Result of summing quantities of some kind
5
X ZX/n; the mean of a _
6
y Zx/N; the mean of a _
7
Mdn C50 (may also be defined informally)
8
Mo Score or midpoint of class interval with largest f
Central Tendency 45
u
.
CHAPTER 6
VARIABILITY
__ 6.1 Introduction
_ 6.2 The Range
_ 6.3 The Semiinterquartile Range
6.4 Deviation Scores
6.5 Deviational Measures: The Variance
_ 6.6 Deviational Measures: The Standard

Deviation
6.7 Calculation of the Standard Deviation:

Raw Score Method
6.8 Score Transformations and Measures of

Variability
6.9 Calculation of the Standard Deviation:

Grouped Scores
6.10 Properties of the Standard Deviation
6.11 Properties of the Semiinterquartile

Range
6.12 Properties of the Range
6.13 Measures of Variability and the Normal

Distribution
6.14 Comparing Means of Two Distributions
6.15 Properties of the Measures of Variability:

Summary
The list of problems and exercises appears on the next page.
47
48 Chapter 6
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
SUMMARY
Measures of Variability
Every set of scores has three important properties: a shape, a central

tendency, and a variability. Chapters 3 and 4 introduced terms for describing
various possible shapes, and Chapter 5 presented three measures of central
tendency. This chapter now offers four measures of variability.
The Purpose of Measures of Variability
Measures of variability express quantitatively (in numerical terms) the
extent to which the scores in a set _
_ [6.1]. A measure of variability specifies / does not specify
how far any particular score diverges from the center of the group; rather it
is a summary figure that describes the spread of the entire set of scores [6.1,
Paragraph 3 on p. 82]. Furthermore, a measure of variability provides / does not
provide information about the level of performance (the central tendency), and
it gives / does not give a clue as to the shape of the distribution [6.1, Para¬
graph 3 on p. 82].
The Range
The simplest measure of variability is the range, which is the difference
between the __ and the _ score [6.2] . Note that the
range is a distance, whereas a measure of central tendency is a location. All
the other measures of variability are also distances, except for the variance,
which is the square of a distance.
There is no special symbol for the range.

Variability 49
The Semiinterquartile Range
Every distribution has three quartile points, which are the three score
points which divide the distribution into four parts, each containing an equal
number of cases. These points, symbolized Q' , Q t and Q , are C , C , and
C , respectively [6.3]. The semiinterquartile range is defined as one-half
the distance between the and the quartile points [6.3].
The symbol for the semiinterquartile range is [6.3], and the formula that
defines it is:
[Formula 6.1a or b]
Note that the distance from Q ^ to Qis the range of the middle 50% of the
scores. Thus the semiinterquartile range is half the range of the middle 50%
of the scores. It may also be thought of as the mean distance between the
mode / median / mean and the two outer quartile points [6.3].
Deviation Scores
The other two measures of variability use the mean of the distribution as a
reference point and indicate how far the scores lie from the mean, on the average.
The distance between a given score and the mean is called the deviation score
for that raw score, and it is found by subtracting the mean from the raw score.
If a raw score is symbolized X, the deviation score is symbolized x, and the
formula that defines a deviation score is:
x - (X — ]j) for scores in a population
and
x = ( - ) for scores in a sample [6.4],
Warning: It is now highly important to distinguish raw scores, symbolized
with a capital X, from deviation scores, symbolized with a little x. Whenever
you see an X or an x, note carefully whether it's a capital or a small letter.
And when you write the symbols, you must make them clearly different. I suggest
that you make your capital letter large with straight lines, like this: X , and
your small letter small with one hooked line, like this: .
The Variance
A third measure of variability, the variance, is based on the squares of the

deviation scores. The square of a number is the result of multiplying that num¬
ber by itself. Thus the square of 2, symbolized 22, is 2 x 2, or 4.
50 Chapter 6
The variance is defined as the mean of the _ of the raw/
deviation scores [6.5]. The symbol for the variance of a population is _;
the foreign letter there is the lower-case Greek letter _ [6.5, Line 2],
and the symbol is read "sigma squared." The symbol for the variance of a sam¬
ple is _, which is read "es squared" [6.5]. The formulas that define the
variance in the two cases are:
2
Variance of a population: = Z ( - ) or _ [Formula 6.2a]
2
Variance of a sample: = Z ( - ) or _ [Formula 6.2b]
The variance is a most important measure which finds its greatest use in
inferential / descriptive statistics [6.5]. At the descriptive level, it has a
fatal flaw: its calculated value is expressed in terms of _ units of
measurement [6.5]. We can correct this flaw, though, by taking the square root
of the variance. (To find the square root of a number, figure out what value
has to be squared to equal that number. Thus the square root of 4, symbolized
/4, is 2, because 22 = 4.)
The Standard Deviation
The square root of the variance is called the standard deviation, and it
serves as a fourth measure of variability. The symbol for the standard devia¬
tion of a population is _, which is read "sigma," and the symbol for the
standard deviation of a sample is _, which is read "es." The defining
formulas, in terms of the deviation scores x, are:
Standard deviation of a population: = [Formula 6.3a]
Standard deviation of a sample: = [Formula 6.3b]
(Special help in understanding the variance and the standard deviation is

offered below.)
Sensitivity to the Scores in a Distribution
The standard deviation, like the mean / median / mode , is sensitive to the
exact location of every in a distribution [6.10]. In particular,
the standard deviation is more / less sensitive than the semiinterquartile

Variability 51
range to the presence or absence of scores which lie at the extremes of the
distribution [6.10]. Because of this characteristic sensitivity, the standard
deviation is / may not be the best choice among measures of variability when
the distribution contains a few very extreme scores, or when the distribution is
badly _ [6.10]. In contrast, if a distribution is badly skewed or if
it contains a few very extreme scores, the semiinterquartile range will /will
not respond to the presence of such scores, and will / but will not give them
undue weight [6.11].
Only the two outermost scores of a distribution affect the range. It is
thus highly / not very sensitive to the total condition of the distribution
[6.12].
Effects of Transformations of the Scores in a Distribution
Think of the scores in a distribution as a set of weights on a plank, as in

Figure 5.1. If we add a constant (say 10) to each score, this is the same as
shifting each weight to the right by the amount of the constant (10 units). The
whole distribution thus slides to the right, and the measures of central ten¬
dency change (they each increase by 10), but the variability among the scores—
where they are in relation to each other, or where they are in relation to their
mean—does not change.
Thus adding a constant to each score in the distribution, or subtracting a
constant from each score, which is equivalent to sliding all the weights to the
left in one big clump, affects / does not affect the measures of variability
[6.8]. When scores are multiplied or divided by a constant, the range, the
semiinterquartile range, and the standard deviation are _
_ by that same constant [6.8].
Comparing Means of Two Distributions
There are many occasions on which a researcher will wish to compare the mean
of one distribution with the mean of another. The researcher will subtract one
mean from another, but the difference so obtained usually has little meaning
without an adequate frame of reference by which to judge. That frame of refer¬
ence can be provided by comparing the difference to the ___
of the variable [6.14]. As a general guideline, a difference of .1 standard
deviation is negligible / moderate , and a difference of .5 standard deviation
is moderate / of some importance [6.14, p. 96].

52 Chapter 6
MAP of MEASURES of VARIABILITY
Measure of Variability
variance
Variability 53
SPECIAL HELP with the VARIANCE and the STANDARD DEVIATION
To understand how the variance and the standard deviation function as

measures of the variability among the scores in a distribution, remember that
they are based on the deviations between the scores and the mean.
The mean will always fall somewhere between the lowest score in the distri¬
bution and the highest one, as noted on p. 41 of this workbook, so some scores
will be below the mean, and some will be above it. For an illustration, look
at the left-most column in the table below. This is not a frequency distribu¬
tion, just a listing of the scores in order from high down to low. Check the
computation of the mean, and note that it falls between the low of 11 and the
high of 23.
Raw Score, X Deviation Score, x Squared Deviation Score, x2
23 +5 25
22 +4 16
19 +1 1
17 -1 1
16 -2 4
11 -7 49
o
Ex2 = 96
w
II
X
EX = 108
Mean = S2 = 96/6
X = EX/n = xd.u
= 108/6 = 18.0 S = /l6.0 = 4.0
Where are the raw scores in relation to the mean? The deviation scores in
the second column tell you. Raw scores below the mean have negative deviations;
raw scores above the mean have positive deviations; scores right at the mean
would have deviations of zero. The larger the deviation score, ignoring its
sign, the farther the corresponding raw score lies from the mean. Thus the raw
score of 11 is the farthest from the mean, because its deviation of 7 (really
-7) is the greatest of the deviation scores.
Now: the standard deviation of a distribution is just what its name says.
It is a standard, or typical, deviation. How big do the deviation scores for
the distribution run (ignoring their signs)? The standard deviation tells you
how big a typical deviation score is. In fact, the standard deviation is a
kind of average of the deviation scores (when their signs are ignored). It must
fall somewhere between the smallest deviation and the largest one, and the bigger
the deviations are, in general, the bigger the standard deviation will be.
54 Chapter 6
It is this latter feature that makes the standard deviation a measure of

the variability among the raw scores. What would happen if the raw scores in
the left-hand column of the table were more widely scattered about their mean?
What would happen, that is, if the scores above the mean became even larger in
value and thus moved farther above the mean, while the scores below the mean
became even smaller and thus moved farther below it? The deviations from the
mean would increase, of course, and so would any average of the deviation scores.
In particular, the standard deviation, as an average, would increase. Thus an
increase in the scatter among the raw scores would produce an increase in the
standard deviation.
The formula for the standard deviation is easy to remember if you recall
that it is the root mean squared deviation, as the text tells you on p. 86.
That is, it is the square root of the mean of the squared deviation scores.
So to compute the standard deviation, you have to find the deviation scores,
then the squares of the deviation scores, then the mean of the squared devia¬
tions, and then the square root of this mean. These calculations are shown in
this order in the other columns of the table above.
For this distribution, the standard deviation, S, turns out to be 4.0.

Think again what this number means. The standard deviation is a standard devia¬
tion—that is, a typical deviation. So for this distribution, a typical amount
by which the raw scores deviate from their mean is 4 units. To be a bit more
precise, an average of the deviation scores is 4.0, and thus some deviation scores
must be less than 4.0 while others must be more than 4.0. Note how these things
are true in the table. The deviation scores in the second column (ignoring their
signs) run from 7 and 5 down through 4 and 2 to 1. The value for the standard
deviation, 4.0, does indeed fall between the 7 and the 1, and it is a pretty
good average for those numbers.
Note that the standard deviation is calculated by doing two operations and
then undoing them, which gets you back to something comparable to what you began
with. You start with deviation scores, and you do two things to them: square
them, and then sum the squares. Then you undo these things: divide by the num¬
ber of cases, which undoes the summing; and take the square root, which undoes
the squaring. The result, the standard deviation, is comparable to the devia¬
tion scores with which you began: it is a standard, or typical, deviation.
Look again now at the computations in the table above. On the way to the
standard deviation in the lower right-hand corner, the variance turned up, for
the variance is nothing more than the mean of the squared deviation scores.
Remembering that it's a mean will help you tell if you have a reasonable value
when you've computed a variance. As the mean of the squared deviation scores,
it must fall somewhere between the largest squared deviation and the smallest
one. In the table, note that 16.0 does fall between the high of 49 and the low
of 1.
Variability 55
EXERCISES
To practice conceptualizing the variance and the standard deviation in the

terns introduced above, fill in the tables below, which are like the example
you just went over. THINK WHAT YOU ARE DOING! Here's a list of points to note;
do indeed note how these generalizations are true in each table.
The Mean (X): As an average of the raw scores, indicates about how large they
run. Takes an intermediate value, with at least one raw score larger and at
least one raw score smaller. Serves as the reference point for the deviation
scores (xs).
Deviation Scores (xs): Indicate where the corresponding raw score (X) lies in
relation to the mean—whether the raw score is above (+) or below (-) the mean,
and how far from the mean the raw score falls. Taking their signs into account,
Ex = 0 (see Section 5.7 in the text).
Variance (S2): Is the mean of the squared deviation scores (x2s). As an average
of the squared deviation scores, indicates about how large they run, and takes
an intermediate value, with at least one squared deviation larger and at least
one squared deviation smaller.
Standard Deviation (S): Is a typical value for the deviation scores (ignoring
their signs); serves as an average for the deviation scores (xs), indicating
about how large they run. Takes an intermediate value, with at least one
deviation larger and at least one deviation smaller. Is calculated by doing
two operations (squaring the deviation scores and summing them) and then un¬
doing the operations (dividing the sum by n or N and taking the square root
of the results), which gets you back to something comparable to what you started
with—something comparable to the deviation scores.
Deviation Score, x Squared Deviation Score,

Raw Score, X
13
12
Ex = Ex2 =
EX =
n =
Mean = S2 = /
X = EX/n
56 Chapter 6
Your standard deviation for the little distribution in that table should
have been 4.0, the same as the value for the example on p. 53. Why do the two
distributions have the same standard deviation? Yes, it's because they have
the same amount of variability, but why, exactly, do the two standard deviations
work out to be the same number?
14
13
10
10
EX = Ex = Ex 2 _
n = Mean = s2 = /
X = IX/n
S
/
Note how the generalizations on the preceding page apply to these tables-
14
~9
EX = Ex = Exz =
n = Mean = S2 = /
X = Ex/n
/ S = /
Variability 57
2
Raw Score, X Deviation Score, * Squared Deviation Score, x
ZX = Zx = Zx2 =
n = Mean = S'2 = /
X = ZX/n
/ S = v
MORE EXERCISES
As Minium's Second Law of Statistics implies, computing variances and

standard deviations from deviation scores is usually quite laborious, because
in real life the mean of a distribution is usually some ridiculously unwhole
number like 15.632, and the deviation scores are also decimal fractions. For¬
tunately, there are ways to compute the right answers without fussing with
deviation scores, ways that don't require anything but the raw scores. The
basic formula is: ? _ yv2 _ (yyx2/n
x K \ ) / ' (Rea(g this as "The sum of little-x-
squared equals the sum of big-X-squared minus the-sum-of-big-X-the-quantity-
squared over n.") Thus whenever you need Zx2, you can substitute the expres¬
sion on the right of the equals sign. To practice using this formula and to
see that it is trustworthy, compute Zx2 = ZX2 - (ZX)2/n for each of the dis¬
tributions above, and look to see if the result is the same as what you got
for Zx2. The raw scores are relisted below for each distribution, and the
computations for the data in the table used as the example on p. 53 have
been worked.
58 Chapter 6
Squared RAW Score, X2 Raw Score, X To make it clear that X2 differs from
2
x2, the column of X values is listed
529 29 on the left side of the raw scores.
484 22
Zx2 = ZX2 - (IX)2/n
361 19
= 2040 - (108) 2/6
289 17
= 2040 - 11664/6
256 16
= 2040 - 1944
121 11
-— - = 96 = the value computed
ZX2 = 2040 Zx = 108 n = 6 directly on p. 53
Squared RAW Score, x2 Raw Score, X
13 Zx2 = ZX2 - (Zx)2/n

12
- ( ) 7
9
7 /
6
---- = should = the value

£^,2 _ Zx - - computed directly
on p. 55
Squared RAW Score, X2 Raw Score, X
14 Zx2 = Zx2 - (Zx)2/n

13
- ( )V
10
10
/
8
5
should = the value
computed directly
ZX2 = Zx = n
on p. 56
Variability 59
14 Ex2 = EX2 - (Ex) 2/n
9
- ( ) 2/
9
/
9
should = the value

Ex2 = = n =
computed directly
on p. 56
7 Ex2 = EX2 - (IX) 2/n
~ = - ( ) 2/
7
--- /
5
4 =
_4 = should = the value

3 computed directly
----- on p. 57
3
Ex = n =
---M NEMONIC T I P—--
That formula you practiced using in the exercise above is worth memorizing
Write it out several times, taking care to distinguish x from X and Ex from
(Ex)2. Note again that the formula is read "The sum of little-x-squared equals
the sum of big-X-squared minus the-sum-of-big-X-the-quantity-squared over n.
Rehearse these words as well as the symbols to which they correspond.
60 Chapter 6
STILL MORE EXERCISES
Of the three measures of variability useful in descriptive statistics,

namely the range, the semiinterquartile range, and the standard deviation,
which one is:
1. Defined in terms of the scores' deviations from the mean?
2. Unresponsive to the location of scores in the middle of the

distribution?
_ 3. Poorest in resistance to sampling fluctuation?
_ 4. Best in resistance to sampling fluctuation?
_ 5. Most useful in inferential statistics?
_ 6. Most sensitive to extremely high or extremely low scores?
_ 7. Particularly useful with open-ended distributions?
_ 8. Determined by only two scores?
SYMBOLISM DRILL
This drill includes some review. Fill in the blanks.
Symbol_Pronunciation_Meaning
1
"little en" Number of scores in a
2
"big en" Number of scores in a
3
X "eks" or "big eks" A raw score, or the set of raw scores
4
Result of summing quantities of some kind
5
X Zx/n; the mean of a
6
Z X/N; the mean of a
9
"little eks" X - X or X - U; deviation score
1 0
"cue" (C 7 5 ~ C25)/2; semiinterquartile range
1 1
a2 "sigma squared" Zx2/N; variance of a population
12
"sigma" /Z xz/Ni of a
1 3
s2 "es squared" Zx2/n; of a
1 4
"es" /Z xz/n; of a
Variability 61
Statistics in
DESCRIPTIVE STATISTICS in USE
In 1973, the U. S. Consumer Product Safety Commission received reports

indicating that spray adhesives damage chromosomes and can cause a pregnant
woman to bear a deformed child. The Commission banned the sale of such adhe¬
sives in August of that year and warned all persons who had come into contact
with them, especially pregnant women, to consult a physician for a chromosome
study.
Two researchers subsequently mailed a questionnaire to all the Americans

listed in a directory of professionals who do diagnostic cytogenetics and
genetic counseling. Only five of these individuals failed to respond. The
questionnaire asked for the number of persons who had consulted the respondent
about exposure to spray adhesives and the number whose chromosomes the respon¬
dent had actually studied. The researchers reported the following data:
Number of People Consulting Respondent

about Exposure to Spray Adhesive f
None 52
1-5 68 Six additional
6-10 31 respondents re-
11 - 15 8 ported "some"
16 - 20 8 inquiries.
21 - 25 2
Over 25 7
176
Range 0 - 200. X = 6.81 inquiries per respondent. C25 = 0/ C50 = 3.2, C75 = 7.6.
Number of Persons Whose Chromosomes

were Studied by Respondent_f
None 49
1-5 58
Two additional
6-10 13
respondents re¬
11-15 4
ported "some"
16-20 2
studies.
21-25 1
Over 25 1
128
Range 0-44. X = 2.97 studies per respondent. C25 = 0, C50 = 2.0, C75 - 4.2.
As you can see, the researchers did not follow standard practice in con¬
structing their grouped frequency distributions: they have too few class inter¬
vals, the intervals are not of uniform width, and one interval is open-ended.
Nevertheless, these data illustrate a number of techniques of descriptive sta¬
tics and offer you a chance to review much of what you've learned so far.
62 Chapter 6
1. What is the shape of the first distribution? _
2. What is the shape of the second distribution? _
3. What is the median of the first distribution?
4. What is the median of the second distribution?
5. Why is the mean higher than the median in the first distribution?
6. Why is the mean higher than the median in the second distribution?
7. What is Q for the first distribution? _
8. What is Q for the second distribution? _
9. The researchers actually reported the values of Q for these distributions;

these are the only times I've seen this done. Why is Q better than S' as a measure
of variability for these distributions?
10. For the 176 respondents who reported an exact number, what was the total
number of people who consulted with them about the effects of exposure to spray
adhesives?
11. For the 128 respondents who reported an exact number, what was the total
number of people whose chromosomes were studied?
12. The researchers' goal was to determine the impact of the ban and the
warning on the genetic counselors in the U. S. who do diagnostic cytogenetics.
What was the population of interest to them?
13. Why did the researchers employ only descriptive statistics with no infer¬
ential techniques?
Important note: The ban was withdrawn in six months when the purported cor¬
relations between exposure to spray adhesive and chromosomal damage or birth de¬
fects could not be confirmed, and no toxicity could be demonstrated for the ad¬
hesives. In fact, investigators who reexamined the slides that had initially
been believed to show chromosome damage in exposed individuals did not agree with
the original interpretation.
Less important note: Don't be upset if you computed C50 and C75 for the two
distributions and failed to get the figures the researchers reported. I can't
derive the figures from the numbers in the tables either. The researchers must
have computed the quartiles from the ungrouped data.
Reference: E. B. Hook & K. M. Healy, "Consequences of a Nationwide Ban on

Spray Adhesives Alleged to be Human Teratogens and Mutagens," Science, 1976, 191,
566 - 567.
CHAPTER 7
THE NORMAL CURVE
7.1 Introduction
7.2 The Nature of the Normal

Curve
7.3 Historical Aspects of the

~ Normal Curve
7.4 The Normal Curve as a Model

— ~ _—— for Real Variables
7.5 The Normal Curve as a Model

~ for Sampling Distributions
7.6 Standard Scores (z Scores)

— and the Normal Curve
7.7 Finding Areas when the Score

is Known
7.8 Finding Scores when the Area

is Known
63
64 Chapter 7
SUMMARY
The Normal Curve as a Model for the Shape of a Distribution
The normal curve is a model, or representation, of one feature of a distri¬
bution of scores—its shape. Some distributions have a shape for which the
normal curve is a poor model. Furthermore, the normal curve best describes
a finite number / an infinity of observations that are on a continuous / dis¬
crete scale of measurement, while in reality recorded observations are dis¬
crete/continuous and finite in number [7.4, p. 111]. Nevertheless, there are
many real distributions whose shape is well represented by the normal curve.
The normal curve also functions well as a model for many distributions of
sample _, even if it is not a good model for the raw observations
that make up the samples [7.5]. Suppose, for example, that we draw a very great
number of random samples from some population. (All the samples should be of
the same size.) We compute the mean of each sample, which is a statistic char¬
acterizing the sample, and we cast these statistics into a distribution. It
would be found that the shape of the distribution of this large number of means
tends to approximate the _ curve [7.5].
The normal curve, as a model for the shape of a distribution, does not
specify the central tendency of that distribution (it does not specify the mean,
for example), nor does the model specify the variability (it does not specify
the standard deviation, for example). Thus different distributions can all
conform to the shape called normal, even though the distributions differ in
their mean, in their standard deviation, or in both. (See Figure 7.1 for an
example.)
Characteristics of the Normal Curve
Exactly what is the shape of the normal curve? It is symmetrical / asymmet¬
rical and unimodal / bimodal [7.2]. (It is thus a specific kind of bell
shape.) Going away from the middle toward either end, the curve gets closer
and closer to the horizontal axis, and it eventually / but it never actually
touches it [7.2, p. 107]. The curve is continuous / discrete [7.2].
Areas under various segments of the normal curve are of special interest.
The interval from one standard deviation below the mean to one standard devia¬
tion above it (the interval from y - la to y + la) contains about two-thirds
(68%) of the total area. In any distribution whose shape is well modeled by
The Normal Curve 65
the normal distribution, then, about two-thirds (68%) of the scores will have a
value falling in the interval from y - la to y + la. The interval y ± 2a con¬
tains about _% of the area, and thus about _% of the scores fall into this
interval [6.13]. The interval y ± 3a contains almost all the area (99.7% of it),
and thus almost all the scores (99.7% of them) fall into this interval. In gen¬
eral, relative area under the curve equals _ of
cases in the distribution [7.6, Paragraph 1; see also the last paragraph of
Section 4.8 and p. 35 of this workbook].
Standard Scores (z Scores)
If a large number of people representative of the general population are
given a certain kind of IQ test, namely the Wechsler Adult Intelligence Scale
(the WAIS), the distribution of their scores will have a mean about 100 and a
standard deviation about 15. Back when the College Entrance Examination Board
Test (the CEEB test, also called the Scholastic Aptitude Test, or SAT) was first
constructed, a large number of high-school seniors would have earned scores with
a mean about 500 and a standard deviation about 100 on both the verbal and the
quantitative parts. (The means for the CEEB scores are somewhat lower now.)
Given these states of affairs, there is an important way in which an IQ score
of 100 from the first distribution corresponds to a CEEB score of 500 from the
second distribution: each falls at the mean of its distribution. There is also
an important way in which an IQ score of 115 corresponds to a CEEB score of 600:
each falls one standard deviation above the mean of the distribution from which
the score comes. Similarly, an IQ of 85 and a CEEB score of 400 are each one
standard deviation below the mean. An IQ of 130 corresponds to a CEEB score of
; an IQ of 145 corresponds to a CEEB score of _; an IQ of _ cor¬
responds to a CEEB score of 300; and an IQ of corresponds to a CEEB score
of 200 [Look ahead to Figure 8.4 on p. 132 for the answers].
There is a convenient way of saying where a raw score falls within its dis¬
tribution, a way that makes clear the sort of correspondences noted above. The
raw score is converted to a standard score. A standard score, or _ score,
states the position of the raw score in relation to the _ of the distri¬
bution, using the of the distribution as the unit of

66 Chapter 7
measurement [7.6]. Thus an IQ of 115 in the distribution described above has a
z score of +1, because it is 1 standard deviation (1 15-point unit) above the
mean (which is 100). A CEEB score of 600 in the other distribution also has a
z score of +1, because it too lies one standard deviation above the mean, but
here the standard deviation and the mean are those characterizing the distribu¬
tion of CEEB scores, namely 100 and 500, respectively. The z score for an IQ of
85 or a CEEB score of 400 is _, and the z score for an IQ of 130 or a CEEB
score of 700 is _ [Figure 8.4 on p. 132]. The IQ score with a z value of +3
is _, and the CEEB score with a z value of -3 is _ [Figure 8.4] .
Standard scores can easily be computed in your head if the numbers involved
are simple, like those in the examples above. If you need to use a formula to
find a z score, it will be one of these:
z score in a population called x: z = - [Formula 7.1]
z score in a sample called X: z = - [Formula 7.2]
The two distributions in the examples above would both be close to normal
in their shape, as Figure 8.4 shows, but z scores are useful in describing raw
scores in a distribution with any kind of shape.
Properties of z Scores
If all the scores in any given distribution are converted to z values, the
mean of the z scores is always _, and the standard deviation of the z scores
is always _ [7.6]. A distribution of z scores has a normal shape / whatever
shape is characteristic of the set of raw scores from which they were derived
[7.6].
ADVICE
The exercises offered in the text for this chapter are especially important.
Be sure to do at least some of the parts of each one. As for the other chapters
the exercises provided here (following the map of the new concepts) do not dupli
cate those in the text.
The Normal Curve 67
MAP of z SCORES and the NORMAL DISTRIBUTION
z SCORES
indicate where for a complete are helpful in solving

corresponding raw distribution of problems involving areas
scores lie in raw scores have under the
relation to mean
of distribution,
using standard mean of zero
deviation of
NORMAL CURVE
distribution
as unit of standard deviation
measurement of one
has a particular
bell shape, with
shape of raw scores
(not necessarily
normal) provides a model
for the shape of
does not specify

the distribution's
68% of the area in interval U +

the la
95% of the area in the interval U + 2a
99.7% of the area in the interval y + 3a central
tendency
variability
certain distributions
certain distributions of sample statistics

68 Chapter 7
EXERCISES
Here's a collection of scores listed in order from high down to low. (This
is not a frequency distribution.) Take them to be a sample called X. Compute
the mean, find the deviation scores, and then calculate the standard deviation
from the squares of the deviation scores. These are the sorts of exercises I
offered you in the preceding chapter.
What's new here is this: Find the z score corresponding to each raw score,
and check to see that the mean of the z scores, z" ("zee bar"), is zero. Then
compute the standard deviation of the z scores using a procedure like that by
which you found the standard deviation of the raw scores. This requires treat¬
ing the z scores as though they were raw scores and finding the deviation scores
for them, using their own mean, zero, as a reference point.
All the figures in the table work out to be simple (but not necessarily
whole) numbers, and the standard deviation of the z scores should be exactly one,
of course.
Squared Deviation Squared

Raw Deviation
Deviation z Score Score for Deviation
Score Score
Score z Score for z Score
X x = {X - X) x2 z = x/S (z - z~) (z - z) 2
13
13
Zx = Zx = Zx2 = Zz = Z (z - z) = Z (z - z)2 =
n = S2 = Zx2/n n = S* = Z(z - z)2/n
X = Z x/n = / z = Z z/n = /
S = / sz — -J
The Normal Curve 69
Note how the generalizations on p. 55 of this workbook apply to this table.

The generalizations enable you to determine whether your calculations are pro¬
ducing reasonable values. Is your value for X , for example, somewhere between
the highest raw score and the lowest one? S2 is also a mean and thus must fall
somewhere between the largest x2 value and the smallest x2 value. The mean and
the variance of the z scores can be checked in analogous ways.
If you'd like to do more problems of this kind, use the distributions on pp.
53 and 55 - 59 above.
SYMBOLISM DRILL
Yes, another one. Repetition is what makes a drill a drill and a good tactic
for learning the sort of material in the table below.
Number of scores in a population
Number of scores in a sample
"eks" or "big eks" A score, or the set of scores
"little eks" X - y or X - X; score
"eks bar" Zx/nj the mean of a
"mew" Zx/N; the mean of a
l l Zxz/N}_ of a
1 2 fzP7N; of a
1 3 Zx2/n; of a
1 4 v^Zx^/n; of a
1 5 "zee" x/0 or x/S; z score

CHAPTER 8
DERIVED SCORES
8.1 The Need for Derived Scores
8.2 Standard Scores: ThezScore
8.3 Standard Scores: Other Varieties
8.4 Translating Raw Scores to Standard

Scores
8.5 Standard Scores as Linear Transfor¬

mations of Raw Scores
8.6 Centile Scores
8.7 Comparability of Scores
8.8 Normalized Standard Scores: T Scores

and Stanines
8.9 Combining Measures from Different

Distributions
PROBLEMS nd EXERCISES
2_____
1 _____
4 _____
3____
6 ___
5 _____
8______
7______
10_______
9_____
12_______
11_____—
14 ___
13 ___-
71
72 Chapter 8
SUMMARY
The Need for Derived Scores
In psychological and educational measurement, a raw score, by itself, is
typically interpretable / uninterpretable [8.1, Paragraph 2]. To determine
whether a given score indicates a good performance or a poor one, then, a frame
of reference is needed. A distribution of scores obtained by a group with
known characteristics provides such a frame of reference. The group is called
a _ group, and the distribution of their scores on the test provides
the test _ [p. 124, footnote].
Types of Derived Scores
A number that indicates where a raw score stands in relation to other raw
scores is called a derived score. There are two major kinds of derived scores:
those like the z score that preserve the proportional relation of interscore
__, and those like the centile rank that do not [8.1, final para¬
graph] .
The z and Other Standard Scores
As noted in Chapter 7, transforming all raw scores in a distribution to z
scores changes the mean to _ and the standard deviation to _, and
it also changes/but it does not change the shape of the distribution [8.2].
It is in this sense (leaving the shape unchanged) that z scores preserve the
proportional relation among the distances between the raw scores.
Chapter 7 used the term standard score to refer to z scores, but there are
other types of derived scores that are also standard. Transformation of an
entire distribution of raw scores to a given type of standard score yields a
distribution with a fixed mean and a fixed standard deviation; this is what is
"standard" about the derived score.
For example, in WWII the Army transformed raw scores on its General Classi¬
fication Test into a type of standard score with a mean of and a standard
deviation of _ [8.3, Table]. And as this workbook noted in the summary for
Chapter 7, raw scores on the Wechsler Intelligence Scale are transformed into
derived scores called IQs with a mean of and a standard deviation of

Derived Scores 73
while transformations of raw scores on the CEEB test originally had a mean of
_ and a standard deviation of [8.3, Table].
The fundamental characteristic of all of these types of standard scores is
that they, like the z score, locate a raw score by stating how many _
__ it lies above or below the _ of the distribution [8.3].
Another important property of these standard scores is that they, like the z
score, preserve the _ of the original distribution of raw scores [8.3].
All of the standard scores considered so far, z scores and the others, are
transformations of the raw scores, and in each case the transforma¬
tion can be expressed by the equation of a _ line [8.5]. A linear
transformation is any transformation of scores performed by adding, subtracting,
multiplying, or dividing by ________ [8*5, first sentence].
Centile Scores
Centile ranks (which this chapter occasionally calls centile scores) are
also derived scores. Like the standard scores described so far, the centile
rank of a raw score describes its_relative to other scores in a
distribution [8.6]. The centile rank does this by indicating what percentage
of the scores fall below it. Centile ranks have a major disadvantage, however:
Changes in raw scores are / are not ordinarily reflected by proportionate
changes in centile rank [8.6]. When one centile rank is higher than another,
the corresponding raw score of the one is higher than that of the other, but we
do not know by how much. Changes in raw score are accompanied by proportionate
changes in centile rank only when the shape of the distribution of scores is
[8.6], as in the bottom half of Figure 8.3.
Comparison of Standard Scores from Different Distributions
Under some circumstances, a standard score from one distribution may be
meaningfully compared with a standard score of the same type from another dis¬
tribution. One element necessary for appropriate comparison is that the refer¬
ence groups (norm groups) used to generate the standard scores are _
(that is, similar) [8.7]. Furthermore, standard scores should be used for com¬
paring scores from two different distributions only if the two distributions
74 Chapter 8
have roughly the same _ [8.7]. If the distributions have different
shapes, there are two possible solutions. One possibility is to use
_; they are independent of the shape of the original distributions
[8.7, p. 134]. Another solution is to convert the distributions of raw scores
to distributions of derived scores that have identical _, as well as
identical _ and _ [8.7]. Such transformations
are considered immediately below.
Normalized Standard Scores
Transformations that yield "normalized standard scores" are transformations
that produce a standard mean and a standard standard deviation, but in addition
the process of transformation alters the shape of the original distribution of
raw scores so that it follows the _ curve [8.8, Sentence 2]. Trans¬
formation of raw scores to T scores, for example, produces a distribution with
a mean of _, a standard deviation of _, and one that is normally distri¬
buted [8.8]. Transforming raw scores to _produces a distribution
with a mean of 5, a standard deviation almost 2, and again a shape that is nor¬
mal [8.8, Paragraph 3].
Combining Measures from Different Distributions
Suppose an instructor has scores for students on a quiz, a midterm exam, and
a final. To compute the final grades, the instructor might simply sum each of
the three scores for each student and then decide which sums are worth an A,
which sums are worth a B, and so on. A naive person would think that this pro¬
cedure of simply summing the three scores makes them count equally in determin¬
ing the total and thus the final grade. When several scores are summed, each
one does / does not, however count equally in determining the total [8.9, Par¬
agraph 2]. If the several scores are independent (i.e., if the size of a per¬
son's score on one variable is / is not predictive of the size of that person's
score on another variable), then the contribution of each score to the total is
proportional to the magnitude of the _ of the distri¬
bution from which it came [8.9].
This situation can be rectified by assuring that all of the test distribu¬
tions from which scores are to be added to form the composite have the same
Derived Scores 75
, and one way to accomplish this is to transform
scores from the several distributions to __ scores [8.9]. The
transformed scores may then be summed if equal weight is desired, or multiplied
by desired weights and summed.
This is not the end of the problem, however, because it will work only if
the several scores are independent, and usually they are not. The lack of in¬
dependence is not a problem when only _ measures are to be combined, but
it is a problem when there are more than _ [8.9, p. 138]. With more than
, the procedure recommended above ensures / does not ensure that the
weights assigned will result in the intended relative importance of the contri¬
bution of each to the whole. Nevertheless, it is better to follow the procedure
outlined above than to allow the scores to be weighted by the amount of _
inherent in the distribution of each [8.9].
EXERCISES
To practice locating scores relative to the mean of their distribution,

using their standard deviation as the unit of distance, fill in the missing
values in the table below. Note that the four scores on a line would be truly
equivalent only if the distributions from which they come have similar shapes.
You should be able to do most of the calculations for the first four columns
in your head.
TABLE OF EQUIVALENT SCORES
Score where Score where Score where Centile Rank if

z Score [1=50, a=io Shape is Normal
y=100, Q=15 y=500, a=ioo
+3 145
97.7
650
50.0
-1
250
20
76 Chapter 8
MAP of DERIVED SCORES
RAW SCORE
is typically is given meaning by

uninterpretable
by itself in
psychology or DERIVED SCORE
education
indicates where may be

raw score stands
in relation to
Kind that preserves Kind that does not

the shape of the preserve the shape
Other Raw Scores distribution of raw of the distribution
scores and thus of raw scores and
preserves the pro¬ thus alters the
constitute portional relation proportional rela¬
I
Test Norms
of interscore
distances
tion of interscore
distances
consists of consists of
*These figures hold only for the original norm group.

The CEEB scores earned by high-school seniors in recent
years have had lower means and varying standard deviations
Derived Scores 77
SYMBOLISM DRILL
2 Number of scores in a population
1 n Number of scores in a __
A score, or the set of _ scores

3 X
9 x X - y or X - X; ___ score
4 I
6 E / ; the mean of a
V
5 E / ; the mean of a
X
11 E /N; of a
/E / N; of a
12 a
E /n; of a
13 s2
A of a
14 s /n;
/a or /S; score
Statistics in Action—------
defining mental retardation
A psychologist who has conducted important research on mental retardation

recently discussed the issue of how to define retardation. Writing in the
journal Science, Edward Zigler of Yale University notes that in 1959, the Amer-
ican Association on Mental Deficiency, the leading professional organization
concerned with the retarded in America, defined the retarded as those wit
IQ score one or more standard deviations below the mean.
1. What IQ score is one standard deviation below the mean? (The IQ

tests for children have a standard deviation that is the same as
that for the WAIS or very close to it.)
2. What percentage of the population was defined as retarded by this

criterion?
78 Chapter 8
But many professionals now accept an IQ of 70 as the dividing line, Zigler says.
- 3. How many standard deviations below the mean does an IQ of 70 lie?
- 4. By this criterion, what percentage of the population is retarded?
. Zigler argues that retardation is a "two-tiered phenomenon" in which the

"mildly retarded," those with IQ between 50 and 70, must be distinguished from
the more severely retarded," those with IQ less than 50. "Statements or views
concerning one tier are pretty much inapplicable to the other," Zigler believes.
- 5. How many standard deviations below the mean does an IQ of 50 lie?
- 6. What percentage of the population is "mildly retarded" accordina to

Zigler's definition?
- 7. What percentage of the population is "severely retarded"?
Zigler suggests that people with IQ between 50 and 70 who are currently
called retarded merely represent "the lower portion of the normal distribution
of intelligence" and are "thus an integral part of the normal population." In
this conception, these people are like those whom we regard as short. Most of
the people m each classification came to be what they are through the usual gene
tic and environmental processes, which produce a normal distribution of 10 scores
m the one case and a normal distribution of heights in the other, both of them
Dust naturally including some scores well below the mean. The truly retarded,
m Zigler's view, are like the dwarfs and midgets who fall far below the mean'
m height as a result of processes that are clearly abnormal. Unfortunately,
he notes, there is no emotionally neutral word comparable to "short" to describe
the lower end of the naturally occurring distribution of IQ scores. The word
retarded" is pejorative, and to apply it to all people with an IQ between 50
and 70 is unfair and misleading, Zigler believes, just as it is unfair and mis-
eadmg to refer to all adults who are, say, five-two or less, as dwarfs or
midgets.
-- 8. What proportion of the young adult men in America are below five
feet, two inches in height? (See the table on p. 27 on this book.)
—- 9* what proportion of the young women are below five-two?
IQîs not the only criterion that is used in diagnosing retardation. The
person's social competence and the age at which the abnormalities began are also
taken into consideration by many professionals. Social competence, or the abil¬
ity to meet the demands of everyday living, is not adequately defined by an IQ
score, Zigler says, and the exact relation between intelligence and social com¬
petence is unclear. There is great need for a measure of social competence
that can be used throughout the lifespan as IQ tests can.
1977?ei96!nil92-11949ler' ReVleW °f The Mentall<J Retarded and Society, Science,

CHAPTER 9
CORRELATION
9.1 Measurement of Association and

Prediction
9.2 Some Historical Matters
9.3 Association: A Matter of Degree
9.4 A Measure of Correlation
9.5 Computation of r: Raw Score Method
9.6 Score Transformations and the

Correlation Coefficient
9.7 The Scatter Diagram
9.8 Constructing a Scatter Diagram
9.9 The Correlation Coefficient: Some

Cautions and a Preview of Some As¬
pects of Its Interpretation
9.10 Other Ways of Measuring Association
10
79
80 Chapter 9
SUMMARY
In the cases considered thus far in your study of statistics, there has
been only one score for a given subject, a score indicating the value of just
one variable (the subject's height, for example), when two scores are avail-
able for each subject, one score for each of two variables (the subject's height
and the subject's weight, say), it is possible to determine the correlation
between the two variables for the subjects on hand. Except in the final sec¬
tion, Chapter 9 is concerned with the correlation between variables that are
each quantitative and continuous (such as height and weight).
Correlation: A Matter of Direction
There is no correlation between two continuous quantitative variables if
high scores on one variable tend to be associated with both high and low
scores on the other variable, and if low scores on the one variable also tend
to be associated with both high and low scores on the other. Diagram 6 of
Figure 9.1 on p. 145 illustrates a case of a correlation very close to zero.
Such a graph is called a

[Figure 9.1, Caption].
If there is a correlation between two continuous quantitative variables,

it will be either positive or negative. In the case of a positive correlation,
lgh scores on one variable tend to be associated only with high scores on the
other variable, while low scores on the one variable tend to be associated
only with low' scores on the other. There is a positive correlation between
height and weight among human beings, in that people with a high score on
height (tall people) tend to have a high score on weight (tend to be heavy)
while people with a low_score on height (short people) tend to have a low score
on weight (tend to be light). Positive correlations are illustrated in Dia¬
grams 1 - 5 of Figure 9.1. Note that in these diagrams the data points fall in
a swarm running from lower left to upper right.
In the case of a negative correlation, high scores on one variable tend to

be associated only with low scores on the other variable, and low scores on
t e one tend to be associated only with high scores on the other. There is a
negative correlation between height and normal sleeping time per night among
uman beings, in that people with a high score on height (adults) tend to have
a ow score on normal sleeping time (tend to sleep relatively few hours), while
people with a low score on height (children) tend to have a high score on nor¬
mal sleeping time (tend to sleep relatively many hours). Negative correlations
are illustrated in Diagrams 7 and 8 of Figure 9.1. Note that in these diagrams
ne data points fall in a swarm running from upper left to lower right.
Correlation: A Matter of Degree
The distinction between a positive and a negative correlation is a distinc¬

tion in the direction of the correlation. Correlations also vary in degree.
Correlation 81
In the case of a perfect correlation, all data points in a diagram like those
in Figure 9.1 fall on a straight line. (If the direction is positive, the line
slopes from lower left to upper right; if the direction is negative, the line
slopes the other way.) In the case of less than perfect correlations, the data
points swarm more or less closely about a straight line; the farther away, the
lower the degree of the correlation (whether the direction is positive or nega¬
tive) .
Pearson's r as a Measure of Correlation
The direction and the degree of correlation between two continuous quantita¬
tive variables is indicated by a number called the Pearson product-moment corre¬
lation coefficient. The Greek letter p, pronounced "_," stands for the
population value, and _ stands for a sample value [9.3]. In this chapter
the symbol is used consistently, but the principles and procedures for
calculation described here apply equally to samples and to populations [9.3].
The sign of the coefficient may be positive or negative. A positive value
of r indicates a positive correlation, which, as noted above, occurs when there
is a tendency for high values of one variable (X) to be associated with high/
low values of the other variable (Y), and low values of the one to be associ¬
ated with high / low values of the other [9.3, Paragraph 2]. A negative value
of r indicates a negative correlation, which occurs when high values of X are
associated with high / low values of Y, and vice versa [9.3]. The sign of the
coefficient indicates the direction / degree of the association [9.3].
When no relationship exists, its value is _ [9.3]. When a perfect
relationship exists, its value is _plus or minus [9.3]. An intermediate
degree of relationship is represented by an intermediate value of r. Note that
when r = +1.00 or -1.00, every point lies close to / exactly on a straight
line [9.3, p. 145]. This means that if we know the value of X, we can predict
the value of Y with only a little / without any error [9.3].
Pearson's r is defined in the text in terms of deviation scores:
[Formula 9.1]
r
xy
, is the sum of the prod-

In this formula, _ - (X-X), y -
= number of pairs of scores, and S^,
ucts of the paired deviation scores
82 Chapter 9
Sy are the of the two distributions [9.4],
To understand how this formula works, visualize a scatter diagram like
Figure 9.2. This diagram is divided into four quadrants by two lines, one
located at the _ of X and one at the of y [9.4, p. 147].
Points located to the right of the vertical line are therefore characterized
by positive / negative values of x and those to the left by positive / negative
values of x [9.4]. Those points lying above / below the horizontal line are
characterized by positive values of y, and those above / below by negative
values of y [9.4]. For any point, the xy product may be positive or negative,
depending on the sign of x and the sign of y. The xy products will be positive
for points falling in quadrants _ and _ and will be negative for points
falling in quadrants and [9.4]
On examination of the deviation score formula for r, it is apparent that
the ___^xy r determines whether the coefficient will
be negative, zero, or positive [9.4]. The correlation coefficient will be
_ when the sum of the negative xy products from quadrants II and IV
equals the sum of the positive products from quadrants I and III; the coeffic¬
ient will be negative when the contributions from quadrants and exceed
those from quadrant _ and _; and the coefficient will be positive when the
reverse is true [9.4]. The greater the predominance of the sum of products
bearing one sign over those bearing the other, the greater the magnitude /
direction of the coefficient [9.4].
Effects of Transforming Scores on the Correlation Coefficient
How will the correlation be affected if a constant is added to or subtracted
from each X score before obtaining the correlation between that variable and Y?
What is the effect of multiplying or dividing X by a constant? The correlation
between the altered variable and the remaining variable changes /remains just
as it was [9.6]. It is also possible to transform X in one way and Y in another
and the value of the coefficient changes/ remains unaltered [9.6]. As long
as the translation of either (or both) variable(s) is a one, the
correlation will be unaltered [9.6].

Correlation 83
Cautions Concerning the Correlation Coefficient
1. Does a correlation coefficient of, say, +.50 mean that there is 50%
association between the two variables? The degree of association is / is not
ordinarily interpretable in direct proportion to the magnitude of the coeffic¬
ient [9.9].
2. If two variables are substantially correlated, must one be, at least in
part, the cause of the other? This is / is not so. Mere association is in¬
sufficient/sufficient to claim a causal relation between two variables [9.9].
3. The strength of the association between two variables depends, among
other things, on the nature of the measurement of the two variables as well as
on the kind of subjects studied. It is/ is not possible, then, to speak of
the correlation between two variables without taking these factors into con¬
sideration [9.9].
4. Pearsonian correlation is based on the line of best fit
to the bivariate distribution. Although a _ line is reasonably
considered to be the line of best fit in many situations, sometimes it is not.
When it is not, the strength of association is likely to be underestimated/
overestimated / accurately estimated by Pearson r [9.9].
5. The correlation coefficient is affected by the range of talent, or vari¬
ability characterizing the measurements of the two variables. In general, the
smaller the range of talent in X and/or Y, the higher / lower the correlation
coefficient, other things being equal [9.9].
6. The correlation coefficient is / is not subject to sampling variation
[9.9]. Depending on the characteristics of a particular sample, the obtained
coefficient may be higher or lower than it would be in a different sample from
the same population.
EXERCISES
For each case described below, use your common sense and your new under¬
standing of correlation to determine whether the correlation between the two
84 Chapter 9
variables is zero, positive, or negative; and if positive or negative, whether

the degree of correlation is low, medium, high, or perfect.
_ 1* Among the world's population of human beings with both

legs intact, what is the correlation between the length of a person's left leg
and the length of his or her right leg?
__ * 2 3* Among grade-school children, what is the correlation

between the length of a child's nose and the number of words in his or her
spoken vocabulary? Remember that the children come from grades one through six.
_ 2* Among teenagers, what is the correlation between the

number of freckles on a person's face and the person's IQ?
_ 4* Among married couples in North America, what is the

correlation between the age of the husband and the age of the wife?
___ Playboy magazine, following a suggestion offered by Plato

2300 years ago, once proposed that men should marry about the age of 30 and
choose a bride about the age of 20. Suppose that North America follows this
advice for 100 years. After the 100 years, married couples are surveyed; some
are relatively young newly-weds, and some are just celebrating their golden
anniversary. Among these couples, what would be the correlation between the
age of the husband and the age of the wife?
__ What is the correlation between the height of the New

York Yankees baseball players and the New York Jets football players?
SYMBOLISM DRILL
Symbol_Pronunciation Meaning
_ "little en" Number of scores in a
2 _ "big en"
3
A raw score, or the set of raw scores
9 _ "little eks" - y or - X ; score
4 "the sum of"
IX/N; the mean of a
5
Zx/n', the mean of a
Correlation 85
11 of a
Ex2/N;
1 3 of a
Ex2/n;
1 2 of a
Ax2/iV;
14 of a
/Ex^/n;
1 7 correlation coefficient for a pop'n

"rho" Pearson
16 Pearson correlation coefficient for a sample

"ar"
1 5 x/a or x/S; score

86 Chapter 9
MAP of CONCEPTS in CHAPTER
Bivariate Distribution
consists of may be graphed as

/
pairs of values
Scatter Diagram
of X and Y
may reveal
Association between No tendency for high scores Association between

high scores on X and on X to be associated with high scores on X and
high scores on Y, either high or low scores on low scores on Y,
and between Y, and no tendency for low and between
low scores on X and scores on X to be associated low scores on X and
low scores on Y with either high or low scores high scores on Y
on Y
I
is described as is described as is described as
I I I
Negative correlation
Positive correlation Zero correlation
between X and Y between X and Y between X and Y
permits
corresponds to corresponds to corresponds to

positive values zero value negative values
of of of
prediction of
standing in Y
from standing Pearson Correlation
in X with Coefficient
better than
chance
is symbolized indicates by its magnitude
accuracy
(ignoring its sign)
p for a population,
is appropriate as a
measure of correlation the
\
strength of the linear
r for a sample
only if the relation association between X and Y
between X and Y is linear
l
is shown graphically by the extent
to which scores cluster around the
straight line of best fit on the
CHAPTER 10
FACTORS INFLUENCING THE CORRELATION

COEFFICIENT
10.1 Correlation and Causation
10.2 Linearity of Regression
10.3 Homoscedasticity
10.4 The Correlation Coefficient in

Discontinuous Distributions
10.5 Shape of the Distributions and the

10.6 Random Variation and the Correlation

Coefficient
10.7 The Correlation Coefficient and the

Circumstances under Which It Was
Obtained
10.8 Other Factors Influencing the

1 ____——-
2 ________
3 ______
4 ________
5 ________
6 _____
7 _____
87
88 Chapter 10
SUMMARY
Correlation and Causation
The fact that x and 7 vary together is a necessary and also / but not a
sufficient condition for one to make a statement about causal relationship
between the two variables [10.1]. That is, evidence that two variables vary
together is/ is not necessarily evidence of causation [10.1]. Figure 10.1
shows four of the possibilities that may occur when two variables are correla¬
ted, and in only two of these cases (the first and second) does one of the
variables have a causal effect on the other.
Linearity of Regression
In scatter diagrams such as those shown on pp. 144-145 and 165 of the text,
it is helpful to fit a straight line to the swarm of data points. (The next
chapter tells how to do this precisely.) In general, the more closely the
scores hug the straight line of best fit, the higher / lower the value of r
[10.2] . When r is 0, the scores scatter as widely about the line as possible,
and when r is _ (plus or minus), the scores hug the line as closely as possi¬
ble, since they all fall exactly on the line [10.2]. One meaning of this prin¬
ciple is that prediction of Y from knowledge of X can be made with greater accu¬
racy when the correlation is high / low than when it is high / low [10.2].
But in a given set of data, a straight line may or may not reasonably
describe the relationship between the two variables. When a straight line is
appropriate, X and Y are said to be __ related [10.2] . More form¬
ally, the data are said to exhibit the property of linearity of
[10.2] .
What happens when X and Y are not linearly related? When the correlation
is other than zero and the relationship is nonlinear, as in Figure 10.3, Pearson
r will underestimate / overestimate / correctly estimate the degree of associa¬
tion [10.2] .
Homoscedasticity
When the variability in Y is the same throughout the values of X, as in
the left side of Figure 10.4, the bivariate distribution is said to exhibit the
Factors Influencing the Correlation Coefficient 89
property of homoscedasticity. Since r is a function of the degree to which the
points hug the straight line of best fit, the value obtained for it in such a
case has a meaning of general significance: r describes the closeness of associ¬
ation of X with Y irrespective of the specific value of _ (or _) [10.3].
When homoscedasticity does not obtain, as on the right side of Figure 10.4, r
will reflect the average degree to which the scores hug the line, but this aver¬
age will properly characterize the degree of relationship for only some values
of X and Y. Thus the meaning to be attached to r will / will not depend on
whether the hypothesis of homoscedasticity is appropriate to the data [10.3].
Discontinuous Distributions
What will happen if distributions that normally would be continuous have been
rendered discontinuous because of the exclusion of cases with intermediate values
on one variable? A sample constituted in this manner will / will not yield a
correlation coefficient different from what would be obtained in drawing a sample
so that all elements of the population had an opportunity to be selected [10.4].
Usually, discontinuity results in a coefficient higher than / the same as / lower
than otherwise [10.4].
Shape of the Distributions
If the correlation coefficient is to be calculated purely as a descriptive
measure, there is a / no requirement that the distributions be normal in shape
[10.5] . Obtaining the coefficient is often only the first step in analysis,
though. When additional steps are undertaken, the assumption frequently must
be made that X and Y are __ distributed [10.5].
Random Variation
When a correlation coefficient has been obtained from a particular set of
observations, it represents / does not represent the correlation between the
two variables; another sample will yield the same / a somewhat different value
[10.6] . In general, large / small samples yield values of r that are similar
from sample to sample, and thus the value obtained from a large / small sample
will probably be close to the population value [10.6]. For very large / small
samples, r is quite unstable from sample to sample, and its value can / cannot
be depended upon to lie close to the population value [10.6].

90 Chapter 10
The Circumstances Under Which the Correlation Coefficient Was Obtained
The influence of random sampling variation is the only / but one reason
why the obtained correlation coefficient is not the coefficient between the two
variables under study [10.7]. The degree of association between two variables
depends on (1) how the two variables were measured, (2) who the subjects were,
and (3) under what circumstances the variables operate. If any of these fac¬
tors changes, the extent of the assocation may also change. Consequently, it
is of utmost importance that a correlation coefficient be interpreted in the
light of the particular conditions by which it was obtained.
SYMBOLISM DRILL
2
N
l
n
3
X
9
x X - or X score
4
E
6
y E / ; the _ of a _
5
x E / ; the _ of a _
1 4
s /E / ; _ of a
1 1
j2 E / ; _ of a
1 3
s2 E / ; _ of a
1 2
o /L 7 ; _ of a
1 5 z /a or / S ; score
1 6
r Pearson cor'n coef't for a
1 7
P Pearson cor'n coef't for a
CHAPTER 11
REGRESSION AND PREDICTION

11.1 The Problem of Prediction
__ 11.2 The Criterion of Best Fit
11.3 The Regression Equation: Standard

Score Form
11.4 The Regression Equation: Raw Score

Form
11.5 Error of Prediction: The Standard

Error of Estimate
__ 11.6 An Alternate (and Preferred)

Formula for S
YX
__ 11.7 Error in Estimating Y from X
11.8 Cautions Concerning Estimation of

Predictive Error
1 ___
2 ___ . . ■
3 _______
4 _____ ...
5,
6___________
7 ___________
8 _______
9 ___
91
92 Chapter 11
SUMMARY
When the correlation coefficient for two variables is zero, knowledge of X
is of some / no help in predicting Y [11.1, Paragraph 1]. When the coefficient
is other than zero, whether positive or negative, knowledge of X permits pre¬
diction of Y with better than chance accuracy, and if the coefficient is ±1, we
can predict Y with almost perfect / perfect accuracy [11.1]. The problem of
prediction is thus closely related to the topic of correlation.
In the situations considered in this chapter, a linear (straight-line) rela¬
tionship between x and Y is a reasonable assumption, and the correlation coeffic¬
ient called r is thus a good measure of the correlation between X and Y. The
chapter shows how to use the value of r and certain other facts to find the
straight line of best fit to the Y values. Such a line is called a _
line [11.1, Paragraph 4], and it will correspond to an equation called a regress¬
ion equation. This line or its equation is used to predict the value of Y for
a given value of X.
The Criterion of Best Fit
How shall we judge which of all possible straight lines is the one that
best fits the values of y on hand and permits the best prediction of unknown Y
values? The criterion in use was first proposed by Karl Pearson, the person
who invented the correlation coefficient called r. Karl Pearson's solution to
this problem was to apply the _ criterion [11.2]. The
_ criterion calls for the straight line to be laid down in
such a manner that the _ of the squares of the discrepancies between the
actual and the predicted values of Y is as small as possible [11.2]. One impor¬
tant property of the least-squares solution is that the location of the regress¬
ion line and the value of the correlation coefficient will fluctuate less / more
under the influence of random sampling than would occur if another criterion
were used [11.2].
The Regression Line as a Mean
The regression line is a "running mean," a line that tells us the mean, or
expected value of _, for a particular value of _ [11.2, p. 180]. Y is the

Regression and Prediction 93
mean of all 7 values in the set, whereas 7' ("7 Prime," 7 predicted from the
regression line) is an estimate of the mean of _ given the condition that
has a particular value [11.2, p. 180].
The Regression Equation
The straight line of best fit to the 7 values, which is called the regress¬
ion line and is used for predicting unknown 7 values from particular values of
X, corresponds to an equation that takes a particular value of X and predicts
a value of 7. Such an equation is called a regression equation; strictly speak¬
ing, it is the equation for the regression of _ on _ [Equation 11.1 or
11.2]. The equation may be written in terms of standard scores (z scores) or
raw scores.
In terms of standard scores, the equation is very simple:
z’ = [Equation 11.1]
where z'Y is the predicted standard score value of __ [11.3]. This form of
the equation makes it easy to see how two important generalizations are true.
First, suppose the value from which prediction was made was the mean of X.
Since the z-score equivalent of the mean is _, the predicted standard score
value of 7 is zero, or in other words, the _ of 7 [11.3]. This prediction
will be the same, irrespective of the value of r / hold only for certain values
of r [11.3]. Second, if r = 0, then the predicted standard score value of 7
will always be [11.3]. In raw score terms, if the correlation is zero,
the predicted value of 7 is the _ of 7 no matter what value of X is used
to predict 7 [11.3].
In terms of raw scores, the regression equation is complicated:

" •'I f \
7' x - X + 7 [Equation 11.2]

V > \ )
Note, however, that the expression inside the first pair of parentheses is the
same as that inside the second pair.
Measuring Errors of Prediction
A value of 7 predicted from a given value of X is only an estimate of the
mean value of 7 for persons with the given score on X. If the correlation
94 Chapter 11
between X and Y is low / high , considerable variation of actual values about
the predicted value may be expected [11.5]. If the correlation is low / high,
the actual values will cluster more closely about the predicted value [11.5].
Only when the correlation is _ will the actual values regularly and
precisely equal the predicted values [11.5].
The magnitude of the errors of prediction is measured by a quantity called
the standard error of estimate of Y on X, which is symbolized [Equation
11.3]. The standard error of estimate is a kind of ;
it is the_of the distribution of obtained Y scores
about the predicted Y score [11.5]. The value of Syx ranges from when
the correlation is perfect to Sy when there is no correlation at all [11.5, p.
186]. The simplest formula for Sy% expresses it in terms of r and SY:
Syx = [Equation 11.4]
Predicting a Distribution of Y Values
With the help of SyX one can predict not only the mean score on Y for cases
with a given score on X (which is what Y' is), but also the entire distribution
of Y scores for cases with that score on X. The procedure for doing so is
described in Section 11.7. Correct application of this procedure requires that
several assumptions be satisfied. First, since the regression equation is used
to obtain the predicted value of Y, we must assume that a line
is the line of best fit, or this predicted value may be too high or too low
[11.8]. Second, Syy is taken as the standard deviation of the distribution of
obtained Y scores about Y’ , irrespective of the value of from which the
prediction has been made [11.8]. It is therefore necessary to assume that
variability is the same from column to column (the assumption of _
_) [11.8]. Third, the procedure depends on the assumption that the
distribution of obtained Y scores (for a particular value of X) is
[11.8].
Regression and Prediction 95
EXERCISES
To understand what you are doing in using the regression equation to predict
a Y score given an X score, and to check to see if the prediction you are making
is a reasonable one, it is helpful to think: Where is X in relation to its mean
X? And where is Y predicted to be in relation to its mean Y? Where Y is pre¬
dicted to be in relation to Y depends on a) where X is in relation to X, and on
b) whether r is positive, zero, or negative. You can figure out for yourself
exactly how this works by using the standard-score form of the regression equa¬
tion (see p. 118).
For example, suppose you want to predict the score on Y for a case whose X
value is above the mean of X. Suppose further that the correlation between X
and Y is positive. The regression equation says that z’y = rz^, and here r will
be some positive number, while zx will also be positive, because a raw score
above its mean has a positive z score. The product of two positive numbers is
also positive; thus the equation predicts that this case will have a positive
z score on variable Y. A positive z score indicates a raw score above the mean
of Y. Thus if X > X and if r is positive, Y' > Y.
This generalization is indicated in the upper—left corner of the table below.

Using reasoning like that illustrated above, fill in the other cells of the table,
in each case putting the symbol >, =, or < between the Y' and the Y. Remember:
A raw score above the mean has a positive z score.

A raw score at the mean has a z score of zero.
A raw score below the mean has a negative z score.
The product of two positive numbers is positive.

The product of two negative numbers is positive.
The product of a positive number and a negative number is negative.
If r is positive If r is zero If r is negative
If X > X Y* > Y Y' Y Y' Y
If X = X Y< y Y' Y Y’ Y
Y' Y Y> Y
If X < X Y' Y
96 Chapter 11
SYMBOLISM DRILL
1
3 A raw score, or the set of raw scores
4 Result of summing quantities of some kind
5 YjX/n; the mean of a
6 Ix/2V; the mean of a
9 X - X or X - ji; deviation score
11 Ix2/IV; of a
12 /Ix2/IV; of a
13 Ix2/n; of a
14 /Ix2/n; of a
1 5 x/0 or x/S; score
16 Pearson correlation coefficient for a sampl
17 Pearson correlation coefficient for a pop'n
1 8 "wi prime" Predicted raw score on Y
19 "zee prime sub wi" Predicted z score on Y
20 "es sub wi eks" Standard error of estimate of Y on x

CHAPTER 12
INTERPRETIVE ASPECTS
OF CORRELATION AND REGRESSION
12.1 Factors Influencing r: Range of

Talent
12.2 Factors Influencing r: Heterogeneity

of Samples
12.3 Interpretation of r: The Regression

Equation (I)
_ 12.4 Interpretation of r: The Regression

Equation (II)
12.5 Regression Problems in Research
12.6 An Apparent Paradox in Regression
12.7 Interpretation of r: kf the Coeffi¬

cient of Alienation
o t
12.8 Interpretation of r: r , the Coeffi-

" cient of Determination
12.9 Interpretation of r: Proportion of

—— — Correct Placements
12.10 Association and Prediction: Summary
3 4
5 6
7 8
9 10
11 12
13 14
15 16
17 18
19 20
21
97
98 Chapter 12
SUMMARY
Chapter 12 offers a wealth of details that provide important insights into

some subtle aspects of correlation and prediction.
Range of Talent
The variability among the scores on a given variable, as measured by the
range or the standard deviation of the distribution, affects the value of r when
r is used as an indicator of the correlation between this variable and any other
In a given situation, restriction of range may take place in X, in Y, or in both
The value of r will be smaller / larger in those situations in which the range
of either X or Y (or both) is less, other things being equal [12.1]. This means
that there is no such thing as the correlation between two variables, and that
the value obtained must be interpreted in the light of the _
of the two variables in the circumstances in which it was obtained [12.1].
Other things being equal, the greater the restriction of range in X and/or Y,
the higher / lower the correlation coefficient [12.1].
Heterogeneity of Samples
On the left side of Figure 12.2 (p. 200) is a scatter diagram showing the
correlation between a pair of variables in two different samples. Within each
sample, the correlation is positive and moderately strong. If r is computed for
the combined data, however, it will be smaller (though still positive). Why?
When the data are pooled, the scores no longer hug the _ line
(which now must be a line lying amidst the two distributions) as closely as they
do in either distribution considered by itself [12.2].
In the particular case illustrated, the distributions differed, between
samples, in the mean of _ but not in the mean of _ [12.2]. Other types
of differences are quite possible. The second illustration in Figure 12.2
shows a situation in which a second sample differs in that both the mean of X
and the mean of Y are higher than in the first sample. In this case, the corre¬
lation will be greater / smaller among the pooled data than among the separate
samples [12.2].
The Role of r in the Regression Equation
The role of r in the regression equation is easiest to understand when the

Interpretive Aspects of Correlation and Regression 99
regression equation is cast in standard score form. In this case, the correla¬
tion coefficient is the of the regression line [12.3]. One interpre¬
tation of the correlation coefficient, therefore, is that it states the amount
of increase in that accompanies unit increase in _ when both measures
are expressed in standard score units [12.3]. It indicates how much of a _
Y will increase for an increase of one standard deviation
in X [12.3] .
When the regression equation for predicting Y from X is stated in raw score
form, its slope is r( / ) rather than r [12.3]. This expression is known
as the regression [12.3]. It states the amount of increase
in that accompanies unit increase in _ when both measures are expressed
in raw score terms [12.3].
Regression on the Mean
The correlation between intelligence of parents and intelligence of offspring
has a value close to +.50. If parental intelligence is two standard deviations
above the mean, the predicted intelligence of their offspring is __ standard
deviation above the mean [12.4]. On the other hand, if parents intelligence
is two standard deviations below the mean, the predicted intelligence of their
offspring is only one standard deviation above / below the mean [12.4]. To put
it in other words, bright parents will tend to have children who are brighter/
duller than average, but not as bright as they, and dull parents will tend to
have children who are dull / bright , but not as dull as their parents [12.4].
Remember that the predicted value is to be thought of as an average value.
This phenomenon is called regression on the mean. The fact of regression on
the mean is, of course, characteristic of any relationship in which the correla¬
tion is less than perfect / zero [12.4]. The more extreme the value from which
prediction is made, the greater / lesser the amount of regression toward the
mean [12.4]. The higher the value of r, the less / greater the amount of re¬
gression [12.4].
Regression on the mean is frequently a problem in research. When subjects
are selected because of their extreme position (either high or low) on one vari¬
able, we expect their position on a correlated variable to be in the same/

100 Chapter 12
opposite direction, but more / less extreme [12.5]. The two variables could
be test and retest on the same measure (as in the excellent example offered in
the first paragraph of Section 12.5), or they could be two different variables.
Again, it should be noted that the amount of regression will depend on the size/
direction of the correlation between the two variables [12.5].
If parents with extreme characteristics tend to have offspring with charac¬
teristics less extreme than themselves, how is it that, after a few generations,
we do not find everybody at the center? The answer is that regression of pre¬
dicted values toward the mean is accompanied by variation of obtained values
about the predicted values, and the greater the degree of regression toward the
mean, the greater / lesser the amount of variation [12.6]. Specifically, Y',
the predicted value of Y, is only the predicted of Y for those who ob¬
tain a particular score in X. The obtained Y values corresponding to that value
of X will be distributed about Y' with a standard deviation equal to _____ [12.6]
The lower the value of the correlation coefficient, the greater the value of
_, so that the greater the degree of regression on the mean, the greater/
lesser the variation of obtained Y values about their predicted values [12.6].
The Coefficient of Alienation
The measure of variability of scores about the regression line, or as we
might call it, the measure of error of prediction, is given by the standard error
of estimate, Syx* The maximum possible error of prediction occurs when r = ,
in which case Syy = [12.7]. The ratio of SYX to _ gives the proportion
of the maximum possible predictive error that characterizes the present predic¬
tive circumstances [12.7], and this ratio turns out to equal /l - r2. This
quantity is symbolized by the letter _, and it is called the coefficient of
_ [12.7].
When the value of k is close to unity (its maximum value), the magnitude of
predictive error is close to its maximum / minimum [12.7]. On the other hand,
when the value of k is close to zero, most / little of the possible error of
prediction has been eliminated [12.7]. The first two columns of Table 12.1 on
p. 208 show how k changes as r changes and make it clear that a given change in
the magnitude of a correlation coefficient has greater consequences when the
correlation is high / low than when it is high / low [12.7].

Interpretive Aspects of Correlation and Regression 101
The Coefficient of Determination
In a bivariate distribution, total variation in Y may be thought of as
having two component parts: variation in Y that is associated with or attribu¬
table to changes in _, and variation in Y that is inherent in Y, and hence
independent of changes in _ [12.8]. r2 gives the proportion of Y variance
that is associated with changes in _, and it is called the coefficient of
[12.8]. The middle column of Table 12.1 on p. 208
shows that the proportion of Y variance accounted for by variation in _ in¬
creases more slowly / rapidly than does the magnitude of the correlation coef¬
ficient [12.8] .
Proportion of Correct Placements
The interpretation of the meaning of r according to k or r is strongly /
certainly not very encouraging as to the value of r's of moderate magnitude
[12.9], A somewhat more cheerful outlook may be had by considering the propor¬
tion of correct placements that occur when the regression equation is used to
predict "success." Assume a normal bivariate distribution, that success is
defined as scoring above the median on the criterion variable (Y), and that
those who are selected as potentially successful are those who score above the
median on the predictor variable (X). The last two columns of Table 12.1 on
p. 208 present, for selected values of r, the proportion of correct placements
in excess of chance and the proportion of improvement in correct placements
relative to the chance proportion of .50.
SYMBOLISM DRILL
3
X
102 Chapter 12
4
Z
6
V ZX/N ; the of a
5 X Zx/n ; the of a
9 X or f score
11
a2 2 / •
/ of a
12
o ^ / f of a
1 3
s2 2 / f of a
14
s 2 / •
s of a
15
z /Q or /S score
16
r Pearson cor 1 n coef't for a
17
P Pearson cor' n coef't for a
18
Y' "wi prime" Predicted score on
19
ZV "zee prime sub wi" Predicted score on
20
"es sub wi eks"
SYX
CHAPTER 13
THE BASIS OF STATISTICAL INFERENCE
13.1 Introduction: The Plan of Study
13.2 A Problem in Inference: Testing

Hypotheses
13.3 A Problem in Inference: Estimation
13.4 Basic Issues in Inference
13.5 Probability and Random Sampling:

An Introduction
13.6 The Random Sampling Distribution

of Means: An Introduction
13.7 Characteristics of the Random

Sampling Distribution of Means
13.8 The Sampling Distribution as

Probability Distribution
13.9 The Fundamentals of Inference and

the Random Sampling Distribution of
Means: A Review
13.10 Putting the Sampling Distribution

of Means to Use
7 8
9 10
103
104 Chapter 13
SUMMARY
The Two Types of Inferential Procedures
A basic aim of statistical inference is to form a conclusion about a char¬
acteristic of a __ from study of a _ taken from
that __ [13.2]. That characteristic may be a proportion, a
mean, a median, a standard deviation, a correlation coefficient, or any other
of a number of statistical _ [13.3]. Inference may also be
concerned with the difference between _ with regard to a
given parameter [13.3]. There are two types of inferential procedures:
_ testing and __ [13.2].
ln ___ testing, we have a particular value in mind; we
hypothesize that the value we have in mind characterizes the population / sample
of observations [13.3]. To evaluate a hypothesis, we will ask what type of
sample results one would expect to obtain if the hypothesis were correct / in¬
correct [13.2, next to last paragraph on p. 218]. If the sample outcome is not
in accord with what one would expect, we will accept / reject the hypothesis
[13.2] .
In estimation, on the other hand, no particular sample / population value
need be stated. Rather, the question is, what is the sample / population value
[13.3] ? To answer the question, a sample is drawn and studied and an inference
is made about the parameter of interest.
Random Sampling Distributions
The fundamental fact of sampling is that the value of the characteristic we
are studying will vary / stay the same from sample to sample [13.4]. The key
to any problem in statistical inference is to discover what sample values will
occur in repeated sampling, and with what relative frequency. A distribution
composed of values (such as the mean) characterizing samples of some one partic¬
ular size drawn repeatedly from the same population is known as a sampling dis¬
tribution. We must be able to describe it completely if we are to say what would
happen when samples of that size are drawn from that population.
There is just one basic method of sampling that permits sampling distribu-
The Basis of Statistical Inference 105
tions to be known. This is to draw probability samples--samples for which the
probability of inclusion in the sample of each _ of the population
is known [13.4]. One kind of probability sample is the random sample, which is
a sample so drawn that each possible _ of that size has an equal prob¬
ability of being selected [13.5].
It is the method of selection, and not the particular sample outcome, that
defines a sample as random. If we were given a population and a sample from that
population, it would be possible / impossible to say whether the sample was ran¬
dom without knowing the method by which it was selected [13.5].
One important characteristic of a random sample is that every _
in the population has an equal probability of inclusion in the sample [13.5].
The Random Sampling Distribution of Means
Although inference could be about any one of a number of characteristics of

a population (parameters), this chapter and the next several chapters focus on
inference about means, and thus the random sampling distribution of means is of
special concern. There is one such distribution for every population and every
sample size. Such a distribution can be hypothetically generated as follows:
From the population of interest, draw a random sample of a given size, and find
the mean of the sample. Replace the sample, "stir well," and draw another random
sample of the same size, now finding the mean of this one. Replace this sample,
and continue this procedure indefinitely. The infinitely large collection of
means of samples of the given size drawn at random from the population of interest
is the random sampling distribution of means for this case.* *
There is not just one random sampling distribution corresponding to a given
population, but a family of such distributions, one for each possible sample
_ [13.7, p. 224].
A random sampling distribution of means, like any other distribution, is
completely defined by specifying its mean, standard deviation, and shape. The
mean of any random sampling distribution of means is the same as the mean of
the of scores [13.7]. The symbol for the mean of a random
sampling distribution of means is _ [Formula 13.1], which is read mew sub
eks bar."
*Actually, there are two random sampling distributions for each such case:
one for sampling with replacement and one for sampling without replacement.
This is a nuance hinted at in the second footnote on p. 223 and treated fully m
Section 14.5
106 Chapter 13
The standard deviation of a random sampling distribution of means is called
the standard _ of the mean and is symbolized [Formula 13.2],
which is read "sigma sub eks bar." It is computed as follows:
= - [Formula 13.2]
The formula shows that (a) means vary more / less than scores do (when sample
size is at least two), (b) means vary more / less where scores vary less, and
(c) means vary more / less when sample size is greater [13.7].
With regard to the shape of the distribution, if the population of scores is
normally distributed, the sampling distribution of means will also / not be
normally distributed [13.7]. If the population is not normally distributed, the
Central Limit Theorem informs us that the random sampling distribution of means
tends toward a normal distribution irrespective of the shape of the population
of observations sampled and that the approximation to the normal distribution
becomes increasingly close with increase / decrease in sample size [13.7].
Thus even when the population of scores differs substantially from a normal
distribution, the sampling distribution of means may be treated as though it
were normally distributed when sample size is reasonably small / large [13.7].
A random sampling distribution of means may be used to tell what proportion

of sample means would fall between certain limits, or alternatively, to give the
probability of drawing such a mean in random sampling, as in the examples worked
in Sections 13.8 and 13.10.
[If you feel a need for a summary of the material on probability, look ahead
to the first two paragraphs on p. 114 of this workbook.]
MAP of RANDOM SAMPLING DISTRIBUTION of MEANS
A Random Sampling Distribution of Means
consists of can be used to compute
symbolized
X
probability that a
single sample of size
n will have a mean
that falls between
certain limits
a certain mean, — mean, symbolized

symbolized y p—, equal to y
a certain standard • standard deviation, symbolized 0^,

deviation, symbolized called standard error of the mean,
a equal to a//rT
a certain shape, not ■ shape that is

necessarily normal •normal if population is normal in shape;
* an approximation to the normal one, even
if population is not normal, with a larger
n bringing a closer approximation
as described by
l
Central Limit
Theorem
108 Chapter 13
EXERCISES
To gain some insight into the random sampling distribution of means, turn
to p. 547 of the text. There are 2500 single digits on this page, which we can
take to be a population. What are the characteristics of this population?
1. We are assured that the digits were chosen at random, so each of the ten
possible values (1 through 9 plus 0) occurs about 1/10 of the time in this pop¬
ulation. Let's assume the figure is exactly 1/10 for each digit; the assumption
can't be far wrong. (You're welcome to make a frequency distribution to check
the assumption; it shouldn't take more than a week.) On this assumption, then,
each value has a frequency of 1/10 of 2500, or 250, and the shape of the dis¬
tribution is thus rectangular.
2. The mean of the population is the same as the mean of a distribution

consisting of just one occurrence of each of those ten digits. To see that
this is true, think of the scores as weights on a plank. The balance point for
the full population, which would appear as 10 stacks of weights each 250 weights
high, would be the same as the balance point for the bottom layer of those
stacks. So the mean works out to be (you figure it out):
3. The standard deviation of the population can be calculated using the

same logic used in figuring the mean: that for the whole distribution is the
same as that for a distribution consisting of one occurrence each of those ten
values. (This logic works only for a rectangular distribution, please note.)
So the standard deviation works out as:
Now, let's approximate the random sampling distribution of means for the
case in which samples of size two are drawn from this population. To draw a
first sample of size two, we should pick two digits in such a way that all
samples of size two have the same chance of occurrence. That's tough to do.
For present purposes, it will be sufficient to close your eyes and put the point
of a pencil down somewhere in the table. Take the digit closest to the point
as the first element of the sample, and take the digit to the right of this one
(or to the left of it, or above it, or below it—whatever you want) as the sec¬
ond element. Record the mean of these two numbers on the next page. Now repeat
this procedure to get another sample of size two, and continue in this way
until you have 20 samples. (To generate the real sampling distribution, you
should continue forever, but then you'd never finish this course.) If you
get tired of closing your eyes each time, just read off pairs of digits start
ing any old place on the page; that'll be good enough.
Mean of Sample Mean of Sample Mean of Sample Mean of

Sample
Sample Number Sample Number Sample Number Sample
Number
1 6 11 16
2 7 12 17
8 13 18
3
4 9 14 19
5 10 15 20
Putting these 20 means together now will get you a rough approximation of
the random sampling distribution of the mean for samples of size two selected
from that population of 2500 digits.
1. In the space below, make at least a rough histogram of your distribution

of 20 sample means. You should find that the shape is not rectangular like
that of the population, but something more bell-shaped, as the Central Limit
Theorem says. The width of your class intervals should be small, by the way,
maybe half a point. Convenient class intervals are -0.25 to +0.25, 0.25 to
0.75, 0.75 to 1.25, 1.25 to 1.75, and so on.
2. Calculate the mean of your distribution of 20 sample means. It should

fall fairly close to the mean of the population. If you had the real sampling
distribution, its mean would be exactly the same as the mean of the population.
110 Chapter 13
3. Calculate the standard deviation of the 20 sample means. This is an

approximation to the standard deviation of the entire sampling distribution,
which is called its standard error and which would equal o//n where n = 2.
Calculate the standard error too. Note that the standard deviation of your
20 means is (almost certainly) less than the standard deviation of the raw
scores in the population.
And now, to gain even more insight, draw 20 samples all of some size greater
than two. Ten is a convenient size, because the division involved in finding
the mean of a sample is then simple. Again record the mean of each sample.
Sample Mean of Sample Mean of Sample Mean of Sample Mean of

Number Sample Number Sample Number Sample Number Sample
1 6 11 16
2 7 12 17
3 8 13 18
4 9 14 19
5 10 15 20
Again make at least a rough histogram to get the shape of the distribution
these 20 sample means It should be even closer to normal than the shape
the distribution for the case in which sample size was only two. Also cal-
culate the mean of your collection of means and the standard deviation. The
latter should be less than the standard deviation of the raw scores in the pop¬
ulation again, and less than the standard deviation of the distribution for the
case in which sample size was two. Finally, calculate the theoretical standard
error of the mean for samples of whatever size you used this second time.
SYMBOLISM DRILL
l Number of scores in a sample

112 Chapter 13
3
4
6
ZX/Nj the mean of a
5
Zx/n; the mean of a
9
X - X or x - y; deviation score
12
/Zx2/N; of a
11
Zx2/N‘, of a
13
Zx2/n; of a
14
/Zx2/n; of a
15
x/0 or x/S',
17
Pearson correlation coefficient for a pop' n
16
Pearson correlation coefficient for a sample
18
Predicted raw score on Y •
19
Predicted z score on y
20
Standard error of estimate of Y on X
"mew sub eks bar" Mean of sampling distribution of means
sigma sub eks bar" Standard error of the mean; o//n

CHAPTER 14
THE BASIS OF STATISTICAL INFERENCE:

FURTHER CONSIDERATIONS
14.1 Introduction
14.2 Another Way of Looking at Proba¬

bility
14.3 Two Theorems in Probability
14.4 More About Random Sampling
14.5 Two Sampling Plans

of Means: An Alternative Approach
14.7 Using a Table of Random Numbers
1_
2_
3_
4_
10
113
114 Chapter 14
SUMMARY
Probability, Empirical and Theoretical
Questions of probability arise when a repeatable event occurs and gives rise
to one of two or more possible outcomes. Examples are flipping a coin, which
gives rise to the outcome "heads" or the outcome "tails," and drawing a card from
a deck of playing cards, which gives rise to one of 52 possible outcomes. The
occurrence of such an event is called a trial. What now do we mean when we speak
of the probability of an outcome of some kind turning up on a given trial? The
previous chapter defined the probability of an outcome in terms of what happens
when trials are repeated over and over again indefinitely: The probability of
the occurrence of A on a single trial is the ____ of trials char¬
acterized by A in an infinite series of trials, when each trial is conducted in
a like manner [13.5, p. 221].
It is impossible to put this definition into practice, of course, because an
infinite series of trials can never be obtained. The best we can do is to repeat
the event of interest some finite number of times and compute the proportion of
trials characterized by A. This gives us what the present chapter calls an em¬
pirical probability. An empirical probability is but an estimate of the true
value, and confidence is to be placed in it according to the __ of
observations (trials) on which it is based [14.2, next-to-last paragraph].
If we know that the several possible outcomes of an event are equally likely,
we don't have to fuss with empirical probabilities. Instead we can find a prob¬
ability of interest by using the following theorem:
Given a population of possible outcomes, each of which is equally
likely to occur, the probability of occurrence on a single trial of
an outcome characterized by A is equal to the _ of out¬
comes yielding A, divided by the total number of _
_ [14.2].
In a roll of a fair die, for example, the six different faces are equally likely
to turn up. Thus the probability of rolling a face with an odd number of spots
is 3/6, or 1/2, because 3 faces yield this characteristic (the faces with 1, 3,
and 5 spots), and there are a total of 6 faces. A probability like this computed
from the theorem stated above is called a theoretical probability.

The Basis of Statistical Inference: Further Considerations 115
The Probability of This OR That
Questions about probability sometimes take the form: what is the probability
of this OR that happening when some event occurs? Such a question can be readily
answered by using the addition theorem of probability, but only if the outcomes
of interest (the "this" and the "that") are mutually exclusive. Outcomes are
mutually exclusive when the occurrence of one _ the possibility
of the occurrence of any of the others [14.3]. Another way to say this is that
two outcomes are mutually exclusive if they cannot occur on the same trial. In
drawing a card from a deck of playing cards, for example, the outcomes King and
Queen are mutually exclusive, because no card can be both a King and a Queen.
But the outcomes King and Club are not mutually exclusive, because a card can be
both a King and a Club. According to the addition theorem:
The probability of occurrence of any one of several particular
outcomes is the sum / product of their individual probabilities,
provided that they are ____ [14.3].
The Probability of This AND That
Other questions about probability take the form: what is the probability of
this AND that happening? Such a question can be readily answered by using the
multiplication theorem of probability, but only if the event that might generate
the "this" and the event that might generate the "that" are independent. Inde¬
pendence of events means that the outcome of one event must have some / no influ¬
ence on and in some /no way be related to the outcome of the other event [14.3].
According to the multiplication theorem:
The probability of several particular outcomes occurring jointly
is the sum / product of their separate probabilities, provided
that the events that generate these outcomes are __________
[14.3] .
The multiplication theorem applies only to situations where two or more _
are considered together, as in the tossing of two coins or the result of tossing
one coin twice [14.3].
Random Sampling
In the previous chapter, a random sample was defined as a sample so drawn

116 Chapter 14
that each possible _ of that size has an equal probability of selec¬
tion [14.4]. If a sample is drawn in this way, it is necessarily true that
every _ in the population has an equal chance of being selected
[14.4]. The reverse is / is not true; giving equal probability to the elements
does / does not necessarily result in equal probability for samples [14.4].
Sampling With and Without Replacement
Although there is only one way to define a random sample, there are two
sampling plans that yield a random sample. One plan is sampling without replace¬
ment. The characteristic of this method is that an / no element may appear more
than once in a sample [14.5]. The other plan is sampling with replacement.
Under this plan it is possible / impossible to draw a sample in which the same
element appears more than once [14.5]. Both of these plans can satisfy the con¬
dition of random sampling, but certain sample outcomes possible when sampling
with/without replacement are not possible under the other method [14.5].
There are three characteristics of a sampling distribution that define it
completely: mean, standard deviation, and shape. The first and last of these are
affected / unaffected by choice of sampling plan [14.5]. The standard deviation
of the sampling distribution is smaller / larger when sampling without replace¬
ment [14.5]. The formula given in Chapter 13 for the standard deviation (for the
"standard error of the mean") is strictly correct only if sampling is with re¬
placement. Despite the fact that most sampling in behavioral science is done
without replacement, we typically use the Chapter-13 formula for the standard
error of the mean. No/ Considerable harm is done in the usual case, where the
sample may be thought to be substantially smaller than 5% of the population [14.5]
Another Way to Conceptualize the Random Sampling Distribution of Means
In the previous chapter, the random sampling distribution of means was con¬
ceived as the result of a(n) large/ infinite series of sampling trials [14.6].
Another view is possible: the random sampling distribution of means is the rela¬
tive frequency distribution of means obtained from n / all possible samples of
a given size that could be formed from a given population [14.6]. This definition
holds whether sampling is done with or without [14.6].

The Basis of Statistical Inference: Further Considerations 117
When the sampling distribution is defined in this way, it is possible to
generate an entire such distribution for a population of finite size. An example
appears on pp. 240 - 243. Among the important insights this example offers is
the point that random sampling results in equal probability of occurrence of any
possible sample / sample mean , not in equal probability of occurrence of any
possible sample / sample mean [14.6].
SYMBOLISM DRILL
4 I
5 Zx/n; the of a
6 ZX/N; the of a
9 or •
r
score
X
12 Z / ; of a
0
1 1 of a
a2 £ / ;
14 s I / ; of a
1 3 Z / ? of a
s2
1 5 z /a or /S’, score
16 r
17 p
18
Y'
19 z'Y -
20
Svv
YX -
21 u_
-
22 Standard of the ; a//n
CHAPTER 15
TESTING HYPOTHESES ABOUT SINGLE MEANS:

NORMAL CURVE MODEL
15.1 Introduction
15.2 Testing an Hypothesis about a Single

Mean
15.3 Generality of the Procedure for

Hypothesis Testing
15.4 Estimating the Standard Error of the

Mean When 0 is Unknown
15.5 Captain Baker's Problem: A Test about y
15.6 Captain Baker's Problem: Conclusion
15.7 Directional and Nondirectional Alter¬

native Hypotheses
__ 15.8 Summary of Steps in Testing an Hypoth¬

esis about a Mean
15.9 Reading Research Reports in Behavioral

Science
15.10 Review of Assumptions in Inference

about Single Means
15.11 Problems in Selecting a Random Sample

and in Drawing Conclusions
1_ 2 __ 3 __
4_ 5 6____
7_ 8 ____ 9___
10_ 11____12___
13 14 _
119
120 Chapter 15
SUMMARY
This chapter introduces the logic of hypothesis testing and the procedure
for testing a hypothesis about the mean of a single population. The procedure,
strictly speaking, requires knowledge of the standard deviation of the popula¬
tion, but this will rarely be known, so it must usually be estimated from the
sample on hand. Naturally, substituting an estimate for the real thing intro¬
duces some error, but the larger the sample the smaller / larger the error
[15.1]. So, if sample size is large enough (n greater than 40 or so), the error
will be small enough so that the procedure described here is satisfactory. For
this reason, the procedure is sometimes known as the large-sample method. It
takes the normal curve as a model for a certain sampling distribution.
Estimating the Population's Standard Deviation and the Standard Error of the Mean
To test hypotheses about means, it will be necessary to calculate the stan¬
dard error of the mean. This is symbolized _ and is equal to
[15.4, unnumbered formula]. This formula requires knowledge of <J, the
of the population / sample , but in actual practice
a will be unknown, and it must be estimated from the sample. One would think
that _, the sample standard deviation, would be the proper estimate, but it
proves to be (on the average) a bit too small / large [15.4].
A statistic called s ("little es") provides a better estimate of a. Its
formula is:
s = [Formula 15.1]
The defining formula for S (which should now be read "big es") is _,
so it is clear that s differs only in that the divisor is _ rather than
n [15.4]. The change in the divisor makes s a bit larger / smaller than S [15.4].
[It is now highly important to distinguish "big S" from "little s." One way
to do it in writing is to print the capital letter and print it large, like
this: , while making the small letter small and of the script variety, like
this: . ]
Substituting s for a in the formula for the standard error of the mean yields
the working formula for estimating the standard error. The estimate of the stan¬
dard error is symbolized s—, to distinguish it from 0—. The formula for the
X X
estimate is:
s— = [Formula 15.2]
X
Substituting s for a in making this estimate takes care of bias that would be
Testing Hypotheses about Single Means: Normal Curve Model 121
introduced if S had been used, but different samples will always yield the same
estimate / still yield different estimates , and so the constant / variable error
introduced by substituting an estimate for the true value remains [15.4]. Pro¬
cedures described here make / do not make allowance for this error, so we must
remember to use them only when samples are large / small enough [15.4].
Stating Hypotheses
In testing an hypothesis about the mean of a population, a researcher states
that the mean has a certain specific value (e.g., y = 30). Such a statement is
symbolized H^ ("aitch null")* HQ is called the_hypothesis [15.5]; it is
the hypothesis that the researcher will test and will decide to accept or reject.
The researcher must also state an alternative hypothesis, symbolized
("aitch sub ay"). This alternative may be nondirectional or directional.
A nondirectional hypothesis states that the population mean does not equal the
value specified by the null hypothesis (e.g., y / 30), without saying whether the
mean is less than the value specified by the null or greater. Use of a nondirec¬
tional alternative allows the investigator to reject the null / alternative
hypothesis if the evidence points with sufficient strength to the possibility
that y is greater than the value hypothesized (by Hq), or to the possibility that
it is less [15.7] .
A directional alternative hypothesis takes one of two forms, stating either
that the population mean is less than the value specified by the null hypothesis
(e.g., y < 30) or that the population mean is greater than this value (e.g., y
> 30). A directional alternative hypothesis is appropriate when it is only of
interest to learn that the true value of y differs from the hypothesized value
(the value specified by the null hypothesis) in a particular direction / either
direction [15.7].
One must choose, therefore, between a directional alternative and a nondirec¬
tional one. The choice should be determined by the rationale that gave rise to
the study, and should be made before / after the data are gathered [15.7].
When a nondirectional alternative hypothesis is stated, the resulting test
is referred to as a one-tailed / two-tailed test, because H0 will be rejected
if the obtained sample mean is located in an extreme position in just one/ either
122 Chapter 15
tail of the sampling distribution [15.7]. Similarly, a directional alternative
leads to a one-tailed / two-tailed test [15.7].
The Level of Significance
In testing a null hypothesis, one draws a sample at random from the popula¬
tion of interest and determines the mean of the sample. If the sample mean is
so different from what is expected when Hq is true that its appearance would be
unlikely, Hq should be accepted / rejected [15.5, Paragraph 2]. What degree of
rarity of occurrence is so great that it seems better to reject the null hypoth¬
esis than to accept it? Common research practice is to reject Hq if the sample
mean is so deviant that its probability of occurrence in random sampling is .05
or less, or alternatively, _or less [15.5]. Such a criterion is called
the level of _ and is symbolized by the Greek letter a,
which is pronounced " " [15.5].
Picturing the Sampling Distribution Implied by Hq
What sample means would occur if Hq were true? If it were true, the random
sampling distribution of means (for whatever sample size is used) would center
on the value specified by the null hypothesis, because the mean of a random sam¬
pling distribution of means, P—, is equal to the mean of the population of raw
scores, ]i. The value that the null hypothesis specifies is symbolized P^yp, so
in general, we may say, p_ = Vhyp if Hq is true.
One can thus draw a picture of the random sampling distribution of means
(for whatever sample size is used) on the assumption that Hq is true. Such a
picture appears in Figures 15.2 and 15.3 (which are essentially the same), 15.4,
and 15.5. Each pictured distribution is centered on Upyp, which takes the value
30 for the example used in this chapter.
Regions of Acceptance and Rejection
The sampling distribution of means that would occur if Hq were true is divi¬
ded into a region of acceptance and one or two regions of rejection. If the
obtained sample mean falls within a region of acceptance, the null hypothesis is
accepted; if it falls within a region of rejection, the null hypothesis is rejec¬
ted.
The region of acceptance always covers the central portion of the distribution
and always includes as Figures 15.2 through 15.5 show.
There are two regions of rejection, one in each tail, if the alternative hy¬
pothesis is nondirectional, as in Figures 15.2, 15.3, and 15.4. There is just
one region of rejection, located in just one of the tails, if the alternative
hypothesis is directional. If the alternative hypothesis specifies a value for
y less than that named by the null, the region of rejection lies in the left
(lower) tail, as in the left half of Figure 15.5; if the alternative specifies
a value for y greater than that named by the null, the region of rejection lies
in the right (upper) tail, as in the right half of Figure 15.5.
In every case, the total area of the region or regions of rejection is equal
to a, the level of significance. If there are two regions, each has half of
this amount (half of .05 in Figures 15.2 and 15.3, half of .01 in Figure 15.4).
Using z Scores in Hypothesis Testing
The base line in the pictures of the random sampling distribution of means
implied by the null hypothesis is divided into regions of acceptance and rejec¬
tion by one or two z scores. These z scores are found by consulting Table C in
Appendix F and are called critical values, symbolized zcr±t'
To determine whether the obtained sample mean falls within the region of
acceptance or the region of rejection, it is necessary to express the sample
mean as a z score. In general, a z score has the form
score - mean of distribution

Z _ standard deviation of distribution
In the distribution of sample means, the ____ a score' the
hypothesized population mean is the __t and the standard __ of
the mean is the standard deviation [15.6, Paragraph 2], Consequently, the loca¬
tion of the sample mean is expressed by:
z = [P. 256, unnumbered formula]
Now since a is unknown. is also unknown, and the sample mean can/
cannot be expressed as a true z [15.6]. The true z can be estimated, though,
if s— is substituted for G-. From now on z calculated by substituting such an
estimate will be called an approximate z and symbolized _ [15.6]. The
formula for "z" ["zee quotes"] is:

[Formula 15.4]
Concluding a Test
To conclude the test of a null hypothesis about the mean of a single popu¬
lation, the approximate z score that locates the obtained sample mean within
the random sampling distribution of means implied by the null hypothesis is com¬
pared with the critical z value or values. If the obtained sample mean, as in¬
dicated by the approximate z, falls in the region of rejection, the null hypoth-
124 Chapter 15
esis is rejected and the alternative hypothesis is accepted. If the obtained
sample mean, as indicated by its approximate z, falls in the region of accep¬
tance, the null hypothesis is accepted. The decision to "accept" Hq does not
mean that it is likely that Hq is true / false , but only that it could be
true / false [15.6, p. 257]. For this reason, some statisticians prefer to say
"fail to _ " or "_" rather than "accept" [15.7, footnote].
Also for this reason, if the null hypothesis is accepted, the alternative hypoth¬
esis remains a plausible possibility and cannot be rejected.
Statistical Jargon
When the outcome of a statistical test is reported in research literature,
it is common to see statements about "significance." What does it mean to say
that "the outcome was significant at the 5% level"? This usually means that a
null hypothesis and a(n) ___ hypothesis were formulated, the
decision criterion was ct = _, and the evidence from the sample led to
acceptance / rejection of the null hypothesis [15.9]. Similarly, the words
"not significant" (sometimes abbreviated n.s.) imply that the null hypothesis
could / could not be rejected [15.9]. When a report simply says "not signifi¬
cant without stating the value of ot, it is probably safe to assume that it was
_ [15.9].
The use of the word significant in connection with statistical outcomes is
unfortunate. In common English it implies "important," but in statistics it
means only that the sample value was / was not within the limits of sampling
variation expected under the null hypothesis [15.9]. Whether the difference
between what is hypothesized and what is true is large enough to be important
is the same thing / another matter [15.9].
Problems in Selecting a Random Sample and Consequences Thereof
The ideal way to conduct statistical inference is to define carefully the
target population, identify each _ of the population and assign it
an identification number, and to draw a _ sample by use of a table
of random numbers [15.11]. in behavioral science, almost always /only occa¬
sionally do real problems permit these specifications to be met [15.11]. We
need to keep in mind that, strictly speaking, it is possible to generalize the

inferential outcome only to a _ from which the sample may be
considered to be a _ sample [15.11]. Many research reports give the
impression that their conclusions are general ones, but a little probing will
likely reveal that the conclusions apply only to subjects who are of a particular
sex or age or who are otherwise a sharply limited group.
in Section 1.6, the distinction was made between a statistical conclusion
and a ___ one (one about the subject matter) [15.11]. The
statistical conclusion says something about a parameter of the population, such
as ]i. But the _ conclusion says something about the mean¬
ing of the study for psychology, or education, or some other discipline [15.11].
Drawing a statistical conclusion can be done as an automatic process once statis¬
tics is learned. However, moving from the substantive question to an appropriate
question, and finally from a ___ con¬
clusion to a substantive one, requires the highest qualities of knowledge and
judgment on the part of the investigator [15.11].
[If something in this chapter remains unclear to you, the next chapter may
help. It presents some of the details of the logic and the procedures introduced
here. Don't be afraid to look ahead and search the next chapter for more infor¬
mation on anything you're still puzzled about.]
126 Chapter 15
MAP of LARGE-SAMPLE PROCEDURE for TESTING a HYPOTHESIS

about a SINGLE MEAN
Null Hypothesis Random Sampling Distribution of Means

(for samples of whatever size is used)
is
symbolized
is centered,
if Hq is true,
H0
on
states
value of is divided
into
has standard error

estimated by
s— = s/Vn~
region of region of
where s = acceptance rejection
v/I'Lx2 / (n-1)
is
has area appears m
equal to one or both
tails of
accepted if obtained
sample mean falls in depending on
level of
significance
rejected if obtained
sample mean falls in
Alternative
Hypothesis
is symbolized
may be
is symbolized
nondirectional, directional,
a
H. requiring a two- requiring a one-
A
tailed test of H tailed test of Hq
0
SYMBOLISM DRILL
5 EX/ ; the mean of a sample
6 Ex/ ; the mean of a population
9 X - X or X - P; score
11 E / ; variance of a population
12 E / ; standard deviation of a popula'n
13 "big es squared" E / ; variance of a sample
14 "big es" E / ; standard deviation of a sample
1 5 x/a or x/S;
1 7 Pearson correlation coefficient for a pop'n
1 8 Predicted raw score on Y
19 Predicted z score on Y
20 Standard error of estimate of Y on X
21 Mean of sampling distribution of means
22 Standard error of the mean; o//~
23 "little es" Estimate of 0; /Ex2/(n-l)
"little es sub eks bar" Estimate of a—; s/vn
25 "aitch null" Null hypothesis
26 "aitch sub ay" Alternative hypothesis
27 hype" Value of p stated in null hypothesis
28 "zee quotes" Approximate z score with denominator estimated
29 "zee crit" Critical value of z

[
CHAPTER 16
FURTHER CONSIDERATIONS IN HYPOTHESIS

TESTING
16.1 Introduction
16.2 Statement of the Hypothesis
16.3 Choice of : One-Tailed and Two-Tailed

Tests
16.4 The Criterion for Acceptance or Rejection

of
16.5 The Statistical Decision
16.6 A Statistically Significant Difference

versus a Practically Important Difference
16.7 Error in Hypothesis Testing
16.8 Decision Criterion or Index of Rarity?
16.9 Multiple Tests
2
16.10 The Problem of Bias in Estimating O
J-
3 4
5 6
7 8
9 10
11 12
14 .
13
129
130 Chapter 16
SUMMARY
This chapter spells out some of the details of the logic and the procedures
introduced in the previous chapter.
The Null Hypothesis
There are three important points to note concerning the null hypothesis. Hi
(a) # is always a statement about the population (or difference
between two or more _ if more than one population is involved)
[16.2] . (b) Hq is expressed in terms of a point value / range [16.2]; that is,
it states only one particular value for the population parameter of interest.
(c) The decision to accept or reject "the hypothesis" always has reference to
H0/ha [!6.2]. It is HQ/HA that is the subject of the test [16.2].
The term "null hypothesis" makes little sense in the case of an hypothesis
about the mean of a single population, but it is appropriate in the case of a
hypothesis about the relationship between the mean of a first population and the
mean of a second one. Here the hypothesis typically states that there is no
difference between the two population means, and the word null means zero.
The Alternative Hypothesis
The alternative hypothesis, H , may be nondirectional or directional. When

r\
the alternative hypothesis is nondirectional, a one-tailed / two-tailed test
results, and it is possible to detect a discrepancy between the true value and
the hypothesized value of the parameter irrespective of the direction / for only
one direction of the discrepancy [16.3]. A directional alternative hypothesis
is appropriate when there is no / some practical difference in meaning between
finding that the null hypothesis is true and finding that a difference exists in
a direction opposite to that stated in the directional alternative hypothesis
[16.3] . A directional alternative results in a one-tailed test. The decision
to use a one-tailed alternative must always flow from the logic of the substan¬
tive/statistical question [16.3]. The time to decide on the nature of the
alternative hypothesis is therefore at the beginning / end of the study, before /
after the data are collected [16.3].
The Level of Significance as a Level of Risk
The decision to accept or reject the null / alternative hypothesis is depen¬
dent on the criterion of rarity of occurrence adopted, commonly known as the

Further Considerations in Hypothesis Testing 131
_of significance (a, "alpha") [16.4]. This quantity determines the
extent to which we are taking a certain kind of risk in testing an hypothesis.
Suppose a is set at .05. When the null hypothesis is true, _% of the sample
means will nevertheless lead us to say that it is false [16.4]. So when we
decide to adopt a = .05, we are really saying that we will accept a probability
of .05 that the null hypothesis will be accepted / rejected when it is really
true [16.4].
To reduce the risk, we may set a at a lower / higher level [16.4]. In this
case, we run a substantial risk of accepting the null hypothesis when it is
true / false [16.4].
For general use, a = .05 and a = .01 make quite good sense. They tend to
give reasonable assurance that the null hypothesis will not be rejected unless
it really should be. At the same time, they are not so stringent as to raise
unnecessarily the likelihood of accepting true/ false null hypotheses [16.4].
Whatever the level of significance adopted, the decision should be made in
advance / after the data are in [16.4].
The Statistical Decision and Its Meaning
The statistical decision is the decision about the null hypothesis. A
decision to reject means that we do not believe the mean of the population to
be what the null says it is. Moreover, the lower / higher the probability of
obtaining a sample mean of the kind that occurred when the null hypothesis is
true, the greater the confidence we have in the correctness of our decision to
reject the hypothesis [16.5].
On the other hand, accepting the null hypothesis means/ does not mean that
we believe the hypothesis to be true [16.5]. Rather, this decision merely re¬
flects the fact that we do not have sufficient evidence to accept / reject the
null hypothesis [16.5]. To put it another way, the decision to accept means
simply that the hypothesis is a tenable one. Certain other hypotheses that
might have been stated would also have been accepted if subjected to the same
test.
In short, rejecting the null hypothesis means that it does not seem reason¬
able to believe that it is true / false , but accepting the null hypothesis
132 Chapter 16
merely means that we believe that the hypothesis could / must be true [16.5].
It does not mean that it must / could be true, or even that it is probably true,
for there would be many other hypotheses that if tested with the same sample
data would also be accepted [16.5].
Statistically Significant or Practically Important?
To test a null hypothesis about the mean of a single population, we calculate
an estimated z score, "z" ("approximate z"), that indicates where the obtained
sample mean falls within the sampling distribution of means that would occur if
the null hypothesis were true. If "z" is large enough (in the sense of being
far away from zero, either above it or below it), we will reject the null hypoth¬
esis. (If the test is one-tailed, "z" must fall on the appropriate side of the
sampling distribution, of course.)
Now the magnitude of "z" depends both on the quantity in the numerator and
on the quantity in the denominator of the ratio:
^ Hhyp
"z" =
s//n
Other things being equal, if sample size n is very large, the denominator, s//n,
will be quite large / small [16.6]. In this event, a relatively large / small
discrepancy between X and y^ may produce a value of "z" large enough to lead
us to reject the null hypothesis [16.6]. In cases of this kind, we may have a
result that is "statistically significant" but in which the difference between
y and y, is so small / large as to be unimportant [16.6].

true hyp
The end product of statistical inference is a conclusion about descriptors,
such as y. Therefore, the simplest remedy is to return to them to evaluate the
importance of a "significant" outcome. Look at how much difference exists be¬
tween y, and the sample mean obtained. Is it of a size that matters?

hyp
What about the statistical test when sample size is small? In this case,
the standard error of the mean will be relatively large / small , and it will be
difficult / easy to discover that the null hypothesis is false, if indeed it is,
unless the difference between y and y, is quite large / small [16.6].

true hyp
Errors in Hypothesis Testing
The statistical conclusion in hypothesis testing is the decision to accept

or reject the null hypothesis. Either decision may be in error; thus there are
two types of errors one can make.
A Type I error is committed when is rejected and in fact it is true /
false [16.7]. The probability of committing a Type I error is Ot, the level of
significance. The possibility of committing a Type I error exists only in situ¬
ations where the null hypothesis is true / false [16.7]. If the null hypothe¬
sis is true / false , it is impossible to commit this error [16.7].
A Type II error is committed when H is accepted and in fact it is true /
false [16.7], The Greek letter _____ (beta, pronounced "bayta") is used to in¬
dicate this probability [16.7], The possibility of committing a Type II error
exists only in situations where the null hypothesis is true / false [16.7]. If
the null hypothesis is true / false , it is impossible to commit this kind of
error [16.7].
Misuses of a
Some researchers evaluate the outcome of the test of a null hypothesis by
showing the probability of obtaining a value as discrepant as the one obtained
if the null hypothesis were true. For a given outcome they might report, say,
that "p < .03." This probability statement is an expression of the rarity of
the sample outcome if the null were true and nothing more. It can/ cannot be
interpreted as the value of ot [16.8], which is a statement of the risk the re¬
searcher is willing to take in rejecting a null hypothesis.
Multiple Tests
Suppose that several hypothesis tests are conducted using the same level of
significance, say .05. For each test taken individually, the probability of a
Type I error is , but taken as a group, the probability that at least one
from among the several will prove to be a false positive is greater / less than
.05 and continues rising / falling as more tests are made [16.9].
Bias in Estimating
The standard error of the mean, symbolized G_, is computable from the formula
A
When o is unknown, as it usually is, it must be estimated from a sample.
Intuition suggests substituting S, the sample standard deviation, as an estimate,

134 Chapter 16
but S tends to be a little too small / large [16.10].
The basic problem is that the sample variance, S2, is a biased estimator of
the population variance, a2. When an estimator is unbiased, the _ of the
estimates made from all possible samples equals the value of the parameter esti-
mated [16.10]. But the mean value of S , calculated from all possible samples
of any given size that could be drawn from a given population, is a little smaller
than a2.
The formula for the sample variance is:
2
S [p. 278]
The tendency toward underestimation will be corrected if £(X-X) is divided by
_ rather than by n [16.10]. This change produces an unbiased estimate of
the population variance, symbolized s ("little es squared"). Taking the square
root of the formula, we have an estimate of the standard deviation of the popula¬
tion, symbolized s:
s [p. 278]
If s is then substituted for O in the formula for the standard error of the
mean, we have an estimate of the standard error called s_:

X
s— = [p. 279]
Although the correction introduced in estimating the standard error of the
mean makes for a better estimate on the average, we should recognize that any
particular sample will probably yield an estimate too _ or too _
[16.10].
SYMBOLISM DRILL
4
I
5
X / ; the of a
6
y / ; the of a
9 score
X or
1 1
a2 £ / ? __ _ of a
1 3
s2 I / ; _ _ of a
1 2
0 I / ; _ _ of a
1 4
s £ / ? ___ _ of a
1 5 score
z /a or /S;
1 6
r
2 1
22 of the _ ; a//~
°x
32
s2 "little es squared" Estimate of a2; lx2/{n~1)
2 3
S _Estimate of 0;
2 4
s— Estimate of_; /J~n
X
2 5
Ho
2 6
ha
2 7
yhyp
3 3 True value of y
u
true
2 8
"z"
2 9
z .
crit
3 0 Risk of Type error; level of
a
3 1 Risk of Type error

3 "bayta"
f
CHAPTER 17
TESTING HYPOTHESES ABOUT TWO MEANS:

NORMAL CURVE MODEL
17.1 Introduction

of the Differences between Two Sample
Means
17.3 An Illustration of the Sampling

Distribution of Differences between
Means
17.4 Properties of the Sampling Distri¬

bution of Differences between Means
17.5 Testing the Hypothesis of No

Difference between Two Independent
Means: the Vitamin A Experiment
17.6 The Conduct of a One-Tailed Test
17.7 Sample Size in Inference about

Two Means
17.8 Randomization as Experimental

Control
17.9 Comparing Means of Dependent

Samples
17.10 Testing the Hypothesis of No

Difference between Two Dependent Means
17.11 An Alternative Approach to the

Problem of Two Dependent Means
17.12 Some Comments on the Use of

Dependent Samples
17.13 Assumptions in Inference about

Two Means
137
138 Chapter 17
1 2
3 4
5 6
7 8
9 10
11 12
SUMMARY
Chapter 17 describes the method for testing a hypothesis about the relation
between the mean of a first population and the mean of a second population.
Scores from the first population are called X, those from the second population
are called Y, and the null hypothesis usually states that the two populations
have the same mean, which is to say that the difference between the population
means is zero. In symbols, H says that - yy = 0. This hypothesis is appro¬
priate for many studies in which a variable is measured under two different con¬
ditions. In particular, this is the appropriate null hypothesis for an experiment
in which an experimental condition and a control condition are established and
a sample of scores on some variable is collected in each condition.
The Random Sampling Distribution of the Differences Between Two Sample Means
As in the case of testing a hypothesis about the mean of a single population,

we must ask whether the data on hand would be likely or unlikely to have turned
up, were the null hypothesis true. Here the data on hand are a pair of samples.
One came from the population of scores called X; it has a certain size, symbolized
nX' an<^ a cer"tain mean, symbolized X. The other sample came from the population
of scores called Y; its size is symbolized ny and its mean Y. Ideally, each
sample was selected at random from its parent population.
The difference between the two sample means, (X - Y), is the statistic on
which we focus. We ask whether the obtained difference is likely or unlikely to
have occurred if the null hypothesis were true. To answer this question, we must
consult a sampling distribution, in this case the random sampling distribution
°f differences between two sample means for samples of the sizes we drew and for
the two populations from which we drew them.
Testing Hypotheses about Two Means: Normal Curve Model 139
This distribution could be generated, hypothetically, as follows: One sample
(of size n%) is drawn at random from the population of X scores, and another (of
size ny) is drawn from the population of Y scores. The _ of each is com¬
puted, and the difference between these two _ obtained and recorded
[17.2]. Let the samples be returned to their respective populations and a second
pair of samples be selected in the same way / in a different way [17.2]. The
sample from population x must have size nx again, and the sample from population
Y must have size ny again. Again the two _ are calculated and the dif¬
ference between them noted and recorded [17.2]. If this procedure is repeated
indefinitely, the differences [values of (x - Y) thus generated] form the random
sampling distribution of differences between two sample _ [17.2]. This
distribution is specific to the sample sizes employed and to the populations
sampled; that is, the characteristics of the sampling distribution would change
if either sample size were changed or if either population were changed.
Three characteristics completely define any distribution: _, _
, and _ [17.4]. The mean of the random sampling
distribution of differences between pairs of samples means, — ("mew sub eks-

A i
bar-minus-wi-bar"), is the same as the between the two pop¬
ulation means, \ix - py [17.4]. This is true regardless of the sample sizes and
regardless of the shapes of the populations. For cases in which the null hypoth¬
esis says that py — py = 0, then, the random sampling distribution of differences
between pairs of sample means will be centered on zero if the null is true.
As for shape, the sampling distribution of differences will be normally dis¬
tributed when the two populations are ____ [17.4] .
Even when the two populations are not normal, the sampling distribution tends
toward normal, and with bigger sample sizes, the sampling distribution becomes
closer and closer to normal in shape.
The standard deviation of the sampling distribution of differences between
two means is called the standard __ the _ between
two sample _, and its symbol is _______ ("sigma sub eks-bar-minus-wi-
bar") [17.4]. Its value depends on whether the samples are independent or not.
Independent random samples exist when the selection of elements comprising
the sample of Y scores is in some / no way influenced by the selection of ele

140 Chapter 17
ments comprising the sample of X scores, and vice versa [17.4]. In ordinary
random selection from two populations, this would /would not he true [17.4]
For this case, the standard error of the difference between two means behaves
in accord with the following formula:
°X'-Y = [Formula 17.1]
Formula 17.1 requires the standard error of the mean of X and of Y and these,
m turn, require that_and_be known [17.4]. As usual, in practice,
the two population standard deviations must be estimated from the samples. Sub¬
stituting in estimates of G— and 0— produces an estimate of a- that is sym-
bolized s—("little es sub eks-bar-minus-wi-bar"). The formula is:
sY-y = [Formula 17.2a]
Since s~ = sx/-/nx and s— = s^Z/n^, in practice the formula works out to:
SX-F = [Formula 17.2b]
Estimating a—introduces a degree of error, of course. If the size of each
sample equals or exceeds __, the error will be small enough that the procedures
described here will be acceptable if not entirely correct [17.4].
There are two basic ways in which dependent samples may be generated: (1)
the same subjects are used for both conditions of the study, and (2) different
subjects are used, but they are __ on some variable related to per¬
formance on the variable being observed [17.9]. when samples are dependent, the
standard error of the difference between means must take account of the
-- induced by the existing relationship between the samples [17.9]. The
standard error is:
ax-y = [Formula 17.4]
When the parameters are unknown, the formula that estimates O_ is-
X-Y
[Formula 17.5]
Again, error will be introduced in substituting an estimate of O— — for its
true value. Evaluation of the approximate z that is obtained by the procedures
described in this chapter according to the characteristics of the normal curve
will generally be acceptable, although not entirely accurate, when the number
of pairs of scores / total number of scores = 40 or more [17.9].
The alternative hypothesis, H^, may take one of three forms. The nondirec-
tional form says that the two populations of interest do not have the same mean,
which is to say that that the difference between the population means is not
zero. In symbols, this form of HA says that ]ix - Py ^ 0. This form gives rise
to a two-tailed test in which the region of rejection is divided between the two
tails of the sampling distribution of differences between sample means, as in
Figure 17.2 on p. 294.
The other possibilities for the alternative hypothesis are directional forms
stating either that the X population has a greater mean than the Y population or
vice versa. In symbols, these forms say either than Px “ hy > 0 or that Px ~ ^y
< 0. These forms give rise to a one-tailed test in which the region of rejection
is located entirely in one tail of the sampling distribution of differences be¬
tween sample means. Figure 17.3 on p. 294 shows the appropriate picture for a
directional alternative of the first kind; for a directional alternative of the
second kind, the region of rejection would be located in the left-hand tail.
No matter what the form of the alternative hypothesis, the region or regions
of rejection have an area equal to a, the level of significance for the test.
Locating (X - F) within the Sampling Distribution
As noted above, if the null hypothesis of no difference between the two popu¬
lation means is true, the random sampling distribution of differences between
sample means is centered on zero. How deviant is the obtained sample difference
from the hypothesized population difference of zero? To answer, the obtained
difference must be expressed in the form of a z score, where z = (score - mean of
distribution)/(standard deviation of distribution). In the sampling distribution
of differences between sample means, our obtained difference, (X - Y), is the
, the hypothesized difference between the population means is the
,
and s-is the estimated ____ [17.5].
X Y
As with problems involving single means, we have an approximate z rather than a
true z, because an estimate of the standard deviation is substituted for the
value [17.5]. The use of the symbol "z" will continue to remind us
142 Chapter 17
of this. The formula for the location of (X -SY) in the sampling distribution,
expressed as an approximate z, is:
"z" = [Formula 17.3]
Here (y^ ~ hyp ^"mew sut)_eks minus mew sub-wi, the quantity hype") is the
value of (y^ - yy) stated in the null hypothesis, which is usually zero.
A Way to Simplify Things in the Case of Dependent Means
Calculating "z" is laborious in the case of dependent samples because of the
work involving in finding the correlation rXY, which is required for the estimate
of the standard error of the difference between the means, s_ An alternative

X~ Y
method that saves computational work while giving an identical answer is avail¬
able .
Consider the hypothesis that yx - yy = 0. If the hypothesis is true, then
it is also true that the mean of the population of differences between paired
values of X and Y is _ [17.11], [An exercise to provide insight into this
state of affairs appears below.] If the difference between an X score and its
paired Y score is designated by D, the initial hypothesis may be restated: Hgi
_ = _ [17.11]. The alternative method requires that we find D, the
mean of the population/ sample set of difference scores, and inquire whether
it differs significantly from the hypothesized mean of the population / sample
of difference scores, y^ [17.11], in this method the two-sample problem is re¬
duced to a one-sample problem exactly like that treated in the previous two
chapters.
To locate the obtained mean of the sample of difference scores, D, within
the sampling distribution of quantities of this kind, we must calculate an approx¬
imate z score as follows:
"z" = [Formula 17.6]
The Effect of Sample Sizes in the Case of Independent Means
When inference concerns two independent sample means, the samples may be /
must be of different size [17.7]. However, if a and a are equal, the total
X Y
of the sample sizes (n + n ) is used most efficiently when n _ n [17.7].

x y x y
This will result in a larger / smaller value for a-than otherwise [17.7].
X Y
The advantage of a larger / smaller a_is that, if there is a difference be-
X Y
tween \ix and yy, the probability of claiming it (rejecting _) is increased
[17.7].
The point just noted has to do with the relative size of the two samples.
What about the absolute magnitude of sample size? Other things being equal,
large samples increase / decrease the probability of detecting a difference
when a difference exists [17.7].
Randomization as Experimental Control
Comparisons between two or more groups may be divided into two categories:
those in which the investigator can assign to each subject any particular treat¬
ment condition, and those in which the investigator cannot. In a study of the
first kind, it is possible for the investigator to assign treatment condition
at random to the subjects, and to do so has important advantages.
The primary experimental (as opposed to statistical) benefit of randomization
lies in the chance (and therefore impartial) assignment of extraneous influence
among the groups to be compared. (An extraneous influence is one other than the
treatment, which will be one whose effects the investigator would not wish to
entangle with any effect the treatment might have.) Those who are likely to do
well have more / just as much chance of being assigned to one treatment group
than / as they have to another, and the same / opposite is true of those who
are likely to do poorly [17.8]. The beauty of randomization is that it affords
this type of experimental control over extraneous influences whether or not they
are known by the experimenter to exist. Random assignment of subjects to treat¬
ment groups guarantees / does not guarantee equality [17.8]. But randomization
tends to produce equality, and that tendency increases in strength as sample
size increases / decreases [17.8].
Inspection of the outcome of randomization sometimes tempts the experimenter
to exchange a few subjects from group to group before proceeding with the treat-
ingnt in order to obtain groups more nearly alike. Such a move improves things /
leads to disaster [17.8]. The standard error formulas are based on the assump¬
tion of randomization, and casual adjustment of this kind makes them more appro¬
priate / inappropriate [17.8].

144 Chapter 17
Aspects of the Use of Dependent Samples
Samples (or means) can be dependent for one of two reasons, as noted above:
because one group of subjects appeared in one sample and another group in the
other sample, but the subjects were matched in pairs, one member of each pair
from each sample; or because the same subjects were used in both conditions. In
both cases, randomization can be used to advantage.
With matched subjects, the benefit of randomization as control can be achieved
by assigning treatment condition __ to the members of each pair,
taking care to do so independently for each pair of subjects [17.12]. The problem
is more complicated when the same subjects are used for both treatment conditions.
Here, random assignment would mean deciding randomly which treatment the subject
will receive ___________ and which will be given [17.12]. This will
create some problems when the first treatment experience changes the subject in
some way so that she or he performs differently under the second treatment. Prac¬
tice effect and fatigue are two possible influences that might affect a subject's
second performance.
From a statistical standpoint, there can be an advantage in electing to use
paired observations rather than independent random samples, when a choice is avail¬
able. Pairing observations makes possible elimination of an extraneous source of
variation. The effect of doing so is to reduce / increase the influence of ran¬
dom variation on the differences between means. The standard error measures this
factor. Ihe effect of reducing the standard error of the difference by pairing
is the same as reducing it by increasing sample ______ [17.12]. The less the
error (as measured by the standard error), the less / more likely it is to mask
a true difference between the means of the two populations [17.12]. To put it
more formally, a reduction in a___ reduces the probability of committing a Type
I / II error [17.12].
The reduction of Cf—ind^ced by pairing observations depends on the value of
the ----- induced by pairing [17.12]. In general,
when pairing is on the basis of a variable importantly related to performance of
the subjects, the correlation will be higher / lower than otherwise, and the re¬
duction in 0—will consequently be lesser / greater [17.12].

CAUTIONS CONCERNING CONFUSABLE QUANTITIES
1. y-f y_ - Y
X-Y T *X
("Mu sub eks bar minus wi bar does not equal Mu sub eks bar Minus Wibar.")
The quantity on the left here is the mean of a population; that's what the
y indicates. The population is composed of numbers derived by taking the mean
of a sample of scores called A" and subtracting from this mean the mean of a
sample of scores called Y; that's what the subscript X~Y indicates. The numbers
described by the subscript are differences between sample means, then, and the
expression y^_y designates the mean of a sampling distribution of such quantities.
The first paragraph on p. 139 of this workbook describes how such a sampling
distribution could be generated.
The quantity on the right above is a difference, the difference between (a)
the mean (y) of a population, the elements of which are means (X) of samples of
scores called X, and (b) the mean (Y) of a single sample of scores called Y.
You will have no occasion to deal with this bizarre expression in this course,
and probably no occasion to deal with the expression at any other time in your
life, even if you become a professional statistician. Be sure you don't confuse
it with the expression on the left above.
("Sigma sub eks bar minus wi bar does not equal Sigma sub eks Minus Wibar.")
The quantity on the left this time is the standard deviation of a population;
that's what the O indicates. The population is composed of numbers derived by
taking the mean of a sample of scores called X and subtracting from this mean
the mean of a sample of scores called Y; that's what the subscript X-Y indicates.
The numbers described by the subscript are differences between means, then, just
as in the expression yy_y. °x-Y is the standard deviation of a sampling distri¬
bution of differences between means, and it has the special name "standard error.
The quantity on the right, in contrast, is a difference, the difference be¬

tween (a) the standard deviation (cr) of a population, the elements of which are
means (X) of samples of scores called X, and (b) the mean (Y) of a single sample
of scores called Y. This is another bizarre expression that will never arise in
this course—unless you think it or write it by mistake. Please don't.
3. The distinction between independent and dependent variables f

the distinction between independent and dependent means.
An independent variable is a variable manipulated by a researcher to see

whether it has some effect on another variable; it is a factor that might have
some influence on this other variable. The text calls it a treatment. In the
experiment described on pp. 285-286 of the text, the independent variable is the
quantity of Vitamin A in the subject's diet, and it is tested as a factor that
146 Chapter 17
might affect the subject's visual acuity under dim light. The variable that
might be influenced by the independent one is called a dependent variable. It
is not manipulated; rather it is left free to vary, and it is measured for each
subject in each condition of an experiment. In the one described in the text,
the dependent variable is visual acuity under dim light.
This distinction has nothing to do with the distinction between independent

means and dependent means, and the similarity in terminology is just an unfortu¬
nate coincidence. The means that may be independent or dependent are means of
scores on a dependent variable. What determines whether the means are indepen¬
dent or dependent is the researcher's procedure in collecting the data.
If the researcher tested one sample of subjects in one condition of the

experiment and a different sample of subjects in the other condition, without
doing anything more special than this, the mean of the scores for the first con¬
dition is independent of the mean of the scores for the second condition (and
vice versa). The two samples may be of different sizes in this case, and even
if they are of the same size, there is no logical way to pair a score from one
sample with a score from the other sample.
If the researcher tested each subject first in one condition and then in the
other condition, though, the mean of the scores for the one condition and the
mean of the scores for the other condition are dependent. Or if the researcher
picked a first subject, looked around for a second one who matched the first in
some way (in visual acuity under normal light for the example on p. 297), flipped
a coin to determine which member of this pair went into which condition, and
continued in this way, matching each subject for one condition with a subject
for the other condition again the mean of the scores for the one condition and
the mean of the scores for the other condition are dependent. When means are
dependent, the samples they characterize must be of the same size, and there is
a logical way to pair each score in one sample with a score from the other sample.
Whether means are independent or dependent, though, and if dependent, whether

they derive from the repeated-measurement or the matching procedure, the scores
contributing to the means are measurements of a dependent variable, and the re¬
searcher is looking to see whether an independent variable (or treatment) is
associated with a statistically significant difference between the means.
In distinguishing independent and dependent means, the text sometimes talks

about independent or dependent samples instead. The terminology is equivalent.
WHY DOES y = 0 if p - y =0?

ja x y
In the bottom paragraph on p. 299, the text zips you through the point that
Kd - 0 if \ix - yy = o. If you found that point unclear, this exercise should
help.
Dependent means arise when there is some logical way to pair each score in
one condition of a study with a score from the other condition of the study.
Such pairings are shown in Table 17.5 on p. 300 of the text. Here we are asked
to imagine that 20 subjects were chosen at random from some population and given
a preliminary test to determine their reaction time to a white light. Ten pairs
of subjects were then formed on the basis of these reaction times. Within each
pair, the two subjects were equal in the speed of their reaction to the white
light, but some pairs were relatively slow while others were relatively fast.
The reaction times on which the pairings were done do not appear in Table 17.5,
though, and they do not enter into any of the statistical calculations for the
study.
The researcher then flipped a coin or did the equivalent to assign the members
of each pair at random to one condition or the other of the experiment. The pro¬
cedure might have gone like this: Take a pair of subjects; call one of them A
and the other B. If the coin comes up heads, Subject A is tested with the green
light and Subject B with the red light; if the coin comes up tails, it's vice
versa for the subjects.
Reaction times to the colored lights for the ten pairs of subjects are shown
in Table 17.5 in the columns headed X and Y. Each score is the time in milli¬
seconds (thousandths of a second) to respond to a light. The light was green for
one member of each pair, whose score was subsequently called X, and red for the
other member of each pair, whose score was called Y.
Note that X = 27.4 while Y = 26.8 milliseconds.
Onward to Table 17.6 now. Here the same ten pairs of subjects are listed
in the same order, from Pair #1 down to Pair #10, along with their scores, X or
Y, again. But this time the difference, called D, between the X score and the Y
score for each pair is included. Check the column of D values: 3 = 28 - 25; -1 =
26 - 27; and so on. In general, D = X - Y.
What is D? It works out to be +.6, as the upper right corner of the table
indicates. (Check the computation.) So what? Well, X-Y = 27.4- 26.8 = +.6 too.
This is an instance of the generalization that where difference scores "D” are
computed as X-Y, D = X-Y.
Now construct another such instance yourself. Fill in the missing numbers in
the table below. It's been arranged so that here the mean of the difference
scores works out to be zero.
Pair X Y D = X - Y
IX = X = / 1 6 4 2
lY = Y = /
2 5 6
X - Y =
3 7 3
Id = d = /
4 5 5
Does D = X - Y? If not, you 5 3 6

made one or more mistakes.
6 46
148 Chapter 17
_ _If you did your computations correctly, your table will indicate that D =
X - Y.
The generalization the text states on p. 299 is just this statement expressed
in terms of population parameters. Corresponding to D, the mean of a sample of
difference scores called D, is yD. Corresponding to X and 7 are ]ix and yy. In
general, \iD = \\x - y^. if \AX - ]Ay = 0, then \iD = 0 too.
This is an eternal truth such as mathematicians call a theorem, and it comes

out pretty in words: The mean of the difference scores equals the difference
between the means.
You yourself can prove that this statement is an eternal truth; all it takes
is a very little high-school algebra. If you want to try—and you'll feel good
if you figure out the proof for yourself—the notes below will get you started.
Proof that y = y - y
D X Y
In general, a table of the kind under consideration here has the following
form, where N is the total number of pairs of X and Y scores:
Pair X Y D = X -
1 X1 D1
Yl
2 X2 y2 d2
3 X3 Y3 d3
♦ • • •
• • • •
• • • •
N X Y D
N N N
To prove that \iD = y^ - yy, start as follows:
lD Dl + b>2 + D3 + . . . + dn
(Xi-Yi) + (X2-Y2) + (X3-Y3) + ... + (XN - Yn)
N
To complete the proof, change that last expression on the right of the -
sign until you have bx “ Uy* You maY or maY not need a11 four of the additional
lines there, or you may need more than four; there's more than one way to do the
proof.
If you get stuck, consult the hint in the middle of this page.
•abed qxau aqq go uioqqoq aqq uo aiqeiTeAe qugq aaoui auo s,aaaqq qaoM
q.usaop OTqoeq sxqq qe—Ajq poob e—Aaq poob e II ioqui uoxssaadxa sqqq
uanq noA ueo qeqM *Arl - xrl st qoqqM ' qeob anoA uioag paeMqoeq burqaoM Aai
150 Chapter 17
SYMBOLISM DRILL
"the sum of"
"eks bar" £*/ ; the of a
"mew" T.X/ ; the of a
"little eks" X - or X score

1 l
"sigma squared" 2 / ; of a
1 3
"big es squared" 2 / ; of a
1 4
"big es" 2 / ; of a
1 2
"sigma" 2 / ; of a
2 3
"little es" Estimate of _; /lx2/( ~
32
"little es squared" Estimate of 2*7 ( )
1 6
"ar"
1 7
"rho"
2 1
"mew sub eks bar" of of
2 2
"sigma sub eks bar" of the
2 4
"little es sub eks bar* Estimate of ; s/f
2 5
"aitch null"
2 6
"aitch sub ay"
2 7
"mew hype" Value of stated in
3 3
"mew true" value of
2 8 Approximate score with

"zee quotes"
estimated
*^rt aog Ajjîtuits pue

N
= Xrl : 8tI * d uo buTqaegs gooad aqg aog guTq xeuTq
NX + * * " + £X + ZX + lX
2 9 "zee crit"
3 o
"alpha" Risk of Type error; level of
3 1 "bayta" Risk of Type error
of of
3 4
"mew sub eks bar minus
wi bar" between
"mew sub eks minus mew

3 5 stated in null hypothesis
sub wi, the quantity Value of
hype"
of the
36 "sigma sub eks bar minus --
- wi bar" between two
"little es sub eks bar

3 7 Estimate of
minus wi bar"
3 8 X - V; score
"dee"
ANNALS of EGREGIOUS* EXAMPLES
A businessman came to see me recently for advice on the analysis of some data
he'd collected. He ran a marketing-research firm, and he'd conducted two studies
testing consumer reaction to several varieties of frozen food. In both studies,
his subjects were shoppers who were approached in public places such as malls and
parking lots. The subjects were asked to taste one or more varieties of a food
(which had been cooked, of course) and to report a judgment of "bad," poor,
"fair," "good," or "excellent." In accord with the procedure described in Ques¬
tion 18 on p. 17 of this workbook, the fellow had translated these judgments into
the numbers 1 through 5.
What is of interest here is a subtle difference between the two studies. To

simplify things a bit, let us suppose that only two varieties of a food were com¬
pared in each study. In the first one, there were a total of 200 subjects. Half
of them had tasted one variety and half the other variety; a given subject had
tasted only one, so there were a total of 200 scores.
*Egregious ("eh-gree-juss"): conspicuously bad.

152 Chapter 17
You know how to analyze data of this kind now. (A fine accomplishment, no?)
Say how you would do it.
1. What statistics would you calculate to describe the data?
2. What inferential procedure would you apply? Say whether you would test
a hypothesis about a single population mean or about two population means, and
if the latter, whether the sample means are independent or dependent. State your
null hypothesis and your alternative hypothesis, choosing between a one-tailed
and a two-tailed test. List the calculations you would have to do.
In the second study, there were only 100 subjects, but each subject had tasted
two varieties ol a food and judged both of them, so each subject contributed two
scores to the data, and there were again a total of 200 scores.
You also know how to analyze data of this kind. Again outline how you would
do it.
3. What statistics would you calculate to describe the data?
4. What inferential procedure would you apply? Spell out the details as for
Question 2 above.
Now we come to the reason why this example is egregious. In his first study,
the businessman had collected 200 judgments, 100 recorded on one page of a note-
book for one variety of a food, and the other 100 recorded on a second page for
a second variety. The fellow had cast each sample of 100 scores into a frequency
distribution, producing two tables looking like this (the frequencies are hypo¬
thetical) :
Judgments of Variety A Judgments of Variety B
Score f Score f
5 23 5 17
4 55 4 40
3 12 3 23
2 6 2 15
1 4 1 5
If = 100 If = 100
The mean and the other descriptive statistics required for each sample were easy
to calculate from these tables. (The text covers these techniques on pp. 66 and
90.) All this is fine and dandy for the first study.
But this is exactly how the businessman had preserved his data for the second
study too. Two tables of this sort were all that he had to go on.
5. Something is wrong here. What is it?
The data from the second study, then, could not be properly analyzed. The^
businessman could hardly believe it; the difference between the procedures he'd
followed in the two studies seemed so slight.
Statistical Moral: Plan the statistical techniques you'll use on your data
BEFORE you collect the data.
The businessman confessed to me that he had already written his report on the
foods for the company that was considering marketing them. In the report he had
simply asserted that the variety with the highest mean rating in each study was
significantly higher than the others tested in that study but he didn t really
know this to be so, not even for the first study.
Moral Moral: A course in statistics helps to preserve one s honesty.

'
»
C H A p T E yT8IMATI0N QF AND jU,X-/XY
18.1 Introduction
18.2 The Problem of Estimation
18.3 Interval Estimates of U
18.4 An Interval Estimate of y
18.5 Evaluating an Interval Estimate
18.6 Sample Size Required for Estimates

of y and
18.7 The Relation between Interval

Estimation and Hypothesis Testing
18.8 The Merit of Interval Estimation
1 2
3 4
5 6_______
7 8
10
9
11 12
14
13
16
15
18
17
20
19
155
156 Chapter 18
SUMMARY
The techniques of inferential statistics permit us to reach conclusions

about entire populations on the basis of samples drawn from those populations.
These techniques fall into two categories: hypothesis testing and estimation.
Previous chapters have treated hypothesis testing for cases where sample size
is large enough to permit the use of the normal-curve model. The present chap¬
ter deals with estimation in cases where sample size is again large and the
normal-curve model thus still appropriate.
Point Estimates vs. Interval Estimates
Sometimes it is required to state a single value as an estimate of the
population value. Such estimates are called _ estimates [18.2].
___ estimates alone are made reluctantly, because they may be consider¬
ably in error [18.2]. __ estimates are more practical when con¬
ditions permit [18.2]. In ___ estimation, limits are set within
which it appears reasonable that the population lies [18.2].
Other things being equal, if wide limits are set, the likelihood that the limits
will include the population value is low / high , and if narrow limits are set,
there is greater / lesser risk of being wrong [18.2]. Because the option exists
of setting wider or narrower limits, any statement of limits must be accompanied
by indication of the degree of__ that the population parameter
falls within the limits [18.2]. The limits themselves are usually referred to
as a ____ interval and the statement of degree of confidence as
a ___coefficient [18.2].
Interval Estimates of y
To construct an interval estimate of a population mean |i, one begins with
the mean of a sample drawn (ideally at random) from that population. A certain
quantity is added to the sample mean to set the upper limit of the interval,
and the same quantity is subtracted from the mean to set the lower limit. The
quantity that is added and subtracted is the product of two values, a certain
z score and—if it is known--the standard error of the mean for samples of what¬
ever size was drawn. In symbols, the limits are:
x ± z a-
p x
where X is the sample / population mean, obtained by sampling;

Estimation of y and 157
Q— is the of the ; and zn is the magnitude

X -tr
of z for which the probability is _ of obtaining a value so deviant or more
so (in either direction) [18.3]. As usual, most frequently _must be substi¬
tuted as an estimate of CT— [18.3]. When n _> , little error will be intro-
X
duced by substituting _ for _ [18.3].
Once the specific limits are established for a given set of data, the inter¬
val thus obtained either does or does not cover _ [18.3]. The probability
is, at this stage, either _ or _ that the interval covers _; we do
not know which [18.3]. Consequently, it is usual to substitute the term _
_ for probability in speaking of a specific interval [18.3].
What does it mean to say that we are, say, "95% confident"? We do not know
whether the particular interval covers _, but when intervals are constructed
according to the rule, of every 100 of them (on the average) will include
_ [18.3]. Remember that it is the interval / y that varies from estimate
to estimate, and not the value of _ [18.3].
For a given confidence coefficient, a small sample results in a narrow/
wide confidence interval, and a large sample in a narrower / wider one [18.3].
Interval Estimates of y— - y—
An interval estimate of the difference between two population means can be
constructed by following the rule:
(X - Y) + z O— —
P X-Y
where (X - Y) is the difference between the two sample _; Cf—_y is the
of the __ between two _;
Zp is the magnitude of z for which, in the normal distribution, the probability
is of obtaining a value so deviant or more so (in either direction); and
p = 1 - C, where C is the ___[18.4] .
Once again, the procedure is dependent on knowledge of a— and o— (and p, in the
case of dependent / independent samples), which are the values needed to obtain
a- [18.4]. If only sample estimates are available (the usual case), _

X Y
must be substituted for 0— — [18.4]. When the size of each of the two samples
X Y
equals or exceeds for dependent/ independent samples, or when the number
158 Chapter 18
of pairs of scores equals or exceeds _ when samples are dependent / indepen¬
dent, the error in such a substitution is usually tolerable [18.4].
Once again, for a given confidence coefficient, a small / large sample re¬
sults in a wide confidence interval and a small / large sample in a narrower
one [18.4].
Interpreting an Interval Estimate
Is a given interval wide or narrow? If we are not familiar with the variable
under study, we cannot say. In such a case, one way to add meaning is to inter¬
pret the interval limits in terms of number of __
of the variable rather than in terms of raw-score points [18.5, Paragraph 2].
One advantage of expressing the outcome of an interval estimate in this way is
that it compensates for the fact that the importance of a given interscore dis¬
tance depends on the size of the _ of the
variable [18.5].
When confidence limits are expressed this way, we need to keep in mind that
the width of the limits must still be considered in the light of the value of the
confidence _ employed, just as we do when the limits are ex¬
pressed in score points [18.5],
Determining Sample Size for an Estimate of a Given Width
Sometimes we wish an interval estimate to be of a certain width. After

choosing our confidence coefficient, we can estimate the size of the one or two
samples required to hold the estimate to the desired width—if we can estimate
the standard deviation of the one or two populations of interest. The details
are available in Section 18.6.
Interval Estimation vs. Hypothesis Testing
Interval estimation and hypothesis testing are two sides of the same coin.
For most population parameters or differences between two parameters, an interval
estimate contains all values of H that would be accepted/ rejected had they
been tested using a = 1 - C [18.7]. But estimation has some important advantages
in many cases:
1. The final quantitative output of an interval estimate is a statement about
the __ or _ _s concerned [18.8]. In hypothesis
testing, the statement is about a derived score, such as z or t, or about a

Estimation of y and y -y 159
2\ Y
probability, P. In either form of inference, the question is about the parame¬
ter (s). A confidence interval is thus a(n) indirect / direct answer to the
question, whereas hypothesis testing focuses on a derived variable [18.8].
2. An interval estimate straightforwardly exhibits the influence of random
sampling variation. But in hypothesis testing, the magnitude of the derived
variable depends on two factors: the difference between what was hypothesized
and what is , and the amount of sampling variation present, which is a
function of sample size [18.8].
3. Hypothesis testing is subject to an important confusion between a statis¬
tically ___ difference and an important difference [18.8],
but this problem essentially disappears with interval estimation.
4. Interval estimation avoids the error of thinking that "accepting Hq /
Ha " means that Hq / HA is true or probably true [18.8]. Interval estimation
can be applied in most situations to inquire whether the population is charac¬
terized by a particular parameter value, and if used in this way, the interval
makes plain all of the values that might characterize the parameter, including,
possibly, the value inquired about.
5. Since the null hypothesis is a point / range hypothesis, it is unreason¬
able to believe that it could be exactly true in any practical encounter. Inter¬
val estimation is therefore more / less realistic [18.8].
SYMBOLISM DRILL
Symbol_Pronunciation_Meaning__
1 n Number of scores in a
2 n Number of scores in a _
3 X _____
4 I __-
/ ; the _ of a
/ ; the __ of a
160 Chapter 18
9
X - or / score
1 1
a2 Z / ; of a
1 2
o 2 / ; of a
1 3
s2 Z / ; of a
1 4
s Z / ; of a
3 2 2
S Estimate of ; ^ /( )
2 3
S Estimate of ; Z /( )
22
a—
v
of the ; /
A
2 4
SX Estimate of •
/ /
A
of the
3 6
°X-Y between
3 7
SX~Y Estimate of
2 5 H0
26 ha
2 1 of of
2 7 Value of stated in
nyp
Vhvv
of of
3 4 U-
between
3 5 (ii _i. \ Value of stated in

VMX Vhyp_ _ _
38 D _ - ;_score
39 C "see" _ coefficient
1 7 P
1 6 r
CHAPTER 19
INFERENCE ABOUT MEANS AND THE

t DISTRIBUTION
19.1 Introduction
19.2 Inference about a Single Mean

when 0 is Known and when it is not
19.3 Characteristics of Student's

Distribution of t
19.4 Degrees of Freedom and Student's

Distribution
19.5 Using Student's Distribution of t
19.6 Application of the Distribution

of t to Problems of Inference about
_ Means
19.7 Testing an Hypothesis about a

_ Single Mean
19.8 Testing an Hypothesis about the

Difference between Two Independent
___ Means
19.9 Testing Hypotheses about Two

Independent Means: An Example
19.10 Testing an Hypothesis about Two

Dependent Means
19.11 Interval Estimates of y
19.12 Interval Estimates of
19.13 Further Comments on Interval

Estimation
19.14 Assumptions Associated with

Inference about Means
2
1
4
3
6
5
161
162 Chapter 19
7 8
9 10
11 12
13 14
15 16
17 18
SUMMARY
In previous chapters (15, 17, and 18), the text presented techniques for
making inferences about population means. Each technique was described first
in an ideal form in which the appropriate standard error could be calculated
from known population parameters. A modification necessary for practice use
was then introduced, because in practical use the standard error must be esti¬
mated from the sample or samples on hand. Estimation introduces a degree of
error that makes the normal curve the wrong model for the distribution of the
"z" statistic, but the error is tolerable when sample size is large. The pre¬
sent chapter describes a modification of the modification that is necessary
when sample size is small.
The procedures described in this chapter are sometimes called small-sample
procedures, which might lead one to think that the basic issue is one of sample
size. This is not so. The fundamental issue is whether the formulas for the
standard errors are based on population ____ or on sample esti¬
mates of those ______ [19.1] . If based on population
_the procedures of chapters 15, 17, and 18 / this chapter are exact¬
ly correct irrespective of sample size [19.1]. If based on estimates of popu¬
lation parameters, the procedures of chapters 15, 17, and 18 / this chapter are
exactly correct [19.1].
t = "Z" ^ z
To test an hypothesis about the mean of a single population, we draw a ran¬

dom sample from the population, find its mean, and ask whether obtained value
would be likely or unlikely to occur if the hypothesis about the population were
true. To determine the likelihood of the obtained value, we consult the sampling
distribution of the mean for samples of whatever size we drew. This distribution
could be generated by combining means of repeated random samples of the given size
Inference about Means and the t Distribution 163
To locate our obtained sample mean within the sampling distribution that
would occur if the null hypothesis were true, we would like to calculate a z
score according to the formula:
X - ]i
. z =
a—
x
If the assumptions of Section 15.10 are satisfied, values of z will be _
distributed as we move from random sample to random sample, but the
values of and _will remain fixed [19.2]. Consequently, values of z
may be considered to be formed as follows:
(normally distributed variable) - ( constant / variable )

z = [19.2]
( constant / variable )
Subtracting a constant from each score in a normal distribution changes / does
not change the shape of the distribution, and so / nor does dividing by a con¬
stant [19.2]. Consequently, z will be normally distributed when _ is normally
distributed, and the normal distribution is therefore the correct distribution
to which to refer z for evaluation [19.2].
In practice, though, we can only calculate an approximate z, z , in which
an estimate is substituted for the true value of the denominator. So in reality,
if we were to draw repeated random samples from the population of interest, not
only would values of X vary, but so would the estimates of a-. The resulting
statistic, "z", may be considered to be formed as follows:
(normally distributed variable) - (constant)

"z" =
(variable)
Because of the presence of the variable quantity in the denominator, this statis
tic does not follow the normal distribution, though it is close to normal when
sample size is large. The distribution it does follow is called Student's dis¬
tribution, and the statistic itself is called t. "z" was invented by the author
of the text as a temporary name for pedagogical purposes.
Beginners sometimes think that it is the sampling distribution of means that
becomes nonnormal when must be estimated from the sample.

O— This is not so.
X
If the assumptions of Section 15.10 are met, will be normally distributed
regardless of sample size [19.2], However, the position of X is not evaluated
directly; rather it is the value of the statistic _ = (X - y)/s— [19.2]. And

X
164 Chapter 19
although z is normally distributed, the resulting value of t is not, because the
denominator is a variable.
Characteristics of the Family of t Distributions
Student s distribution of t is not a single distribution but rather a
- of distributions [19.3]. The exact shape of a particular member of that
ramily depends on sample size, or, more accurately, on the number of
of freedom (df), a quantity closely related to sample size [19.3]. In general,
the number of _ , of freedom corresponds to the number of observations
that are completely ________ to vary [19.4]. One might at first suppose that
this would be the same as the number of scores in the sample (or samples), but
often conditions exist that impose restrictions so that the number of
of freedom is smaller / larger [19.4]. For example, the number of
of freedom in a problem involving the calculation of s is __[19.4, p. 332] .

When the number of degrees of is __ (df = oo) ,
ine distribution or Student's t is exactly the same as that of normally distribu¬
ted z [19.3]. As the number of degrees of_____ decreases, the charac¬
teristics of the t distribution begin to depart from those of normally distributed
z [19.3]. When samples are large/ small , the values of s— will be close to
X
that of
and wil1 be much like z [19.3, Paragraph 1], Its distribution
-ls, consequently, very nearly normal. When sample size is large / small , the
values of s— will vary substantially about

__ [19.3]. The distribution of t
Vvil_L then depart significantly from that of normally distributed z.
When the number of degrees of is less than
the theoretical distribution of t and the normal distribution of z are alike in
some ways, and different in others [19.3], They are alike in that both distribu¬
tions have a mean of_, are symmetrical / asymmetrical , and are unimodal
bimodal [19.3]. The two distributions differ in that the distribution of t is
more / less leptokurtic than the normal distribution (a leptokurtic curve has a
lesser / greater concentration of area in the center and in the tails than does a
normal curve), has a smaller / larger standard deviation (remember that o = )

z -
and depends on the number of of freedom [19.3].
Inference about Means and the t Distribution 165
Putting t to Work
When sample size is so small that the procedures presented in the previous
chapters do not work accurately, the same procedures can still be followed with
these changes: (a) "z" should be called t, because t is the conventional name.
(b) In hypothesis testing, an obtained value of t should be evaluated not with
reference to the normal distribution, but with reference to the distribution of
t for whatever degrees of freedom are involved. For inferences regarding a single
population mean, df = n — 1. For inferences regarding two population means, df —
(nx - 1) + (ny - 1) in the case of independent means and df = 1 less than the
number of pairs of scores in the case of dependent means. (c) In interval estima¬
tion, to determine the value of tp (the quantity analogous to zp), not the normal
distribution but the t distribution for the appropriate degrees of freedom should
be used. The rules about degrees of freedom just cited also apply here. (d) For
both hypothesis testing and estimation in the case of two independent means, the
standard error of the difference between two means should be calculated as noted
below.
Pooling Variance Estimates When Dealing with Two Independent Means
To test the difference between two independent means, the procedure introduced
in Chapter 17 calls for the computation of an approximate z as follows:
[19.8, first formula]
The denominator here, s—= /s—2 + s—2 - /(sxl/n^) + (sY2/ny).
When sample size is relatively small / large , "z" is very nearly normally
distributed [19.8]. As sample size increases/ decreases , its distribution de
parts from the normal, but, unfortunately, neither is it distributed exactly as
Student's t [19.8], except when = ny.
The eminent British statistician, Ronald A. Fisher, showed that a slightly
different approach to the computation of s—results in a statistic that is
distributed as Student's t. The change that Fisher introduced was, in effect,
to assume that _ = _ [19.8]. This is often called the assumption of

2 2
[19.8]. Under the assumption, s^ and sy are
homogeneity of ___
estimates of the same population variance. If this is so, then rathei than make
two separate estimates, each based on a small sample, it is preferable to combine
the information from both samples and make a single ___ estimate of the
population variance [19.8], This estimate is called s 2 and is calculated as:
2 [Formula 19.2]
s
p
166 Chapter 19
This quantity may be substituted for _ and for in the formula for
s— — [19.8]. Some algebraic manipulation simplifies the formula to s— — =

2\ i 2\ ~~ Y
v's 2 (1 /n + 1/n ) . The same formula should be used in setting a confidence inter-
p x Y
val for the difference between two population means.
Is the assumption of homogeneity of variance usually justified? Experience
suggests that the assumption of homogeneity of variance appears to be reasonably
satisfied in only a few / many cases [19.14]. Furthermore, violation of the
assumption makes less disturbance when samples are large / small than when they
are large/ small [19.14]. As a rule of thumb, it might be hazarded that moder¬
ate departure from homogeneity of variance will have little effect when each
sample consists of _ or more observations [19.14],
The problem created by heterogeneity of variance is minimized / maximized
when the two samples are chosen to be of equal size [19.14].
SYMBOLISM DRILL
1 Number of scores in a sample
3 A raw score, or the set of raw scores
6 IX/N; the of a
5■ T,X/n; the of a
9 X - X or X - y; score
1 1 Zx2/N; of a
12 yZx2/N; of a
1 3 I>x2/n; of a
1 4 /Ex2/n; of a
Inference About Means and the t Distribution 167
32
Estimate of G2 ; Zx2/(n-l)
2 3
Estimate of G; /tx2/ (n-1)
22
Standard error of the mean; 0//n
Estimate of G—; s//n

2 4 - ■■ 1 ■————■—- X.
- - Standard error of the difference between two

3 6 means
3 7 Estimate of G—
2 5
Null hypothesis
Alternative hypothesis
2 6
Mean of sampling distribution of means

2 1
Value of y stated in null hypothesis

2 7
3 3
True value of y
-——- Mean of sampling distribution of differences

3 4 between two means
Value of y -y stated in null hypothesis

3 5 X Y
Approximate z score with denominator estimated

2 8
Critical value of z
29
"tee" Conventional name for "z"

4 0
X - Yj score
3 8
Confidence coefficient
3 9
Level of significance; risk of Type I error

3 0
Risk of Type II error

3 1
4 1
"dee ef" Degrees of freedom
168 Chapter 19
MAP of t and RELATED CONCEPTS
is the conventional name is distributed under random

for the quantity hitherto sampling (if the assumptions
in this text called reviewed in Section 19.14 are
correct) as
a member of the family

of Student's
designates distribution of t
X, D,
■ .p the sampling distribution of
or - the mean of has
(X~Y) quantities of the given kind
mean of zero
divided by
j
an estimate of the standard error of this
sampling distribution
shape
that is
must be_made, for quantities of the
kind (X-Y), by pooling information standard deviation
from the two samples (assuming that greater than one,
°X = Oy ) when the samples are but smaller and
independent closer to one for
a larger number of
unimodal
degrees of freedom
symmetrical
is number of scores that
are completely free to
leptokurtic in comparison to vary
the normal distribution, but
closer to normal for a larger
number of ____
INFERENCE ABOUT PEARSON CORRELATION
COEFFICIENTS
20.1 Introduction
20.2 The Random Sampling Distribution of r
20.3 Testing the Hypothesis that p = 0
20.4 Fisher's z' Transformation
20.5 Estimating p
20.6 Testing the Hypothesis of No Differ¬

ence between px and p2: Independent Samples
20.7 Testing the Hypothesis of No Differ¬

ence between px and p2: Dependent Samples
20.8 Concluding Comments
169
170 Chapter 20
SUMMARY
Like other statistics, the Pearsonian correlation coefficient would vary

from sample to sample were samples to be drawn repeatedly from a given popula¬
tion. Thus the value characterizing a sample cannot be taken as the correlation
coefficient for the parent population, and it might not come even close to the
population value. If only a sample is available, as is usually the case, the
best we can do to learn the population value is to employ one of the techniques of
inferential statistics, hypothesis testing or estimation. These are the subject
of this chapter.
The Random Sampling Distribution of r
Consider a population of pairs of scores (X and Y) that form a bivariate
distribution. The Pearsonian correlation coefficient, calculated from the com¬
plete set of paired scores, is _ [20.2]. When the coefficient is calculated
from a sample, it is _ [20.2]. If we draw a sample of given size at random
from the population, calculate r, return the sample to the population, and repeat
this operation indefinitely, the multitude of sample r's will form the
sampling ___ of r for samples of the particular size [20.2]. As
we might expect, the values of r will vary more / less from sample to sample
when sample size is large.[20.2].
If the sample values of r formed a normal distribution, we could proceed to
solve problems of inference by methods already familiar. Unfortunately, the
sampling distribution of r is not normal in shape. When p = , the sampling
distribution of r is symmetrical and nearly normal [20.2]. But, when p has a value
other than _, the sampling distribution is skewed [20.2]. Because the
distribution of sample r's is not normal, alternative solutions must be sought
to provide a practical frame for inference. For some problems, the distribu¬
tion affords an appropriate model [20.2].
Testing the Hypothesis that p = 0
The t distribution (or rather the family of t distributions) is useable in
testing the hypothesis that p = 0, because a simple combination of the value of
r characterizing a given sample and the number of cases in the sample, n, has a
sampling distribution that is that for Student's t with n - 2 degrees of freedom
when p = 0. The combination is:

Inference about Pearson Correlation Coefficients 171
[Formula 20.1]
t =
where r is the sample / population coefficient and n is the number of -
of scores in the sample / population [20.3].
To test the hypothesis that p = 0, we calculate t according to Formula 20.1
and evaluate it, according to the ____level adopted and
degrees of freedom, with reference to the values of t found m Table D
of Appendix F [20.3]. The method can be used for a variety of levels of signifi¬
cance and for one- or two-tailed tests.

What if a given r turns out to be significantly different from zero (that is,
what if a given r permits one to reject the null hypothesis that p = 0)? In
terms of the research question posed, the finding of a significant r is only a
preliminary. The next question is whether the correlation is - enough
to be of practical or theoretical use [20.3], With small / large samples, an
unusefully small r may prove to be "statistically significant" [20.3]. Although re¬
searchers frequently draw their inferences from a value of r by testing the null
that p = 0, constructing an interval estimate of the population value offers a
number of advantages. To do so requires transforming values of r to quantities
called z' ("zee prime").
From r to z1
The transformation to z' is accomplished via a formula invented by R. A.
Fisher, and it yields a quantity with two desirable properties:
1. The sampling distribution of z' is approximately - irrespective
of the value of _ [20.4].

2. The standard error of z\ unlike the standard error of r, is essentially
independent of the value of _ [20.4].

Because z' has these properties, we can transform a value of r to z' and then
apply inferential techniques to the Z* that use the convenient normal-curve model.
But the outcomes of the inferential techniques will still apply to the correlation
coefficients of interest.
in using the Z1 transformation, reasonable results will obtain unless sample
size (n) is very large / small or p is very low / high [20.4, last paragraph].
172 Chapter 20
Estimating p
Rather than testing the hypothesis that p is some specific value (usually
zero), it may be desirable to ask within what limits the population coefficient
may be found. A confidence interval may be constructed by translating the sample
r to z1 , and following the rule:
z1 — [Formula 20.3]
where z' is the value of z' corresponding to the sample _; z is the magni-
P
tude of z for which the probability is _ of obtaining a value so deviant or
more so (in either direction); p is (1 - C), where C is the
__; and , is the standard of Fisher's z' [20.5].

Z 1
The formula for the standard _ of z' is:
O' * = - [Formula 20.4]

4i
Application of the above rule will result in a lower limit and an upper limit,
both expressed in terms of z1. These must then be converted to by means of
Table F in Appendix F [20.5].
Testing the Hypothesis of No Difference between p1 and p0
Given a sample from each of two bivariate populations, 1 and 2, a satisfactory
test of the hypothesis that pi = p2 is available only if the samples (and popula¬
tions) are independent. That is, there must be no logical way to pair either set
of scores in one sample with either set of scores in the other sample. The hypoth¬
esis is tested with the same procedure used for testing a hypothesis about two
meansr except that the test statistic is a z score computed as follows:
z = - [Formula 20.5]
The denominator here is the standard error of the difference between two values of
z1, and its formula is:
a ,
z i-z 2 ~ [Formula 20.6]
The Assumption of Bivariate Normality

No assumption about the ~ of the bivariate distribution is required
when the correlation coefficient is used purely as a descriptive index [20.8].
However, all of the procedures for inference about coefficients described in this
chapter are based on the assumption that the population of pairs of scores forms
a bivariate distribution [20.8]. This implies that X is ---_
distributed, Y is_distributed, and that the relation between X and
y j_g [20.8] . If we are not dealing with a normal bivariate popula¬
tion, then these procedures for inference must be considered to yield only approx¬
imate results.
SYMBOLISM DRILL
Number of scores in a
n
Number of scores in a
N
Z ; the of a
X
Z ; the of a
u
score
or
Z of a
1 3
a
z of a
1 2
a2
z of a
1 1
s z of a
14
of
23
of ; E
32
174 Chapter 20
22 CT—
X---- /
2 4 ST7 /
36 °x-Y
3 7 S-_-
2 1 U ___of __
X
27 y Value of stated in
hyp
34 y—
x-y
Value of stated in
5 (y -y )
* y hyp
2 5 h0
26 H
A
1 5 (value - mean)/(standard deviation)
ff ff
2 8 Z
29 Z
crit
of t
3 0 ^
of
3 1 3 of
38 D score
16
1 7 P
42 a
r
4 3 Fisher’s transformation of
_ of
4 4 CF i
Z -
of the _
between two independent z''s
3 9 C
t Conventional name for

40
CHAPTER 21
SOME ASPECTS OF EXPERIMENTAL DESIGN
21.1 Introduction
21.2 Type I Error and Type II Error
21.3 The Power of a Test
21.4 Factors Affecting Type II Error:

Discrepancy between the True Mean and
the Hypothesized Mean

Sample Size

(1) Variability of the Measure; (2)
Dependent Samples

Choice of Level of Significance (a)

One-Tailed versus Two-Tailed Tests
21.9 Summary of Factors Affecting Type

II Error
21.10 Calculating the Probability of

Committing a Type II Error
21.11 Estimating Sample Size for Tests

of Hypotheses about Means
21.12 Some Implications of Table 21.1

and Table 21.2
21.13 The Experiment versus the In Situ

Study
21.14 Hazards of the Dependent Samples

Design
21.15 The Steps of an Investigation
177
178 Chapter 21
1 _ 2_
3 4
5 6
7 8
9 10
11 12
SUMMARY
A researcher faces not only statistical problems but also substantive ones,
and the two types of problem are often interrelated. The present chapter treats
some such interrelationships.
Errors in Hypothesis Testing and the Power of a Test
In testing a null hypothesis, a researcher may go wrong in either of two ways
by committing a Type I error, or by committing a Type II error. The probability
of committing each is defined as follows:
For a Type I Error, a = Pr (rejecting_is true / false );
For a Type II Error, 3 = Pr(accepting is true / false ) [21.2].
Thus a Type II error is committed when a false null hypothesis is accepted. The
opposite occurs when a false null hypothesis is rejected. Since the probability
of the former is 3/ the probability of the latter is [21.3]:
(1 - 3) = Pr(rejecting Hq Hq is true / false ) [21.3]
In other words, (1 - 3) is the probability of claiming a significant difference
when a true difference really exists. The probability of doing so, (1 - 3), is
called the _ of the test [21.3].
There are a number of factors that affect 3f and these are listed below.
Since 3 and power are complementary, it must be remembered that any condition
that decreases 3 increases/ decreases the power of the test, and vice versa [21.3]
Some Aspects of Experimental Design 179
Factors Affecting Type II Error
1. The greater the discrepancy between ytrue and V1hyp• the greater / less
the probability of falsely accepting the hypothesis [21.4]. This generalization
applies to a test of a hypothesis about the mean of a single population. For
hypotheses about the difference between the mean of a first population and the
mean of a second, the greater the discrepancy between the true difference and
the hypothesized difference (which is usually zero), the less the probability of
falsely accepting the hypothesis.

2. Other things being equal, the larger / smaller the size of the sample(s),
the lower the probability of committing a Type II error [21.5].
3. Increase in sample size reduces the risk of Type II error by reason of
its action in reducing the standard error of the mean. Since the standard error
of the mean is o//n, another way to make it smaller is to increase / reduce the
size of a [21.6]. a is the standard deviation of the set of measures, and it re¬
flects not only variation attributable to the factors of interest, but also vari¬
ation attributable to extraneous and irrelevant sources. Any source of extraneous
variation tends to increase / decrease a over what it would be otherwise, so an
effective effort to eliminate such sources will tend to increase / decrease a
and thus augment / reduce 3 [21.6]. In comparing means of two groups, the in¬
dependent / dependent sample design makes it possible to reduce the standard error
of the by controlling the influence of extraneous variables
[21.6].
4. 3 is also related to the choice of a. In general, reducing the risk of
a Type I error increases / decreases the risk of committing a Type II error
[21.7] . The primary consideration in selecting a should be the logic of the ex
periment. But unthinking conservatism in minimizing a will have an unnecessarily
adverse influence on _ [21.7].

5. Other things being equal, the probability of committing a Type II error is
greater/less for a one-tailed test than for a two-tailed test [21.8].
Estimating Sample Size for Testing Hypotheses about Means
In testing hypotheses about means, convenience suggests the desirability of
large / small sample, but accuracy suggests a large / small one [21.11]. How
a
I
180 Chapter 21
In the basic two-group experiment, control may be achieved in a number of
ways:
1. One fundamental technique is to hold the condition of a possible inter¬
fering factor constant for every_in the study [21.13] . But there
is an important price to be paid for seeking control in this manner: the tighter
| M
the control developed by holding many conditions constant, the more limited the
of the outcome [21.13].
2. The -groups design (see Section 17.12) equates subjects in
the two groups on some characteristic, rather than holding the characteristic
constant for all subjects [21.13].
3. Randomization provides another most important source of control. Random
assignment of treatment achieves control over differences that subjects may bring
to the study but still limits / without limiting generalization in the way that
would be done by holding these variables constant [21.13], Although random as¬
signment of treatment conditions to subjects is a powerful experimental tool in
controlling extraneous factors, it is no cure-all. It can take care of potentially
interfering subject variables (characteristics of the subjects that might influ¬
ence the dependent variable). But it cannot control certain other types of ex¬
traneous influence, namely those factors that vary along with the treatment from
one condition to the other. Thus conducting the study as an experiment with ran¬
dom assignment of treatment conditions can be a great help in interpreting the
meaning of the outcome of the statistical test, and it means / but it does not
mean that the answer to the substantive question posed in the beginning is auto¬
matically provided by the statistical conclusion [21.13].
Limitations of In Situ Studies
Many independent variables of potential interest are not subject to manipula¬
tion by the experimenter. Some are unmanipulable for ethical reasons. Other
variables are unmanipulable because they are _ characteristics of
of the organism [21.13], To study the effect of differences in such a variable,
we identify subpopulations possessing the desired differences and compare samples
from the subpopulations.

large a sample is really needed? To answer this question, we must first decide
, . , . value and
what magnitude of discrepancy between the ----
the __ value of the parameter is so great that, if one of this size or
larger existed, we would want to be reasonably certain of discovering it [21.
The decision as to just how big a discrepancy between the parameter's hypothesized
value and its true value is important is fundamentally a statistical / substantive
question and not a statistical / substantive one [21.11]. If we can specify
(1) this discrepancy and (2) the risk (B) we are willing to take of overlooking
a discrepancy of that magnitude, then we can estimate the size of the sample
samples we will need for our research.

Section 21.11 offers two tables for estimating sample size Close examination
of the tables reveals some important points about the design of researc .
1. If it is satisfactory to discover a discrepancy only when it is large,
more/fewer cases are required [21.12].
2. If it is important to discover a discrepancy as small as one-quarter of a
standard deviation of the variable measured, sample size must be large / small
[21.12].
3. If it is acceptable to increase the risk of a Type II error, larger/
smaller sample size is nedded [21.12],
4. If a is set at .01 rather than .05, a larger / smaller sample will be
required to maintain the same level of protection for 3 [21.12].
5. If a one-tailed test is appropriate, a larger / smaller sample will be
required to maintain the same level of protection for B [21.12].

6. If the problem involves the difference between two means rather than a
hypothesis about a single mean, approximately -- as many cases will be
required in each of the two samples to achieve the same level of protection
against committing a Type II error [21.12],
Control in ExpGrimsntution
in the classic model of the experiment, all variables are controlled except
the one subject to inquiry. The variable to be studied is manipulated, and the
effect on the variable under observation is examined. The variable subject to
manipulation is called the independent / dependent variable, and that under ob
servation is called the independent / dependent variable [21.13, first parag

182 Chapter 21
Studies in which the element of manipulation of the independent variable is
absent can still / cannot be called experiments [21.13]. The text refers to
between
them as in situ studies. The important difference/experiment and in situ study
is that in the latter, a significant degree of control is lost. This loss of
control makes it more / less difficult to interpret the outcome of such studies
[21.13].
The loss of control arises because when individuals are selected according
to differences that they possess in the variable we wish to investigate, they
inevitably bring with them associated differences in other dimensions. If dif¬
ferences in these extraneous dimensions are related to the dependent variable,
we may well find that the different "treatment" groups are significantly differ¬
ent with regard to the dependent variable, but the origins of these differences
may be so entangled that it is extremely difficult or even hopeless to sort them
out. In short, it is most difficult to develop statements of causal / correlation¬
al relationship in studies of this type [21.13].
But this is not to say that in situ studies are worthless.
Hazards of the Dependent-Samples Design
The dependent-samples design can be used to good effect when it is of the
variety in which matched pairs of subjects are formed with treatment conditions
randomly assigned to the two members of each pair. But trouble arises in the
other three versions of this design:
1. When repeated measurements are made on the same subjects, it is possible
that exposure to the first treatment condition will change the subject in some
way that affects his or her performance under the treatment condition assigned
second. An influence of this sort is called a(n) _ effect [21.14].
When an effect is present and it can be assumed that the influence
of one treatment upon the other is the same as that of the other upon the one,
the outcome of the experiment may be interpretable, if treatment condition has
been assigned with regard to the order of treatment [21.14].
However, the disturbing order effect will introduce an additional source of vari¬
ation in each set of scores, according to the magnitude of its influence. This
tends to increase / decrease the standard error and consequently to increase /

decrease the power of the test [21.14].
Furthermore, if the influence of one treatment upon the other is not the
same as that of the other upon the one, _ will be introduced as well as
unwanted variation [21.14]. The outcome then becomes difficult or impossible
to interpret.
2. If the design utilizes repeated observations on the same subject but as¬
signment of the treatment condition is not random with regard to order, we are
in less/even graver difficulty [21.14]. Any order effect will bias the com¬
parison. Studies of this type may also be subject to another source of bias:
the regression effect. If subjects are selected because of their extreme scores
on some measure, we expect remeasurement on the same variable to yield scores
closer to / farther from the mean [21.14].
3. A third troublesome case in that in which the two groups consist of
different subjects matched on an extraneous but related variable, and assignment
of treatment condition to members of a matched pair is nonrandom. These condi¬
tions are likely to arise when studying the effect of a nonmanipulable variable
in intact groups. Such investigations fall in the category of in situ studies
and are susceptible to all of the usual difficulties of such studies plus several
additional hazards:
(a) Matching may increase / reduce , and therefore obscure, the influence
of other important variables associated with the variable on which matching took
place [21.14].
(b) When the two intact populations differ widely on the variable on which
matching is done, it may be possible to form matched pairs only by using subjects
who are unusual relative to others of their own kind. Under these conditions,
any conclusion reached will be generalizable only to peculiarly constituted sub¬
groups of the two target ____ [21.14].
(c) When subjects are selected for pairing because of their extreme scores
on the matching variable, a regression effect may be expected. This is likely
to occur in studying two intact populations that differ widely / slightly on
the matching variable [21.14].

184 Chapter 21
SYMBOLISM DRILL
1
Number of scores in a population

2

3

4
5 Z ; the mean of a sample
Z ; the mean of a population

6
or ; deviation score
9
Z ; variance of a population
1 1
Z ; standard deviation of a population

12
1 3 Z ; variance of a sample
Z ; standard deviation of a sample

14
2. v1
32 Estimate of 0 ; Z
23 Estimate of a; Z
22 Standard error of the mean; /
24 Estimate of O—; /
-——. — ———-
- Standard error of the difference between two

36 means
37 Estimate of 0— —
Z\ JL
25 Null hypothesis
Alternative hypothesis
26
27 Value of y stated in null hypothesis

33 _ ____
True value of y
Mean of sampling distribution of differences

34 __ between two means
Value of y -y stated in null hypothesis

35 X Y
(value - mean)/(standard deviation)

1 5

2 8
Critical value of z
29
Level of significance; risk of Type I error

30
Risk of Type II error

3 1
X - Y; difference score
3 8
Pearson correlation coefficient for a sample

1 6
Pearson correlation coefficient for a pop'n

1 7
Standard error of r
42
Fisher's transformation of r
43
Standard error of z'

44
Standard error of the difference between two

45 independent z*'s
Confidence coefficient
39
Conventional name for "z"

40
Degrees of freedom
41
ELEMENTARY ANALYSIS OF VARIANCE
_22.1 Introduction
22.2 One-Way Analysis of Variance: The Hypothesis
22.3 The Effect of Differential Treatment on

___ Subgroup Means
22.4 Measures of Variation: Three Sources
22.5 Within-Groups and Among-Groups Variance

__Estimates
22.6 Partition of Sums of Squares and Degrees of

___Freedom
22.7 Raw Score Formulas for Analysis of Variance
_ 22.8 The F Distribution
____ 22.9 Comparing sw2 and s^ according to the F Test
__ 22.10 Review of Assumptions
22.11 Two-Way Analysis of Variance
22.12 The Problem of Unequal Numbers of Scores
22.13 A Problem in Two-Way Analysis of Variance
22.14 Partition of the Sum of Squares for Two-Way

__ anova
22.15 Degrees of Freedom in Two-Way Analysis of

____Variance
22.16 Completing the Analysis
22.17 Studying the Outcome of Two-Way ANOVA
22.18 Interaction and the Interpretation of Main

Effects
187
188 Chapter 22
22.19 Alternatives to the General F Test for a

__ Treatment Effect
__ 22.20 Constructing a Comparison
__ 22.21 Standard Error of a Comparison
__ 22.22 Evaluating a Planned Comparison
__ 22.23 Constructing Independent Comparisons
__ 22.24 Evaluating a Post Hoc Comparison
1___
2_____
3-
4_
5 ____
6 ____
7 ___
8 _____
9 _____
10______
11_____
1 2 _____
13 ______
14 _______
15 _____
16 _
17
Elementary Analysis of Variance 189
SUMMARY
Analysis of variance is a technique of inference applicable to many studies

in which quantitative data are collected in more than two conditions. The text
describes two varieties, "one-way" and "two-way."
One-Way Analysis of Variance
Terminology
In analysis of variance, an independent variable is known as a _
and the varied conditions of an independent variable are known as
of that _ [22.11, fourth sentence] . One-way anal¬
ysis of variance is appropriate for a study in which there is just one treatment.
The number of conditions (levels) of that treatment is symbolized k, and k may
be 2 or some larger number. Subjects are assigned independently and _
to the k treatment conditions [22.2, second paragraph]. The conditions
are identified as D, E, F, and so on, and the individuals subjected to a given
condition, in reality (those in the sample actually studied) or hypothetically
(those in the parent population), are called a subgroup.
The Hypotheses
If the different treatment applied to the subgroups has no differential
effect on the variable under observation, then we may expect these subgroup
population means to be _ [22.2]. To inquire as to whether variation
in treatment made a difference, we therefore test the null hypothesis:
H0:
against the alternative that they are __ in some way [22.2]. In
testing the hypothesis of no difference between two means, a distinction was
made between directional and nondirectional null / alternative hypotheses
[22.2]. Such a distinction still / no longer makes sense when the number of
subgroups exceeds two [22.2]. In the multigroup analysis of variance, H0 may be
false in only one way / in any one of a number of ways [22.2].
In spite of this difference, one-way analysis of variance is very closely
related to the t test of the difference between two dependent / independent
means [22.2, second paragraph]. In fact, the outcome of analysis of variance

190 Chapter 22
applied to the special case of two subgroups is identical with that of the t
test. Like the t test, the analysis of variance is suited to samples only of
large size / of any size [22.2].
The General Formula for an Unbiased Estimate of a Population Variance
True to its name, analysis of variance is concerned with the _____
as a measure of variability [22.5].* An unbiased estimate of a population vari¬
ance is made by calculating the sum of the_of the deviation of
each score from the sample / population mean, and dividing by the number of
degrees of associated with that sum of squares [22.5], which will
be one less than the number of scores. This general relationship can be summar¬
ized by the equation:

s2 = -
where the letters _ stand for "sum of squares (of deviation scores)" [22.5).
2
Homogeneity of Variance and the Within-Groups Estimate of U
The analysis makes the assumption that the several subgroup populations all
have the same variance, which is symbolized O . An estimate of O could be made
from any one of the subgroup samples by taking the sum of ___ of the
deviations of scores in that group from the subgroup / grand mean and dividing
by the appropriate number of degrees of ___ [22.5]. However, on the
assumption that the subgroup population variances are the same for all subgroups
(the assumption of ___ of variance) , a better estimate may be
made by combining information from these several subgroup samples [22.5]. Such
an estimate may be made by pooling the sums of squares of deviation scores from
the several subgroups and dividing by the sum of the degrees of freedom character
*If the difficult material in this chapter has not totally destroyed your
sense of humor, you should have recognized this question as analogous to Groucho
Marx's favorite on You Bet Your Life: "Who is buried in Grant's tomb?" But the
proper answer to Groucho's question is not the obvious one, for the tomb actually
holds both Grant and Grant’s wife. Similarly, the quantity that is analyzed
(that is, decomposed) in the analysis of variance is not exactly a variance; it
is the sum of the squared deviation scores that contributes to an estimate of
the population variance derived from the total set of scores on hand, as the
text explains in Section 22.6.
izing each of the subgroups. This estimate is called the within-groups / among-
groups variance estimate; we shall symbolize it by _ [22.5]. The formula
is:
w —- [Formula 22.1]
df}W
where Xis a score in subgroup sample D, etc.; XD is the __ of subgroup
sample D, etc.; and n^ is the _ of elements in subgroup D, etc. [22.5] .
The numerator of this expression is called the within-groups / among groups sum
of _ (SSW) and the denominator the within-groups / among groups
degrees of __ (dfw) [22.5].
In practice, sw2 is more easily computed by Formula 22.5 on p. 399.
The Among-Groups Estimate of a2

If the null hypothesis is true (that is, if there is no treatment effect),
it is possible to derive another estimate of a2, independent of the within-groups
estimate, solely from the means of the k subgroup samples.
First, these means are treated as though they were raw scores, and the vari¬
ance of the population from which they came (the population of subgroup sample
means) is estimated in the usual fashion: find the mean of all the "scores"
(which here is the mean of the sample means, and this will be the same as the
grand mean of all raw scores); for each "score," find its deviation from this
mean (here, find the deviation between each sample mean and the grand mean);
square each deviation; sum the squares of the deviations; and divide the sum by
one less than the number of "scores" (which is k - 1 here). The symbols describ¬
ing these operations are:
1(X - X)2
k - 1
The k over the summation sign indicates that there are k quantities to be summed,
each has the form (X X)2
Now the variance estimated in this way is the variance of the population of
subgroup sample means, not the variance that we assume to characterize each of
the subgroup populations of raw scores. But from the former we can estimate the
latter, using the following reasoning:
An estimate of the standard error of a mean, s—, is computed from the formula
s^/Zn. Squaring both sides of this equation we have: _ [22.5,
paragraph 4]. Solving this equation for s^ , it reads.

192 Chapter 22
s 2 = [Section 22.5]
A
But s—2 is the quantity that we just computed; that is, it is the estimate
A
of the variance of the population of subgroup sample means that we derived from
the means themselves. We computed it directly here, whereas earlier in this
course we always computed it by following the formula s2/n. So we can take our
value for s— and multiply it by n to get an estimate of the variance that we
A
assume to be common to each population of raw scores.
This estimate is called the within-groups / among-groups estimate and is
symbolized by _ [22.5, paragraph 4]. When the subgroup samples are of equal
sizes, the formula for this estimate, in deviation score form, is:
= —rz = - [Formula 22.2]

atA
where _ is the mean of a subgroup sample; _ is the mean of the combined
distribution of scores (_ mean) , k is the number of _,
and n is the number of scores in each subgroup sample [22.5]. The numerator of
this formula is called the within-groups / among-groups sum of
(SSA), and the denominator is called the within-groups / among-groups degrees
of _ (dfA) [22.5].
When the subgroup samples are of unequal sizes, a slightly different formula
is required:
s^2 = - [Formula 22.3]
where _ is the number of scores in the ith subgroup sample and is the
mean of the ith subgroup sample [22.5].
In practice, SSA is more easily computed by Formula 22.6 on p. 399.
If there is no treatment effect, subgroup sample means will tend to cluster
about _ as predicted by the standard error of the mean, and sA2 will be an
unbiased estimate of inherent variation, 02 [22.5, p. 396]. It will, therefore,
estimate the same quantity as that estimated by _ [22.5, p. 396]. On the
other hand, if there is a treatment effect, the sum of squares of the deviations
of X about X will tend to be larger / smaller , and s^2 will tend to be larger /
smaller than sw2 [22.5, p. 396].

2. 2
Comparing sA and s^
To compare the two estimates of c2, sA2 and sw2, we form them into a quantity
called an F ratio: F = sA2/sw2. If the null hypothesis is true, which means that
there is no treatment effect, the top and the bottom of this ratio will have^
about the same value, so the value of F will be about one. But if the null is
false and there is a treatment effect, sA2 will tend to be larger than sw , as
noted in the paragraph just above, and now the value of F will tend to be greater
than one.
But even if the null hypothesis were true, it would be possible for F to be
greater than one, even considerably greater, just because of sampling variation.
The best we can do is to determine whether a given value of F is likely or un
likely to occur should the null hypothesis be true.
We thus need to know the sampling distribution of F when the null is true.
This depends on the number of degrees of freedom associated with sA and on the
number of degrees of freedom associated with sw2. Table H in the back of t e
text shows selected values from the various members of the family of F distnbu
tions. Hypothetically, the values of F that make up a sampling distribution
could be generated by repeatedly replicating a given experiment: from the same
populations, draw samples, each of whatever size was originally used, and compute
sa2, Stf2, and their ratio, F, for each replication. The null hypothesis must
remain true throughout the replications.
To test the hypothesis that <and so on for a11 k sub9rouP popu~
lations, we must compare the calculated value of F with the values of F that would
occur through random sampling if the hypothesis were true / false [22.9]. Now
if s 2 is always placed in the numerator / denominator of F, as is customary,

will be rejected
the hypothesis of equality of subgroup population
only if the calculated value of F is larger / smaller than expected [22.9].
Consequently, the region of rejection is placed entirely in the upper / lower
tail of the F distribution [22.9].
Assumptions of the Analysis of Variance
For the procedure presented here to be entirely correct, several assumptions
must be satisfied:
distributed [22.10].
1. The subgroup populations are
2. Samples are drawn _

[22.10].
3. Selection of elements comprising any subgroup sample is ----—
of selection of elements of any other subgroup sample [22.10].
4 The of the several subgroup populations are the same
for all subgroups (homogeneity of ____) [22.10].

194 Chapter 22
As with the t test for independent means, moderate departure from conditions
specified in the first and fourth requirements will not unduly disturb the outcome
of the test. Resistance to such disturbance is enhanced when sample size rises/
falls [22.10].
If there is a choice, it is desirable to select samples of unequal / equal
sizes for the subgroups [22.10],
[The continues after the following section.]
SPECIAL HELP with ONE-WAY ANALYSIS of VARIANCE
If you re now lost m the thicket of details, it may help to walk out of the
de^n onto a hl11 to look at the big picture. You can then go down into the
details again when you see where they fit in.
Overview of the One-Way Analysis of Variance
There are k conditions (k = 2 or more), called D, E, F, and so on each reore

sentrng one level of a certain independent variable, which is called a treatment
of instruction5)ml? different doses of a drug or different methods'

of instruction.) In condition D we have scores on some dependent variable for a
sample of nD subjects, and their mean is xD, in condition e we havlscores fn the
dependent variable for a sample of nE subjects, and their mean is and so on
ical set^f T atTandom from its Parent population, which if the hvpothet-
If f saores for a11 individuals who could have been subjected to the aiven
condition The samples are independent of one another, in that each subject tls
observed in only one condition, and there is no logical way to pair thescores
n any condition with the scores in any other condition. The n's may thus be
iTlTlis "at'TeiX on^^ treatment has influenced the dependent variable,

other populations Population would have a mean different from that of the
... Populations, so our substantive question turns into a statistical one- Is
it plausible that all populations have the same mean?
1. Hr
l0: l1 D h# - V-p, and so on. As usual, H
lation parameters. 0 is a statement about the popu-
hA: H0 is false. The null can be false in more than one way dependina
between^ne-taileând^two-taile^testsô'longer'^makes'^sense5"
gard as unlikely from those we will regard as likely.

5 Make two special assumptions: , .

(a) an2 = oJ = aF2 , and so on. The number that is assumed here to be t e
variance
D ,Lhp noDulations is called 02,
common to the populations is wn '
with no subscript, a measures
the variation inherent in each population of scores.

(b) All populations are normally distributed. . .. .
the test of the null hypothesis will not be serious y wrong.
4 Look at the samples on hand, and compute a number called F. This *

4. LOOK at me scmipic true. F tends to be about 1.0,
statistic characterizing the samples. Whe Hq
but when H0 is false, F tends to be greater than 1.0.
5. Assume that HQ is true and ask whether it would then be likely or„^**iY
for an F value of the obtained size (.or ^êr,),uniikely"°means the probability

means the probability is greater than th ' is likely or unlikely
is less than the « level. Determining whether the ion of F when the
when the null is true requires consulting the samp g tically by replicating
null is true. This distribution could be generated, hypothetically, by .ep
our study an indefinitely large number of times when the null is true. nD n
Z ~ « ~.t remain «.
A value of F is computed for each replication; it will vary from repi
replication but will be about 1.0 most of the time.
6a. If the obtained F value is a likelythe^"^ themself

its probability when the null is true exceeds a), accept the nu
failing to reject it. It must be retained as a possibility, though
necessarily likely to be true.
6b. If the obtained F value is an unlikely the'null
‘I ltS Tthfalternativf6 SVvJTi“sSS to be significant in this case.
^•rthfn^r^to
differencesâmong°the^population means. (The techniques for snooping ars= fancy t
tests such as the comparisons treated m Sef4^°^treatment in-

Note that a significant F value provides only some evide e '8 the sta-
whether this is so.
Details of the F Statistic

If you now see the big picture clearly, you're ready for a review of
details of the F statistic mentioned in Step 4 above:
. . rpqult of dividing one number by another. The
4a. An F value is a ratio, the result oi aivi y above. As
196 Chapter 22
The bottom estimate is called the within—groups estimate, s 2; the top estimate
is called the among-groups estimate, s 2. ^
2
F = aftiQng-groups estimate of g2 SA
within-groups estimate of oz s 2
W
4b. The estimate on the bottom of the ratio takes each of the k samples in
turn and looks within it for information about the variability of the scores in
the population from which it was drawn. Since each sample's parent population
has the same variance, according to the assumption in Step 3a above, an especially
good estimate of this variance can be made by pooling the information from within
sample D, the information from within sample E, and so on. This pooling is done
as a fancy kind of averaging: information about the variability of population D ,
which is derived from the data within sample D, is averaged with information
about the variability of population E derived from within sample E, and so on. The
number thus produced is not influenced by the dispersion among the samples; in
particular, it is not influenced by the variation among the xs, because it never
compared the xs.
F _ _ among-groups estimate of g2_

fancy average of information about g^ taken from within each of the samples
4c.^The estimate on the top of the F ratio completely ignores the information
about g that is available within each of the samples; it does not consider the
values of the individual scores (the Xs), Rather it looks only at the variability
among the samples, taking the sample means as measures of the locations of the
samples.
From the variability among the sample means, it figures an estimate of the
variance of the population of such means. This variance is not g2; rather this
is something analogous to G— , the standard error of a sampling distribution
composed of sample means. In fact, the estimate of the variance of the population
of sample means can be properly symbolized s—2 .
X
Now, when the null hypothesis is true, s_2 reflects only the variation inher-
X.
ent in each of the populations, a2. Think this through. If the null is true,
all populations have the same mean. We are assuming, moreover, that each has the
same inherent variation among its scores, variation measured by the number a2,
and that each is normally distributed. So if the null is true, we are, in effect,
drawing each sample from the same population. Why, then, should the various
samples turn out to have different means? Only because of the variation among
the scores in the population. This fact,
the fact that when the null is true s_2
X
reflects only^the variation inherent in a population of scores, means that we can
fuss with s— a bit and turn it into an estimate of the inherent variation, an
estimate of a2.
But we must remember that the estimate will be predicated on the assumption
that the null hypothesis is true. If the null is false, the several populations
don't all have the same mean; at least one is different from the others.
sample mean will tend to fall where its parent population mean falls, of cours ,
* n.lV1 -jo faise the spread among the sample means will reflect no
so when the nul / f scores, but also the spread among
only the inherent variation in a populationof scor , 2 bably not
the population means. So the estimate of 0 derived rrom
»hiS“S'n«
F ratio will yield an F greater than 1.0.
Details of the Within-Groups Estimate of aj

. . . „ i-hirket here are some details of Section
s.'svs “Lszi't “ « ji.

If you're ready to continue 1 ao back to the big picture (to the Over-
issSiZ
familiar quantity Zx2/ (n - 1).
*. *■ —»*■»
-f - co+- nf scores is the mean squared deviation, remember.
(ChecketheULItrparagraph on p. 54 of this W°ff ^ “fôme"values"
^d b^theTuêrlf^lues! 'and £’/* or «) is ^ mean of a collection of
squared deviation scores, x s.
To estimate the variance of a population from a ê_P°PuL(See'sec-

divide the sum of the squared deviation scores no y this matter.) The es-
as dfst-rct frowst ™ of the sample itself) and
a2 (the true variance of the population).

fArTnill3 for s2 in a novel form and introduces some
Now Chapter 22 gives the foJ t x is written as (X - X). For sample
special terms to describe its pa . @ E ( _ y )f and so on. (To distin-
D this is specifically (XD XD), P E on, just
in case they're not all equa .) ' 2 The sqUares whose sum this ex-
ss, is used for a quantity of the kind IU of^âtions, of course-the
pression is talking about are rea y ,q<.„r,r„ x and its mean, X. Third, the
squares of the deviations between eac ra 2 ' ' described as the degrees
n - 1 on the bottom of the formula^X^xWUi population. Degrees
of freedom for this estima e matter is explained in Sections 19.4 and

of freedom is abbreviated df, an 1 ^ the latter section).
19.8 (starting with the last sentence on p. 336 m the
198 Chapter 22
Finally, the averaging: From sample D we estimate Oe2 , its parent popula¬
tion s variance, by computing £ {xE - XE)2/(nE - 1); from sample E we estimate
°E by computing £ {XE - XE)2 / {nE - 1); and so on for each sample. Each of the
estimates uses only information from within a single sample, note again; only
the raw scores within the sample, their mean, and their number go into the
estimate. The estimates are then averaged in a fancy way: (a) The tops of the
estimates, the quantities of the kind £ (X - X)2 , are added together. (b) The
bottoms of the estimates, the quantities of the kind (n - 1), are added together,
(c) The sum of the tops is divided by the sum of the bottoms. These operations
are summarized in Formula 22.1 on p. 395 of the text, which also reveals that
the sum of the tops is called the within-groups SSW, while the sum of squares,
sum of the bottoms is called the within-groups degrees of freedom, df
Though the text doesn't say so, SSw - SSE + SSE + SSE, and so on, while
dftyj = dfD + dfE + dfE, and so on.
For an example of these calculations, consult the example on p. 397 of the

text. In the arithmetic for SSW, pay attention to the square brackets, the "["
and the "]", which enclose the operations symbolized £(x - X)2 for each sample.
Details of the Among-Groups Estimate of G2
If you're disoriented again, go back to the big picture on p. 194 and reread
through Section 4c on p. 196, but skip Section 4b this time. Following are some
details of 4c.
As noted in the” next—to—the-last paragraph on the previous page, to estimate

the variance of a^population, we compute from a sample a quantity called s2 whose
formula is £(x - x) /(n - 1). What if the elements of the population are not raw
scores Jnot Xs), but sample means, Xs? We can still use the formula. Replace
X with X and call the mean of the xs X ("eks double-bar"). Replace n, the number
of Xs, with kf the number of sample means. The formula becomes £ (X - X)2/ (k - 1),
whi^ch appears in the middle of p. 395. This is the formula for what was called
s— in Section 4c above.
X
If the null hypothesis in the analysis of variance is true, s 2 reflects only

~X
the variation inherent in each population of scores, only 02, as we saw in Section
4c. How, then, could we estimate Q2 from s_2? Remember the formula for the
X
standard error of the mean: a— = O //n. When we have not G but an estimate of
a a x
it, s^, the formula becomes: s— = s^/n. Squaring both sides turns the equation
into one concerned with variances rather than standard deviations or errors.
Squaring tells us that an estimate of the variance of a population of Xs can be
had by dividing an estimate of the variance of the population of raw scores by
the sample size. That is, the estimate of the variance of the raw scores must be
scaled down through division by n. We already have an estimate of the variance
of a population of Xs; this is s— . So to derive an estimate of the variance of
the population of raw scores, we have to scale s—2 up by multiplying it by n.

And when the ns of the several samples are equal, that's all there is to
the formula for the among-groups estimate of a : n, the common sample size, is
used as a multiplier for Z (X - F)2/(k - l)-that is, as a multiplier for our
friend s—2 • This gives Formula 22.2 on p. 395.
WhenXthe ns are not all equal, a slight modification is required, as incor¬
porated in Formula 22.3. The example on p. 397 is a case of unequa ns.
In either case, the product of sample size and £(X X) is called the among
groups sim of squahs, SS^, and the rest of the formula, * - 1, xs the among-
groups degrees of freedom, df^.
Disoriented again? Back to the big picture on p. 194. It’s always there if
you get lost in details-but at least you should now see where the details fit.
SUMMARY Continued
TWO-WAY ANALYSIS of VARIANCE
Two-way analysis of variance is applicable to a study in ^different
sample of scores.
able? And did the two treatments interact m mfluenc g P
The question about interaction is new. In general, the question of inter¬
action between two treatments may be phrased this way: Whatever the difference
among the several levels of one treatment, is it the same for each of the -
of the other__ [22.11, P- 406].
Variance Estimates
Each of the substantive questions gives rise to a o7
= \x
, and so on, where Cx names the
a
a null
null hypothesis
nyjJUbncoxo asserting
2J
that
- U(
'*C\ C2
C 33
treatment level represented by the first column in the table, C2 names the trea ■
200 Chapter 22
ment level represented by the second column, and so on. For the second substan¬
tive question, the statistical question is the plausibility of another null hy¬
pothesis, this one asserting that y^ = y^ = y^, and so on, where R, names the
treatment level representing by the first row, R2 names the treatment level rep¬
resented by the second row, and so on. For the third substantive question, the
statistical question is the plausibility of a third null hypothesis, and this one
asserts that there is no interaction between the two treatments in the pattern of
the population means for the individual cells of the table.
Each null hypothesis is tested by computing an F statistic, as in the one-way

analysis of variance, and each F is again formed by dividing one variance estimate
by another. For the three Fs, four variance estimates are needed.
1 * SWC (Within-cells estimate), derived from the variation among the scores
in the first cell, the variation among the scores in the second cell, and so forth
This measure is of interest because it is free from the influence of possible
differences between columns (column ___) , possible differences between
rows (row _), and also any interaction effect, if present [22.13]. it
therefore measures only inherent variation, and is analogous to in one-way

ANOVA [22.13].
o _ 2 ,
estimate), derived from the differences between column
---. [22.13]. If the null hypothesis about the population values of the
columns is correct, variation among column means (Xr , Xr , and so on) will be
° 1 2
affected only by inherent variation. Under these circumstances, sc2 will estimate
the same quantity estimated by [22.13]. If the null is false, sr2 will
C
tend to be larger than otherwise. It is therefore analogous to in one-way
ANOVA [22.13].
3. s ( estimate), derived from the differences between row

R
[22.13]. If the null hypothesis about the population values of the rows is cor-
nect, variation among row means (XR^f XR^, and so on) will be affected only by
inherent variation. Under these circumstances, sR2 will estimate the same quan¬
tity estimated by swc2. If the null is false, sR2 will tend to be larger than
otherwise. It is therefore just like ., except that it is sensitive to row

effect rather than to column effect [22.13].
4. s (
Rxc - estimate), derived from the discrepancy be¬
tween the means of the several columns / rows / cells and the values predicted
for each on the assumption of no interaction at the population / sample level

[22.13]. If there is no interaction, SRXC2 wil1 be responsive only to

[22.13]
variation and will estimate the same quantity estimated by -
^ will respond to it and will therefore tend to

If interaction is present, s RXC
be larger / smaller [22.13].

Each of the three variance estimates s^ , and sRXC responsive (1) to
the presence of the effect for which it is named (_ effect, --

variation
effect, and effect), and (2) to
, on the other hand, is responsive only to inherent variation

[22.13] . _
[22.13] . When the first three estimates are at hand, F s may be formed by placing
each in turn in the numerator / denominator and s^c in the numerator / denomi¬
nator [22.13]. A significantly small / large F will then serve as an indicator
of the presence of the effect specially associated with the kind of estimate
placed in the numerator / denominator [22.13].
Formulas for computing the variance estimates appear in Section 22.14.
Degrees of Freedom for the Variance Estimates

Let C equal the number of columns, R equal the number of rows, and nwc equal
the number of scores within each cell (the n's for the cells are assumed to be^
equal). Since there are C deviations involved in the computation of sj, dfc =
[22.15]. In computing s^2, we consider the

; similarly, dfR =
deviation of each score in the cell from the cell mean. Consequently, each cell
contributes degrees of freedom, and = y [22.15]. Finally,

—-- WC L -
jf = [22.15].
a±Rxc ----
Main Effects and Interaction

effects [22.18]. The
The column effect and the row effect are called -
interpretation of a significant main effect is clear when there is no significant

. . , is present, the
interaction. However, when signirleant -----
meaning of tests for main effects may be clouded [22.18], It is particularly
important to study the several values of the column / row / cell means when the
test for interaction is significant [22.18].

202 Chapter 22
COMPARISONS
The F test for a treatment (the test for the one treatment in a one-way
analysis of variance or the test for either treatment in a two-way) examines
the hypothesis of no difference among population means for all subgroups of
the treatment. Often it is more interesting to inquire about a certain pattern
among the subgroup means than to ask the one overall question, though. Some¬
times the logic of the study will suggest the particular comparisons to be made,
and if so we will know in advance what comparisons would interest us. Compar¬
isons chosen this way are called planned / post hoc comparisons [22.19]. On
other occasions, comparisons come to our attention only on inspection of the
data. Such comparisons are known as planned / post hoc comparisons [22.19].
The same / A different strategy is desirable for examining post hoc compari¬
sons as/than for evaluating planned comparisons [22.19]. The way the compar-
is constructed is the same for both planned and post hoc comparisons; the differ¬
ence is in the way the comparison is [22 19]
Constructing a Comparison
A comparison is constructed from the means of the subgroup samples in such

a way that:
1. m a comparison, two/ two or more quantities are contrasted with each

other [22.20];
2. each term in the comparison is multiplied / divided by a coefficient

[22.20]; and
3. the total of the positive coefficients equals / exceeds the total of the
negative coefficients [22.20].
In general, a comparison, K, may be expressed as:
K = + + + [Formula 22.14]
where a^, a , etc., are the

for the several
of the treatment, and _ is the number of levels of the particular treatment
[22.20]. If some levels are not included in the comparison, the coefficients of
the means of these subgroups are assigned the value [22.20]

Constructing Independent Comparisons

If possible, it is highly desirable to construct comparisons that are inde¬
pendent of each other. From among the k means of the levels of a treatment one
may construct a set of _ comparisons that are nearly mutually independent
[22.23]. When subgroup sample size is equal, it can be done as follows.
1. Construct the first comparison using two or more / all levels of the
treatment [22.23].
2. The second comparison must be constructed wholly from subgroups that fall
on one side of the first comparison. Again, use two or more / all available
subgroups [22.23].
3. Construct the third comparison by applying the procedure of step 2 to the
comparison just obtained.

iI j comparisons
Comparisons constructed this way are called ------
[22.23]. Adequacy of the procedure rests on reasonable approximation to the

. distributed with equal
assumptions that subgroup populations are ---
, , r?2 23]. Moderate depar-
__ and that subgroup n s are ____ L^z* J
ture from these conditions will not be crucial.
Estimating the Standard Error of a Comparison

To evaluate a comparison, we need to estimate its standard error, which is
symbolized sR. The formula for the calculation is:
[Formula 22.15]
SK = error
where s 2 is the variance estimate that would constitute the numerator / de-
w error in two-way
in one-way ANOVA;
nominator of the overall F test (-
of the subgroup A, etc.; and is
ANOVA); a is the coefficient of the —
the number of cases in subgroup sample A, etc. [22.21], The number of degrees of
freedom associated with this estimated standard error is the number associated
with __ [22.21].
Evaluating a Comparison
A comparison, planned or otherwise, may be evaluated by --
[22.22].
or by -------
204 Chapter 22
If interval estimation is chosen, the procedure for a planned comparison is
entirely analogous to estimation of — ]iy. The limits of the confidence inter¬
val are given by the rule K ± t^sR, and this is parallel to the rule in Formula
19.7 on p. 343 of the text: K is like (x - Y) , and is like s_
For post hoc comparisons, the text offers a procedure that, strictly speaking,
is applicable only in a situation where a preliminary overall F test for the
treatment has shown significance. If such a test has shown significance, then
there exists at least one comparison for which the null hypothesis will be re¬
jected / accepted at the same level of significance [22.24, p. 421]. As many
comparisons as are desired may be made, whether independent or not. The price
for such flexibility is that each comparison yields narrower / wider limits than
if it had been planned [22.24, p. 421].
SYMBOLISM DRILL
This drill is confined to the symbols used in the analysis of variance.

Pronunciation is given only where it's not obvious.
ONE-WAY ANALYSIS of VARIANCE
Symbol Meaning
1
Df E, F, The several subcategories of a
2
XD' XE' %E ' Scores in the several subcategories
3
XD' XE' xF, Means of samples / populations in the subcategories
4
Hd' He' hp' Means of samples / populations in the subcategories
5 X __ mean; mean of all scores (pronounced "eks double-bar")
6
___-groups estimate of a2
7 ss _ °f __ sum of squares of deviations

from mean)
8 ssw _-groups
Meaning _______—
S vmbol
(ifrr -qroups ___—

ax W
-groups estimate of
V
cc -aroups __—--
—
A-F -qroups _____—

A -
QQ rji _____ ■ -—- -■—__—-----—
dfm ______________ “““
2 /
F sA /
TWO-WAY ANALYSIS of VARIANCE
Score m tne run
Score in the ith

xv
sample / population in the ith
of the
*Ci
of the

of the
UC.l
of the
Ur
i
e 2 of G2
swc
estimate of
estimate of
estimate of
SRXC2
for
ssc
for
SSR
for
SSRxC
within
sswc
SST sum of
206 Chapter 22
Symbol Meaning
dfc for _
3 1
3 2
for _
dfR
3 3
for _
dfRxC
34
within
dfwc
3 5 degrees of _
dfT
3 6 F V or / or
COMPARISONS
3 7 K Sample / population value of a comparison
3 8 K Sample / population value of a comparison (pronounced "kappa")
3 9 Estimate of _ error of a _
SK
2 2
4 0 s s or s
error
4 1 F' Critical value of for a Scheffe comparison
ANNALS of EGREGIOUS EXAMPLES , Continued
Look back at the description of the marketing survey on p. 151 of this work¬
book, and note that in his first study, the researcher really tested five versions
of that frozen foodstuff he had been hired to evaluate.
1. What inferential technique should he have used on the full set of data
from the first study?
In the second study, he again tested five versions of a product, but this time
each subject tasted and rated all five versions.
2. Is the inferential technique appropriate for the first study also appropri¬
ate for the second? Why or why not?
INFERENCE ABOUT FREQUENCIES
23.1 Introduction
23.2 A Problem in Discrepancy between

Expected and Obtained Frequencies
23.3 Chi-Square (X2) as a Measure of

Discrepancy between Expected and Obtained
Frequencies
23.4 The Logic of the Chi-Square Test
23.5 Chi-Square and Degrees of Freedom

of Chi-Square
23.7 Assumptions in the Use of the Theo¬

retical Distribution of Chi-Square
23.8 The Alternative Hypothesis
23.9 Chi-Square and the 1 x C Table
23.10 The 1 x 2 Table and the Correction

for Discontinuity
23.11 Small Expected Frequencies and the

Chi-Square Test
23.12 Contingency Tables and the Hypoth¬

esis of Independence
23.13 The Hypothesis of Independence as

a Hypothesis about Proportions
23.14 Finding Expected Frequencies in a

Contingency Table
23.15 Calculation of X2 and Determination

of Significance in a Contingency Table
207
208 Chapter 23
23.16 Interpretation of the Outcome of

a Chi-Square Test
23.17 The 2x2 Contingency Table
23.18 Interval Estimates about Propor¬

tions
23.19 Other Applications of Chi-Square
1_
2___
3 ___
4 _____
5 ______
6 ____
7
-——---—---
8
———— — "
9
— —--——— - --
10___
11 _
12
■ I... - ,
13:_ •———— . —— —- ■ - ■ — ■ ■
Inference about Frequencies 209
SUMMARY
May back an Chapt.r 2. . distinction — *.

,„d quantitative ones. '=“ of «» « Pi that ditfet
in" qua 1 i ty^ in'1 kind^" and" no t in quantity (nothin degree) ^
Ifsts examples are number of siblings and
• . | -l *
All of the descriptive and inferentiai ^chn^ttr^t!
course, with the exception of the bar drag ^ Nervations are numerical
for observations on guantitatrve varia^^^^ the frequency distributions of
scores, and it is scores that polygon and cumulative frequency curves
Chapter 3, by the histogram an deviation of Chapters 5 and 6, by the
prSP-S “PS ’.*w‘tn. t .«d TcLx:°t rsr t PKS B-
“12 s:tp“si“o°sr;s“nS:i ;:«=«-».“■*-

• i • _ •'t7n
niques involving z , +•
z, Fisher's
risucj. o z ,, and F.
in Chapter 23, we return to edYnTo r

there is one such variable, and eac su ] summarized by count-
of the categories that make up the variable^ ^category. The counts are fre-
ing up the number of subjects c assi their associated frequencies forms a
quencies, and the list of categories with theirâssocia ^ ^ p 428 of the
text^^The^frequenc^distribution^could be graphed as a bar diagram like that
011 "inâ'more complicated case, there are two ^^^0nftriable Tnf

subject is simultaneously classifie in ■da^ are SUIrmarized as frequencies,
one category from another vf»riabl - 9 various combinations of
here the frequencies with which subjects fall into
categories. An example is the table on page 438.
, t-hpre is one or two variables, the frequencies may
in either case, whether there corresponding proportions in the
be converted to proportions, and it is the co p s
parent populations that are of interest. _
,0 d,„ “^ripnpv’dSpn irPiPP t.

srps “"cM-.q»v= i. T *
more samples, (b) it has a cert^" attribution for each different number of
it, ,c) it has a ^“-Ni^fmay be used to test a null hypothesis or to
degrees of freedom, and ( ) Y ,,ff ence between two parameters. But the
estimate a population parameter or ^ chi-square are frequenc ies, whereas
numbers that enter into the computation of c 1- Qn means and standard
t (like its large-sample counterpart z ) and
deviations or variances.
210 Chapter 23
Although chi-square was developed for qualitative variables, it can be used

for quantitative ones if the several class intervals that divide up the scale
of measurement for such a variable are treated as discrete categories.
The CASE of ONE VARIABLE

The Null Hypothesis
The simplest case to which the chi-square statistic can be applied is that
in which frequency counts are available for the categories of a single variable.
Problems of this class are sometimes said to be characterized by a 1 x c ("one
by C") table, where C is the number of or class
[23.9] . In the 1 x C table, x2 may be used to test whether the relative fre¬
quencies characterizing the several categories or class intervals of a sample /
population frequency distribution are in accord with the set of such values
hypothesized to be characteristic of the sample / population distribution [23.9]
In any such problem, the hypothesized relative frequency of occurrence in each
category or class interval is dictated by the statistical / substantive hypoth¬
esis of interest [23.9]. The hypothesized proportions must / need not be equal
[23.9] .
#A> the alternative hypothesis, is simply that the null hypothesis is untrue
m some (any) way. Note that the distinction between a directional test and a
nondirectional one, encountered earlier, is still / not pertinent here [23.8],

with the exception described below.
Expected Frequencies
To conduct the test, we must generate expected frequencies, and these are
obtained for each category by multiplying the proportion hypothesized to charac¬
terize that category in the population by sample [23.4, p. 430]. An
expected frequency is the __ of the obtained frequencies that would occur
on infinite repetitions of an experiment such as the one actually done when the
null hypothesis is true/ false and sampling is [23.4, p. 430]
Computing Chi-Square
The chi-square statistic, /2/ provides a measure of the discrepancy between

Its basic formula, suited to this task, is:

expected and obtained frequencie:
[Formula 23.1]
X' = l
a iq the obtained frequency and sum
where is the expected frequency and - is the obrai
- characterizing a given
mation is over the number of -------
problem [23.3], Examination of the formula reveals several pornts of rnter
2
about X :
1. V2 cannot be positive / zero / negative since all discrepances are
squared; both positive and negative discrepancies make a posrtrve / negatrve
contribution to the value of X2 [23.3]. ,

2. X2 will be _ only in the unusual event that each obtaxned freque -
cy exactly equals the corresponding expected frequency [23.3].

i fhP larcrer / smaller the discrepancy between
3. Other things being equal, the la g /
the f 's and their corresponding f0's, the larger X wl11 be t23.
: But, it is not the size of the discrepancy alone that accounts for a con¬
tribution to the value of X2, it is the size of the discrepancy relative to the
magnitude of the expected / obtained frequency [23.3].

m-
5.
b. The value
me vdiuc of
v^.l. X2
A depends
— on the number of
volved in its calculation [23.3], The method of evaluating X ™ust there
take this factor into account. This is done by considering the number o -
(df) associated with the particular X [23.3].
Deqrees of Freedom for Chi-Square _ , _ iri

. - flparees of freedom was encountered first m Chapter 19
The concept of degrees or rree
connection with the t statistic and then again in connection with F. thos
t ings, „ Proved to be a function of sample - <».5]. However in

„ r, the number of degrees of freedom is determined by
problems with frequency data, the numb ^ each
the number of (fQ - fe) discrepancies that are -__-- "

other, and ii . arv" [23
are therefore "free to vary
5],
[23.5]. In general, the number of
grees of freedom for problems of the one-variable type will be C - 1, wh
involved [23.5].
is the number or ----
212 Chapter 23
The Logic of the Chi-Square Test
As noted above, an expected frequency is the _ of the obtained fre¬
quencies that would occur on infinite repetitions of an experiment when the null
hypothesis is true/ false and sampling is _ [23.4, p. 430]. When
the hypothesis is true, the several obtained frequencies will vary from their
corresponding __ frequencies only according to the influence of
random sampling fluctuation [23.4]. The calculated value of X2 will be smaller
when agreement between fQ's and fe's is good / poor and larger when it is not
[23.4].
When the hypothesized fe's are not the true ones, the set of discrepancies
between fQ and fe will tend to be larger/ smaller than otherwise, and, conse¬
quently, so will the calculated value of X [23.4]. To test the hypothesis, we
must learn what calculated values of X2 would occur under random sampling when
the hypothesis is true/ false [23.4]. Then we will compare the calculated
from our particular sample with this distribution of values. If it is so large
that such a value would rarely occur when the hypothesis is true, the hypothesis
will be accepted / rejected [23.4].
The Random Sampling Distribution of Chi-Square
When the hypothesis to be tested is true, and when the conditions noted below
obtain, the sampling distribution formed by the values of y2 calculated from re¬
peated random samples closely follows a known theoretical distribution. Actually,
there is a family of sampling distributions of /2 r each member corresponding to
a given number of_[23.6]. It is useful to know
that the of any member of the family of chi-square distributions is al¬
ways the same as the number of degrees of freedom associated with that particular
distribution [23.6].
The conditions that must obtain for a sampling distribution of x2 to follow
a known theoretical distribution are those stated in these assumptions:
1. It is assumed that the sample drawn is a _ sample from the
population about which inference is to be made [23.7].
2. It is assumed that observations are [23.7]. The set
of observeuions will not be completely when their number

is less than / equals / exceeds the number of subjects [23.7].
3. It is assumed that, in repeated experiments, observed frequencies will
be _ distributed about expected frequencies. With random sam¬
pling this tends to be true. There are two important ways in which this assump¬
tion may be violated:
(a) When f is small / large , the distribution of f 1s about fQ tends to
be positively skewed [23.7].
(b) The theoretical distribution of chi-square is smooth and continuous.
On the other hand, the obtained values of X2 actually form a _
distribution [23.7]. Comparing the obtained values of y2* which form a _
series, with the continuous distribution of theoretical values may re¬
sult in a degree of error [23.7] . The importance of this discrepancy is, for¬
tunately, minimal unless both n and df are small / large [23.7], A correction
exists when df = _____ [23.7] .
Special Considerations in the Case of a 1 x 2 Table
When the variable under study consists of only two categories, the data fall
into a 1 x 2 table, and the number of degrees of freedom is equal to one. In
the special circumstance that df = 1, a correction may be applied to compensate
for the error involved in comparing calculated values of X > which form a dis-
continuous / continuous distribution, with the theoretical tabled values of X r
which form a discontinuous / continuous distribution [23.10]. This correction
is known as Yates' correction, or the correction for__
and consists of reducing the discrepancies between fQ and f by _ before
squaring [23.10].
A second special consideration in this case is the availability of a one-
tailed test. When df = 1 (only), X2 = z2- We maY therefore calculate z = _
and compare that value with the critical one-tailed value of normally distributed
z [23.8, next-to-last paragraph]. The null hypothesis should be rejected only
for differences in the direction specified in the null / alternative hypothesis
[23.8].
A third special consideration is the size of the sample. When df = 1, both
f 's ought to equal or exceed _ [23.10].

214 Chapter 23
Fourth, note that this test may be conceived as a test about a single
__ [23.10, next-to-last paragraph] . The null hypothesis in
this conception states the value of the proportion in the population of interest;
the population value is symbolized P and the sample value p. The test of a
single proportion is conceptually analogous to the test of a single
[23.10]. The difference is simply in the test statistic involved; i.e., X or p,
whose statistical significance is assessed with t or y2, respectively.
Finally, just as with single means, it is possible to construct an interval
estimate of _ [23.10]. The procedure is explained on p. 446.
The CASE of TWO VARIABLES
The Null Hypothesis
So far, we have considered the application of chi-square to the one-variable
case. It also has important application to the analysis of
frequency distributions [23.12]. Here there are two variables of interest, and
each subject is simultaneously classified into one category of one variable and
one category of the other variable. The resulting frequency counts are cast
into a matrix like Table 23.3 on p. 438. Bivariate frequency distributions of
the type illustrated in Table 23.3 are known as tables
[23.12]. In many ways, such a table is similar to the bivariate frequency dis¬
tributions encountered in the study of correlation (see chapter 9). Indeed, the
major difference is that here the two variables are both qualitative / quantitative
variables rather than qualitative / quantitative variables [23.12].
From such a table we may inquire what cell frequencies would be expected if
the two variables are independent of each other in the sample /population
[2 3.12] . Then, chi-square may be used to compare the obtained cell frequencies
with those expected under the hypothesis of independence. If the fQ - f dis¬
crepancies are small / large , y2 will be small, suggesting that the two variables
of classification could be independent [23.12] in the population. Conversely,
a small / large y2 will point toward a contingent relationship [23.12] in the
population.
An alternative conception of the null hypothesis is possible. In general,
the hypothesis of independence in a contingency table (at the population level)
is equivalent to hypothesizing that the proportionate distribution of frequen¬
cies in the population for any row is the same for all _, or that in
the population the proportionate distribution of frequencies for any column is
the same for all _ [23.13]. So again the null hypothesis to be
tested by X2 maY k>e thought of as one concerning proportions.
No matter how the null hypothesis is conceived, the alternative hypothesis

states simply that the null is false, and the distinction between one-tailed and
two-tailed tests does not apply—unless each variable consists of only two cate¬
gories, as noted below.
Calculating Expected Frequencies
Frequencies expected under the hypothesis of independence (at the population
level) in a contingency table may be calculated as follows:
1. Find the column proportions by dividing each column total by the grand
total (n). The sum of these proportions should always be - [23.14].
2. Multiply each row total by these column proportions; the result in each
instance is the expected cell frequency (fe) for cells in that row. Keep the
result to one decimal place.
3. Check to see that the total of the expected frequencies in any row or in
any column equals that of the _ frequencies [23.14],
The same result could be obtained by finding the row proportions and multiply-
ing by totals [23.14].
Degrees of Freedom
To figure degrees of freedom in a contingency table, we consider that the
column totals and the row totals are fixed / free and ask how many cell frequen¬
cies are free to vary [23.15]. In general, for an R x C contingency table, their
number (and therefore the number of degrees of freedom) is --- where
C is the number of columns and R is the number of rows [23.15].
The Logic of the Chi-Square Test

of independence of classification is true at the pop-
If the null hypothesis
216 Chapter 23
l®êl t should expect that random sampling will produce obtained
values of X that are in accord with the tabled distribution of that statistic
for the appropriate number of degrees of freedom. If the hypothesis is false
in any way, the calculated value of X2 will tend to be smaller / larger than
otherwise [23.15]. As before, then (as for the case of a single variable), the
region of rejection is placed in the lower / upper tail of the tabled distri¬
bution [23.15].
Interpreting a Chi-Square Test
Since the contingency table is analogous to the correlation table, it might
be thought that x2, like r, provides a measure of strength of association. Al¬
though x2 may be converted into such a measure, it does not, by itself, serve
this function. The purpose of the chi-square test as applied to a contingency
table is to examine the hypothesis of__ between the two
variables at the population level [23.15]. Consequently, it is more nearly an¬
alogous to the test of the hypothesis, in a correlation table, that the true
correlation is [23.15].
Remember that a significant outcome of the chi-square test is directly ap¬
plicable to any row or column/ only to the data taken as a whole [23.16]. The
X which we obtain is inseparably a function of the R x c contributions (one
from each cell) composing it. We cannot say for sure whether one group is re¬
sponsible for the finding of significance or whether all are involved.
We should also remember that when small / large samples are involved, pro¬
portionately small differences may be responsible for statistically significant
differences [23.16]. Paying attention to proportions rather than frequencies will
help curb undue excitement upon obtaining a significant outcome in a small /
large sample [23.16].
Special Considerations in the Case of a 2 x 2 Table
The 2X2 table affords _degree(s) of freedom [23.17]. Consequently,
Yates' correction is applicable. We may, if we wish, proceed to treat this table
in exactly the same way as afforded an R x C table, except that each (fQ - f )
discrepancy would be reduced by _before _ing [23.17]. A special

formula is available, however, that incorporates the correction and reduces
computational labor. (See p. 444.)

A second special consideration in this case is the availability of a one-
tailed test. The alternative hypothesis may state not just that the nuxl is
false, but that it is false in a certain one of two ways. The procedure for a
one-tailed test in the case of a 1 x 2 table is also applicable here.
A third special consideration is again the size of the sample. Because df

= 1, all fe's should equal or exceed 5.
Fourth, just as the 1x2 chi-square table is related to testing a hypothe¬
sis about a single _, so is the 2 x 2 table related to the
test of the difference between two_from dependent / inde¬
pendent samples [23.17].

Finally, it is possible to construct an interval estimate of the difference
between the two proportions involved in a 2 x 2 table, as explained on pp.
447.
Statistics in Action , -------- “ “

The EFFECT of ENVIRONMENTAL NOISE on HELPFULNESS
in 1975 two psychologists reported a clever field experiment in which they

had unobtrusively observed reactions to a stranger who dropped a stack of boo s.
The incident was staged out-of-doors in a complex of student apartments on a
university campus, and the subjects were men who happened to walk into the
scenario alone. The stranger wore a cast on his arm, and he was carrying e
books from a car to an apartment when he spilled them. The subject was six feet
away at the time.
The dependent variable was a simple, qualitative one: did the subject help
the stranger retrieve his books, or didn't he? The independent variable was the
environmental noise: for half the subjects, it was at its normal level, about
decibels (di>), and for the other half it was raised to 87 db by a confederate
of the experimenters who ran a gasoline-powered lawn mower 25 feet from the
point where the stranger dropped the books.
The researchers reported the following results:
50 db 87 db
Yes 16 3
Help?
No 4 17
218 Chapter 23
1. Describe the data in terms of proportions.
2. Is this a case of one variable or two variables?
3* Tê chi-square test on these data can be conceived as a test for an

association between two variables or as a test for a difference between two
proportions.
(a) State the null hypothesis for a test for an association.
(b) State a two-tailed alternative hypothesis for this null.
(c) State the null hypothesis for a test for a difference between two
proportions.
(d) State a two-tailed alternative to this null.
4. Compute chi-square and draw a conclusion about the null hypothesis, usinq
the .05 level of significance.
The researchers conducted a parallel experiment in which the stranger did not
wear a cast on his arm, and here the noise level did not significantly affect the
proportion of subjects who helped him, which was 20% at 50 db and 10% at 87 db.
The effect of noise on helpfulness is thus not a simple one; sometimes it matters
and sometimes it doesn't.
Reference: K. E. Mathews, Jr., & L. K. Canon, "Environmental Noise Level as

1975 te™lri57^ °^7HelPin9 Behavior," Journal of Personality and Social Psychology,
SOME ORDER STATISTICS (MOSTLY)
24.1 Introduction
24.2 Ties in Rank
24.3 Spearman’s Rank Order Correlation

Coefficient
24.4 Test of Location for Two Independent

Groups: The Mann-Whitney Test
24.5 Test of Location Among Several Inde¬

pendent Groups: The Kruskal-Wallis Test
24.6 Test of Location for Two Dependent

Groups: The Sign Test
24.7 Test of Location for Two Dependent

Groups: Wilcoxon's Signed Ranks Test
__ 2__
1_
_ 4_
3
_ 6_
5
__ 8__
7
10_
9
12_
11
_ 14_
13
16
15
17
219
220 Chapter 24
SUMMARY
In Chapter 24 the text returns to quantitative variables and describes some

alternatives to the techniques previously presented for dealing with observations
on such variables. In the realm of descriptive statistics there is an alterna¬
tive to Pearson's r, and in the realm of inferential statistics there are alter¬
natives to the t-test for two dependent groups, the t-test for two independent
groups, and the one-way analysis of variance.
The t-tests and the one-way analysis of variance are very efficient, in the
sense of providing high power for a given sample size, and they are the tech¬
niques of choice for quantitative variables. But to be absolutely correct the
techniques require that certain assumptions hold true for the distributions of
scores in the populations from which the available data came. For example,
_ of subgroup populations and homogeneity of _
are assumed for the t-test of the difference between _ of dependent /
independent samples and also for the one-way analysis of variance [24.1]. The
tests are quite "robust" against violation of such assumptions, in that they
yield results close to correct when the assumptions are wrong. However, a prob¬
lem can arise when the distributional assumptions are materially violated and
sample size is _ [24.1].
The alternative techniques presented in this chapter require less restrictive

conditions. The alternatives, however, especially the Sign Test, are somewhat
less efficient than the standard techniques when the assumptions necessary for
the latter are fully met.
All but the Sign Test among the techniques described in this chapter require
that the data be in the form of ranks, so if scores are on hand they must be
rank-ordered. Once ranks are available, though, the techniques that require them
are easy to use, and the Sign Test is exceptionally simple. Thus the techniques
of this chapter might be given special consideration when:
1. the data as gathered are already in the form of _ [24.1] ;
2. there is substantial reason to believe that the _
assumptions required for the more efficient techniques may be violated and when
sample size is _ [24.1];
3. rapidity of analysis and ease of computation are special considerations.
Dealing with Ties in Rank
A problem that arises quite frequently in translating scores to ranks is that

Some Order Statistics (Mostly) 221
of identical scores that therefore cause ties in rank. Most rank-order proce¬
dures are based on the assumption that the underlying measure is discrete / con¬
tinuous and that therefore theoretically there are no/ only a few ties [24.2],
There are various ways to deal with ties in rank. A simple and reasonably satis¬
factory scheme is to assign each of the consecutive scores in a tie the _
of the ranks that would be available to them [24.2]. This procedure usually has
little or no effect on the of the entire sample but tends to reduce the
[24.2]. Fortunately, the disturbance created in the statistic
being calculated is usually slight unless perhaps as many as a - of all
scores are involved in ties [24.2].
Spearman's Rank-Order Correlation Coefficient
Spearman's correlation coefficient, symbolized r^, is closely related to the
correlation coefficient [24.3]. In fact, if the paired scores
are both in the form of ranks (and there are no ties in rank), calculation of
and ___ r will yield similar / identical outcomes [24.2]. is also
used on occasion when both sets of measures are in score form. In this case,
each set of measures is translated into rank form, assigning _ to the lowest
score, to the next lowest, etc. [24.3]. When would one do this? Sometimes
the scale properties of the measures appear doubtful, as explained in Sections
2.7 and 2.8. If it can be concluded that what matters is that one score is higher
than another and that how much higher is not really important, translating scores
to will be suitable [24.3].
The formula for the Spearman rank correlation coefficient is:
[Formula 24.1]
rs =
between a pair of scores / ranks and n is
where D is the __
the number of __ of scores / ranks [24.3].
Exact procedures have been developed for testing the hypothesis of no correla¬
tion in the population sampled for very small samples, but good results may be
had for n >_ _ by finding the ciitical values required for significance for df
= in Table E of Appendix F [24.3], This is the same table used to
determine significance of Pearson __ [24.3].

222 Chapter 24
The Mann-Whitney Test
This test is an alternative to the t-test of the difference between means
of two dependent / independent samples [24.4]. The null hypothesis states
the identity of the two population distributions (the entire distributions)
rather than the identity of just the two means, medians, or whatever measure of
central tendency is used. Nevertheless, if the two population distributions are
of even moderately similar shape and variability, the Mann-Whitney is an excellent

test of
[24.4]. Since the test is on ranks, the
most closely corresponding measure of central tendency is the [24.
The procedure for conducting the test is as follows:
1. Label the two groups X and Y; if one group contains fewer cases than the
other, it must be labeled X/Y [24.4].
2. Combine all scores into one distribution of + ny cases. Then assign
the rank of 1 to the lowest / highest score, 2 to the next lowest / highest
score, etc., until all scores are ranked [24.4].
3. Find ZRX, the sum of the __of all scores in the X/Y distribu¬
tion [24.4].
The remainder of the procedure requires Table J in Appendix F and depends on

the alternative hypothesis.
The fundamental assumptions for this test are __ sampling with /
without replacement and only a few / no ties in rank [24.4]. A moderate number
of tied ranks does not substantially disturb the sense of the outcome.
The Kruskal-Wallis Test
This test is an alternative to the one-way / two-way analysis of variance
[24.5]. It may be thought of as an extension of the test
to more than two groups [24.5]. Like the __ test (and like
the one- and two-way analysis of variance procedures described in Chapter 22), it
is for dependent / independent groups [24.5]. The null hypothesis states the
identity of the several population distributions (the entire distributions again)
rather than the identity of just some particular measure of
-— for the several populations [24.5]. Under ordinary circumstances,
however, it is a good test for location. When examining a significant outcome,
the mean / median / mode is probably the best descriptive statistic to use [24.5]
Some Order Statistics (Mostly) 223
The test statistic for the Kruskal-Wallis procedure is called H and is
computed via Formula 24.2 on p. 461. With three groups and 4 or more cases per
group, the distribution may be used to evaluate H and will
give good approximate results [24.5]. With more than three groups, some groups
can have as few as 2 or 3 cases. Compare «calc with tabled values of -
(using Table G in Appendix F) for df = _ where k is the number of -
[24.5]. The region of rejection lies in the upper / lower tail of the -
distribution [24.5].
As with the _ test for two independent groups, the
effect of ties in rank is not great unless there are many of them [24.5]. As¬
sumptions for the Kruskal-Wallis test are the same as for the -.-
test: random sampling with / without replacement and only a few / no ties in
rank [24.5].
The Sign Test

The Sign Test and Wilcoxon's Signed Ranks Test are commonly used to test for
a difference in location for two dependent / independent groups [24.6]. For
the Sign Test, difference scores are calculated as though one were going to do
a t-test for dependent means using the procedure of Section 17.11, but all posi¬
tive difference scores are assigned the symbol "+", and all negative difference
scores are assigned the symbol Under the null hypothesis, we would expect
that there would be as many "pluses" as "minuses" in the sample within the limits
of sampling fluctuation, and we can compute a chi-square to test the null:
[p. 464, 1st formula]

+
x2 =
This statistic has 1 df.

in conducting a test according to these principles, it will occasionally
occur that some of the differences will be - and cannot, therefore, be
categorized as + or - [24.6]. This dilemma may be solved in one of several ways
Probably the simplest is to ignore such cases, reduce - accordingly, and pro-
ceed with the test on the remaining values [24.6].

224 Chapter 24
Evaluating the Sign Test by X2 will give reasonable accuracy for

or
more pairs of scores [24.6].
The assumptions required for this test are that the x - Y differences have
been randomly drawn from the
-----__ ot difference scores and that
sampling is with / without replacement [24.6], A third assumption is that no
difference is exactly - [24.6]. with regard to the last assumption,
using the method described above will be reasonably satisfactory provided the
number of _____ is small [24.6] .
Wilcoxon's Signed Ranks Test
The Wilcoxon test is an alternative to the

Test (and also to the
t test of the difference between two dependent / independent means) [24.7],
It is more sensitive than the -- Test, but it demands an assumption that
we may not be willing to make [24.7]: we must assume that differences between
pairs of scores can be placed in rank order. For the test itself, the assump¬
tions are random assignment of treatment condition to members of a
(and independent assignment among different _) , no differences of
and only a few/no ties in rank [24.7, p. 467]
To conduct the test, compute difference scores in the usual way. Then dis¬
regard the sign /size of the differences obtained, and supply ranks to the
absolute magnitude of the differences, assigning a rank of 1 to the smallest/
largest of the differences, 2 to the next smallest / largest , etc. [24.7, p.
466], Next, resupply the appropriate _ of the differences to the rinks
of the differences. The test statistic will be the of th» t u

Lne _ of the ranks with a
g e sign or the - of the ranks with a positive sign, whichever is
aller/larger [24./]. The rest of the procedure requires Table K in Appendix

F and depends on the alternative hypothesis.
We can see immediately why this test is more sensitive than the Sign Test.
That test responded only to the size / direction of the difference between a
pair of scores, whereas the Wilcoxon test uses additional information about the
size / direction of the difference [24.7].
; £\*r. t* - ■*-*■*- * * • _
'
answers
to Questions in this Workbook
CHAPTER 2
1. All adults in the state.
2. Their answers to the question asked m the poll.
3. The 500 people interviewed.
4. The 500 answers to the question asked in the poll.
5. 500, of course.
6. An element.
7. A constant so far as your survey is concerned. In a nationwide survey,
state of residence would be a variable.
8.
O.
Qualitative.
ûai.iv,uuj-»w.
9.
-
Discrete. 10. Nominal.
13. Ratio.
11. Quantitative. 12. In theory, continuous; in practice, discrete.
14. Qualitative. 15. Discrete. 16. Nominal
17. Most researchers would say yes, but all ^J^^the^Jnfattempt^

more of something than "No" does, namely n>°re approval But ordinal
to rank-order the respondents for the magnitude.^êrvafor a ratio scale.
--
is measured on a nominal scale.
18. Not nominal, because the variable is

scores. Not ordinal, because the suttee s we not indicate
favorability. Not ratio, because zero is "Reaves interval, but we can’t tell
the absence of the quantity of inter . . an along the scale. For example,
i< th. interval between m«.r by 1 »lt on
if Person A gets a score of 1 f 5 they also differ by 1 unit,
the scale. If C gets a score of 4 and D a ' between A and B is the
But we cannot tell whether the di erenc So we don’t know exactly
same as the difference in favorabi l Y e Section 2.8 says, it probably
S£! °ITTJZV. tSiSSi “ —

or even ratio scales without going far wrong.
227
228 Answers for Chapter 3
CHAPTER 3
Page 23:

3 .25 25 1.00 100
3 .25 25 9 .75
1
1
1
1
1
1
|
1
1 o
i
i
i
i
i
2 .17 17
LO
6
•
CD
o
1 8 4 . 33
•
3 .25 25 3 .25
ii
n
ii
ii
ii
ii
ii
ii
ii
ii
n
ii
i
i
it
i
ii
ii ii
i
ii
ii
i
i
i
n
ii
ii
ii
12 1.00 100
Page 24:

22.5 - 27.5 4 .33 33 1.00 100
17.5 - 22.5 2 .17 17 8 .67
12.5 -- 17.5 2 .17 17 6 .50
7.5 - 12.5 0 .00 0 4 .33
2.5 ■- 7.5 4 .33 33 4 .33

ii
ii
ii
ii
ii
ii
n
ii
ii
ii
12 1.00 100

5 .33 33 1.00 100
486 - 495 485.5 - 495.5 3 . 20 20 10 67
476 - 475.5 - 485.5 2 13 7 .46 46
466 - 475 465.5 - 475.5 5 .33 33 5 33

n
ii
ii
n
n
ii
ii
ii
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
i
15 .99 99
In this latter table, the sums of the proportionate and percentage frequencies
are not quite what they should be because of rounding error.
Answers for Chapters 3 and 4 229
CHAPTER 3 , continued
Page 26:
1. 98 2. 14
3. 87.5 4. 69.5
5. 75.5 6. 81.5 7. 78.5 and 81.5
8. 84.5 and 87.5 9. 87.5 and 90.5
10. 28 and 40
11. 52 and 64
12. 96 and 98
Page 28:
1. 66.5 inches.
2. A centile point.
3. Six-one = six feet + one inch = 72 inches + 1 inch = 73 inches. The 95th
centile point is 73.1 inches. Thus 5% of the men are over 73.1 inches, and so
a bit more than 5% are over 73 inches even.
4. Neither. The answer is indeed a percentage, but it's not the percentage of
cases falling below a given point along the scale of scores.
5. 69.3 inches.
6. A centile point (a score).
7. 67.9 inches and 69.3 inches.
8. 64.3 inches and 73.1 inches (which are C5 and C95, respectively).
9. 10%, or about 41 of the 411 or so. The table indicates that 10% were below
65.4 inches in height, and 20% were below 66.5. So going up the scale of heights
from 65.4 to 66.5 raised the cumulative percentage from 10 to 20, getting us an
additional 10% of the cases in the interval in question.
10. 25%, or about 103 of the 411 or so men. The logic behind this answer is
the same as that for Question 9.
CHAPTER 4
Page 37:
1. Skewed, with the tail on the right.
2. Skewed, with the tail on the left. Maybe J-shaped, even. It will be J-
shaped if the maximum score, 50, is the score that occurs most often (and is thus
the mode, as you will learn in the next chapter). Note that the size of the
group, 523, is irrelevant to the shape.
230 Answers for Chapters 4 and 5
3. J shaped, with the tail on the right.
4. Bimodal, but not necessarily symmetrical.
5. Normal, or at least unimodal and bell-shaped, very close to symmetrical,
rical.N°rmal' °r a9ain at least unimodal and bell-shaped, very close to syirnet-
for7thek!?v?h Wita *** taU °n the right‘ If the test is extremely difficult
for the sixth-graders, the shape might even be a backwards J.
8 . Bimodal, but not necessarily symmetrical
CHAPTER 5
Pages 43-44:
1 . The mode.
2 . The mean, because the sum of all values is unknown.
3. The mean, because

a change m the value of any score will change the sum
4. The mean.
5. The median.
6 . The mean.
7. The mode.
8. The mode.
9. The mean.
10 . The mean.
i—1
i—1
The mode.
•
12 . The mode.
13. The mean.
14. The mean.

i—1
LO,
The median.
•
i—1
The mode.
•
17. The mean.

00
The mean.
•
On the pages now following is the table of answers

m this workbook for all symbolism drills
Answers for Symbolism Drills 231
ANSWERS for SYMBOLISM DRILLS
Pronunciation Meaning_.
Svrnbol
1 "little en" Number of scores in a sample

n
2 N "big en" Number of scores in a population
3 "eks" or "big eks" A raw score, or the set of raw scores

X
4 "the sum of" Result of summing quantities of some kind

Z
5 "eks bar" Zx/n; the mean of a sample

X
6 "mew" ZX/N; the mean of a population

V
7 "median" C50 (may also be defined informally)

Mdn
8 Score or midpoint of interval with largest f

Mo "mode"
9 "little eks" X - \i or X - X; deviation score

X
1 0 "cue"
(C?5 _ C25)/2; semiinterquartile range
Q
1 1 "sigma squared" Zx2/Nj variance of a population

a2
12 /Zx^/N; standard deviation of a population

a "sigma"
1 3 "es squared" Zx2/n; variance of a sample

S2
14 /Zx2/n; standard deviation of a sample

s "es"
x/0 or x/S; z score

15 "zee"
z In general, (value - mean)/(standard deviation)

r "ar"
17 Pearson correlation coefficient for a pop n

P "rho"
18 "wi prime" Predicted raw score on Y

Y'
19 "zee prime sub wi" Predicted z score on Y

z'y
20 "es sub wi eks" Standard error of estimate of Y on X

SYK
"mew sub eks bar"
vx
22 sigma sub eks bar" Standard error of the mean;
232 Answers for Symbolism Drills
Symbol Pronunciation Meaninq

2 3
s "little es" Estimate of 0; VZx2/ {n - 1)
24
"little es sub eks bar " Estimate of a—; s/Jn
X
25 H "aitch null"
Ho Null hypothesis
26 H "aitch sub ay"

ha Alternative hypothesis
26 yh
hyp "mew hype" Value of y stated in null hypothesis
28 "z" "zee quotes"

29
"zee crit" Critical value of z
Zcrit
30 a "alpha" Risk of Type 1 error; level of significance

31 g
"bayta" Risk of Type 2 error
32 s2
"little es squared" Estimate of a2; Zx2/(n - 1)
3 3
u "mew true"
true True value of y
34 y- mew sub eks bar minus Mean of sampling distribution of differences

X-Y wi bar" between means
"mew sub eks minus mew

5 (u-uv), sub wi, the quantity Value of u - u stated in null hypothesis
X Y hyp
hype" A 1
36 0- _ "sigma sub eks bar minus Standard error of the difference between
X-Y wi bar" two means
3 7 "little es sub eks bar

S-
X-Y minus wi bar" Estimate of a-
X-Y
38 D "dee" X - Y; difference score
39 C "see" Confidence coefficient

40 t
"tee" Conventional name for "z"
41 df "dee ef" Degrees of freedom
92 a "sigma sub ar"

r Standard error of r
43 z' "zee prime" Fisher's transformation of r

Answers for Symbolism Drills and Chapter 6 233
44 "sigma sub zee prime" Standard error of z'
"sigma sub zee prime Standard error of the difference between two
0 sub one minus zee
45
z'l-z'; independent Z1's
prime sub two"
CHAPTER 6
Page 55:
Ex = 48, n = 6, X = 8.0. Ex = 0. lx2 = 96, S2 = 16.0, S = 4.0.
Page 56, First Paragraph:

The standard deviations of the distributions tabled on pp. 53 and 55 work
out to be the same because the deviation scores for the distributions are the
same—which is to say that where the raw scores e in relation to their mean
is the same for the two distributions.
Page 56, Top Table:

lx2 = 54, s2 = 9.0, S = 3.0.
M
Ex = 60, n = 6, X = 10.0.
II
X
Page 56, Bottom Table:
Ex = 0. Zx2 = 20, s2 = 4.0, S = 2.0.

Ex = 50, n = 5 , X = 10.0.
Page 57:
Ex2 = 48, s2 = 4.0, S = 2.0.
M
O
X
II
EX = 48 n = 12, X = 4.0.
•
Page 58, Middle Table:
ZX2 = 480.
Page 58, Bottom Table:
lX2 = 654.
Page 59, Top Table:
IX2 = 520.
Page 59, Middle Table:
Ex2 = 240.
CHAPTER 6, continued
Page 60:
1. The standard deviation.
2. The range.
3. The range.
6 . The standard deviation.
7. The semiinterquartile range.
8. The range.
Symbolism Drill:
See p. 231.
Page 62:
1. Skewed to the right.
2. Skewed to the right.
3. 3.2.
4. 2.0.
5. The very large scores pull the mean up. See the next-to-the-last para¬
graph on p. 41 of this workbook.
6. For the same reason as in the first distribution.
7. (7.6 - 0)/2 = 3.8.
8. (4.2 - 0)/2 = 2.1.
9. Because the distributions are so highly skewed, S is misleadingly large

See p. 91 of the text.
10. Since X = Zx/n, Zx = nX. Here nX = (176)(6.81) = 1198.56. The total must
have been a whole number and could have been either 1198 or 1199; both figures
round to 6.81 when divided by 176.
11. Again we must compute nX. Here the figures are (128) (2.97) = 380 16
which rounds to 380. * * '
12. All those genetic counselors in the U. S. who do diagnostic cytogenetics,

or, speaking more formally, their answers to each item on the questionnaire. (In
the more formal conception, there is one population for each item.)
13. The sample(s) included almost all elements of the population(s).

Answers for Chapter 7 235
CHAPTER 7
Page 68:
Deviation Squared
Squared
Raw Deviation z Score Score for Deviation
Deviation
Score Score z Score for z Score
+1.25 1.5625
+5 25 +1.25
13
+1.25 1.5625
+5 25 +1.25
13
+0.25 0.0625
+1 1 +0.25
9
-0.50 0.2500
-2 4 -0.50
6
1.00 -1.00 1.0000

4 -4 16 -
Ez = 0.00 E(z ~ z) = 0.00 E(z - z) 2 = 6.0000

Ex2 = 96
O
M
EX = 48
X
II
E(z - z)2/n
6 S2 = Ex2/n n = 6 Sz2 =
n =
6.0000/6
= 96/6 z = Ez/n
X = iX/n
= 0.00/6 — 1.0000
—
48/6 = 16.0
= 0.00 sz = /1.0000
=
8.0 S = /16.0
= 1.0000
= 4.0
Page 69:
For the answers to the symbolism drill, see p. 231
CHAPTER 8
Page 75
TABLE OF EQUIVALENT SCORES
Score where Score where Score where

z Score Centile Rank if
0=100, 0=15 0=500, 0=100 0=50, 0=10 Shape is Normal
+3 145 800 80 99.9
+2
130 700 70 97.7
+1.5 122.5 650 65 93.3
0
100 500 50 50.0
-1
85 400 40 15.9
-2.5 62.5 250 25 0.6
-3 55 200
20
0.1
Page 77:
For the answers to the symbolism drill, see p. 231
Pages 77-78:
1 . 100 - 15 = 85. See p. 132 of the text for a beautiful illustration.

2. About 16%.
3. 70 - 100 = -30, which is 15 x -2. Therefore 70 is 2 standard deviations

below the mean.
4. About 2%.
5. 50 - 100 = -50, which is 15 x - 3 33. Therefore 50 is 3.33 standard devi-

ations below the mean.
- ,6' y°rdyg t0 Table B of Appendix F, 0.05% of the cases lie beyond a z score
or 3.30 (which is the closest we can get to 3.33 in the table). Coming in toward
the mean to a z of -2.00 (corresponding to an IQ of 70), we find that 2.28% of
the cases lie beyond it. That leaves 2.28 - 0.05 = 2.23% of the cases in the
lnterva^ between z - 3.30 and z = -2.00. So about 2% of the population is
mildly retarded by Zigler's definition.
7.
About 0.05% (actually somewhat less). This is about 1 person in 2000.
8 . •FlIn'cW° =5x12+2=60+2=62 inches. For men, the first centile
point
is 62.6 inches. Thus less than 1% of the men are below 62 inches in height.
9. Between 20 and 30%.
Answers for Chapters 9-11 237
CHAPTER 9
Pages 83-84:
1. Positive and high, surely very close to perfect.
2. Positive and probably at least moderate. The older children will have
both longer noses and larger vocabularies. The examples in the first two ques
tions here show that two variables can be correlated even though neither as
any influence on the other.
3. Almost certainly zero.
4. Positive and probably high. Note that instead of two scores for a single
subject, here we have two scores for a pair of subjects (a couple). The couple
is thus the equivalent of a single subject, in that it is the unit on which the
two variables are measured.
5. Still positive and high. The change in social custom would not influence
the relationship between the two variables; it would only raise the scores on
one variable (husband's age, relative to the scores on the °êr variable (wif,e s
age). A couple with a high score on husband's age would still tend to ha
high score on wife's age, and a couple with a low score on husband s age would
still tend to have a low score on wife’s age.
6 The question is nonsensical, because there are no pairs of scores here.

There is no logical way to pair a baseball player's height with a footba
player’s height; the two teams probably have different numbers of players, fo
one^thing. So the concept of correlation simply does not apply to a situa ion
likS ^Life is full of questions like this one, by the way, questions to which
the proper answer is, "That's a stupid question." Stay alert for them.
Symbolism Drill:
See p. 231.
CHAPTER 10
Symbolism Drill:
See p. 231.
CHAPTER 11
Page 95: If r is negative

If r is positive If r is zero
Y' = Y Y' < Y

If X > X Y' > Y
Y' = Y Y> = Y
If X = X Y' = Y
Y' > Y
Y' = Y
If X < X Y' < Y
238 Answers for Chapters 11 - 17
Page 96:
For the answers to the symbolism drill, see p. 231.
CHAPTER 12
Pages 101-102:
CHAPTER 13
Page 108:
y ~ ^X/N — (1+2+3+4+5+6+7+8+9+0)/10 = 4.5.
3. Ex2. As noted on p. 87 of the text. Ex2 = Ex2

Let's calculate
- (EX)2/N.
Here we have Ex = 285 - 452/10 = 285 - 2025/10 = 285 - 202.5 = 82 5 Then a =
/Ex2//V = 82.5/10 = /8.25 = 2.87.
Page 110:
3. The standard error of the mean, which is the standard deviation for the
real sampling distribution and not just your approximation to it, is 2.87/v^2 =
2.87/1.41 = 2.03.
For n = 10, the standard error of the mean is 2.87//Io = 2.87/3.16 = 0.91.
Pages 111-112:
CHAPTER 14
Page 117:
CHAPTER 15
Page 127:
For the answers to the symbolism drill, see pp. 231-232.
CHAPTER 16
Page 135:
CHAPTER 17
Page 147:
EX = 30, X = 5.0; Ey = 30, Y = 5.0; Ed = 0, D = 0.0.

Answers for Chapter 17 239
Pages 148-149:
One way (out of several) to complete the proof is this:
Xj + x2 + x3 + ... + xN - yi ~ Y2 ~ Y3 ~ ~ yN
VD = " ~ N
{Xl + x2 + x3 + ... + XN) - (Y1 + Y2 + Y3 + . . . + YN)
= N
IX - ZY
N
lx lY_
N N
= UX - y,
Pages 150-151:
Pages 152-153: _
1 Calling the two samples X and Y, sx, Y- sl' and
you should compute X,
X - 7 The latter should be compared to the average of sx and sy, as Section
6.14 ;f the text starting on p. 95 tells you, so you can get some idea of how
large the difference between the sample means is.
2. You should test a hypothesis about the difference between two Population
means. The sample means are independent. The null should state that
ence between the two population means of interest is zero, an have
should say that it is not zero, which is the two tailed case-
to choose an a level, estimate and calculate z *
3 You should proceed as in the first study (see Question 1), and in addition
you should calculate the correlation coefficient for the two sets of score .
4 This is a case of dependent means, so you have your choice of the pro¬
cedures described in Sections 17.10 and 17.11 of the text. The null hypo asls
should again declare no difference between the two population means and t
alternative should again be two-tailed. You would have to calculate s__Y or D
and a "z" again.
5 in the second study, each score for Variety 7> is paired with a score for
Variety B, but the two tables of data do not indicate the pairings,
impossible to compute s-_— or the difference scores an s^.
240 Answers for Chapters 18 - 22
CHAPTERS 18 - 21
Pages 159-160
Pages 166-167 <
For the answers to the symbolism drills, see pp, 231-233

Pages 173-175
Pages 184-185
CHAPTER 22
Pages 204-206:
1. treatment
3. samples
4. populations
5. Grand
6. Within
7. Sum of squares
8. Within-groups sum of squares
9. Within-groups degrees of freedom
10. Among-groups estimate of a2
11. Among-groups sum of squares
12. Among-groups degrees of ffeedom
13. Total sum of squares
14. Total degrees of freedom
15*
16. column
17. row
18. Mean of the sample in the ith column
19. Mean of the sample in the ith row
20. Mean of the population in the ith column

21. Mean of the population in the ith row
22. Within-cells estimate of a2

23. Column estimate of a2
24. Row estimate of a2
25. Interaction estimate of a2
26. Sum of squares for columns
Answers for Chapters 22 and 23 241
27. Sum of squares for rows
28. Sum of squares for interaction
29. Sum of squares within cells
30. Total sum of squares
31. Degrees of freedom for columns
32. Degrees of freedom for rows
33. Degrees of freedom for interaction
34. Degrees of freedom within cells
35. Total degrees of freedom
36. sc2/swc2 or sR2/Swq2 or sRXC /swc
37. Sample
38. Population
39. standard error of a comparison
40. sw2 or swc2
41. F
Page 206:
1. A one-way analysis of variance of the kind described in Chapter 22.
2. No, because the samples are dependent. There is a kind of one-way analysis
of variance suitable for the data in such a case, but like the t-test for depen¬
dent means, it requires knowledge of how the scores in one sample line up with
the scores in the other sample or samples. This was the information that the
businessman failed to record.
CHAPTER 23
Page 218:
1 Among the subjects tested at 50 db, 80% helped the stranger. Among those
tested at 87 db, though, only 15% helped. [80% = 16/(16+4), and 15% = 3/(3+17).]
2. Two variables. 3. (a) In the population sampled, there is no association

between noise level and helping vs. just walking by. (b) In the population
sampled, there is such an association. (c) The proportion of subjects helping in
the population that could be tested at 50 db equals the proportion helping in the
population that could be tested at 87 db. (d) Those two proportions are not equal
4. x2 = 14.44 with Yates' correction. df = 1, and the critical value of X

at a = .05 is 3.84. Ergo the null hypothesis can be rejected and the alternative
accepted.
245
HOMEWORK
On the following pages is homework, one double-sided page for each chapter
but the first. The answers to the homework problems appear only m the ms rue
tor's manual for the text.
Most of the problems in the homework are modeled after ones appearing in
the text or in this workbook, to encourage you to do those in the text and the
workbook for practice.
The space for your name is at the bottom of the second side of the homework
Daaes you will note, and you should write your name there upside down. The
person who checks your work is thus unlikely to know who you are until she or
he has finished the checking, and there will then be no question of bias in the
checking.
Homework for Chapter 2 247
See the comments on p. 245 of the workbook before beginning.
Suppose you're studying the effects of violent television programs on first-

grade boys. You run an experiment with two conditions. In your experimental
condition, 20 first-grade boys view some typical Saturday-morning fare with
plenty of violence, and in your control condition, another 20 first-grade boys
view something equally exciting but free of violence, like a series of races.
Each subject watches one program or the other, individually, and then goes out
to a playground. You determine the number of aggressive acts and the number of
altruistic acts each child commits over the first 30 minutes outside.
Say whether each of the following items is a population, a sample, an element,
a parameter , or a statistic m
1.
Pop Samp El Par Stat
the experimental condition.
Stat 2.
2. The 20 altruism scores in the control condition.
Pop Samp El Par
3. The number of aggression scores in the experimen

condition, which is 20.
4.
group described in Question 1.
5. The 17th child you tested in the control condition.

6. The altruism scores you would have obtained had you

tested all possible first-grade boys in the control
condition.
The number of aggressive acts committed by the child

7.
described in Question 5.
8. The average altruism score for the 20 children in

the control condition.
The 20 aggression scores in the experimental condi¬

9.
tion.
10. The average of the scores described in Question 6

1—1
o

•
Hi
r+
Now say whether each

0
or a variable in your study, and
crete or continuous.
Cons Dis Var Cont Var 11.
13. Inclination to be altruistic.

Cons Dis Var Cont Var
Dis Var Cont Var 14.

Cons

248 Homework for Chapter 2
In a naturalistic study of aggression among children, you simply observe

children interacting on a playground. Your subjects include boys and girls
from several grades in several schools. For each of the measurement procedures
described below, say what level of measurement you're working at.
-L / . To specify a child's sex, you write "0"

Nominal Ordinal Interval Ratio
for a boy and "1" for a girl.
1—1
CO
To specify a child's sex, you write "1"
e
for a boy and "2" for a girl.
19. To specify a child's sex, you write "M"

for a boy and "F" for a girl.*
20. To measure a child's physical maturity

(in an admittedly crude way), you call
the tallest child "1," the second tallest
"2," the third tallest "3," and so on.
21. To measure a child's sociability in the

setting under observation, you determine
the total time the child spends interact-
with other children.
22. To measure a child's inclination to commit aggression at this time in

this place, you again count the number of aggressive acts the child performs
over a 30-minute interval. What level of measurement obtains here? The answer
may or may not be one of the four levels discussed in the text and the workbook.
Justify your answer.
*Did you hear about the girl who received a report card with "F" written in after
the word "sex"'? "'F' in sex!" she cried. "I didn't even know I was taking it."
Think about the level of measurement involved in grading on the scale A-B-C-D-F.
eureN
Write the exact limits for the scores listed below.
Lower Limit Upper Limit
1. The weight of a 97-pound weakling measured to the

-- - nearest pound.
2. The weight of a 44-kilogram weakling measured to

- - the nearest kilogram.
3. The weight of a 100-pound weakling measured to the

- - nearest 10 pounds.
4. The weight of a 45—kilogram weakling measured to

-. - the nearest 5 kilograms.
5. A time of 9.9 seconds for a hundred-yard dash

- - timed to the nearest tenth of a second.
6. A distance of 100 yards for a 9.9-second dash

--—-- --- measured (very crudely) to the nearest 100 yards.
The question logically next is the one on the back of this page, but it
wouldn't fit on this side. You may wish to do it now.
In the table of selected centile points from the distribution of heights for
women on p. 27 of the workbook:
_ 7. What percentage of the women were shorter than five feet even?
_ 8. What percentage were between five feet even and five-three?
_ 9. The middle 40% of the distribution lies between what two values?
_ 10. What percentage of the women were over 65 inches in height?
_ 11. Cg o = ?
_ 12. How short can you be and still be taller than half the women in
this sample?
_ 13. Is the answer to Question 7 a centile point or a centile rank?
_ 14. Is the answer to Question 8 a centile point or a centile rank.
_ 15. Is the answer to Question 9 a centile point or a centile rank?
__ 16. Is the answer to Question 10 a centile point or a centile rank?
In the table that answers Question 13 on p. 502 of your text:
18. What is the centile rank of a score of 79.5?
19. What is the 82nd centile point?
_ 20. What is C6?
21. The topmost 8% of the scores lie above what value?

In a recent semester, 40 students who had enrolled in a statistics course

took the algebra test in Appendix A of your text. The test was scored as the
number of items answered correctly, and with 50 items the maximum score was
thus 50. The following jumble of scores resulted. Bring some order to this
chaos by grouping the data into class intervals with a width of 3. The topmost
interval should be 48-50. Cast the grouped data into the table below, giving
the proportions to 3 decimal places and the percentages to 1. You may or may
not need all the lines in the table.
45 44 47 29 28 37 41 34 34 50 47 34 36 17 43 42
40 23 36 28 22 28 33 25 38 49 21 43 25 44 15 37
35 18 32 41 32 42 41 35
Those are real data, so if you took the algebra test, you can meaningfully
compare your score with them. You might want to compute your centile rank in
the distribution. Also, if you won't be getting this page back before you have
to do the homev/ork for the next chapter, you should make a copy of the table.
In the table that answers Question 13 on p. 502 of the text, what are the
following centile points and centile ranks? Use the procedures of Sections 3 10
and 3.11 in your computations.
- The 16th centile point. _ The 90th centile point.
- The centile rank for 64.5. _ The centile rank for 68.0.
eureisi
.ainpura fpcf aciain. On this side of the page.

Here are those 40 scoresdistribution grouped into class intervals 3
make a frequency po ygon s g 40.50 (This will be the frequency polygon
units wide with the topmost interval 48 50 (This w Qn the homework
corresponding to the grouped frequency distribution tnau y _
for Chapter 3.) Your vertical axis should show raw frequencies.
as large as possible.
Be neat, and plan ahead so your graph is
47 34 36 17 43 42
37 41 34 34 50
45 44 47 29 28 37
21 43 25 44 15
22 28 33 25 38 49
40 23 36 28
41 32 42 41 35
35 18 32
Now make a cumulative percentage-frequency curve for the data on -f-h
rsm:rLte°S9ofh:aôs x,igrJiTnbg ;r intrais 3 units wide
L“:rpercentage — -- 3.s
sideways^fy^Lke?" ^ 9Mph 33 large aS Possib^. Turn the page
0UIPN
test again
Here are the scores on the algebra
34 50 47 34 36 17 43 42
47 29 28 37 41 34
45 44
38 49 21 43 25 44 15 37
23 36 28 22 28 33 25
40
32 41 32 42 41 35
35 18
First, leave the data ungrouped. In the space to the left below, arrange the
scores in order of magnitude, as in Table 3.2 of the text, and show your work in
answering the following questions.

1. Identify the mode of these
2. Find the median, defining it

informally.
3. Calculate the mean to one

decimal place.
Now suppose the test had been scored o.s

the number of items not answered correctly.
The student with the score of 45 would then
have earned a 5, for example. This change
in scoring is equivalent to subtracting 50
from each score and calling the resulting
negative numbers positive.
4. What would the modal score be

“ ’ for this other way of scoring
the test?
5. What would the median (defined

informally) be?
6. How did you figure out Question 5?
7. What would the mean be? 8. How did you figure out Question 7?
Now let the data (as originally collected) be grouped into class intervals
of width 3 with 48 - 50 on top. (This is the way you've been grouping the data
in previous homework.) In answering the questions below, don't bother to show
the grouped frequency distribution again, but do show your computations.
__ 9* What is the mode of the grouped data?
10. What is the median (defined formally as C50)?
11. What is the mean?
In the 1970 census, American women over age 45, who had presumably completed
any childbearing they were going to do, reported the number of children they had
borne. Some said none (about 6% had never married and about 10% of those who did
marry had remained childless), some said one, some said two, and so on. The mean
oyer all the women in this population was about 2.6 (W. Petersen, Population 3rd
ed., New York: Macmillan, 1975, p. 533).
What would the mean have been if the following events had happened?
- 12. Each woman had one additional child. (Those who had borne none
in reality would hypothetically have had one.)
- 13. Each woman had two times as many children as she actually did.
(Those who had really borne none would hypothetically have had
2xo=0 still.)
- 14. Each woman had first two times as many children as she actually
did, and then one additional one.
- 15. Each woman had first one child more than she actually did, and
then enough more to double the resulting number.
suiun
Once again , the scores on the algebra test.
34 50 47 34 36 17 43 42
47 29 28 37 41 34
45 44
49 21 43 25 44 15 37
28 22 28 33 25 38
40 23 36
32 41 32 42 41 35
35 18
show your work

data ungrouped. In the space to
Leave the
in answering the following questions
1. What is the range of these data?
2. What is ZX2? (If you're using a

calculator, there is no need to
show your work for this item or
the next one.)
3. What is Zx?
4. What is (Zx)2?
5. What is Zx2?
6. What is S2?
7. What is S?
NOW suppose, as you did in the homework for the last chapter, that the test
resulting negative numbers positive.
Yes No 8. Would ZX2 change?
Yes No 9. Would Zx change?
Yes No 10. Would (Zx)2 change
Yes No 11. Would n change?
Yes No 12. Would Zx2 change?
Yes No 13. Would S2 change?
Yes No 14. Would S change?
15 . Explain your answers to Questions 12 , 13, and 14

Let the data now be grouped in the familiar way, into class intervals of
width 3 ^ with 48 50 on top. Show your computations for the questions below,
but don t bother to copy in the grouped frequency distribution.
_ 16. What is 'Lx2?
17. What is S'2?
18. What is S?
In the fall of 1977, the Educational Testing Service reported that 54,903
people had taken the Graduate Record Exam, and that their scores on those'items
measuring verbal aptitude had a mean of 503 and a standard deviation of 126.
What would the new mean and the new standard deviation be if the following silly
operations were performed on each of those 54,903 scores?
New Standard
New Mean
Deviation
19. 10 points are added to each score.
20. Each score is multiplied by 2.
21. 10 points are added to each score, and the resulting

value is then multiplied by 2.
22. Each score is multiplied by 2, and 10 points are then

added to the result.
23. Each score is divided by 3.
24. 50 points are subtracted from each score.
25. Each score is divided by 3, and 50 points are then sub¬

tracted from the result.
26. 50 points are subtracted from each score, and the re¬
sulting value is then divided by 3.
0UTCN
These exercises will probably strike you as repetitious and

do them carefully, though, because the principles you'll ^ learnrng and practrc
ing are essential for the understanding of important and interesting matters
coming up in this course.
Answer these questions to 4 decimal places (e.g., .1234). If a collection

of scores is normally distributed, what proportion fall...
1. above z — +0.50?
2. above z = +1.50?
3. below z = -2.50?
4. below z = -3.50?
5. above z = -0.50?
6. above z = -1.50?
7. below z = +2.50?
8. below z = +3.50?
9. between z = +0.60 and z = +1.20?
and z = -2.40?
CO
o
1—1
z
II
10. between
1
and z = +0.60?
o
00
t—1
z
II
11. between
1
12. between z = -2.40 and z = +1.20?
+0.80?
13. outside the limits z = +0.40 and z
-1.60?
14. outside the limits z = -1.20 and z
15. outside the limits z = -0.40 and z +0.40?
16. outside the limits z = -0.80 and z +0.80?
10 scores for the general public on the Wechsler Adult Intelligence Scale
(the WAIS) are normally distributed with a mean of 100 and a standard de.iation
of 15 Answer the following questions to 2 decimal places (e.g., 0.12,). What
percentage of the general (adult) public has a Wechsler IQ...
18. below 85?
17. below 70?
20. above 145?
19. above 115?
22. above 85?
21. below 130?
24. between 70 and 130?

Answer these questions to 2 decimal places (e.g., 1.23). If a collection

of scores is normally distributed, what z score...
26. divides the upper 10% of the scores from the remainder?
28. divides the lower 25% of the scores from the remainder?
Answer these questions to 2 decimal places again. If a collection of scores

is normally distributed, what z score limits identify...
___________ 34. the central 90% of the scores?
_ 35. the central 80% of the scores?
__ 36. the outermost 70% of the scores?
37. the outermost 60% of the scores?
Answer these questions to the nearest whole number (e.g., 123). If a distri¬
bution of scores on a standardized aptitude test is normal in shape with a mean
of 500 and a standard deviation of 100, what is the raw score (not the z score)...
_ 38. below which 50% of the scores fall?
__ 39. below which 75% of the scores fall?
40. above which 85% of the scores fall?
41. above which 95% of the scores fall?
Again answer to the nearest whole number. In the distribution described just
above, what are the raw scores (not the z scores)...
42. that enclose the central 50% of the scores?
auiPN
Fill in the missing values in the table below, noting that the four scores
on a given line would be truly equivalent only if the distributions from which
they came had similar shapes. This exercise is modeled after the one in the
workbook on p. 75.
Where the answer is not a whole number, give it to one decimal place (e.g.,
123.4)—except that you should give the centile ranks in the right-hand column
to two decimal places (e.g., 12.34).
TABLE of EQUIVALENT SCORES
Score where Score where Score where Score where Centile Rank if
y=100, a=15 y=100, a=10 y=500, a=100 y=80, a=20 Shape is Normal
850
125
100
54.50
50.00
I
94
84
300
10
In the fall of 1977, a college senior took the Graduate Record Examination
and received the following information from the Educational Testing Service,
which constructs and scores this instrument:
Quantitative Verbal Analytic
Aptitude Aptitude Aptitude
Student's own score: 440 560 585
Mean score for all who took the test: 525 503 513
Standard deviation for all who took the test: 133 126 129
Her centile rank in this group: 25 66 63
1* What is her z score (to 2 decimal places) on the guantitative part?
2. If the quantitative-aptitude scores of all who took the test {N =

54,903) had been normally distributed, what would her centile rank ,
have been (to the nearest whole number)?
3. What is her z score (to 2 decimal places) on the verbal part?
4. If the verbal-aptitude scores of all who took the test had been
normally distributed, what would her centile rank have been (to the
nearest whole number)?
5. What is her z score (to 2 decimal places) on the analytic part?
6. If the analytic-aptitude scores of all who took the test had been
normally distributed, what would her centile rank have been (to the
nearest whole number)?
Now for each subtest compare the student's actual centile rank with the one
she would have earned in a normal distribution. The comparison doesn't provide
conclusive evidence, but it does permit an informed guess about the actual shape
of the distribution of scores. If the discrepancy between her actual centile rank
and the one she would have earned in a normal distribution is small, we have
little evidence against the most plausible hypothesis, which is that the true
shape is normal. If the discrepancy is large, we do have some good evidence
against the hypothesis of normality, and we can tell whether the shape is skewed
left or skewed right. So for each subtest, indicate your conclusion about the
shape of the distribution of scores. if you infer a skew, spell out your reason¬
ing about the direction of the skew.
7. Quantitative aptitude:
8. Verbal aptitude:
9. Analytic aptitude:
euiep
In the fall of 1977, Ramapo College offered a statistics course (taught by

someone other than the author of your workbook and using a text other than yours)
in which the students were tested with a total of 500 multiple-choice items over
the semester. Eleven students completed all the work for the course, and here
for ten of them are their scores on the first examination, expressed as a percent¬
age correct out of the 39 items on that exam, along with their percentage correct
out of the semester's total of 500 items. The student at the median on the first
test was dropped from the table to reduce the n to 10 and thus simplify the cal¬
culations you will be asked to do.
Student % Correct on Exam 1_% Correct over Semester
38 75
A
54 65
B
62 94
C
67 81
D
67 84
E
72 93
F
77 90
G
77 93
H
82 90
I
85 95
J
How closely is performance on the first test related to performance over the
entire semester? To begin to answer this question, make a scatter plot of the
data in the space below. Do it neatly and as large as possible.
To provide a more precise answer to the question about the relationship

between performance on the first exam and performance over the entire semester,
compute the Pearsonian correlation coefficient for the data on the other side.
Use the raw-score method illustrated on p. 150 of the text, and find the means
and standard deviations of the variables while you're at it.
Don't bother to list the individual values of x2 and F2, but do show your
other work. Give all values that are not whole numbers to three decimal places
(as 1.23, e.g.). This is more than the number that would usually be reported
for a mean or a standard deviation or a correlation coefficient, but you'll need
the extra accuracy for future work with these data.
IX IF
IX2 If
<N
w
*
Xv
X F
sx
IXF
Ixy
Now that you ve found the means, go back to your scatter plot and add lines
that show the locations of the means, as in the figure on p. 148 of your text.
-- .. Which students' data points lie in the first quadrant? List

the letters that identify the students.
-_______ Which students' data points lie in the second quadrant?
... Which students' data points lie in the third quadrant?
__ Which lie in the fourth quadrant?
euieisi
1. in doing the homework for Chapter 9, you computed a correlation coefficient

to describe the relationship between a student's performance on the first exam
in a certain statistics course and the student's performance over the entire
semester. Is it sensible to describe this number as the correlation coefficient
for initial performance and total performance in a statistics course? Why or
why not?
2. Suppose you classify all the registered voters in the U. S. by their age.
You group all the 18-year-olds together, all the 19-year-olds together, and so
on. Would the Pearsonian correlation coefficient do a good job of describing
the relationship between a group's age and the proportion of the people in that
group who actually voted in a given election? Why or why not?
The following data (taken from an almanac) indicate the sort of numbers you'd
be working with. These are estimates of the national turn-out for the 1972
presidential election, which was the first such election in which citizens under
21 could vote.
% of Those
Age
Registered
Bracket
Who Voted
18 - 20 48.3
21 - 24 50.7
25 - 29 57.8
30 - 34 61.9
35 - 44 66.3
45 - 54 70.9
55 - 64 70.7
65 - 74 68.1
75 & + 55.6
3. Thanks to your competence at statistics, you've been hired by a company

that constructs and sells tests to public-school systems. Your job is to develop
a test to compete with a widely used instrument for measuring grade-school chil¬
dren's reading ability. You work up some items that seem promising, put them
together into a test that has the advantage of taking less class time to adminis¬
ter, and then check to see how closely scores on your test correlate with scores
on the competing test. The results are disappointing. In a sample of first-
graders, the Pearsonian correlation coefficient is only .32, and in a sample of
sixth-graders the coefficient is only .39. You'd like to be able to say in ad¬
vertising the test that it yields scores closely correlated with the instrument
now m common use. How can you squeeze a higher correlation coefficient out of
your data? There's a way to do it without changing any of the students' scores.
(This tactic would still be unethical, though.)
.4* Have you ever wondered just how the amount of studying a person does on
a given subject relates to the person's mastery of that subject? Suppose you
questioned a variety of your classmates, asking each: a) How much time did you
spend in studying for whatever objective examination you took most recently, and
b) what percentage of the items on the exam did you get correct? Imagine that
the Pearsonian correlation coefficient for these two variables turns out to be
negative in your sample. Say there are 15 people in the sample, and the value
of r is -.32. Would you be tempted to reduce your studying time in the expecta¬
tion that your grades would increase? Name at least two reasons why your finding
(r ~ *32) Pr°vides only very weak evidence that more studying time causes exam
performance to deteriorate.
BureN
Chapter 9 again.
the data from the homework for
Student % Correct on Exam 1 % Correct ovei
38 75
A
54 65
B
62 94
C
67 81
D
67 84
E
72 93
F
77 90
G
77 93
H
82 90
I
85 95
.T
Use the data to determine the regression equation tor

correct over the entire semester from percentage correct on the first exam o
correct over theên u need the means and standard deviations of the
work.
7' =
Copy this equation for use in the next chapter's homework.

Now make another scatter plot of the data, as you did for Chapter 9, but this
time add the regression line to it.
The student who was at the median on the first exam (and who was
omitted from the table) earned a score of 69% correct on that exam
Use your regression equation to predict this person's score over
the entire semester (to 2 decimal places). The actual figure was
86% correct. Show your work below.
Now use the regression equation to predict performance over the entire
semester for the 10 students who contributed data to the table. Fill in the
table below, which parallels the one on p. 186 of the text. Give the values of
F' to 2 decimal places. Remember that Z(Y - Y')2 should be 0, but it may be a
little off because of rounding error.
i r• \Y - Y ) (Y - Y’)
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
Compute Syx to 2 decimal places, doing it directly as \/Z(Y - Y') 2/n.

Show your work:
Now compute to 2 decimal places from the formula on p. 187, again

showing your work. Did you get the same value?
aureN
1, Look back at the regression equation for the data presented in

the homework for Chapters 9 and 11. What is the regression coefficient for
those data?
2. Interpret the regression coefficient in the manner described on the

bottom of p. 202 in the text. Remember that you are talking about percentage
points, because the scores indicate percentage correct on examinations.
3. Compute k to two decimal places for the data on initial perfor

mance andTtotal performance in that statistics course.
4. Does k indicate a strong relationship between those two variables or a

weak one? Interpret k in the manner described on p. 208, by specifying the
reduction in the errors of prediction relative to the case in which the correla
tion is zero.
5. Compute the coefficient of determination to two decimal places

for the data on initial performance and total performance in the statistics
course.
coefficient of determination in the manner described at

6. Interpret the
the bottom of p. 210.
7. Recall the test of reading ability you were (hypothetically) constructing

for Question 3 of the homework for Chapter 10. The test yielded scores that
correlated only weakly with another test presumably measuring the same thing
when you looked at a sample of first-graders and again when you looked at a
sample of sixth-graders. Suppose you now collect data on a good number of chil¬
dren in each grade from the first through the sixth. Do you expect the correla¬
tion for the entire group to be larger than the values for just the first-grader
or just the sixth-graders, or do you expect the correlation for the entire group
to be smaller? Justify your answer, and note that it again has some bearing on
the ethics of evaluating and advertising tests.
8. That company you're working for now develops a set of materials for
teaching reading. (There's a huge market for this kind of thing.) To persuade
potential customers that'the materials work, it is necessary to try them out,
collecting data before and after students use the materials. Now the company
has to decide what kind of sample to study: children who are already well above
average for their age in reading ability, children who are average or close to
it for their age, or children who are well below average for their age. There
is an unethical choice you could make here that would virtually guarantee that
the mean reading-ability score of the children in the sample would increase from
before the use of your company's materials to afterwards, even if the materials
were ineffective. Which choice is this, and why will this sample's mean almost
certainly rise from the pretest to the posttest no matter how poor the instruc¬
tional materials are?
0UIPN
Suppose you're conducting research on something like errors in social

judgment, something that might be influenced by your subjects' intelligence.
You're accordingly worried about getting a sample of subjects who, as a group,
are unusually bright or unusually dull. You know that IQ scores for the general
adult public on the Wechsler Adult Intelligence Scale (the WAIS) are normally
distributed with a mean of 100 and a standard deviation of 15. If your subjects
will be a truly random sample of this population, you can correctly predict the
likelihood of getting a group whose mean IQ lies more than, say, 5 or 10 points
away from the population mean of 100. So, onward to the problems below, the
answers to which will tell you whether it's realistic to worry about such things
as getting a sample whose mean IQ is below 90 or over 110.
Answer these guestions to 4 decimal places (as .1234, e.g.), and show your
work.
First, suppose your sample size is going to be 9. What is the probability

that those 9 people will have IQ scores whose mean is...
1. over 105?
2. under 90?
3. more than 5 points away from 100 in either direction (i.e., more
than 100 + 5 or less than 100-5)?
4. more than half a standard deviation away from the population mean
(The standard deviation this question refers to is that of the
population.)
5. within one standard deviation of the population mean?

Now suppose you

almost triple your sample size to 25. Again answer to 4
decimal places, and
show your work. What is the probability that the 25 people
in your sample will have IQs with a mean...
6. over 105?
7. under 90?
8. more than 5 points away from the population mean?
9. more than half a standard deviation away from the population mean?
10. within one standard deviation of the population mean?
11. Suppose your sample will be quite large, with 100 persons. To 4
decimal places, what is the probability that the 100 people will
have Wechsier IQs whose mean lies within 1 measly point of the
population mean? You may find the answer surprisingly large.
12. With a sample as large as 100, it doesn't much matter whether the distri¬
bution of IQs is normal in the population. Even if it departs considerably from
normality, we can still be quite confident that the answer to Question 11 is
correct. Why is this? There are three "magic words" that name the reason, and
the briefest possible answer to this question (an answer that is still entirely
correct, though) requires no more than those three little words.
auiPN
An honest die is one whose six faces turn up with equal probability. The
"other" way of looking at probability described in Section 14.2 of the text thus
applies to it.
In answering these questions, assume that the dice are honest, and 9iye the
requested probabilities both as common fractions reduced as far as possi e (as
1/6, e.g.) and as decimal fractions to four decimal places (as .1234, e.g.).
You'll do best if you first translate each question into an OR question, an
question, or a combination of the two, whichever is appropriate. (See p. 115 of
the workbook.) Show your work below each of the questions.
With a pair of honest dice, what is the probability of rolling...
1. a 5 on a certain one of the dice (call it Die #1) and

" a 6 on the other (on Die #2)?
2. a 6 on Die #1 and a 5 on Die #2?
3. a 5 on one die (it doesn't matter which) and a 6 on

the other?
4. a 6 on one die (it doesn't matter which) and a 6 on the
other?
5. "doubles" (the same number on both dice)?
To draw a sample of some given size at random is to draw it in such a way

that all possible samples of that size are equally likely ^drawing a car
(a sample of size 1) at random from a deck of playing cards, then, the other
way “looking at probability described in Section 14.2 applies. Answer these
questions as you did the series above.
If you draw a card at random from a standard deck of 52, which is the
probability that you will get...
6. the Queen of Hearts?
7. a queen (of any kind)?

8. a heart (of any kind)?
9. a queen OR a heart?
Now suppose you draw a first card at random, look at it, replace it, and
draw again at random. This is sampling (drawing a sample of size 2) with re-
p acement, and it is equivalent to drawing the first card from one deck and
the second card from a second deck.
What is the probability that you will get,
10. the Queen of Hearts on both draws?
11. a queen (of any kind) on both draws?
12. a heart (of any kind) on both draws?
13. a queen on the first draw AND a heart on the second?
14. Tne LaMaze method is a technique of prepared childbirth permitting a

(Laboring woman to participate actively in the delivery of her child, perhaps
o viatmg che need for analgesia or anesthesia. In the MaMaze classes attended
and ?!S1few aUthor °f yOUr workbook' there were seven women enrolled,
d °f them gave blrth to a 9irl rather than a boy. Is this a rare occur¬
rence. Assume that the probability of any one's bearing a girl is 1/2 (Actuallv
the probability of a boy is slightly greater than the probability of a'girl
about .51 or .52. More boys than girls are conceived, and even though male’
at ""delivery time.^17 ^ ^ geStation' they still predominate slightly
15. In answering Question 14

what assumption (other than the one about the
probability of a woman's bearing a girl) did you have to make?
sure*!
Under the "personalized system of instruction" (PSI), material to be learned

is divided into small units, students work on it at their own pace, and they ta e
the exam on a given unit only when they think they're ready for it. They must
pass the exam at a high level before going ahead to the next unit, but they are
allowed several tries for each unit (taking a different exam each time, of course).
Three psychologists at Southwest Minnesota State College recently reported a

comparison of PSI with the traditional mode of instruction for introductory psy
chology (R. C. Riedel, B. Harney, S W. LaFief, "Unit Test Scores in PSI versus
Traditional Classes in Beginning Psychology," Teaching of Psychology, , ,
76 - 78) In the fall quarter of an academic year, they used t e me o w
a criterion of 16 out of 20 correct (80%) as the passing score for the test on
each of the ten units into which they divided their course. They did not lecture
but merely made themselves available at certain times to administer the exams
to whichever students were seeking to take one. In the winter quarter the psy¬
chologists did lecture and administered the same tests, one for each unit, every
fourth class period, with no opportunity for students to retake an exam.
The psychologists expected that the students in the PSI course would generally
make their initial try at the exam on a given unit without being fully prepared,
so that most would not meet the criterion of 16 correct on the firs l' .
fact in the students' first tries at the exam on the very first unit of
course, they earned a mean of 17.93 correct (which is almost 90%). The Psychol¬
ogists reported the n and the standard deviation for this group: 64 and . .,
respectively. Thus you can determine whether it is plausible that those 64 score
are a random sample from a population whose mean is only 16 Test the appropriate
hypothesis, using the .05 level of significance and doing a two-tailed test
Carry all calculations to 3 decimal places, and round your final answers to
but If you need an answer for a later calculation, use the 3-place version.
2. Ha in symbols
1. Ho in symbols:
4. X:
3. a:
5. The value for the standard deviation given above is S. Compute s using the
formula s = /(S2) (n) / (n - 1) :
7. z ______—
crit -—
6‘ sx:-
II „ II
9. Decision on Hq: Accept Reject
8. "z
■calc *■
in Question 9 mean in substantive

10. What does the statistical decision
for the substantive question of whether
terms? That is, what are its implications
to meet the criterion of 16 correct on
the PSI students were generally unprepared
their first tries at the exam?
11. If you had conducted that test at the .01 level of significance, would
your decision on the null hypothesis have been different?
Yes No
12. If you had conducted the test at the .05 level of significance but had
done a one-tailed version in which the alternative hypothesis stated that u <16
would your decision on the null hypothesis have been different?
Yes No
13. If you had conducted the test at the .01 level of significance and had
done a one-tailed version in which the alternative hypothesis stated that u<16,
would your decision on the null hypothesis have been different?
Yes No
In the lecture course, the mean score on the first exam was only 13.41 (67%
correct). Is it plausible that the scores in this group are a random sample from
apopuiation whose mean is as large as 16? The n for the group was 61, and the
s andard deviation (S) was 4.07. Do a two-tailed test at the .05 level of sig¬
nificance again. ^
14. H0 in symbols:
15. H in symbols:
i—1
a: 17. X: 18. s:
•
>
19* X -
20. • 4- : 21. "z" n :
2
cnt calc - 22. Decision on H_: Anrppf Reject
Over the remaining nine units of the course, the PSI students earned a mean
a ove 16 on their first tries at every unit except one. The mean of their first
th^ ^ 619 Unit WaS °nlY 15'39 (n = 62' S = 2.78). is it plausible that
y 1th6 SC°reS Were a random samPle fron> a population whose mean was only
16? Do the appropriate two-tailed test at the .05 level of significance.
23. s: 24. s—:

X ------
25: " .
calc * 26: Decision on Hq: Accept Reject
27. If you had conducted the test (still two-tailed) at the .10 level of
significance, what would z have been'?
cnt
28. Would your decision on H^ have been different? Yes No
You may be interested to know that over all 10 units of the course, even on
their first tries the PSI students earned a mean score higher than the mean for
the students in the lecture version. In the homework for Chapter 17 you will
have a chance to determine whether the differences are statistically significant
ouiun
1. Recall from the homework for Chapter 15 the comparison of PSI with
- traditional lecture course in elementary psychology at Southwest Minnesota
State College in their first tries at the exam on the first unit of the course,
the 64 PSI students earned a mean of 17.93 correct out of the 20 items on the
the b4 stuaeiiuij . 0 Q1 Thp figure 16 out of 20 was of special
intere st'here^because^lô^better^was' required" for going on to the next unit
Evaluate the difference between 17.93 and 16.00 following the Procedure sugge
in Section 16,6 of the text. “ determine how much of a
Vp *>• Give the answer

to two decimal places on the line above.
"of some importance" according to the

2. Is this difference "negligible" or
standards proposed on p. 96 of the text?
13.41
3 The 61 students in the lecture course earned a mean of only
devia-
on the first exam, with a standard deviation of 4.07. How many stan ar
tions worth is the difference between 13.41 and 16.00.
How important is this difference according to the standards on p. 96?

4.
5 The poorest performance for the PSI students came in their first
worth is the difference between 15.39 and 16.00?
6. How important is this difference according to the standards on p. 96?
_s-Trirzi
Nation of 2 89 How'many standard deviations worth is the difference between
16.60 and 16 even?
96?
8. How important is this difference according to the standards on P
is common practice in the behavioral sciences for researchers to conduct

one tailed test if their hypothesis specifies the direction in which y.
from Ui-__
ph . _x. _ _ . ~ .... pt lies
..,r Suppose the psychologists at Southwest Minnesota State CollegeThad
followed this practice in examining the data on the PSI students' first tries at
t e exam on the first unit of their course. They expected that the students
would not generally have prepared well enough to earn a score of 16 or better.
Their null hypothesis would have said that y = 16.
9. What would their alternative hypothesis have said?
10. Is there any level of

significance that would have permitted the psychol-
ogists to discover that y
true lies above 16? If so, what? If no, why not?
--- 11. Given that the psychologists were interested in discovering the
population mean to be either above 16 or below it, what should
their alternative
hypothesis have said?
12. Suppose you test the null hypothesis that the population mean is 16 for
tftSaTP^? °f PSI S=ores' and you end UP rejecting the null. The sample mean was
.' 1®t S Say' and the.n was 60- You a one-tailed test at the .01 level of
significance, and a friend who is naive about statistics asks, "Does your result
mean we can be 99% confident that the population mean is above 16?" Explain to
your friend the logic behind your hypothesis test, and say how the figure 99%
enters into things. Work the figures 16, 18, and 60 into your explanation too.
Answer this question carefully! Do a rough draft on another page first.

Recall again the comparison between PSI and a conventional lecture course in
elementary psychology. The instructors of the course reported the following data
for scores on the first exam: PSI Lecture
X 17.93 13.41
S 2.91 4.07
n 64 61
is it plausible that the two groups are random samples from populations wi
identical means? Test the appropriate hypothesis at the .01 level of signi 1
cance, doing a two-tailed test. Carry your calculations to 3 decimal places, an
round your final answers to 2—but if you need an earlier answer for a calcula
tion, use the figure with 3 decimal places. Show your work. If you need to com-
pute s, the formula is s = /(S2) (n)/ (n - 1).
1. These samples are (circle one): independent dependent
3. Ha in symbols:
2. Hq in symbols:
4. a: 5. X - Y: 6. s-:.
7• sy:. 8- sx-7:'
9. z . , : 10. "z"
cnt calc ---
11. Decision on Hq : Accept Reject
12. What are the implications of your statistical conclusion in Questio

for the substantive question of whether the PSI students tries at the tes
would not be as good as the lecture students' performance? (That was the psy
chologists' expectation, remember.)
The mean for the PSI students' first attempts at each of the remanning nine
exams was higher than the corresponding mean for the lecture students (jho ne
^reft °nThe°thirdaunitais^f intS^ because that was the one on which the
iectureStudents did the best and the only unit on which they earned a mean over
16. The data are as follows: PSI Lecture
X 17.06 16.60
5 2.01 2.89
n 64 72
Again, determine whether it is plausible that the parent populations have the
same mean, doing a two-tailed test at the .01 level of significance.
13. These samples are: independent dependent 14. X - Y:
15. s—: _____ 16. s-

X — Y
17. s-: 18. "z"
X-Y ■ calc
19. Decision on HQ: Accept Reject
Also of interest are the data on the eighth unit, because this was the one
on which the PSI students did least well, and the only one on which the mean of
their first tries at the exam was under 16. The PSI class still outperformed
the lecture class, though:
PSI Lecture
i—1
X 15.39 27
•
5 2.78 3. 34
n 62 61
Do a two tailed test at the .01 level of significance again

20. These samples are: independent dependent 21. X - Yz
22. s- 23. Sy:

X —
24. s-: 25. "z"

X-Y - calc *-
26. Decision on HQ: Accept Reject
None—1st 2nd 3rd 27. Which, if any, of the above three tests would have
yielded a different conclusion about the null hypothesis if it had been conducted
at the .05 level of significance?
28. Suppose you're wondering whether the PSI students' performance on their
first tries on the first unit differed significantly from their performance on
their first tries at the tenth unit (x = 16.49). If you were to do a two-tailed
test of the hypothesis that the parent populations for the two samples of scores
had identical means, using the .01 level of significance again, could you follow
the procedure and use the formulas that you employed for the three problems above
Why or why not?
29. If you could not follow the same procedure and use the same formulas, say
what you would have to do differently.
9UIBN
Here are the data again for the first-unit comparison of PSI with a lecture
course covering the same material:
PSI Lecture
X 17.93 13.41
5 2.91 4.07
n 64 61
In answering the following questions, carry your calculations to 3 decimal

places and round the final answers to 2, but if you need an earlier answer for a
calculation, use the figure with 3 decimal places. Show your work, as usual.
The formula, once more, for computing s from S is s = v(S2) (n)/(n - 1).
Determine the 95% confidence limits for the PSI population's mean.
1. X
2.
ZP
3.
sx
4. Lower limit
5. Upper limit
6. di
7. Say in words what di means
Suppose we wanted to increase the precision of this estimate, so that the

full width of the interval is only 1 point on the scale of raw scores (which are
numbers of items correct out of 20). Estimate the required sample size. Remember
that it must be a whole number.
8. w
9. s
11. Required n
Determine the 99% confidence limits for the lecture population's mean.
_ 12. X
____13- zP
14. s—
- X
__ 15. Lower limit
_ 16. Upper limit
_ 17. di
18. Say in words what di means in this case.
Now find the 95% confidence limits for the difference between the PSI popu¬
lation's mean and the lecture population's mean.
19. X - Y
20.
ZP
21.
SX-Y
22. Lower limit
23. Upper limit
24. sav
25. d2
26. Say in words what d2 means in this case.
Suppose you wanted the 95% confidence limits for the problem above to be only
1 point wide on the scale of raw scores. Estimate the required size for each
sample (which must be a whole number, of course).
27. w
28. s
av
29. z,
30. Required n per sample
0UIPN
Here are the data from the homework for Chapter 9 again.
% Correct % Correct
Student
on Exam 1 over Semester
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
In answering the following questions, carry your computations to 3 decimal

places and report answers to 2, but if you need an answer for a later computa¬
tion use the 3-place version. Show your work.
If the students had known the answer to 75% of the questions over the entire
semester, on the average, their mean percentage correct would have been 81.25.
The extra 6.25 percentage points would have come from their guessing correctly
on a quarter of the 25% of the items they didn't know. (With 4-choice items,
the probability of a correct guess is .25.) Is it plausible that the mean of
the population of percentage-correct scores for the entire semester is 81.25?
Do a two-tailed test of the appropriate hypothesis at the .05 level of signifi¬
cance. (Don't be confused because you previously called the scores Y.
symbols: 2. H in symbols:
1. Hq in
3. a: 4. X: 5. s„: ____—-
- X
6. s—: ______7‘ df'- 8’ tcrit""
10. Decision on Hq: Accept Reject

9. t
calc
Now estimate the population mean for the total-performance scores by finding
the 99% confidence limits.
11. t
-p
12. Lower limit
13. Upper limit

Did the students' total-performance percentages significantly exceed their

initial—performance percentages? Test the difference between the mean percentage
correct on Exam 1 and the mean percentage correct over the semester, using the
method of difference scores. Show the difference scores in the table on the
other side of this page. Use the .01 level of significance, and even though it's
not appropriate, let the alternative hypothesis indicate a higher mean for the
population of total-performance scores. Be sure to state the null and the alter¬
native hypotheses in terms of difference scores, though.
14. Hq in symbols:________ 15. in symbols:
16. a: 17. D: 18. s :

-- D ---—---—-- --
19 •
__ 20. df: 21. t . :
— —— —--—- crit —--
22. t ____ 23. Decision on Hq: Accept Reject

calc *•
Here again are the data for the comparison of PSI with a conventional lecture
course in elementary psychology:
PSI Lecture
X 17.93 13.41
S 2.91 4.07
n 64 61
Test the difference between the two sample means, doing a two-tailed test at the
.01 level of significance and using the t statistic. It will be interesting to
compare this test with the one you did on the same data for Chapter 17.
24. Hq in symbols : 25. in symbols:
26. Oil 27. X - Y: 28. Ex2:
29. ly2: 30. s-:

V
A~ j\x — --- -———----
31. df:
32. 33. 34. Decision on HQ: Accept Reject

^crit’ ^calc * —
Finally, compute the 9()% confidence limits for the difference between the two
population means.
36. Lower limit
37. Upper limit

Look back at the bivariate distribution presented in the homework for

Chapters 9, 11, and 19. Test the hypothesis that the correlation coefficient
is zero in the full population from which those scores may be considered a
random sample. Make the test two-tailed, and use the .01 level of significance.
Except for the computation of r, show your work.
_ 1. Hq in symbols
2. in symbols
3. n
__ 4. r
_ 5. t
_ 6. df
7. t ..
——- crit
8. Decision on
9. According to Table E in Appendix F, what values of r are re-

quired for significance at the .01 level in a two-tailed test?
Now compute the 95% confidence limits for the population value, again show¬
ing your work.
10. z'
i“H
1—1
z
•
12. 0 ,
z'
13. Lower limit expressed as z'
14. Upper limit expressed as z'
15. Lower limit expressed as a correlation coefficient
coefficient
16. Upper limit expressed as a correlation
17. Are the limits equidistant from the sample value? If not, which limit
is closer?
Many institutions of higher education make a systematic effort to survey

their students' opinions of the instruction the institutions provide. Some
schools publish the average rating of each instructor in each course that she
or he teaches, and students use these averages in choosing their curricula,
on the presumption that an instructor whose average rating for a given course
is high will receive another high average when teaching that course again. But
how valid is this presumption?
Relevant data have been gathered at Queens College of the City University
of New York (L. H. Seiler, L. D. Weybright, & D. J. Stang, "How Useful are
Published Evaluations Ratings to Students Selecting Courses and Instructors?"
Teaching of Psychology, 1977, 4, 174 - 177). Five different Pearsonian correla¬
tion coefficients are available, each describing the relationship between the
mean rating an instructor received for a given course and the mean that instruc¬
tor received for the same course one year later. The n's for the correlations
range from 99 to 183, and the r's range from .58 to .65. Even the largest of
these, which happens to be the one based on the biggest n and is thus the best
single estimate of the true correlation, is disappointingly small.
--- 18. What is the coefficient of alienation for these data? (See
Section 12.7.)
19. What is the coefficient of determination for these data? (See

Section 12.8.)
One might expect the correlation to be higher if an instructor's two offer¬

ings of a given course came not a year apart but in successive semesters. The
only available data for this case were collected at another institution: r = .67
for 45 combinations of an instructor and a course. Does this figure differ siq-
ni icantly from r = .65 for 183 combinations of instructor and course? Test the
appropriate hypothesis at the .05 level of significance, using a two-tailed alter¬
native. As usual, carry computations to 3 decimal places and round to 2, and
show your work.
20 . H0 in symbols
21 . H^ in symbols
22. z1 i
23. Z'2
24.
25.
26. z
crit
27. Decision on H ^
ouiPN
Look back at the first page of homework for Chapter 15. Note that on their
first tries at the exam on the first unit of the PSI course, the 64 students
correctly answered a mean of almost 18 out of the 20 items, which turned out to
be significantly greater (in the statistical sense) than 16, the minimum needed
for proceeding to the next unit.
To get some idea of the power of the test you did there, assume that the
standard deviation of the scores in the population is 2.40. (The article from
which this example is drawn reports the standard deviation for 10 samples of
first tries at an exam in that PSI course, one for each of the 10 units into
which the course was divided, and the mean of the 10 standard deviations is 2.42.)
Following the procedure illustrated in Section 21.10, determine B for the test
vou did (a two-tailed test at the .05 level of significance using a sample of
size 64)— on the assumption that the true mean was 17, just one point higher than
the hypothesized mean. To show your work, construct a neat, carefully labele ^
diagram like Figure 21.5 on p. 371. Do a rough draft first on another sheet or
paper.
1. What is 3 (to 4 decimal places)?
2. Say in words what this figure means for this particular case
3. What is the power of the test (to 4 decimal places)?
means for this particular case.

4. Say in words what this figure
In a study conducted on a street corner with a traffic light, an experimenter

stood on the curb waiting for the light to turn from red to green. When another
pedestrian walked up, the experimenter turned and gave the person either a quick
glance or a prolonged stare. Regardless of the sex of the experimenter or the
sex of the subject, those who had been stared at tended to cross the street faster,
once the light changed, than those who had received only a glance. The mean
crossing time for 66 subjects in the stare condition was 11.1 seconds, while the
mean for 62 subjects in the glance condition was 12.2 seconds, and this difference
proved to be highly significant. The standard deviation of the population of
crossing times was estimated to be 1.00 seconds (really) in both conditions.
(This study was the work of P. C. Ellsworth, J. M. Carlsmith, and A. Henson, who
reported it in their article "The Stare as a Stimulus to Flight in Human Subjects"
in the Journal of Personality and Social Psychology, 1972, 21, 302-311.)
It seems safe to assume that the researchers conducted a two-tailed test of

the null hypothesis that the population means were equal. One wonders how small
a ffbi'srice between the population means they could have discovered in this
procedure with samples of the sizes they employed. There is no one answer, of
course; rather the smaller the difference, the lower the probability of their dis¬
covering it. But if the probability of discovery is specified, one can estimate
the minimum difference whose discovery carried this probability. Do so for the
probabilities named below, following the procedure described in the next-to-the-
last paragraph of Section 21.11. (Only approximate answers are possible. Give
them in seconds, not in standard deviations.)
--- 5* What was the minimum difference between the population means
that was discoverable with a probability of .80? (This is
the difference for which the risk of missing it was .20.)
---6. What was the minimum difference that was discoverable with
a probability of 90%? (This is the difference for which the
risk of wrongly accepting the null was 10%.)
__ 7. What was the minimum difference whose probability of dis¬

covery was fully 95%? (This is the difference for which the
risk of a Type II error was only .05.)
ouipjsi
To become comfortable with one-way analysis of variance, and to gain insight

into its workings, it is helpful to do an analysis with very simple numbers. Such
numbers are supplied below, and you will find that most of the quantities derived
from them (means, sums of squares, and the like) also turn out to be simple. In
doing the analysis of these figures, you will see that you are asked first to use
the definitional formulas of Section 22.5 (illustrated in Section 22.6) and then
the raw-score formulas of Section 22.7, which should produce the same results.
xD xD-xD (xD-xD)2 XE xe-xe (XE-XE)2 XF XF-Xp (XF-Xp)2
21 16 12
21 15 10
20 15 10
19 15 10
19 14 8
z =
1. XD 2. Xe 3. XF Note that the
means are very widely dispersed, whereas within each subgroup the scores cluster
tightly about their mean.
4. Z (XD XD) 2 These are the quantities that could logically be

called SSEr SSF, and SSF, as noted on p. 198 of the
5. Z (xE xe) 2
workbook. To show your work in computing them via
6. Z (xF xF)2 the definitional formulas, fill in the table above.
2
7. 8. df W 9. sW
Now compute
ssw ■
10. Zx2
11. The <
12. ssw ,
Onward to sa . First use the formula on the bottom of p. 395, showing your
work in computing the numerator of the fraction by filling in the table below.
X X - X (X - X) 2
2
XD 13. Z (x - x)
XE 14. n
XF 15. ssA
16. dfA = k - 1
Z =
2
17.
Next compute SSA via the raw-score formula on p. 399.
_ 18. The term in square brackets in Formula 22.6
__ 18- The other term in the formula, the one subtracted from the first
_ 20• SSA as computed from Formula 22.6 Compare with previous result
Now complete the analysis:
21. F
calc
22. F for a = .05
cnt
23. Decision on null hypothesis stating equality of population means
For a bit of extra insight, finally, compute SST first via the definitional
formula on p. 397 and then via the raw-score formula on p. 399. To use the defi-
nitional formula, fill in the table below, which lists all the scores.
X x-x (x-x)
Suppose an experimenter assigns 45 men at random
to receive a placebo, a small dose of caffeine, or
a large dose, and then determines their reaction time
in an apparatus simulating the braking of an automo¬
bile. The n's are equal for the three treatment lev¬
els. The experimenter then repeats this procedure
with 45 women. The resulting data can be studied in
a two-way analysis of variance. One variable is
dosage of drug, and the other is sex of subject.
__ 28. df for dosage of drug
__ 29. df for sex of subject
_ 30. df for interaction

1—1
ro
•
dfwc
32. F for dosage at a = .05
cnt
33. F for sex at a = . 05
cnt
34. F . for interaction at a =
cnt
24• SST from the table, which should = SSW + ssA

25. Ex , where summation is over all scores, from Formula 22.7
26. The other term in Formula 22.7 27. SST from Formula 22.7
Want to convince people that you can read their minds? Try this demonstra¬
tion. Ask a good-sized group of people each to think of a number between six
and ten, inclusive. Each person should make his or her choice individually and
keep it private. Then request that the group think their numbers "at" you, and
announce that you will receive, via telepathy, the number that "comes through"
most strongly, which will be the modal choice. Pretend to receive their thoughts,
and state with confidence that the "loudest" number is seven. You will have an
excellent chance of being correct. Why? Consider the following data, which are
the results of asking 207 introductory-psychology students to choose a number
from six to ten. (The data were reported by Philip Zimbardo in the instructor's
manual for the ninth edition of his text Psychology & Life.)
Choice f
six 24
seven 112
eight 33
nine 25
ten 13
Does it appear plausible, in light of these data, that people in our contem¬
porary society make those five possible choices in equal proportions? Test the
appropriate hypothesis at the .05 level of significance.
1. State the null hypothesis in words.
2. State the alternative hypothesis in words.
3. Compute X2#■ showing your work in the table above. X =
4. df = 5. X' 6. Decision on Hq: Accept Reject

crit
7. Is it plausible that the proportion of people who choose seven in what¬

ever population Zimbardo randomly sampled is one-half? Compute the appropriate
X2, showing your work in the table below.
Choice f
seven 112
other 95
8. df = 9. for a .05 10. Decision on Hq: Accept Reject

crit
In 1975, three psychologists at Purdue University reported a study in which

undergraduate students, 28 men and 34 women, were asked to play an electronic
dart game. Each subject was offered a choice between two versions, one in which
the score depended on the player's skill and one in which the score depended on
luck. The psychologists reported their data in the following table:
Males Females
Luck 6 22
Choice
Skill 22 12
28 34
11. Conceptualizing this study as a test of the difference between two propor¬
tions, state the appropriate null hypothesis in words.
12. State a two-tailed alternative in words.
13. X2 = 14. df = 15. for a .05 =

crit
16. State your decision on the null hypothesis and interpret your finding,
specifying the direction of the difference, if any, between the two sexes. Use
the remaining space to copy in (neatly) your computation of X2•
The study is the work of Kay Deaux, Leonard White, and Elizabeth Farris:
"Skill versus Luck: Field and Laboratory Studies of Male and Female Preferences,"
Journal of Personality and Social Psychology, 1975, 32, 629-636.
3UIUJSI
1. Here are the data from the homework for Chapter 9 again. Compute Spearman's
rank order correlation coefficient for the two variables, showing all your work.
Student % Correct on Exam 1 % Correct over Semester
A 38 75
B 54 65
C 62 94
D 67 81
E 67 84
F 72 93
G 77 90
H 77 93
I 82 90
J 85 95
You may be interested to compare r^ with r. Are they close?
2. Compute X2 for a Sign Test of the difference between the two samples of
scores in the table above. Show your work in the space below.
3. Recall that with 1 df, Jy* = z, and z is comparable to t. Compute z for the
Sign Test. You may be interested to compare it with the value of t that you
found in the homework for Chapter 19.at the top of p. 282.
Now do Wilcoxon's Signed Ranks Test for the data on the other side. If you
think a bit, you'll see that it's not necessary to find the difference scores.
Show your work in the space below.
4. W+
5. W_
6. Is the test statistic W+ or W_?
7. Critical value of the test statistic for a two-tailed test at the

.01 level of significance
8. Decision on Hq
In a replication of the study described in the homework for Chapter 21 (p.

286), a female experimenter directed a stare or just a glance at a pedestrian
waiting for a traffic light to change from red to green. A second experimenter
standing across the street timed the subject as she or he crossed the intersec¬
tion after the light changed. Crossing times were recorded to the nearest half
second, and the following (real) data resulted:
Stare: 677.57.58888 9 12.5
Glance: 8.5 8.5 9 9.5 10 10 10 10.5 11 11
Test the difference between the two samples with the Mann-Whitney procedure,
using a two-tailed alternative and the .05 level of significance. Call the
stare condition X, and show your work in the table above.
_ 9. ZRX
_ 10. Range of critical values
11. Decision on H
- 0
amujsi

Student Workbook: Prepared By-Gordon Bear

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Student Workbook: Prepared By-Gordon Bear

Uploaded by

Copyright:

Available Formats

Student Workbook

prepared by-Gordon Bear

John Wiley & Sons, Inc.

Reproduction or translation of any part of

ISBN 0 471 03663 1

I am indebted to many fine people for important contributions to this work¬

my teaching assistants at the University of Wisconsin, who worked diligently

the Faculty Research Committee and the Academic Administration of Ramapo

the classical-music stations of New York City, which nourished my spirit

Cartoon Cat, who provided companionship through those nights.

I happily dedicate this work to the women in my life.

•Do-it-yourself summaries that direct your attention to the important points

Chapter 19 Influence about Means and the t Distribution 161

As of this writing, I have led over 600 students through an introductory

•The mathematics employed in the text is only simple algebra. You

Furthermore, look what you've got going for you:

•Your text is an exceptionally good book. As I noted above, it pre¬

•This workbook offers summaries of the text, exercises to help you in

*This course itself provides a leisurely review of mathematics. The

vivid and useful.

The mathematics in this course is thus fully within your comprehension.

A miniature calculator would be a good investment for this course, and

•automatic constant for multiplication (for which there's usually no special

You should also look for a machine with:

•positive-action keys, which click or change in amount of resistance to the

•a square-root key. It'll be labeled / or /x.

•a display that you can read easily from a variety of angles.

•Compare models, guarantees, and prices, and try several stores.

Here's a list of the sections that make up Chapter 1 in your textbook.

1.1 Descriptive Statistics

__ 1.2 Inferential Statistics

___ 1.3 Relationship and Prediction

1.4 Kinds of Statisticians

1.5 For Whom Is This Book Intended?

1.6 The Role of Applied Statistics

1.7 More about the Road Ahead

1.8 Dirty Words about Statistics

1.9 Some Tips on Studying Statistics

To construct a helpful summary of the first chapter in your text, write

What Is (Are?) Statistics?

In ordinary speech, the term statistics refers to facts involving numbers,

As a field of mathematics, statistics consists of techniques for solving

inferential statistics. The primary function of descriptive statistics is to

provide meaningful and convenient techniques for .!<? '

of these procedures is to draw an __ about conditions which exist

in a larger set of observations from study of \ , 1 ■ - , , .- [1.2].

This branch of statistics is also known as _ statistics [1.2] .

Who Uses Statistical Techniques?

those who need to know statistics in order to appreciate _' ? M i r,

statistical treatment in the course of __, (3) profession¬

al r»,„g_, and (4) ____rr I ’ ■_ statisticians [1.4]. The main

amateur/-professional statisticians [1.4].

The professional (practicing) statistician acts as a _ in

the process of research by assisting those with _ questions

them to the evidence they gather [1.4].

These three kinds of persons are more interested in applications of statis¬

of person, the mathematical statistician, is primarily interested in the theory.

The text is concerned with applied / t^eorebi-cal statistics, and it is

directed to the prospective amateur / pa?e#e^sdrorrad. statistician [1.5].

(For a "map" of the information reviewed here, see p. 6 of this workbook.)

How Do Statistical Techniques Figure into Research?

It is important to distinguish between substantive matters and statistical