The first step for choosing a statistical test is to identify the scale of measurement of the dependent
variable. All of the parametric tests of means that we have learned to date (e.g., onesample Z test, one
sample ttest, ttest for dependent means, ttest for independent means) require an interval or ratio scale
of measurement. Many psychologists also apply parametric tests of means to variables with
an approximately interval scale of measurement. It is your decision whether to consider approximately
interval or "scale" scores as suitable for parametric tests of means.
Parametric
Test
Scale of Measurement
No Nominal
No Ordinal
Yes No Approximately Interval
Yes
Interval
Yes
Ratio
Research Design
Number of Samples
Once you have identified the scale of measurement of the dependent variable, you want to
determine how many samples or "groups" are in the study design. Designs for which onesample tests of
means (e.g., Z test; ttest) are appropriate, collect only one set or "sample" of data. There must be two
sets of scores or two "samples" for the ttest for dependent means and the ttest for independent means.
OneSample Tests
It is sometimes difficult to determine from a study description how many samples of data were
collected. Typically when the description refers to one type of person from a larger population, the study
design uses only one sample. An example would be if we wanted to know whether Emory students are
like college students in general. Emory students are a specific With single samples, the one
sample Z test or the onesample ttest are the only statistics that can be used. Students sometimes ask,
"but don't you have population data too, so you have two sets of data?" Yes and no. Data have to exist
or else the population parameters are defined. But, the researcher does not collect these data, they
already exist. So, if you are collecting data on one sample and comparing those data to information that
has already been gathered and is pulished, then you are conducting a onesample test using the one
sample/set of data collected in this study.
TwoSample Tests
Studies that refer to repeated measurements or pairs of subjects typically collect at least two sets of
scores. Studies that refer to two specific subgroups in the population also collect two samples of data.
Once you have determined that the design uses two samples or "groups", then you must determine
the nature of the relationship between groups  dependent or independent.
Dependent Means
Dependent groups refer to some type of association or link in the research design between sets of scores.
This usually occurs in one of three conditions  repeated measures, linked selection, or matching.
Repeated measures designs collect data on subjects using the same measure on at least two occasions.
This often occurs before and after a treatment or when the same research subjects are exposed to two
different experimental conditions. When subjects are selected into the study because of natural "links or
associations", we want to analyze the data together. This would occur in studies of parentinfant
interaction, romantic partners, siblings, or best friends. In a study of parents and their children, I would
want my data to be associated with my son's, not some other child's. Subject matching also produces
dependent data. Suppose that an investigator wanted to control for socioeconomic differences in
research subjects. She might measure socioeconomic status and then match on that variable. The scores
on the dependent variable would then be treated as a pair in the statistical test.
Independent Means
Independent means ttests are required when there is no subject overlap across groups. In an
independent groups design, subjects are completely independent. Tests of gender differences are a good
example of independent groups. We cannot be both male and female at the same time; the groups are
completely independent. If you want to determine whether samples are independent or not, ask yourself,
"Can a person be in one group at the same time they are in another?" If the answer is no (can't be in a
remedial education program and a regular classroom at the same time; can't be a freshman in high
school and a sophomore in high school at the same time), then the groups are independent.
Available Information
Some tests of means require population information and others do not. Each test and the information
necessary for the test is presented below.
OneSample Z Test
Onesamples tests ask whether a sample comes from a defined population. When we say "defined" what
we mean is that the population parameters are known and are available to researchers. A onesample
Z test requires both thepopulation mean () and population variance or standard devation (o
2
; o). We
typically find this information from test manuals, metaanalyses, or the published findings of largescale
studies. The formula for the onesample Z test is presented below. As you can see, both and o are
needed to calculate the Z test.
OneSample tTest
The onesample ttest also asks whether a sample comes from a defined population. For the one
sample ttest, only the population mean () is required to conduct the test. In the onesample ttest, we
estimate the populationstandard deviation (o) from sample data (S). The formula for the onesample t
test is presented below. As you can see, only is needed to calculate the sample tratio.
tTest for Dependent Means
The ttest for dependent means can be conducted without any prior knowledge of population
parameters. We use the ttest for dependent means to determine whether there is a significant difference
between linked scores. We do this by calculating the between pairs of raw scores (X
1
X
2
) and then
testing whether the average difference score is zero. To conduct this test, we do not need to know the
population mean or population standard deviation. All we need to do to conduct this test, is to determine
whether a population difference of zero is a reasonable null hypothesis. The formula for the ttest for
dependent means is presented below. As you can see, or o do not appear anywhere in the formula.
tTest for Independent Means
The ttest for independent means can be conducted without any prior knowledge of
population parameters. We use the ttest for independent means to determine whether two
samples come from the same population or from different populations. For example, when
studying college students, do we need to consider college women and college men as two
different populations or can we think of them as coming from identical populations (a
difference of zero). We do this by calculating the difference between sample means and then
testing whether the difference is equal zero. To conduct this test, we do not need to know the
population mean or population standard deviation. All we need to do to conduct this test, is
to determine whether a difference between means in the population of zero is a reasonable
null hypothesis. The formula for the ttest for independent means is presented below. As you
can see, or o do not appear anywhere in the formula.
Test Assumptions
The final factor that we need to consider is the set of assumptions of the test. All of the statistical tests of
means are parametric tests. All parametric tests assume that the populations have specific
characteristics and that samples are drawn under certain conditions. These characteristics and conditions
are expressed in the assumptions of the tests.
OneSample Z Test
The assumptions of the onesample Z test focus on sampling, measurement, and distribution. The
assumptions are listed below. Onesample Z tests are considered "robust" for violations of normal
distribution. This means that the assumption can be violated without serious error being introduced into
the test. The central limit theorem tells us that, if our sample is large, the sampling distribution of the
mean will be approximately normally distributed irrespective of the shape of the population distribution.
Knowing that the sampling distribution is normally distributed is what makes the onesample Z
test robust for violations of the assumption of normal distribution.
OneSample t Test
The assumptions of the onesample ttest are identical to those of the onesample Z test. The
assumptions are listed below. Onesample ttests are considered "robust" for violations of normal
distribution. This means that the assumption can be violated without serious error being introduced into
the test.
 Interval or ratio scale of measurement (approximately interval)
 Random sampling from a defined population
 Characteristic is normally distributed in the population
tTest for Dependent Means
The assumptions of the ttest for dependent means focus on sampling, research design, measurement,
and distribution. The assumptions are listed below. The ttest for dependent means is
considered typically "robust" for violations of normal distribution. This means that the assumption can
be violated without serious error being introduced into the test in most circumstance. However, if we are
conducting a onetailed test and the data are highly skewed, this will cause a lot of error to be introduced
into our calculation of difference scores which will bias the results of the test. In this circumstance,
a nonparametric test should be used.
 Interval or ratio scale of measurement (approximately interval)
 Random sampling from a defined population
 Characteristic is normally distributed in the population
 Interval or ratio scale of measurement (approximately interval)
 Random sampling from a defined population
 Samples or sets of data used to produce the difference scores are
linked in the population through repeated measurement, natural
association, or matching
 Scores are normally distributed in the population; difference
scores are normally distributed
tTest for Independent Means
The assumptions of the ttest for independent means focus on sampling, research design, measurement,
population distributions and population variance. The assumptions are listed below. The ttest for
independent means is considered typically "robust" for violations of normal distribution. This means
that the assumption can be violatedwithout serious error being introduced into the test in most
circumstance. However, if we are conducting a onetailed test and the data are highly skewed, this will
cause a lot of error to be introduced into our test and anonparametric test should be used. The ttest for
independent means is not robust for violations of equal variance. Remember that the shape of the
sampling distribution is determined by the population variance (o
2
) and the sample size. If
the population variances are not equal, then when we calculate the difference between sample
means, we do not have a sampling distribution with an expectable shape and cannot calculate
an accurate critical value of the tdistribution. This is a serious problem for our test.
Our alternatives when the asssumption of equal variances has been violated are to use a
correction (available in the SPSS program) or to use a nonparametric test. How do we determine
whether this assumption has been violated? Conduct a Levene's test (using SPSS).
 Interval or ratio scale of measurement (approximately interval)
 Random sampling from a defined population
 Samples are independent; no overlap between group members
 Scores are normally distributed in the population
 Population variances are equal
OneSample Z Test
Hypothesis
The onesample Z test is used when we want to know whether our sample comes from a particular
population. For instance, we are doing research on data collected from successive cohorts of students
taking the Elementary Statistics class. We may want to know if this particular sample of college
students is similar to or different from college students in general. The onesample Z test is used only
for tests of the sample mean. Thus, our hypothesis tests whether the average of our sample (M) suggests
that our students come from a population with a know mean() or whether it comes from a different
population.
The statistical hypotheses for onesample Z tests take one of the following forms, depending on whether
your research hypothesis is directional or nondirectional. In the equations below 1 refers to the
population from which the study sample was drawn; is replaced by the actual value of the population
mean.
Study Design
The name of the onesample Z test tells us the general research design of studies in which this statistic is
selected to test hypotheses. We use the onesample Z test when we collect data on a single sample
drawn from a defined population. In this design, we have one group of subjects, collect data on these
subjects and compare our sample statistic (M) to the population parameter (). The population
parameter tells us what to expect if our sample came from that population. If our sample statistic is very
different (beyond what we would expect from sampling error), then our statistical test allows us to
conclude that our sample came from a different population. Again, in the onesample Z test, we are
comparing the mean (M) calculated on a single set of scores (one sample) to a known population mean
().
Available Information
The onesample Z test compares a sample to a defined population. When we say "defined" population,
we are saying that the parameters of the population are known. We typically define a population
distribution in terms of central tendency and variability/dispersion. Thus, for the onesample Z test,
the population and o must be known. The onesample Z test cannot be done if we do not
have and o. Population information is available in the technical manuals of measurement
instruments or in research publications. Population information for the attachment scales used in the
class dataset is available in the articles on reserve.
Test Assumptions
All parametric statistics have a set of assumptions that must be met in order to properly use the statistics
to test hypotheses. The assumptions of the onesample Z test are listed below.
Random sampling from a defined population
Interval or ratio scale of measurement
Population is normally distributed
When reading the psychological literature, we can find many studies in which all of these assumptions
are violated. Random sampling is required for all statistical inference because it is based on
probability. Random samples are difficult to find, however, and psychologists and researchers in other
fields will use inferential statistics but discuss the sampling limitations in the article. We learned in our
scale of measurement tutorial that psychologists will apply parametric statistics like the Z test on
approximately interval scales even though the tests require interval or ratio data. This is an accepted
practice in psychology and one that we use when we analyze our class data. Finally, the assumption
of normal distribution in the population is considered "robust". This means that the the statistic has been
shown to yield useful results even when the assumption is violated. The central limit theorem tells us
that even if the population distribution is unknown, we know that the sampling distribution of the mean
will be approximately normally distributed if the sample size is large. This helps to contribute to
the Z test being robust for violations of normal distribution.
OneSample tTest
Hypothesis
The onesample ttest is used when we want to know whether our sample comes from a
particular population but we do not have full population information available to us. For
instance, we may want to know if a particular sample of college students is similar to or
different from college students in general. The onesample ttest is used only for tests of the
sample mean. Thus, our hypothesis tests whether the average of our sample (M) suggests
that our students come from a population with a know mean () or whether it comes from a
different population.
The statistical hypotheses for onesample ttests take one of the following forms, depending
on whether your research hypothesis is directional or nondirectional. In the equations
below
1
refers to the population from which the study sample was drawn; is replaced by
the actual value of the population mean. The statistical hypotheses are identical to those used
for onesample Z tests.
Study Design
The name of the onesample ttest tells us the general research design of studies in which this statistic is
selected to test hypotheses. We use the onesample ttest when we collect data on a single sample drawn
from a defined population. In this design, we have one group of subjects, collect data on these subjects
and compare our sample statistic (M) to the population parameter (). The population parameter tells us
what to expect if our sample came from that population. If our sample statistic is very different (beyond
what we would expect from sampling error), then our statistical test allows us to conclude that our
sample came from a different population. Again, in the onesample ttest, we are comparing the mean
(M) calculated on a single set of scores (one sample) to a known population mean ().
Available Information
The onesample ttest compares a sample to a defined population. When we say "defined" population,
we are saying that the parameters of the population are known. We typically define a population
distribution in terms of central tendency and variability/dispersion. But, for a onesample ttest, only
the population is known. The onesample ttest cannot be done if we do not have . The population s
is not required for the onesample ttest. All ttests estimate the population standard deviation using
sample data (S). Population means are available in the technical manuals of measurement instruments
or in research publications. Population information for the attachment scales used in the
class dataset is available in the articles on reserve.
Test Assumptions
All parametric statistics have a set of assumptions that must be met in order to properly use the statistics
to test hypotheses. The assumptions of the onesample ttest are listed below. These assumptions are
identical to those of the onesample Z test.
Random sampling from a defined population
Interval or ratio scale of measurement
Population is normally distributed
When reading the psychological literature, we can find many studies in which all of these assumptions
are violated. Random sampling is required for all statistical inference because it is based on
probability. Random samples are difficult to find, however, and psychologists and researchers in other
fields will use inferential statistics but discuss the sampling limitations in the article. We learned in our
scale of measurement tutorial that psychologists will apply parametric statistics like the t test for
dependent means on approximately interval scales even though the tests require interval or ratio data.
This is an accepted practice in psychology and one that we use when we analyze our class data. Finally,
the assumption of normal distribution in the population is considered "robust". This means that the the
statistic has been shown to yield useful results even when the assumption is violated. The central limit
theorem tells us that even if the population distribution is unknown, we know that the sampling
distribution of the mean will be approximately normally distributed if the sample size is large.
This helps to contribute to the ttest being robust for violations of normal distribution. There are
conditions we may encounter when we should not use the ttest for dependent means. If we are
conducting a directional test and our sample data are highly skewed, we should consider a
nonparametric alternative.
ttest for Dependent Means
Hypothesis
The ttest for dependent means is used when we want to know whether there is a difference between
populationswhen the data are "linked" or "dependent". For instance, we may want to know if using
tutorials in a statistics class improves knowledge. To assess this, we would have to know a student's
knowledge before using the tutorial and again after completing the tutorial. Thus, any data collect from
this student are "linked". The ttest for dependent means is used only for tests of the sample means.
Thus, our hypothesis tests whether the average difference between scores (M
1
 M
2
) suggests that our
students come from a population where tutorials do not affect performance (
1
2
= 0) or whether
they come from a different population in which knowledge improves after using the tutorial.
The statistical hypotheses for ttests for dependent means take one of the following forms, depending on
whether your research hypothesis is directional or nondirectional. In the equations below
1
refers to the
pretest or Time 1 population from which the study sample was drawn;
2
refers to the posttest or Time
2 population.
Study Design
The ttest for dependent requires a specific type of research design. We use the ttest for dependent
means when we collect data two different times on a single sample drawn from a population or
when two different people aresampled as a pair because they are linked in some fashion in the
population. In this design, we have one group of subjects/paired subjects, collect data on these subjects
twice, compute the difference between pairs or pretest and posttest scores, and compare the average
sample difference (M
Diff
) to the population parameter (
Diff
). The population parameter tells us what to
expect if there was no effect or difference in the population. If our sample statistic is very different
(beyond what we would expect from sampling error), then our statistical test allows us to conclude that
our sample came from a population in which members of a pair were different or Time 1 and Time 2
scores were different. In the ttest for dependent means, we are comparing the mean difference (M
1

M
2
) calculated on linked/dependent data to an expectation that there is no difference in the population
(
1

2
= 0).
Available Information
The ttest for dependent means compares the mean difference between sample scores that are linked by
the study design to an expectation about the difference in the population. For this test, we do not need to
know the population parameters. As long as the null hypothesis reflects no difference in the population,
then the value of
1

2
needed for our statistical hypothesis is known (0). In ttests, we estimate the
population variances/standard deviations fromsample data (S).
Test Assumptions
All parametric statistics have a set of assumptions that must be met in order to properly use the statistics
to test hypotheses. The assumptions of the ttest for dependent means are listed below.
Random sampling from a defined population
Interval or ratio scale of measurement
Population difference scores (
1

2)
are normally
distributed
When reading the psychological literature, we can find many studies in which all of these assumptions
are violated. Random sampling is required for all statistical inference because it is based on
probability. Random samples are difficult to find, however, and psychologists and researchers in other
fields will use inferential statistics but discuss the sampling limitations in the article. We learned in our
scale of measurement tutorial that psychologists will apply parametric statistics like the ttest on
approximately interval scales even though the tests require interval or ratio data. This is an accepted
practice in psychology and one that we use when we analyze our class data. Finally, the assumption that
the difference scores are normally distributed in the population is considered "robust". This means that
the the statistic has been shown to yield useful results even when the assumption is violated. The central
limit theorem tells us that even if the population distribution is unknown, we know that the sampling
distribution of the mean will be approximately normally distributed if the sample size is large. This also
applies to the means of difference scores and helps to contribute to the ttest being robust for violations
of normal distribution.