Professional Documents
Culture Documents
1 Review 5
1.10 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.10.1 Skewness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2
1.11.1 Independent Random Samples from one Population: Continuous data . . 25
1.11.2 Independent Large Random Samples from one Population: Count data . 40
1.11.3 Independent Small Random Samples from one Population: Count data . 43
1.11.6 Large Random Samples from two Independent Populations: Count Data 70
1.11.7 Small Random Samples from two Independent Populations: Count Variables 72
1.12 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3
3 Non-parametric Methods 116
4
Chapter 1
Review
This chapter highlights the statistical prerequisite material required for successful completion of
the course. It provides an overview of some of the descriptive and inferential statistics topics that
will be used throughout the course and would have been taught in an undergraduate statistics
course.
• identify and calculate three measures of the central tendency (the mean, median, and mode)
of data;
• identify and calculate three measure of data dispersion (range, variance, and standard
deviation);
• identify and calculate two measures of data position (z-scores and percentile ranks);
5
• draw and interpret a box-and-whisker plot, a confidence interval plot, and a scatter plot;
• implement several hypotheses tests including z-tests for a single proportion and the dif-
ference in two independent proportions; t-tests for a single mean, the difference in means
for two independent samples, and the difference in means for two dependent samples; an
F-test for two independent standard deviations; Levene’s test for equality of variances; the
Shapiro-Wilk W test for Normality; a test for significant correlation; and tests for regression
coefficients.
• determine graphically and through inference whether a data set is normally distributed;
• compute confidence intervals for various parameters including means and regression coef-
ficients;
• identify visually and through inference whether two variables are linearly related;
• use a sample regression line to make interpolations and extrapolations and provide the
associated confidence and prediction intervals; and
• draw public health conclusions based on the reviewed statistical inference topics.
• Observational studies
• Experimental design
• Lurking variables
6
• Measures of central tendency: mean, median, and mode
• Measures of dispersion: range, variance, and standard deviation
• Measures of position: z-scores, percentiles, and quartiles
• Hypothesis testing: significance level, Type I error, Type II error, power of a test, null and
alternative hypotheses, left-tail, right-tail, two-tail, p-value, reject region, test statistic, and
critical value
• Confidence intervals: confidence level, margin of error, and confidence interval width
• Distributions: Normal, Student-t, and F
• z-tests: for a single proportion and for the difference in two independent proportions
• t-tests: for a single mean, for the difference of two means from independent samples; and
for the difference of two means from dependent samples
• F-test and Levene’s Test for the equality of two variances
• QQ and PP plots for visualizing normality
• Shapiro-Wilks Test for normality
• Influential observations
• Outliers
• Scatter plot
• Pearson’s correlation coefficient
• Spearman’s rank correlation coefficient
• The Principle of Least-Squares
• Line of best-fit
• Coefficient of Determination
• Regression line
• Interpolation/extrapolation
• Confidence and prediction intervals for expected responses
7
1.3 Introduction to Statistics
Statistics is the science of collecting, organizing, summarizing and analyzing information in order
to draw conclusions. Biostatistics is the application of statistics to medicine and the biological
sciences. The two branches of Biostatistics on which we will be focusing are Descriptive Statistics
and Inferential Statistics. We will briefly discuss two other branches: Sampling Theory and
Experimental Design.
The area of Descriptive Statistics consists of organizing and summarizing the information col-
lected. Inferential Statistics uses methods that generalize results obtained from a sample to a
population and measure their reliability. The information required is either based on an entire
population or a sample from within the population. A population is the complete collection of
objects, subjects, units or individuals of interest in a study. A population can also be the complete
set of measurements or observations on these objects, subjects, units, or individuals. A sample
is a subset or part of a population.
Individuals or elements from a population are the objects described by a data set. They can
be people, animals, or things. Variables are the characteristics or attributes of the individuals
(elements) within the population and are capable of assuming any value within the data set. Data
consist of the values (measurements or observations) that the variables can assume. Variables
whose values are determined by chance are called random variables.
Discrete variables have a finite number of possible values or a countable number of possible
values.
8
– e.g. group size, number of steps you take in your life
Continuous variables can assume an infinite number of possible values between any two given
values and therefore can be measured to any desired level of accuracy.
– e.g. height, weight, time taken from sample collected in lab until patient receives the test
results
Nominal data are names of categories or characteristics or properties. Data is classified into
non-overlapping, exhausting categories in which no order or rank can be imposed on the data.
– e.g. Classifying the residents of Saskatoon by their postal code; classifying the employees of
the university by their office telephone number; classifying the students in this PUBH 805 class
by their eye colour, their gender, by whether they are wearing socks or not.
Ordinal data are observations that can be ordered or ranked on some basis of magnitude. Note
differences between ordinal data do NOT make any sense.
– e.g. Opinions about a concert may be summarized as terrible, okay, excellent; the ratings of
movies are ranked according to G, PG, PG13, and R; the alert level of an American airport may
be yellow, orange, or red.
Interval data are numerical observations such that equal differences in the numbers define each
differences in magnitudes; there is no meaningful value of zero.
Ratio data are numerical observations such that ratios of numbers define ratios of magnitudes.
The value of zero does have meaning.
9
Data can be obtained from four different sources:
1. A census is a list of all individuals in a population along with certain characteristics of each
individual.
4. Survey sampling does not attempt to manipulate or influence the variable(s) of interest. It
leads to observational studies, which measure the characteristics of a population by studying
individuals in a sample. The value of the variable of interest has already been established.
NOTE BIEN: Observational studies are very useful tools for determining whether there is a relation
between two variables, BUT a designed experiment is required to isolate the cause of the relation.
When implementing a statistical study, it is important to note whether the variable that is being
studied is not influenced by some underlying hidden variables. These underlying hidden variables
are sometimes called lurking variables.
2. How could we change the study so that causation rather than association can be concluded?
3. Identify a lurking variable and explain how it, rather than smoking, could plausibly be the
cause of lung cancer.
10
Observational studies are performed for two reasons:
Observational (or ex post facto) research is usually implemented in situations where the control of
certain variables is unethical or simply impractical (impossible). Experiments are used whenever
control of certain variables is desired (and morally/ethically allowed). Once one has identified the
source of the data, how does one actually form a sample?
A sample of size n from a population of size N is obtained through simple random sampling if
every possible sample of size n has an equally likely chance of occurring. The sample is then
called a simple random sample. To generate a simple random sample:
1. A convenience sample is a sample in which the individuals in the sample are easily obtained.
The most popular type of convenience sample is a self-selected sample (that is, the indi-
viduals volunteer to participate). Convenience sampling will generally yield results that are
suspect. Any results should be looked upon with extreme skepticism.
11
2. A systematic sample is a sample obtained by selecting every kth individual from the popu-
lation. The first individual selected is a random number between 1 and k.
3. A cluster sample is obtained by selecting all individuals within a randomly selected collection
or group of individuals.
For Discussion. A study of 300 households in a rural community revealed that 20 percent had
at least one school-age child living in the household. Describe how you would use a stratified
random sample to collect data from this group. BFAHS, p. 17. q. 8
For Discussion. A study of 250 patients admitted to a hospital during the past year revealed
that, on average, the patients lived within 15 miles of the hospital. Describe how you would use
systematic sampling of patient records to collect data from this group. BFAHS, p. 17. q. 8
A designed experiment is a controlled study in which one or more treatments are applied to
experimental units. An experimental unit is a person, object or some other well-defined item
upon which a treatment is applied. A treatment is a condition applied to the experimental
unit. A response variable is a qualitative or quantitative variable in which we are interested. A
predictor variable is a qualitative or quantitative variable that affects the response variable. It
may be controlled or uncontrolled.
A double-blind experiment is an experiment in which neither the experimental unit nor the ex-
perimenter knows what treatment is being administered to the experimental unit.
An experiment in which the different treatments are randomly assigned to the experimental units
is called a completely randomized design.
12
In a randomized block design, each experimental unit is subdivided into smaller blocks and treat-
ments are randomly assigned to each of these smaller blocks (within each original experimental
unit).
A matched-pairs design is a randomized block design in which the experimental units are somehow
related (i.e. the same person before and after a treatment, twins, husband/wife, etc.). There are
only TWO treatments in a matched-pairs design.
In a repeated measures design, multiple treatments in the same order are applied to each experi-
mental unit and the same variables are measured in the same order for each treatment on each
experimental unit.
A parameter is a descriptive measure of population. It is obtained by using all the data from a
population. A statistic is a descriptive measure of a sample. It is obtained by using the data
from a sample. Usually we use greek letters to represent parameters. One exception to this rule
is we usually use N to represent the size of a population and n to represent the size of a sample.
The arithmetic mean (or average) of a variable is computed by determining the sum of all values
of the variable in the data set and then dividing this sum by the number of elements in the
data set. The population arithmetic mean, denoted μ, is computed using all the individuals in a
population. This would be an example of a parameter. The sample arithmetic mean, denoted x,
is computed using sample data. The sample mean is an example of a statistic.
PN
x1 + x2 + ... + xN xi
μ= = i=1
N N
Pn
x1 + x2 + ... + xn xi
x= = i=1
n n
The sample mean x is an example of a statistic and it is quite often used to estimate the parameter
μ, the population mean. Why? Quite often it is unreasonable (if not impossible) to determine
13
the population mean μ. Using x to estimate μ is an example from “inferential statistics” which
we will be discussing later in the course.
Both the population mean μ and the sample mean x are measures of central tendency. μ is
the average or ”central” value for the population and x is the average or ”central” value for the
sample. The arithmetic mean can be thought of as the point to which half of the “weight” is to
the left and half of the “weight” is to right.
For Discussion. Does the mean provide sufficient information about a population (sample)?
When is/isn’t it enough?
The mean (or average) is not the only measure of central tendency. Two other types of central
tendency are the median and mode.
The median of a variable is the value that lies in the middle of the data when arranged in ascending
order. That is, half of the actual data points lie below the median and half of the data are above
the median. We use M to represent the median value. If there is an odd number of elements in
this ordered set, then the median is the middle value. If there is an even number of elements in
this ordered set, then the median is the average of the two middle values.
The mode of a variable is the most frequent observation of the variable that occurs in the data
set. If no value in the data occurs more often than some other value in the data, there is no
mode. You can have two modes: bimodal; three modes: trimodal; etc.
For Discussion. Which measure of central tendency is the ”best”? Note the mode is the only
measure of central tendency that applies to qualitative data.
For Practice. Suppose we are interested in testing the effectiveness of a new type of antibiotic.
Three different types of bacteria are exposed to the drug and the survival time for a particular
bacteria culture is measured as the amount of time required to kill 50% of the cells in the petri
dish. The survival times for eight colonies of one particular bacteria culture are 1.1 hours, 1.2
hours, 1.5 hours, 1.7 hours, 1.9 hours, 1.1 hours, 1.3 hours, and 1.8 hours. Calculate the mean,
median, and mode.
14
Solution: The mean is 1.45 hours. The median is 1.4 hours. The mode is 1.1 hours.
In this section we are going to quantify “spread out” (or disperse) a set of values is. The simplest
measure of this variability among a set of data is the range. The range of a sample is simply the
difference between the largest and smallest values in the set.
For Discussion. Is the range a “good” measure of dispersion? Does the range describe well
how dispersed data in a set is?
Because x is a measure of the center of a data set, one method of determining the variability of
the individual data points xi about the center is their deviation from the mean, that is xi − x.
Pn
For Discussion. Is the total deviation i=1 (xi − x) a good measure of dispersion? Why or why
not?
The sample variance of a variable is denoted s2 and is the average of the squared deviations of
the observations about the sample mean, that is
n
" n Pn #
X (x − x) 2
1 X ( x )
2
i i=1 i
s2 = = x2i − .
i=1
n − 1 n − 1 i=1
n
15
The variance (whether population or sample) has different units thsn the values used to compute
it. Hence the variance cannot be directly compared to the mean or the data used to compute
it. To solve this problem, we simply square root the variance to get the
√ standard deviation. The
population standard deviation is denoted σ and √ is defined to be σ = σ 2 . The sample standard
deviation is denoted s and is defined to be s = s2 .
In order to give meaning to the magnitude of the standard deviation, it needs to be compared to
the mean. The coefficient of variation is the standard deviation divided by the mean times 100%.
s σ
For samples, it is and for populations, it is (the larger the coefficient of variation, the more
x μ
variation in the corresponding data).
For Practice. Calculate the sample range, variance, standard deviation, and coefficient of
variation for the survival times in the previous practice question.
Solution: The sample range is 0.8 hours. The sample variance is 0.10285714 hours 2 . The sample
standard deviation is 0.32071349 hours. The sample coefficient of variation is 22.118171%.
Suppose one needs to know the position of a data point relative to the other points in the data set.
One measure of a data point’s position is the z-score which represents the number of standard
deviations that the data point is from the mean. It is obtained by subtracting the mean from
the data value and dividing this difference by the standard deviation. There is both a population
z-score and a sample z-score. Note that the z-score is unitless.
xi − μ xi − x
z= and z = are respectively population and sample z-scores.
σ s
Another measure of a values position within a data set is the sample k’th percentile, denoted
Pk . Pk is a value such that after the data are ordered from smallest to largest, at least k% of
the observations are at or below the value Pk and at least (100-k)% are at or above the value
Pk . Special percentiles are the first, second, and third quartiles (respectively denoted Q1 , Q2 ,
and Q3 ), which are nothing but the 25th, 50th, and 75th percentiles respectively.
16
How does one compute Pk from a data set with n observations? There are several different
methods. A simple method is as follows.
2. Calculate nk/100.
3. If nk/100 is not an integer, round it up to the next integer and find the corresponding
ordered value. If nk/100 is an integer, say I, calculate the average of the I’th and the
(I+1)’st ordered values.
Note: SPSS does not use the above procedure. Instead of a simple average, SPSS uses a weighted
average. Percentiles you would compute using the above procedure will generally be close, if not
exactly, the values computed by SPSS.
Suppose we need to know the approximate percentile rank (the value of k) of a particular ob-
servation X within a data set. We can find the approximate ranking by using the following
expression:
We can use percentiles and z-scores to identify extreme observations. Any extreme observation
is referred to as an outlier. Any statistic that is heavily influenced by outliers is referred to as a
nonresistant statistic. Any statistic that is minimally influenced by outliers is a resistant statistic.
One method for checking a data set for outliers uses percentiles, as follows:
3. Determine the fences. Fences serve as the cutoff points for determining outliers. The
lower fence is computed using Q1 − 1.5(IQR) and the upper fence is computed using
Q3 + 1.5(IQR).
17
4. If a data value is less than the lower fence or greater than the upper fence, then it is
considered an outlier.
For Practice. : The following are the systolic blood pressures of 20 men: 150 141 90 108 158
119 156 114 95 97 145 167 144 171 132 97 163 111 186 98.
2. Compute Q1 , Q2 , and Q3 .
Solution: The SPSS output which gives the solutions to (1) and (2) is
2. The first quartile is Q1 =100.500 mmHg. The second quartile is Q2 =136.500 mmHg.
3. The value for the lower fence is: Q1 − 1.5IQR = 15.000 mmHg. The value for the upper
fence is: Q3 + 1.5IQR = 243.000 mmHg. Because no value in the data file is below the lower
fence or above the upper fence, this data set has no outliers.
18
4. The percentile rank associated with 163 is k = (16 + 0.5)/20 × 100% = 82.5% which suggests
that the percentile rank of 163 is 83, i.e. P83 =163.
The three quartiles are used as part of the Five-Number Summary needed to draw a box-and-
whisker plot (also called a boxplot). A five-number summary for a data set consists of the
minimum value, Q1 , Q2 , Q3 , and the maximum value. If there are no outliers in the data, the
boxplots draw via SPSS use this five-number summary. However, if there are outliers, SPSS uses
a slightly different method for drawing the boxplot.
For Practice. Draw a boxplot for the systolic blood pressures of 20 men example.
Solution: The boxplot for the systolic blood pressures of the 20 men is
19
1.10 Distributions
The distribution underlying a data set provides probabilistic information about the data. From
the distribution, one can compute the probability of different outcomes occurring.
Boxplots can be viewed as very coarse illustrations of the distributions associated with a data
sets. These illustrations provide some useful information about the data:
1. If the median is near the center of the box and each horizontal line is approximately the
same length, then the distribution is roughly symmetric.
2. If distance between the minimum value and the median is smaller than the distance from
the median to the maximum value, the distribution is skewed right.
3. If distance between the minimum value and the median is greater than the distance from
the median to the maximum value, the distribution is skewed left.
4. If one wanted to compare the underlying distributions of two different data sets, one would
create a boxplot for both data sets and plot them one on top of the other on the same
horizontal scale.
Now Your Turn: BFAHS p. 61 q. 29 except do not draw the histogram, frequency polygon,
and stem-and-leaf plot.
Thilothammal et al. (A-19) designed a study to determine the efficacy of BCG (bacillus Calmette-
Guerin) vaccine in preventing tuberculous meningitis. Among the data collected on each study
was a measure of nutritional status. The data set in the textbook is for 107 cases. Using SPSS
to assist you, complete all parts of Question 29 except do not draw the histogram, frequency
polygon, and stem-and-leaf plot.
20
1.10.1 Skewness
In the previous discussion, we introduced the idea of the skewness of a distribution. Distributions
may be symmetric (the left half of the graph is the mirror image of the right half) or asymmetric
(the left half of the graph is not the mirror image of the right half). In the case where a
distribution is not symmetric, we say the distribution is skewed.
We say that a distribution is skewed to the left (or negatively skewed) if the tail of the distribution’s
left side is much longer than the tail of the right side (cf. the following figure).
We say that a distribution is skewed to the right (or positively skewed) if the tail of the distribu-
tion’s right side is much longer than the tail on the left side (cf. the following figure).
21
For Practice. Using the boxplot generated for the nutritional status of children example, describe
the shape of the distribution of the nutritional statuses of children.
Solution: The above box plot illustrates that there are three outliers. If the three outliers were
ignored, then the lower whisker (representing the distance between Q1 and the minimum) is about
the same size as the upper whisker (representing the distance between Q3 and the upper fence)
and the distance between Q1 and Q2 is approximately equal to the distance between Q2 and Q3 .
This implies the data is distributed symmetrically.
If the outliers were not ignored, then the distance from the minimum value to Q2 is less than
the distance between Q2 and the maximum value. This implies that the data is not distributed
symmetrically, but is rather skewed to the right (positively skewed).
A point estimate of a parameter is the value of a statistic that estimates the value of the
parameter. The sample mean x is a point estimate of the population mean μ. The sample
standard deviation s is a point estimate of the population standard deviation σ.
1. A claim is made
To understand the rationale behind Hypothesis Testing, we will turn to our legal system. Consider
a court case where the defendant is charged with murder. The only person who truly knows
whether the defendant is innocent is the defendant. From the jury’s perspective, the defendant’s
22
innocence will never be known with absolute certainty. Two hypotheses are put forth to the jury:
In our legal system one is innocent until proven guilty. Therefore Ha (the alternative hypothesis)
is always what we are trying to “prove”.
In Statistics, we do not use a jury to choose between H0 and Ha ; we perform a hypothesis test
to determine whether there is enough support to conclude the alternative hypothesis. We either
do NOT reject H0 or we reject H0 .
If we do NOT reject H0 , IT DOES NOT MEAN THAT YOU BELIEVE H0 IS TRUE!!!! Not
rejecting H0 really means that you do not have enough evidence to conclude that H0 is false. In
terms of a court case, we do not have enough evidence to prove guilt beyond a reasonable doubt.
If a jury returns a verdict of NOT GUILTY, it does NOT imply the defendant was innocent.
If we reject H0 , then we are saying that there is enough evidence to conclude that the defendant
was guilty. We are NOT saying the defendant is guilty. Sometimes we can reject H0 when H0
was actually true. This is equivalent to sending an innocent man to jail. This is quite serious.
We call rejecting H0 when H0 was actually true a Type I Error. We denote the probability of a
Type I Error occurring by the symbol α.
α is also referred to as the significance level of a hypothesis test. Since Type I errors are very
serious, it is required that when implementing a Hypothesis Test, we minimize α.
The second type of error that can occur is a Type II Error. Basically, a Type II Error occurs when
you do NOT reject H0 when it is false (you let a guilty man go free). We use the symbol β to
represent the probability of a Type II Error occurring, P(Type II Error)=P(do NOT reject H0 |H0
is false)=β.
23
NOTE: As we make α smaller, we make β larger.
In order to implement a hypothesis test, we need to calculate the value of a test statistic (a
numerical summary of a set of data that reduces the data set to a single value) and compare
the value of the test statistic to a critical value from a corresponding table of values OR we can
use the test statistic to compute the associated p-value for the problem and then compare the
p-value to the significance level.
The p-value is the probability of observing something at least as extreme as what was actually
observed (assuming the null hypothesis was true) if what was observed was due to chance.
If a p-value < α, then we reject H0 . If the p-value > α, then we do not reject H0 .
There are three ways to set up the null and alternative hypotheses and calculate the associated
p-value:
Then p-value=2Prob(X > ts) or =2Prob(X < ts), where X is the random variable representing
the test statistic and ts is the calculated value of the test statistic.
2. H0 : parameter = some value; Ha : parameter < some value (this is a left-tailed test).
Then p-value=Prob(X < ts), where X is the random variable representing the test statistic and
ts is the calculated value of the test statistic.
3. H0 : parameter = some value; Ha : parameter > some value (this is a right-tailed test).
Then p-value=Prob(X > ts), where X is the random variable representing the test statistic and
ts is the calculated value of the test statistic.
1. Research Question
2. Population declarations
3. Hypothesis to be tested
24
4. Hypothesis Test to be used
9. The Conclusion
We will use this framework for each hypothesis test that we conduct throughout the rest of the
course.
Suppose a sample has been drawn from a single population and you want to extrapolate (i.e.
infer) properties about the population from this sample. If one wants to infer something regarding
a continuous variable, then one might attempt to use a t-test for a single mean or a confidence
interval for a true mean. If one wants infer something regarding a count variable, then one might
attempt to use a z-test for a single proportion or a confidence interval for a true proportion.
Suppose we wish to know the population mean but it is not feasible to determine its exact value.
We use X to estimate μ because
25
3. X is an efficient estimator of μ, that is in repeated samples, the majority of the sample
means will be ”close” to the value of the population mean.
To make inferences about μ using a sample’s mean (when we do not know the population standard
deviation σ), we need to know if the following two conditions are both satisfied:
2. the population from which the sample is drawn is normally distributed OR the sample size
is greater than 29.
If the above two conditions are both satisfied, then we can use the t-test statistic
x−μ
t(df ) = √
s/ n
with df=n-1 degrees of freedom and the critical values tα (df ) (or tα/2 (df )) to implement a“one
sample t-test” and make inferences about the true mean of the population of interest.
Note that the t-test statistic t(df ) defined above will follow a Student’s t-distribution with df=n-1
and we determine the critical t-values using the Student’s t-distribution table.
We refer to a hypothesis test that uses this test statistic t(df) with df=n-1 and the Student’s
t-distribution to determine the critical values as a t-test for a single mean.
Note that in the following examples, the parts of the SPSS output table that highlighted in yellow
contain the information we need. We ignore the rest of the output table.
For Practice. Each of 15 hypertension patients was administered several drugs on different
occasions. The results of concern are for a placebo drug compared with Inderal. Each patient
first took the placebo for one month. After the month, their systolic blood pressures were
recorded. They then stopped taking the placebo and started taking 120 mg of Inderal for one
month. After the month, their blood pressures were recorded. The data presented in the following
table are the systolic blood pressures measured.
26
Patient Placebo Inderal
1 175 176
2 199 181
3 180 146
4 180 140
5 164 127
6 174 139
7 195 129
8 204 133
9 205 194
10 180 169
11 195 186
12 161 158
13 164 141
14 190 150
15 178 164
For every question based on this scenario, you may assume that it is known that the systolic
blood pressures (of both the placebo and treatment group) are normally distributed and that the
patients selected were randomly chosen.
At the 5% level of significance, test whether the true average systolic blood pressure of those
people who took inderal is 160.
Solution: Note: at this point in the course, we are not going to worry about testing the assump-
tions necessary for the results of this test to be valid. This will not always be the case. The
Research Question: Does the systolic blood pressure of those people who take inderal differ from
160?
Population Declarations:
Let Population 1 be the group of people who have hypertension. Then define μ to be the true
average systolic blood pressure of people in Population 1 after they take inderal.
Hypothesis to be tested:
H0 : μ = 160 (i.e. the true mean systolic blood pressure of those people who take inderal is
27
equal to 160 mmHg)
Ha : μ 6= 160 (i.e. the true mean systolic blood pressure of those people who take inderal is not
equal to 160 mmHg)
2. We assume that the systolic blood pressures of Population 1 after taking inderal are normally
distributed.
Based on the One-Sample Test table above, the test statistic is t = -0.796 with df =14 and the
corresponding p-value = 0.439. (Note SPSS indicates that this value is for a two-tailed test).
The Decision Rule: Since p-value = 0.439 > 0.05=α, we do not reject H0 .
The Conclusion: At the 5% level of significance, there is not enough evidence to conclude that
the true mean systolic blood pressure of people who take inderal differs from 160 mmHg (p-value
= 0.439). At the same level of significance, there is no evidence to reject the assumption that
the true mean systolic blood pressure of people who take inderal is 160 mmHg.
28
For Discussion. Suppose we wanted to know whether the true mean systolic pressure was less
than 160 mmHg. How would we modify the above solution?
29
Solution: Note: at this point in the course, we are not going to worry about testing the assump-
tions necessary for the results of this test to be valid. This will not always be the case. The
Research Question: Is the systolic blood pressure of those people who take inderal less than 160?
Population Declarations:
Let Population 1 be the group of people who have hypertension. Then define μ to be the true
average systolic blood pressure of people in Population 1 after they take inderal.
Hypothesis to be tested:
H0 : μ = 160 (i.e. the true mean systolic blood pressure of those people who take inderal is
equal to 160 mmHg)
Ha : μ < 160 (i.e. the true mean systolic blood pressure of those people who take inderal is less
than to 160 mmHg)
Hypothesis Test to be used: A t-test for a single mean
Assumptions required to implement the hypothesis test:
1. We assume that a simple random sample was obtained.
2. We assume that the systolic blood pressures of Population 1 are taking inderal are normally
distributed.
The Significance Level: α = 0.05
The Test Statistic and corresponding p-value:
Based on the One-Sample Test table above, the test statistic is t = -0.796 with df =14 and the
corresponding p-value = 0.439/2 (why)?. (Note SPSS indicates that this value is for a two-tailed
test. There is no way to tell SPSS that you are doing a left-tailed test.)
The Decision Rule: Since p-value = 0.439/2=0.2195 > 0.05=α, we do not reject H0 .
The Conclusion: At the 5% level of significance, there is not enough evidence to conclude that
the true mean systolic blood pressure of people who take inderal is less than 160 mmHg (p-value
= 0.2195). At the same level of significance, there is no evidence to reject the assumption that
the true mean systolic blood pressure of people who take inderal is 160 mmHg.
30
For Discussion. Suppose we wanted to know whether the true mean systolic pressure was
greater than 160 mmHg. How would we modify the above solution?
31
Solution: Note: at this point in the course, we are not going to worry about testing the assump-
tions necessary for the results of this test to be valid. This will not always be the case. The
Research Question: Is the systolic blood pressure of those people who take inderal greater than
160?
Population Declarations:
Let Population 1 be the group of people who have hypertension. Then define μ to be the true
average systolic blood pressure of people in Population 1 after they take inderal.
Hypothesis to be tested:
H0 : μ = 160 (i.e. the true mean systolic blood pressure of those people who take inderal is
equal to 160 mmHg)
Ha : μ > 160 (i.e. the true mean systolic blood pressure of those people who take inderal is
greater than to 160 mmHg)
Hypothesis Test to be used: A t-test for a single mean
Assumptions required to implement the hypothesis test:
1. We assume that a simple random sample was obtained.
2. We assume that the systolic blood pressures of Population 1 after taking inderal are normally
distributed.
The Significance Level: α = 0.05
The Test Statistic and corresponding p-value:
Based on the One-Sample Test table above, the test statistic is t = -0.796 with df =14 and the
corresponding p-value = 1-0.439/2=0.7805 (why)?. (Note SPSS indicates that this value is for
a two-tailed test. There is no way to tell SPSS that you are doing a right-tailed test.)
The Decision Rule: Since p-value = 1-0.439/2=0.7805 > 0.05=α, we do not reject H0 .
The Conclusion: At the 5% level of significance, there is not enough evidence to conclude that the
true mean systolic blood pressure of people who take inderal is greater than 160 mmHg (p-value
= 0.7805). At the same level of significance, there is no evidence to reject the assumption that
the true mean systolic blood pressure of people who take inderal is 160 mmHg.
32
Confidence interval for the true mean
Recognizing a statistic is an estimate for a parameter but, due to the random nature of the same,
the statistic is very unlikely to be exactly the true value of the parameter. Suppose you wanted
to capture a possible interval of values for the parameter. One can do this via a 100 ∙ (1 − α)%
confidence interval.
For Discussion. Recall 1 − α is the confidence one has in a statistical test. A 100 ∙ (1 − α)% is
NOT an interval that contains the parameter with probability 1 − α. Why?
For Discussion. What is a 100 ∙ (1 − α)% confidence interval? How do you explain what it
means to a lay person?
If it is known that the data is normally distributed but the population standard deviation σ is
unknown, then a (1 − α) ∙ 100% confidence interval for μ is computed using
s s
x − tα/2 (df ) √ , x + tα/2 (df ) √
n n
where tα/2 (df ) is referred to as a critical t-value determined from a student t distribution with
df = n − 1 degrees of freedom. The quantity
s
E = tα/2 (df ) √
n
√
is referred to as the (1 − α) ∙ 100% error margin. The quantity s/ n is referred to as the standard
error of the sample mean x.
For Practice. For the hypertension study, determine a 99% confidence interval for the true
mean systolic blood pressure of the placebo group.
33
Then the corresponding 99% confidence interval for the true mean systolic blood pressure of the
placebo group is 171.8312 to 194.0355.
(1) A group that opposes mandatory vaccination has suggested that one of the side effects of
the BCG vaccination is the underdevelopment (physically and cognitively) of the children who
have received the BCG vaccine. Suppose it is known that the nutritional status measure is a
good indicator of a child’s physical and cognitive development. If the average nutritional status
measure of those children who did not receive the BCG vaccine is 85.2, test the group’s claim at
the α=0.05 level of significance.
(2) Create a 90% confidence interval for the true mean nutritional status measure of children
who receive the BCG vaccination.
Solution: (1) Research Question: Is the average nutritional status of BCG vaccinated children
lower than the average nutritional status of non-BCG vaccinated children?
Population Declarations:
34
The population of interest is the group of children eligible for the BCG vaccination. Define μ to be
the true mean nutritional status measure of this population after they have the BCG vaccination.
Hypothesis to be tested:
Assumptions:
o The nutritional statuses of the population from which the sample is drawn are normally dis-
tributed OR the sample size is greater than 29.
Based on the SPSS output above, the p-value < 0.001/2 < 0.0005 (because it is a 1-tailed test).
The Decision: Since the p-value < 0.0005 < 0.05, we reject the H0 .
The Conclusion: At α= 0.05 level of significance (with p-value < 0.0005), we have evidence to
conclude that the average nutritional status of BCG vaccinated children is lower than 85.2, the
35
average nutritional status of non-BCG vaccinated children.
Solution: (2) According to the below SPSS output, a 90% confidence interval for the true average
nutritional status measure is (72.18, 77.28).
Once one has computed an appropriate (1 − α) ∙ 100% confidence interval, we can use it to test
a hypothesis about the true mean (in the situation in which the data is normally distributed and
the sample standard deviation is known). How?
If we wish to implement a two-tailed hypothesis test at the significance level α, we first compute
a (1 − α) ∙ 100% confidence interval. If the hypothesized mean value lies in the interval, we do
not reject H0 ; otherwise we reject H0 .
36
hypothesized mean value is less than the lower boundary of the (1 − 2α) ∙ 100% confidence
interval, we reject H0 ; otherwise we do not reject H0 . To perform a left-tailed test, if the
hypothesized mean value is greater than the upper boundary of the (1 − 2α) ∙ 100% confidence
interval, we reject H0 ; otherwise we do not reject H0 .
For example, suppose we read in a journal that a 90% confidence interval for the true average is
(150,170).
(1) Suppose we need to test if the true average is 160 against the alternative it is not (using a 10%
level of significance). This is a two-tailed test. We can use a 90% CI (Since 100%-90%=10%,
our level of significance) to directly test our hypotheses. The reject region associated with this
test will then be best described by any value greater than 170 or any value less than 150. The
do not reject region would be characterized by values between 150 and 170. In other words if
our hypothesized average of 160 is greater than 170 or less than 150, we would reject our null
hypothesis and have evidence to conclude the alternative hypothesis. For the value I provided
(160), 160 lies between 150 and 170. Hence at the 10% level of significance, we would not reject
our null hypothesis that the true mean is 160.
(2) Suppose we need to test if the true average is 160 against the alternative it is actually greater
than 160 (using a 5% level of significance). This is right-tailed test. We can use a 90% CI (Since
100%-90%=10%, and for a two-tailed test, half of this value (i.e. 5%, our level of significance)
is the area of the reject region in the right tail and the other half is the area of the reject region
in the left tail) to directly test our hypotheses. The reject region associated with this test will
then be best described by any value less than 150. Notice the reject region is in the opposite tail
than what we might have expected. The do not reject region would be characterized by values
greater than 150. In other words if our hypothesized average of 160 is less than 150, we would
reject our null hypothesis and have evidence to conclude the alternative hypothesis. For the value
I provided (160), 160 is greater than 150. Hence at the 5% level of significance, we would not
reject our null hypothesis that the true mean is 160.
(3) Suppose we need to test if the true average is 160 against the alternative it is actually less
than 160 (using a 5% level of significance). This is left-tailed test. We can use a 90% CI (Since
100%-90%=10%, and for a two-tailed test, half of this value (i.e. 5%, our level of significance) is
the area of the reject region in the right tail and the other half is the area of the reject region in
the left tail) to directly test our hypotheses. The reject region associated with this test will then
be best described by any value greater than 170. Notice the reject region is in the opposite tail
than what we might have expected. The do not reject region would be characterized by values
less than 170. In other words if our hypothesized average of 160 is greater than 170, we would
37
reject our null hypothesis and have evidence to conclude the alternative hypothesis. For the value
I provided (160), 160 is less than 170. Hence at the 5% level of significance, we would not reject
our null hypothesis that the true mean is 160.
This process can be used to implement a hypothesis test using a confidence interval, regardless
of the confidence interval and its associated parameter.
We can depict the information captured in a confidence interval via a confidence interval plot.
This plot is formed using the lower and upper boundaries of the confidence interval and the point
estimate for the parameter estimated via the confidence interval.
For Practice. Draw a 95% confidence interval for the true mean blood pressure patients after
they have taken inderal.
Solution:
38
The circle represents the point estimate for the parameter of interest (in this case the true mean
systolic blood pressure of patients who take inderal). The lower and upper whiskers respectively
represent the lower and upper boundaries for calculated confidence interval.
For Discussion. How could we use this plot to test the hyptheses:
(a) the true mean systolic blood pressure is 160 mmHg? 140 mmHg? 170 mmHg?
(b) the true mean systolic blood pressure is less than 160 mmHg? 140 mmHg? 170 mmHg?
(c) the true mean systolic blood pressure is higher than 160 mmHg? 140 mmHg? 170 mmHg?
Now Your Turn: The box plot and confidence interval plot below are both summarizing the
systolic blood pressures of patients who took inderal.
(a) Which plot would you use to see if the data is normally distributed? Why?
(b) From these plots, is there evidence the true mean blood pressure is equal to the true median
blood pressure?
39
1.11.2 Independent Large Random Samples from one Population: Count
data
For Practice. In a poll conducted May 7-10, 2000, by ABC News, a simple random sample of
1068 American adults was asked “Have you ever been shot at?”. Of the 1068 American adults
surveyed, 96 responded yes. Obtain a point estimate for the population proportion of American
adults who have been shot at.
Solution: An estimate for the true number of Americans who have been shot at is
π̂ = 96/1068 ≈ 0.0899.
Theorem 1.11.2 (Sampling distribution of π b) For a simple random sample of size n such
that n ≤ 0.05N (where N is the population size), the sampling
r b is approximately
distribution of π
π(1 − π)
normal with mean μπb = π and standard deviation σπb = , provided nπ(1 − π) ≥ 10.
n
40
1. a simple random sample is obtained and
2. nπ0 (1 − π0 ) ≥ 10 and n ≤ 0.05N (where n is the sample size and N is the population
size).
If the above two conditions are both satisfied then we can modify the three z hypothesis tests
x − μ0
by replacing μ with π and μ0 with π0 in the hypotheses; and the test statistic z = √ with
σ0 / n
b − π0
π x
the new test statistic z = r b = and x is the number of individuals in the
where π
π0 (1 − π0 ) n
n
sample with specified characteristic. The other steps remain the same.
b − π0
π
We refer to a hypothesis test that uses the test statistic z = r and the Standard
π0 (1 − π0 )
n
Normal Distribution to determine the critical values as “a large sample z−test for a single pro-
portion”.
For Practice. The drug Prevnar is a vaccine meant to prevent meningitis. It is typically
administered to infants. In clinical trials, the vaccine was given to 710 randomly sampled infants
between 12 and 15 months of age. Of the 710 infants, 121 experienced a decrease in appetite.
Is there significant evidence (at the 1% level of significance) to conclude that the proportion
of infants who receive Prevnar and experience a decrease in appetite is larger from 0.135, the
proportion of infants who experience a decrease in appetite because of competing medications?
Solution:
1. Research Question: Do more infants experience a decrease in appetite after being vaccinated
with Prevnar when compared to infants who received an existing vaccine?
2. Population declarations: The population being study is the group of infants between 12 and
15 months of age. Let π be the true proportion of infants from this population who experience
a decrease in appetite after being vaccinated with Prevnar.
3. Hypothesis to be tested:
H0 : π = 0.135
41
HA : π > 0.135
(b) While we don’t know the population size, we will assume 710 is less than 5% of the population
size. Note
(710)(0.135)(1 − 0.135) ≈ 82.9 > 10.
Hence we can use the large sample test.
9. The Conclusion: At the 1% level of significance, we have evidence to conclude that the true
proportion of infants who experience a decrease in appetite after being vaccianted with Prevnar
is greater than 0.135.
Now Your Turn: Suppose the current president heard the statistic regarding the proportion of
Americans who have been shot at and dismissed the evidence. He first claimed that the survey was
too small to be used to draw any country wide statement. He then claimed that the proportion
was an anomaly and in fact this statistic was so small that it is essentially zero. Discuss the
validity of the “logic” for both his claims.
Solution:
42
A confidence interval for a true proportion
Theorem 1.11.3 Suppose a simple random sample of size n is taken from a population. A
(1 − α) ∙ 100% confidence interval for π is given by
r r !
b(1 − π
π b) b(1 − π
π b)
b − zα/2
π b + zα/2
,π
n n
π (1 − π
where nb b) ≥ 10 must be true.
For Practice. For the above poll, compute a 95% confidence interval for the population pro-
portion π that have been shot at.
The above are inference techniques for large samples but what happens if one needs to make
inferences for π when nπ̂(1 − π̂) < 10? As long as n ≥ 10, one can use the “Plus Four”
confidence interval to determine a (1 − α) ∙ 100% confidence interval for π.
x+2
π̃ =
n+4
43
and the standard error is r
π̃(1 − π̃)
S.E. = ,
n+4
then the (1 − α) ∙ 100% confidence interval for π is
r r !
π̃(1 − π̃) π̃(1 − π̃)
π̃ − zα/2 , π̃ + zα/2 (1.1)
n+4 n+4
If one wishes to implement a hypothesis via a test statistic and critical value, then one will have
to use an appropriate non-parametric test.
For Practice. In a hospital emergency room, all staff (nurses and doctors) use the same central
computer. Of 97 staff members who were observed to have used the computer, only 7 washed
their hands after they finished using the computer. Determine a 95% confidence interval for the
true proportion of staff who wash their hands after using the computer.
Solution:
t-test for the difference in two independent means with unequal variances
Suppose X1 , ..., Xn1 is a random sample of size n1 taken from Population 1 whose mean is μ1 ;
Y1 , ..., Yn2 is a random sample of size n2 taken from Population 2 whose mean is μ2 ; and the
samples from the two populations are independent. In order to form a (1 − α) ∙ 100% confidence
interval for, and test a hypothesis regarding, μ1 − μ2 , we need to know the following are all
satisfied:
44
1. a simple random sample is obtained.
2. The populations from which both samples are drawn are normally distributed OR both sample
sizes are large (n1 ≥ 30 and n2 ≥ 30).
There is actually no exact solution to the situation where σ1 and σ2 are unknown and σ1 6= σ2 .
We can use an approximate solution, that is we can use Welch’s approximate t-distribution, to
form a (1 − α) ∙ 100% confidence interval for, and test a hypothesis regarding, μ1 − μ2 , if we
know the following are satisfied:
2. The populations from which both samples are drawn are normally distributed.
If the above four conditions are simultaneously true, then a (1 − α) ∙ 100% confidence interval
for μ1 − μ2 is given by
s s
xˉ1 − xˉ2 − tα/2 (df ) s21 s22 s21 s22
+ , xˉ1 − xˉ2 + tα/2 (df ) +
n1 n2 n1 n2
where tα/2 (df ) is referred to as a critical t-value from a Student’s t-distribution with df=min(n1 , n2 )
degrees of freedom.
xˉ1 − xˉ2 − k
t= q 2
s1 s2
n1
+ n22
45
with df=min(n1 , n2 ) degrees of freedom.
We will use the Student’s t-distribution to determine the critical values. The other steps remain
the same. We refer to a hypothesis test that uses the test statistic
xˉ1 − xˉ2 − k
t= q 2
s1 s2
n1
+ n22
with df=min(n1 , n2 ) degrees of freedom and the Student’s t-Distribution to determine the critical
values as a t-test for the difference in two independent means with unequal variances.
Note the degrees of freedom in the above discussion is very conservative. SPSS calculates the
degrees of freedom to be: 2 2
s1 s22
n1
+ n2
df = 2 2 2 2 .
s1 s2
n1 n2
n1 −1
+ n2 −1
For Practice. BFAHS p185, q. 6.4.10: In a study of factors thought to be responsible for the
adverse effects of smoking on human reproduction, cadmium level determinations (nanograms
per gram) were made on placenta tissue of a random sample of 14 mothers who were smokers
and an independent random sample of 18 nonsmoking mothers. The data is summarized below:
1. At the α = 0.10 level of significance, test the claim that the mean cadmium level is higher
among smokers than nonsmokers.
46
2. Determine a 95% confidence interval for the true difference in the mean cadmium levels
between the two groups.
Solution: (1) Note: at this point in the course, we are not going to worry about testing the
assumptions necessary for the results of this test to be valid. This will not always be the case.
Research Question: Is the cadmium level higher in smoking mothers than in nonsmoking mothers?
Population Declarations:
Let population 1 be the group of all non-smoking mothers. Then define μ1 to be the true mean
cadmium level in population 1.
Let population 2 be the group of all smoking mothers. Then define μ2 to be the true mean
cadmium level in population 2.
Hypothesis to be tested:
H0 : μ1 = μ2 (i.e. the true mean cadmium level of population 1 is equal to the true mean
cadmium level of population 2)
HA : μ1 < μ2 (i.e. the true mean cadmium level of population 1 is less than the true mean
cadmium level of population 2)
Hypothesis Test to be used: t-test for the difference in two independent means with unequal
variances
The Test Statistic and corresponding p-value: Based on the following output from SPSS,
47
the value of test statistic is t=-2.438 with df=26.671 degrees of freedom. The corresponding
p-value = 0.022/2 = 0.011 (since we are implementing a 1-tailed test).
The Decision Rule: Since the p-value = 0.011 < 0.10 = α, we reject H0 .
Conclusion: At the 10% level of significance (with equal variances not assumed) there is evidence
to conclude that the true mean cadmium level is lower in the group of mothers who do not smoke
than in the group of mothers who do smoke (p-value = 0.011).
(2)
Based on the Independent Samples Test table above, the estimated 95% confidence interval for
the true mean difference in cadmium levels between the two groups is -10.48540 to -0.89872.
48
Hypothesis testing via confidence intervals
One can use (1 − α) ∙ 100% confidence intervals to visually determine whether it is plausible if
two populations have the same mean. How? On the same graph, plot (1 − α) ∙ 100% confidence
intervals for each sample. If the sample mean of one group lies within the confidence interval
for the other group and vice versa, then it is plausible (at the α significance level) that the two
populations have the same mean. If the two confidence intervals do not overlap, then, at the α
level of significance, it is plausible that the two populations have different means.
For Practice. Based on the below 80% confidence intervals for the true mean cadmium levels
for the two groups, is there evidence to conclude the true mean cadmium levels for the two groups
differ?
Solution: The two confidence intervals plotted do not overlap. Consequently we would believe
that, at the α = 0.20 level of significance, the true mean cadmium levels for smokers and non-
smokers differ.
49
NOTE BIEN: Boxplots are used to illustrate the distribution of the data. Confidence interval
plots do not illustrate this distribution. Confidence interval plots can be used to visualize whether
means are equal and whether it is reasonable that the variances of different populations are equal.
Now Your Turn: The total cholesterol levels (mg/dl) for 133 randomly selected hypertensive pa-
tients and 41 randomly selected normotensive patients were collected. The data for this problem
is available online at the textbook’s website (cf. Chapter 7, Section 3, Exercise 4). Assume that
the total cholesterol levels of both populations are normally distributed. From a 95% confidence
interval plot, would you conclude that the true mean cholesterol levels of hypertensive patients
equals the true mean cholesterol levels of normotensive patients?
t-test for the difference in two independent means with equal variances
When using a statistical package, one might also see the t-test, ”testing two means from in-
dependent samples with equal variances”. The (1 − α) ∙ 100% confidence interval computed
is
r r
1 1 1 1
xˉ1 − xˉ2 − tα/2 (df )spooled + , xˉ1 − xˉ2 + tα/2 (df )spooled +
n1 n2 n1 n 2
and the test statistic is
xˉ1 − xˉ2 − k
t= q
spooled n11 + n12
where tα/2 (df ) is referred to as a critical t-value from a Student’s t-distribution with df=n1 +n2 −2
degrees of freedom.
For Practice. Repeat the above cadmium level example (both the hypothesis test and confidence
interval using the level of significance in the original question) assuming both populations have
the same variance.
Solution: (1) Note: at this point in the course, we are not going to worry about testing the
50
assumptions necessary for the results of this test to be valid. This will not always be the case.
Research Question: Is the cadmium level higher in smoking mothers than in nonsmoking mothers?
Population Declarations:
Let population 1 be the group of all non-smoking mothers. Then define μ1 to be the true mean
cadmium level in population 1.
Let population 2 be the group of all smoking mothers. Then define μ2 to be the true mean
cadmium level in population 2.
Hypothesis to be tested:
H0 : μ1 = μ2 (i.e. the true mean cadmium level of population 1 is equal to the true mean
cadmium level of population 2)
HA : μ1 < μ2 (i.e. the true mean cadmium level of population 1 is less than the true mean
cadmium level of population 2)
Hypothesis Test to be used: t-test for the difference in two independent means with equal
variances
The Test Statistic and corresponding p-value: Based on the following output from SPSS,
51
the value of test statistic is t=-2.468 with df=30 degrees of freedom. The corresponding p-value
= 0.020/2 = 0.010 (since we are implementing a 1-tailed test).
The Decision Rule: Since the p-value = 0.010 < 0.10 = α, we reject H0 .
Conclusion: At the 10% level of significance (with equal variances not assumed) there is evidence
to conclude that the true mean cadmium level is lower in the group of mothers who do not smoke
than in the group of mothers who do smoke (p-value = 0.010).
(2)
52
Based on the Independent Samples Test table above, the estimated 95% confidence interval for
the true mean difference in cadmium levels between the two groups is -10.4025 to -0.9816.
Now Your Turn: Does texting while driving really slow one’s ability to react? A psychologist
measured the reaction time (in seconds) to stop when a hazard was suddenly placed in the driver’s
lane. A sample consisted of 18 randomly selected individuals who were not texting while driving
and 16 randomly selected individuals who were texting while driving. The results are summarized
in the following table:
Assume that the reaction times of both populations are normally distributed.
53
(a) If the variances of the two populations are not equal, at the α = 0.10 level of significance,
test the claim that a person’s stopping reaction time is increased if texting while driving.
Population Declarations:
Let population 1 be the group of all the drivers who do not text while driving and μ1 be the true
mean stopping reaction time associated with this group.
Let population 2 be the group of all the drivers who do text while driving and μ2 be the true
mean stopping reaction time associated with this group.
Hypothesis to be tested:
H0 : μ1 = μ2 (i.e. the true mean stopping reaction time of population 1 is equal to the true
mean stopping reaction time of population 2)
HA : μ1 < μ2 (i.e. the true mean stopping reaction time of population 1 is less than the true
mean stopping reaction time of population 2)
Hypothesis Test to be used: T-test for two independent samples for equal variances
54
The test statistic and corresponding p-value: Based on the following output from SPSS,
the test statistic is T=-5.928 with an associated p-value < 0.001/2 < 0.0005 (since it is a 1-tailed
test).
The Decision Rule: Since p-value < 0.0005 < 0.10 = α, we reject H0 .
Conclusion: At the 10% level of significance (with a p-value < 0.0005) there is evidence to
conclude that the true mean stopping reaction time of drivers who were not texting while driving
is less than the true mean stopping reaction time of the drivers who were texting while driving.
(b) Assuming the variances of the two populations are not equal, construct an appropriate plot
(using the level of significance α=0.05) from which you could visually inspect whether it was
plausible that the two populations had the same mean. Referencing this plot, discuss why or why
not you would conclude the two populations have the same mean.
Solution:
55
Because the 95% confidence intervals in the above plot do not overlap, it is not likely that the
two populations have the same mean.
(c) If the variances of the two populations are not equal, determine a 99% confidence interval for
the true difference in the mean reaction times between the two groups.
Solution:
From the SPSS output above, a 99% confidence interval for the true mean difference μ1 − μ2
(assuming unequal variances) is (-2.82937,-1.03729).
56
(d) If the variances of the two populations are equal, at the α = 0.10 level of significance, test
the claim that a person’s stopping reaction time is increased if texting while driving.
Solution:
Research Question:
Population Declarations:
Let population 1 be the group of all the drivers who do not text while driving and μ1 be the true
mean stopping reaction time associated with this group.
Let population 2 be the group of all the drivers who do text while driving and μ2 be the true
mean stopping reaction time associated with this group.
Hypothesis to be tested:
H0 : μ1 = μ2 (i.e. the true mean stopping reaction time of population 1 is equal to the true
mean stopping reaction time of population 2)
HA : μ1 < μ2 (i.e. the true mean stopping reaction time of population 1 is less than the true
mean stopping reaction time of population 2)
Hypothesis Test to be used: T-test for two independent samples for equal variances
57
The Significance Level: α = 0.10
The test statistic and corresponding p-value: Based on the following output from SPSS,
the test statistic is T=-5.968 with an associated p-value < 0.001/2 < 0.0005 (since it is a 1-tailed
test).
The Decision Rule: Since p-value < 0.0005 < 0.10 = α, we reject H0 .
Conclusion: At the 10% level of significance (with a p-value < 0.0005) there is evidence to
conclude that the true mean stopping reaction time of drivers who were not texting while driving
is less than the true mean stopping reaction time of the drivers who were texting while driving.
(e) If the variances of the two populations are equal, determine a 99% confidence interval for the
true difference in the mean stopping reaction times between the two groups.
Solution:
58
From the SPSS output above, a 99% confidence interval for the true mean difference μ1 − μ2
(assuming equal variances) is (-2.82048,-1.04619).
For Discussion. What are the pitfalls (if any) associated with using the assume equal variances
Confidence Interval/Test Statistic?
In order for the confidence interval/result of the hypothesis test to be valid, one must be able to
verify that the two populations indeed have the same variance. To verify this, one would have to
do a hypothesis test. Consequently the validity of the confidence interval/result of the hypothesis
test depends on the result of a hypothesis test, thus increasing the chance of making a Type I
error.
The F-distribution (named after R. Fisher), is a family of curves, each of which is completely
specified by two different degrees of freedom ν and d. We use the symbol Fα (ν, d) to denote the
59
specific value of the F-distribution with ν and d degrees of freedom at the significance level α.
The F-distribution is skewed from the right. Because of this, the F-distribution is not symmetric
about any particular value. We need to take this fact into consideration when we are working with
the F-distribution. The simplest way is to assign the populations in such a way that Population
1 has the largest variance. Then, given two samples from two independent normal Populations 1
and 2, we wish to test H0 : σ12 = σ22 against either HA : σ12 6= σ22 or HA : σ12 > σ22 .
s21
F (ν, d) = ,
s22
where the numerator and denominator are the sample variances from Population 1 and Population
2, respectively AND ν = n1 − 1 and d = n2 − 1. If you know both populations are normally
distributed, then this test statistic will follow an F-distribution with ν and d degrees of freedom.
Since most F-tables only provide values for right-tailed tests and due to the lack of symmetry
of the F-distribution, we have to be careful if we wish to implement a TWO-TAILED hypothesis
test. To implement a two-tailed F-Test, we have to implement both one-tailed alternatives at
half the significance level. Then if we reject H0 for either of these one-tailed alternatives, we
reject H0 for the two-tailed test; otherwise we conclude that we cannot reject H0 . Because in
practice there is really only one direction, which will lead to a rejection of like in the one-tailed
version of the hypothesis test, we let Population 1 correspond to the population with the largest
sample variance.
For Discussion. What, if any, are the potential issues with using the above test for the equality
of variances?
Suppose that you do not know the data is normally distributed OR that you know the data
is not normally distributed OR that you need to know whether the variances from popula-
tions are equal. In these three scenarios, the test we just learned cannot be used. If you
know the data is sampled from a symmetric distribution then you can use Bartlett’s Test for
Equality of Variances (cf. http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm).
Suppose that the assumptions required to use Bartlett’s Test are not true or are known to
be not true. Then, in this situation, you can use Levene’s Test for Equality of Variances
60
(cf. http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm). This is the test that
SPSS implements.
For Practice. Use Levene’s Test for Equality of Variances to determine whether or not (at
the 10% level of significance) the variances between the smoking and non-smoking group in the
cadmium level study are equal.
Solution:
The Research Question: Is there a difference between the true variances of cadmium levels in
pregnant women who smoke and those who do not smoke?
Population Declarations: Let population 1 be the group of all smoking pregnant women. Let
population 2 be the group of all non-smoking pregnant women.
Let σ1 be the true standard deviation of the cadmium levels of population 1 and σ2 be the true
standard deviation of the cadmium levels of population 2.
Hypothesis to be tested:
H0 : σ12 = σ22 (there is no difference between the true variances of the cadmium levels of the
mothers who smoke and those who do not smoke)
HA : σ12 6= σ22 (the true variance in the cadmium levels of the pregnant women who smoke differs
from the true variance in the cadmium levels of the pregnant women who do not smoke)
61
From the Independent Samples Test table above, the value of Levene’s Test Statistic is F=0.461
with ν = 14−1 = 13 and d = 18−1 = 17 degrees of freedom and the associated p-value=0.502.
Decision Rule: Since the p-value > 0.502 > 0.10 = α, we do not reject H0 .
The Conclusion: At the α= 0.10 level of significance, there is no evidence to conclude that the
true variance in the cadmium levels of pregnant women who smoke differs from the true variance
in the cadmium levels of pregnant women who do not smoke (p-value=0.502). In other words,
at the α= 0.10 level of significance, there is no evidence to reject the assumption that the true
variance in the cadmium levels of the pregnant women who smoke equals the true variance in the
cadmium levels of the pregnant women who do not smoke.
For Discussion. In the above example, how does one determine which is Population 1 and which
is Population 2?
Now Your Turn: For the texting while driving example above, test, at the 10% level of signif-
icance, whether both samples were drawn from populations with the same variance. Be sure to
62
write your solution in the format discussed in class.
Solution: Research Question: Are the reaction times of drivers who text while driving and the
reaction times of drivers who do not text while driving from two populations with the same
variance?
Population Declarations: Let population 1 be the group of drivers who text while driving and
population 2 be the group of drivers who do not text while driving. Then let σ1 be the standard
deviation in the stopping reaction times of individuals from population 1 and σ2 be the standard
deviation in the stopping reaction times of individuals from population 2.
Hypothesis to be tested:
H0 : The variances of the stopping reaction times of both populations are equal.
HA : the variances of the stopping reaction times of both populations are not equal.
63
Based on the above SPSS output, the value of the test statistic is F=0.618 with ν = 16 − 1 = 15
and d = 18 − 1 = 17 degrees of freedom and an associated p-value = 0.437.
The Decision Rule: Since p-value = 0.437 > 0.10 = α, we do not reject H0 .
The Conclusion: At α= 0.10 level of significance, there is no evidence to reject the assumption
that the variances of the stopping reaction times of both populations are equal (p-value= 0.437).
The work we have done thus far was based on the assumption that our data was unpaired. In the
situations where we had two different populations, we assumed that the data samples were drawn
independently from the two populations. How do we make conclusions about two population
means, when the two samples are not independent?
Two samples are said to be paired when for each data value collected from one sample, there is
a corresponding data value collected from the second sample, and both of these data values are
collected from the same source. A perfect example would be the midterm and the final exam
marks for a group of students.
When we have a small paired data sample, we can run a t-test to test the null hypothesis that
the means of the two populations are equal (the same) against one of the three usual alternative
hypotheses. How do we implement such a test?
Match-paired t-test
Suppose (X1 , Y1 ), ..., (Xn , Yn ) is a random sample of size n with mean (μ1 , μ2 ) and μ1 − μ2 is the
difference of the two population means. In order to form a (1 − α) ∙ 100% confidence interval for
μ1 − μ2 , and test a hypothesis regarding μ1 − μ2 , we need to know the following are all satisfied:
64
2. The populations from which both samples are drawn are normally distributed.
dˉ − k
t= √
sd / n
and a (1 − α) ∙ 100% confidence interval is given by
sd sd
dˉ − tα/2 (df ) √ , dˉ + tα/2 (df ) √ .
n n
dˉ − k
t= √
sd / n
For Practice. Referring back to the inderal example, the hope is that one’s systolic blood
pressure after taking inderal would be lower than while on the placebo.
(a) At the α = 0.1 level of significance, test the claim that one’s systolic blood pressure is lower
after taking Inderal than when on the placebo.
(b) Determine a 99% confidence interval for the true difference in the mean systolic blood
pressures (μ1 − μ2 ) between the two groups.
65
Solution: (a) The Research Question: Does one’s systolic blood pressure become lower after
taking Inderal (when compared to the blood pressure when taking the placebo)?
Population Declarations:
Let the population of interest be the group of people who with hypertension. Then define μ1 to
be the true mean systolic blood pressure of this population after taking the placebo and define
μ2 to be the true mean systolic blood pressure of this population after taking inderal.
Hypothesis to be tested:
H0 : μ1 = μ2 (i.e. the true mean systolic blood pressure of population 1 is equal to the true
mean systolic blood pressure of population 2)
HA : μ1 > μ2 (i.e. the true mean systolic blood pressure of population 1 is greater than the true
mean systolic blood pressure of population 2)
o The systolic blood pressures after taking the placebo and after taking inderal are normally
distributed.
o The systolic blood pressures after taking the placebo and after taking inderal were simple
random samples.
The Test Statistic and corresponding p-value: Based on the following output from SPSS,
66
The value of the test statistic is t =4.937 with df =14 and the corresponding p-value ¡0.0005.
The Decision Rule: Since the p-value < 0.0005 < 0.10 = α, we reject H0 .
The Conclusion: At α = 0.10 level of significance, there is evidence to conclude that the true
mean systolic blood pressure after taking inderal is lower than the true mean systolic blood
pressure before taking the medication (p-value < 0.0005).
In the above solution, you will notice that one cell of the table is highlighted in orange. This
cell tells you how SPSS calculated the differences used to calculate the test statistic. In order
to avoid confusion, it is simplest if the identified first value used in the difference (Placebo) is
associated with population 1 and the second value used in the difference (Inderal) is associated
with population 2. The same rule of thumb is used when computing a confidence interval for the
difference of the two means.
(b)
67
The estimated 99% confidence interval for the true difference μ1 − μ2 is 10.880 to 43.920.
Remark: To use the paired data t-test statistic, we have made the assumption that D = X − Y
is a normal random variable.
Remark: If you have paired data, you can either implement a paired t-test or an unpaired two-
sample t-test, because the paired t-test tests the same hypotheses as the unpaired two sample
tests that we have studied. BUT, if you do not have paired data, you CAN ONLY implement an
unpaired two-sample test!!!
Now Your Turn: BFAHS, p252, example 7.4.1: John M. Morton et al. (A-14) examined
gallbladder function before and after fundoplication–a surgery used to stop stomach contents
from flowing back into the esophagus (reflux)–in patients with gastroesophageal reflux disease.
The authors measured gall bladder functionality by calculating the gall bladder ejection fraction
(GBEF) before and after fundoplication. These values are stored in the table below.
The goal of fundoplication is to increase GBEF, which is measured as a percent. Does the data
support, at the 5% level of significance, that fundoplication increases GBEF functioning? You
may assume that the patients were randomly selected and that the differences in the Pre-op and
Post-op GBEF are normally distributed.
Population Declarations: The population of interest is the group of patients who have gastroe-
sophageal reflux disease. Then let μ1 be the true mean gall bladder ejection fraction (GBEF)
before fundoplication and μ2 be the true mean gall bladder ejection fraction (GBEF) after fun-
doplication.
Hypothesis to be tested:
68
H0 : μ1 = μ2 (i.e. the true mean GBEF before fundoplication equals the true mean GBEF after
fundoplication)
H0 : μ1 < μ2 (i.e. the true mean GBEF before fundoplication is less than the true mean GBEF
after fundoplication)
2. The populations from which both samples are drawn are normally distributed.
From the SPSS output above, the test statistic is t =-1.916 with df =11 and an the associated
p-value=0.041.
The Decision Rule: Since α = 0.05 > 0.082/2 = 0.041= p-value, we can reject H0.
69
1.11.6 Large Random Samples from two Independent Populations: Count
Data
Suppose we wish to know is there a difference between the proportion of men who feel satisfied
with a health promotion activity for stopping smoking and the proportion of women who feel
satisfied with the same health promotion activity, but we do not have apriori knowledge of either
proportion. We can use sample data from the two populations to determine a (1 − α)100%
confidence interval for the true difference in the proportions and to implement the appropriate
hypothesis test.
Let ni , xi , π̂i , and π respectively be the sample size, the number of successes observed, the
observed proportion of successes, and the true proportion of successes in population i. If both
samples are large, then π̂1 − π̂2 will be approximately normally distributed with standard error
r
π̂1 (1 − π̂1 ) π̂2 (1 − π̂2 )
+ .
n1 n2
The name of the hypothesis test: z-test for the difference in two independent proportions
The assumptions:
70
1. Both must be simple random samples
2. Both must be large samples, i.e. n1 π̂1 ≥ 5, n1 (1 − π̂1 ) ≥ 5 and n2 π̂2 ≥ 5, n2 (1 − π̂2 ) ≥ 5.
π̂1 − π̂2 x1 + x 2
The test statistic is z = s where π̂ = n + n .
1 1 1 2
π̂(1 − π̂) +
n 1 n2
The other steps do not change.
For Practice. In a group of 1000 men polled, 850 supported an issue. Of 500 women surveyed,
400 supported the issue. Assume the data was collected using simple random sampling.
Test the hypothesis that the true proportion of men supporting the issue equals the proportion
of women supporting the issue against the alternative. Use α=0.01.
Solution:
Large sample confidence interval for the difference in two independent proportions
Both samples must be large, where we will still use the condition that the variance of each of the
samples is greater than or equal to 10 to define large enough.
Let ni , xi , π̂i , and π respectively be the sample size, the number of successes observed, the
observed proportion of successes, and the true proportion of successes in population i. If both
samples are large, then π̂1 − π̂2 will be approximately normally distributed with standard error
r
π̂1 (1 − π̂1 ) π̂2 (1 − π̂2 )
+ . A (1 − α)100% confidence interval for π1 − π2 will therefore be
n1 n2
s s
(π̂1 − π̂2 ) − zα/2 π̂1 (1 − π̂1 ) π̂2 (1 − π̂2 ) π̂1 (1 − π̂1 ) π̂2 (1 − π̂2 )
+ , (π̂1 − π̂2 ) + zα/2 + .
n1 n2 n1 n2
(1.2)
71
For Practice. In a group of 1000 men polled, 850 supported an issue. Of 500 women surveyed,
400 supported the issue. Assume the data was collected using simple random sampling.
Find a 90% confidence interval for the true difference in the proportions of men and women that
support the issue.
Solution:
Now Your Turn: Do our emotions influence the economic decisions we make? One way to
examine the issue is to have subjects play an “ultimatum game” against other people and against
a computer. Your partner (person or computer) gets $10, on the condition that it be shared with
you. The partner makes you an offer. If you refuse, neither of you gets anything. Consequently,
it is to your advantage to accept an unreasonable offer (such as you get $2 and your partner
gets $8). Some people get made and refuse unfair offers. Here are data on the response of
228 subjects randomly selected to receive $2 from either a person they were introduced to or a
computer.
Humans offers accepted: 60; Human offers rejected: 54; Computer offers accepted: 96; Computer
offers rejected: 18.
We suspect that emotion will lead to offers from another being rejected more often than offers
from an impersonal computer. Test this claim at the α = 0.05 level of significance.
Suppose that either the number of observed successes OR the number of observed failures at least
one of the variances is less than 10. We should not blindly use the confidence interval formula
above. We are going to use the “Plus Four” method to compute a (1 − α)100% confidence
interval when n1 ≥ 5 and n2 ≥ 5.
72
If the point estimates used in the “Plus Four” confidence interval are
x1 + 1 x2 + 1
π̃1 = , π̃2 =
n1 + 2 n2 + 2
and the standard error is s
π̃1 (1 − π̃1 ) π̃2 (1 − π̃2 )
S.E. = + ,
n1 + 2 n2 + 2
then the (1 − α) ∙ 100% confidence interval for π1 − π2 is
s s
(π̃1 − π̃2 ) − zα/2 π̃1 (1 − π̃1 ) + π̃2 (1 − π̃2 ) , (π̃1 − π̃2 ) + zα/2 π̃1 (1 − π̃1 ) + π̃2 (1 − π̃2 )
n1 + 2 n2 + 2 n1 + 2 n2 + 2
(1.3)
as long as n1 ≥ 5 and n2 ≥ 5.
For Practice. In an experiment to determine if consuming Omega-3 fatty acids can improve
one’s memory, the brains of 20 randomly selected healthy rats were treated in a controlled,
humane way, to damage their memories. The rats were trained to run a maze. (At the end
of the training, all 20 could find their way through the maze quickly). After a day, controlled
amounts of Omega-3 fatty acids were introduced to the diets of 10 rats; the diets of the other
10 rats were identical those rats receiving the treatment EXCEPT FOR no Omega-3 fatty acids
were added to their diets. After one day of treatment, 7 of the Omega-3 group successfully
made it through the maze; 2 of the control group made it through the maze. Determine a 95%
confidence interval for the true difference in the proportions of the Omega-3 group and control
group that successfully made it through the maze, post treatment.
Solution:
In order to use the t-tests that we just reviewed when the number of data points is fewer than
30, the data must be normally distributed in order for the test statistic to have a distribution that
73
can be approximated by the Student’s t-distribution. The examples we have looked at thus far
have told us that we can assume that the data is normally distributed but, in practice, researchers
cannot just assume that their data is normally distributed. The normality of the data needs to
be tested.
In this course we will look at a graphical method and a numerical method for testing whether the
data is normally distributed.
The probability-probability plot (P-P plot or percent plot) compares an empirical cumulative
distribution function of a variable with a specific theoretical cumulative distribution function
(e.g., the standard normal distribution function).
For Practice. Create a P-P Plot for the Placebo group in the Systolic Blood Pressure example.
74
For Discussion. Based on the above P-P Plot, are the blood pressures from this population
(from which the sample was drawn) normally distributed?
Solution: Because all the points lie quite close to the plotted line y = x and there seems to be
just as many points lying above the line as below, I would suspect that the data was drawn from
a normally distributed population.
A second plot that can be used to help determine the normality of the data is the quantile-
quantile plot (Q-Q plot) which compares ordered values of a variable with quantiles of a specific
theoretical distribution (i.e., the normal distribution).
For Practice. Create a Q-Q Plot for the Placebo group in the Systolic Blood Pressure example.
75
For Discussion. Based on the above Q-Q Plot, are the blood pressures from this population
(from which the sample was drawn) normally distributed?
Solution: Because all the points lie quite close to the plotted line y = x and there seems to be
just as many points lying above the line as below, I would suspect that the data was drawn from
a normally distributed population.
In either the P-P or Q-Q plot, if the two distributions match, the points in the plot will form a
linear pattern passing through the origin with a unit slope. P-P and Q-Q plots are used to see
how well a theoretical distribution models the empirical data. The question becomes how much
of a deviation from this line does it take to conclude non-normality. Is the Inderal data plotted
in the Q-Q plot normal or is the deviation present sufficient enough to conclude the underlying
population is not normal?
There are several different statistical tests which can be used to quantitatively determine (up to
some level of significance whether or not a given data set is drawn from a normally distributed
population. The two tests that SPSS implements are the Shapiro-Wilk W Test for Normality
and the Kolmogorov-Smirnov D Test. The Shapiro-Wilk W Test for Normality is valid for data
sets whose size is between 3 and 2000 (inclusive). If one has more than 2000 data points, then
the Kolmogorov-Smirnov D Test should be used. The assumptions required for the Kolmogorov-
Smirnov D Test are that the data is randomly sampled and that the underlying distribution of
the population is continuous. Here we will focus on the Shapiro-Wilk W Test for Normality.
H0 : the population from which the data was sampled is normally distributed
HA : the population from which the data was sampled is not normally distributed.
76
Pn 2
i=1 ai x(i)
W = Pn ,
i=1 (xi − x ˉ )2
where x1 , x2 , ..., xn represent the sample data, x(1) , x(2) , ..., x(n) represents the sample data or-
dered from least to greatest, and ai are constants generated from the means, variances, and
covariances of the order statistics of a sample size n from a normal distribution (see Pearson and
Hartley (1972, Table 15)).
For Practice. Determine at the 5% level of significance if the systolic blood pressures of
hypertensive people after taking a placebo are normally distributed.
Solution: Research Question: Are the systolic blood pressures of hypertensive people after taking
a placebo normally distributed?
Hypothesis to be tested:
H0 : The systolic blood pressures of hypertensive people after taking a placebo are normally
distributed.
HA : The systolic blood pressures of hypertensive people after taking a placebo are not normally
distributed.
Assumptions Required to Implement the Hypothesis Test: We must have between 3 and 2000
data points in each sample.
77
From the Tests of Normality table above, the value of the test statistic is with an associated
p-value=0.349.
The Decision Rule: Since the p-value = 0.349 > 0.05 = α, we do not reject H0 .
The Conclusion: At the α=0.05 level of significance, there is no evidence to conclude that the
systolic blood pressures of hypertensive people after taking a placebo are not normally distributed
(p-value=0.349). In other words, at the α=0.05 level of significance, there is no evidence to reject
the assumption that the systolic blood pressures of hypertensive people after taking a placebo
are normally distributed (p-value=0.349).
Now Your Turn: Determine at the 5% level of significance if the systolic blood pressures of
hypertensive people after taking Inderal are normally distributed.
Solution: Research Question: Are the systolic blood pressures of hypertensive people after taking
Inderal normally distributed?
Hypothesis to be tested:
H0 : The systolic blood pressures of hypertensive people after taking Inderal are normally dis-
tributed.
HA : The systolic blood pressures of hypertensive people after taking Inderal are not normally
78
distributed.
Assumptions Required to Implement the Hypothesis Test: We must have between 3 and 2000
data points in each sample.
From the Tests of Normality table above, the value of the test statistic is W = 0.938 with an
associated p-value=0.363.
The Decision Rule: Since the p-value = 0.363 > 0.05 = α, we do not reject H0 .
The Conclusion: At the α = 0.10 level of significance, there is no evidence to conclude that the
systolic blood pressures of hypertensive people after taking Inderal are not normally distributed
(p-value=0.363). In other words, at the α = 0.10 level of significance, there is no evidence to
reject the assumption that the systolic blood pressures of hypertensive people after taking Inderal
are normally distributed (p-value=0.363).
Now Your Turn: Using the data presented in the Texting While Driving Now-Your-Turn Sce-
nario,
79
a) generate the P-P plots required to visually inspect whether each sample was drawn from a
normally distributed population. Comment on whether, from the plots, you would believe each
population is normally distributed.
b) generate the Q-Q plots required to visually inspect whether each sample was drawn from a
normally distributed population. Comment on whether, from the plots, you would believe each
population is normally distributed
c) Test, at the 10% level of significance, whether each sample was drawn from a normally dis-
tributed population. Be sure to write your solution in format discussed in class.
Solution: a)
80
Based on the above P-P plots, because most of the data points for the drivers not texting while
driving and the drivers texting while driving are far away from the y = x line and the points seem
to be distributed in a sinusoidal pattern about the y = x line , one might suspect that neither of
the two samples was drawn from a normally distributed population.
81
b)
82
Based on the above two Q-Q plots, because most of the data points for the drivers not texting
while driving and the drivers texting while driving are far away from the y = x line and the points
seem to be distributed in a sinusoidal pattern about the y = x line , one might suspect that
neither of the two samples was drawn from a normally distributed population.
c) Research Question: Are the stopping reaction times of drivers who text while driving and the
stopping reaction times of drivers who do not text while driving normally distributed?
Hypothesis to be tested:
H0,1 : The stopping reaction times of drivers who do not text while they drive are normally
distributed.
HA,1 : The stopping reaction times of drivers who do not text while they drive are not normally
distributed.
H0,2 : The stopping reaction times of drivers who do text while they drive are normally distributed.
HA,2 : The stopping reaction times of drivers who do text while they drive are not normally
distributed.
Assumptions: In order to use the Shapiro-Wilk Test, we must have between 3 and 2000 data
points in each sample.
83
Based on the above Tests of Normality table, the test statistic associated with drivers who were
not texting while driving is W(18) = 0.819 with a corresponding p-value=0.003; and the test
statistic associated with the drivers who were texting while driving is W(16) = 0.817 with a
corresponding p-value=0.005.
For those who were not texting while driving: since p-value = 0.003 < 0.10 = α, we reject H0,1 .
For those who were texting while driving: since p-value = 0.005 < 0.10 = α, we reject H0,2 .
The Conclusion: At α = 0.10 level of significance, there is evidence to conclude that the stopping
reaction times of drivers who do not text while driving (p-value=0.003) and the stopping reaction
times of drivers who text while driving (p-value=0.005) are both not normally distributed.
For more information about the Shapiro-Wilk W Test for Normality refer to Shapiro, S. S. and
Wilk, M. B. (1965). ”An analysis of variance test for normality (complete samples)”, Biometrika,
52, 3 and 4, pages 591-611.
Pearson, A. V., and Hartley, H. O. (1972). Biometrica Tables for Statisticians, Vol 2, Cambridge,
England, Cambridge University Press.
84
1.12 Correlation
Suppose we wish to some how characterize the relationship, if one exists, between two quantitative
variables. Further suppose that, given we know some relationship exists, we would like to use
this relationship to some how make predictions about unknown values from the population of
interest. To this end, suppose we have n cases and for each case we take a measurement for two
variables X and Y . Then the point (xi , yi ) is formed using the value of X and the value of Y
for case i. The point (ˉx, yˉ) can be plotted on the scatter plot and is called the centroid.
To construct a scatter diagram (or scatter plot), we simply plot the points (xi , yi ) for the n cases.
For Practice. A medical researcher wants to determine if there is a linear relationship between
the costs of prescription drugs that can be administered to both humans and pets. The data
collected (in Canadian dollars) is summarized in the following table.
Solution:
85
Using this sample data, create a scatterplot.
For Discussion. From the above scatter plot, what relationship would you suspect there to
be (if any) between the prescription drug cost of medications that can be administered to both
humans and pets?
Suppose we would like to determine if there is a linear relationship between the Human Drug
Cost X and the Pet Drug Cost Y . This relationship is referred to as the correlation. We can
measure this correlation using the Pearson’s correlation coefficient r where
86
P P P
x i y i ) − ( x i ) ( yi )
n(
r = q P P P P (1.4)
n ( x2i ) − ( xi )2 n ( yi2 ) − ( yi )2
P
xi −ˉ
x yi −ˉ
y
sx sy
= (1.5)
n−1
Sxy
= √ p , (1.6)
Sxx Syy
X X P
2 ( xi )2
Sxx = (xi − xˉ) = x2i − ,
n
X X P
2 ( yi ) 2
Syy = (yi − yˉ) = yi2 − ,
n
and
X X P P
( xi ) ( yi )
Sxy = (xi − xˉ)(yi − yˉ) = xi yi − .
n
Note:
1. It can be shown using the above formula for r that we always have −1 ≤ r ≤ 1. In other
words, the sample correlation coefficient can NEVER be smaller than -1 or greater than
+1.
2. r does NOT measure the slope of the linear line (referred to as the regression line or the
line of best fit) that we are trying to fit our data, apart from the sign.
3. If r = 1.0, then we have perfect positive correlation between the two variables.
87
4. If 0.7 < r < 1, then we have a strong positive correlation between the two variables.
5. If 0.4 < r ≤ 0.7, then we have a moderate positive correlation between the two variables.
6. If 0.0 < r ≤ 0.4, then we have a weak positive correlation between the two variables.
7. If r = −1.0, then we have perfect negative correlation between the two variables.
88
8. If −1.0 < r < −0.7, then we have a strong negative correlation between the two variables.
9. If −0.7 ≤ r < −0.4, then we have a moderate negative correlation between the two
variables.
10. If −0.4 ≤ r < 0.0, then we have a weak negative correlation between the two variables.
11. If r is close to ZERO, then there is little to no linear relationship between the two variables.
This does not imply that there is no relationship between the two variables!!!!
89
For Practice. For our Human versus Pet Drug Cost data, calculate Pearson’s correlation coef-
ficient.
90
For Discussion. Does |r| = 1 always imply a strong linear relationship between the two vari-
ables?
A lurking variable is a variable that is not among the explanatory or response variables in a study
and yet may influence the interpretation of the relationships among the explanatory and response
variables.
A common error that people make is that they interpret a strong correlation as a cause and effect.
Sometimes such a relationship does exist (Smokers and Physical Endurance), but in many cases
no such causal relationship exists even if the correlation is strong (our human and pet drug cost).
In such situations, there usually are hidden variables linking the two quantities of interest.
Another common error is people interpret a correlation value by stating that the independent
variable(s) explain(s) the percentage (i.e. 60% when r = 0.6) of the variability in the dependent
variable.
Using data collected for two variables, if r = 0.0, then there is no linear relationship between the
two variables. Hence there is either no relationship between the two variables or there is some
non-linear relationship between the two variables. From a scatter plot of the two variables, if no
trend is apparent and if r = 0.0, then we would conclude there is no relationship between the
variables (i.e. the variables are independent).
For Discussion. How do we determine if the data in a single set of observations are independent,
i.e. how do we determine if r = 0.0
To determine whether or not there is significant correlation between your independent and de-
pendent variables, test the hypotheses:
H0 : ρ = 0
91
r
n−2
HA : ρ 6= 0, using the test statistic t = r , with df = n − 2.
1 − r2
Note that here ρ represents the population correlation. If we reject H0 , then we conclude that
the true correlation is not zero and that some relationship exists between the two variables, i.e.
the two variables are dependent. If we do not reject H0 , then we cannot reject that the true
correlation is zero and hence we cannot reject that there is no relationship between the two
variables, i.e. the two variables are independent.
If one wanted to test if there is significant positive correlation then HA : ρ > 0 and if one wanted
to test if there is significant negative correlation then HA : ρ < 0 .
In order for the above test to be valid, the following assumptions must be true: 1. the variables
x and y are linearly related;
For Practice. : For the Human versus Pet Drug Cost example, determine whether there is
significant positive correlation between the independent and dependent variables at the 10% level
of significance. Assume all the required assumptions hold.
Solution: Research Question: Is there significant positive correlation between the cost of the
prescriptions drugs that can be administered to both humans and pets?
Population Declarations: Let population 1 be the costs for prescription drugs for humans and let
population 2 be the costs for prescription drugs for pets.
Hypotheses to be tested:
H0 : ρ = 0
HA : ρ > 0
92
Hypothesis Test to be used: Test for Significant Correlation
Assumptions to be tested: We are told to assume that all the required assumptions hold.
From the Correlations table above, Pearson’s correlation coefficient is 0.942 and the p-value
associated with the test statistic to test for significant correlation is p-value=0.001/2=0.0005.
Conclusion: At the 10% level of significance, we have evidence to conclude that the true cor-
relation between human prescription drug costs and pet prescription drug costs is greater than
zero (p-value<0.0005). This implies that, at the 10% level of significance, we have evidence to
conclude that there is significant positive correlation between the human and pet prescription
drug costs.
Now Your Turn: The blood pressure and age were measured for female patients. The patients
were then grouped by age and, for each of the age groups, the median blood pressure measurement
was computed. The data are summarized below:
93
Determine at the 5% level of significance if the midpoints of the age group are independent of
the median blood pressure for the age group.
Solution: Research Question: Are the midpoints of the age group independent of the median
blood pressure for the age group?
Population Declarations: Let population 1 be the midpoint ages of female patients, grouped by
the decade in which their ages fall and let population 2 be the median blood pressure of the age
group.
Hypotheses to be tested:
H0 : ρ = 0
HA : ρ 6= 0,
Assumptions to be tested: We are told to assume that all the required assumptions hold.
94
From the Correlations table above, Pearson’s correlation coefficient is 0.997 and the p-value
associated with the test statistic to test for significant correlation is p-value<0.001.
Conclusion: At the 5% level of significance, we have evidence to conclude that there is nonzero
correlation between the midpoints of the age group and the group’s associated median blood pres-
sure (p-value<0.001). Hence, at the 5% level of significance, we have evidence to conclude that
the midpoints of an age group and the group’s associated median blood pressure are dep endent.
When the null hypothesis in a test for significant correlation has been rejected, it could mean:
1. there is a cause-and-effect relationship between the two variables (X causes Y or vice versa);
We say that an observation is influential for a statistical calculation if removing it from the
95
calculation significantly changes the result of the calculation.
For Discussion. Does our human/pet drug cost data have any influential observations?
As outliers affect Pearson’s correlation coefficient, we can measure the linear relationship between
two variables using Spearman’s Rank Correlation coefficient ρs .
We can calculate Spearman’s Rank Correlation coefficient ρs using the following procedure:
1. Arrange the observations of the independent variable in increasing order and assign them ranks
1,2,...,n.
2. Arrange the observations of the dependent variable in increasing order and assign them ranks
1,2,...,n.
3. For a particular data point, let (xi , yi ) denote actual observations and (ri , si ) denote the ranks
of the independent and dependent variable. Then Spearman’s Rank Correlation coefficient is
defined to be
P
(ri − rˉ)(si − sˉ)
ρs = pP pP .
(ri − rˉ)2 (si − sˉ)2
What happens if two or more observations of a variable are identical. How does one rank
these identical observations? One should assign to all the tied observations the average of the
consecutive ranks that would have been assigned to the tied values.
Note:
96
1. the above formula for ρs is nothing but Pearson’s Correlation Coefficient formula where the
bivariate data are the ordered pairs of the ranks of the original data.
2. If the variables X and Y are strongly positively correlated, the ranks on X should generally
agree with the ranks on Y and ρs will be positive.
3. If the variables X and Y are strongly negatively correlated the ranks on X should be in the
reverse order to the ranks on Y and ρs will be negative .
4. If the variables X and Y are uncorrelated, the ranks on X should be randomly distributed
with the ranks on Y and ρs will be essentially zero.
5. −1 ≤ ρs ≤ 1.
6. ρs = 1.0 indicates the ranks on X completely agree with the ranks on Y. ρs = −1.0
indicates the ranks on X are in the reverse order to the ranks on Y.
For Practice. For our Human versus Pet Drug Cost data, calculate Spearman’s Rank Correlation
Coefficient.
Solution:
97
Once we have determined that a linear relationship between the independent and the dependent
variable explains a significant proportion of the variability in the dependent variable, we can use
the following linear model as a method for determining the line of best fit or linear regression
equation for the data set.
Once we have determined that a linear relationship exists between the two variables of interest,
we would like to determine the linear function which best fits the data, that is we would like to
determine a line of best fit through the data points. This line of best fit is also called a regression
line. One reason for determining the regression line is, under certain conditions, it can be used
to predict the dependent variable given a specific value for the independent variable.
We refer to the dependent (Y) variable as the response variable or the outcome variable. The
response variable is the variable that we want to predict using the values of other variables.
These other variables are referred to as the predictor variables or the independent variables and
are usually denoted as X1 , X2 , ....
When we have one independent variable X1 , we generally drop the subscript and refer to it simply
as X. Then, the model we use relating the response Y to the predictor variable X is
Yi = β0 + β1 Xi + i , i = 1, ..., n,
where Yi denotes the response corresponding to the i’th experimental run in which the predictor
variable X has the value Xi and i are the unknown error components that are assumed to be
independent and normally distributed with mean zero and an unknown standard deviation σ; and
the parameters β0 and β1 , which together define the straight line, are also unknown. β1 is referred
to as the regression coefficient. It is the estimated mean change in Y per unit of change in X.
According to the above model, the observation yi corresponding to the input value xi is one
observation from a normal distribution with mean β0 + β1 xi and standard deviation σ.
In regression, there are two goals. The first goal is to develop an equation by which the average
value of a particular random variable (Y) can be estimated or predicted based on the knowledge
98
of values of the predictor variables. The second goal is to quantify the relationship of one or more
independent variables to a dependent variable.
Of course, we quite often will never know the true values of the parameters β0 and β1 . To
estimate β0 and β1 , we can use The Principle of Least-Squares which determines the values for
the parameters so that the overall discrepancy
X
D= (Observed response − P redicted response)2
is minimized. The values for the parameters β0 and β1 that minimize D are referred to as the
least-squares estimates.
The linear function formed using these estimates is referred to as the least-squares regression line.
The least-squares regression line is nothing but the line that minimizes the square of the vertical
distance between observed values of Y and those predicted by the line
Sxx = (n − 1)s2x ;
are the errors associated with the estimates for β̂0 and β̂1 respectively.
Note:
99
1. β̂0 and β̂1 are the least-squares estimates for β0 and β1 .
4. the y-intercept only has meaning if ZERO is a possible value for the predictor variable AND
there are observed values of the predictor variable near ZERO.
5. interpolation (extrapolation) is the use of a regression line for prediction within (outside) the
range of values observed for the independent variable X and used to obtain the regression
line
The assumptions that must be true in order to use a linear regression model:
1. Existence: For any fixed variable X, Y is a random variable with a specific probability
distribution that has a finite mean and finite variance that depend on X.
2. Independence: The y-values are statistically independent of one another. This cannot be
assumed in the case of longitudinal studies.
3. Linearity: The mean value of the variable Y is a straight line that is a function of X.
4. Homoscedasticity: The variance of Y is the same for every value for X. Further, the errors
must be normally distributed with mean 0 and variance σ 2 .
For Practice. Blood pressure and age were measured for female patients. The patients were
then grouped by age and, for each of the age groups, the median Blood Pressure measurement
was computed. The data are summarized below:
100
Midpoint Age Group (X): 35 45 55 65 75
Draw a scatter plot, calculate Pearson’s correlation coefficient, and calculate a least-squares
regression line for the above data.
101
From the Correlations table above, Pearson’s correlation coefficient is 0.997.
102
1.13.1 Inferences about β1
To test the null hypothesis H0 : β1 = k against one of the three alternatives, we can use our
hypothesis recipe with the test statistic
β̂1 − k
t= √ with df = n − 2.
se / Sxx
We expect the above test statistic to have a students t-distribution with df=n-2 degrees of
freedom.
se se
β̂1 − tα/2 (df ) √ , β̂1 + tα/2 (df ) √ ,
Sxx Sxx
where tα/2 (df ) is a critical value from a Student’s t-distribution with df=n-2 degrees of freedom.
For Practice. Referring back to our Age-Median Blood pressure example, should “median age”
be included in our model? Use α = 0.05.
Solution: Research Question: Is an individual’s age linearly related to one’s blood pressure?
Hypotheses to be tested:
H0 : β1 = 0
HA : β1 6= 0
103
The value of the test statistic and p-value: From the following SPSS output,
the value of the test statistic is T=23.634 with df=3 and the associated p-value< 0.001.
Conclusion: At the α=0.05 level of significance, we have evidence to conclude that the true
regression coefficient associated with the Median Age variable is not zero (p-value <0.001). Hence
the Median Age variable needs to be in the model.
To test the null hypothesis H0 : β0 = k against one of the three alternatives, we can use our
hypothesis recipe with the test statistic
β̂ − k
t= q0 with df = n − 2.
2
se n1 + Sxˉxx
We expect the above test statistic to have a students t-distribution with df=n-2 degrees of
freedom.
104
s s !
1 xˉ2 1 xˉ2
β̂0 − tα/2 (df )se + , β̂0 + tα/2 (df )se + ,
n Sxx n Sxx
where tα/2 (df ) is a critical value from a Student’s t-distribution with df=n-2 degrees of freedom.
For Practice. Referring back to our Age-Median Blood pressure example, should “median age”
be included in our model? Use α = 0.05.
Solution: Research Question: Is an individual’s age linearly related to one’s blood pressure?
Hypotheses to be tested:
H0 : β0 = 0
HA : β0 6= 0
The value of the test statistic and p-value: From the following SPSS output,
the value of the test statistic is T=23.409 with df=3 and the associated p-value< 0.001.
105
Statistical Decision: Since the p-value< 0.001 < 0.05 = α, we reject H0 .
Conclusion: At the α=0.05 level of significance, we have evidence to conclude that the true
regression intercept is not zero (p-value<0.001). Hence the intercept needs to be in the model.
In the last two For Practice questions, we established that both the y-intercept (β0 ) and the
regression coefficient associated with the Age variable (β1 ) were both non-zero. Consequently
the model for that we should fit to the data is
Y = β0 + β 1 X +
To recap the model building process: The first step is to determine whether or not the independent
and dependent variables are statistically correlated. If no, there is no model to build as the
independent variable is not a good predictor for the dependent variable. If yes, then the next step
is to check if the coefficient of determination is greater than 0.50 (or equivalently 50%). If it is
not greater than 0.5 (50%), then there is no model to build, as the independent variable is not
a good predictor for the dependent variable. If it is greater than 0.5 (50%), then you perform
the regression to determine which of the slope and y-intercept are statistically significant. If
you follow this process, the independent variable should end up being statistically significant and
included in the model. There is no guarantee that the y-intercept will be statistically significant.
Given a value x0 , we can estimate the expected response using BUT, this is only a sample estimate
for the expected response. We can form a 100 ∙ (1 − α)% confidence interval for the expected
(or true) response by using the formula
s s
2 2
β̂0 + β̂1 x0 − tα/2 (df )se 1 + (x0 − xˉ) , β̂0 + β̂1 x0 + tα/2 (df )se 1 + (x0 − xˉ) ,
n Sxx n Sxx
106
where tα/2 (df ) is a critical value from a Student’s t-distribution with df=n-2 degrees of freedom.
For Practice. Estimate the median blood pressure you would expect a randomly selected 40 year
old to have. Also compute a corresponding 95% confidence interval for this expected response.
107
What if we were not interested in the expected response for the entire population with value x0
but for a single individual with value x0 ? We can compute a 100 ∙ (1 − α)% confidence interval
for this individual prediction using
s s
2 2
β̂0 + β̂1 x0 − tα/2 (df )se 1 + 1 + (x0 − xˉ) , β̂0 + β̂1 x0 + tα/2 (df )se 1 + 1 + (x0 − xˉ) ,
n Sxx n Sxx
where tα/2 (df ) is a critical value from a Student’s t-distribution with df=n-2 degrees of freedom.
This is called a prediction interval.
For Practice. What median blood pressure would you expect your biostatistics professor to have
when he turned 40? Also compute a corresponding 95% prediction interval.
Now Your Turn: A certain species of fish has the remarkable ability that the female members
of the species have the ability to change their sex to male if too few males are in the population.
This species of fish is used to study the impact of the “estrogen-like” compounds introduced to
natural water bodies through human pollution. After a specified number of males, between three
and nine, were randomly removed from a fish tank, the number of females that changed their sex
to male was counted. The data is summarized in the following chart:
(b) Do you suspect that there is a linear relationship between the two variables? Justify your
response.
108
(c) Determine a 95% confidence interval for the true regression coefficient and the true y-value
of the y-intercept.
(e) How many female fish would you expect to change sex whenever 4 males are remove from
the tank? Determine a 95% confidence interval for this estimated expected value.
(f) Suppose your biostatistics professor has a tank of containing this fish species. Predict how
many females would change their sex after he removed 6 males from the tank and a corresponding
95% prediction interval.
Solution: (a)
In the scatterplot above, as the number of male fish removed from the tank increases, the number
of female fish who change sex also increases. Hence one would expect there to be a positive
correlation between the number of male fish removed from the tank and the number of female
fish who change sex.
109
(b) From the Correlations table above, Pearson’s correlation coefficient is 0.857. Because the
p-value= 0.002 < 0.05 = α, there is evidence to reject the hypothesis that the true value of
Pearson’s Correlation Coefficient is 0 and conclude that the true value of Pearson’s Correlation
Coefficient is not 0. Combining this with the fact that the estimate is 0.857, there is evidence sug-
gesting that there is strong positive correlation between the independent and dependent variables.
Consequently there is evidence that a linear relationship may exist between the two variables.
(c)
Referring to the Coefficients table above, a 95% confidence interval for the true y-value of the
y-interept is (-3.266, 3.388).
Referring to the Coefficients table above, a 95% confidence interval for the true regression coef-
ficient is (0.496, 1.454).
(d) Referring to the Coefficients table in c), because the p-value=0.002 < 0.05 = α level of
significance, there is evidence to reject the hypothesis that the true slope is zero and conclude
that the true slope is not zero (p-value=0.002). Therefore the variable representing the number
of males removed from the tank needs to be included in the model.
Referring to the Coefficients table in c), because the p-value= 0.967 > 0.05 = α level of
significance, there is no evidence to reject the hypothesis that the true y-intercept is zero (p-
value=0.967). Therefore the y-intercept does not need to be included in the model.
Because the coefficient of determination (r2 = 0.734) associated with the regression line
y = 0.061 + 0.975x
110
is less than the coefficient of determination (r2 = 0.957) associated with the regression line
y= 0.983x, the model Y = β1 X + is the better model. Hence y = 0.983x is the sample
regression line which best predicts the number of female fish who change sex in response to a
particular number of male fish being removed from the tank.
The above analyses suggests that the best model would be of the form
Y = β1 X + .
Note that the degrees of freedom in (e) and (f) is n-1 not n-2 because there is only one parameter
in the model.
111
1.13.4 Diagnostics for the least-squares regression line
The coefficient of determination is r2 and measures the proportion of the total variation in the
response variable that is explained by the least-squares regression line.
2
The adjusted coefficient of determination is radj and is used for model selection.
2
Note that radj < r 2 . The adjusted coefficient of determination cannot be interpreted exactly like
2
the coefficient of determination. At best it can be thought of explaining at least radj × 100% of
the total variation in the response variable that the least-squares regression line explains. Note
the at least in this definition.
For Practice. For the Blood Pressure and Age data in the previous example, determine the
coefficient of determination.
A residual is the difference between an observed value of the response variable and the value
predicted by the fitted curve, that is for the i’th observation, the residual is ri = yi − ŷi .
Residuals calculated using the least-squares regression line are called least-squares residuals and
have the property that the mean of the least-squares residuals is always zero. Hence we will
calculate the residual sum of squares (RSS):
X X
RSS = ri2 = (yi − ŷi )2 ,
which is a measure of the goodness of fit of the line to the data.
RSS measures the variability in the dependent variable that is NOT explained by the independent
variable. Consequently RSS is also abbreviated SSunexplained .
which is the total variability in the response variable Y ignoring the independent variable X.
The sum of squares measuring the total variability in Y explained by the independent variable X,
112
abbreviated SSexplained , is calculated using
X
SSexplained = (ŷi − yˉ)2 ,
Note that
SStotal = SSexplained + SSunexplained .
For Practice. For the Blood Pressure and Age data, calculate SStotal , SSexplained , SSunexplained .
Solution:
Referring to the above table, SStotal = 1647.2, SSexplained = 1638.4, SSunexplained = 8.8.
A residual plot is a scatter plot of the regression residuals against the independent variable.
Residual plots help us assess how well a regression line fits the data.
1) If a plot of the residuals against the predictor variable shows a discernible pattern, then the
predictor and response variables may not be linearly related.
2) If a plot of the residuals against the predictor variable shows the spread of the residuals
increasing or decreasing as the predictor variable increases, then an assumption of the least-
squares linear model is violated. This is the constant variance of the errors assumption.
3) If a plot of the residuals against the predictor variable shows that one residual is much larger
or smaller than all the other residuals, then the observation used to calculate this residual may
be an outlier.
113
For Practice. Draw a residual plot for the Age vs Median Blood Pressure example.
Solution:
114
Chapter 2
Analysis of Variance
115
Chapter 3
Non-parametric Methods
116
Chapter 4
117
4.1 Overview
In Module I: Review, we discussed fitting a simple linear model (i.e. one continuous dependent
and one continuous independent variable) to a set of data. Quite often it is found that this
single independent variable does not predict well the dependent variable. In fact, quite often
the dependent variable is influenced by many independent variables simultaneously and it is this
combination of independent variables that leads to a good predictive model for the dependent
variable. This module provides you with an introduction to modeling the relationship between
one continuous dependent variable and several independent variables. In addition to trying to
quantify the relationship between an independent and a dependent variable, the estimated model
can be used to predict the response of the dependent variable for a given set of values for the
independent variables.
118