You are on page 1of 364

Beginselen van Biostatistiek

K.U.Leuven
Prof. Jurgen Vercauteren
2019-20
Chapter 01

General Overview

Chapter 01 1
Introduction

Statistics: a science whereby inferences are made about specific


random phenomena on the basis of relatively limited sample
material.

Mathematical statistics: concerns the development of new methods of


statistical inference and requires detailed knowledge of abstract
mathematics for its implementation.

Applied statistics: involves application of mathematical statistical methods


to specific subject areas such as economics, psychology, and public health.

Chapter 01 2
Biostatistics: a branch of applied statistics that applies
statistical methods to medical and biological problems.

Standard statistical methods may not necessarily be


applicable for all studies
New biostatistical methods are developed by biostatisticians.

Chapter 01 3
Role of Biostatistics in Medical Research
Observation:
Blood pressure readings of patient X obtained using
Automatic measuring device = 115 mm Hg;
highest reading = 130 mmHg
Standard blood pressure cuff = 90 mm Hg

Why is there a difference in blood pressure readings between an


automatic machine vs. a human observer?
Are the two methods of determining blood pressure comparable?

Chapter 01 4
Study Question:
Are the methods of automatic vs. manual
determination of blood pressure comparable?

To address this question, we designed and carried


out a small-scale study of blood pressure monitoring
machines.

Chapter 01 5
Data Analysis
Data obtained from the study can be
summarized using descriptive statistics

Descriptive material can be numeric or


graphic
If numeric, data can be tabulated or presented
as a frequency distribution

If graphic, data can be summarized pictorially

Chapter 01 6
Choice of numeric or graphic descriptive
statistics is dependent on type of distribution
of data.

1.Continuous data:
where there are infinite number of possible values
(e.g., blood pressure measurements)
means and standard deviations may be used
2.Discrete data:
where there are only a few possible values (e.g., sex)
percentages of people for each value may be
considered
Chapter 01 7
Tabulated results of study under
consideration

Note: Meaningful data for all 100 people at each test site could not be obtained on account of a few
not valid readings from the machines.
Missing data is common in biostatistics and should be anticipated at the planning stage.

Notice the apparent difference in blood pressure readings between machine


vs. manual measurements in locations C and D.
Chapter 01 8
Inferential Statistics
Determining whether the difference in blood
pressure readings is “real” or “by chance”

Sample size = 98 people from the general


population
Estimated mean difference = 14 mm Hg
Error in estimated mean difference = ?
True mean difference = d = ?
Chapter 01 9
Inferring the characteristics of a population from
a sample is the central concern of statistical
inference.

To accomplish this aim, we need to develop a


probability model, which would tell us how likely
it is to obtain a 14-mm Hg difference between
the two methods in a sample of 98 people if
there were no real difference between the two
methods over the entire population of users of
the machine.

A small enough probability would indicate that


the difference between the two methods is real.
Chapter 01 10
For our study, we used a probability model based
on t distribution.

The probability was found to be < 1 in 1000 for


each of the machines at locations C and D.

The low probability indicated that there is a real


difference between the automatic and manual
method of blood pressure determination.
Chapter 01 11
Further data analyses were carried out using a
statistical package.

A statistical package is a collection of statistical


programs that describe data and perform various
statistical tests on the data.
A few statistical packages include R, SAS, SPSS,
Stata, MINITAB, and Excel.

Chapter 01 12
The End

Chapter 01 13
Chapter 02
Descriptive Statistics

Chapter 02 1
Introduction
The first step in data analysis is to describe the data in some
concise manner.
Descriptive statistics that involve numeric or graphic display
are crucial in capturing and conveying the final results of
studies in publications.

Features of good numeric or graphic form of data


summarization:
Self-contained
Understandable without reading the text
Clearly labeled of attributes with well-defined terms
Indicate principal trends in data

Chapter 02 2
Example: Bar graphs

Vitamin A consumption
prevents cancer

Total cancer cases: 200


Total matched controls: 200

The bar graphs show that the


Vitamin A consumed by
controls is more than that
consumed by the patients with
cancer. In some cases, the
levels exceed the
recommended daily allowance
(RDA).
Chapter 02 3
Example: Scatter plot

CO concentrations are
about the same in the
working environments
of passive smokers and
non
smokers early in the day.

This supports the


observation that passive
smokers have lower
pulmonary function
than comparable
nonsmokers.

Chapter 02 4
(1) Measures of Location
It is easy to lose track of the overall picture when there are
too many sample points.

Data summarization is important before any inferences can be


made about the population from which the sample points
have been obtained.

Measure of location is a type of measure useful for data


summarization that defines the center or middle of the
sample.

Chapter 02 5
The Arithmetic Mean or Average

Arithmetic mean or the “average”: the sum of all the observations


divided by the number of observations.
Statistically expressed as

Limitation: Oversensitive to extreme values; in which case, it may not be


representative of the location of the majority of sample points.

Chapter 02 6
Median
Sample median is

Example: Calculating the median


Order the sample as follows:
3, 5, 7, 8, 8, 9, 10, 12, 35

Because n is odd, the sample median


is the fifth largest point, that is, 8

Chapter 02 7
Comparing Average and Median

For symmetric distributions,


the average is approximately the
same as the median

For positively skewed distributions,


the average tends to be …………
than the median

For negatively skewed distributions,


the average tends to be …………..
than the median

poll
Chapter 02 8
Mode
Mode: the most frequently occurring value among all the
observations in a sample.
Data distributions may have one or more modes.
One mode = unimodal
Two modes = bimodal
Three modes = trimodal and so on.

Example

Mode is 28

Chapter 02 9
Geometric Mean

Many types of laboratory data (for example concentrations) can be expressed as


multiples of 2 or a constant multiplied by a power of 2, that is,

2kc k = 0, 1,… for some constant c

Example:

2k(0.03125) for k = 0, 1, 2,…


Chapter 02 10
Geometric Mean

The distribution of the original concentrations is skewed:

Chapter 02 11
Geometric Mean

Chapter 02 12
(2) Measures of Spread

The mean obtained by the two methods is the same. However, the
variability or spread of the Autoanalyzer method appears to be
greater.

Chapter 02 13
Range
Range is the difference between the largest and smallest
observations in a sample.
Once the sample is ordered, it is very easy to compute the range.
Range is very sensitive to extreme observations or outliers.
Larger the sample size (n), the larger the range and the more
difficult the comparison between ranges from data sets of varying
sizes.

A better approach to quantifying the spread in data sets is


percentiles or quantiles.
Percentiles are less sensitive to outliers and are not greatly affected
by the sample size.

Chapter 02 14
Quantiles or percentiles
The pth percentile is the value Vp such that p percent of the sample points are
less than or equal to Vp.
Median is the 50th percentile, which is a special case of a quantile.

Frequently used percentiles are


quartiles (25th, 50th, and 75th percentiles)
quintiles (20th, 40th, 60th, and 80th percentiles)
deciles (10th, 20th,…, 90th percentiles)

There are different algorithms for calculating quantiles leading to (slightly)


different values. Best to use a computer program like RStudio that works with R:

Chapter 02 15
RStudio

Chapter 02 16
Chapter 02 17
Chapter 02 18
Variance and Standard Deviation
If the distribution is bell-shaped, then a measure that can
summarize the difference (or deviations) between the individual
sample points and the arithmetic mean can be expressed as

that is,

The sum of the deviations of the individual observations of a


sample about the sample mean is always zero. Alternatively, the
variance, which is the average of the squares of the deviations from
the sample mean, and its square root may be used:
Variance : Standard deviation:

Chapter 02 19
example

Chapter 02 20
Coefficient of Variation (CV)

Chapter 02 21
Graphic Methods
Graphic methods of displaying data give a quick overall impression of
data. Bar graphs, stem-and-leaf plots and box plots, are some graphic
methods:

Bar graphs:

used to display grouped data

difficult to construct

Identity of the sample points


within the respective groups is lost

Chapter 02 22
Graphic methods
Stem-and-Leaf plots:
Each data point is converted into stem and leaf, e.g., 134 (stem: 13; leaf: 4)

The collection of leaves indicates the


shape of the data distribution

Chapter 02 Stem |Leaf 23


Graphic methods
Box plot:

If the distribution is symmetric, then upper and lower quartiles should be
approximately equally spaced from the median
If the upper quartile is farther from the median than the lower quartile, then the
distribution is positively skewed
If the lower quartile is farther from the median than the upper quartile, then the
distribution is negatively skewed

Chapter 02 24
Case study 1: Effects of lead exposure on
neurological and psychological function in children

The finger-wrist tapping scores (MAXFWT) and full-scale IQ scores (IQF) seem
slightly lower in the exposed group than in the control group.
We analyze these data in more detail in later chapters, using t tests, analysis of
variance and regression methods.

Chapter 02 25
Case study 2: Effects of
tobacco use on bone-mineral
density (BMD in g/cm²) in
middle-aged female twins
Matched-pair study: in this case,
matching is based
on having similar genes (twins).

Additional info relevant for BMD


obtained and included in statistical
regression model (see later chapters),

Especially for the lumbar spine an


inverse relationship is seen
between difference in BMD and
difference in tobacco use.

Chapter 02 26
Summary
Numeric or graphic methods for displaying data help in
quickly summarizing a data set
And/or presenting results to others

A data set can be described numerically in terms of measure of location and


a measure of spread
Measure of location Measure of spread
Arithmetic mean Standard deviation
Median Quantiles
Mode Range
Geometric mean

Graphic methods include bar graphs and more exploratory methods


such as stem-and-leaf plots and box plots.

Chapter 02 27
The End

Chapter 02 28
Chapter 03
Probability

Chapter 03 1
Introduction
In addition to describing data, we might want to test specific inferences about the
behavior of data. To set a framework for evaluating occurrence, we introduce the
concept of probability.

sample space = the set of all possible outcomes


An event = any set of outcomes of interest
The probability of an event = the relative frequency of this set of outcomes over an
infinite number of trials.
In real life, experiments cannot be performed an infinite number of times. Hence,
probabilities are estimated from empirical probabilities obtained from large samples.
Theoretical probability models may also be constructed from which probabilities of
many different kinds of events can be computed.
Comparing empirical probabilities with theoretical probabilities enables us to assess
the goodness-of-fit of probability models.
Chapter 03 2
The probability of an event E, denoted by Pr(E), always satisfies
0≤ Pr(E) ≤ 1

If outcomes A and B are two events that cannot both happen at the same time,
then these events are mutually exclusive.
In that case we have: Pr(A or B occurs) = Pr(A) + Pr(B)

Example: Suppose
A = person has normotensive diastolic blood pressure: DBP < 90
B = person has borderline DBP readings: 90 ≤ DBP < 95.
Z = event that person has DBP < 95.

Pr(A) = 0.7 and Pr(B) = 0.1.


Then: Pr(Z) = Pr(A) + Pr(B) = 0.8.

Chapter 03 3
Multiplication and Additions Laws of Probability

 If Pr(A  B) = Pr(A) × Pr(B) then A and B are independent. This is


the Multiplication Law of Probability. Then Pr(B|A) = Pr(B).

Example 1: A = mother is hypertensive; B = father is hypertensive


Pr(A) = 0.1, Pr(B) =0.2 and Pr(A  B) = 0.02
A  B = both mother and father are hypertensive.

Pr(A  B) = Pr(A) × Pr(B) = 0.1(0.2) = 0.02 so both events are


independent.
Chapter 03 4
Chapter 03 5
Chapter 03 6
Some more formulas

Chapter 03 7
Example

Chapter 03 8
Example

A priori kans
A posteriori kans

Chapter 03 9
The End

Chapter 03 10
Chapter 04
Discrete Probability Distributions

Chapter 04 1
Introduction
In this chapter, we will discuss

Discrete random variables


Probability-mass function (probability distribution)
cumulative-distribution function, expected value, and variance
Permutations and combinations
Binomial distribution
Poisson distribution

Chapter 04 2
Random Variables
A random variable is a function that assigns numeric values to different
events in a sample space.

Two types of random variables: discrete and continuous

A random variable for which there exists a discrete set of numeric values is
a discrete random variable.

A random variable whose possible values cannot be enumerated is a


continuous random variable.

Chapter 04 3
Probability-Mass Function for a Discrete Random Variable

The values taken by a discrete random variable and its associated


probabilities can be expressed by a rule or relationship called a
probability-mass function (pmf).

A probability-mass function, sometimes also called a probability


distribution, is a mathematical relationship, or rule, that assigns to
any possible value r of a discrete random variable X the probability
Pr(X = r). This assignment is made for all values r that have positive
probability.

Chapter 04 4
The probability distribution can be displayed as a table or as a
mathematical formula giving the probabilities of all possible values.
Example: X = the number of persons that can be controlled for
hypertension out of 4 treated persons.

The probability of any particular value must be between 0 and 1 and


the sum of the probabilities of all values must be exactly equal to 1.

The probabilities can be determined by experiments (observational studies,


by imposing a well-known distribution (that fits the sample data), or by
applying a mathematical formula (see further).

Chapter 04 5
Expected Value of a Discrete Random Variable
The analog of the arithmetic mean or average 𝑥 of a sample data set is
called the expected value of a random variable, or population mean, and
is denoted by E(X) or .
It is obtained by multiplying each possible value by its respective
probability and summing these products over all possible values:

What is the expected value for the nr of persons that can be controlled
for hypertension?  POLL

Chapter 04 6
Variance of a Discrete Random Variable

The analog of the sample variance (s2) for a random variable is called the
variance of a random variable, or population variance, and is denoted by
Var(X) or 2

The standard deviation of a random variable X, denoted by sd(X) or , is


defined by the square root of its variance.

Approximately 95% of all possible values falls within two standard


deviations (2) of the mean of a random variable.

Chapter 04 7
The cumulative-distribution function F(x) or cdf of a random
variable X is the Pr(X ≤ x).

Example:

F(X=r) 0,129 0,393 0,664 0,849 0,944 0,983 1

For a discrete random variable,


the cdf looks like a series of steps,
called the step function.

With the increase in number of


values, the cdf approaches that of a
smooth curve.

For a continuous random


variable, the cdf is a smooth curve.
Chapter 04 8
Binomial Distribution
= A sample of n independent “Bernoulli” trials, each of which can have only two
possible outcomes, which are denoted as “success” and “failure.”
The probability of a success at each trial is assumed to be some constant p, and
hence the probability of failure is 1 – p = q.

Example: We have 5 cells that can be neutrophilic (x) or non-neutrophilic (o).


The probability that a cell is neutrophilic is 0,6.

What is the probability that the 2nd and 5th cell is neutrophilic?
 POLL

Chapter 04 9
Binomial Distribution
What is the probability that any 2 cells out of 5 will be neutrophilic?

In general for a binomial distribution, the probability of k successes


within n trials:

Chapter 04 10
Binomial Distribution
Table 1 at the end of the book can be used for probability calculations of
the Binomial distribution for n up to 20 and p = 0.05, 0.1, 0.15, …., 0.5

Chapter 04 11
Using Electronic Tables

What if n>20 and/or p is not given in binomial table?


 For sufficiently large n, the normal distribution can be used to
approximate the binomial distribution and tables of the normal
distribution can be used to evaluate binomial probabilities (see later
chaper).
 If the sample size is not large enough to use normal approximation,
then an electronic table can be used to evaluate binomial probabilities.

MS Excel provides a menu of statistical function, including calculation of


probabilities for many probability distributions. Ex., the binomial-
distribution function, called BINOMDIST, which can be used to calculate
the pmf and cdf for any binomial distribution.

Chapter 04 12
Chapter 04 13
Expected Value and Variance of the Binomial Distribution

The expected value of a discrete random variable is

In the special case of a binomial distribution, the only values that


take on positive probability are 0, 1, 2,…, n and these occur with
probabilities

Thus, that is, the expected value = np

Similarly, variance

Chapter 04 14
Chapter 04 15
Poisson Distribution
The Poisson distribution is the second most frequently used discrete
distribution after binomial distribution. It is usually associated with rare
events per unit of time, of space, of … Assume that the nr of events per
unit of time (space, ..) is constant throughout the entire time (space, ..)
and assume that the occurrence of events is independent of a previous
occurrence.
Example: The number of deaths attributed to typhoid fever over a long period of
time (for example, 1 year) is a Poisson random variable.

The probability of k events occurring in a time period t for a Poisson random


variable with parameter  is

Where  = t
e is approximately 2.71828
 = expected no. of events /unit time
 = expected no. of events over time period t

Chapter 04 16
Chapter 04 17
Poisson Distribution
Table 2 at the end of the book can be used for probability calculations of
the Poisson distribution for μ ≤ 20

μ = 2.3

Chapter 04 18
Using Electronic Tables

What if μ is not given in Poisson table?


 For sufficiently large μ (≥10), the normal distribution can be used to
approximate the Poisson distribution and tables of the normal
distribution can be used to evaluate Poisson probabilities (see later
chaper).
 If μ is not large enough to use normal approximation, then an
electronic table can be used to evaluate Poisson probabilities.

MS Excel can be used to calculate the pmf and cdf for any Poisson
distribution.

Chapter 04 19
Computation of Poisson Probabilities
Electronic tables for the Poisson distribution
POISSON function of Excel 2007 can be used to compute individual
and cumulative probabilities for the Poisson distribution

Chapter 04 20
Expected Value and Variance of the Poisson Distribution
𝐸 𝑋 = 𝑉𝑎𝑟 𝑋 = 𝜇
This guideline helps identify random variables that follow the
Poisson distribution.
If we have a data set from a discrete distribution where the mean and
variance are about the same, then we can preliminarily identify it as a
Poisson distribution. Example:

15−18 2 + 10−18 2 +⋯+(15−18)² 208


𝐸 𝑋 = 180/10 = 18 and 𝑉𝑎𝑟 𝑋 = 10−1
= 9
= 23
Variance is approximately the same as the mean, therefore, the Poisson
distribution would fit well here.
Chapter 04 21
Poisson Approximation to the Binomial Distribution
The binomial distribution with large n and small p can be accurately
approximated by a Poisson distribution with parameter  = np.
The mean of this distribution is given by np and the variance by npq.
Note that q  (is approximately equal to 1) for small p, and thus
npq  np. That is, the mean and variance are almost equal.

Why approximate by Poisson?  The binomial distribution involves expressions


such as 𝑛𝑘 and (1 – p)n –k which are cumbersome for large n.

Chapter 04 22
Summary
In this chapter, we discussed:
Random variables and the distinction between discrete and continuous
variables.
Specific attributes of random variables, including notions of probability-mass
function (probability distribution), cdf, expected value, and variance.
Sample frequency distribution was described as a sample realization of a
probability distribution, whereas sample mean (x) and variance (s2) are sample
analogs of the expected value and variance, respectively, of a random variable.
Binomial distribution was shown to be applicable to binary outcomes
(“success” and “failure”).
Poisson distribution as a classic model to describe the distribution of rare
events.

Chapter 04 23
The End

Chapter 04 24
Chapter 05
Continuous Probability Distributions

Chapter 5 1
Introduction
This chapter focusses on the normal, or Gaussian or
“bell-shaped,” distribution. It is the cornerstone of most methods
of estimation and hypothesis testing.

Many random variables, such as distribution of birth weights or


blood pressures in the general population, tend to follow
approximately a normal distribution.

Those variables that are not themselves normal oftentimes are


closely approximated by a normal distribution when summed many
times.

Using normal distribution is desirable since it is easy to use and


tables for it are more widely available than are tables for other
distributions.

Chapter 5 2
An analog for a continuous random variable to the concept of a
probability-mass function, is the probability-density function (pdf).
These functions are also called (probability) distributions.

Examples:

Chapter 5 3
The expected value of a continuous random variable X, denoted by
E(X), or , is the average value taken on by the random variable:

The variance of a continuous random variable X, denoted by Var(X)


or 2 is the average squared distance of each value of the random
variable from its expected value, which is given by E(X - )2 and can
be re-expressed in short form as E(X2) - 2. The standard deviation,
or , is the squared root of the variance, that is,  = √Var(X).

Chapter 5 4
Normal Distribution

Normal distribution is the most widely


used continuous distribution.
It is vital to statistical work.

It is also frequently called Gaussian


distribution, after the well-known
mathematician Karl Friedrich Gauss.

Normal distribution is generally more convenient to work with than any


other distribution.
Body weights or DBPs approximately follow a normal distribution. Other
distributions that are not themselves normal can be made approximately
normal by transforming data onto a different scale, such as a logarithmic
scale.
Chapter 5 5
Usually, any random variable that can be expressed as a sum of many
other random variables can be well approximated by a normal
distribution.

Example: many physiologic measures are determined in part by a


combination of several genetic and environmental risk factors can often
be well approximated by a normal distribution.

Most estimation procedures and hypothesis tests assume the random


variable being considered has an underlying normal distribution.

Chapter 5 6
A point of inflection is a point at
which the slope of the curve
changes direction.
Distance from  to the points of
inflection are an indicator of
magnitude of .

Chapter 5 7
Chapter 5 8
Chapter 5 9
Chapter 5 10
Example: Pr(X<1.47) = 0.9292

Chapter 5 11
Example: Pr(X<-1.47) = Pr(Z>1.47) = 0,0708 (= 1-0.9292)

Chapter 5 12
The percentiles of a normal distribution are often referred to in
statistical inference. Example, the upper and lower fifth percentiles
of the distribution may be referred to define a “normal” range of values.

Chapter 5 13
Using Electronic Tables for the Normal Distribution
In Excel 2007, the function NORMDIST(x) provides the cdf
for a standard normal distribution for any value of x.

Example: determine the 85 percentile for a standard normal density:

Or via R:

Chapter 5 14
Conversion from an N(,2) Distribution to an N(0,1) distribution

This is known as standardization of a normal variable.

Chapter 5 15
𝑋~𝑁(80, 122 )

Chapter 5 16
Linear Combinations of Random Variables

Chapter 5 17
Normal Approximation to the Binomial Distribution
• If n is large, the binomial distribution is difficult to work with and an
approximation is easier to use rather than the exact binomial distribution.
• If n is moderately large and p is either near 0 or 1, then the binomial
distribution will be very positively or negatively skewed, resp.
• If n is moderately large and p is not too extreme, then the binomial
distribution tends to be symmetric and is well approximated by a normal
distribution.

Chapter 5 18
Example:

Chapter 5 19
Normal Approximation to the Poisson Distribution
The normal distribution can also be used to approximate discrete
distributions other than the binomial distribution, particularly the
Poisson distribution, which is cumbersome to use for large values
of .

Chapter 5 20
The normal approximation is clearly inadequate for  = 2, marginally
inadequate for  = 5, and adequate for  = 10, and  = 20.
Chapter 5 21
Example:

The exact Poisson probability via Excel:

Chapter 5 22
Summary
In this chapter, we discussed

Continuous random variables


Probability-density function, which is the analog to a probability-mass
function for discrete random variables
Concepts of expected value, variance, and cumulative distribution for
continuous random variables
Normal distribution, which is the most important continuous distribution:
The two parameters: mean  and variance ²
Normal tables, which are used when working with standard normal
distribution
Electronic tables can be used to evaluate areas and/or percentiles for any
normal distribution
Properties of linear combinations of random variables for independent and
dependent random variables
Normal approximation to the binomial and to the Poisson distribution

Chapter 5 23
The End

Chapter 5 24
Chapter 06
Estimation

Chapter 6 1
Introduction

Chapter 6 2
Introduction
In this chapter, we will discuss

How to infer the properties of the underlying distribution in a data set. This
inference is usually based on inductive reasoning rather than deductive
reasoning, that is, determining a best “fits” model among different
probability models.

Two types of statistical inferences:

Estimation (this chapter): concerned with estimating the values of specific


population parameters. These specific values are referred to as point
estimates. Sometimes, interval estimation is carried out to specify a range
within which the parameter values are likely to fall.

Hypothesis testing (Chapters 7-10): concerned with testing whether the


value of a population parameter is equal to some specific value.
Chapter 6 3
Example

Chapter 6 4
Relationship Between Population and Sample

Random sample: a selection of some members of the population such


that each member is independently chosen and has a known nonzero
probability of being selected.

Simple random sample: each group member has the same probability of
being selected.

Reference, target or study population: the group we want to study. The


random sample is selected from the study population.

Chapter 6 5
Random-Number Tables
A random number (or random digit) is a random variable X that
takes on the values 0, 1, 2, …, 9 with equal probability. Thus,
Pr(X = 0) = Pr( X = 1) = … = Pr(X = 9) = 1/10

Computer-generated random numbers are collections of digits that


satisfy the following two properties:
Each digit 1,2,…,9 is equally likely to occur.
The value of any particular digit is independent of the value of
any other digit selected

Computer programs are used to generate large sequences of


random digits. For example Table 4 in appendix of Book Rosner.

Chapter 6 6
Chapter 6 7
Process of random selection
Each of 1000 participants are assigned a number from 000 to 999, for
example based on an alphabetical list. Twenty groups of three digits are
selected, starting at any position in the random number table:

Chapter 6 8
Randomized Clinical Trials (RCT)

• Now accepted as the optimal study design in clinical research.


• For comparing different treatment arms
• Patients assigned tot treatment arm by “randomization”
• If sample sizes are large: characteristics of patients in different arms
will be the same.
• If sample sizes are small: check that characteristics are similar

Chapter 6 9
Randomized Clinical Trials (RCT)

Chapter 6 10
Chapter 6 11
Chapter 6 12
Design Features of Randomized Clinical Trials
In clinical trials, random assignment is sometimes called block
randomization.

A block size of 2n is predetermined, where for every 2n patients


entering the study, n patients are randomly assigned to treatment
A and the remaining n patients are assigned to treatment B.

For more than two treatment groups: If there are k treatment


groups, then the block size might be kn, where for every kn
patients, n patients are randomly assigned to the first treatment,
second treatment, and so on up to the kth treatment.

Chapter 6 13
Design Features of Randomized Clinical Trials
• Stratification: patients can be subdivided into strata, according to
characteristics important for the outcome (for example age, gender, …).

• Blinding: a RCT is:


• Double blind (gold standard): neither physician, nor patient knows type
of treatment
• Single blind: physician knows, but patient doesn’t know type of
treatment
• Unblinded: both know

• Patients (and physicians) may be blind to treatment initially but the side
effects can indicate the treatment received.

Chapter 6 14
Estimation of the Mean of a Distribution
We will discuss point estimation and interval estimation for the population
mean μ.

A natural point estimator to use for estimating the population mean  is the
sample mean

• 𝑥 is a single realization of a random variable 𝑋 over all possible samples of


size n that could have been selected from the population.

• X denotes a random variable, and x denotes a specific realization of the


random variable X in a sample.

• The sampling distribution of 𝑋 is the distribution of values of 𝑥 over all


possible samples of size n that could have been selected from the
reference population.

Chapter 6 15
Chapter 6 16
Expected value of 𝑋
• It can be shown that the average (or mean) of these 𝑥’s (when taken
over a large number of samples) approximates µ regardless of the
underlying distribution of X

• In other words: E 𝑋 = 𝜇

•  𝑋 is an unbiased estimator of µ.

• There are other unbiased estimators for µ but if underlying


distribution is normal, then it can be shown 𝑋 is that unbiased
estimator with the smallest variance. So 𝑋 is “minimum variance
unbiased estimator” of µ.

Chapter 6 17
Chapter 6 18
Standard deviation of 𝑋

• The standard deviation of these 𝑥’s (when taken over a large


number of samples) approximates 𝜎 𝑛 regardless of the underlying
distribution of X, with σ the standard deviation of X.

𝜎
• So sd(𝑋) =
𝑛

Chapter 6 19
Standard deviation of 𝑋
𝜎
• sd(𝑋) =
𝑛
𝑛 2
• Proof: 𝑉𝑎𝑟 𝐿 = 𝑖=1 𝑖 𝑉𝑎𝑟(𝑋𝑖 )
𝑐 (see chapter 5)
with 𝐿 = 𝑐1 𝑋1 + 𝑐2 𝑋2 + … + 𝑐𝑛 𝑋𝑛

𝑋1 +𝑋2 +⋯+𝑋𝑛 1 1 1
If 𝐿 = = 𝑋1 + 𝑋2 + ⋯ 𝑋𝑛 then for 𝑋 = 𝐿:
𝑛 𝑛 𝑛 𝑛

Chapter 6 20
Standard error of the mean (sem)

𝜎
• so the "standard error of the mean" is the sd 𝑋 = which is
𝑛
𝑠
estimated by
𝑛
• So the sample means from repeated samples of size 30 are less
variable than those from samples of size 10 (that are less variable
than those of size 1)

Chapter 6 21
𝑠
𝑠𝑒𝑚 =
1

𝑠
𝑠𝑒𝑚 =
10

𝑠
𝑠𝑒𝑚 =
30

Chapter 6 22
Central Limit Theorem
• So we just showed that:
yields
2
𝐸 𝑋 = 𝜇 𝑎𝑛𝑑 𝑉𝑎𝑟 𝑋 = 𝜎 𝐸 𝑋 = 𝜇 𝑎𝑛𝑑 𝑉𝑎𝑟 𝑋 = 𝜎² 𝑛

2
yields 2
• It can be shown that: 𝑋~𝑁 𝜇, 𝜎 𝑋~𝑁(𝜇, 𝜎 𝑛)

• What if underlying distribution is not normal?


 Central Limit Theorem

Chapter 6 23
Central Limit Theorem

This theorem allows us to perform statistical inference based on the


approximate normality of the sample mean despite the nonnormality
(skewness) of the distribution of individual observations.

However, it is always good to reduce the skewness of the distribution by


transforming the data using for example a log scale. The central-limit
theorem can them be applicable for smaller sizes than if the data are retained
in the original scale.

Chapter 6 24
Chapter 6 25
Interactive :

http://lstat.kuleuven.be/newjava/vestac/

(choose ‘basics’, ‘distribution of mean (continuous)’)

Chapter 6 26
Interval Estimation for μ
• Interval estimation = specify a range within which parameter values are likely to fall.
• How to set up an interval estimation for μ:

2 yields 𝑋−𝜇
𝑋~𝑁 𝜇, 𝜎 𝑛 𝑍=𝜎 ~𝑁(0, 1)
𝑛

So:
𝑋−𝜇
Pr −𝑧1−𝛼 ≤𝜎 ≤ 𝑧1−𝛼 =1−α
2 2
𝑛
Or
𝜎 𝜎
Pr −𝑋 − 𝑧1−𝛼 ≤ −𝜇 ≤ −𝑋 + 𝑧1−𝛼 = 1−α
2 𝑛 2 𝑛

𝜎 𝜎
Or Pr 𝑋 − 𝑧1−𝛼 2 ≤ 𝜇 ≤ 𝑋 + 𝑧1−𝛼 2 = 1−α
𝑛 𝑛

Chapter 6 27
Interval Estimation for μ
• For α = 0,05 this becomes

𝑋−𝜇
Pr −1,96 ≤ 𝜎 ≤ +1,96 = 0,95
𝑛
Or
𝜎 𝜎
Pr −𝑋 − 1,96 ≤ −𝜇 ≤ −𝑋 + 1,96 = 0,95
𝑛 𝑛

𝜎 𝜎
Or Pr 𝑋 − 1,96 𝑛
≤ 𝜇 ≤ 𝑋 + 1,96 𝑛
= 0,95

Chapter 6 28
Chapter 5 29
Chapter 5 30
𝜎 𝜎
So 𝑋 − 1,96 , 𝑋 + 1,96
𝑛 𝑛

is a 95% estimation or confidence interval (CI) for μ.

• Problem:  is rarely known in practice


Solution: estimate σ by the sample standard deviation s and to try to construct
𝑋−𝜇
CIs using the quantity 𝑆
𝑛
𝑋−𝜇
• Again problem: 𝑆 is no longer normally distributed
𝑛

Solved in 1908 by a statistician named William Gossett.


𝑋−𝜇
Gossett founded the Student’s t distribution for 𝑆 :
𝑛
The shape of the t distribution depends on the sample size n.
Thus the t distribution is not a unique distribution but is instead a family
of distributions indexed by a parameter referred to as the degrees of
freedom (df ) of the distribution.
Chapter 6 31
Student’s t distribution is a family of distributions indexed by the
degrees of freedom d. The t distribution with d degrees of freedom
is referred to as the td distribution.
The t distribution is symmetric about 0 but is more spread out then
the standard normal distribution:

Chapter 6 32
Pr 𝑡4 ≤ 2,776 = 0,975

Chapter 6 33
So if σ is unknown we replace it by it’s estimation s and use the 𝑡𝑛−1 distribution
so that:

Chapter 6 34
Chapter 6 35
Interpretation for α = 0,05: over the collection of all 95% CIs that could be constructed
from repeated random samples of size n, 95% will contain μ.
Chapter 6 36
Over the collection of all 95% CIs that could be constructed from repeated
random samples of size n, 95% will contain the parameter .

The length of the CI indicates the precision of the point estimate 𝑥.

Chapter 6 37
Chapter 6 38
Case study: effects of tobacco use on bone-mineral density (BMD) in
middle-aged women.

• 41 twin pairs of heavy and light smoking sister


• Average difference in BMD of -0,036 +- 0,014 g/cm² (mean +- se)
• So the 95% CI for the true mean difference in BMD is

• Since 0 is not included in this CI, we can say that there is a significant
association (α = 0,05) between BMD and cigarette smoking.

Chapter 6 39
Estimation of the Variance of a Distribution
Point Estimation
Let X1, …, Xn be a random sample from some population with mean
 and variance 2. The sample variance S2 is an unbiased estimator
of 2 over all possible random samples of size n that could have
been drawn from this population, that is E(S2) = 2

We use “n-1” in stead of the more intuitive “n” because then s² is an


unbiased estimator for 2, especially for small n.

Chapter 6 1
Estimation of the Variance of a Distribution

Interval estimation
To obtain an interval estimate of 2 , we need to find the sampling
distribution of S2. If we assume that X ~ N(,2), then it can be shown
that

with
2
𝜒𝑛−1 = 𝐶ℎ𝑖 − 𝑠𝑞𝑢𝑎𝑟𝑒 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑛 − 1 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚.

Chapter 6 2
Estimation of the Variance of a Distribution

Interval estimation
How to set up an interval estimation for μ:

If then

So

To obtain

This interval works best when X is normally distributed. Even for large n, this interval can perform
poorly because there is now no CLT to use.
Chapter 6 3
So a new family of continuous distributions, called chi-square ( 2)
distributions, must be introduced to enable us to find the sampling
distribution of S2:

The chi-square distribution only takes on positive values and is


always skewed to the right.

Chapter 6 4
For n  3, the distribution has a
mode greater than 0 and is
skewed to the right. The skewness
diminishes as n increases.

The uth percentile of a 𝜒𝑑2 distribution is denoted by 𝜒𝑑,𝑢


2
where Pr(𝜒𝑑2 < 𝜒𝑑,𝑢
2
)  u.

Chapter 6 5
2
If u = 0,975, then 𝜒5,0,975 = 12,83

Chapter 6 6
Summary
In this chapter, we discussed

Sampling distribution, crucial to understanding the principles of


statistical inference

Minimum-variance unbiased estimator of 

Central-limit theorem

Interval estimate or confidence interval

t and chi-square distributions to obtain interval estimates

Chapter 6 7
The End

Chapter 6 8
Chapter 07
Hypothesis Testing: One-Sample Inference

Chapter 7 1
Introduction
Hypothesis-testing framework specifies two hypotheses:
The null hypothesis (!" ) and the alternative hypothesis (!# or !$ ). We
wish to compare the relative probabilities of obtaining the sample data
under each of these hypotheses.

Hypothesis-testing provides an objective framework for making decisions


using probabilities methods, rather than relying on subjective
impressions.
It provides a uniform decision-making criterion that is consistent.

In a one-sample problem, hypotheses are specified about a single


distribution.

In a two-sample problem, two different distributions are compared.

Chapter 7 2
The null hypothesis is the hypothesis that is to be tested. The alternative
hypothesis is the hypothesis that contradicts the null hypothesis.

H0: µ = µ0 vs. H1: µ < µ0

If we decide H0 is true, then we say we accept H0. If we decide H1 is true,


then we state that H0 is not true or, equivalently, that we reject H0. Thus,
four possible outcomes can occur:

Chapter 7 3
Type I = Pr(rejecting H0|H0 true)

Type II = Pr(accepting H0|H1 true)

Example:

Chapter 7 4
type I error = a = the significance level of a test

type II error = b

The power of a test is defined as


1 - b = 1 – probability of a type II error = Pr(rejecting H0|H1 true)

The aim in hypothesis testing is to use statistical tests that make a and b
as small as possible, which means rejecting and accepting null hypothesis
less often, respectively. These actions are contradictory: as a decreases, b
increases and vice versa. Let us fix a and use that statistical test that
minimizes b (or maximizes the power).

Chapter 7 5
One-sample inference can be applied on several hypothesis-testing
situations from which we will cover the following:

Each of the hypothesis tests can be conducted in one of two ways:

We can choose for a one-sided or for a two-sided alternative.


Chapter 7 6
(1) One-Sample Test for the Mean of a Normal Distribution
with unknown variance

Based on the teststatistic t where

$̅ − &'
!= )
( *

which follows under H0 a t distribution with n-1 degrees of freedom.

Chapter 7 7
One-Sample Test for the Mean of a Normal Distribution:
Two-sided Alternatives or One-Sided Alternatives

H0: µ = µ0 vs. H1: µ ≠ µ0

H0: µ = µ0 vs. H1: µ < µ0 or H1: µ > µ0

Chapter 7 8
So assume normal distribution
or for n>30 the CLT can be used

which under H0 follows a t distribution with n-1 df.

Chapter 7 9
Example 1: one-sample one-sided t test

Assume that the cholesterol level is normally distributed. Suppose the mean
cholesterol level of 10 children whose fathers died from heart disease is
200 mg/dL and the sample standard deviation is 50 mg/dL.

Chapter 7 10
Example 1: Cardiovascular disease, pediatrics
Test: H0: ! = !# = 175 vs. H1: ! > 175

or is the cholesterol level significantly higher than the general mean?

̅ *
'()
Test statistic: %= ,
+ -

with t ~ t10-1 distribution under H0

.##(/01
Compute: %= 2* = 1.58
+ 3*

Conclude: t = 1.58 < 1.833 = t9, 1-0.05


so don’t reject H0 : the cholesterol level is not significantly higher than the
general mean

p-value: p = Pr(t9 > 1.58) = probability between 0.05 and 0.10

so p > 0.05 so indeed don’t reject null hypothesis at 95% confidence


level or 5% significance level.
Critical value from table 5: 1.833 = t9, 1-0.05 = t9, 0.95
p-value: p = Pr(t9 > 1.58) = probability between 0.05 and 0.10
which under H0 follows a t distribution with n-1 df.

One-sided or One-tailed test

≤ In general: if p < 0,05, then H0 is rejected.

Chapter 7 14
Example 2: one-sample one-sided t test

Assume that the birthweight is normally distributed.

Chapter 7 15
Example 2: Obstetrics
Test: H0: ! = !# = 120 vs. H1: ! < 120

or is the birthweight significantly lower than the national average?

̅ ,
)*+
Test statistic: '= .
- /

with t ~ t100-1 distribution under H0

001*02#
Compute: '= 34 = −2.08
- 5,,

Conclude: t = -2.08 < -1.66 = t99, 0.05


so reject H0 : the birthweight is significantly lower than the national average

p-value: p = Pr(t99 < -2.08) = probability between 0.01 and 0.025

so p < 0.05 so indeed reject null hypothesis at 95% confidence


level or 5% significance level.

which under H0 follows a t distribution with n-1 df.

So assume normal distribution


or for n>30 the CLT can be used

In general: if p < 0,05, then


H0 is rejected.

Chapter 7 17

which under H0 follows a t distribution with n-1 df.

Two-sided or Two-tailed test

In general: if p < 0,05, then


H0 is rejected.

Chapter 7 18
Chapter 7 19
Example 3: one-sample two-sided t test

Chapter 7 20
Example 3: Cardiovascular disease
Test: H0: ! = !# = 190 vs. H1: ! ¹ 190

or is the cholesterol level significantly different from the general mean?

̅ ,
)*+
Test statistic: '= .
- /

with t ~ t100-1 distribution under H0

010.34*05#
Compute: '= 6, = −2.12
- 7,,

Conclude: t = -2.12 < -1.99 = t99, 0.025


so reject H0 : the cholesterol level is significantly different from general mean

p-value: p = 2Pr(t99 < -2.12) = 2(probability between 0.01 and 0.025) =


between 0.02 and 0.05

so p < 0.05 so indeed reject null hypothesis at 95% confidence level


or 5% significance level.
So 2 methods for a one-sample t-test for the mean of a (normal*) distribution
(unknown variance and two-sided or one-sided alternatives) :
*For n>30: CLT
Compute the t teststatistic:
which under H0 follows a t distribution with n-1 df.

1) the critical-value method: compare t with the critical value(s):


a) for H0: µ = µ0 vs. H1: µ < µ0 :
• if t < tn-1,a then H0 is rejected
• if t ≥ tn-1,a then H0 is accepted
b) for H0: µ = µ0 vs. H1: µ > µ0 :
• if t > tn-1,1-a then H0 is rejected
• if t ≤ tn-1,1-a then H0 is accepted
c) for H0: µ = µ0 vs. H1: µ ≠ µ0 :
• if t < tn-1,a/2 or if t > tn-1,1-a/2 then H0 is rejected
• if tn-1,a/2 ≤ t ≤ tn-1,1-a/2 then H0 is accepted

2) the exact p-value method: if p < α, then reject H0


a) for H0: µ = µ0 vs. H1: µ < µ0 : ! = Pr(&'() ≤ &)
b) for H0: µ = µ0 vs. H1: µ > µ0 : ! = Pr(&'() ≥ &)
c) for H0: µ = µ0 vs. H1: µ ≠ µ0 : ! = 2 x Pr(&'() ≤ &) if t < 0
or ! = 2 x Pr(&'() ≥ &) if t > 0
Chapter 7 22
When is a one-sided test more appropriate than a two-sided test?

Generally, the sample mean falls in the expected direction from µ0 and it is
easier to reject H0 using a one-sided test than using a two-sided test.
A two-sided test can be more conservative because it is not necessary to guess
the appropriate side of the null hypothesis for the alternative hypothesis.

In some cases, only alternatives on one side of the null mean are of interest or
are possible, and a one-sided test is “better” than a two-sided test because it
has more power.

But the decision whether to use a one-sided or two-sided test must be made
before the data analysis (or before data collection) begins so as not to bias
conclusions based on results of hypothesis testing!
Do not change from a two-sided to a one-sided test after looking at the data.

Chapter 7 23
The Relationship Between Hypothesis Testing
and Confidence Intervals (Two-Sided Case)
In example 3 the H0: ! = !# = 190 was rejected at 95% confidence level or 5%
significance level by using a t test (critical value approach or p-value approach).

We can reach the same conclusion by setting up a 95% confidence interval for the
mean μ of a (normal*) distribution with unknown variance (see previous chapter):

we see that !# = 190 is not included in this interval,


so also confirming the rejection of H0
The Power of a Test
It tells us how likely it is that a statistically significant difference will be
detected based on a finite sample size n, if the alternative hypothesis is true.
Let us study the length of the CI for the mean:

So the same factors are influencing the power of a test, because as the length of the
CI decreases, the less chance that !" is included, so the power increases. Moreover
the power increases also if !" is shifted further away from the sample mean.

Chapter 7 25
(2) One-Sample Test for the Mean of a Normal Distribution
with known variance

Based on the teststatistic z where

$̅ − &'
!= )
( *

which follows under H0 a standard normal distribution .

Same rationale as in case of unknown variance but now use table of


standard normal distribution.

Chapter 7 26
Sample-Size Determination: one-sided alternatives

Suppose we wish to test H0: µ = µ0 vs. H1: µ = µ1 (and µ1 > µ0 or µ1 < µ0) where the
data are normally distributed with mean µ and known variance s². The sample size
n needed to conduct a one-sided test with significance level a and probability of
detecting a significant difference = 1- b is:

Similar formulas exist for all other cases of hypothesis testing.

Chapter 7 27
Sample-Size Determination: one-sided alternatives

Example 7.2: birthweight


The appropriate sample size needed to conduct the following one-sided test
H0: µ = µ0 = 120 vs. H1: µ = µ1 = 115
where the data are normally distributed with known variance s² = 242, with
significance level a = 0,05 and with power = 1 - b = 80% is:

Thus a sample size of 143 is needed to have an 80% chance of detecting a


significant difference at the 5% level if the alternative mean is 115 oz and a one-
sided test is used.

Chapter 7 28
Sample-Size Determination: two-sided alternatives

Suppose we wish to test H0: µ = µ0 vs. H1: µ = µ1 where the data are normally
distributed with mean µ and known variance s². The sample size n needed to
conduct a two-sided test with significance level a and probability of detecting a
significant difference = 1- b is:

Chapter 7 29
Power Determination: one-sided alternatives

Suppose we wish to test H0: µ = µ0 vs. H1: µ = µ1 (and µ1 > µ0 or µ1 < µ0) where the
data are normally distributed with mean µ and known variance s².
The power 1- b of this one-sided test with significance level a and sample size = n
is:

Chapter 7 30
Power Determination: one-sided alternatives

Example 7,2 birthweight

Chapter 7 31
(3) One-Sample Test for the Variance of a Normal Distribution

Based on the teststatistic X² where

$ − 1 '²
!² =
()*

which follows under H0 a Chi-square distribution with n-1 degrees of freedom.

Chapter 7 32
!" : $ % = $"% '()*+* !, : $ % ≠ $"%
which under H0 follows a Chi-square
distribution with n-1 df.

Chapter 7 33
!" : $ % = $"% '()*+* !, : $ % ≠ $"%

In general: if p<0,05, then reject H0.

Use table of Chi-square distribution for critical values and p-value..


Chapter 7 34
(4) One-Sample Test for the Parameter p of a Binomial
Distribution

Based on the teststatistic zcorr where


1
'̂ − ') −
!"#$$ = 2,
') .)-
,

which follows under H0 a standard normal distribution when the normal


approximation to the binomial distribution is valid i.e. when ,') .) ≥ 5.

Chapter 7 35
Example: cancer

1

2$
Continuity correction is negligible

Chapter 7 36
(5) One-Sample Test for the Expected Value μ of a Poisson
Distribution
%
Based on the teststatistic !"#$$ where

%
%
' − )* − 0.5
!"#$$ =
)*

which follows under H0 a Chi-square distribution with 1 degree of freedom


when the expected number of events is large: )* ≥ 10.

Chapter 7 37
Here one-sided test

Chapter 7 38
Example, Occupational health

% %
%
' − )* − 0.5 21 − 18.1 − 0.5
!"#$$ = = = 0.32
)* 18.1

Chapter 7 39
An index frequently used to quantify risk in a study population
relative to the general population is the standardized mortality
ratio (SMR).

It is defined by 100% × o/E = 100% × the observed number of


deaths in the study population divided by the expected number of
deaths in the study population under the assumption that the
mortality rates for the study population are the same as those for
the general population.

For nonfatal conditions, the SMR is sometimes known as the


standardized morbidity ratio.

Chapter 7 40
Summary
In this chapter, we introduced
1. Specification of the null (H0) and alternative (H1) hypotheses;
2. type I error (a), type II error (b), and the power (1-b) of a hypothesis test;
the p-value of a hypothesis test and the distinction between one-sided and
two-sided tests;
3. methods for estimating appropriate sample size and power as determined
by the prespecified null and alternative hypotheses and type I and type II
errors.
4. These concepts were applied to many one-sample hypothesis- testing
cases. Each of the hypothesis tests was shown to be conducted in one of
two ways (i.e. critical value(s) approach and p-value approach):

Chapter 7 41
Chapter 7 42
The End

Chapter 7 43
Power Determination: one-sided alternatives

Suppose we wish to test H0: µ = µ0 vs. H1: µ = µ1 (and µ1 > µ0 or µ1 < µ0) where the
data are normally distributed with mean µ and known variance s².
The power 1- b of this one-sided test with significance level a and sample size = n
is:

Power = if µ1 < µ0

Power = if µ1 > µ0

(formules niet van buiten kennen;: ze zullen gegeven zijn op het examen indien, nodig)

Chapter 7 New slide 30


Chapter 08
Hypothesis Testing: Two-Sample Inference

Chapter 8 1
Introduction

In a two-sample hypothesis-testing problem, the underlying parameters of


two different populations, neither of whose values is assumed known, are
compared.

For example, if we want to study the relationship between oral contraceptive


(OC) use and blood pressure in women, two different experimental designs
may be used to assess this relationship:

Chapter 8 2
Longitudinal Study Design:

1. Identify a group of women who are not currently OC users, and measure their blood pressure, which will
be called the baseline blood pressure.

2. Rescreen these women 1 year later to check who started OC use. This is the study population. Measure
the blood pressure of the study population at the follow-up visit.

3. Compare the baseline and follow-up blood pressure of the women in the study population to determine
the difference between blood pressure levels of women when they were using the pill at follow-up and
when they were not using the pill at baseline.

This represents a paired-sample (dependent) design because each woman is used


as her own control.

Chapter 8 3
Cross-Sectional Study:

1. Identify both a group of OC users and a group of non-OC users and


measure their blood pressure.
2. Compare the blood pressure level between the OC users and
nonusers.

This study represents an independent-sample (unpaired) design because


two completely different groups of women are being compared. A
cross-sectional study is also less expensive than a follow-up study.

Chapter 8 4
• Paired or dependent sample: when each data point in the first sample
is matched and is related to a unique data point in the second sample.
Paired samples may represent two sets of measurements on the
same people or on different people who are chosen on an
individual basis using matching criteria, such as age and sex, to
be very similar to each other.

• Unpaired or independent samples: when the data points in one


sample are unrelated to the data points in the second sample.

For the example under discussion, paired-study design is probably more


definitive because most influencing factors present at first screening will
also be there at the second screening and will not influence the
comparison of BP levels.

Chapter 8 5
(1) Two-Sample Test for the Mean Difference of Two
Distributions X and Y.
D = X – Y with D following the normal distribution with
unknown variance.

Based on the teststatistic t where


The Paired t Test

𝑑 − 0
𝑡=𝑠
𝑑
𝑛

which follows under H0 a t distribution with n-1 degrees of freedom.

n = number of matched pairs

Chapter 8 6
Example 1: paired t-test

Assume that the differences in SBP levels di are normally distributed and we
have:

Chapter 8 7
Example 1: Hypertension (dependent samples)
Test: H0:  = 0 = 0 vs. H1:   0

or is the SBP significantly different after one year of OC use?

𝑑−0
Test statistic: 𝑡= 𝑠𝑑
𝑛

with t ~ t10-1 distribution under H0

4,80−0
Compute: 𝑡 = 4,566 = 3.32
10

Conclude: t = 3.32 > 2.262 = t9, 1-0.025


so reject H0 : the SBP is significantly different after one year of OC use

p-value: p = 2Pr(t9 > 3.32) = 2(probability between 0.0005 and 0.005)


= probability between 0.001 and 0.01)

so p < 0.05 so indeed reject null hypothesis at 95% confidence


Chapter 8
level or 5% significance level. 8
Chapter 8 9
Interval Estimation for the Comparison of Means
from Two Paired Samples

Chapter 8 10
(2) Two-Sample Test for the Difference of the Means of Two
Distributions X1 and X2.
X1 and X2 are independent and both following the normal
distribution with unknown but equal variances.

Based on the teststatistic t where


The Unpaired t Test
with equal variances
𝑥1 − 𝑥2 − 0
𝑡=
1 1
𝑠 +
𝑛1 𝑛2

which follows under H0 a t distribution with 𝑛1 + 𝑛2 − 2 degrees of


freedom.

Chapter 8 11
Example 2: unpaired t-test (equal variances)

Assume that SBP levels are normally distributed in both groups and that the
underlying variances are the same*. We have:

 s = 17.527

* We will have to formally test the equality of the variance.

Chapter 8 12
Example 2: Hypertension (independent samples and equal variances)
Test: H0: 𝜇1 = 𝜇2 (or 𝜇1−𝜇2 = 𝜇0 = 0) vs. H1: 𝜇1  𝜇2

or is the SBP significantly different between the group of OC users and


non-users?
𝑥1 −𝑥2 −0
Test statistic: 𝑡= 1 1
𝑠 +
𝑛1 𝑛2

with t ~ t8+21-2 distribution under H0

132.86−127.44−0
Compute: 𝑡= 1 1
= 0.74
17. 527 8 +21

Conclude: t = 0.74 < 2.052 = t27, 1-0.025


so don’t reject H0 : the SBP is not significantly different between the 2 groups

p-value: p = 2Pr(t27 > 0.74) = 2(probability between 0.20 and 0.25)


= probability between 0.4 and 0.5

so p > 0.05 so indeed don’t reject null hypothesis at 95% confidence


Chapter 8
level or 5% significance level. 13
Interval Estimation for the Comparison of Means
from Two Independent Samples (Equal Variances
Case)

Chapter 8 14
(3) F test for the Equality of Two Variances of Two
Distributions X1 and X2.
X1 and X2 are independent and both following the normal
distribution with unknown variances.

Based on the teststatistic f where

𝑠12
𝑓= 2
𝑠2

which follows under H0 a F distribution with 𝑛1 − 1; 𝑛2 − 1 degrees of


freedom.

Chapter 8 15
Example 3: F test for equality of two variances

We asumed in example 2 (unpaired t-test) that the underlying variances were the
same. Let us formally test the equality of the variances.

Chapter 8 16
Example 3: Hypertension (independent samples):
F test on variances
Test: H0: 𝜎12 = 𝜎22 vs. H1: 𝜎12  𝜎22

or are the variances significantly different between both groups?


𝑠22
Test statistic: 𝑓=
𝑠12
with F ~ F21-1; 8-1 distribution under H0

18.232
Compute: 𝑓= 15.342
= 1.41

Conclude: F20; 7; 0.025 = 0,33 < 1.41 < 4.42 = F20; 7; 0,975
so don’t reject H0 : the variances are not significantly different and so can be
considered equal.

p-value: p = 2Pr(F20; 7 > 1.41) = 2(probability more than 0,10)


= probability bigger than 0,20

so p > 0.05 so indeed don’t reject null hypothesis at 95% confidence


level or 5% significance level.
Chapter 8 17
Chapter 8 18
Use of F Table for looking up critical values (percentiles)

In our example: F20; 7; 0.025 = 1/F7; 20; 0,975 = 1/3.01 = 0,33


F20; 7; 0,975 = F24; 7; 0,975 = 4.42

Chapter 8 19
Use of F Table for looking up critical values (percentiles)

In our example: F20; 7; 0,975 = F24; 7; 0,975 = 4.42

Chapter 8 20
Use of F Table for looking up critical values (percentiles)

In our example: F20; 7; 0.025 = 1/F7; 20; 0,975 = 1/3.01 = 0,33

Chapter 8 21
>

If F  1, then p = 2 × Pr(Fn1-1,n2-1 > F).


If F < 1, then p = 2 × Pr(Fn1-1,n2-1 < F).

In our example: p-value: p = 2Pr(F20; 7 > 1.41)


= 2Pr(F24; 7 > 1.41)
= 2(probability more than (1-0.90))
= probability bigger than 0,20
Chapter 8 22
Use of F Table for looking p-value
In our example: p-value: p = 2Pr(F20; 7 > 1.41)
= 2Pr(F24; 7 > 1.41)
= 2(probability more than (1-0.90))
= probability bigger than 0,20

Chapter 8 23
(4) Two-Sample Test for the Difference of the Means of Two
Distributions X1 and X2.
X1 and X2 are independent and both following the normal
distribution with unknown but unequal variances.

Based on the teststatistic t where


The Unpaired t Test
with unequal variances
𝑥1 − 𝑥2 − 0
𝑡=
𝑠12 𝑠22
𝑛1 + 𝑛2

which follows under H0 a t distribution with 𝑑 ′degrees of freedom.

Chapter 8 24
Example 4: unpaired t-test (unequal variances)

Chapter 8 25
Example 4: unpaired t-test (unequal variances)
Unequal variances because:
H0: 𝜎12 = 𝜎22 vs. H1: 𝜎12  𝜎22

Since table 8 in the book is limited, we use R for computing the critical values
and the p-value:

So H0 is rejected and so there are unequal variances.

Chapter 8 26
Example 4: Cardiovascular Disease
(independent samples and unequal variances)
Test: H0: 𝜇1 = 𝜇2 (or 𝜇1−𝜇2 = 𝜇0 = 0) vs. H1: 𝜇1  𝜇2

or is the cholesterol level significantly different between the 2 groups?

𝑥1 −𝑥2 −0
Test statistic: 𝑡=
𝑠2 2
1 + 𝑠2
𝑛1 𝑛2

with t ~ td’ distribution under H0

207.3−193.4−0
Compute: 𝑡= = 3.40
35.62 17,32
+
100 74

Conclude: t = 3.40 > 1.980 = t120, 1-0.025


so reject H0 : the cholesterol level is significantly different between the 2
groups

p-value: p = 2Pr(t120 > 3.40) = 2(probability less than 0.0005)


= probability less than 0,001
so p < 0.05 so indeed reject null hypothesis at 95% confidence
Chapter 8 27
level or 5% significance level.
Interval Estimation for the Comparison of Means
from Two Independent Samples (Unequal Variances
Case)

Chapter 8 28
Case Study: Effects of Lead Exposure on Neurologic
and Psychological Function in Children

Chapter 8 29
Case Study: Effects of Lead Exposure on Number of
Finger-Wrist Taps in Children
Equal variances because:
H0: 𝜎12 = 𝜎22 vs. H1: 𝜎12  𝜎22

So H0 is not rejected and so there are equal variances.

Chapter 8 30
Case Study: Effects of Lead Exposure on Number of
Finger-Wrist Taps in Children
Test: H0: 𝜇1 = 𝜇2 vs. H1: 𝜇1  𝜇2

So H0 is rejected and so the number of finger-wrist taps is significantly different


between the 2 groups with different lead exposure.

Chapter 8 31
Case Study: Effects of Lead Exposure on full-scale IQ
Scores in Children
Equal variances because:
H0: 𝜎12 = 𝜎22 vs. H1: 𝜎12  𝜎22

So H0 is not rejected and so there are equal variances.

Chapter 8 32
Case Study: Effects of Lead Exposure on full-scale IQ
Scores in Children
Test: H0: 𝜇1 = 𝜇2 vs. H1: 𝜇1  𝜇2

So H0 is not rejected (borderline!) and so the full-scale IQ scores is not


significantly different between the 2 groups with different lead exposure.

Chapter 8 33
Treatment of Outliers
Outliers can have an important impact on the conclusions of a
study.
It is important to definitely identify outliers and either exclude
them outright or at least perform alternative analyses with and
without the outliers present.
All the potentially outlying values are far from the mean in
absolute value.
A useful way to quantify an extreme value is by the number of
standard deviations that a value is from the mean.
The statistic applied to the most extreme value is called
Extreme Studentized Deviate (or ESD statistic)
= maxi=1,…,n|xi –x|/s

Chapter 8 34
Chapter 8 35
Example: is there a single outlier present in the finger-wrist tapping scores for the
control group?

With n = 64, 𝑥 = 54.4 and s = 12.1 we have for the smallest and largest values:
13−54.4 84−54.4
= 3.42 en = 2.45
12.1 12.1
Table 9 gives the following critical values: 𝐸𝑆𝐷60;0.95 = 3.20 and 𝐸𝑆𝐷70;0.95 = 3.26

So 3.42 > 𝐸𝑆𝐷64;0.95 and so we infer that the finger-wrist tapping score of 13 taps per
10 seconds is an outlier.
Chapter 8 36
Chapter 8 37
Estimation of Sample Size for Comparing Two Means

Chapter 8 38
Estimation of Power for Comparing Two Means

Chapter 8 39
Summary
In this chapter, we discussed
Methods of hypothesis testing for comparing the means and variances of
two samples that are assumed to be normally distributed.
Paired t test and F test: In a two-sample problem, if the two samples are
paired then the paired t test is appropriate. If the samples are independent,
then the F test for the equality of two variances is used to decide whether
the variances are significantly different.
If the variances are not significantly different, then the two-sample t test
with equal variances is used. If the variances are significantly different, then
the two sample t test with unequal variances is used.
Methods for detection of outliers and presenting the appropriate sample
size and power formulas for planning investigations.

Chapter 8 40
Chapter 8 41
The End

Chapter 8 42
Chapter 09
Nonparametric Methods

Chapter 9 1
Introduction

Methods of estimation and hypothesis testing are usually called


parametric statistical methods because the parametric form of the
distribution is assumed to be known.

If assumptions about the shape of the distribution are not made


and/or if the central-limit theorem also seems inapplicable because
of small sample size, then nonparametric statistical methods,
which make fewer assumptions about the distributional shape,
must be used.

In this chapter we will discuss the following nonparametric tests:


the Sign test, the Wilcoxon Signed-Rank test and the Wilcoxon
Rank-Sum test.

Chapter 9 2
Types of data
An assumption characteristic of cardinal data or quantitative data is that it is
on a scale where it is meaningful to measure the distance between possible
data values and to use means and standard deviations.
examples: body weight, blood pressure, temperature, viral load, …

Ordinal data can be ordered but do not have specific numeric values. Thus,
common arithmetic (means, standard deviations, … ) cannot be performed on
ordinal data in a meaningful way.
examples: - 'sick' vs. 'healthy‘ when measuring health
- 'completely agree', 'mostly agree', 'mostly disagree',
'completely disagree' when measuring opinion

Data are on a nominal scale if different data values can be classified into
categories, but the categories have no specific ordering. It has even less
structure than an ordinal scale concerning relationships between data values.
examples: hair color, gender, bloodtype, …

Ordinal and nominal data are typically analyzed with nonparametric methods.

Chapter 9 3
(A) The Sign Test

• It is a test for ordinal data that can have only three outcomes: greather
than, less than or equal to. So depends only on the sign of the difference.

• That data is not normally distributed, so a nonparametric test is needed.

• tests the difference in the numbers of “greather than” versus “less than”.

• can also be used for testing the median.

• is a special case of the one-sample binomial test (chapter 7) for


H0: p = 0.5 vs. H1: p ≠ 0.5.

Chapter 9 4
The Sign Test: n ≥ 20: normal approximation of the
binomial distribution
number of “greather than” = number of “less than”
Via the critical values:

Or via the p-value:

Chapter 9 5
The Sign Test: n ≥ 20: normal approximation of the
binomial distribution

Chapter 9 6
Example (1) : The Sign Test for n ≥ 20

Chapter 9 7
The Sign Test: n < 20: exact binomial probabilities rather than
normal approximation: “Exact Method”

Chapter 9 8
Example (2): The Sign Test for n < 20:

Chapter 9 9
(B) The Wilcoxon Signed-Rank Test

• It is a nonparametric analog to the t test for two dependent samples.

• It is a test for ordinal data that can have more than 2 outcomes such as a rating
scale. So depends on the sign and the magnitude of the differences.

• It is based on the ranks rather than on the actual values of the observations.

• For data that is not normally distributed or when the CLT cannot be applied
(small sample size), so a nonparametric test is needed. In fact the Wilcoxon
Signed-Rank test is the nonparametric version of the paired t test.

Chapter 9 10
(B) The Wilcoxon Signed-Rank Test
If we want to test H0: D = 0 vs. H1: D ¹ 0 where D = difference in ordinal
scores, then do following steps:
1. leave out the “ties” and compute n = nonzero differences
2. compute ranks for each observation:

3. compute the rank sum R1 of the positive differences


3. compute the rank sum R1 of the positive differences

Chapter 9 11
(B) The Wilcoxon Signed-Rank Test
In case n > 15:
&(&($) $
#$ % *
%+
4. compute the teststatistic 𝑇 =
∑& +
./$ -.
*
which follows under H0 a standard normal distribution.
where 𝑟2 = rank of the absolute value of the j th observation.

5. Reject or accept H0 via critical value or p-value approach

In case n < 16:

4. Reject or accept H0 via use of small-sample tables (Rosner: table 10)


for critical value for testing R1.

Chapter 9 12
Rationale for The Wilcoxon Signed-Rank Test when n> 15

Test: H0: D = 0 vs. H1: D ¹ 0


where D = difference in ordinal scores.

6(675) ∑& +
./$ ;.
If H0 is true, then 𝐸 𝑅5 = and 𝑉𝑎𝑟 𝑅5 =
8 8
where 𝑟2 = rank of the absolute value of the j th observation.

When n is “large enough” (Rosner: n >15), then the Normal


approximation can be applied to come to the teststatistic T where

𝑛(𝑛 + 1) 1
𝑅5 − −
4 2
𝑇=
∑62C5 𝑟2B
4
which follows under H0 a standard normal distribution.

Chapter 8 13
Example (3): Dermatology (Wilcoxon Signed-Rank test)

Chapter 8 14
Example (3): Dermatology (Wilcoxon Signed-Rank test)

Chapter 8 15
Example (3): Dermatology (Wilcoxon Signed-Rank test)

so p < 0.05 so indeed reject null hypothesis at 95% confidence level


Chapter 8
or there is a significant difference between ointments. 16
Example (4): Dermatology (Wilcoxon Signed-Rank test)
Suppose n = 9 and R1 = 43; then use Table 10. The results are statistically
significant at a particular α level only if R1 ≤ the lower critical value or
if R1 ≥ the upper critical value for that α level:

Since 43 > 42, the results are also in this case, statistically significant.
Chapter 8 17
(C) Wilcoxon Rank-Sum Test
• It is a nonparametric analog to the t test for two independent samples.

• It is a test for ordinal data that can have more than 2 outcomes such as a
rating scale. So depends on the sign and the magnitude of the differences.

• It is based on the ranks rather than on the actual values of the


observations.

• For data that is not normally distributed or when the CLT cannot be
applied (small sample size), so a nonparametric test is needed. In fact the
Wilcoxon Rank-Sum test is the nonparametric version of the unpaired t
test.

• The Wilcoxon Rank-Sum test is sometimes referred to in the literature as the


Mann-Whitney U test.

Chapter 9 18
(C) The Wilcoxon Rank-Sum Test
If we want to test H0: 𝜇1 = 𝜇B vs. H1: 𝜇1 ¹ 𝜇2 with 𝑛5 ≤ 𝑛B
then do following steps:
So take the smallest sample as n1.

1. Combine the data from the two groups and compute ranks for each
observation:
• order the values from the lowest to highest
• assign ranks to the individual values
• if a group of observations has the same value, then compute the range of
ranks for the group, and assign the average rank for each observation in
the group

2. Compute the rank sum R1 in the first sample

Chapter 9 19
(C) The Wilcoxon Rank-Sum Test
In case n1 and n2 ≥ 10:
& (& (& ($) $
#$ % $ $ + + %+
3. compute the teststatistic 𝑇 =
&$ &+
$+
(6576B75)

which follows under H0 a standard normal distribution.

4. Reject or accept H0 via critical value or p-value approach

In case n1 or n2 < 10 :

3. Reject or accept H0 via use of small-sample tables (Rosner: table 11)


for critical value for testing R1.

Chapter 9 20
Rationale for The Wilcoxon Rank-Sum Test when n1 and n2 ≥ 10

Test: H0: 𝜇1 = 𝜇B vs. H1: 𝜇1 ¹ 𝜇2 with 𝑛5 ≤ 𝑛B

6$ (6576B75) 𝒏𝟏 𝒏𝟐
If H0 is true, then 𝐸 𝑅5 = and 𝑉𝑎𝑟 𝑅5 = (𝑛1 + 𝑛2 + 1)
B 𝟏𝟐

Remark: The formula for Var(R1) can be adjusted in case of ties but in most
cases this is not necessary.

When n1 and n2 are “large enough” (Rosner: n1 and n2 ≥ 10) and the
studied variable has a continuous distribution then the Normal
approximation can be applied to come to the teststatistic T where

& (& (& ($) $


#$ % $ $ + + %+
𝑇= 𝒏𝟏 𝒏𝟐
𝟏𝟐
(6576B75)

which follows under H0 a standard normal distribution.

Chapter 8 21
Example (5): Ophthalmology (Wilcoxon Rank-Sum test)

Chapter 8 22
Example (5): Ophthalmology (Wilcoxon Rank-Sum test)

Computation is done differently


Computation is done differently

Chapter 8 23
Example (6): Ophthalmology (Wilcoxon Rank-Sum test)
Suppose n1 = 8 and n2 = 15 and R1 = 73; then use Table 11. The results are
statistically significant at a particular α level only if R1 ≤ the lower critical value or
if R1 ≥ the upper critical value for that α level:

Since 65< 73 < 127, the results are in this case, not statistically significant.
Chapter 8 24
Summary
1.The main advantage of nonparametric methods is that the assumption of
normality can be relaxed when such assumptions are unreasonable.

2.One drawback of nonparameteric procedures is that some power is lost


relative to using a parametric procedure (such as a t test) if the data truly
follow a normal distribution or if the central-limit theorem is applicable.

3.The sign test and the Wilcoxon signed-rank test are nonparametric analogs
to the paired t test. For the sign test it is only necessary to determine whether
one member of a matched pair has a higher or lower score than the other
member of the pair.

4.For the Wilcoxon signed-rank test the magnitude of the absolute value of
the difference score, as well as its sign, is used in performing the significance
test.
5.The Wilcoxon rank-sum test (or the Mann-Whitney U test) is an analog to
the two-sample t test for independent samples in which the actual values are
replaced by rank scores.

Chapter 9 25
The End

Chapter 9 26
Chapter 10
Hypothesis Testing: Categorical Data

Chapter 10 1
Introduction
In this chapter, we will discuss

Methods of hypothesis testing for comparing two or more binomial proportions

Methods for testing the goodness-of-fit of a previously specified probability


model to actual data

Relationships between categorical and nonparametric approaches to data


analysis

Chapter 10 2
(1) Two-Sample Test for
Binomial (uncorrelated) Proportions
Example:
A study looked at the effects of oral contraceptive (OC) use on heart disease in women
40 to 44 years of age.

p1 = probability of developing a myocardial infarction (MI) over a period of 3


years in women on OC
p2 = probability of developing a MI over a period of 3 years in women not on OC

Hypothesis: H0: p1 =p2 = p vs. H1: p1  p2 for some constant p.

There are 3 approaches for testing this hypothesis:


A) Normal-Theory method
B) Contingency-Table method: Chi-square test
C) Contingency-Table method: Fisher Exact test

Chapter 10 3
Two-Sample Test for Binomial Proportions:
A) the Normal-Theory method
Example: The researchers found that among 5000 current OC users at baseline, 13
women developed a myocardial infarction (MI) over a 3-year period, whereas
among 10,000 non-OC users, 7 developed an MI over a 3-year period:

13 7
𝑝1 = = 0.026 𝑎𝑛𝑑 𝑝2 = = 0.0007
5000 10000
How to test H0: p1 = p2 versus H1: p1 ≠ p2 ?

Chapter 10 4
Two-Sample Test for Binomial Proportions:
A) the Normal-Theory method
Based on the teststatistic z where

1 1
𝑝1 − 𝑝2 − +
2𝑛1 2𝑛2
𝑧=
1 1
𝑝𝑞 𝑛 + 𝑛
1 2

which follows under H0 a standard Normal distribution.

𝑛1 𝑝1 +𝑛2 𝑝2
𝑝= = weighted average of sample proportions and 𝑞 = 1 − 𝑝
𝑛1 +𝑛2

Use this test only if normal approximation to binomial distribution holds,


so if 𝑛1𝑝𝑞 > 5 and 𝑛2𝑝𝑞 > 5

Chapter 10 5
Example 1: Myocardial Infarction (Normal-Theory method)
Test: H0: 𝑝1 = 𝑝2 vs. H1: 𝑝1 ≠ 𝑝2

or is the proportion MI cases significantly different in both groups?

1 1
𝑝1 −𝑝2 − +
2𝑛1 2𝑛2
Test statistic: 𝑧= Condition: 𝑛1𝑝𝑞 > 5 and
1 1
𝑝𝑞 +
𝑛1 𝑛2 𝑛2𝑝𝑞 > 5 is ok

with z ~ N(0, 1) distribution under H0


1 1
0,026−0,0007 − +
Compute: 𝑧= 2∗5000 2∗10000
= 39,08
1 1
0,0013∗(1−0,0013) + With 𝑝 =
5000∗0,026 + 10000∗0,0007
=
20
= 0,0013
5000 10000 5000+10000 15000

Conclude: z = 39,08 > 1.96 = z0.975


so reject H0 : the proportion MI cases is significantly different in both groups

p-value: p = 2Pr(Z > 39,08) = 2(probability < 0,00001)

so p < 0.05 so indeed reject null hypothesis at 95% confidence level


Chapter 10 or 5% significance level. 6
Two-Sample Test for Binomial Proportions:
B) Contingency-Table Method: Chi-square test

A 2 × 2 contingency table is a table composed of two rows cross-


classified by two columns.

It is an appropriate way to display data that can be classified by two


different variables, each of which has only two possible outcomes. One
variable is arbitrarily assigned to the rows and to the other to the
columns.

Each of the four cells represents the number of units with a specific
value for each of the two variables: these are the observed number of
units in the four cells O11, O12, O21, and O22, resp.

Chapter 10 7
Two-Sample Test for Binomial Proportions:
B) Contingency-Table Method : Chi-square test

• In the days before computers were readily available, people analyzed


contingency tables by hand, or by a calculator, using chi-square tests.
• This tests again H0: 𝑝1 = 𝑝2 vs. H1: 𝑝1 ≠ 𝑝2
• This test works by computing the expected values E11, E12, E21, and E22 if H0
would be true.
• It then combines the discrepancies between observed and expected values into
a chi-square statistic from which a P value is computed.

Chapter 10 8
Two-Sample Test for Binomial Proportions:
B) Contingency-Table Method : Chi-square test

exposed not exposed exposed not exposed


case O11 O12 R1 case E11 E12
control O21 O22 R2 control E21 E22
C1 C2 N

• E11 = C1 x R1/N
• E12 = C2 x R1/N
• E21 = C1 x R2/N
• E22 = C2 x R2/N

Chapter 10 9
Fisher’s exact test can be used when at least one of the four expected values is less than 5.
This procedure gives exact levels of significance for any 2 × 2 table: see later.

Chapter 10 10
Chapter 10 11
Example 1: Myocardial Infarction (Contingency-Table Method)

Chapter 10 12
- Critical value: 𝜒12 = 3,84 so 7,67 > 3,84: so reject H0 : the proportion MI cases is
significantly different in both groups

- P-value:
6.63 < 7.67 < 7.88 it follows that 1 − 0.995 < p < 1 − 0.99, or
0.005 < p < 0.01, and the results are highly significant.

Chapter 10 13
C) Contingency-Table Method : Fisher’s Exact Test
This test gives exact levels of significance for any 2×2 table but it is
only necessary for tables with small expected values.
Example:

Hypothesis H0: p1 = p2 = p vs. H1: p1  p2.

E11 = 7*25/60 = 2,92


E12 = 7*35/60 = 4,08

2 23
𝑛1 𝑝𝑞 = 25 × × <5
25 25
5 30
𝑛2 𝑝𝑞 = 35 × × <5
35 35

So Normal-Theory method and Contingency-Table


method cannot be used.

 Use Fisher’s exact test:


methodology is complex and beyond the scope of this
Chapter 10 course. 14
(2) Two-Sample Test for Binomial (correlated) Proportions:
McNemar’s Test
Example:

Chi-square test (or Fisher exact


test) cannot be used since the
samples are dependent!

 Construct different kind of 2x2


contingency table with the
matched pair as unit of analysis
and perform the McNemar’s test.
Chapter 10 15
Example 2: Cancer (McNemar’s Test)

here matched pair is the unit of


analysis.

A concordant pair is a matched pair in which the outcome is the same for each
member of the pair.
A discordant pair is a matched pair in which the outcomes differ for the members
of the pair.

There are 600 concordant pairs and 21 discordant pairs.


Focus on the discordant pairs: there are 5 type A discordant pairs and 16 type B
discordant pairs.

Chapter 10 16
Example 2: Cancer (McNemar’s Test)

Let p = the probability that a discordant pair is of type A. If both treatments are
equally effective (H0 is true), then about the same number of type A and type B
discordant pairs would be expected, and p should be 0,5.

1 1
𝐻0: 𝑝 = 𝑣𝑒𝑟𝑠𝑢𝑠 𝐻1 : 𝑝 ≠
2 2
nD = number of discordant pairs
nA = number of discordant pairs of type A

There are 2 types of McNemar’s test: A) Normal-Theory Test


B) Exact Test
Chapter 10 17
A) 𝒏𝑫 ≥ 𝟐𝟎: McNemar’s Test: Normal-Theory Test

Chapter 10 18
Chapter 10 19
Example 2: Cancer: McNemar’s Test: Normal-Theory Test
Note that nD = 21 > 20, so normal
approximation to the binomial
distribution holds.

Chapter 10 20
B) 𝒏𝑫 < 𝟐𝟎: McNemar’s Test: Exact Test

Example:

Chapter 10 21
Example 3: Hypertension: McNemar’s Test: Exact Test

here matched pair is the unit of


analysis.

Chapter 10 22
Example 3: Hypertension: McNemar’s Test: Exact Test

So we test if both methods detect the same amount of hypertensives:

1 1
𝐻0: 𝑝 = 𝑣𝑒𝑟𝑠𝑢𝑠 𝐻1 : 𝑝 ≠
2 2
nD = number of discordant pairs = 8
nA = number of discordant pairs of type A = 7

Since nA > nD/2 we have:


8 8
8 1
𝑝 = 2× = 2 × 0,0313 + 0,0039 = 2 × 0,0352 = 0,070
𝑘 2
𝑘=7
Chapter 10 so no significant difference in number of hypertensives. 23
(3) Estimation of Sample Size and Power
for Comparing Two Binomial Proportions

A) For uncorrelated proportions (unpaired)


B) For correlated proportions (paired)
C) In “realistic” clinical trial setting

Chapter 10 1
A) Sample size needed to compare two uncorrelated binomial proportions
(unpaired design) using a two-sided test with significance level  and
power 1-, where one sample (n2) is k times as large as the other sample
(n1) (independent-sample case):

Chapter 10 2
A) Power achieved in comparing two uncorrelated binomial
proportions (unpaired design) using a two-sided test with significance
level  and samples of size n1 and n2 (independent-sample case)

Chapter 10 3
B) Sample size needed to compare two correlated binomial proportions
(paired design) using a two-sided test with significance level  and power
1- (for McNemar’s test):

Chapter 10 4
B) Power achieved in comparing two correlated binomial proportions
(paired design) using a two-sided test with significance level  (for
McNemar’s test):

Chapter 10 5
C) Sample size (and power) needed to compare two UNcorrelated binomial
proportions (UNpaired design)
in a “realistic” clinical trial setting

The dropout rate 𝝀𝟏 is defined as the proportion of participants in the


active-treatment group who fail to actually receive the active treatment.

The drop-in rate 𝝀𝟐 is defined as the proportion of participants in the


placebo group who actually receive the active treatment outside the
study protocol.

Chapter 10 6
Hypothesis H0: p1 = p2 versus H1 : p1  p2 for the specific alternative
|p1 – p2| =  with a significance level  and a power 1- in a randomized clinical trial
in which group 1 receives active treatment, group 2 receives placebo, and an equal
number of subjects are allocated to each group. We assume that p1 and p2 are the
rates of disease in treatment groups 1 and 2 under the assumption of perfect
compliance.

Chapter 10 7
The power formula in equation 10.14 also assumes perfect compliance. In
“realistic” clinical trial setting, replace p1 , p2 , Δ , 𝑝 and 𝑞 in equation 10.14
with p1*, p2*, Δ*, 𝑝* and 𝑞* as given in equation 10.17. The resulting power
is a compliance-adjusted power estimate.

Chapter 10 8
Example:

Chapter 10 9
Chapter 10 10
10.17

10.13

Chapter 10 11
(4) R × C Contingency Tables
An R × C contingency table is a table with R rows and C columns. It displays the
relationship between two variables, where the variable in the rows has R
categories and the variable in the columns has C categories.
Example:

Chapter 10 12
Chapter 10 13
Example:

320

Chapter 10 14
(5) Chi-Square test for trend in binomial proportions

Suppose there are k groups and we want to test whether there is an


increasing (or decreasing) trend in the proportion of “successes” pi (the
proportions of units in the first row of the ith group) as i increases.

There exist a Chi-square test for detecting a trend in the binomial


proportions in this 2 x C contingency table.

In R software:

Chapter 10 15
Summary
In this chapter, we discussed
1. Two-Sample Test for Binomial (uncorrelated) Proportions
A. The Normal-Theory method
B. Contingency-Table Method: Chi-square test
C. Contingency-Table Method : Fisher’s Exact Test

2. Two-Sample Test for Binomial (correlated) Proportions:


McNemar’s Test
A. Normal-Theory Test
B. Exact Test
3. Estimation of Sample Size and Power for Comparing Two
Binomial Proportions
4. R × C Contingency Tables
5. Chi-Square test for trend in binomial proportions

Chapter 10 16
Chapter 12
Multisample Inference

Chapter 12 1
Example 12.1

Chapter 12 2
Example 12.1

• In a one-way analysis of variance, or a one-way ANOVA model, the


means of an arbitrary number of groups, each of which follows a normal
distribution with the same variance, can be compared.
• So we want to determine whether the variability in the data comes mostly
from variability within groups or can truly be attributed to variability
between groups.

Chapter 12 3
One-Way ANOVA—Fixed-Effects Model

Suppose there are k groups with ni observations in the ith group.


The jth observation in the ith group will be denoted by yij.
Let us assume the following model:

yij = µ + ai + eij
Where µ is a constant, ai is a constant specific to the ith group, and eij is an
error term, which is normally distributed with mean 0 and variance s2.
A typical observation from the ith group is normally distributed with mean
µ+ai and variance s2.

Chapter 12 4
One-Way ANOVA—Fixed-Effects Model

yij = µ + ai + eij
• µ represents the underlying mean of all groups taken together.
• ai represents the difference between the mean of the ith group and the
overall mean.
• eij represents random error about the mean µ+ai from the ith group for
an individual observation from the ith group.

Chapter 12 5
Why “Fixed”-Effects?

•In Example 12.1, we studied the effect of both active and passive smoking on the level of
pulmonary function. We were specifically interested in the difference in pulmonary
function between the PS and the NS groups.

•This is an example of the fixed-effects analysis-of-variance model because the subgroups


being compared have been fixed by the design of the study.

•So, in the fixed-effects case, the levels of the categorical variable have inherent meaning
and the primary goal is to compare mean levels of the outcome variable (FEF) among
different levels of the grouping variable.

•This will be different from a “Random”-Effects model, see further.

Chapter 12 6
Hypothesis Testing in One-Way ANOVA—Fixed-Effects Model

In general we have:

With the mean responsvariable for the ith group denoted by "
!i, and the
mean responsvariable over all groups by "
#.

(yij – "! i) represents the deviation of an individual observation from the group
mean for that observation and is an indication of within-group variability.
("! i – ")
# represents the deviation of a group mean from the overall mean and
is an indication of between-group variability.

Chapter 12 7
Hypothesis Testing in One-Way ANOVA—Fixed-Effects Model

Generally, if the between-group variability is large and the within-group


variability is small, then H0 (all group means are the same) is rejected and the
underlying group means are declared significantly different.
Chapter 12 8
Hypothesis Testing in One-Way ANOVA—Fixed-Effects Model

Conversely, if the between-group variability is small and the within-group


variability is large, then H0, the hypothesis that the underlying group means
are the same, is accepted.
Chapter 12 9
From:

We get:

Total Sum of Squares = Within Sum of Squares + Between Sum of Squares

Total SS = Within SS + Between SS

Chapter 12 10
Chapter 12 11
F Test for Overall Comparison of Group Means

With significance level a, test:


H0: αi = 0 for all i vs. H1: at least one αi ¹ 0

Consider the test statistic: F = Between MS/Within MS

with F ~ Fk-1, n-k distribution under H0

Compute: f = between ms/within ms

Conclude: f > Fk-1, n-k, 1-a then reject H0


f ≤ Fk-1, n-k, 1-a then accept H0

The exact p-value: p = Pr(Fk-1, n-k > f)

Chapter 12 12
Chapter 12 13
F Test for Overall Comparison of Group Means in Example 12.4

Test whether the mean FEF scores differ significantly among the six groups
in Table 12.1 with α = 5% :

Compute: f = 58.0 > 2.21 = F5, ∞, 0.95 ≈ F6-1, 1050-6, 1-0.05


so reject H0
The exact p-value: p = Pr(F5, 1044 > 58.0) < 0.001
so reject H0

Conclusion: the FEV score is not the same for all groups
with significance level 5%.
Chapter 12 14
Chapter 12 15
We approximate F5, 1044 by F5, ∞
F5, ∞, 0.95 = 2.21

Pr(F5, ∞ < 4.10) = 0.999


So p = Pr(F5, ∞ > 58.0) < 0.001
Chapter 12 16
Comparisons of Specific Groups in One-Way ANOVA

1. t Test for Comparison of Pairs of Groups


2. Linear Contrasts
3. Multiple Comparisons

Chapter 12 17
1) t Test for Comparison of Pairs of Groups

Chapter 12 18
Chapter 12 19
Example data table 12.1

Chapter 12 20
Chapter 12 21
à very interesting result because it shows that the pulmonary function of
passive smokers is significantly worse than that of nonsmokers and is
essentially the same as that of noninhaling and light smokers (≤ l/2 pack
cigarettes per day).

Chapter 12 22
Comparisons of Specific Groups in One-Way ANOVA or
just a two-sample t-test?
A frequent error in performing the t test in when comparing two groups in
One-Way ANOVA is to use only the sample variances from these two groups
rather than from all k groups to estimate σ². If the sample variances from only
two groups are used, then different estimates of σ² are obtained for each pair
of groups considered, which is not reasonable because all the groups are
assumed to have the same underlying variance σ².

Furthermore, the estimate of σ² obtained by using all k groups will be more


accurate than that obtained from using any two groups because the estimate
of the variance will be based on more information. This is the principal
advantage of performing the t tests in the framework of a one-way ANOVA
rather than doing different two-sample t-tests.

However, if there is reason to believe that not all groups have the same
underlying variance (σ²), then the one-way ANOVA should not be performed,
and two-sample t tests based on pairs of groups should be used instead.
Chapter 12 23
2) Linear Contrasts

Chapter 12 24
Chapter 12 25
Chapter 12 26
Remark that this is the same t-distribution as in the previous comparison of
pairs of groups.

Chapter 12 27
Linear Contrasts, example 12.10

Test if the pulmonary function of nonsmokers is the same of that of inhaling


smokers, assuming that 10% of inhaling smokers are light smokers, 70% are
moderate smokers, and 20% are heavy smokers. Report by critical value
approach and p-value approach.

Chapter 12 28
Chapter 12 29
Critical value approach:
We approximate t1044 by t∞
t = 14.69 > t∞, 0.975 = 1.96

Chapter 12 30
p-value approach:

Pr(t∞ < 3.291) = 0.9995


So p = 2*Pr(t∞ > 14.69) < 2*0.0005
Chapter 12
So p < 0.001 31
3) Multiple Comparisons

• In many studies, comparisons of interest are specified before looking


at the actual data, in which case the t test procedure for comparing
pairs of groups or for testing linear contrasts is appropriate.

• In other instances, comparisons of interest are only specified after


looking at the data. In this case a large number of potential
comparisons are often possible. Specifically, if there are a large
number of groups and every pair of groups is compared using the
t test, which is called “multiple comparisons”, then some significant
differences are likely to be found just by chance.

Chapter 12 1
3) Multiple Comparisons

For example, for the 6 groups of smoking and non-smoking males,


15* pairs of groups can be studied, each with a t-test with a
false-positive error of 0.05:

Pr(at least one t-test with a false-positive result)


= 1 – Pr(no t-test with a false-positive result)
= 1-0.9515 = 1 – 0.46 = 0.54

So the probability that at least one of the t-test is given falsely a


significant result is more than 50% !

This requires some correction…

! !!
* "
= = 15
"!∗&!
Chapter 12 2
Multiple Comparisons—Bonferroni Approach

Several procedures referred to as “corrections” for this multiple-


comparisons problem, ensure that the overall probability of having false-
positive results is maintained at some fixed significance level. One of the
simplest such procedure is the Bonferroni adjustment.

!
Bonferroni correction just means that you will use as significance level
"
for performing c number of tests.

Remark that this is a very strict way of correcting.

Chapter 12 3
!
c = number of tests = "

Chapter 12 4
In stead of comparing the p-values with
0.05, under the Bonferroni correction,
!.!#
they are compared with = 0.0033.
$#

In this example, this does not have an


effect…

Chapter 12 5
The results of the multiple-comparisons procedure are typically displayed
as shown. A line is drawn between the names or numbers of each pair of
means that is not significantly different. This plot allows us to visually
summarize the results of many comparisons of pairs of means in one
concise display.

Chapter 12 6
Multiple Comparisons—Scheffé Approach

If linear contrasts, which have not been planned in advance, are


suggested by looking at the data, then the Scheffé approach is a good
correction for this multiple testing problem.

Chapter 12 7
Chapter 12 8
Retake example 12.10 of Linear Contrasts

Test if the pulmonary function of nonsmokers is the same of that of inhaling


smokers, assuming that 10% of inhaling smokers are light smokers, 70% are
moderate smokers, and 20% are heavy smokers, using Scheffé’s multiple-
comparisons procedure.

Chapter 12 9
Retake example 12.10 of Linear Contrasts

Since t = 14.69 > 0, we calculate !" = $ − 1 ∗ ()*+,-*),+*∝ =


6 − 1 ∗ (0*+, +121*0, +*1.12 = 5 ∗ (2, +155, 1.62 ≈ 5 ∗ (8,9,1.62 =
5 ∗ 2.21= 3.32.

Because t = 14.69 > c2 = 3.32, H0 is rejected at the 5% Scheffé level and so


pulmonary function of nonsmokers is significantly different from that of
inhaling
Chapter 12 smokers. 10
Reformulating One-Way ANOVA with dummy variables

It is arbitrary which group is assigned to be the reference group; the choice of a


reference group is usually dictated by subject-matter considerations.
Chapter 12 11
The subjects in each category of C have a unique profile in terms of x1,…, xk-1.
To relate the categorical variable C to an outcome variable y, we can now use
the following multiple regression model:

y = a + b1x1 + b2x2 + …+ bk-1xk-1 + e

Chapter 12 12
y = a + b1x1 + b2x2 + …+ bk-1xk-1 + e

The average value of y for subjects in category 1 (the reference category) = a,


the average value of y for subjects in category 2 = a+b1.

So b1 represents the difference between the average value of y for subjects in


category 2 vs. the average value of y for subjects in the reference category.

In general: bj represents the difference between the average value of y for


subjects in category (j+1) vs. the reference category.

Remark: this is the way we will write so-called Regression models (see Applied
Biostatistics / Toegepaste Biostatistiek course.)

Chapter 12 13
Example Case Study: Effects of Lead Exposure on
Neurological and Psychological Function in Children
The effects of exposure to lead on the psychological and neurological well-
being of children were studied. Blood levels of lead were measured in a group
of children who lived in the seventies near a lead smelter in El Paso, Texas and
this was brought into relation with the number of finger–wrist taps measured
in the dominant hand (MAXFWT = a measure of neurological function).

Specifically,
we will consider three lead-exposure groups according to the variable
LEAD_GRP:

• If LEAD_GRP = 1, then the child had normal blood-lead levels


(<40 μg/100 mL) in both 1972 and 1973 (control group).
• If LEAD_GRP = 2, then the child had elevated blood-lead levels
(≥40 μg/100 mL) in 1973 (the currently exposed group).
• If LEAD_GRP = 3, then the child had elevated blood-lead levels in 1972 and
normal blood-lead levels in 1973 (the previously exposed group).
Chapter 12 14
Chapter 12 15
Outputs of One-Way ANOVA Fixed-Effects

Chapter 12 16
Fixed-Effects Two-Way ANOVA

Chapter 12 17
• An interaction effect between two variables is defined as one in which the
effect of one variable depends on the level of the other variable.
• In general, if an interaction effect is present, then it becomes difficult to
interpret the separate (or main) effects of each variable because the effect of
one factor (e.g., dietary group) depends on the level of the other factor (e.g.,
sex).

Chapter 12 18
Two-Way ANOVA, example without interaction (simplicity)
Two “dummy” variables were set up to represent study group (x1, x2), where
x1 = 1 if a person is in the first (SV) group
= 0 otherwise
x2 = 1 if a person is in the second (LV) group
= 0 otherwise
and the normal group is the reference group.

A variable x3 is also included to represent sex, where


x3 = 1 if male
= 0 if female
This two-way ANOVA can then be written as the following multiple-regression model:

Chapter 12 19
Chapter 12 20
1

2,3

Chapter 12 21
Two-Way ANOVA, example without interaction,
interpretation output Table 12.15

1) Overall hypothesis
H0: β1 = β2 = β3 = 0 vs. H1: at least one of the βj ≠ 0
F = 105.85
p-value = Pr(F3,745 > 105.85) < 0.0001

Thus at least one of the effects (study group or sex) is


significant.

Chapter 12 22
The “type III SS” provides an estimate of the effects of specific risk factors
after controlling for the effects of all other variables in the model.

2) Hypothesis to test the effect of study group after controlling for


sex:
H0: β1 = β2 = 0, β3 ≠ 0 vs. H1: β1 and/or β2 ≠ 0, β3 ≠ 0 F =
132.24
p-value = Pr(F2,745 > 132.24) < 0.0001
Thus there are highly significant effects of dietary group
on SBP even after controlling for the effect of sex.
3) Hypothesis to test the effect of sex after controlling for study
group: same conclusion:
Thus there are highly significant effects of sex after
Chapter 12
controlling for the effect of dietary group. 23
4) Hypothesis to test which specific dietary group differs from one
another after controlling for the effect of sex:
- Thus mean SBP of people in group SV differs
significantly (p=0.0425) from that of people in group LV
after controlling for the effect of sex.
- Thus mean SBP of people in group SV differs
significantly (p=0.0001) from that of people in group NOR
after controlling for the effect of sex.
- Thus mean SBP of people in group LV differs
significantly (p=0.0001) from that of people in group NOR
after controlling for the effect of sex.
Chapter 12 24
5) Hypothesis to test the which specific sex group differs from the
other after controlling for study group:
Thus there are highly significant (p=0.0001) effects of sex
after controlling for the effect of dietary group. (Of course
this is the same as the type III test in 3) because sex is binary)

Chapter 12 25
6) In particular, the coefficient b1 = –17.9 mm Hg is an estimate of the difference
in mean SBP between the SV and NOR groups after controlling for the effect of
sex.

Similarly, the coefficient b2 = –13.8 mm Hg is an estimate of the difference in


mean SBP between the LV and NOR groups after controlling for the effect of
sex.
Also, the estimated difference in mean SBP between the SV and LV groups is
given by [–17.9 – (–13.8)] = –4.1 mm Hg; thus the SVs on average have SBP 4.1
mm Hg lower than the LVs after controlling for the effect of sex.

Finally, the coefficient b3 = 8.4 mm Hg tells us males have mean SBP 8.4 mm Hg
higher than females, even after controlling for effect of study group.

à So the estimated model is: ŷ = 119.76 − 17.87*1 − 13.79*2 + 8.43*3


Chapter 12 26
Two-way ANCOVA

The previous 2-way ANOVA model can then be extended to allow for the
effects of other covariates using the two-way ANCOVA. If weight is
denoted by x4 and age by x5, then we have the multiple-regression model
or two-way ANCOVA model:

Chapter 12 27
1

5
Chapter 12 28
Two-Way ANCOVA, example without interaction ,
interpretation output Table 12.16
1. the overall model is highly significant (F-value =103.16, p = .0001): some of the
variables are having a significant effect on SBP

2. the type III SS: each of the risk factors has a significant effect on SBP after
controlling for the effects of all other variables in the model (p = .0001).

3. Are there differences in mean blood pressure by dietary group after controlling
for the effects of age, sex, and weight?
• No significant difference in mean SBP between the SVs (group 1) and the LVs
(group 2) after controlling for the other variables (p = 0.7012)!
• Still highly significant differences between each of the vegetarian groups and
normals (p = .0001).
à Thus there must have been differences in either age and/or weight between
the SV and LV groups that accounted for the significant blood-pressure
difference between these groups in the previous analysis (Table 12.15).

Chapter 12 29
4. Still there are highly significant (p=0.0001) effects of sex after controlling for the
effect of dietary group, weight and age.

5. the estimates of specific parameters: after controlling for age, sex, and weight,
the estimated differences in mean SBP:
• between the SV and NOR groups = β1 = –8.2 mm Hg
• between the LV and NOR groups = β2 = –9.0 mm Hg
• between the SV and LV groups = β1 – β2 = –8.23 – (–8.95) = 0.7 mm Hg
àThese differences are all much smaller than the estimated differences in
Table 12.15

The difference in mean SBP between males and females is also much smaller in
Table 12.16 after controlling for age and weight than in Table 12.15.

Also, the estimated effects of age and weight on mean SBP are 0.47 mm Hg per
year and 0.13 mm Hg per lb, respectively.

Thus it is important to control for the effects of possible explanatory variables in


performing ANOVA/regression.
à So the estimated model is: ŷ = 82.75 − 8.23*1 − 8.95*2 + 5.50*3 + 0.47x4 + 0.13x5
Chapter 12 30
The Kruskal-Wallis Test
When we want to compare means among more than two samples, but
either the underlying distribution is far from being normal or we have
ordinal data, a nonparametric alternative to the one-way ANOVA must be
used. This is the Kruskal-Wallis test. (For 2 samples, this was called the
Wilcoxon rank-sum test.)

Chapter 12 31
Summary
• One-way ANOVA methods enable us to relate a normally distributed
outcome variable to the levels of a single categorical independent
variable.
• Two different ANOVA models based on a fixed- or random-effects
model were considered.
• In a fixed-effects model, the levels of categorical variable are
determined in advance. A major objective of this design is to test the
hypothesis that the mean level of the dependent variable is different
for different groups defined by the categorical variable.
• More complex comparisons such as testing for specific relationships
involving more than two levels of the categorical variable can also be
accomplished using linear-contrast methods.
• Random-effects models will be further explored in Applied
biostatistics. Also mixed-effects models, in which one or more factors
is fixed by design and one or more factors is random, will the be
discussed.

• In two-way ANOVA we jointly compare the mean levels of an outcome


variable according to the levels of two categorical variables.
• In one-way ANCOVA, we are interested in relating a continuous
outcome variable to a categorical variable but want to control for
other covariates.
• In two-way ANCOVA, we are interested primarily in relating a
continuous outcome variable simultaneously to two categorical
variables but want to control for other covariates.

Chapter 12 33
The End

Chapter 12 34
Outputs of Multiple-Regression

You might also like