You are on page 1of 8

INTRODUCTION TO EPIDEMIOLOGY FOR GLOBAL HEALTH

Confounding
Lecturer: Dr. Brandon Guthrie

In this lecture, I will discuss the concept of confounding.

As with the other forms of bias, we need to conclude that no alternative explanations by way of
confounding exist that would explain the observed association. It's important to consider
whether the increase in disease risk we see in the presence of a certain factor is due instead by
other co-occurring factors or exposures. We call this situation confounding. Before we can infer
a causal relationship, we first must consider the possibility of confounding and either dismiss it
or take it into account.

We can take confounding into account at either the design or analysis stages of a study.

So, what is confounding? Confounding occurs when there is a mixing of effects. If there is
confounding, the association that we see between an exposure and a disease is a distortion.
This distortion occurs because another factor or exposure that happens along with the one
we're interested in is also associated with the disease we're studying. So, what we're really
seeing is a mixture of the effects of two exposures or factors on the disease.

In order for confounding to occur, two things need to happen. First, there needs to be an
association between the exposure you're interested in and this extraneous or confounding
factor. And second, there needs to be an association between the extraneous factor and the
disease you're studying. So, let's use an example to illustrate this principle.

Down syndrome is a genetic disorder in which an individual has three copies of the 21st
chromosome instead of two copies. An investigator may be interested in whether birth order or
the number of children a woman has had is associated with the risk of a child being born with
Down syndrome.

From this graph, we can see a striking increase in the incidence of births with Down syndrome
with increasing birth order. We see a low incidence of Down syndrome among first births, and
we have a relatively high incidence of Down syndrome among women who have had six or
more previous births. But we may be concerned here that we have a confounded relationship
between birth order and the risk of Down syndrome.

Introduction to Epidemiology in Global Health online course


University of Washington 1
[Erickson, J. (1978). Down syndrome, paternal age, maternal age and birth order. Annals of
Human Genetics, 41(3), 289-298.]

We should think to ourselves, what other factors might be associated both with birth order and
with the likelihood that the infant would have Down syndrome? These are the conditions that
would be necessary in order to have confounding present.

So, let's review our criteria for confounding. First, we want to look at what our exposure of
interest is. In this case, the exposure of interest is birth order. Next, we want to think about
what the disease or outcome is that we're interested in. In this case, Down syndrome. We
observe a strong association between birth order and Down syndrome, our outcome.

But we might also be concerned about maternal age. We know that age is associated with birth
order. An older woman is more likely to have had more previous births and therefore we would
expect an association of higher birth order with higher maternal age. We also know from
previous evidence that a higher maternal age is associated with the incidence of Down
syndrome. So here we see that we have all the conditions necessary to have confounding occur.
Thus, age is a potential confounder of the association between birth order and Down
syndrome.

This is confirmed when we conduct a stratified analysis in which we define age categories, or
strata, of maternal age. We then look at the incidence of Down syndrome births within each of
these age categories, or strata. Our suspicions are confirmed. We see relatively low incidence of
Down syndrome among younger age categories, those below age 40 and considerably higher
risk among women 40 years old or older at the time of the birth of their child.

When we look within each strata of maternal age, we see no association between birth order
and the risk of Down syndrome. Within each age category, we see roughly the same risk of
Introduction to Epidemiology in Global Health online course
University of Washington 2
Down syndrome among women for whom this was their first birth as we do for women for
whom it was their sixth birth.

[Erickson, J. (1978). Down syndrome, paternal age, maternal age and birth order. Annals of
Human Genetics, 41(3), 289-298.]

Now let's return to our necessary conditions for confounding to occur. In order for a factor to
be a confounder, it must be associated with the disease of interest or with the recognition of
that disease. It can be either a cause or a correlate of a cause. The factor also must be
associated with the exposure of interest. And finally, the factor cannot be in the causal pathway
between exposure and disease.

What do we mean by causal pathway? If the exposure of interest has a causal effect on the
factor being considered as a potential confounder, and that factor in turn has a causal effect on
the outcome, then this factor is on a causal pathway linking the exposure and the outcome. We
therefore consider this to be an intermediate factor, or mediator, rather than a confounder.
Let's use an example here.

If we take smoking as our exposure of interest and we're interested in looking at infant
mortality, we may consider whether low birthweight is a confounder of the relationship
between smoking and infant mortality. Now in this relationship, we would be comfortable
saying that smoking causes low birthweight which in turn puts an infant at a higher risk of
death. So, in this example, low birthweight would be considered on the causal pathway
between smoking and infant mortality. And therefore, we would not consider low birthweight
to be a confounder of the relationship between smoking and infant mortality. We often need to
utilize non-epidemiologic or clinical information to determine whether a factor is on a causal
Introduction to Epidemiology in Global Health online course
University of Washington 3
pathway. There's no statistical way to distinguish a confounder from a factor that is on a causal
pathway. And therefore, we need to use our reasoning and outside knowledge to assess
whether this pathway occurs.

Unfortunately, information may be inconclusive. It may not be possible to definitively


determine whether a factor is on the causal pathway or is, in fact, a confounder. In this case, it
may be necessary to consider both the association between the factor of interest and the
outcome with and without the potential confounder considered in your analysis.

We'll now walk through an example of how we would go about assessing a confounding
relationship between an exposure and a disease. In this example, we'll ask, does drinking coffee
increase the risk of myocardial infarction? To do this, we'll use a case-control study of 150 cases
and 150 randomly selected controls.

From the study, we find that 90 of the cases routinely


consumed one or more cups of coffee daily in the year
before diagnosis. Similarly, we find that 60 controls
routinely consumed coffee in the same period. From these
data, we can construct a two-by-two table in which the
columns indicate disease status, cases or controls, and the
rows indicate exposure status, coffee drinking or no coffee
drinking.

We can calculate an odds ratio from these data, which


yields an odds ratio of 2.25, indicating a relatively strong
association between coffee drinking and myocardial
infarction.

Now, this initial evidence of an association between coffee drinking and myocardial infarction
should raise some eyebrows. We don't really have much of a biological mechanism to support
this association. And we might think there are other factors that are non-causal explanations
for this observed observation. One that might come to mind would be smoking. So, we should
ask ourselves, is smoking associated with MI risk? We can address this directly from the data.

We first stratify our study population by smoking status, and we will look to see whether
smoking is associated with the outcome. We will do this among the people who don’t drink
coffee so that we don’t mix-up the effect of smoking with effect of coffee drinking. We can
construct a two-by-two table among the non-coffee drinkers, where we have as columns, cases
and controls, and as rows, smoking status.

So, in the upper left of this table, we have all of the MI cases who are smokers among the non-
coffee drinkers, which is 20. In the upper right, we have all the controls who are smokers and
non-coffee drinkers. That would be the unexposed group for the smoking-MI relationship. And

Introduction to Epidemiology in Global Health online course


University of Washington 4
that's 10. In the lower left cell, we have the MI cases who are nonsmokers which is 40. And in
the lower right, we have the controls who are nonsmokers.

Based on this table, we can construct an odds ratio for the association between smoking and
MI. And this yields an odds ratio of 4 for the relationship between smoking and MI. So we see
that we have a strong association between smoking and the likelihood of somebody being an
MI case. And this is consistent with what we know from other evidence about the relationship
between smoking and myocardial infarction.

Now we want to determine, is there an association between the factor of interest, in this case
smoking, and our exposure of interest, coffee drinking? So, is there a relationship between
smoking and our potential confounder, and our exposure, coffee drinking?

To do this, we look among the controls, those who did not have an MI. Using a two-by-two
table, we look for smokers who are coffee drinkers and nonsmokers who are coffee drinkers
and look how many smokers were non-coffee drinkers and how many nonsmokers were non-
coffee drinkers. This results in an odds ratio of 16. So, we see that we have a very strong
association between smoking status and whether or not a person was a coffee drinker.

Introduction to Epidemiology in Global Health online course


University of Washington 5
Finally, is the factor or potential confounder in the causal pathway? When we think about
coffee drinking and myocardial infarction risk, we did not think that coffee drinking causes
smoking. Most people would conclude that is unlikely. It's more likely that smoking is either a
cause of coffee drinking, or that smoking and coffee drinking share some common cause, and
that smoking is independently associated with MI risk. In this case, we don't believe that
smoking is on the causal pathway between coffee drinking and MI risk. And so therefore, we
would consider smoking as an important confounder to adjust for or control for in our analysis
of the relationship between coffee drinking and MI risk.

In any exposure disease relationship, there are many potential confounders of that relationship.
Just because a factor is related to the disease and related to the exposure of interest does not
necessarily mean that there will be meaningful confounding of the exposure-disease
relationship. Meaningful confounding generally only occurs when there's relatively strong
associations between the exposure and the potential confounder and between the disease and
the potential confounder. Weak associations rarely lead to a degree of confounding of the
exposure-disease relationship that would be considered meaningful.

We don't judge the presence or absence of confounding based on statistical criteria. Rather, we
assess confounding based on our judgment of its effect on the measure of excess risk that
we're interested in.

One approach is to compare the crude and the stratum-specific measures of effect and if they
differ by an appreciable amount we consider the association to be confounded. Oftentimes, an
appreciable amount is judged to be 10% or more but that's just a general rule of thumb. If we

Introduction to Epidemiology in Global Health online course


University of Washington 6
conclude that confounding occurs, then we would need to take measures to adjust or
accommodate for the confounding.

Similarly, we might compare the crude and the adjusted affect measures and determine if
there's a difference between those. Again, a difference of 10% or more might be a good rule of
thumb to conclude that confounding is present. But again, remember, it's very important that
we do not judge confounding based on a statistical significance test. This is one of the most
commonly made errors in epidemiologic research and leads to a great deal of confusion
because accurate assessment of confounding needs to be done based on, first, the assessment
of whether the basic confounding criteria are met, and second, does the potential confounder
actually confound the association, meaning does it have a meaningful confounding effect on
the measure of risk being studied, not whether it has a significant p-value for the association
between the outcome of interest.

So again, let's illustrate this with an example. Returning to our example of coffee drinking and
its relationship with myocardial infarction risk, we see that in our crude analysis, comparing
cases and controls based on their coffee drinking status, we have a crude odds ratio of 2.25.

Now, we conduct stratum-specific analyses where we separate smokers from nonsmokers, we


can conduct two separate odds ratio calculations, first, among smokers, and then among
nonsmokers.

We see in both cases that we have an odds ratio of 1, indicating no association between coffee
drinking and MI status in either smokers or nonsmokers. These are what are called the stratum-
specific odds ratios. And we see that there's a considerable difference between the stratum-
specific odds ratios and the crude value, indicating that there is a confounding of this
association by smoking status.

Introduction to Epidemiology in Global Health online course


University of Washington 7
To summarize this lecture, before we can infer a causal relationship, we first must consider the
possibility of confounding and either dismiss it or take it into account. This distortion occurs
because another factor or exposure that happens along with the one we're interested in is also
associated with the disease we're studying. The necessary conditions for confounding to occur
are that the factor must be associated with the disease of interest or with the recognition of
that disease, associated with the exposure of interest, and cannot be in the causal pathway
between exposure and disease.

Introduction to Epidemiology in Global Health online course


University of Washington 8

You might also like