You are on page 1of 59

Unit Two

Exposure Measurement Error and Its Effects

Destaye Shiferaw Alemu (MPH, MSc2)


Department of Epidemiology and Biostatistics
Institute of Public Health
College of Medicine and Health Sciences ,Comprehensive Specialized Hospital
University of Gondar
October 2023

1
Objectives
At the end of this session, students will be able to:
 Define measurement error

 Identify sources of measurement error

 Describe a simple model for measurement error

 Discuss parameters to estimate effects of degree of measurement error

 Differentiate differential and non-differential measurement errors

 Examine effects of differential and non-differential measurement errors


2
Contents
Introduction: Measurement error Measures of measurement error

Exposure measurement error Effects of measurement error

Parameters to quantify measurement error 


Effects of differential and non-differential
measurement error
Sources of measurement error
Measures of misclassification
Differential exposure measurement error
Effects of differential and non-differential
Sources of differential error misclassifications

Measurement error model assumptions Effect of measurement error in the presence of


covariates 3
Introduction: Measurement Error
• Terminology used in measurement error varies between fields of study.

o Some authors use the terms validity and accuracy to refer to lack of bias only
o Validity, accuracy, and measurement error used as general terms describing the
accuracy of X as a measure of T, including both the concepts of bias and
precision.

• ‘True value’ can have various meanings.


o ‘True value’ is similar to what is termed a ‘construct’ or ‘latent variable’
• Measurement error:
o A systematic and random error in a subject's score not attributable to true
changes in the construct being measured
o One of the major sources of bias in epidemiological studies

4
Exposure Measurement Error

• Leads to bias in the measure of association between exposure and outcome (most
important effect)
• Information bias
• Misclassification bias spurious conclusions

• Difference between measured exposure and true exposure (for an individual)

• True exposure
o The agent of interest, underlying variable of interest
o Assumed that:
• A measure of the true exposure exists
• Measurements of the true exposure are known on the entire population

• Would be ascertained in a validity study, which would be conducted on a small sample


of subjects who would be measured with both the imperfect and perfect measure of
exposure (actual practice) 5
Parameters to Quantify Degree of Measurement Error in Exposure

• Computed by comparing the measured exposure to a perfect measure of the true


exposure in a population

• How the parameters that quantify the exposure measurement error can be used to
estimate the effects of this degree of error on the:
• Bias in the measure of association in the parent epidemiological study, and
• Power and sample size

• Parent epidemiological study: the study that will use (or has used) the mis-measured
exposure to estimate the measure of association for the exposure–disease
relationship.

• The odds ratio from the parent epidemiological study is called the observable odds ratio
because it is the odds ratio that will be obtained (on average) that differs from the true
odds ratio due to exposure measurement error. 6
Appropriate measures for continuous exposure variables
Bias :
• Difference between means of the measured and true exposure, and

Validity coefficient :
• Correlation between means of the measured and true exposure

Appropriate measures for categorical exposure variables


• Misclassification matrix for categorical variables

7
Sources of Measurement Error
• Error in measurement of exposure can be introduced during almost any phase of a
study.
• Possible causes include:

Faulty design of the instrument


 Errors in the design of the instrument
 Lack of coverage of all sources of the exposure

 Inclusion of exposures that do not have the actual active agent

 Time period assessed by instrument not the true etiological time period

 Measure (questionnaire or laboratory measure) not reflective of exposure in


target tissue

 Phrasing of questions that lead to misunderstanding or bias 8


 Errors or omissions in the protocol for use of the instrument

Failure to specify protocol in sufficient detail

Failure to specify a method to handle unanticipated situations


consistently

Failure to include standardization of instrument periodically throughout


data collection

9
 Poor execution of the study protocol

• Failure of data collectors or laboratory technicians to follow protocol in


same manner for all subjects

• Failure of subjects to read instructions in self-administered questionnaire

• Improper handling and/or analysis of biological specimens

• Influence of the personality


• Sex
• Race, or
• Age of interviewer on subject’s responses

10
 Limitations due to subject characteristics

• Memory limitations of subjects including:


o Poor recall of exposures
o Influence of recent exposures on memory of past exposures

• Limitations of proxy respondents’ knowledge and memory of subject’s


exposures

• Tendency of subjects to
o Over-report socially desirable behaviors and
o Under-report socially undesirable behaviors

• Variability in biological characteristics over time


• E.g.:
o Day to-day variations
o Seasonal variation 11
 Errors during data capture and analysis
• Data entry errors
• Errors in conversion tables used to convert subject responses to units of
active agent
• Programming errors in creating variables for analysis

• A properly designed and analyzed reliability and measurement error can assess
the impact of all these sources of variation on scores

12
Differential Exposure Measurement Error

• Occurs when exposure measurement error differs according to outcome

• Particular concern in retrospective collection of exposure data

• Can occur in prospectively collected data


• Biological effects or symptoms of the pre-diagnostic phase of the disease
• Influence of a strong risk factor for the disease on the reporting of the
exposure

13
Sources of Differential Error
Recall bias
• i.e. when cases report exposures differently from controls because of their knowledge or feelings about the
disease

Influence of data collector’s knowledge of the subject’s disease status on the exposure
measure

• Different distributions of laboratory technicians or interviewers between cases and


controls
• Differences in specimen handling or storage between cases and controls

• Biological effects of the disease or treatment on the exposure

• Biological effects or symptoms of the pre-diagnostic phase of the disease

• Influence of a strong risk factor for the disease on the reporting of the exposure
• e.g. family history 14
A Model of Measurement Error
• A simple model of measurement error in a population is:

• Xi : Observed measure for a given individual (i)


• Ti : True value for that individual
• b: systematic error (bias)
• would affect all members of the study population
• Ei: Additional error in Xi for subject i
• Referred to as subject error,
• Varies from subject to subject
• Does not refer to error due to subject
characteristics, but rather may include all of the
sources of error.

• Xi differs from Ti for an individual as a consequence of


two types of measurement error
15
• Example. Suppose that a portable scale is used to weigh subjects for a study of
weight and hip fracture among a population of elderly women.

• Although current weight (X) will be measured, the true exposure of interest (T)
is the subject’s average weight over the previous 5 years. (The true exposure
could be measured in theory by averaging multiple weighing over the 5-year
period.).

• Measures of weight in the population of interest would yield observations for X


that would differ from each subject’s true weight T because of measurement
error.

16
• In this example, suppose that the bias in X (in the population to be studied) is 1 kg.

• Sources of systematic bias (b) in X might be: ???


• Scale is miscalibrated so that it reads on average 0.5 kg too heavy
• Subjects currently weigh on average 0.5 kg more than their average weight over the
previous 5 years.

• Sources of subject error (E) might include: ???


• Randomness in the mechanics of the scale beyond the scale’s usual 0.5 kg over estimation
• Difference between each individual’s current weight and her average 5-year weight
(beyond the average 0.5 kg increase).

• Other sources of error which could contribute to bias and subject error include:???

• Weight of the subject’s clothes


• Misreading of the scale by the interviewer
• Random hour-to-hour and day-to-day fluctuations in ‘current’ weight 17
Calculate observed weight and measurement error in observations of body
weight in a series of subjects represents below. What conclusion/s could you make
form the given data?
Subject (i)
1 2 3 4…

Observed weight (kg) ? ? ? ?


True weight (kg) 59 52 69 60

Measurement error ? ? ? ?
Systematic error (kg) 1 1 1 1

Subject error (kg) 1 −3 0 2

18
Calculate observed effect and measurement error in observations of body
weight in a series of subjects represents below. What do you can you conclude
form the given data?
Subject (i)
Xi = Ti + b + Ei 1 2 3 4…

or
Observed weight (Xi (kg)) 61 50 70 63

er r
X1 = 59 + 1+1
X2 = 52 + 1 +(-3) True weight (Ti (kg)) 59 52 69 60

nt
me
X3 = 69 + 1+0 Measurement error (ME) 2 −2 1 3
X4 = 60 + 1+2 re
asu Systematic error (b (kg)) 1 1 1 1
Me

Subject error (E (kg)) 1 −3 0 2

+
0 19
• X, T, and E:
o Are variables with distributions for the population of potential study
subjects
• E.g.:
• Distribution of E is distribution of subject measurement errors in the population of
interest

• X, T, and E have:
 Expectations (population means over an infinite population)
• μX, μT, μE

Variances
• σx2 , σT2 , σE2

20
Assumptions Model of Measurement Error

Population mean of the subject error (μE) is zero


• Average measurement error in X in the population is expressed as a constant b

 Correlation coefficient of T with E (ρTE) is zero


→ True values of exposure are not correlated with subject errors in the population
- Subjects with high true values are assumed not to have systematically higher (or
lower) errors than subjects with lower true values

Normality of T and E in equations expressing effect of measurement


error on logistic regression assumes:
 No
o Sampling error
o Confounding
o Error in ascertainment of disease 21
Measures of Measurement Error
• Two measures of measurement error are used to describe the validity of X
• i.e. the relationship of X to T in the population of interest
• Based on model of measures of error and assumptions

1. Bias or the average measurement error in the population


• Difference between the population mean of X and the population mean of T:

• Positive b → X overestimates the amount of exposure on average


• Negative b → X underestimates exposure

22
2. Measure of precision of X (ρTE )

• A measure of variation in the measurement error in the population

• One measure of precision is variance


• The variance is the variance of the measurement error σ E2
• b does not contribute to the variance, why?
• Constant

• The measure of precision adopted is the correlation of T with X (ρTE)


• Termed the validity coefficient of X

23
• The square of ρTE is 1 minus the ratio of the variance of E to variance of X:

ρ2TE
• The proportion of the variance of X explained by T
• Range between 0 and 1
• The smaller the error variance, the greater ρTE
• A value of 1: X is a perfectly precise measure of T
• ρTE is assumed to be zero or greater. Why?

• For X to be considered to be a measure of T, X must be positively correlated with T

24
• To further understand the separate concepts of bias and precision, consider a
situation in which X only has a systematic bias, with Ei =0 for all subjects
ρTE = 1, σE2 = 0

• Suppose that the only source of error in a measurement of weight (X) is that the
scale weighs each subject exactly 1 kg too heavy.

• Despite this systematic bias, the variable X could be used to order each person
correctly in the population by his/her value of T.
• X would be perfectly precise

• However, if Ei varied from person to person (around the mean μE = 0), the
ordering would be lost.
• The greater the variance of E, relative to the variance of X, the less precise is
X as a measure of T.
• In this case the scale lacks precision (ρTE < 1) even though it is correct on
average (b=0). 25
Class exercise
Suppose that the only source of error in a measurement of weight is a scale weighs
each subject exactly 1 kg too heavy with a precision is 0.80.

1. Calculate the bias.

2. Interpret the precision?

3. What three conclusions can you make?

26
Class exercise
Suppose that the only source of error in a measurement of weight is a scale weighs each
subject exactly 1 kg too heavy with a precision is 0.80.

1. Calculate the bias


• Bias = 1 kg

2. Interpret the precision?


• Precision could be measured by the correlation of T with X
• ρ2TE= proportion of the variance of X explained by T = (0.8)2 = 0.64
→ Only 64 % of the variance in measured weight is explained by the true weight variation,
with remainder of the variance being due to error

3. What conclusions can you make?

• It is possible to order individuals based on their true weight variation!


• The scale lacks precision (0.64<1)!
27
• Bias and precision are different concepts!
• Measurement error is not an inherent property of an instrument,
• Rather a property of the instrument applied in a particular manner to a specific population.

• Therefore the error can vary:


• Between two instruments which measure the same exposure
• For a single instrument when applied differently or
• For a single instrument when applied to different population groups which vary
o E.g.: by level of education

• Importantly, measurement error could also differ between the population of cases and
the population of controls to be studied in an epidemiological study.

• In addition, the validity coefficient is dependent on the variance of the true exposure in
the population

• Therefore even if the error variance σx2 were the same for two populations, ρTE would
28
differ if σ differs
2
Effects of Measurement Error on Population exposure Mean & Variance
• In a study population, the mean and variance of the measured exposure differ
from true exposure mean and variance because of measurement error.
• The population mean of X differs from the true mean by b:

The population variance of X is:

29
Class exercise

Given a validity coefficient of 0.8 for a weight measurement in a sale,


what interpretation and conclusion could you make about the
variance of the measured weight?

30
Class exercise
Given a validity coefficient of 0.8, what interpretation and conclusion could
you make about the variance of a measured weight?

σ2X = σ2T/ ρ2TX = σ2T/ *(0.8)2 = σ2T/ 0.64


 σ2X = 1.56 σ2T

• The variance of a measured weight (X) is greater than the variance of the true
weight (T) in a population!
• Because of the addition of the variance of the measurement error

31
• The Fig demonstrates the effect of measurement
error on the distribution of X in a population
assuming a:
• Normally distributed exposure and
• Normally distributed error

• The bias in the measure causes a shift in the


distribution of X compared with T.

• The imprecision of X causes a greater dispersion


of the distribution of X compared with that of T.

• Even if a measure is correct on average, there


could still be substantial effects of measurement
error because of lack of precision which would
lead to a greater dispersion in the measured
exposures. 32
Effects of Differential Measurement Error on OR
• Effect of exposure measurement error on measure of association is of greater
concern

• The equations given in this lecture are based on the assumption that the only
source of error in the measure of association between the exposure and
disease is measurement error in the exposure.
• Other sources of bias, including:
o Measurement error in the disease
o Selection bias
o Confounding
o Error due to sampling a finite number of subjects, are assumed to be
absent.

33
• It is common in epidemiological studies • Extending measurement error model
to measure the exposure as a to the two groups,
continuous variable, and for the • Exposure measure XN in non-
outcome to be a dichotomous disease
diseased group differs from the
state.
true exposure TN by:

• Commonly used measure of association


is OR,
• i.e. Odds of disease at one level of exposure
relative to the odds of disease at another
(usually lower) level of exposure
Diseased group by:
• Studies of dichotomous outcomes
can be thought of as a comparison of
two population groups
34
• Differential exposure measurement error occurs when:

o Bias in the exposure measure in the non-diseased group (bN) differs from
the bias in the diseased group (bD),or

o Precision of XN differs from that of XD

35
• Effects of differential measurement
error (differential bias) on:
• Distributions of exposure among non-diseased and
diseased groups

• TN: True exposures among non diseased group


• TD: True exposures among diseased group
• XN: Exposures measured with error among the
non diseased group
• XD: Exposures measured with error among
diseased group
• ORT: True odds ratio for exposure versus
reference level r
• ORO: observable odds ratio for exposure versus
reference level r

36
• Graphical presentation of differential
measurement error (differential bias
between cases and controls).

• True mean exposure in the diseased


group ( is greater than the true mean
exposure in the non-diseased group.
• Leads to a positive slope in the true OR
curve (ORT)

• NB: The OR curve is shown as the OR for


disease among those with exposure level X or
T versus an arbitrary reference point r

37
• Distribution of XN is shifted to the
right relative to TN

→ Exposure overestimated in non-


diseased group

• Positive bias

38
• Distribution of XD is shifted to the
left relative to TD.

Exposure: underestimated
among those with disease
• Negative bias

• Leads the observable OR curve to


cross over the null value of 1:

• It indicates less disease risk with


increasing exposure, rather than the
true increasing disease risk.

39
• OR with differential measurement error →
O

• Differs from ORT by two exponential factors

• Effect of ρ2TX is more predictable


• Only range zero to one

• Factor C can be any magnitude and either positive or negative

 OR O could be:
• Closer to null value (1)
• Further from the null value, or
• Cross over the null value in comparison with ORT
40
Class exercise
• Suppose that a study of weight and hip fracture had a case–control
design and that the true average weight among cases was 2kg less than
the weight among controls. If the bias of the weight measure among
controls was 1 kg, but cases had gained an additional 2 kg between
their hip fracture and their participation in the study because of their
immobility, then

• What is the observed odds ratio?

• What conclusion could be made?

41
Class exercise Given information:
o μTD = μTN - 2kg
• Suppose that a study of weight and hip fracture o bN = 1kg
o bD = 3kg (1+2)
had a case–control design and that the true
Solution
average weight among cases was 2kg less than

the weight among controls. If the bias of the

weight measure among controls was 1 kg, but

cases had gained an additional 2 kg between C= 1 + ((3-1)/(μTN - 2kg – μTN)


= 1 + 2/-2 = O
their hip fracture and their participation in the

study because of their immobility, then

= OR T 0*PTX 2
• What is the observed odds ratio? = OR T0
ORO= 1

• What conclusion could be made? This shows that a differential bias between cases and controls of 2 kg
would completely obscure a true difference of -2 kg, leading to no
observable association between weight and hip fracture in the42study.
Effects of Non-differential Measurement Error on OR
• Non-differential exposure
measurement error exists if:
• Equal bias and equal error
variance between diseased and
non-diseased groups.

• The two distributions may shift, but they


are not shifted with respect to each other
as there is equal bias for the two groups.

• Thus the observable difference in the


mean values of X between cases and
controls is equal to the true
difference

μXD – μXN = μTD - μTN


43
• Lack of precision in X widens
each distribution and leads to
more overlap and less distinction
between the distributions of XN
and XD compared with the true
distributions.

• The odds ratio curve is flattened


towards the horizontal line of
odds ratio equal to 1 for all X.

44
• The OR under non-differential measurement error is a function of the precision
of X (measured by ρTX).

• States that the ORO for any fixed difference in units of X is equal to the OR T
for the same fixed difference in units of T to the power ρ2TX

• Since 0 ≤ρ2TX ≤1 the ORO will be closer to the null value of 1 (no
association) than the ORT.

45
Effects of Non-differential Measurement Error on Power and Sample Size
• Then the sample size nX needed to detect a difference of d in a study with non-
differential measurement error with reference to the sample size nT needed in a
study in which the exposure is measured without error is (Fleiss):

• Used to show the potentially dramatic effects of poor exposure


measurement on the sample size required.

46
Class exercise

2
• If the correlation between T and
X is 0.7, what sample size is
required when the imperfect
measure is applied?

= 2nT

→ The sample size required is twice while


applying an imperfect measure than a perfect
measure

47
Categorical Exposure Measures

• Misclassification: Measurement error in categorical variables


• Categorical variables:
• Including:
• Dichotomous
• Nominal categorical
• Ordered categorical

• Subject to all the sources of measurement error

48
Measures of Misclassification in Categorical Variables
• Misclassification of exposure:
→Certain proportion of subjects who truly fall into a specific exposure category
are correctly classified; the remainder misclassified to other categories.

• Misclassification matrix:
o A description of measurement error for a population for all types of categorical
variables
o It is a matrix of the proportions Cij of those with true exposure category j who will be
classified into category i.

o Depends on the:
• Instrument
• Operational procedures, and
• Population to which the instrument is applied
o Can differ by disease status
49
• Matrix representation
• k : number of categories
• Cij sum down each column to 1 (sum over true exposure j)
• The diagonal elements quantify the proportions correctly classified;
• A measure is perfect when the diagonal elements are all 1.

• For dichotomous exposure (k = 2), only two classification probabilities are needed:
1. Sensitivity of the exposure measure:
• The proportion of those who truly have the exposure who will be correctly classified as
exposed
2. Specificity of the exposure measure:
• The proportion of those who are truly unexposed who will be classified as unexposed

50
• If category 1 is ‘exposed’ and 2 is ‘unexposed’ the misclassification matrix would be:

1 − sensitivity: the probability of misclassifying a truly exposed person as unexposed


1 − specificity: the probability of misclassifying a truly unexposed person

• Even though both sensitivity and specificity can range from 0 to 1, it is assumed that

• In other words, for the instrument to be considered a measure of the exposure, it should classify a
truly exposed person as exposed with greater (or at least equal) probability than it classifies a truly
unexposed person as exposed, i.e.

51
Effects of Differential Misclassification of a Dichotomous Exposure on OR
• A more common situation in epidemiology is the comparison of exposure between two
populations: those with the disease of interest and those without.

• The effects of misclassification of categorical exposure measures are straightforward for


two types of studies:
• Studies of the association between a dichotomous exposure and a dichotomous
disease outcome and,
• Under certain assumptions, studies of an ordered categorical exposure and a
dichotomous disease outcome.

• In unmatched case–control study of a dichotomous exposure, under the assumption


that the disease is measured without error, the effect of misclassification of exposure is
to rearrange individuals in the true 2 × 2 table into an observable 2 × 2 table.

• Individuals remain in the correct disease group but may be misclassified as to exposure
status. 52
• Observable OR can be calculated separately to diseased & non-diseased groups:

• PD : true proportions exposed in the diseased


• PN: true proportions exposed in the non-diseased groups
• pD: observable proportions exposed in the diseased
• pN : observable proportions exposed in the non-diseased groups
53
• Differential misclassification exists when:

o Sensitivity of exposure measure for diseased group (sensD) differs from that for
the non-diseased group (sensN), or

o Specificity of exposure for diseased group (specD) differs from non-diseased


group (specN), or

o Both

54
Exercise: calculate ORT, ORO. What is your conclusion?

ORT = 0.10*0.90/0.90*0.10 =1.0


ORO = 0.19*0.95/0.81*0.05=4.5

• A true OR of 1.0- no association


between disease and exposure
could appear as a strong
association because of differential
misclassification.

55
Effects of Non-differential Misclassification of a Dichotomous Exposure
on OR
• Non-differential misclassification occurs when
• The sensitivity and specificity of the exposure measurement for the diseased
group are equal to those for the non-diseased group.

• The effect of non-differential misclassification of exposure on the OR can be


computed as in follows, except that there is a common sensitivity and a common
specificity for the diseased and non-diseased groups.

56
• The observable odds ratio depends on the:
• True odds ratio
• Sensitivity
• Specificity
• Probability of exposure among the non-diseased

• Non-differential misclassification leads to an attenuation of the OR towards the


null value of 1.

57
Effect of Measurement Error in the Presence of Covariates
• In analyzing the relationship between an exposure and an outcome, it is usually
necessary to adjust for confounding factors.

• These factors are also subject to the sources of measurement error.

• The effects of measurement error in the primary exposure and covariates are not
easy to quantify unless the exposure error is independent of the confounder and
the confounder error, and vice versa.

• Residual confounding can remain after adjustment when:


• Confounder is measured with non-differential error, but exposure is
measured perfectly
• Exposure is measured with non-differential error and covariate is measured
perfectly
• Both the exposure and the confounder are measured with error 58
References
• Principles of Exposure Measurement in Epidemiology: Collecting, Evaluating,
and Improving Measures of Disease Risk Factors: Exposure measurement error
and its effects: In: Emily White Bruce K. Armstrong Rodolfo Saracci. Cahpeter 3

• Mokkink et al. COSMIN Risk of Bias tool to assess the quality of studies on
reliability or measurement error of outcome measurement instruments: a Delphi
study. BMC Medical Research Methodology (2020) 20:293

59

You might also like