You are on page 1of 6

MODULE 06: RESEARCH & EPIDEMIOLOGY

Principles of Validity and Reliability


ARIANNA MAEVER L. AMIT, MAS
01/13/2021
QUANTITATIVE STUDIES

TABLE OF CONTENTS
CONTINUATION: Formulating a Research Question and Creating Study
Populations
I. INTERNAL VALIDITY....................................................................................................1
A. DEFINITION .........................................................................................................1
B. BARRIERS TO INTERNAL VALIDITY ......................................................................2
II. EXTERNAL VALIDITY ..................................................................................................2
A. DEFINITION .........................................................................................................2
B. DO INTERNALLY VALID INFERENCES APPLY TO OTHER TARGET POPULATIONS?
................................................................................................................................2
C. BARRIERS TO EXTERNAL VALIDITY ......................................................................2
III. VALIDITY...................................................................................................................3
A. COMPONENTS OF VALIDITY ...............................................................................3
B. VALIDITY FOR INSTRUMENTS .............................................................................4
IV. RELIABILITY ..............................................................................................................4
A. RELIABILITY MEASURES FOR INSTRUMENTS .....................................................4
V. CONCLUSION ............................................................................................................5 Figure 2. An example wherein target population, source population and
QUICK REVIEW ..............................................................................................................5 study population are the same
SUMMARY OF CONCEPTS .......................................................................................5
SUMMARY OF NEED-TO-KNOWS (NDTK) ...............................................................5 • Studies that have very specific parameters can limit the target
SUMMARY OF EQUATIONS .....................................................................................6 population to a small, more manageable number, that everyone in the
REVIEW QUESTIONS ...............................................................................................6 target population can be considered as a member of the study
REFERENCES..................................................................................................................6 population
REQUIRED ...............................................................................................................6 o E.g., Health outcomes of all Filipinos enrolled in a certain medical
insurance aged 70 and above
o Because of this, sampling error is negligible, or even zero, since there
was no need to sample
LEARNING OBJECTIVES
o In terms of analysis, no hypothesis testing is necessary. There is no
1) Define external and internal validity in the context of both epidemiology need to infer about a target population using a study population
and the field of psychometrics § In these small studies, the study population = target population,
2) Differentiate between systematic error and random error so these studies do not need to create inferences
3) Differentiate between the concepts of validity and reliability in
measurement
4) Explain different methods to assess validity I. INTERNAL VALIDITY
5) Interpret selected statistical measures to assess inter-rater reliability and A. DEFINITION
internal reliability • An internally valid study is a study that:
o Is free from bias or systematic error
RECALL: Formulating a Research Question and Creating Study o Has sound study design, conduct, and analysis
Populations • Internal validity is a pre-requisite for external validity
1) Define population at risk, exposure of interest, and outcome
2) Does A cause B?
3) Who should be studied?
4) How do they relate to those eligible to be studied?
5) How do they relate to those at risk for outcome?

• Step 2 is concerned with exposure and outcome association


• Step 3 to 5 are concerned with population of the study

Figure 3. Biases when selecting populations

• A source population is selected from the target population


• When people from the source population have consented to participating
in the study, they become part of the study population
o Possible biases in selecting a study population:
§ More likely to include those that are available, e.g., at home, with
access to the study
§ Those that are working, those that are in school, those that cannot
Figure 1. Relationship between Study Population, Eligible Population and access an internet-based study, etc. may systematically be excluded
Target Population • Observed data is obtained after collecting data from the study population
o Biases may arise from what are included as data points, and what are
excluded based on different inclusion criteria
• Target population is the enumeration of all observations
• In summary, internal validity refers to how well the inferences of the study
• Source population is the population eligible for the study
population reflect the target population, if the whole target population
• Study population is the population that is eligible and has consented to
was to be studied
participate in the study

Transcribed by TG 17: Aw Young, Banzon, Chua, Elefante, Lazo, Mendez, Mesina


YL6: 06.07
Checked by TG 5: Bartelheimer, Cerezo, Cua, Felizarte, Lee, Melo, Nuguid, San Pascual 1 of 6
NDTK: Generalizability and Internal Validity
• There is no test or automated method to ensure that inferences are
applicable to other populations. However, inferences, before they are
generalizable, should have internal validity
o It does not make sense to ask about generalizability if inferences are
not internally valid.

ACTIVE RECALL
1. Can a study have the same target population and study population?
How?
2. T/F. It makes sense to ask about generalizability if inferences are not
internally valid.

ANSWERS: 1Yes, if the inclusion criteria of the study is specific enough to


Figure 4. Internal Validity limit the target population to a manageable size 2F

• Internal validity depends on our knowledge of study methods


o Researchers need to enumerate how they selected participants, who C. BARRIERS TO EXTERNAL VALIDITY
they were able to obtain, and who they missed Selection Bias
o There is a need to specify data collection methods, measurements, and
data analysis • Distortion of a measure of association from what would be observed in the
§ When looking at reports in scientific literatures, one can see that target population due to the selection of participants
there are many descriptions on sample collection, data collection, • Usually, a result of:
and data analysis o Poor eligibility criteria
§ This is something we should do and apply when doing research o Poor sampling frame
o Differential participation
o Non-participation
B. BARRIERS TO INTERNAL VALIDITY o Non-response
1) Vague definition of the target population o Losses to follow up
2) Inability to define the source population • The risk of bias can be minimized by giving careful thought to the selection
3) Problems with enrolling the study population of the study population and the methods of data collection to be used

II. EXTERNAL VALIDITY Measurement Error


A. DEFINITION • Any difference between the value of an exposure or outcome variable, as
• “Generalizability”: Degree to which the results of the study may apply, be measured in a study, and its true value
relevant, or be generalized to populations or groups that did not & Occurs when the response has a tendency to differ from the true value in
participate in the study one direction
2 How the measurement in the sample can be generalized to the population • Can be minimized by careful questionnaire design and good training of
2 Refers to how accurately the measures obtained from the study sample study staff
described the reference population from which the sample was drawn • Subdivided into validity and reliability in epidemiological books

B. DO INTERNALLY VALID INFERENCES APPLY TO OTHER TARGET Validity


POPULATIONS? • How close the test comes to measuring the variable we are interested in
• One can assess if internally valid inferences can apply to other target • Analogous to accuracy
populations, other persons, other places, and other time periods
o E.g., Can the results of certain studies be generalized to other persons?
§ If the study is exclusively done on men, can it be generalized to
women?
• External validity depends on our knowledge of study methods as well
o There is a need to state who the study population is, and this can be
done by clearly defining the inclusion and exclusion criteria
• Also depends on one’s knowledge on the subject matter (people in the
study).
Figure 6. Validity in terms of true value
o Are the circumstances of exposure the same?
o Are all other related factors, whether measured or not, similar?
Reliability
• There is no way to measure all of these, and assumptions have to be made
• How consistent the test is when measured or used by different observers
or over different periods of time
• Analogous to precision

Figure 5. Internally Valid Inferences Applied to Target Populations


Figure 7. Reliability in terms of true value

YL6: 06.07 RESEARCH & EPIDEMIOLOGY: Principles of Validity and Reliability 2 of 6


Total Measurement Error • Notes from Figure 9:
o If error varies by one or more of any of the characteristics (e.g., risk
factors, disease status, age, sex, socioeconomic status), the error is
differential and if not, it is non-differential
§ Non-differential error means that the degree of misclassification is
the same by group
o If random error is present, variability is wide
o If systematic error is present, mean estimates deviates from a true
mean but variability is narrower
o If differential error is present, they differ between the two groups

ACTIVE RECALL
3. Selection biases occur due to the following EXCEPT:
a) Poor eligibility criteria
b) Poor sampling frame
c) Non-participation
d) Losses to follow up
Figure 8. Total Measurement equations a) None of the above
4. Identify the errors related to Total Measurement Error
• Any estimated value is the sum of true value and total measurement error
ANSWERS: 3E, 4Systematic Error and Random Error
(see Figure 8)
• Total measurement error arises from:
o Systematic error: usually associated with validity
o Random error: usually associated with reliability III. VALIDITY
• It is Important to know which type of error you have
o Random error can be reduced by taking two or more readings of the
same experiment
o Systematic error would persist to the same extent

Differential Error
: Differential (non-random) misclassification occurs when the proportions of
subjects misclassified differ between study groups
: The probability of exposure being misclassified is dependent on disease
status, or the probability of disease status being misclassified is dependent
on exposure status
: Considered a more serious problem, as the effect of differential
misclassification is that the observed estimate of effect can be biased in the
direction of producing either an overestimate or under-estimate of the true
association

Non-differential Error
Figure 10. Illustration of Validity
: Non-differential (random) misclassification occurs when classifications of
disease status or exposure occurs equally in all study groups being
compared • As seen in Figure 10,
: The probability of exposure being misclassified is independent of disease o Results with good validity: centered on the true value
status and the probability of disease status being misclassified is o Results with poor validity: not centered on the true value
independent of exposure status
: Increases the similarity between the exposed and non-exposed groups, and A. COMPONENTS OF VALIDITY
may result in an underestimate (dilution) of the true strength of an
association between exposure and disease Ideally, validity is measured by comparing results of the tests with the
true values. However, in practice, test results are compared to the best
available test or gold standard test.

• Sensitivity
o Ability of a test to correctly identify individuals who have the disease
• Specificity
o Ability of a test to correctly identify individuals who do not have the
disease

Figure 9. Diagram of differential and non-differential errors

Figure 11. Summary of Terms.

YL6: 06.07 RESEARCH & EPIDEMIOLOGY: Principles of Validity and Reliability 3 of 6


[EXAMPLE]: SENSITIVITY AND SPECIFICITY B. SOME RELIABILITY MEASURES IN EPIDEMIOLOGIC RESEARCH
Assume a population of 1,000 individuals, of whom: 100 have the disease, Categorical variables
900 do not have the disease. Both groups have some will test positive and • Percent agreement
some will test negative. • Percent positive agreement
• Kappa statistics

Continuous variables
• Correlation coefficient
• Coefficient of variation
• Regression

Figure 12. Calculating sensitivity vs. specificity.

• Sensitivity
o SN = TP/TP+FN = 80/100 = 80%
• Specificity
o SP = TN/TN+FP = 800/900 = 89%

ACTIVE RECALL Figure 13. Comparison between results from a more reliable test vs. a less
5. T/F. Results with poor validity are centered on the true value. reliable test.
6. T/F. Sensitivity is the ability of a test to correctly identify individual who
DO NOT have the disease.
NOTE: Doc only enumerated the reliability measures in epidemiological
ANSWERS: 5F, 6F research and did not discuss what they were

B. VALIDITY FOR INSTRUMENTS ACTIVE RECALL


• In this context, validity is the degree to which the evidence and theory 7. Identify. Three common categorizations of validity for instruments
support the interpretations of test scores entailed by proposed uses of 8. Identify. Three types of factors that affect the reliability of a test result
tests
• Many use the term “validated instruments”, but this concept is rejected by ANSWERS: 7Content, Criterion, Construct, 8Intra-person, Intra-observer,
experts in the field Inter-observer
o Validity is the property of the inference, not the instrument
o Validity of interpretations is a matter of degree
§ An instrument’s scores will reflect the underlying construct more C. RELIABILITY MEASURES FOR INSTRUMENTS
accurately or less accurately, but never perfectly Internal consistency
• Common categorization of Validity • Variation related to factors including biologic, environmental. etc.
o Content: extent to which the items in a questionnaire are
• Extent to which the questionnaire items are inter-correlated, or whether
representative of the entire theoretical construct the questionnaire is
they are consistent in the measurement of the same construct
designed to assess
• Cronbach’s alpha
§ Done by content experts
o Statistical tool that measures internal consistency
§ Problem: it is very subjective
o Ranges from 0 to 1
o Criterion: how well does one measure predict an outcome for another
§ When the value is 0: There is no internal consistency and none of
measure?
the items are correlated
o Construct: most important concept in evaluating a questionnaire
§ When the value is 1: Perfect internal consistency
designed to measure a construct that is not directly observable (e.g.,
§ Value of at least 0.7 has been suggested to indicate adequate
pain)
internal consistency
§ There have been suggestions that all validity should be
conceptualized under this framework
𝑘 ∑𝜎!"
Þ Reasoning: instrument scores are only useful inasmuch it reflects 𝛼= (1 − " )
the desired construct, and evidence must be collected to support 𝑘−1 𝜎#
this relationship Equation 1. Cronbach’s alpha formula for internal consistency

IV. RELIABILITY NOTE: We are not expected to compute for Cronbach’s alpha manually
• Ask the question: Are the test results repeatable? since there are statistical software that may be used. It is important to
remember though that Cronbach’s alpha values range from 0 to 1.
A. FACTORS AFFECTING RELIABILITY
• Intra-person Test-retest reliability
o Variation related to factors including biologic, environmental, etc. • Same observer may not interpret the test result in the same way every time
• Intra-observer • Extent to which individual’s responses to the questionnaire items remain
o Same observer may not interpret the test result in the same way every relatively consistent across repeated administration of the same
time questionnaire or alternate questionnaire forms
• Inter-observer
o Two (or more) different observers may not interpret the same test
Inter-rater reliability
result in the same way
• Two different observers may not interpret the same result in the same way

YL6: 06.07 RESEARCH & EPIDEMIOLOGY: Principles of Validity and Reliability 4 of 6


• Extent to which raters are consistent in their observations across the same § Validity: Asks how close the test comes to measuring the variable
group of examinees can be evaluated we are interested in
• Kappa statistic: the proportion of agreement between the two raters after § Reliability: Asks how consistent the test is when measured or used
factoring out the proportion of agreement by chance by different observers or over different periods of time
o Ranges from 0 to 1 o Total Measurement Error: Arises from systematic error usually
§ When the value is 0: Represent all chance agreements associated with validity and random error usually associated with
§ When the value is 1: Perfect agreement between the two raters reliability
o Differential Error: Occurs when the proportions of subjects
𝑃$ − 𝑃% misclassified differ between the study groups
𝑘= o Non-differential Error: Occurs when classifications of disease status or
1 − 𝑃%
exposure occurs equally in all study groups being compared
Equation 2. Kappa statistic
VALIDITY
• Good Validity: centered on the true value
V. CONCLUSION
• Poor Validity: not centered on the true value
• It is not easy to develop and have an instrument validated • Components of Validity:
o This process requires more steps than what was taught in the lecture o Sensitivity: Ability of a test to correctly identify individuals who have
§ E.g., Content validity is just having an assessment by content experts the disease
familiar with the subject o Specificity: Ability of a test to correctly identify individuals who do not
Þ This can be subjective and other objective measures can be used have the disease
• The entire process from developing a construct to having a validated and
developed instrument takes a lot of time VALIDITY FOR INSTRUMENTS
o It starts with identifying a construct of interest and asking if there is a • In this context, validity is the degree to which the evidence and theory
validated questionnaire available support the interpretations of test scores
§ If there is none, you develop one for your study which include the • Common categorization
following steps: o Content: how representative of the entire theoretical construct are the
Þ Establish an expert committee items in the questionnaire?
Þ Identify the dimensionality of the construct o Criterion: how well does one measure predict an outcome for another
Þ Determine the questionnaire format measure?
Þ Develop the items o Construct: how well does this measure the desired construct?
Þ Review § Most important concept in evaluating a questionnaire designed to
§ Once all steps are done, that is when preliminary pilot testing and measure a construct that is not directly observable (e.g., pain)
final validation is done
Þ Iterative and may take a year or more VALIDITY FOR INSTRUMENTS
Þ Validation studies are usually done by PhD students abroad. In • Are the test results repeatable?
the Philippines, validation studies are usually done for Master’s • Factors affecting reliability
thesis. o Intra-person: variation related to factors including biologic,
environmental, etc.
To maximize the benefit to society, you need to not just do research but
o Intra-observer: same observer may not interpret the test result in the
do it well (Professor Doug Altman)
same way every time
o Inter-observer: two (or more) different observers may not interpret the
HOW’S MY TRANSING? same test result in the same way

Feedback Form: https://tinyurl.com/2024YL6gHMT RELIABILITY MEASURES FOR INSTRUMENTS


Errata Tracker: https://tinyurl.com/2024YL6ET06 • Internal consistency: Extent to which the questionnaire items are inter-
correlated, or whether they are consistent in the measurement of the
construct
QUICK REVIEW
o Cronbach’s alpha: Statistical tool that measures internal consistency
SUMMARY OF CONCEPTS § When the value is 0: There is no internal consistency
INTERNAL VALIDITY § When the value is 1: Perfect internal consistency
• A study is internally valid if it: • Test-test reliability: Extent to which individual’s responses to the
o Is free from bias or systematic errors questionnaire items remain relatively consistent across repeated
o Has sound study design, conduct and analysis administration of the same questionnaire or alternate questionnaire forms
• Refers to how well the inferences of the study population reflect the target • Inter-rater reliability: Extent to which raters are consistent in their
population observations across the same group of examinees can be evaluated
• Depends on our knowledge of study methods o Kappa: the proportion of agreement between the two raters after
o There is a need to specify data collection methods, measurements, and factoring out the proportion of agreement by chance
data analysis § When the value is 0: Represent all chance agreements
• Barriers to Internal Validity § When the value is 1: Perfect agreement between the two raters
o Vague definition of the target population
o Inability to define the source population CONCLUSION
o Problems with enrolling the study population • It is not easy to develop and have an instrument validated
• The entire process from developing a construct to having a validated and
EXTERNAL VALIDITY developed instrument takes a lot of time
• Generalizability: Degree to which the results of the study may apply, be
relevant, or be generalized to populations or groups that did not
SUMMARY OF NEED-TO-KNOWS (NDTK)
participate in the study
• One can assess if internally valid inferences can apply to other target • There is no test or automated method to ensure that inferences are
populations, other persons, other places, and other time periods applicable to other populations. However, inferences, before they are
• External validity depends on one’s knowledge of the subject matter and generalizable, should have internal validity
study methods o It does not make sense to ask about generalizability if inferences are not
internally valid.
• Barriers to External Validity
o Selection Bias: Distortion of a measure of association from what would • Internal validity is a prerequisite to external validity
be observed in the target population due to the selection of participants • Cronbach’s alpha: Value ranges from 0 to 1
o Measurement Error: Any difference between the value of an exposure • Kappa statistic: Value ranges from 0 to 1
or outcome variable, as measured in a study, and its true value

YL6: 06.07 RESEARCH & EPIDEMIOLOGY: Principles of Validity and Reliability 5 of 6


SUMMARY OF EQUATIONS b) Extent to which raters are consistent in their observations
c) Extent to which individual’s responses to the questionnaire items
Equation 1. Sensitivity
remain relatively consistent across repeated administration
𝑇𝑃 d) NOTA
𝑇𝑃 + 𝐹𝑁
8. What does a 0 Cronbach’s alpha value indicate for the internal consistency?
• Where: a) No internal consistency
o TP = True Positive b) Perfect internal consistency
o FN = False Negative c) Perfect agreement
d) All chance agreements
Equation 2. Specificity e) NOTA

𝑇𝑁 ANSWERS:
𝑇𝑁 + 𝐹𝑃 1F, 2Generalizability, 3D, 4D, 5B, 6A, 7D, 8A

• Where: EXPLANATIONS:
o TN = True Negative 1. False. An online survey will not reach students who do not have internet
o FP = False Positive connectivity. The results of the study will be skewed to favor online classes as
only those with internet will be able to answer the survey
Equation 3. Cronbach’s alpha formula for internal consistency
REFERENCES
𝑘 ∑𝜎!" REQUIRED
𝛼= (1 − " )
𝑘−1 𝜎# 2 ASMPH 2023. 06.17: Principles of Validity and Reliability by Joseph Anthony
Lachica, MD.
Equation 4. Kappa statistic
(1) Arianna Maever L. Amit, MAS. 01/11/2021. Principles of Validity and
Reliability [Lecture slides].
𝑃$ − 𝑃%
𝑘=
1 − 𝑃%

REVIEW QUESTIONS
1. T/F: The Department of Education wanted to survey the student
population on whether or not they were prepared to shift from face-to-
face classes to online. Posting the Google Forms link on Facebook will allow
them to disseminate the survey and come up with an internally sound
study.

2. Identify. This is the degree to which the results of the study may apply, be
relevant, or be generalized to populations or groups that did not
participate in the study

3. Reliability asks how close the test comes to measuring the variable we are
interested in. Validity asks how consistent the test is when measured or
used by different observers or over different periods of time.
a) Statement 1 is true
b) Statement 2 is true
c) Both statements are true
d) Both statements are false

4. Assume a population of 1,000, of whom 200 have the disease and 800 do
not have the disease. Among those who have the disease, 125 tested
positive while 75 tested negative. Among those who do not have the
disease, 200 tested positive while 600 tested negative. What is the
sensitivity of the test?
a) 60%
b) 75%
c) 72.5%
d) 62.5%

5. Assume a population of 1,000, of whom 200 have the disease and 800 do
not have the disease. Among those who have the disease, 125 tested
positive while 75 tested negative. Among those who do not have the
disease, 200 tested positive while 600 tested negative. What is the
specificity of the test?
a) 60%
b) 75%
c) 72.5%
d) 62.5%

6. Which of the following is true?


a) One of the flaws of content validity is that it is subjective
b) Content validity is the most important concept in evaluating a
questionnaire that measures a construct that is not directly observable
c) Validity is the property of the instrument
d) NOTA

7. Which of the following best describes test-retest reliability?


a) Extent to which the questionnaire items are inter-correlated

YL6: 06.07 RESEARCH & EPIDEMIOLOGY: Principles of Validity and Reliability 6 of 6

You might also like