You are on page 1of 5

1

BSN/NURSING RESEARCH 2 (NCM 115)


Academic Year: 2021-2022, 1st Semester
(3:00-4:30 M-W)

MODULE # 2

RELIABILITY AND VALIDITY OF MEASURING INSTRUMENTS

Learning Objectives:

1. Define reliability and validity.


2. Familiarize with the three aspects of reliability.
3. Differentiate between internal validity and external validity
4. Describe and distinguish between sensitivity and specificity

Learning Contents:

Research Instruments are scientific and systematic tools which are designed in order to help the researcher collect,
measure, and analyze data related to the topic of the research.
These tools are most commonly used in health sciences, social sciences, and education to assess patients, clients,
students, teachers, staff, etc.
Research instruments can include questionnaires, interviews, tests, surveys, scales, or checklists.
The choice of which specific research instrument to use is decided on the by the researcher.
It will also be strongly related to the actual methods that will be used in the specific study.

The two primary criteria for assessing a quantitative research instrument are reliability and validity.
The principles of validity and reliability are fundamental cornerstones of the scientific method.
Together, they are at the core of what is accepted as scientific proof, by scientists and philosophers alike.

Reliability

Reliability is the degree of consistency or accuracy with which an instrument measures an attribute.
This refers to the repeatability of a measure, i.e., the degree of closeness between repeated measurements of
the same value.
It is the extent to which an experiment, test, or any measuring procedure gives the same result on repeated
trials, every time.
Reliability addresses the question: If the same thing is measured several times, how close are the measurements
to each other?

The idea behind reliability is that other researchers must be able to perform exactly the same experiment, under
the same conditions and generate the same results.
This prerequisite is essential in establishing a hypothesis as true, and ensure that the research community will
accept the hypothesis.
The higher the reliability of an instrument, the lower the amount of error in obtained results.

For example, if you are performing a time critical experiment, you will be using some type of stopwatch.
Generally, it is reasonable to assume that the instrument (the stopwatch) is reliable and will keep true and
accurate time.

Engr. Manuel S. Tumalad mstumalad@gmail.com 0915 879 1419


2

On the other hand, any experiment that uses human judgment is always going to come under question.
Human judgment can vary wildly between observers, and the same individual may rate things differently
depending upon time of day and current mood of the observers.
Poor reliability can result from the following sources of variation:
a) Variation in the characteristic of the subject being measured (e.g., blood pressure)
b) The measuring instruments (e.g., questionnaires)
c) The persons collecting the information (observer variation)

Observer variations may be:


 Inter-observer variation: differences between observers in measuring the same observation
 Intra-observer variation: differences in measuring the same observation by the same observer on different
occasions.

The three key aspects of reliability are:


• Stability,
• Internal consistency, and
• Equivalence.

Stability is an aspect of reliability that indicates if a test is stable over time, i.e., that the results do not change
over time.
It answers the question, “Will the scores be stable over time?”
A test or measure is administered.
To determine stability, the same test or measure is re-administered at a future date to the same or highly similar
group.
Results are compared and correlated with the initial test to give a measure of stability.

Internal consistency refers to the extent to which all the instrument’s items are measuring the same attribute.
It answers the question, “How well does each item measure the characteristic being studied?”
Internal consistency reliability is a way to gauge how well a test or survey is actually measuring what you want it
to measure.
It is only measured once and need not be repeated.

[Example] You want to find out how satisfied your clients are with the level of patient care they
receive at your hospital. You send out a survey with three statements designed to
measure overall satisfaction.
1. I was satisfied with my experience.
2. I will probably recommend your company to others.
3. If I write an online review, it would be positive.
Choices for each question are: Agree/Neutral/Disagree

If the survey has good internal consistency, respondents should answer the same for each
question, i.e. three “agrees” or three “strongly disagrees.” If different answers are given,
this is a sign that your questions are poorly worded and are not reliably measuring
customer satisfaction.

Equivalence answers the question, “Are the two forms of the test or measure equivalent?”
Equivalence is the degree to which alternate forms of the same measurement instrument produce the same
result.

Engr. Manuel S. Tumalad mstumalad@gmail.com 0915 879 1419


3

If different forms of the same test or measure are administered to the same group; one would expect that the
reliability coefficient will be high if the tests are equivalent.
Testing equivalence involves ensuring that a test administered to two people, or similar tests administered at
the same time give similar results.

With this method, the researcher creates a large set of questions that measures the same characteristic, and
then randomly divides the questions into two sets.
Both sets are given to the same sample of people.
If there is a strong correlation between the instruments, we have high reliability.

[Example] Comparing two methods of blood pressure measurement to determine if they are equivalent
by comparing is the systolic BP results determined by an automatic BP cuff are equivalent to
those recorded from manual BP measures using a stethoscope and sphygmomanometer.

Validity

Validity is the degree to which an instrument measures what it is supposed to measure.


This refers to the degree of closeness between a measurement and the true value of what is being measured.
Validity addresses the question, how close is the measured value to the true value?
Validity may be external, or internal.

Internal validity is the extent to which you can be confident that a cause-and-effect relationship established in a
study cannot be explained by other factors.
In other words, can you reasonably draw a causal link between your independent variable and the dependent
variable?
Internal validity makes the conclusions of a causal relationship credible and trustworthy.
Without high internal validity, an experiment cannot demonstrate a causal link between the two variables.

[Example] You want to test the hypothesis that drinking a cup of coffee improves memory.
You schedule an equal number of college-aged participants for morning and evening
sessions at the laboratory.
For convenience, you assign all morning session participants to the treatment group (those
will drink coffee) and all evening session participants to the control group (those who will
not drink coffee).
.
Once they arrive at the laboratory, the treatment group participants are given a cup of
coffee to drink, while control group participants are given water.
You also give both groups memory tests.
After analyzing the results, you find that the treatment group performed better than the
control group on the memory test.
Can you conclude that drinking a cup of coffee improves memory performance?
For your conclusion to be valid, you need to be able to rule out other explanations for the
results.

External validity asks the question of generalizability: To what populations, settings, treatment variables and
measurement variables can this effect be generalized?

Engr. Manuel S. Tumalad mstumalad@gmail.com 0915 879 1419


4

The main criterion of external validity is the process of generalization, and whether results obtained from a small
sample group, often in laboratory surroundings, can be extended to make predictions about the entire
population.
For example, a vaccine that is 99% effective against Covid-19 among 1,000 people in a clinical sample, has a
strong external validity if it will also be 99% effective for 2 million residents in a city.

External validity is the most difficult of the validity types to achieve.


Many scientific disciplines, especially the social sciences, face a long battle to prove that their findings represent
the wider population in real world situations.
The reality is that if a research study has poor external validity, the results will not be taken seriously.

Reliability and validity are not independent qualities of an instrument.


A measure can be reliable without being valid.
A measure cannot be valid without being reliable.
An instrument cannot validly measure an attribute if it is inconsistent and inaccurate.

However, if one cannot have both, validity is more important in situations when we are interested in the absolute
value of what is being measured.
Reliability on the other hand is more important when it is not essential to know the absolute value, but rather we
are interested in finding out if there is a trend, or to rank values.

Sensitivity and specificity

Sensitivity and specificity are criteria that are important in evaluating instruments designed as screening
instruments or diagnostic aids.
Screening refers to the application of a medical procedure or test to people who as yet have no symptoms of a
particular disease, for the purpose of determining their likelihood of having the disease. 
The screening procedure itself does not diagnose the illness.
Those who have a positive result from the screening test will need further evaluation with subsequent
diagnostic tests or procedures.

Whenever we create a test to screen for a disease, to detect an abnormality or to measure a physiological
parameter such as blood pressure (BP), we must determine how valid that test is—does it measure what it sets
out to measure accurately?
There are lots of factors that combine to describe how valid a test is: sensitivity and specificity are two such
factors.
Sensitivity and specificity are measures of a test's ability to correctly classify a person as having a disease or not
having a disease.

Engr. Manuel S. Tumalad mstumalad@gmail.com 0915 879 1419


5

Sensitivity is the ability of an instrument to identify a ‘case’ correctly, that is, to screen or diagnose a condition
correctly.
It is the ability of a test to correctly identify patients with a disease, or true positives. 
True positive: the person has the disease and the test is positive.

Specificity is the instrument’s ability to identify ‘non-cases’ correctly, that is, to screen out those without the
condition correctly.
It is the ability of a test to correctly identify people without the disease.
It is the rate of yielding ‘true negatives.’

Sources/References:

Polit, D. & Beck, C. (2004). Nursing research: Principles and methods (7 th ed.). Philadelphia: Lippincott Williams &
Wilkins.

Engr. Manuel S. Tumalad mstumalad@gmail.com 0915 879 1419

You might also like