You are on page 1of 18

Research Methodology

VALIDITY AND RELIABILITY IN RESEARCH


INTRODUCTION

• In quantitative research, you have to consider the reliability and


validity of your methods and measurements. Reliability and
validity are concepts used to evaluate the quality of research. They
indicate how well a method, technique or test measures
something. Reliability is about the consistency of a measure, and
validity is about the accuracy of a measure.
• It’s important to consider reliability and validity when you are
creating your research design, planning your methods, and writing
up your results, especially in quantitative research.
VALIDITY

• Validity tells you how accurately a method measures something. If a


method measures what it claims to measure, and the results closely
correspond to real-world values, then it can be considered valid.
There are four main types of validity:
• Construct validity: Does the test measure the concept that it’s
intended to measure?
• Content validity: Is the test fully representative of what it aims to
measure?
• Face validity: Does the content of the test appear to be suitable to
its aims?
• Criterion validity: Do the results correspond to a different test of
the same thing?
Construct validity

• Construct validity evaluates whether a measurement tool really represents


the thing we are interested in measuring. It’s central to establishing the
overall validity of a method.
• Measures the adherence of a measure to existing theory and knowledge of
the concept being measured.
What is a construct?
• A construct refers to a concept or characteristic that can’t be directly
observed, but can be measured by observing other indicators that are
associated with it.
• Constructs can be characteristics of individuals, such as intelligence, obesity,
job satisfaction, or depression; they can also be broader concepts applied to
organizations or social groups, such as gender equality, corporate social
responsibility, or freedom of speech.
Construct validity CONT..

Example
• There is no objective, observable entity called “depression” that we can
measure directly. But based on existing psychological research and theory,
we can measure depression based on a collection of symptoms and
indicators, such as low self-confidence and low energy levels.

• https://www.scribbr.com/methodology/types-of-validity/
Construct validity CONT..

• Construct validity is about ensuring that the method of measurement matches the
construct you want to measure. If you develop a questionnaire to diagnose
depression, you need to know: does the questionnaire really measure the
construct of depression? Or is it actually measuring the respondent’s mood, self-
esteem, or some other construct?
• A self-esteem questionnaire could be assessed by measuring other traits known or
assumed to be related to the concept of self-esteem (such as social skills and
optimism). Strong correlation between the scores for self-esteem and associated
traits would indicate high construct validity.
• To achieve construct validity, you have to ensure that your indicators and
measurements are carefully developed based on relevant existing knowledge. The
questionnaire must include only relevant questions that measure known indicators
of depression.
• The other types of validity described in the next slides can all be considered as
forms of evidence for construct validity.
Content validity

• Content validity assesses whether a test is representative of all aspects of the


construct i.e the extent to which the measurement covers all aspects of the
concept being measured.
• To produce valid results, the content of a test, survey or measurement method
must cover all relevant parts of the subject it aims to measure. If some aspects
are missing from the measurement (or if irrelevant aspects are included), the
validity is threatened.
Example
• A mathematics teacher develops an end-of-semester algebra test for her class.
The test should cover every form of algebra that was taught in the class. If
some types of algebra are left out, then the results may not be an accurate
indication of students’ understanding of the subject. Similarly, if she includes
questions that are not related to algebra, the results are no longer a valid
measure of algebra knowledge.
Face validity

• Face validity considers how suitable the content of a test seems to be on the
surface. It’s similar to content validity, but face validity is a more informal
and subjective assessment.
• Example
• You create a survey to measure the regularity of people’s dietary habits. You
review the survey items, which ask questions about every meal of the day
and snacks eaten in between for every day of the week. On its surface, the
survey seems like a good representation of what you want to test, so you
consider it to have high face validity.
• As face validity is a subjective measure, it’s often considered the weakest
form of validity. However, it can be useful in the initial stages of developing a
method.
Criterion validity

• Criterion validity evaluates how closely the results of your test correspond to
the results of a different test. The extent to which the result of a measure
corresponds to other valid measures of the same concept.
What is a criterion?
• The criterion is an external measurement of the same thing. It is usually an
established or widely-used test that is already considered valid.
What is criterion validity?
• To evaluate criterion validity, you calculate the correlation between the
results of your measurement and the results of the criterion measurement. If
there is a high correlation, this gives a good indication that your test is
measuring what it intends to measure.
Criterion Validity cont..

Example
1. A university professor creates a new test to measure applicants’ English
writing ability. To assess how well the test really does measure students’
writing ability, she finds an existing test that is considered a valid measurement
of English writing ability, and compares the results when the same group of
students take both tests. If the outcomes are very similar, the new test has a
high criterion validity.
2. A survey is conducted to measure the political opinions of voters in a region.
If the results accurately predict the later outcome of an election in that region,
this indicates that the survey has high criterion validity.
Ensuring validity in your research

1. Choose appropriate methods of measurement by ensuring that your


method and measurement technique are high quality and targeted to
measure exactly what you want to know. They should be thoroughly
researched and based on existing knowledge.
2. Use appropriate sampling methods to select your subjects
• To produce valid generalizable results, clearly define the population you are
researching (e.g. people from a specific age range, geographical location, or
profession). Ensure that you have enough participants and that they are
representative of the population.
RELIABILITY

• Reliability refers to how consistently a method measures


something. If the same result can be consistently achieved by using
the same methods under the same circumstances, the
measurement is considered reliable.
• You measure the temperature of a liquid sample several times
under identical conditions. The thermometer displays the same
temperature every time, so the results are reliable.
• A doctor uses a symptom questionnaire to diagnose a patient with
a long-term medical condition. Several different doctors use the
same questionnaire with the same patient but give different
diagnoses. This indicates that the questionnaire has low reliability
as a measure of the condition.
RELIABILITY CONT..
• However, reliability on its own is not enough to ensure validity. Even if a test is
reliable, it may not accurately reflect the real situation.
• The thermometer that you used to test the sample gives reliable results.
However, the thermometer has not been calibrated properly, so the result is 2
degrees lower than the true value. Therefore, the measurement is not valid.
• A group of participants take a test designed to measure working memory. The
results are reliable, but participants’ scores correlate strongly with their level of
reading comprehension. This indicates that the method might have low
validity: the test may be measuring participants’ reading comprehension
instead of their working memory.
• Validity is harder to assess than reliability, but it is even more important. To
obtain useful results, the methods you use to collect your data must be valid:
the research must be measuring what it claims to measure. This ensures that
your discussion of the data and the conclusions you draw are also valid.
How are reliability and validity assessed?

• Reliability can be estimated by comparing different versions of the same


measurement. Validity is harder to assess, but it can be estimated by
comparing the results to other relevant data or theory. Methods of
estimating reliability and validity are usually split up into different types.
Types of reliability
• Different types of reliability can be estimated through various statistical
methods.
Types of Reliability
1. Test-retest: Measures the consistency of a measure across time: do you get
the same results when you repeat the measurement?
E.g. A group of participants complete a questionnaire designed to measure
personality traits. If they repeat the questionnaire days, weeks or months
apart and give the same answers, this indicates high test-retest reliability.
2. Interrater: Measures the consistency of a measure across raters or
observers: i.e do you get the same results when different people conduct the
same measurement?
E.g. Based on an assessment criteria checklist, five examiners submit
substantially different results for the same student project. This indicates that
the assessment checklist has low inter-rater reliability (for example, because
the criteria are too subjective).
Types of Reliability

3. Internal consistency: Measures the consistency of the measurement itself:


do you get the same results from different parts of a test that are designed to
measure the same thing?
E.g. You design a questionnaire to measure self-esteem. If you randomly split
the results into two halves, there should be a strong correlation between the
two sets of results. If the two results are very different, this indicates low
internal consistency.
Ensuring reliability

Reliability should be considered throughout the data collection process. When


you use a tool or technique to collect data, it’s important that the results are
precise, stable and reproducible.
1. Apply your methods consistently
• Plan your method carefully to make sure you carry out the same steps in the
same way for each measurement. This is especially important if multiple
researchers are involved.
2. Standardize the conditions of your research
• When you collect your data, keep the circumstances as consistent as
possible to reduce the influence of external factors that might create
variation in the results.
• For example, in an experimental setup, make sure all participants are given
the same information and tested under the same conditions.
Final Remarks
• There exist statistical methods for testing reliability and validity of
research.
• Cronbach’s Alpha coefficient for internal consistency reliability test
• HTMT Criterion for validity test for discriminant validity test
• Concurrent validity test is usually measured by the correlation between a
new test and an existing test to demonstrate whether the new test
correlates well with the existing test (Murphy & Davidshofer, 1998).

You might also like