You are on page 1of 41

MSc Computer Science with Emerging Technologies

Lecture 13

Reliability & Validity


Results, Discussion and Conclusion

Research Methodologies
by Dr Vinaye Armoogum
Professor

VA 1
Learning Topics
This lecture will address the following:

1. Importance of Reliability and Validity in research

2. Contents in Result analysis

3. Importance of Discussion before completing a research work

VA 2
Research Process
II. Review the literature
Review concepts
and theories IV. Design
I. Define Research III. Formulate research(including
Problem
Review previous hypotheses sample design)
research finding

V. Build Design (e.g


VII. Interpret VI. Analyse data
(Test hypotheses) model) and Collect
and report
data (Execution)

VA 3
Reliability and Validity

VA 4
Measurement
• MEASUREMENT is any process by which a value is assigned to the
level or state of some quality of an object of study.
• Measurement involves the expression of information in quantifies
(numbers) rather than by verbal statement
• It provides a powerful means of reducing qualitative data to a more
condensed form for summarization, manipulation, and analysis

VA 5
Measurement

The best measure should be both reliable and valid


Reliability and validity
• Reliability and validity are concepts used to evaluate the
quality of research.
• They indicate how well a method, technique or test measures
something.
• Reliability is about the consistency of a measure, and validity
is about the accuracy of a measure.
VA 6
What is reliability?
• We often speak about “reliable cars."
• On news people talk about a "usually reliable source“
• In both cases, the word reliable usually means "dependable" or "trustworthy."
• In research, the term "reliable" also means dependable in a general sense, but
that's not a precise enough definition

• Reliability refers to how consistently a method measures something.


• If the same result can be consistently achieved by using the same methods
under the same circumstances and conditions, the measurement is considered
reliable.

VA 7
Reliability - Examples
• You measure the temperature of a liquid sample several times under
identical conditions. The thermometer displays the same temperature
every time, so the results are reliable.

• A doctor uses a symptom questionnaire to diagnose a patient with a


long-term medical condition. Several different doctors use the same
questionnaire with the same patient but give different diagnoses. This
indicates that the questionnaire has low reliability as a measure of the
condition.

• Data Collected when testing an AI algorithm in an experiment is


considered reliable if the same score is obtained when repeating the
same test under the same conditions VA 8
Reliability of measuring devices
• The slightest variations in measuring devices in Olympic track
and field events (whether it is a tape or clock) could mean the
difference between the gold and silver medals
• Olympic measuring devices, then, must be reliable from one
throw or race to another and from one competition to another
• They must also be reliable when used in different parts of the
world, as temperature, air pressure, humidity, interpretation,
or other variables might affect their readings

VA 9
Types of Reliability

There are three ways that reliability is usually estimated:


1. test/retest
2. internal consistency
3. Interrater.

VA 10
Test-Retest Reliability
• We estimate test-retest reliability when we administer the same test to the
same sample on two different occasions
• The idea behind test/retest is that you should get the same score on test 1
as you do on test 2.
• The three main components to this method are as follows:
1) implement your measurement instrument at two separate times for
each subject;
2) compute the correlation between the two separate measurements
3) assume there is no change in the underlying condition between test 1
and test 2

VA 11
Internal Consistency
• Internal consistency estimates reliability by grouping questions in a
questionnaire that measure the same concept
• After collecting the responses, run a correlation between groups of questions to
determine if your instrument is reliably measuring that concept.

How to test Correlation?


One solution is to use SPSS or R programming
• Generate the Cronbach’s alpha (a measure of internal consistency, that is, how
closely related a set of items are as a group)- a correlation coefficient or
coefficient of reliability
• The closer it is to 1, the higher the reliability estimate of your instrument/model.
• Generally, a value of above 0.7 is considered as high reliability

VA 12
Types of Reliability - Summary
Type of reliability What does it assess? Example
A group of participants complete a questionnaire
designed to measure personality traits. If they repeat
The consistency of a measure across
the questionnaire days, weeks or months apart and
Test-retest time: do you get the same results
give the same answers, this indicates high test-retest
when you repeat the measurement?
reliability.

Based on an assessment criteria checklist, five


The consistency of a measure across examiners submit substantially different results for
raters or observers: do you get the the same student project. This indicates that the
Interrater
same results when different people assessment checklist has low inter-rater reliability
conduct the same measurement? (for example, because the criteria are too subjective).

The consistency of the measurement You design a questionnaire to measure self-esteem.


itself: do you get the same results If you randomly split the results into two halves,
Internal consistency from different parts of a test that are there should be a strong correlation between the two
designed to measure the same sets of results. If the two results are very different,
thing? this indicates low internal consistency.
VA 13
Validity
• Validity involves the degree to which you are measuring what you are
supposed to. =>More simply, the accuracy of your measurement.

• Validity refers to how accurately a method measures what it is intended


to measure.
• If research has high validity, that means it produces results that
correspond to real properties, characteristics, and variations in the
physical or social world.
• High reliability is one indicator that a measurement is valid. If a method
is not reliable, it probably isn’t valid.
VA 14
Validity – Covid-19 Examples
• If the thermometer shows different temperatures each time, even though you have
carefully controlled conditions to ensure the sample’s temperature stays the same,
the thermometer is probably malfunctioning, and therefore its measurements are
not valid.
• If a symptom questionnaire results in a reliable diagnosis when answered at
different times and with different doctors, this indicates that it has high validity as
a measurement of the medical condition.
• However, reliability on its own is not enough to ensure validity. Even if a test is
reliable, it may not accurately reflect the real situation.
• The thermometer that you used to test the sample gives reliable results. However,
the thermometer has not been calibrated properly, so the result is 2 degrees lower
than the true value. Therefore, the measurement
VA is not valid. 15
Validity – Another Example
• A group of participants take a test designed to measure working memory. The
results are reliable, but participants’ scores correlate strongly with their level of
reading comprehension. This indicates that the method might have low validity:
the test may be measuring participants’ reading comprehension instead of their
working memory.

Validity is harder to assess than reliability, but it is even more important.


• To obtain useful results, the methods which researchers use to collect data must be
valid: the research must be measuring what it claims to measure.
• This ensures that the discussion of the data and the conclusions which the
researchers draw are also valid.
VA 16
Four types of validity
• Construct validity
• Content validity
• Criterion validity
• Internal validity
• External validity

VA 17
Types of Validity - Summary

Type of validity What does it assess? Example


A self-esteem questionnaire could be assessed by
measuring other traits known or assumed to be related
The adherence of a measure to
to the concept of self-esteem (such as social skills and
Construct existing theory and knowledge of the
optimism). Strong correlation between the scores for
concept being measured.
self-esteem and associated traits would indicate high
construct validity.
A test that aims to measure a class of students’ level of
Communication Skills contains reading, writing and
The extent to which the speaking components, but no listening
Content measurement covers all aspects of the component. Experts agree that listening comprehension
concept being measured. is an essential aspect of language ability, so the test lacks
content validity for measuring the overall level of ability
for this module.
A survey is conducted to measure the political opinions
The extent to which the result of a of voters in a region of Mauritius. If the results
Criterion measure corresponds to other valid accurately predict the later outcome of an election in
measures of the same concept. that region, this indicates that the survey has high
VA
criterion validity. 18
Understanding internal
validity
• Internal validity is the extent to which you can be confident that a
cause-and-effect relationship established in a study cannot be
explained by other factors.

• In other words, can you reasonably draw a causal link between


your treatment and the response in an experiment?

VA 19
Why internal validity
matters?
• Internal validity makes the conclusions of a causal relationship credible
and trustworthy.
• Without high internal validity, an experiment cannot demonstrate a causal
link between two variables.

Research example
• You want to test the hypothesis that drinking a cup of coffee improves
memory. You schedule an equal number of college-aged participants from
all secondary schools in Port-Louis Region for morning and evening
sessions at the laboratory. For convenience, you assign all morning session
participants to the treatment group and all evening session participants to
the control group.
VA 20
Why internal validity
matters?
Research example (contd)
• Once they arrive at the laboratory, the treatment group participants are
given a cup of coffee to drink, while control group participants are given
water. You also give both groups memory tests. After analyzing the
results, you find that the treatment group performed better than the
control group on the memory test.

Question: Can you conclude that drinking a cup of coffee improves


memory performance?

• For your conclusion to be valid, you need to be able to rule out other
explanations for the results. VA 21
How to check whether your
study has internal validity?
There are three necessary conditions for internal validity.
All three conditions must occur to experimentally establish causality between an independent
variable A (your treatment variable) and dependent variable B (your response variable).

1. Your treatment and response variables change together.


2. Your treatment precedes changes in your response variables
3. No confounding or extraneous factors can explain the results of your study.

In the research example above, only two out of the three conditions have been met.
✓ Drinking coffee and memory performance increased together.
✓ Drinking coffee happened before the memory test.
X The time of day of the sessions is an extraneous factor that can equally explain the
results of the study.

VA 22
How to check whether your
study has internal validity?
• Because you assigned participants to groups based on the schedule, the
groups were different at the start of the study.
• Any differences in memory performance may be due to a difference in the
time of day.
• Therefore, you cannot say for certain whether the time of day or drinking
a cup of coffee improved memory performance.

That means your study has low internal validity, and you cannot deduce a
causal relationship between drinking coffee and memory performance.

VA 23
External validity
• External validity is the extent to which you can generalize the findings of a study
to other measures, settings or groups. In other words, can you apply the findings
of your study to a broader context?

• There is an inherent trade-off between internal and external validity; the more
you control extraneous factors in your study, the less you can generalize your
findings to a broader context.
Research example: Consider the previous study regarding coffee and memory
• The external validity depends on the selection of the memory test, the
participant inclusion criteria, and the laboratory setting.
• For example, restricting your participants to college-aged people enhances
internal validity at the expense of external validity – the findings of the study
may only be generalizable to college-aged populations.
VA 24
Reliability & Validity
• We often think of reliability and validity as separate ideas but, in fact, they are
related to each other.
Example:
• Think of the center of the target as the concept that you are trying to measure
• Imagine that for each person you are measuring, you are taking a shot at the target.
If you measure the concept perfectly for a person, you are hitting the center of the
target
• If you don't, you are missing the center. The more you are off for that person, the
further you are from the center.

VA 25
Reliability & Validity
The diagram above shows four possible situations
Figure 1
• you are hitting the target consistently, but you are missing the center of the target. That is,
you are consistently and systematically measuring the wrong value for all respondents
• This measure is reliable, but no valid (that is, it is consistent but wrong).
Figure 2
• It shows hits that are randomly spread across the target
• You seldom hit the center of the target but, on average, you are getting the right answer for
the group (but not very well for individuals)
• In this case, you get a valid group estimate, but you are inconsistent
• Here, you can clearly see that reliability is directly related to the variability of your measure

VA 26
Reliability & Validity
Figure 3
• The third scenario shows a case where your hits are spread across the target
and you are consistently missing the center
• Your measure in this case is neither reliable nor valid

Figure 4
• Finally, we see the "Robin Hood" scenario -- you consistently hit the center of
the target
• Your measure is both reliable and valid

VA 27
Result Analysis & Discussion
and Conclusion

VA 28
The Results : Organisation and
Content
• In general, the pure, unbiased results should be presented
first without interpretation.
• To present the raw data or/and the results after applying
the techniques outlined in the methods section.
• Tells about outcomes/findings of the research study.

The results are simply results; they do not draw conclusions


and should be communicated objectively

VA 29
The Results : Organisation and
Content
Purpose
• To provide the data from your study so that other
researchers can draw their own conclusions and understand
fully the basis for your conclusions.

Common Structure:
1. Present a series of figures and tables and clearly describe
them in detail through efficient text.
2. The figures (charts) should support the assertions or
illustrate the new insights

VA 30
The Results : Presentation
of Data
• Tables and figures (charts, photographs, drawings, graphs and
flow diagrams) are often used to present details whereas the
narrative section of result tends to be used to present the
general findings.
• Numerical data can usually be presented more effectively in
tables or graphs than in the text.
• The order of presentation of the result should be either
chronological to correspond with the methods or from the
most to the least important

VA 31
Statistical measures for
Analysis
Measures of Central
Tendency
1. Mean
2. Mode
3. Median
Measures of Dispersion Measures of Relationship
1. Standard Deviation 1. Co relation
2. Variance 2. Regration analysis
3. T test & Z test
VA 32
Statistical measures for
Analysis
• Analysis of Validity and Reliability
• Compute Correlation Coefficients
• Use appropriate hypothesis testing methods
The Discussion: The
Author’s Opinion
• You interpret the results (previous section) to reach the major conclusions of the research
work.
• The main impact section where you have the most freedom to assemble the implications of
the research
• Function of discussion is to :
➢ Interpret results in light of what was already known about the subject and

➢ Explain new understanding of the problem after taking results into consideration.

Some examples:
1. Comparison between measured and modeled data
2. Comparison among various modeling methods
3. Application of the results obtained to solve a specific engineering or scientific problem

VA 34
The Discussion : The
Author’s Opinion
Tips: Two Common questions to write the opinions

1. How do the results compare with earlier work?


2. What is new and significant?

Warning!!:
• NEVER make an assertion of which you are not 100% sure
• DO NOT open the door for a negative review or the eventual rejection of
your research study.

Experts and examiners will draw their own conclusions anyhow.

VA 35
The Results and Discussion:
As one Section
Some researchers will combine the discussion (OPINION) and results
sections (FACTS) so that they can avoid repetition and so that they can give
their conclusions parallel with the results.

Acceptable :
– Provided they provide clear distinction between facts and opinion

Recommendation:
– Most IT / Engineering research work: To separate sections.

VA 36
Conclusion
• The final section of the study does not introduce any new
information or insights: it merely summarizes and concludes.
• This section is longer than the abstract and generally includes more
specific conclusions.
• It is often more quantitative than the abstract, however, listing
equations or citations should not be necessary (McNown 1996a).

VA 37
Conclusion Section/Chapter
• A good format for this section is to write it in two paragraphs.
– The first paragraph summarizes various sections of the study.
– The second paragraph draws the important conclusions.

• The summary paragraph is different than that at the end of the


introduction section.
• Here, the summary paragraph draws on the fact that the reader knows
all of the new results presented in the study.
• It then summarizes what the important results where.

VA 38
Conclusion Section/Chapter
The conclusion paragraph identifies the significant conclusions. McNown (1996)
suggests two possible formats for this second paragraph:
1. Organize based on logical flow for points that are interconnected
2. Organize based on merit, where the most important items appear first.

• It is important to remember that this paragraph should not present new information.
• It may combine parts of the study to underscore an important conclusion, but it
cannot present information that could not be gleaned from the other sections.
– For Research Articles, a third, optional, paragraph may identify future research
directions that flow naturally from the study work.
– For Theses, the chapter generally consists of three parts: The Summary, The
Critical Appraisal & Limitation and Future work.
VA 39
Conclusion Section/Chapter
The guiding principle for the summary and conclusions chapter/section may be
formulated as follows:
• The summary and conclusions section tells the reader what has already been
read and draws the important conclusions—keep it short and make it as
specific as possible
• If the reader wants to know specifically what aspects of a problem your work
will address, s/he will often read the introduction and then the summary and
conclusions section.
• Hence, it is important that all of the significant findings are summarized and
united in the significant conclusions.

VA 40
Summary

We have considered
• Examples on Validity and Reliability
• Result Analysis – charts, tables, testing
• Discussion
• Conclusion

VA 41

You might also like