Lecture 13 - Reliability & Validity, Results & Discusion and Wrap Up

MSc Computer Science with Emerging Technologies
Lecture 13
Reliability & Validity

Results, Discussion and Conclusion
Research Methodologies
by Dr Vinaye Armoogum
Professor
VA 1
Learning Topics
This lecture will address the following:
1. Importance of Reliability and Validity in research
2. Contents in Result analysis
3. Importance of Discussion before completing a research work
VA 2
Research Process
II. Review the literature
Review concepts
and theories IV. Design
I. Define Research III. Formulate research(including
Problem
Review previous hypotheses sample design)
research finding
V. Build Design (e.g

VII. Interpret VI. Analyse data
(Test hypotheses) model) and Collect
and report
data (Execution)
VA 3
Reliability and Validity
VA 4
Measurement
• MEASUREMENT is any process by which a value is assigned to the
level or state of some quality of an object of study.
• Measurement involves the expression of information in quantifies
(numbers) rather than by verbal statement
• It provides a powerful means of reducing qualitative data to a more
condensed form for summarization, manipulation, and analysis
VA 5
Measurement
The best measure should be both reliable and valid

Reliability and validity
• Reliability and validity are concepts used to evaluate the
quality of research.
• They indicate how well a method, technique or test measures
something.
• Reliability is about the consistency of a measure, and validity
is about the accuracy of a measure.
VA 6
What is reliability?
• We often speak about “reliable cars."
• On news people talk about a "usually reliable source“
• In both cases, the word reliable usually means "dependable" or "trustworthy."
• In research, the term "reliable" also means dependable in a general sense, but
that's not a precise enough definition
• Reliability refers to how consistently a method measures something.

• If the same result can be consistently achieved by using the same methods
under the same circumstances and conditions, the measurement is considered
reliable.
VA 7
Reliability - Examples
• You measure the temperature of a liquid sample several times under
identical conditions. The thermometer displays the same temperature
every time, so the results are reliable.
• A doctor uses a symptom questionnaire to diagnose a patient with a

long-term medical condition. Several different doctors use the same
questionnaire with the same patient but give different diagnoses. This
indicates that the questionnaire has low reliability as a measure of the
condition.
• Data Collected when testing an AI algorithm in an experiment is

considered reliable if the same score is obtained when repeating the
same test under the same conditions VA 8
Reliability of measuring devices
• The slightest variations in measuring devices in Olympic track
and field events (whether it is a tape or clock) could mean the
difference between the gold and silver medals
• Olympic measuring devices, then, must be reliable from one
throw or race to another and from one competition to another
• They must also be reliable when used in different parts of the
world, as temperature, air pressure, humidity, interpretation,
or other variables might affect their readings
VA 9
Types of Reliability
There are three ways that reliability is usually estimated:

1. test/retest
2. internal consistency
3. Interrater.
VA 10
Test-Retest Reliability
• We estimate test-retest reliability when we administer the same test to the
same sample on two different occasions
• The idea behind test/retest is that you should get the same score on test 1
as you do on test 2.
• The three main components to this method are as follows:
1) implement your measurement instrument at two separate times for
each subject;
2) compute the correlation between the two separate measurements
3) assume there is no change in the underlying condition between test 1
and test 2
VA 11
Internal Consistency
• Internal consistency estimates reliability by grouping questions in a
questionnaire that measure the same concept
• After collecting the responses, run a correlation between groups of questions to
determine if your instrument is reliably measuring that concept.
How to test Correlation?

One solution is to use SPSS or R programming
• Generate the Cronbach’s alpha (a measure of internal consistency, that is, how
closely related a set of items are as a group)- a correlation coefficient or
coefficient of reliability
• The closer it is to 1, the higher the reliability estimate of your instrument/model.
• Generally, a value of above 0.7 is considered as high reliability
VA 12
Types of Reliability - Summary
Type of reliability What does it assess? Example
A group of participants complete a questionnaire
designed to measure personality traits. If they repeat
The consistency of a measure across
the questionnaire days, weeks or months apart and
Test-retest time: do you get the same results
give the same answers, this indicates high test-retest
when you repeat the measurement?
reliability.
Based on an assessment criteria checklist, five

The consistency of a measure across examiners submit substantially different results for
raters or observers: do you get the the same student project. This indicates that the
Interrater
same results when different people assessment checklist has low inter-rater reliability
conduct the same measurement? (for example, because the criteria are too subjective).
The consistency of the measurement You design a questionnaire to measure self-esteem.

itself: do you get the same results If you randomly split the results into two halves,
Internal consistency from different parts of a test that are there should be a strong correlation between the two
designed to measure the same sets of results. If the two results are very different,
thing? this indicates low internal consistency.
VA 13
Validity
• Validity involves the degree to which you are measuring what you are
supposed to. =>More simply, the accuracy of your measurement.
• Validity refers to how accurately a method measures what it is intended

to measure.
• If research has high validity, that means it produces results that
correspond to real properties, characteristics, and variations in the
physical or social world.
• High reliability is one indicator that a measurement is valid. If a method
is not reliable, it probably isn’t valid.
VA 14
Validity – Covid-19 Examples
• If the thermometer shows different temperatures each time, even though you have
carefully controlled conditions to ensure the sample’s temperature stays the same,
the thermometer is probably malfunctioning, and therefore its measurements are
not valid.
• If a symptom questionnaire results in a reliable diagnosis when answered at
different times and with different doctors, this indicates that it has high validity as
a measurement of the medical condition.
• However, reliability on its own is not enough to ensure validity. Even if a test is
reliable, it may not accurately reflect the real situation.
• The thermometer that you used to test the sample gives reliable results. However,
the thermometer has not been calibrated properly, so the result is 2 degrees lower
than the true value. Therefore, the measurement
VA is not valid. 15
Validity – Another Example
• A group of participants take a test designed to measure working memory. The
results are reliable, but participants’ scores correlate strongly with their level of
reading comprehension. This indicates that the method might have low validity:
the test may be measuring participants’ reading comprehension instead of their
working memory.
Validity is harder to assess than reliability, but it is even more important.

• To obtain useful results, the methods which researchers use to collect data must be
valid: the research must be measuring what it claims to measure.
• This ensures that the discussion of the data and the conclusions which the
researchers draw are also valid.
VA 16
Four types of validity
• Construct validity
• Content validity
• Criterion validity
• Internal validity
• External validity
VA 17
Types of Validity - Summary
Type of validity What does it assess? Example

A self-esteem questionnaire could be assessed by
measuring other traits known or assumed to be related
The adherence of a measure to
to the concept of self-esteem (such as social skills and
Construct existing theory and knowledge of the
optimism). Strong correlation between the scores for
concept being measured.
self-esteem and associated traits would indicate high
construct validity.
A test that aims to measure a class of students’ level of
Communication Skills contains reading, writing and
The extent to which the speaking components, but no listening
Content measurement covers all aspects of the component. Experts agree that listening comprehension
concept being measured. is an essential aspect of language ability, so the test lacks
content validity for measuring the overall level of ability
for this module.
A survey is conducted to measure the political opinions
The extent to which the result of a of voters in a region of Mauritius. If the results
Criterion measure corresponds to other valid accurately predict the later outcome of an election in
measures of the same concept. that region, this indicates that the survey has high
VA
criterion validity. 18
Understanding internal
validity
• Internal validity is the extent to which you can be confident that a
cause-and-effect relationship established in a study cannot be
explained by other factors.
• In other words, can you reasonably draw a causal link between

your treatment and the response in an experiment?
VA 19
Why internal validity
matters?
• Internal validity makes the conclusions of a causal relationship credible
and trustworthy.
• Without high internal validity, an experiment cannot demonstrate a causal
link between two variables.
Research example
• You want to test the hypothesis that drinking a cup of coffee improves
memory. You schedule an equal number of college-aged participants from
all secondary schools in Port-Louis Region for morning and evening
sessions at the laboratory. For convenience, you assign all morning session
participants to the treatment group and all evening session participants to
the control group.
VA 20
Why internal validity
matters?
Research example (contd)
• Once they arrive at the laboratory, the treatment group participants are
given a cup of coffee to drink, while control group participants are given
water. You also give both groups memory tests. After analyzing the
results, you find that the treatment group performed better than the
control group on the memory test.
Question: Can you conclude that drinking a cup of coffee improves

memory performance?
• For your conclusion to be valid, you need to be able to rule out other
explanations for the results. VA 21
How to check whether your
study has internal validity?
There are three necessary conditions for internal validity.
All three conditions must occur to experimentally establish causality between an independent
variable A (your treatment variable) and dependent variable B (your response variable).
1. Your treatment and response variables change together.

2. Your treatment precedes changes in your response variables
3. No confounding or extraneous factors can explain the results of your study.
In the research example above, only two out of the three conditions have been met.
✓ Drinking coffee and memory performance increased together.
✓ Drinking coffee happened before the memory test.
X The time of day of the sessions is an extraneous factor that can equally explain the
results of the study.
VA 22
How to check whether your
study has internal validity?
• Because you assigned participants to groups based on the schedule, the
groups were different at the start of the study.
• Any differences in memory performance may be due to a difference in the
time of day.
• Therefore, you cannot say for certain whether the time of day or drinking
a cup of coffee improved memory performance.
That means your study has low internal validity, and you cannot deduce a
causal relationship between drinking coffee and memory performance.
VA 23
External validity
• External validity is the extent to which you can generalize the findings of a study
to other measures, settings or groups. In other words, can you apply the findings
of your study to a broader context?
• There is an inherent trade-off between internal and external validity; the more
you control extraneous factors in your study, the less you can generalize your
findings to a broader context.
Research example: Consider the previous study regarding coffee and memory
• The external validity depends on the selection of the memory test, the
participant inclusion criteria, and the laboratory setting.
• For example, restricting your participants to college-aged people enhances
internal validity at the expense of external validity – the findings of the study
may only be generalizable to college-aged populations.
VA 24
• We often think of reliability and validity as separate ideas but, in fact, they are
related to each other.
Example:
• Think of the center of the target as the concept that you are trying to measure
• Imagine that for each person you are measuring, you are taking a shot at the target.
If you measure the concept perfectly for a person, you are hitting the center of the
target
• If you don't, you are missing the center. The more you are off for that person, the
further you are from the center.
VA 25
The diagram above shows four possible situations
Figure 1
• you are hitting the target consistently, but you are missing the center of the target. That is,
you are consistently and systematically measuring the wrong value for all respondents
• This measure is reliable, but no valid (that is, it is consistent but wrong).
Figure 2
• It shows hits that are randomly spread across the target
• You seldom hit the center of the target but, on average, you are getting the right answer for
the group (but not very well for individuals)
• In this case, you get a valid group estimate, but you are inconsistent
• Here, you can clearly see that reliability is directly related to the variability of your measure
VA 26
Figure 3
• The third scenario shows a case where your hits are spread across the target
and you are consistently missing the center
• Your measure in this case is neither reliable nor valid
Figure 4
• Finally, we see the "Robin Hood" scenario -- you consistently hit the center of
the target
• Your measure is both reliable and valid
VA 27
Result Analysis & Discussion
and Conclusion
VA 28
The Results : Organisation and
Content
• In general, the pure, unbiased results should be presented
first without interpretation.
• To present the raw data or/and the results after applying
the techniques outlined in the methods section.
• Tells about outcomes/findings of the research study.
The results are simply results; they do not draw conclusions

and should be communicated objectively
VA 29
The Results : Organisation and
Content
Purpose
• To provide the data from your study so that other
researchers can draw their own conclusions and understand
fully the basis for your conclusions.
Common Structure:
1. Present a series of figures and tables and clearly describe
them in detail through efficient text.
2. The figures (charts) should support the assertions or
illustrate the new insights
VA 30
The Results : Presentation
of Data
• Tables and figures (charts, photographs, drawings, graphs and
flow diagrams) are often used to present details whereas the
narrative section of result tends to be used to present the
general findings.
• Numerical data can usually be presented more effectively in
tables or graphs than in the text.
• The order of presentation of the result should be either
chronological to correspond with the methods or from the
most to the least important
VA 31
Statistical measures for
Analysis
Measures of Central
Tendency
1. Mean
2. Mode
3. Median
Measures of Dispersion Measures of Relationship
1. Standard Deviation 1. Co relation
2. Variance 2. Regration analysis
3. T test & Z test
VA 32
Statistical measures for
Analysis
• Analysis of Validity and Reliability
• Compute Correlation Coefficients
• Use appropriate hypothesis testing methods
The Discussion: The
Author’s Opinion
• You interpret the results (previous section) to reach the major conclusions of the research
work.
• The main impact section where you have the most freedom to assemble the implications of
the research
• Function of discussion is to :
➢ Interpret results in light of what was already known about the subject and
➢ Explain new understanding of the problem after taking results into consideration.
Some examples:
1. Comparison between measured and modeled data
2. Comparison among various modeling methods
3. Application of the results obtained to solve a specific engineering or scientific problem
VA 34
The Discussion : The
Author’s Opinion
Tips: Two Common questions to write the opinions
1. How do the results compare with earlier work?

2. What is new and significant?
Warning!!:
• NEVER make an assertion of which you are not 100% sure
• DO NOT open the door for a negative review or the eventual rejection of
your research study.
Experts and examiners will draw their own conclusions anyhow.
VA 35
The Results and Discussion:
As one Section
Some researchers will combine the discussion (OPINION) and results
sections (FACTS) so that they can avoid repetition and so that they can give
their conclusions parallel with the results.
Acceptable :
– Provided they provide clear distinction between facts and opinion
Recommendation:
– Most IT / Engineering research work: To separate sections.
VA 36
Conclusion
• The final section of the study does not introduce any new
information or insights: it merely summarizes and concludes.
• This section is longer than the abstract and generally includes more
specific conclusions.
• It is often more quantitative than the abstract, however, listing
equations or citations should not be necessary (McNown 1996a).
VA 37
Conclusion Section/Chapter
• A good format for this section is to write it in two paragraphs.
– The first paragraph summarizes various sections of the study.
– The second paragraph draws the important conclusions.
• The summary paragraph is different than that at the end of the

introduction section.
• Here, the summary paragraph draws on the fact that the reader knows
all of the new results presented in the study.
• It then summarizes what the important results where.
VA 38
The conclusion paragraph identifies the significant conclusions. McNown (1996)
suggests two possible formats for this second paragraph:
1. Organize based on logical flow for points that are interconnected
2. Organize based on merit, where the most important items appear first.
• It is important to remember that this paragraph should not present new information.
• It may combine parts of the study to underscore an important conclusion, but it
cannot present information that could not be gleaned from the other sections.
– For Research Articles, a third, optional, paragraph may identify future research
directions that flow naturally from the study work.
– For Theses, the chapter generally consists of three parts: The Summary, The
Critical Appraisal & Limitation and Future work.
VA 39
The guiding principle for the summary and conclusions chapter/section may be
formulated as follows:
• The summary and conclusions section tells the reader what has already been
read and draws the important conclusions—keep it short and make it as
specific as possible
• If the reader wants to know specifically what aspects of a problem your work
will address, s/he will often read the introduction and then the summary and
conclusions section.
• Hence, it is important that all of the significant findings are summarized and
united in the significant conclusions.
VA 40
Summary
We have considered
• Examples on Validity and Reliability
• Result Analysis – charts, tables, testing
• Discussion
• Conclusion
VA 41

Lecture 13 - Reliability & Validity, Results & Discusion and Wrap Up

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 13 - Reliability & Validity, Results & Discusion and Wrap Up

Uploaded by

Copyright:

Available Formats

MSc Computer Science with Emerging Technologies

Reliability & Validity

1. Importance of Reliability and Validity in research

2. Contents in Result analysis

3. Importance of Discussion before completing a research work

V. Build Design (e.g

The best measure should be both reliable and valid

• Reliability refers to how consistently a method measures something.

• A doctor uses a symptom questionnaire to diagnose a patient with a

• Data Collected when testing an AI algorithm in an experiment is

There are three ways that reliability is usually estimated:

How to test Correlation?

Based on an assessment criteria checklist, five

The consistency of the measurement You design a questionnaire to measure self-esteem.

• Validity refers to how accurately a method measures what it is intended

Validity is harder to assess than reliability, but it is even more important.

Type of validity What does it assess? Example

• In other words, can you reasonably draw a causal link between

Question: Can you conclude that drinking a cup of coffee improves

1. Your treatment and response variables change together.

The results are simply results; they do not draw conclusions

1. How do the results compare with earlier work?

Experts and examiners will draw their own conclusions anyhow.

• The summary paragraph is different than that at the end of the

You might also like