You are on page 1of 64

Chapter 4: Research Design

and Measurements

Dr. Mohammed Shamim Uddin Khan


Professor and Ex-Chairman
Department of Finance
University of Chittagong

1
Learning Objectives
❑ To understand what research design is and why it is
significant
❑ To appreciate areas of ethical sensitivity in research design
❑ To learn how exploratory research design helps the
researcher gain a feel for the problem by providing
background information, suggesting hypotheses, and
prioritizing research objectives
❑ To know the fundamental questions addressed by
descriptive research and the different types of descriptive
research
❑ To explain what is meant by causal research and to
describe types of experimental research designs

2
Research
Design-Introduction
❑ Research design is a set of advance decisions that make up
the master plan specifying the methods and procedures for
collecting and analyzing the needed information.
❑ A research design is a plan for the proposed research work.
❑ Research design refers to the plan, structure, and strategy of
research--the blueprint that will guide the research process.
❑ Decision regarding what, where, when, how, what means
regarding the research study constitutes research design.
❑ It is a blue print for collection, measurement, and analysis
of data.

3
Research
Design-Introduction Cont.
❑ This is the most difficult and formidable task in the
research process. At the same time it is the most important
task because research design is the conceptual framework
within which research is conducted.
❑ The role of research design is to connect the question to
data.
❑ Design sits between the two, showing how the research
questions will be connected to the data, and the tools and
procedures to use in answering them.
❑ Research design must follow the questions and fit them
with data.
4
Research
Design-Introduction Cont.
According to C.W. Emory and D. R. Cooper, a research
design includes the following:
1. A plan for selecting the sources and types of information
relevant to the research questions.
2. It is a framework for specifying the relationships among
the study’s variables.
3. It is a blue print for outlining all of the procedures from
the hypotheses to the analysis of data.

5
Importance of Research
Design
Good research design is the “first rule of good research.” Knowledge of
the needed research design allows advance planning so that the
project may be conducted in less time and typically at a cost savings
due to efficiencies gained in preplanning. However, the benefits are
as follows:
1. Research design reduces inaccuracy.
2. It helps to get maximum efficiency and reliability.
3. It eliminates bias and marginal errors.
4. It minimizes the wastage of time.
5. It is helpful for testing the hypothesis.
6. It gives an idea regarding the type of resources required in terms of
money, manpower, time and effort.
7. It provides an overview to other experts.
8. It guides the research in the right direction.

6
Essential Elements of
Research Design
1. Accurate purpose statement
2. Techniques to be implemented for collecting and analyzing
research
3. The method applied for analyzing collected details
4. Type of research methodology
5. Probable objections for research
6. Settings for the research study
7. Timeline
8. Measurement of analysis

7
Research Design
Components
1. Sample design: It deals with the method of selecting
items to be observed for the given study.
2. Observational design: It relates to the conditions under
which the observations are to be made.
3. Statistical design: It concerns with the question of how
many items are to be observed and how the information
and data gathered are to be analyzed.
4. Operational design: It deals with the techniques by
which the procedures specified in the sampling, statistical
and observational design can be carried out.

8
Good Research Design
A good research design often possesses the qualities such as being flexible,
suitable, efficient, economical, and so on. Generally, a research design which minimizes
bias and maximizes the reliability of the data collected and analyzed is considered a
good design. A good research design tries to answer the following questions:
1. What is the study about?
2. Why is the study being made?
3. Where will the study be carried out?
4. What is the scope of the study?
5. What type of data is required?
6. Where can the required data be found?
7. What period of time will the study include?
8. What will be the approximate expenditure?
9. What will the sample design?
10. What will be the methodology for research?
11. What techniques of data collection will be used?
12. How will the data be analyzed?
13. In what style will the report be prepared?

9
Criteria for Selecting
Research Design
A research design depends on the purpose and the nature of
the research problem. Thus one single research design
cannot be used to solve all types of research problem i.e. a
particular design is suitable for a particular problem. The
type of research design to be chosen from the above designs
depends primarily on the four factors:
1. The nature of the problem
2. The objectives of the problem to be studied
3. The existing state of knowledge about the problem that is
being studied
4. The resources (time and money) available for the study

10
Characteristics of
Research Design
1. Neutrality: When you set up your study, you may have to
make assumptions about the data you expect to collect.
The results projected in the research design should be free
from bias and neutral. Understand opinions about the final
evaluated scores and conclusion from multiple individuals
and consider those who agree with the derived results.
2. Reliability: With regularly conducted research, the
researcher involved expects similar results every time.
Your design should indicate how to form
research questions to ensure the standard of results. You’ll
only be able to reach the expected results if your design is
reliable.
11
Characteristics of Research
Design Cont.
3. Validity: There are multiple measuring tools available.
However, the only correct measuring tools are those
which help a researcher in gauging results according to the
objective of the research. The questionnaire developed
from this design will then be valid.
4. Generalization: The outcome of your design should apply
to a population and not just a restricted sample. A
generalized design implies that your survey can be
conducted on any part of a population with similar
accuracy.

12
Understanding Various Types
of Research Design to Select
Which Model to Implement
for a Study
Like research itself, the design of our study can be broadly
classified into quantitative and qualitative.
⬥ Qualitative Research Design: Qualitative
research determines relationships between collected data
and observations based on mathematical calculations.
Theories related to a naturally existing phenomenon can
be proved or disproved using statistical methods.
Researchers rely on qualitative research design methods
that conclude “why” a particular theory exists along with
“what” respondents have to say about it.
Understanding Various Types
of Research Design to Select
Which Model to Implement
for a Study
⬥ Quantitative research design: Quantitative research is
for cases where statistical conclusions to collect
actionable insights are essential. Numbers provide a better
perspective to make critical business decisions.
Quantitative research design methods are necessary for
the growth of any organization. Insights drawn from hard
numerical data and analysis prove to be highly effective
when making decisions related to the future of the
business.
Understanding Various Types
of Research Design to Select
Which Model to Implement
for a Study
We can further break down the types of research design into three
categories: Exploratory Research, Descriptive Research and
Experimental Research
1. Exploratory Research: If the problem statement is not clear, we
can conduct exploratory research. It is usually conducted when the
researcher does not know much about the problems. It is usually
conducted at the outset of research projects.
Uses of Exploratory Research:
❑ Gain background information
❑ Define terms
❑ Clarify problems and hypothesis
❑ Establish research priorities
15
Understanding Various Types
of Research Design to Select
Which Model to Implement
for a Study
2. Descriptive Research Design: In a descriptive design, a
researcher is solely interested in describing the situation
or case under their research study. It is a theory-based
design method which is created by gathering, analyzing,
and presenting collected data. This allows a researcher to
provide insights into the why and how of research.
Descriptive design helps others better understand the need
for the research.
Understanding Various Types
of Research Design to Select
Which Model to Implement
3.
for a Study
Experimental Research Design: Experimental research design
establishes a relationship between the cause and effect of a situation. It
is a causal design where one observes the impact caused by the
independent variable on the dependent variable. For example, one
monitors the influence of an independent variable such as a price on a
dependent variable such as customer satisfaction or brand loyalty. It is a
highly practical research design method as it contributes to solving a
problem. The independent variables are manipulated to monitor the
change it has on the dependent variable. It is often used in social
sciences to observe human behavior by analyzing two groups.
Researchers can have participants change their actions and study how
the people around them react to gain a better understanding of social
psychology.
Measurement and Scaling
❑ Measurement is the foundation of any scientific
investigation. It is a recorded observation
❑ Everything we do begins with the measurement of
whatever it is we want to study
❑ Measurement is the assignment of numbers to objects

When you cannot measure, your knowledge is of a meager


and unsatisfactory kind.
Kelvin, 1883
Measurement and Scaling
Cont.
Measurement means assigning numbers or other
symbols to characteristics of objects according to
certain pre-specified rules.
■ One-to-one correspondence between the numbers
and the characteristics being measured.
■ The rules for assigning numbers should be
standardized and applied uniformly.
■ Rules must not change over objects or time.
Measurement and Scaling
Cont.
Scaling involves creating a continuum upon which
measured objects are located.

Consider an attitude scale from 1 to 100.


Each respondent is assigned a number from 1 to 100, with 1 =
Extremely Unfavorable, and 100 = Extremely Favorable.
Measurement is the actual assignment of a number from 1 to
100 to each respondent. Scaling is the process of placing the
respondents on a continuum with respect to their attitude
toward department stores.
Primary Measurement
Scales
⬥ Nominal crude Nominal – arbitrary assignment of a
code to an attribute, e.g.,
⬥ Ordinal 1 = male, 2 = female


Ordinal – rank, e.g.,
Interval 1st, 2nd, 3rd, …

⬥ Ratio Interval – equal distance between units,


but no absolute zero point, e.g.,
sophisticated
20° C, 30° C, 40° C, …
Ratio – absolute zero point, therefore
ratios are meaningful, e.g.,
20 wpm, 40 wpm, 60 wpm
Use ratio measurements where
possible
Nominal Measures

⬥ Only offer a name or a label for a variable


⬥ Not really a ‘scale’ because it does not scale objects along
any dimension
⬥ There is not ranking
⬥ They are not numerically related
⬥ Categorical data are measured on nominal scales which
merely assign labels to distinguish categories
⬥ Example: Gender; Race
Ordinal Measures
⬥ Variables with attributes that can be rank ordered
⬥ Numbers are used to place objects in order
⬥ But, there is no information regarding the differences
(intervals) between points on the scale
⬥ A nonnumeric label or numeric code may be used.
⬥ Distance between does not have meaning
■ lower class, middle and upper class
Interval Measures
⬥ Distance separating attributes has meaning and is
standardized (equidistant)
⬥ An interval scale is a scale on which equal intervals
between objects, represent equal differences
⬥ The interval differences are meaningful
⬥ But, we can’t defend ratio relationships
⬥ “0” value does not mean a variable is not present
⬥ The data have the properties of ordinal data, and the
interval between observations is expressed in terms of a
fixed unit of measure.
⬥ Interval data are always numeric.
Ratio Measures
⬥ Attributes of a variable have a “true zero point” that means
something
⬥ Ratios are meaningful
⬥ Preferred scale of measurement
⬥ With ratio measurements summaries and comparisons are
strengthened
⬥ Report “counts” as ratios where possible because they
facilitate comparisons
⬥ Variables such as distance, height, weight, and time use the
ratio scale.
Primary Scales of Measurement

Scale
Nominal Numbers
Assigned
Finis
7 8 3
to Runners h
Ordinal Rank Order Finis
of Winners h
Thir Seco Firs
d nd t
Interval Performance
Rating on a
plac
8. place
9. plac
9.
0 to 10 Scale e2 1 e6
Ratio Time to Finish, 15. 14. 13.
in 2 1 4
Seconds
Primary Scales of
Measurement
Primary Measurement
Scaling and Descriptive
Statistics
Type of Scale Numerical Operation Descriptive Statistics

Nominal Counting Frequency in each category,


percentage in each category,
mode
Ordinal Rank Ordering Median, range, percentile
ranking
Interval Arithmetic Operations Mean, standard deviation,
on Intervals between variance
numbers
Ratio Arithmetic Operations Geometric mean, coefficient
on actual quantities of variation
Operationalization: Concept
and Variables
⬥ It is critical to survey research to understand how to go from
ideas to concepts to variables – operationalization.
⬥ Concept: an idea, a general mental formulation
summarizing specific occurrences
⬥ A label we put on a phenomenon, a matter, a “thing” that
enables us to link separate observations, make
generalizations, communicate and inherit ideas.
⬥ Concepts can be concrete, abstract, tangible or intangible.
⬥ Properties of objects that can take on different values are
referred to as variables
⬥ A constant is a number that does not change its value in a
given situation
Operationalization: Concept
and Variables Cont.
❑ Dependent Variable: The dependent variable is the variable that
the researcher measures; it is called a dependent variable because it
depends upon (is caused by) the independent variable.
not under the experimenter’s control
usually the outcome to be measured
❑ Independent Variable: The independent variable is the one that the
researcher manipulates.
manipulated by the experimenter
under the control of the experimenter
❑ Typically, we are interested in measuring the effects of independent
variables on dependent variables
❑ Extraneous variables are those variables that may have some
effect on a dependent variable yet are not independent variables.
Operationalization: Concept
and Variables Cont.
⬥ Qualitative Variable: Composed of categories which are
not comparable in terms of magnitude
⬥ Quantitative Variable: Can be ordered with respect to
magnitude on some dimension
⬥ Continuous Variable: A quantitative variable, which can be
measured with an arbitrary degree of precision. Any two
points on a scale of a continuous variable have an infinite
number of values in between. It is generally measured.
Example: time, distance, weight
⬥ Discrete Variable: A quantitative variable where values can
differ only by well-defined steps with no intermediate
values possible. It is generally counted. Example: gender,
marital status, religious affiliation
Attitude
Measuring Attitude is a frequent undertaking in business
research
Attitude may be defined as an enduring disposition to
consistently respond in a given manner to various aspects

Attitude has three dimensions:

Behaviou
Affective Cognitive ral
Compone Compone Compone
nt nt nt
Components of Attitude
Affective Component: Reflective of a person’s general
feelings or emotions towards an object or subject (like,
dislike, love, hate)

Cognitive Component: Reflective of a person’s


awareness of and knowledge about an object or subject
(know, believe)

Behavioural Component: Reflective of a person’s


intentions and behavioural expectations, and
predisposition to action
Measuring Attitude

⬥ It can be difficult to measure attitude, therefore, indicators


such as verbal expression, physiological measurement
techniques and overt behaviour are used for this purpose.
The three different components of attitude may require
different measuring techniques

⬥ Common techniques used in business research to


determine attitude include rating, ranking, sorting and the
choice technique
Rating Techniques to
Measure Attitude
Rating Scales are frequently employed in business
research for measuring attitude, and many scales have
been developed for this purpose, including:
Simple Attitude Scales
Category Scales
Likert Scale
Semantic Differential
Numerical Scales
Constant-Sum Scale
Stapel Scale
Graphic Scales
Simple Attitude Scales

In attitude scaling, individuals are typically asked whether


they agree or disagree with a question (or questions) put to
them, or they are asked to respond to a question or questions

Simple attitude scales have the properties of a nominal scale


and the disadvantages that go with it, also, they do not
permit fine distinctions in the respondents’ answers because
their choice of answers is limited, but they can be useful in
instances where the respondents’ education level is low and
questionnaires lengthy
Category Scales

A category scale consists of several response categories to


provide the respondent with alternative ratings

Category scales are more sensitive than rating scales


which allow only two answer categories (because of the
larger number of choices), and thus provides more data
and information
Classification of Scaling
Techniques
Scaling Techniques

Comparative Non-comparative
Scales Scales

Paired Rank Constant Q-Sort and Continuous Itemized Rating


Comparison Order Sum Other Rating Scales Scales
Procedures

Likert Semantic Stapel


Differential
Comparison of Scaling
Techniques
⬥ Comparative scales involve the direct comparison of
stimulus objects. Comparative scale data must be
interpreted in relative terms and have only ordinal or rank
order properties.

⬥ In non-comparative scales, each object is scaled


independently of the others in the stimulus set. The
resulting data are generally assumed to be interval or ratio
scaled.
Relative Advantages
/Disadvantages of Comparative
Scales
Advantages:
1. Small differences between stimulus objects can be detected.
2. Same known reference points for all respondents.
3. Easily understood and can be applied.
4. Involve fewer theoretical assumptions.
5. Tend to reduce halo or carryover effects from one judgment to another.
Disadvantages:
1. Ordinal nature of the data
2. Inability to generalize beyond the stimulus objects scaled.
Comparative Scaling
Techniques
Paired Comparison Scaling
⬥ A respondent is presented with two objects and asked to
select one according to some criterion.
⬥ The data obtained are ordinal in nature.
⬥ Paired comparison scaling is the most widely-used
comparative scaling technique.
⬥ With n brands, [n(n-1)/2] paired comparisons are required.
⬥ Under the assumption of transitivity, it is possible to
convert paired comparison data to a rank order.
Comparative Scaling
Techniques
Rank Order Scaling
⬥ Respondents are presented with several objects
simultaneously and asked to order or rank them according
to some criterion.
⬥ It is possible that the respondent may dislike the brand
ranked 1 in an absolute sense.
⬥ Furthermore, rank order scaling also results in ordinal
data.
⬥ Only (n - 1) scaling decisions need be made in rank order
scaling.
Comparative Scaling
Techniques
Constant Sum Scaling
⬥ Respondents allocate a constant sum of units, such as 100
points to attributes of a product to reflect their importance.
⬥ If an attribute is unimportant, the respondent assigns it zero
points.
⬥ If an attribute is twice as important as some other attribute, it
receives twice as many points.
⬥ The sum of all the points is 100. Hence, the name of the
scale.
Q-Sort and Other
Procedures
Q-sort scaling was developed to discriminate among a
relatively large number of objects quickly. This technique uses a
rank order procedure in which objects are sorted into piles based
on similarity with respect to some criterion.
For example, respondents are given 100 attitude statements on
individual cards and asked to place them into 11 groups, ranging
from ‘most highly agreed with’ to ‘least highly agreed with’.
The number of objects to be sorted should not be less than 60
nor more than 140; a reasonable range is 60 to 90 objects.15 The
number of objects to be placed in each group is pre-specified,
often to result in a roughly normal distribution of objects over the
whole set.
Non-Comparative Scaling
Techniques
⬥ Respondents evaluate only one object at a time, and for
this reason non-comparative scales are often referred to as
monadic scales.

⬥ Non-comparative techniques consist of continuous and


itemized rating scales.
Continuous Rating Scale
Respondents rate the objects by placing a mark at the appropriate position on a line that runs from one
extreme of the criterion variable to the other.
The form of the continuous scale may vary considerably.

How would you rate Sears as a department store?


Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - Probably the best

Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - --Probably the best
0 10 20 30 40 50 60 70 80 90 100

Version 3
Very bad Neither good Very good
nor bad
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - ---Probably the best
0 10 20 30 40 50 60 70 80 90 100
Itemized Rating Scales

⬥ The respondents are provided with a scale that has a


number or brief description associated with each category.

⬥ The categories are ordered in terms of scale position, and


the respondents are required to select the specified
category that best describes the object being rated.

⬥ The commonly used itemized rating scales are the Likert,


semantic differential, and Stapel scales.
Likert Scale
A likert Scale is a measure of attitudes designed to allow
respondents to indicate how strongly they agree or
disagree with carefully constructed statements that range
from very positive to very negative towards an object or
subject
The number of alternatives on the Likert scale can vary,
often five alternatives are foreseen
A Likert Scale may include a number of question items,
each covering some aspect of the respondent’s attitude,
and these items collectively form an index
Likert Scale Cont.
The Likert scale requires the respondents to indicate a degree of agreement or disagreement with
each of a series of statements about the stimulus objects.

Strongly Disagree Neither Agree Strongly


disagree agree nor agree
disagree

1. Sears sells high quality merchandise. 1 2X 3 4 5

2. Sears has poor in-store service. 1 2X 3 4 5

3. I like to shop at Sears. 1 2 3X 4 5

⬥ The analysis can be conducted on an item-by-item basis (profile analysis), or a total


(summated) score can be calculated.
⬥ When arriving at a total score, the categories assigned to the negative statements by the
respondents should be scored by reversing the scale.
Semantic Differential Scale
The semantic differential is a seven-point rating scale with end points associated with
bipolar labels that have semantic meaning.
SEARS IS:
Powerful --:--:--:--:-X-:--:--: Weak
Unreliable --:--:--:--:--:-X-:--: Reliable
Modern --:--:--:--:--:--:-X-: Old-fashioned
⬥ The negative adjective or phrase sometimes appears at the left side of the scale and
sometimes at the right.
⬥ This controls the tendency of some respondents, particularly those with very
positive or very negative attitudes, to mark the right- or left-hand sides without
reading the labels.
⬥ Individual items on a semantic differential scale may be scored on either a -3 to +3
or a 1 to 7 scale.
A Semantic Differential Scale for Measuring Self-
Concepts, Person Concepts, and Product Concepts

1) Rugged :---:---:---:---:---:---:---: Delicate


2) Excitable :---:---:---:---:---:---:---: Calm
3) Uncomfortable :---:---:---:---:---:---:---: Comfortable
4) Dominating :---:---:---:---:---:---:---: Submissive
5) Thrifty :---:---:---:---:---:---:---: Indulgent
6) Pleasant :---:---:---:---:---:---:---: Unpleasant
7) Contemporary :---:---:---:---:---:---:---: Obsolete
8) Organized :---:---:---:---:---:---:---: Unorganized
9) Rational :---:---:---:---:---:---:---: Emotional
10) Youthful :---:---:---:---:---:---:---: Mature
11) Formal :---:---:---:---:---:---:---: Informal
12) Orthodox :---:---:---:---:---:---:---: Liberal
13) Complex :---:---:---:---:---:---:---: Simple
14) Colorless :---:---:---:---:---:---:---: Colorful
15) Modest :---:---:---:---:---:---:---: Vain
Stapel Scale
The Stapel scale is a unipolar rating scale with ten categories numbered from -5 to
+5, without a neutral point (zero). This scale is usually presented vertically.
BIG-BAZAR
+5 +5
+4 +4
+3 +3
+2 +2X
+1 +1
HIGH QUALITY POOR SERVICE
-1 -1
-2 -2
-3 -3
-4X -4
-5 -5
The data obtained by using a Stapel scale can be analyzed in the
same way as semantic differential data.
Basic Non-Comparative
Scale
Scales
Basic Characteristics Examples Advantages Disadvantages
Continuous Place a mark on a Reaction to TV Easy to construct Scoring can be
Rating Scale continuous line commercials cumbersome unless
computerized
Itemized
Rating Scales
Likert Scale Degree of agreement on a 1 Measurement of Easy to construct, More time consuming
(strongly disagree) to 5 attitudes administer, and
(strongly agree) scale understand

Semantic Seven-point scale with Brand, product, and Versatile Difficult to cons-truct
Differential bipolar labels company images bipolar adjectives

Stapel Scale Unipolar ten-point scale, -5 Measurement of Easy to construct Confusing and
to +5, without a neutral attitudes and images and administer difficult to apply
point (zero) over telephone

53
Scale Evaluation

Scale Evaluation

Reliability Validity Generalizability

Test/ Alternative Internal Construct


Content Criterion
Retest Forms Consistency

Convergent Discriminant Nomological


Goodness of Measures
Determining Quality of
Measurement
Accuracy and Consistency in Measurement

⬥ Validity is accuracy

⬥ Reliability is consistency
Reliability
⬥ Definition: The extent to which the same research technique
applied again to the same object (subject) will give you the
same result
⬥ Reliability does not ensure accuracy: a measure can be
reliable but inaccurate (invalid) because of bias in the
measure or in data collector/coder
⬥ Reliability can be defined as the extent to which measures
are free from random error, XR. If XR = 0, the measure is
perfectly reliable.
⬥ In test-retest reliability, respondents are administered
identical sets of scale items at two different times and the
degree of similarity between the two measurements is
determined.
Reliability
⬥ In alternative-forms reliability, two equivalent forms of the scale
are constructed and the same respondents are measured at two
different times, with a different form being used each time.
⬥ Internal consistency reliability determines the extent to which
different parts of a summated scale are consistent in what they
indicate about the characteristic being measured.
⬥ In split-half reliability, the items on the scale are divided into two
halves and the resulting half scores are correlated.
⬥ The coefficient alpha, or Cronbach's alpha, is the average of all
possible split-half coefficients resulting from different ways of
splitting the scale items. This coefficient varies from 0 to 1, and a
value of 0.6 or less generally indicates unsatisfactory internal
consistency reliability.
Four Aspects of Reliability

1. Stability
2. Reproducibility
3. Homogeneity
4. Accuracy
Validity

⬥ The validity of a scale may be defined as the extent to which


differences in observed scale scores reflect true differences among
objects on the characteristic being measured, rather than systematic
or random error. Perfect validity requires that there be no
measurement error (XO = XT, XR = 0, XS = 0).
⬥ Content validity is a subjective but systematic evaluation of how
well the content of a scale represents the measurement task at hand.
⬥ Criterion validity reflects whether a scale performs as expected in
relation to other variables selected (criterion variables) as
meaningful criteria.
Validity

⬥ Construct validity addresses the question of what construct or


characteristic the scale is, in fact, measuring. Construct
validity includes convergent, discriminant, and nomological
validity.
⬥ Convergent validity is the extent to which the scale correlates
positively with other measures of the same construct.
⬥ Discriminant validity is the extent to which a measure does
not correlate with other constructs from which it is supposed to
differ.
⬥ Nomological validity is the extent to which the scale
correlates in theoretically predicted ways with measures of
different but related constructs.
Measurement Accuracy
The true score model provides a framework for
understanding the accuracy of measurement.
XO = X T + X S + X R
where
XO = the observed score or measurement
XT = the true score of the characteristic
XS = systematic error
XR = random error
Relationship Between
Reliability and Validity
⬥ If a measure is perfectly valid, it is also perfectly reliable.
In this case XO = XT, XR = 0, and XS = 0.
⬥ If a measure is unreliable, it cannot be perfectly valid, since
at a minimum XO = XT + XR. Furthermore, systematic
error may also be present, i.e., XS≠0. Thus, unreliability
implies invalidity.
⬥ If a measure is perfectly reliable, it may or may not be
perfectly valid, because systematic error may still be
present (XO = XT + XS).
⬥ Reliability is a necessary, but not sufficient, condition for
validity.

You might also like