You are on page 1of 16

5.

Reliability and Validity

 What are random error and systematic error, and how do they
influence measurement?
 What is reliability? Why must a measure be reliable?

 How are test-retest and equivalent-forms reliability measured?

 How are split-half reliability and coefficient alpha used to


assess the internal consistency of a measured variable?

 What is interrater reliability?


 What are face validity and content validity?

 How are convergent and discriminant validity used to assess


the construct validity of a measured variable?

 What is criterion validity?

 What methods can be used to increase the reliability and validity


of a self-report measure?

 How are reliability and construct validity similar? How are


they different?
Random and Systematic Error
Random Error
1) fluctuations in the person’s current mood.
2) misreading or misunderstanding the questions
3) measurement of the individuals on different days or in different
places.

These error may cancel out as you collect many samples


Systematic Error
Sources of error including the style of measurement, tendency
toward self-promotion, cooperative reporting, and other conceptual
variables are being measured.
So, we have to reduce these errors to prove scientific findings
How well do our measured variables “capture” the conceptual
variables?
Reliability
The extent to which the variables
are free from random error, usually * *
determined by measuring the **
variables more than once. CVs**
*
Construct Validity *
The extent to which a measured
variable actually measures the **
conceptual variables that is design **
to assess the extent to which it is *
CVs
known to reflect the conceptual * **
*
variable other measured variables
Test-Retest Reliability
The extent to which scores on the same
measured variable correlate with each
other on two different measurements
given at two different time.

Questionnaire 9/20 Questionnaire 9/27

4 I feel I do not have much proud of.


___ 4 I feel I do not have much proud of.
___

3 On the whole, I am satisfied with myself


___ 4 On the whole, I am satisfied with myself
___
2 I certainly feel useless at times
___ 1 I certainly feel useless at times
___
1 At times I think I am no good at all
___ 1 At times I think I am no good at all
___
4 I have a number of good qualities
___ 4 I have a number of good qualities
___
3 I am able to do things as well as others
___ 4 I am able to do things as well as others
___
f ec t
Ef
st i ng
Rete
Equivalent-Forms Reliability
The extent to which two equivalent
variables given at different time
correlate each other.
Example. GRE, SAT, GMAT, TOEFL

22 X 45 = 32 X 45 =

85 X (23-11) = 85 X (41-11) =

72-14 X 12 X (7-1) = 72-14 X 25 X (6-1) =


Reliability as Internal Consistency
The extent to which the scores on
the items correlate with each other
and thus are all measuring the true
score rather than reflecting random
error.
Questionnaire 9/20 How Do You Measure
___ I feel I do not have much proud of. Internal Consistency?
___ On the whole, I am satisfied with myself
___ I certainly feel useless at times Split-half Reliability
___ At times I think I am no good at all
___ I have a number of good qualities
___ I am able to do things as well as others
Coefficient Alpha
Interrater Reliability

The extent to which the scores


counted by coders correlate
each other.

How Do You Measure


Interrater Reliability?
Aggression Code
Coder 1 Coder 2
Cohen’s Kappa
Hit boy A ______
1 ______
3
Hit boy B ______
3 ______
3
Hit girl A ______
3 ______
2
Hit girl B ______
1 ______
1
t
Questionnaire 1 Test-Retest Reliability Questionnaire 1
Item 1 Item 1
Item 2 Item 2
Reliability as
Internal Consistency
Item 3 Item 3

Questionnaire 2
Equivalent-Forms Item 1
Reliability
Item 2

Interrater Reliability Item 3


Validity
Construct Validity
The extent to which a measured variable
actually measures the conceptual variable
(that is, the construct) that it is designed
to assess.

Criterion Validity
The extent to which a self-report measure
correlates with a behavioral measured
variables.
Construct Validity

Face Validity
The extent to which the measured
variable appears to be an adequate
measure of the conceptual variables

I don’t like Japanese


Strongly Disagree 1 2 3 4 5 6 7 8 Strongly Agree
Discrimination
towards Japanese

Measured Conceptual
Variable X Variable
Construct Validity

Content Validity
The degree to which the measured
variable appears to have adequately
sampled from the potential domain
of question that might relate to
the conceptual variable of interest.

Sympathy
Verbal Aptitude

Intelligence Math Aptitude


Construct Validity

Convergent Validity
Interdependence Scale
The extent to which a measured variable
is found to be related to other measured
variables designed to measure the same
conceptual variable. Collectivism Scale

Discriminant Validity

The extent to which a measured variable Independence Scale


is found to be unrelated to other measured
variables designed to measure the different
conceptual variables.
Interdependence Scale
Criterion Validity

Predictive Validity

The extent to which the scores


can predict the participants’
Example. GRE, SAT...
future performance.

Concurrent Validity

The extent to which the self-report


measure correlate with the behavioral
measure that is assessed at the same
time.
How Do You Improve the Reliability and Validity of
Your Measured Variables?

1. Conduct a pilot test, trying out a questionnaire or other


research instruments on a small group.
2. Use multiple measures.
3. Ensure variability that there is in your measures.
4. Write good items.
5. Get your respondents to take your questions
seriously.
6. Make your items nonreactive.
7. Be certain to consider face and content validity by choosing
reasonable terms and that cover a broad range of issues
reflecting the conceptual variables..
8. Use existing measures.
time

Conceptual Future
Variables behaviors

Face Validity
Predictive Validity

Other
Domain of the Measured Measured
CVs Variables Variables
(Self-Report) (Behavioral)
Content Validity Concurrent Validity

Similar Items-Scales Items-Scales Other Items-Scales

Convergent Validity Discriminant Validity

You might also like