3 Reliability and Validity

Reliability and Validity
Reliability
When a Measurement Procedure yields consistent scores when the phenomenon being measured is not changing. Degree to which scores are free of measurement error Consistency of measurement
VALIDITY
The extent to which measures indicate what they are intended to measure. The match between the conceptual definition and the operational definition.
RELATIONSHIP BETWEEN RELIABILITY AND VALIDITY

Necessary but not sufficient Reliability is a prerequisite for measurement validity One needs reliability, but its not enough
Example
Measuring height with reliable bathroom scale Measuring aggression with observer agreement by observing a kid hitting a Bobo doll
Types of Reliability Measurement

1. Stability Reliability 2. Equivalence Reliability
Stability Reliability

Test-retest SAME TEST DIFFERENT TIMES Testing phenomenon at two different times; The degree to which the two measurements of Sam Ting, using same measure, are related to one another Only works if phenomenon is unchanging
Example of Stability
Administering same questionnaire at 2 different times Re-examining client before deciding on intervention strategy. Running trial twice (e.. g. errors in tennis serving)
Notes on Stability Reliability

When ratings are by an observer rather than the subjects themselves, this is called Intraobserver Reliability or Intrarater Reliability. Answers about the past are less reliable when they are very specific, because the questions may exceed the subjects capacity to remember accurately.
Equivalence Reliability
1. Inter-item (split ) 2. Parallel forms [Different types of
measures]
3. Interobserver Agreement
-Is every observer scoring the same ?
1. Inter-item Reliability
(Internal consistency): The association of answers to a set of questions designed to measure the same concept.
Note on Inter-item Validity

The stronger the association among individual items and the more items included, the higher the reliability of an index Cronbachs alpha is a statistic commonly used to measure inter-item reliability Cronbachs alpha is based on the average of all the possible correlations of all the split 1/2s of a set of questions on a questionnaire
2. Parallel forms of Reliability

Split (inter-item) Different types of measures Interobserver Reliability
Is everyone measuring the same thing ? Different measures same time
3.Interobserver Reliability
Correspondence between measures made by different observers.
Note for Stat Students Only

The text inadvertently describes a 3rd type of reliability that were not concerned with in this class: goodness of fit about a slope line. Its sometimes referred to as random measurement error. Save this for Grad School =)
Note on Reliability
For Statistics people, the following quote refers to goodness of fit around a slope line due to measurement error.
Secondary Definition of Reliability from a previous slide
or that the measured scores changes in direct correspondence to actual changes in the phenomenon
And Now Onto Validity..
Types of Validity
1. Content Validity
Face Validity Sampling Validity (content validity)
2. Empirical Validity
Concurrent Validity Predictive Validity
3. Construct Validity
Face Validity
confidence gained from careful inspection of a concept to see if its appropriate on its face; In our [collective] intersubjective, informed judgment, have we measured what we want to measure? (N.B. use of good judgment)
Example of Face Validity

Rosenbergs self esteem scale questions:
Content validity
Also called sampling validity establishes that the measure covers the full range of the concepts meaning, i.e., covers all dimensions of a concept N.B depends on good judgment
Example of content validity

Earlier SES scale in class Authoritarian personality questions from Walizer & Wienir
*Note *
Actually I think face and content validity are probably Sam Ting
EMPIRICAL Validity
Establishes that the results from one measure match those obtained with a more direct or already validated measure of the same phenomenon (the criterion) Includes
Concurrent Predictive
Concurrent Validity
Validity exists when a measure yields scores that are closely related to scores on a criterion measured at the same time Does the new instrument correlate highly with an old measure of the same concept that we assume (judge) to be valid? (use of good judgment)
Example of concurrent validity

Aronsons doodle measure of achievement motivation. Act vs. SAT
Predictive Validity
Exits when a measure is validated by predicting scores on a criterion measured in the future Are future events which we judge to be a result of the concept were measuring anticipated [predicted] by the scores were attempting to validate Use of good judgment
Examples of Predictive Validity

Bronson screening test for at risk parenting followed up by interviewing and observing family members and school staff later Sat / ACT scores and later college performance (grades) Grades are judged to be measured validly
Whats a Construct? [NSB]*

Multidimensional concept
SES Industrialization
Fuzzy concept / hard to define

Ego strength Love
Concept build out of other concepts

Force=mass * acceleration * Ya better know these!!!!!
Consider This:
If a construct is hard to conceptualize doesnt it make sense that itll be more difficult to operationalize and validate?
Construct validity
: established by showing that a measure is (1) related to a variety of other measures as specified in a theory, used when no clear criterion exists for validation purposes (2) that the operationalization has a set of interrelated items and (3) that the operationalization has not included separate concepts
Construct validity
Check the intercorrelation of items used to measure construct judged to be valid Use theory to predict a relationship and use a judged to be valid measure of the other variable then check for relationship Demonstrate that your measure isnt related to judged to be valid measures of unrelated concepts
Convergent Validity
Convergent validity: achieved when one measure of a concept is associated with different types of measures in the same concept (this relies on the same type of logic as measurement triangulation) Measures intercorrelated
Example of questions that Interrelate

Questions for Companionateintimacy We get along well We communicate We like the same stuff Our chemistry is good We support each other
Discriminant Validity
Discriminant validity: scores on the measure to be validated are compared to scores on measures of different but related concepts and discriminant validity is achieved if the measure to be validated is NOT strongly associated with the measures of different concepts Measure not related to unrelated concepts
Questions for Passion

I think my partner is HOT My partner turns me on When Im with my partner I just feel the electricity
Using theory
Measure of constructs predicts what theory says it should
Companionate rel
longevity
satisfaction

3 Reliability and Validity

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3 Reliability and Validity

Uploaded by

Copyright:

Available Formats

Reliability and Validity

RELATIONSHIP BETWEEN RELIABILITY AND VALIDITY

Types of Reliability Measurement

Notes on Stability Reliability

Note on Inter-item Validity

2. Parallel forms of Reliability

Note for Stat Students Only

And Now Onto Validity..

Example of Face Validity

Example of content validity

Example of concurrent validity

Examples of Predictive Validity

Whats a Construct? [NSB]*

Fuzzy concept / hard to define

Concept build out of other concepts

Example of questions that Interrelate

Questions for Passion

You might also like