You are on page 1of 33

Ch.

3: The Measurement of Behaviour

PSYCH 3F40 PSYCHOLOGICAL RESEARCH MIKE MANIACI 9/25/2013

Scientific advances rely on advances in

measurement Examples: Intelligence, psychological disorders, personality, relationship processes, attachment, aggression, etc. Accurately and appropriately measuring variance in behaviour or mental processes

Are we really measuring the construct we want to measure? Is the measure consistent over time? Is the measure practical to use in research?

Types of Measures

Observational Measures Involve the direct observation of behavior Can be used to measure anything an organism does that can be observed Researchers can either directly observe or use audio and video recordings Examples:

EAR (Electronically Activated Recorder) Marital conflict Ainsworth Strange Situation

Types of Measures

Physiological and Neuroscientific Measures Facilitate the study of associations between biological processes and behavior Involves the use of specialized equipment to measure heart rate, brain activity, hormonal changes, and other bodily responses Examples: fMRI, EEG, cortisol, oxytocin

Types of Measures

Self-report measures Responses to questionnaires and interviews Can use self-reports to measure:

thoughts (cognitive self-reports) feelings (affective self-reports) actions (behavioral self-reports)

Examples: Personality assessment, clinical

inventories, social support

Psychometrics: The study of psychological

measurement
Converging operations: Measuring a construct in

several different ways

E.g., combining observational, physiological, and self-report measures Combining measures within a study allows us to better separate signal from measurement noise

Scales of Measurement Four Types


1.

Nominal Scale numbers assigned are simply labels for characteristics or behaviors
For example: Males = 1, Females = 2 Democrat = 1, Republican = 2, Independent =3

Scales of Measurement Four Types


2. Ordinal Scale the rank ordering of

peoples behaviors or characteristics For example:


the order in which runners complete a race
students ranking from lowest to highest in a class An ordinal scale does not tell us the distance between participants on the variable being measured.

Scales of Measurement Four Types


3.

Interval Scale equal differences between the numbers reflect equal differences between participants, but there is no true zero point Examples: Scores on an IQ test Ratings on a 5-point agree-disagree scale

Scales of Measurement Four Types


4.

Ratio Scale -- contains a true zero point


Examples: weight the number of questions answered correctly the time it takes to complete a task

Ratio scales provide the greatest amount of information and should be used whenever possible.

Scales of Measurement (Summary)


Scale Characteristics Distinct categories No numerical distinctions Rank ordered by size or magnitude Rank ordered categories + Equal-sized intervals between categories Arbitrary or absent zero point Examples Gender Diagnosis Experimental Condition Rank in class Olympic medals Temperature (in F or C) IQ Golf scores (above/below par)

Nominal

Ordinal

Interval

Ratio

Rank ordered categories + Equal-sized intervals + Absolute zero point

Number of correct answers Time to complete task Gain in height since last year

Scales of Measurement
Scale of Measurement influences the choice of

statistical analysis Typically interval and ratio vs. ordinal and nominal

Measurement Error
Observed Score = True Score + Measurement Error
True Score the score that a participant would have

obtained if the measure were perfect and we were able to measure without error Measurement Error variability in scores due to factors that distort the true score

Five Sources of Measurement Error


1.

Transient States a temporary, unstable state of the participant E.g., mood, health, fatigue, anxiety

2.

Stable Attributes enduring traits of the participant such as illiteracy, paranoia, hostility

Five Sources of Measurement Error


3.

Situational Factors characteristics of the researcher or the research setting

4.

Characteristics of the Measure long, difficult, or tedious measures

ambiguous wording
5.

Mistakes in recording a participants score

Reliability
Reliability the consistency or dependability of a

measuring technique
The reliability of a measure is an inverse function

of measurement error.
If a measurement has high reliability, participants

observed scores will be close to their true scores.

Reliability
Testing reliability requires analyzing the

variability in a set of scores.


Total Variance in = Variance due a set of scores to true scores + Variance due measurement error

Reliability is the proportion of the total

variance that is associated with participants true scores: Reliability = True-score variance / Total variance

Assessing Reliability
Researchers estimate reliability by assessing the

extent to which two or more measurements of the same behavior, object, or event yield similar scores.
Researchers usually use a correlation coefficient to

make those estimates.

Correlation Coefficients
Correlation coefficient expresses the strength

of the relationship between two measures


Can range from -1.00 to +1.00 Correlation of .00 indicates no relationship between the variables The sign indicates whether the relationship between the variables is positive or negative.

Test-Retest Reliability
Test-Retest Reliability consistency of

participants responses on a measure over time

Administer measure on two separate occasions Examine the correlation between the scores obtained on the two occasions Correlation > .70 indicates acceptable reliability Useful only if the attribute being measured should not change over time

Interitem Reliability
Interitem Reliability assesses the degree of

consistency among the items on a scale


Interitem reliability tells us whether all of the items

on a scale are measuring the same thing. If not, summing scores across the items creates measurement error and lowers reliability.

Indices of Interitem Reliability


Item-total correlation the correlation between

a particular item and the sum of all the other items on the scale (ideally > .30) Split-half reliability divide the items on a scale into two sets and examine the correlation between the sets Cronbachs alpha coefficient () equivalent to the average of all possible split half reliabilities
Most frequently used Adequate interitem reliability if exceeds .70 Influenced by number of items

Interrater Reliability
Interrater Reliability -- the consistency among

two or more researchers who observe and record participants behavior


Examine the degree of agreement among two or

more people who observe and record participants behavior

Increasing the Reliability of Measures


1.

Standardize administration of the measure.

2. Clarify instructions and questions.

3. Train observers.
4. Minimize errors in coding data.
(basically, try to minimize variance due to measurement error)

Validity
Validity the degree to which a measurement

procedure actually measures what it is intended to measure rather than measuring something else (or nothing other than error)

To what extent does the variability in scores on the

measure reflect variability in the characteristic or behavior we are trying to assess?

Face Validity
Face Validity the extent to which a measure

appears to measure what its supposed to measure

Just because something has face validity doesnt mean that it is valid. Many measures without face validity are valid. Some measures are designed to lack face validity so as to disguise their purpose.

Construct Validity
Hypothetical constructs entities that cannot be

directly observed but are inferred on the basis of empirical evidence

Examples: intelligence, status, motivation, love, selfesteem, attachment style

Construct validity the extent to which a measure

of a hypothetical construct relates as it should to other measures

Construct Validity
Convergent Validity a measure correlates

with other measures that it should correlate with

Embarrassability should be positively correlated with shyness but negatively correlated with selfconfidence.

Discriminant Validity a measure does not

correlate with other measures that it should not correlate with

Embarrassability should not correlate with IQ.

Criterion-Related Validity
Criterion-related validity the extent to which a

measure allows us to distinguish among participants on the basis of a particular behavioral criterion
Researchers examine whether behavioral outcomes

are related to scores on the measure as expected.


Example: Implicit relationship satisfaction predicts

divorce

Criterion-Related Validity I
Concurrent validity scores on a measure

are related as expected to a criterion that is assessed at the time the measure is administered

Example: an Embarrassability Scale (administered today) predicts blushing in the current situation

Criterion-Related Validity II
Predictive Validity scores on a measure

are related as expected to a criterion that is assessed in the future

Example: an Embarrassability Scale (administered today) predicts whether students sign-up for public speaking classes next semester

Fairness and Bias


Test bias occurs when a particular measure is not

equally valid for everyone.

The question is not whether various groups score differently on the test. Rather, test bias is present when the validity of a measure is lower for some groups than for others.

Any Questions?

You might also like