You are on page 1of 23

Measurement Concepts

Types of reliability and validity


Reliability of measures
Types of measurement scales
Anatomy of research article.
Constructs vs variables
• Constructs are broader concepts, traits, or attributes that are not directly observable or
measurable. They are often defined by a set of behaviors, attitudes, or characteristics.
• Variables are attributes or characteristics that can vary and are typically measured or
manipulated in research.
• In summary, variables are specific attributes that can be quantified or manipulated,
while constructs are more abstract concepts that are inferred from multiple variables
and are not directly measurable. Constructs are often used to explain or represent more
complex phenomena in research and theory.
Instruments
• You depend on instruments to measure events more than you probably realize. For
example, you rely on the speedometer in a car and the clock in your bedroom, and you
can appreciate the problems that arise when these instruments are inaccurate. Accuracy
refers to the difference between what an instrument says is true and what is known to be
true. A clock that is consistently 5 minutes slow is not very accurate. Inaccurate clocks
can make us late, and inaccurate speedometers can earn us traffi c tickets. The accuracy
of an instrument is determined by calibrating it, or checking it with another instrument
known to be true. Measurements can be made at varying levels of precision. A measure
of time in tenths of a second is not as precise as one that is in hundredths of a second.
One instrument that yields imprecise measures is the gas gauge in most older cars.
Although reasonably accurate, gas gauges do not give precise readings. Most of us have
wished at one time or another that the gas gauge would permit us to determine whether
we had that extra half gallon of gas that would get us to the next service station.
Instruments
• We also need instruments to measure behavior. You can be assured that the precision,
and even the accuracy, of instruments used in psychology have improved signifi cantly
since 1879, the founding of the fi rst psychology laboratory. Today, many sophisticated
instruments are used in contemporary psychology (Figure 2.3). To perform a
psychophysiology experiment (e.g., when assessing a person’s arousal level) requires
instruments that give accurate measures of such internal states as heart rate and blood
pressure. Tests of anxiety sometimes employ instruments to measure galvanic skin
response (GSR). Other behavioral instruments are of the paper-and-pencil variety.
Questionnaires and tests are popular instruments used by psychologists to measure
behavior. So, too, are the rating scales used by human observers. For instance, rating
aggression in children on a 7-point scale ranging from not at all aggressive (1) to very
aggressive (7) can yield relatively accurate (although perhaps not precise) measures of
aggression. It is the responsibility of the behavioral scientist to use instruments that are
as accurate and as precise as possible.
Measurement
• Scientists use two types of measurements to record the careful and controlled
observations that characterize the scientifi c method. One type of scientifi c
measurement, physical measurement, involves dimensions for which there is an agreed-
upon standard and an instrument for doing the measuring. For example, length is a
dimension that can be scaled with physical measurement, and there are agreed-upon
standards for units of length (e.g., inches, meters). Similarly, units of weight and time
represent physical measurement. In most psychological research, however, the
measurements do not involve physical dimensions. Rulers do not exist for measuring
psychological constructs such as beauty, aggression, or intelligence. These dimensions
require a second type of measurement— psychological measurement. In a sense, the
human observer is the instrument for psychological measurement. More specifi cally,
agreement among a number of observers provides the basis for psychological
measurement. For example, if several independent observers agree that a certain action
warrants a rating of 3 on a 7-point rating scale of aggression, that is a psychological
measurement of the aggressiveness of the action.
validity
• Measurements must be valid and reliable. In general, validity refers to the
“truthfulness” of a measure. A valid measure of a construct is one that measures what it
claims to measure. Suppose a researcher defi nes intelligence in terms of how long a
person can balance a ball on his or her nose. According to the principle of
“operationalism,” this is a perfectly permissible operational defi nition. However, most
of us would question whether such a balancing act is really a valid measure of
intelligence. The validity of a measure is supported when people do as well on it as on
other tasks presumed to measure the same construct. For example, if time spent
balancing a ball is a valid measure of intelligence, then a person who does well on the
balancing task should also do well on other accepted measures of intelligence
• Validity is critical in psychology to ensure that the measures accurately assess the
constructs or variables of interest. In the field of psychology, different types of validity
include:
• Content Validity: Ensuring that the content of psychological measures covers the full
range of the construct being measured. For example, a depression scale should include
items that represent various symptoms of depression.
• Criterion-Related Validity: Establishing whether a psychological measure correlates
with a specific criterion, such as using a new psychological assessment to predict future
behavior or outcomes in a valid and reliable manner.
• Construct Validity: Ensuring that a psychological measure accurately assesses the
theoretical construct it is intended to measure. This is particularly important when
developing new psychological measures or assessing complex psychological constructs.
Reliability
• The reliability of a measurement is indicated by its consistency. Several kinds of
reliability can be distinguished. When we speak of instrument reliability, we are
discussing whether an instrument works consistently. A car that sometimes starts and
sometimes doesn’t is not very reliable. Observations made by two or more independent
observers are said to be reliable if they show agreement—that is, if the observations are
consistent from one observer to another. For example, when psychologists asked college
students to rate the “happiness” of medal winners at the 1992 Summer Olympics in
Barcelona, Spain, they found that rater agreement was very high (Medvec, Madey, &
Gilovich, 1995). They also found, somewhat counterintuitively, that bronze (third place)
medal winners were perceived as happier than silver (second place) medal winners, a fi
nding that was explained by a theory of counterfactual thinking. Apparently, people are
happier just making it (to the medal stand) than they are just missing it (i.e., missing a
gold medal).
• in psychology, the reliability of measures is essential for ensuring that the data collected
is consistent and dependable. Key types of reliability in psychological research include:
• Test-Retest Reliability: This is important in psychology to ensure that the same results
are obtained when a measurement is repeated over time. For example, in clinical
psychology, it is crucial that psychological assessments yield consistent results when
administered to the same individual on different occasions.
• Inter-Rater Reliability: In psychological assessments and observations, it is important
to establish inter-rater reliability to ensure that different raters or observers achieve
consistent results when evaluating the same subject.
• Internal Consistency Reliability: This is particularly relevant in psychological tests
and questionnaires, where internal consistency measures such as Cronbach's alpha are
used to ensure that the items in the test are measuring the same underlying construct.
Types of Measurement scales
• There are four primary types of measurement scales:
• Nominal Scale: This scale is used for labeling variables without any quantitative value.
It simply names or categorizes.
• Ordinal Scale: This scale ranks and orders data without establishing a precise
measurement between each item. The differences between the ranks are not equal.
• Interval Scale: This scale not only classifies and orders the variables, but it also
specifies the exact value between each unit. However, it does not have a true zero point.
• Ratio Scale: This scale has all the properties of the interval scale, but it also has a true
zero point, allowing for the computation of ratios.
Anatomy Of Research Article
• In psychology, research articles follow a similar structure to those in other fields, with
specific emphasis on psychological theories, methodologies, and findings. The anatomy
of a research article in psychology includes:
• Introduction: Provides an overview of the research problem, relevant psychological
theories, and the specific research question or hypothesis.
• Literature Review: Summarizes and synthesizes previous psychological research and
theories relevant to the current study.
• Methodology: Describes the specific psychological research design, participant
selection, psychological measures used, and data collection procedures.
Anatomy of Research Article
• Results: Presents the findings of the psychological study, including statistical analyses
and psychological measurements.
• Discussion: Interprets the psychological findings, relates them to existing psychological
theories, and discusses their implications for the field.
• Conclusion: Summarizes the key psychological findings and their implications, as well
as potential directions for future research in the specific psychological area.
• References: Lists all psychological sources cited in the article, following specific
psychological citation styles such as APA (American Psychological Association) format.
• This structure is essential for ensuring the transparency and rigor of psychological
research articles.
Psychrometric properties
• In psychology, psychometric properties refer to the characteristics and qualities of
psychological measures or instruments, such as tests, questionnaires, and assessments.
These properties are essential for evaluating the reliability and validity of psychological
measures, ensuring that they accurately and consistently measure the constructs they are
designed to assess. Key psychometric properties include:
• 1. Reliability
• 2. Validity
• 3. Standardization
• 4. Norms
• 5. Item Analysis
• 6. Factor Analysis
EFA
• Exploratory Factor Analysis (EFA) is a statistical method used to identify the
underlying structure of a set of observed variables without preconceived assumptions
about the number and nature of the underlying factors. EFA is often employed when
researchers or practitioners are exploring the potential dimensions or constructs that
may be present in a specific set of psychological measures.
• EFA is useful in uncovering the latent variables or constructs that may be driving the
observed patterns of responses in a psychological measure. By identifying these
underlying factors, researchers can gain insights into the structure of the measure and
the relationships between different items or subscales.
• EFA allows researchers to reduce the dimensionality of a large set of variables by
identifying the key latent factors that explain the patterns of correlations among the
observed variables. This reduction can help in simplifying the interpretation and use of
psychological measures.
CFA
• Confirmatory Factor Analysis (CFA) is a statistical method used to test and confirm
the hypothesized factor structure of a set of observed variables. Unlike EFA, CFA is
employed when researchers have specific hypotheses or theories about the underlying
structure of a psychological measure.
• CFA is used to evaluate the fit between the observed data and a predefined factor
structure, allowing researchers to test whether the data support the theorized
relationships between the measured variables and the underlying constructs.
• CFA is particularly useful in assessing the construct validity of psychological measures
by confirming whether the observed variables are indeed measuring the intended
constructs as hypothesized.
• Additionally, CFA allows researchers to examine the relationships between latent
factors and observed variables, providing insights into the convergent and discriminant
validity of the measure.
causal inference:
• Process of determining wether an observed association truly refers cause and effect
relationship.
• Scientists set three important conditions for making a causal inference: covariation of
events, a time-order relationship, and the elimination of plausible alternative causes. A
simple illustration will help you to understand these three conditions. Suppose you hit
your head on a door and experience a headache; presumably you would infer that
hitting your head caused the headache. The first condition for causal inference is the
covariation of events. If one event is the cause of another, the two events must vary
together; that is, when one changes, the other must also change. In our illustration, the
event of changing your head position from upright to hitting against the door must
covary with the experience of no headache to the experience of a headache.
• The second condition for a causal inference is a time-order relationship (also known as
contingency). The presumed cause (hitting your head) must occur before the presumed
effect (headache). If the headache began before you hit your head, you wouldn’t infer
that hitting your head caused the headache. In other words, the headache was contingent
on you hitting your head first. Finally, causal explanations are accepted only when other
possible causes of the effect have been ruled out—when plausible alternative causes
have been eliminated. In our illustration, this means that to make the causal inference
that hitting your head caused the headache, you would have to consider and rule out
other possible causes of your headache (such as reading a difficult textbook).
Further explanation
• Unfortunately, people have a tendency to conclude that all three conditions for a causal
inference have been met when really only the fi rst condition is satisfi ed. For example,
it has been suggested that parents who use stern discipline and physical punishment are
more likely to have aggressive children than are parents who are less stern and use other
forms of discipline. Parental discipline and children’s aggressiveness obviously covary.
Moreover, the fact that we assume parents infl uence how their children behave might
lead us to think that the time-order condition has been met— parents use physical
discipline and children’s aggressiveness results. It is also the case, however, that infants
vary in how active and aggressive they are and that the infant’s behavior has a strong
infl uence on the parents’ responses in trying to exercise control. In other words, some
children may be naturally aggressive and require stern discipline rather than stern
discipline producing aggressive children. Therefore, the direction of the causal
relationship may be opposite to what we thought at fi rst.
• It is important to recognize, however, that the causes of events cannot be identifi ed
unless covariation has been demonstrated. The fi rst objective of the scientifi c method,
description, can be met by describing events under a single set of circumstances. The
goal of understanding, however, requires more than this. For example, suppose a teacher
wished to demonstrate that so-called “active learning strategies” (e.g., debates, group
presentations) help students learn. She could teach students using this approach and then
describe the performance of the students who received instruction in this particular way.
But, at this point, what would she know? Perhaps another group of students taught using
a different approach might learn the same amount. Before the teacher could claim that
active learning stategies caused the performance she observed, she would have to
compare this method with some other reasonable approach. That is, she would look for a
difference in learning between the group using active learning strategies and a group not
using this method. Such a fi nding would show that teaching strategy and performance
covary. When a controlled experiment is done, a bonus comes along when the
independent and dependent variables covary. The time-order condition for a causal
inference is met because the researcher manipulates the independent variable (e.g.,
teaching method) and subsequently measures the differences between conditions on the
dependent variable (e.g., a measure of student learning
• By far the most challenging condition researchers must meet in order to make a
causal inference is eliminating other plausible alternative causes. Consider a study in
which the effect of two different teaching approaches (active and passive) is assessed.
Suppose the researcher assigns students to teaching conditions by having all men in
one group and all women in the other. If this were done, any difference between the
two groups could be due either to the teaching method or to the gender of the
students. Thus, the researcher would not be able to determine whether the difference
in performance between the two groups was due to the independent variable she
tested (active or passive learning) or to the alternative explanation of students’
gender. Said more formally, the independent variable of teaching method would be
“confounded” with the independent variable of gender. Confounding occurs when
two potentially effective independent variables are allowed to covary simultaneously.
When research is confounded, it is impossible to determine what variable is
responsible for any obtained difference in performance
• Researchers seek to explain the causes of phenomena by conducting experiments. However, even
when a carefully controlled experiment allows the researcher to form a causal inference, additional
questions remain. One important question concerns the extent to which the fi ndings of the
experiment apply only to the people who participated in the experiment. Researchers often seek to
generalize their fi ndings to describe people who did not participate in the experiment. Many of the
participants in psychology research are introductory psychology students in colleges and
universities. Are psychologists developing principles that apply only to college freshmen and
sophomores? Similarly, laboratory re search is often conducted under more controlled conditions
than are found in natural settings. Thus, an important task of the scientist is to determine whether
laboratory fi ndings generalize to the “real world.” Some people automatically assume that
laboratory research is useless or irrelevant to realworld concerns. However, as we explore research
methods throughout this text, we will see that these views about the relationship between
laboratory science and the real world are not helpful or satisfying. Instead, psychologists recognize
the importance of both: Findings from laboratory experiments help to explain phenomena, and this
knowledge is applied to real-world problems in research and interventions.
ITEMS and Indicators
• Components of Psychological Measures: Items are the building blocks of psychological
measures and are designed to elicit responses from individuals regarding their thoughts,
feelings, behaviors, or experiences related to the construct being assessed.
• Types of Items: Items can take various forms, including multiple-choice questions, Likert
scale statements (e.g., strongly agree to strongly disagree), open-ended prompts, and
visual stimuli, depending on the nature of the measure and the construct being assessed.
• Content of Items: The content of items is tailored to the specific construct or trait being
measured. For example, items in a depression questionnaire may ask about symptoms
such as low mood, sleep disturbances, or loss of interest in activities.
• Scoring of Items: Each item within a psychological measure is typically associated with a
specific scoring or response format, allowing individuals to provide their responses based
on the options provided (e.g., selecting a response option, rating on a scale, providing
written responses).
Items and indicators
• Item Analysis: Item analysis refers to the process of evaluating the effectiveness of
individual items within a psychological measure. This involves assessing factors such
as item difficulty, item discrimination, and item-total correlations to ensure that the
items are effectively capturing the intended construct.
• Item Development: The development of items in psychological measures often
involves rigorous processes to ensure that the items are clear, relevant, and valid. This
may include expert review, pilot testing, and refinement of items to enhance their
reliability and validity.
• Item Response Theory (IRT): In psychometrics, item response theory is a statistical
framework used to model the relationship between individuals' latent traits (e.g.,
intelligence, personality) and their responses to individual items within a measure. IRT
helps in understanding how different items contribute to the measurement of the latent
trait.

You might also like