Professional Documents
Culture Documents
CHAPTER 5
OPERATIONAL DEFINITION OF
VARIABLES
The operational definition of a variable is the specific way in which it is measured in
that study. An operational definition, when applied to data collection, is a clear,
concise detailed definition of a measure. The need for operational definitions is
fundamental when collecting all types of data. It is particularly important when a
decision is being made about whether something is correct or incorrect, or when a
visual check is being made where there is room for confusion.
LEARNING OBJECTIVES
What is a Variable?
A variable is anything that has a quantity or quality that varies. The dependent variable is
the variable a researcher is interested in. An independent variable is a variable believed to
affect the dependent variable. Confounding variables are defined as interference caused
by another variable.
Types of variables:
Categorical variables take on values that are names or labels. The color of a ball (e.g.,
red, yellow, blue) or the breed of a dog (e.g., collie, shepherd, terrier) would be examples
of categorical variables.
Confounding variables are an outside influence that changes the effect of a dependent
and independent variable. It can ruin an experiment and produce useless results.
Control variables are a factor in an experiment which must be held constant. For
example, in an experiment to determine whether light makes plants grow faster, you
would have to control for soil quality and water.
Dependent variables are the outcome of an experiment. As you change the independent
variable, you watch what happens to the dependent variable.
Discrete variables can only take on a certain number of values. For example, “number of
cars in a parking lot” is discrete because a car park can only hold so many cars.
Independent variables are not affected by anything that you, the researcher, does.
Usually plotted on the x-axis.
Lurking variables are a “hidden” variable the affects the relationship between the
independent and dependent variables.
Measurement variables have a number associated with it. It’s an “amount” of something,
or a “number” of something.
Ordinal variables are similar to a categorical variable, but there is a clear order. For
example, income levels of low, middle, and high could be considered ordinal.
Qualitative variables are a broad category for any variable that can’t be counted (i.e. has
no numerical value). Nominal and ordinal variables fall under this umbrella term.
Quantitative variables can be counted or have a numerical value associated with them.
Examples of variables that fall into this category include discrete variables and ratio
variables.
Random variables are associated with random processes and give numbers to outcomes
of random events.
Ranked variables are an ordinal variable; a variable where every data point can be put in
order (1st, 2nd, 3rd, etc.).
Ratio variables are similar to interval variables but has a meaningful zero.
Attribute variable is another name for a categorical variable (in statistical software) or a
variable that isn’t manipulated (in design of experiments).
Binary variable can only take on two values, usually 0/1. Could also be yes/no, tall/short
or some other two-variable combination.
Collider Variable is represented by a node on a causal graph that has paths pointing in as
well as out.
Criterion variable is another name for a dependent variable, when the variable is used in
non-experimental situations.
Dummy Variables is used in regression analysis when you want to assign relationships to
unconnected categorical variables. For example, if you had the categories “has dogs” and
“owns a car” you might assign a 1 to mean “has dogs” and 0 to mean “owns a car.”
Extraneous variables are any variables that you are not intentionally studying in your
experiment or test.
A grouping variable (also called a coding variable, group variable or by variable) sorts
data within data files into categories or groups.
Responding variable is an informal term for dependent variable, usually used in science
fairs.
Study Variable (Research Variable) can mean any variable used in a study but does have
a more formal definition when used in a clinical trial.
Each person/thing we collect data on is called an observation (in our work these are
usually people/subjects. Currently, the term participant rather than subject is used when
describing the people from whom we collect data.
Quantitative variables are ones that exist along a continuum that runs from low to high.
Ordinal, interval, and ratio variables are quantitative. Quantitative variables are
sometimes called continuous variables because they have a variety (continuum) of
characteristics. Height in inches and scores on a test would be examples of quantitative
variables.
Qualitative variables do not express differences in amount, only differences. They are
sometimes referred to as categorical variables because they classify by categories.
Nominal variables such as gender, religion, or eye color are categorical variables.
Categorical variables are group such as gender or type of degree sought. Quantitative
variables are numbers that have a range…like weight in pounds or baskets made during a
ball game. When we analyze data we do turn the categorical variables into numbers but
only for identification purposes…e.g. 1 = male and 2 = female. Just because 2 = female
does not mean that females are better than males who are only 1. With quantitative data
having a higher number means you have more of something. So higher values have
meaning.
While the independent variable is often manipulated by the researcher, it can also be a
classification where subjects are assigned to groups. In a study where one variable causes
the other, the independent variable is the cause. In a study where groups are being
compared, the independent variable is the group classification.
The dependent variable is the outcome. In an experiment, it may be what was caused or
what changed as a result of the study. In a comparison of groups, it is what they differ on.
Let’s assume that we found that whole language instruction worked better than phonics
instruction with the high SES students, but phonics instruction worked better than whole
language instruction with the low SES students. Later you will learn in statistics that this
is an interaction effect. In this study, language instruction was the independent variable
(with two levels: phonics and whole language). SES was the moderator variable (with
two levels: high and low). Reading achievement was the dependent variable (measured
on a continuous scale so there aren’t levels).
With a moderator variable, we find the type of instruction did make a difference, but it
worked differently for the two groups on the moderator variable. We select this
moderator variable because we think it is a variable that will moderate the effect of the
independent on the dependent. We make this decision before we start the study.
If the moderator had not been in the study above, we would have said that there was no
difference in reading achievement between the two types of reading instruction. This
would have happened because the average of the high and low scores of each SES group
within a reading instruction group would cancel each other and produce what appears to
be average reading achievement in each instruction group (i.e., Phonics: Low—6 and
High—2; Whole Language: Low—2 and High—6; Phonics has an average of 4 and
Whole Language has an average of 4. If we just look at the averages (without regard to
the moderator), it appears that the instruction types produced similar results).
These variables are independent variables that have not been controlled. They may or
may not influence the results. One way to control an extraneous variable which might
influence the results is to make it a constant (keep everyone in the study alike on that
characteristic). If SES were thought to influence achievement, then restricting the study
to one SES level would eliminate SES as an extraneous variable.
There are two traits of variables that should always be achieved. Each variable should be
exhaustive, it should include all possible answerable responses. For instance, if the
variable is "religion" and the only options are "Protestant", "Jewish", and "Muslim", there
are quite a few religions I can think of that haven't been included. The list does not
exhaust all possibilities.
On the other hand, if you exhaust all the possibilities with some variables—religion being
one of them—you would simply have too many responses. The way to deal with this is to
explicitly list the most common attributes and then use a general category like "Other" to
account for all remaining ones.
For instance, you might be tempted to represent the variable "Employment Status" with
the two attributes "employed" and "unemployed." But these attributes are not necessarily
mutually exclusive -- a person who is looking for a second job while employed would be
able to check both attributes! But don't we often use questions on surveys that ask the
respondent to "check all that apply" and then list a series of categories? Yes, we do, but
technically speaking, each of the categories in a question like that is its own variable and
is treated dichotomously as either "checked" or "unchecked", attributes that are mutually
exclusive.
In some cases, the conceptual variable may be too vague to be operationalized, and in
other cases the variable cannot be operationalized because the appropriate technology has
not been developed.
First, more specific definitions mean that there is less danger that the collected data will
be misunderstood by others.
Second, specific definitions will enable future researchers to replicate the research.
The operational definition also helps to control the variable by making the measurement
constant. Therefore, when it comes to operational definitions of a variable, the more
detailed the definition is, the better.
For example, if the researcher was planning to weigh research subjects, there would
several constructs that should be spelled out including what the subjects were to wear,
whether they would wear shoes, what type of scale was being used, and time of day. It
may also be important to define the measurement of the outcome.
For example, if a study was examining the relationship of swimming on overall fitness,
the researcher would need to define how the outcome of overall fitness would be
measured.
Similarly, if a researcher was studying the impact of a nutrition education program, the
outcome to be used in measuring the program’s effectiveness would need to be defined.
Classify the probable categories of each variable and determine if the categories can be
clearly understood, are mutually exclusive (do not overlay) and exhaustive. The list of
categories is complete to categorize all respondents.
Write down the key terms which may be understood otherwise by different people, unless
they are operationally defined. Write an operational definition for each term.
Does the definition obviously require the way the variable will be measured?