You are on page 1of 8

Levels of Measurement

Nominal, Ordinal, Interval, and Ratio Variables

Level of measurement refers to the way that a variable is measured. There are four main
levels of measurement that variables can have: nominal, ordinal, interval, and ratio. Being
familiar with the level of measurement of the variables in your data set is crucial because
they will help determine what statistical procedure you use. Not every statistical
operation can be used with every variable. The type of procedure used depends on the
variables level of measurement.
There is a hierarchy implied in the levels of measurement such that that at lower levels of
measurement (nominal, ordinal), assumptions are typically less restrictive and data
analyses are less sensitive.


At each level up the hierarchy, the current level includes all the qualities of the one below
it in addition to something new. In general, it is desirable to have higher levels of
measurement (interval or ratio) rather than a lower one. Lets examine each level of
measurement in order from lowest to highest on the hierarchy:
Nominal Level Of Measurement
At the nominal level of measurement, variables simply name the attribute it is measuring
and no ranking is present. For example, gender is a nominal variable because we classify
the observations into the categories "male" and "female." Because the different categories
(for instance, males and females) vary in quality but not quantity, nominal variables are
often called qualitative variables. An important feature of nominal variables is that there
is no hierarchy or ranking to the categories. For instance, males are not ranked higher
than females or vice versa there is no order or rank, just different names assigned to
Other examples of nominal variables include political party, religion, marital status, and
Nominal variables are also commonly referred to as categorical variables.
Ordinal Level Of Measurement
Variables that have an ordinal level of measurement can be rank-ordered. For example,
social class is an ordinal variable because we can say that a person in the category "upper
class" has a higher class position than a person in a middle class category, which again
is higher than "lower class."

In ordinal variables, the distance between categories does not have any meaning. For
example, we dont know how much higher "upper class" is to "middle class" or "lower
class." All we know is the order of the categories, but the interval between values is not
Other examples of ordinal variables include education level (less than high school, high
school degree, some college, etc.) and letter grades (A, B, C, D, F).
Interval Level Of Measurement
In interval measurement, the distance between the attributes, or categories, does have
meaning. For example, temperature is an interval variable because the distance between
30 and 40 degrees Fahrenheit is the same as the distance between 70 and 80 degrees
Fahrenheit. The interval between the values is interpretable. For this reason, it makes
sense to compute averages, or means, of interval variables, where it doesnt make sense
to do so for ordinal variables. With interval variables, however, ratios do not make sense.
That is, 80 degrees Fahrenheit is not twice as hot as 40 degrees Fahrenheit, even though
the attribute value is twice as large.
Ratio Level Of Measurement
Variables that are measured at the ratio level are similar to interval variables, however
they have an absolute zero that is meaningful (i.e. no numbers exist below zero). That is,
you can construct a meaningful ratio, or fraction, with a ratio variable.
Height and weight are both examples of ratio variables. If you are measuring a persons
height in inches, there is quantity, equal units, and the measurement cannot go below zero
inches. A negative height is not possible.

Frankfort-Nachmias, C. & Leon-Guerrero, A. (2006). Social Statistics for a Diverse Society. Thousand Oaks, CA: Pine Forge Press.

Trochim, W. M. K. (2006). Levels of Measurement. Research Methods Knowledge Base.

Level of measurement
In statistics and quantitative research methodology, various attempts have been made to
classify variables (or types of data) and thereby develop a taxonomy of levels of
measurement or scales of measure. Perhaps the best known are those developed by the
psychologist Stanley Smith Stevens. He proposed four types: nominal, ordinal, interval,
and ratio.

Stevens proposed his typology in a 1946 Science article titled "On the theory of scales of
measurement".[1] In that article, Stevens claimed that all measurement in science was
conducted using four different types of scales that he called "nominal," "ordinal,"
"interval," and "ratio," unifying both "qualitative" (which are described by his "nominal"
type) and "quantitative" (to a different degree, all the rest of his scales). The concept of
scale types later received the mathematical rigour that it lacked at its inception with the
work of mathematical psychologists Theodore Alper (1985, 1987), Louis Narens (1981a,
b), and R. Duncan Luce (1986, 1987, 2001). As Luce (1997, p. 395) wrote:
S. S. Stevens (1946, 1951, 1975) claimed that what counted was having an interval or
ratio scale. Subsequent research has given meaning to this assertion, but given his
attempts to invoke scale type ideas it is doubtful if he understood it himself . . . no
measurement theorist I know accepts Stevens' broad definition of measurement . . . in our
view, the only sensible meaning for 'rule' is empirically testable laws about the attribute.

Nominal scale[edit]
The nominal type differentiates between items or subjects based only on their names or
(meta-)categories and other qualitative classifications they belong to; thus dichotomous
data involves the construction of classifications as well as the classification of items.
Discovery of an exception to a classification can be viewed as progress. Numbers may be
used to represent the variables but the numbers do not have numerical value or
Examples of these classifications include gender, nationality, ethnicity, language, genre,
style, biological species, and form.[2][3] In a university one could also use hall of affiliation
as an example. Other concrete examples are

in grammar, the parts of speech: noun, verb, preposition, article, pronoun, etc.
in politics, power projection: hard power, soft power, etc.
in biology, the taxonomic ranks below domains: Archaea, Bacteria, and Eukarya

Nominal scales were often called qualitative scales, and measurements made on
qualitative scales were called qualitative data. However, the rise of qualitative research
has made this usage confusing.

Mathematical operations[edit]
Set membership, classification, categorical equality, and equivalence are all operations
which apply to objects of the nominal type.

Central tendency[edit]
The mode, i.e. the most common item, is allowed as the measure of central tendency for
the nominal type. On the other hand, the median, i.e. the middle-ranked item, makes no
sense for the nominal type of data since ranking is meaningless for the nominal type.

Percentages can be used to determine or develop a comparison of the classifications.

Ordinal scale[edit]
The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which data can be sorted,
but still does not allow for relative degree of difference between them. Examples include,
on one hand, dichotomous data with dichotomous (or dichotomized) values such as 'sick'
vs. 'healthy' when measuring health, 'guilty' vs. 'innocent' when making judgments in
courts, 'wrong/false' vs. 'right/true' when measuring truth value, and, on the other hand,
non-dichotomous data consisting of a spectrum of values, such as 'completely agree',
'mostly agree', 'mostly disagree', 'completely disagree' when measuring opinion.
Central tendency
The median, i.e. middle-ranked, item is allowed as the measure of central tendency;
however, the mean (or average) as the measure of central tendency is not allowed. The
mode is allowed.
In 1946, Stevens observed that psychological measurement, such as measurement of
opinions, usually operates on ordinal scales; thus means and standard deviations have no
validity, but they can be used to get ideas for how to improve operationalization of
variables used in questionnaires. Most psychological data collected by psychometric
instruments and tests, measuring cognitive and other abilities, are ordinal, although some
theoreticians have argued they can be treated as interval or ratio scales. However, there is
little prima facie evidence to suggest that such attributes are anything more than ordinal
(Cliff, 1996; Cliff & Keats, 2003; Michell, 2008).[4] In particular,[5] IQ scores reflect an
ordinal scale, in which all scores are meaningful for comparison only.[6][7][8] There is no
absolute zero, and a 10-point difference may carry different meanings at different points
of the scale.[9][10]

Interval scale[edit]
The interval type allows for the degree of difference between items, but not the ratio
between them. Examples include temperature with the Celsius scale, which has two
defined points (the freezing and boiling point of water at specific conditions) and then
separated into 100 intervals, date when measured from an arbitrary epoch (such as AD)
and direction measured in degrees from true or magnetic north. Ratios are not allowed
since 20 C cannot be said to be "twice as hot" as 10 C, nor can multiplication/division
be carried out between any two dates directly. However, ratios of differences can be
expressed; for example, one difference can be twice another. Interval type variables are
sometimes also called "scaled variables", but the formal mathematical term is an affine
space (in this case an affine line).

Central tendency and statistical dispersion[edit]

The mode, median, and arithmetic mean are allowed to measure central tendency of
interval variables, while measures of statistical dispersion include range and standard
deviation. Since one can only divide by differences, one cannot define measures that
require some ratios, such as the coefficient of variation. More subtly, while one can
define moments about the origin, only central moments are meaningful, since the choice
of origin is arbitrary. One can define standardized moments, since ratios of differences
are meaningful, but one cannot define the coefficient of variation, since the mean is a
moment about the origin, unlike the standard deviation, which is (the square root of) a
central moment.

Ratio scale[edit]
The ratio type takes its name from the fact that measurement is the estimation of the ratio
between a magnitude of a continuous quantity and a unit magnitude of the same kind
(Michell, 1997, 1999). A ratio scale possesses a meaningful (unique and non-arbitrary)
zero value. Most measurement in the physical sciences and engineering is done on ratio
scales. Examples include mass, length, duration, plane angle, energy and electric charge.
Ratios are allowed because having a non-arbitrary zero point makes it meaningful to say,
for example, that one object has "twice the length" of another (= is "twice as long"). Very
informally, many ratio scales can be described as specifying "how much" of something
(i.e. an amount or magnitude) or "how many" (a count). The Kelvin temperature scale is a
ratio scale because it has a unique, non-arbitrary zero point called absolute zero.
Central tendency and statistical dispersion
The geometric mean and the harmonic mean are allowed to measure the central tendency,
in addition to the mode, median, and arithmetic mean. The studentized range and the
coefficient of variation are allowed to measure statistical dispersion. All statistical
measures are allowed because all necessary mathematical operations are defined for the
ratio scale.

Debate on typology[edit]
While Stevens' typology is widely adopted, it is still being challenged by other
theoreticians, particularly in the cases of the nominal and ordinal types (Michell, 1986).

Duncan (1986) objected to the use of the word measurement in relation to the nominal
type, but Stevens (1975) said of his own definition of measurement that "the assignment
can be any consistent rule. The only rule not allowed would be random assignment, for
randomness amounts in effect to a nonrule". However, so-called nominal measurement
involves arbitrary assignment, and the "permissible transformation" is any number for
any other. This is one of the points made in Lord's (1953) satirical paper On the
Statistical Treatment of Football Numbers.[12]
The use of the mean as a measure of the central tendency for the ordinal type is still
debatable among those who accept Stevens' typology. Many behavioural scientists use the
mean for ordinal data, anyway. This is often justified on the basis that the ordinal type in
behavioural science is in fact somewhere between the true ordinal and interval types;
although the interval difference between two ordinal ranks is not constant, it is often of
the same order of magnitude.
For example, applications of measurement models in educational contexts often indicate
that total scores have a fairly linear relationship with measurements across the range of an
assessment. Thus, some argue that so long as the unknown interval difference between
ordinal scale ranks is not too variable, interval scale statistics such as means can
meaningfully be used on ordinal scale variables. Statistical analysis software such as
SPSS requires the user to select the appropriate measurement class for each variable. This
ensures that subsequent user errors cannot inadvertently perform meaningless analyses
(for example correlation analysis with a variable on a nominal level).
L. L. Thurstone made progress toward developing a justification for obtaining the interval
type, based on the law of comparative judgment. A common application of the law is the
analytic hierarchy process. Further progress was made by Georg Rasch (1960), who
developed the probabilistic Rasch model that provides a theoretical basis and justification
for obtaining interval-level measurements from counts of observations such as total
scores on assessments.
Another issue is derived from Nicholas R. Chrisman's article "Rethinking Levels of
Measurement for Cartography",[13] in which he introduces an expanded list of levels of
measurement to account for various measurements that do not necessarily fit with the
traditional notions of levels of measurement. Measurements bound to a range and
repeating (like degrees in a circle, clock time, etc.), graded membership categories, and
other types of measurement do not fit to Steven's original work, leading to the
introduction of six new levels of measurement, for a total of ten: (1) Nominal, (2) Graded
membership, (3) Ordinal, (4) Interval, (5) Log-Interval, (6) Extensive Ratio, (7) Cyclical

Ratio, (8) Derived Ratio, (9) Counts, and finally (10) Absolute. The extended levels of
measurement are rarely used outside of academic geography.

Scale types and Stevens' "operational theory of measurement"[edit]

The theory of scale types is the intellectual handmaiden to Stevens' "operational theory of
measurement", which was to become definitive within psychology and the behavioral
sciences,[citation needed] despite Michell's characterization as its being quite at odds with
measurement in the natural sciences (Michell, 1999). Essentially, the operational theory
of measurement was a reaction to the conclusions of a committee established in 1932 by
the British Association for the Advancement of Science to investigate the possibility of
genuine scientific measurement in the psychological and behavioral sciences. This
committee, which became known as the Ferguson committee, published a Final Report
(Ferguson, et al., 1940, p. 245) in which Stevens' sone scale (Stevens & Davis, 1938) was
an object of criticism:
any law purporting to express a quantitative relation between sensation intensity and
stimulus intensity is not merely false but is in fact meaningless unless and until a
meaning can be given to the concept of addition as applied to sensation.
That is, if Stevens' sone scale genuinely measured the intensity of auditory sensations,
then evidence for such sensations as being quantitative attributes needed to be produced.
The evidence needed was the presence of additive structure a concept comprehensively
treated by the German mathematician Otto Hlder (Hlder, 1901). Given that the
physicist and measurement theorist Norman Robert Campbell dominated the Ferguson
committee's deliberations, the committee concluded that measurement in the social
sciences was impossible due to the lack of concatenation operations. This conclusion was
later rendered false by the discovery of the theory of conjoint measurement by Debreu
(1960) and independently by Luce & Tukey (1964). However, Stevens' reaction was not
to conduct experiments to test for the presence of additive structure in sensations, but
instead to render the conclusions of the Ferguson committee null and void by proposing a
new theory of measurement:
Paraphrasing N.R. Campbell (Final Report, p.340), we may say that measurement, in the
broadest sense, is defined as the assignment of numerals to objects and events according
to rules (Stevens, 1946, p.677).
Stevens was greatly influenced by the ideas of another Harvard academic, the Nobel
laureate physicist Percy Bridgman (1927), whose doctrine of operationism Stevens used
to define measurement. In Stevens' definition, for example, it is the use of a tape measure
that defines length (the object of measurement) as being measurable (and so by
implication quantitative). Critics of operationism object that it confuses the relations
between two objects or events for properties of one of those of objects or events
(Hardcastle, 1995; Michell, 1999; Moyer, 1981a,b; Rogers, 1989).

The Canadian measurement theorist William Rozeboom (1966) was an early and
trenchant critic of Stevens' theory of scale types.

See also[edit]

Inter-rater reliability
Cohen's kappa
Hume's principle
Logarithmic scale
RamseyLewis method

Stevens, S. S. (1946). "On the Theory of Scales of Measurement". Science 103
(2684): 677680. Bibcode:1946Sci...103..677S. doi:10.1126/science.103.2684.677.
PMID 17750512.
Nominal measures are based on sets and depend on categories, ala Aristotle.
"Invariably one came up against fundamental physical limits to the accuracy of
measurement. ... The art of physical measurement seemed to be a matter of compromise,
of choosing between reciprocally related uncertainties. ... Multiplying together the
conjugate pairs of uncertainty limits mentioned, however, I found that they formed
invariant products of not one but two distinct kinds. ... The first group of limits were
calculable a priori from a specification of the instrument. The second group could be
calculated only a posteriori from a specification of what was done with the instrument. ...
In the first case each unit [of information] would add one additional dimension
(conceptual category), whereas in the second each unit would add one additional atomic
fact.", pp. 14: MacKay, Donald M. (1969), Information, Mechanism, and Meaning,
Cambridge, MA: MIT Press, ISBN 0-262-63-032-X