You are on page 1of 7

Level of measurement

In statistics and quantitative research methodology, levels of measurement or scales of


measure are types of data that arise in the theory of scale types developed by the
psychologist Stanley Smith Stevens. The types are nominal, ordinal, interval, and ratio.

Typology
Stevens proposed his typology in a 1946 Science article titled "On the theory of scales of
measurement".[1] In that article, Stevens claimed that all measurement in science was
conducted using four different types of scales that he called "nominal", "ordinal", "interval"
and "ratio", unifying both qualitative (which are described by his "nominal" type) and
quantitative (to a different degree, all the rest of his scales). The concept of scale types later
received the mathematical rigour that it lacked at its inception with the work of mathematical
psychologists Theodore Alper (1985, 1987), Louis Narens (1981a, b) and R. Duncan Luce
(1986, 1987, 2001). As Luce (1997, p. 395) stated:

S. S. Stevens (1946, 1951, 1975) claimed that what counted was having an interval
“ or ratio scale. Subsequent research has given meaning to this assertion, but given
his attempts to invoke scale type ideas it is doubtful if he understood it himself…
no measurement theorist I know accepts Stevens' broad definition of
measurement… in our view, the only sensible meaning for 'rule' is empirically
testable laws about the attribute. ”
Stanley Smith Stevens' typology

Measure
Scale Logical/math Examples: Variable
# of Qualitative or
type operations name
central Quantitative
allowed (data values)
tendency

Dichotomous: Gender
(male vs.
female)
1 Nominal =/≠ Mode Qualitative
Non-dichotomous: Nationality

(American/Chinese/etc)

2 Ordinal =/≠ ; </> Dichotomous: Health Median Qualitative


(healthy vs.
sick),
Truth
(true vs.
false),
Beauty
(beautiful vs.
ugly)
Non-dichotomous: Opinion

1
('completely
agree'/
'mostly
agree'/
'mostly
disagree'/
'completely
disagree')

Date
(from 9999
BC
Arithmetic
3 Interval =/≠ ; </> ; +/− to 2013 AD) Quantitative
Mean
Latitude
(from +90° to
−90°)

Age
=/≠ ; </> ; +/− ; Geometric
4 Ratio (from 0 Quantitative
×/÷ Mean
to 99 years)

Nominal scale
The nominal type, sometimes also called the qualitative type, differentiates between items or
subjects based only on their names and/or (meta-)categories and other qualitative
classifications they belong to. Examples include gender, nationality, ethnicity, language,
genre, style, biological species, visual pattern, and form (gestalt)....

Central tendency

The mode, i.e. the most common item, is allowed as the measure of central tendency for the
nominal type. On the other hand, the median, i.e. the middle-ranked item, makes no sense for
the nominal type of data since ranking is not allowed for the nominal type.

Ordinal scale
The ordinal type allows for rank order (1st, 2nd, 3rd, etc.) by which data can be sorted, but
still does not allow for relative degree of difference between them. Examples include, on one
hand, dichotomous data with dichotomous (or dichotomized) values such as 'sick' vs.
'healthy' when measuring health, 'guilty' vs. 'innocent' when making judgments in courts,
'wrong/false' vs. 'right/true' when measuring truth value, and, on the other hand, non-
dichotomous data consisting of a spectrum of values, such as 'completely agree', 'mostly
agree', 'mostly disagree', 'completely disagree' when measuring opinion.

Central tendency

2
The median, i.e. middle-ranked, item is allowed as the measure of central tendency; however,
the mean (or average) as the measure of central tendency is not allowed. The mode is
allowed.

In 1946, Stevens observed that psychological measurement, such as measurement of


opinions, usually operates on ordinal scales; thus means and standard deviations have no
validity, but they can be used to get ideas for how to improve operationalization of variables
used in questionnaires.

Most psychological data collected by psychometric instruments and tests, measuring


cognitive and other abilities, are of the interval type, although some theoreticians have argued
they can be treated as being of the ratio type (e.g. Lord & Novick, 1968; von Eye, 2005).
However, there is little prima facie evidence to suggest that such attributes are anything more
than ordinal (Cliff, 1996; Cliff & Keats, 2003; Michell, 2008). In particular,[2] IQ scores
reflect an ordinal scale, in which all scores are meaningful for comparison only.[3][4][5] There is
no absolute zero, and a 10-point difference may carry different meanings at different points of
the scale.[6][7]

Interval scale
The interval type allows for the degree of difference between items, but not the ratio between
them. Examples include temperature with the Celsius scale, and date when measured from an
arbitrary epoch (such as AD). Ratios are not allowed since 20°C cannot be said to be "twice
as hot" as 10°C, nor can multiplication/division be carried out between any two dates directly.
However, ratios of differences can be expressed; for example, one difference can be twice
another. Interval type variables are sometimes also called "scaled variables", but the formal
mathematical term is an affine space (in this case an affine line).

Central tendency and statistical dispersion

The mode, median, and arithmetic mean are allowed to measure central tendency of interval
variables, while measures of statistical dispersion include range and standard deviation. Since
one cannot divide, one cannot define measures that require a ratio, such as the studentized
range or the coefficient of variation. More subtly, while one can define moments about the
origin, only central moments are meaningful, since the choice of origin is arbitrary. One can
define standardized moments, since ratios of differences are meaningful, but one cannot
define the coefficient of variation, since the mean is a moment about the origin, unlike the
standard deviation, which is (the square root of) a central moment.

Ratio scale
The ratio type takes its name from the fact that measurement is the estimation of the ratio
between a magnitude of a continuous quantity and a unit magnitude of the same kind
(Michell, 1997, 1999). Informally, the distinguishing feature of a ratio scale is the possession
of a zero value. Most measurement in the physical sciences and engineering is done on ratio
scales. Examples include mass, length, duration, plane angle, energy and electric charge. The
Kelvin temperature scale has a non-arbitrary zero point of absolute zero, which is equal to
−273.15 degrees Celsius.

3
Central tendency and statistical dispersion

The geometric mean and the harmonic mean are allowed to measure the central tendency, in
addition to the mode, median, and arithmetic mean. The studentized range and the coefficient
of variation are allowed to measure statistical dispersion. All statistical measures are allowed
because all necessary mathematical operations are defined for the ratio scale.

Debate on typology
While Stevens' typology is widely adopted, it is still being challenged by other theoreticians,
particularly in the cases of the nominal and ordinal types (Michell, 1986). .[8]

Duncan (1986) objected to the use of the word measurement in relation to the nominal type,
but Stevens (1975) said of his own definition of measurement that "the assignment can be any
consistent rule. The only rule not allowed would be random assignment, for randomness
amounts in effect to a nonrule". However, so-called nominal measurement involves arbitrary
assignment, and the "permissible transformation" is any number for any other. This is one of
the points made in Lord's (1953) satirical paper On the Statistical Treatment of Football
Numbers.

The use of the mean as a measure of the central tendency for the ordinal type is still debatable
among those who accept Stevens' typology. Many behavioural scientists use the mean for
ordinal data, anyway. This is often justified on the basis that the ordinal type in behavioural
science is in fact somewhere between the true ordinal and interval types; although the interval
difference between two ordinal ranks is not constant, it is often of the same order of
magnitude. For example, applications of measurement models in educational contexts often
indicate that total scores have a fairly linear relationship with measurements across the range
of an assessment. Thus, some argue that so long as the unknown interval difference between
ordinal scale ranks is not too variable, interval scale statistics such as means can
meaningfully be used on ordinal scale variables. Statistical analysis software such as PSPP
requires the user to select the appropriate measurement class for each variable. This ensures
that subsequent user errors cannot inadvertently perform meaningless analyses (for example
correlation analysis with a variable on a nominal level).

L. L. Thurstone made progress toward developing a justification for obtaining the interval
type, based on the law of comparative judgment. A common application of the law is the
analytic hierarchy process. Further progress was made by Georg Rasch (1960), who
developed the probabilistic Rasch model that provides a theoretical basis and justification for
obtaining interval-level measurements from counts of observations such as total scores on
assessments.

Another issue is derived from Nicholas R. Chrisman's article "Rethinking Levels of


Measurement for Cartography",[9] in which he introduces an expanded list of levels of
measurement to account for various measurements that do not necessarily fit with the
traditional notions of levels of measurement. Measurements bound to a range and repeating
(like degrees in a circle, clock time, etc.), graded membership categories, and other types of
measurement do not fit to Steven's original work, leading to the introduction of six new levels
of measurement, for a total of ten: (1) Nominal, (2) Graded membership, (3) Ordinal, (4)
Interval, (5) Log-Interval, (6) Extensive Ratio, (7) Cyclical Ratio, (8) Derived Ratio, (9)
4
Counts and finally (10) Absolute. The extended levels of measurement are rarely used outside
of academic geography.

Scale types and Stevens' "operational theory of measurement"

The theory of scale types is the intellectual handmaiden to Stevens' "operational theory of
measurement", which was to become definitive within psychology and the behavioral
sciences,[citation needed] despite Michell's characterization as its being quite at odds with
measurement in the natural sciences (Michell, 1999). Essentially, the operational theory of
measurement was a reaction to the conclusions of a committee established in 1932 by the
British Association for the Advancement of Science to investigate the possibility of genuine
scientific measurement in the psychological and behavioral sciences. This committee, which
became known as the Ferguson committee, published a Final Report (Ferguson, et al., 1940,
p. 245) in which Stevens' sone scale (Stevens & Davis, 1938) was an object of criticism:

…any law purporting to express a quantitative relation between sensation intensity


“ and stimulus intensity is not merely false but is in fact meaningless unless and until
a meaning can be given to the concept of addition as applied to sensation. ”
That is, if Stevens' sone scale genuinely measured the intensity of auditory sensations, then
evidence for such sensations as being quantitative attributes needed to be produced. The
evidence needed was the presence of additive structure – a concept comprehensively treated
by the German mathematician Otto Hölder (Hölder, 1901). Given that the physicist and
measurement theorist Norman Robert Campbell dominated the Ferguson committee's
deliberations, the committee concluded that measurement in the social sciences was
impossible due to the lack of concatenation operations. This conclusion was later rendered
false by the discovery of the theory of conjoint measurement by Debreu (1960) and
independently by Luce & Tukey (1964). However, Stevens' reaction was not to conduct experiments
to test for the presence of additive structure in sensations, but instead to render the conclusions of
the Ferguson committee null and void by proposing a new theory of measurement:

Paraphrasing N.R. Campbell (Final Report, p.340), we may say that measurement,
“ in the broadest sense, is defined as the assignment of numerals to objects and events
according to rules (Stevens, 1946, p.677). ”
Stevens was greatly influenced by the ideas of another Harvard academic, the Nobel laureate
physicist Percy Bridgman (1927), whose doctrine of operationism Stevens used to define
measurement. In Stevens' definition, for example, it is the use of a tape measure that defines
length (the object of measurement) as being measurable (and so by implication quantitative).
Critics of operationism object that it confuses the relations between two objects or events for
properties of one of those of objects or events (Hardcastle, 1995; Michell, 1999; Moyer,
1981a,b; Rogers, 1989).

The Canadian measurement theorist William Rozeboom (1966) was an early and trenchant
critic of Stevens' theory of scale types.

References
5
 Alper, T. M. (1985). A note on real measurement structures of scale type (m, m + 1). Journal
of Mathematical Psychology, 29, 73–81.
 Alper, T. M. (1987). A classification of all order-preserving homeomorphism groups of the
reals that satisfy finite uniqueness. Journal of Mathematical Psychology, 31, 135–154.

 Briand, L. & El Emam, K. & Morasca, S. (1995). On the Application of Measurement Theory
in Software Engineering. Empirical Software Engineering, 1, 61–88. [On line]
http://www2.umassd.edu/swpi/ISERN/isern-95-04.pdf

 Babbie, E. (2004). The Practice of Social Research, 10th edition, Wadsworth, Thomson
Learning Inc., ISBN 0-534-62029-9

 Cliff, N. (1996). Ordinal Methods for Behavioral Data Analysis. Mahwah, NJ: Lawrence
Erlbaum. ISBN 0-8058-1333-0

 Cliff, N. & Keats, J. A. (2003). Ordinal Measurement in the Behavioral Sciences. Mahwah,
NJ: Erlbaum. ISBN 0-8058-2093-0

 Lord, Frederic M (December 1953). "On the Statistical Treatment of Football Numbers".
American Psychologist 8 (12): 750–751. doi:10.1037/h0063675. Retrieved 16 September
2010

See also reprints in:


Readings in Statistics, Ch. 3, (Haber, A., Runyon, R.P., and Badia, P.) Reading, Mass:
Addison–Wesley, 1970.
Maranell, Gary Michael, ed. (2007). "Chapter 31". Scaling: A Sourcebook for Behavioral
Scientists. New Brunswick, New Jersey & London, UK: Aldine Transaction. pp. 402–405.
ISBN 978-0-202-36175-8. Retrieved 16 September 2010
 Hardcastle, G. L. (1995) S. S. Stevens and the origins of operationism. Philosophy of Science
62:404–424.
 Lord, F. M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading, MA:
Addison–Wesley.

 Luce, R. D. (1986). Uniqueness and homogeneity of ordered relational structures. Journal of


Mathematical Psychology, 30, 391–415.

 Luce, R. D. (1987). Measurement structures with Archimedean ordered translation groups.


Order, 4, 165–189.

 Luce, R. D. (1997). Quantification and symmetry: commentary on Michell 'Quantitative


science and the definition of measurement in psychology'. British Journal of Psychology, 88,
395–398.

 Luce, R. D. (2000). Utility of uncertain gains and losses: measurement theoretic and
experimental approaches. Mahwah, N.J.: Lawrence Erlbaum.

 Luce, R. D. (2001). Conditions equivalent to unit representations of ordered relational


structures. Journal of Mathematical Psychology, 45, 81–98.

 Luce, R. D. & Tukey, J.W. (1964). Simultaneous conjoint measurement: a new scale type of
fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.

 Michell, J. (1986). Measurement scales and statistics: a clash of paradigms. Psychological


Bulletin, 3, 398–407.

 Michell, J. (1997). Quantitative science and the definition of measurement in psychology.


British Journal of Psychology, 88, 355–383.
6
 Michell, J. (1999). Measurement in Psychology – A critical history of a methodological
concept. Cambridge: Cambridge University Press.

 Michell, J. (2008). Is psychometrics pathological science? Measurement – Interdisciplinary


Research & Perspectives, 6, 7–24.

 Narens, L. (1981a). A general theory of ratio scalability with remarks about the measurement-
theoretic concept of meaningfulness. Theory and Decision, 13, 1–70.

 Narens, L. (1981b). On the scales of measurement. Journal of Mathematical Psychology, 24,


249–275.

 Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests.
Copenhagen: Danish Institute for Educational Research.

 Rozeboom, W.W. (1966). Scaling theory and the nature of measurement. Synthese, 16, 170–
233.

 Stevens, S. S. (June 7, 1946). "On the Theory of Scales of Measurement". Science 103
(2684): 677–680. Bibcode:1946Sci...103..677S. doi:10.1126/science.103.2684.677.
PMID 17750512. Retrieved 16 September 2010

 Stevens, S. S. (1951). Mathematics, measurement and psychophysics. In S. S. Stevens (Ed.),


Handbook of experimental psychology (pp. 1–49). New York: Wiley.

 Stevens, S. S. (1975). Psychophysics. New York: Wiley.

 von Eye, A. (2005). Review of Cliff and Keats, Ordinal measurement in the behavioral
sciences. Applied Psychological Measurement, 29, 401–403.

You might also like