Optimality theory

From Wikipedia, the free encyclopedia
Optimality theory (frequently abbreviated OT) is a linguistic model proposing that the observed forms
of language arise from the interaction between conflicting constraints. OT models grammars as systems that
provide mappings from inputs to outputs; typically, the inputs are conceived of as underlying representations,
and the outputs as their surface realizations.
 1 Theory
 2 Input and GEN: the candidate set
 3 CON: the constraint set
 4 EVAL: definition of optimality
 5 Example
 6 Criticism
 7 Theories within optimality theory
 8 Use outside of phonology
 9 Notes
 10 References
 11 External links
There are three basic components of the theory:
1. GEN takes an input, and generates the list of possible outputs, or
2. CON provides the criteria, in the form of strictly ordered violable
constraints, used to decide between candidates, and
3. EVAL chooses the optimal candidate based on the constraints, and
this candidate is the output.
Optimality theory assumes that these components are universal. Differences in grammars reflect different
rankings of the universal constraint set, CON. Part of language acquisition can then be described as the
process of adjusting the ranking of these constraints.
Optimality theory was originally proposed by the linguists Alan Prince and Paul Smolensky in 1993, and later
expanded by Prince and John J. McCarthy. Although much of the interest in optimality theory has been
associated with its use in phonology, the area to which optimality theory was first applied, the theory is also
applicable to other subfields of linguistics (e.g. syntax and semantics).
Optimality theory is usually considered a development of generative grammar, which shares its focus on the
investigation of universal principles, linguistic typology and language acquisition.
Optimality theory is often called a connectionist theory of language, because it has its roots in neural
network research, though the relationship is now largely of historical interest. It arose in part as a successor to
the theory of Harmonic Grammar, developed in 1990 by Géraldine Legendre, Yoshiro Miyata and Paul
[edit]Input and GEN: the candidate set
Optimality theory supposes that there are no language-specific restrictions on the input. This is called richness
of the base. Every grammar can handle every possible input. For example, a language without complex
clustersmust be able to deal with an input such as /flask/. Languages without complex clusters differ on how
they will resolve this problem; some will epenthesize (e.g. [falasak], or [falasaka] if all codas are banned) and
some willdelete (e.g. [fas], [fak], [las], [lak]). Given any input, GEN generates an infinite number of candidates,
or possible realizations of that input. A language's grammar (its ranking of constraints) determines which of the
infinite candidates will be assessed as optimal by EVAL.
[edit]CON: the constraint set
In optimality theory, every constraint is universal. CON is the same in every language. There are two basic
types of constraints. Faithfulness constraints require that the observed surface form (the output) match the
underlying or lexical form (the input) in some particular way; that is, these constraints require identity between
input and output forms. Markedness constraints impose requirements on the structural well-formedness of
the output. Each plays a crucial role in the theory. Faithfulness constraints prevent every input from being
realized as some unmarked form, and markedness constraints motivate changes from the underlying form.
The universal nature of CON makes some immediate predictions about language typology. If grammars differ
only by having different rankings of CON, then the set of possible human languages is determined by the
constraints that exist. Optimality theory predicts that there cannot be more grammars than there are
permutations of the ranking of CON. The number of possible rankings is equal to the factorial of the total
number of constraints, thus giving rise to the term Factorial Typology. However, it may not be possible to
distinguish all of these potential grammars, since not every constraint is guaranteed to have an observable
effect in every language. Two languages could generate the same range of input-output mappings, but differ in
the relative ranking of two constraints which do not conflict with each other.
[edit]EVAL: definition of optimality
Given two candidates, A and B, A is better than B on a constraint if A incurs fewer violations than B. Candidate
A is better than B on an entire constraint hierarchy if A incurs fewer violations of the highest-ranked constraint
distinguishing A and B. A is optimal in its candidate set if it is better on the constraint hierarchy than all other
candidates. For example, given constraints C1, C2, and C3, where C1 dominates C2, which dominates C3 (C1
>> C2 >> C3), A is optimal if it does better than B on the highest ranking constraint which assigns them a
different number of violations. If A and B tie on C1, but A does better than B on C2, A is optimal, even if A has
100 more violations of C3 than B. This comparison is often illustrated with a tableau. The pointing finger marks
the optimal candidate, and each cell displays an asterisk for each violation for a given candidate and constraint.
Once a candidate does worse than another candidate on the highest ranking constraint distinguishing them, it
incurs a crucial violation (marked in the tableau by an exclamation mark). Once a candidate incurs a crucial
violation, there is no way for it to be optimal, even if it outperforms the other candidates on the rest of CON.
A violation tableau

C1 C2 C3
☞A * * ***
B * **!

Constraints are ranked in a hierarchy of strict domination. The strictness of strict domination means that a
candidate who violates only a high-ranked constraint does worse on the hierarchy than one that doesn't, even if
the second candidate fared worse on every other lower-ranked constraint. This also means that constraints are
violable; the winning candidate need not satisfy all constraints. Within a language, a constraint may be ranked
high enough that it is always obeyed; it may be ranked low enough that it has no observable effects; or, it may
have some intermediate ranking. The term the emergence of the unmarked describes situations in which a
markedness constraint has an intermediate ranking, so that it is violated in some forms, but nonetheless has
observable effects when higher-ranked constraints are irrelevant.
An early example proposed by McCarthy & Prince (1994) is the constraint NoCoda, which prohibits syllables
from ending in consonants. In Balangao, NoCoda is not ranked high enough to be always obeyed, as
witnessed in roots like taynan (faithfulness to the input prevents deletion of the final /n/). But, in
the reduplicated form ma-tayna-taynan 'repeatedly be left behind', the final /n/ is not copied. Under
McCarthy & Prince's analysis, this is because faithfulness to the input does not apply to reduplicated material,
and NoCoda is thus free to prefer ma-tayna-taynan over hypothetical ma-taynan-taynan (which has an
additional violation of NoCoda). Constraints are also violable; the winning candidate need not satisfy all
constraints, as long as for any rival candidate that does better than the winner on some constraint, there is
a higher ranked constraint on which the winner does better than that rival.
Some optimality theorists prefer the use of comparative tableaux, as described in Prince (2002). Comparative
tableaux display the same information as the classic or "flyspeck" tableaux, but the information is presented in
such a way that it highlights the most crucial information. For instance, the tableau above would be rendered in
the following way.
Comparative tableau

C1 C2 C3
A ~ B e W L
Each row in a comparative tableau represents a winner-loser pair, rather than an individual candidate. In the
cells where the constraints assess the winner-loser pairs, there is a W if the constraint in that column prefers
the winner, an L if the constraint prefers the loser, and an e if the constraint does not differentiate between the
pair. Presenting the data in this way makes it easier to make generalizations. For instance, in order to have a
consistent ranking some W must dominate all L's. Brasoveanu and Prince (2005) describe a process known as
fusion and the various ways of presenting data in a comparative tableau in order to achieve the necessary and
sufficient conditions for a given argument.
As a simplified example, consider the manifestation of the English plural:
/ˈkæt/ + /z/ → [ˈkæts] (cats)(also smirks, hits, crepes)
/ˈdɒɡ/ + /z/ → [ˈdɒɡz] (dogs)(also wugs, clubs, moms)
/ˈdɪʃ/ + /z/ → [ˈdɪʃɨz] (dishes)(also classes, glasses, bushes)
Also consider the following constraint set, in descending order of domination (M: markedness, F: faithfulness):
M: *SS - Sibilant-Sibilant clusters are ungrammatical: one violation for every pair of adjacent sibilants in the
M: Agree(Voi) - Agree in specification of [voi]: one violation for every pair of adjacent obstruents in the output
which disagree in voicing.
F: Max - Maximize all input segments in the output: one violation for each segment in the input that doesn't
appear in the output (This constraint prevents deletion).
F: Dep - Output segments are dependent on having an input correspondent: one violation for each segment in
the output that doesn't appear in the input (This constraint prevents insertion).
F: Ident(Voi) - Maintain the identity of the [voi] specification: one violation for each segment that differs in
voicing between the input and output.
dish + z > dishiz
dish + z *SS Agree Max Dep Ident
☞ dishiz



* *!
dishz *! *



dishs *!

dog + z > dogz
dog + z *SS Agree Max Dep Ident



*! *
☞ dogz





cat + z > cats
cat + z *SS Agree Max Dep Ident



*! *




☞ cats

No matter how the constraints are re-ordered, the 'is' allomorph will always lose to 'iz'. This is called harmonic
bounding. The violations incurred by the candidate 'dogiz' are a subset of the violations incurred by 'dogis';
specifically, if you epenthesize a vowel, changing the voicing of the morpheme is gratuitous violation of
constraints. In the 'dog + z' tableau, there is a candidate 'dogz' which incurs no violations whatsoever. Within
the constraint set of the problem, 'dogz' harmonically bounds all other possible candidates. This shows that a
candidate does not need to be a winner in order to harmonically bound another candidate.
The tableaux from above are repeated below using the comparative tableaux format.
dog + z > dogz
dog + z *SS Agree Max Dep Ident
dogz ~ dogiz e e e W e
dogz ~ dogis e e e W W
dogz ~ dog e e W e e
dogz ~ dogs e W e e W
From the above tableau for dog + z, it can be observed that any ranking of these constraints will produce the
observed output dogz. Because there are no loser-preferring comparisons, dogz wins under any ranking of
these constraints; this means that no ranking can be established on the basis of this input.
cat + z > cats
cat + z *SS Agree Max Dep Ident
cats ~ catiz e e e W L
cats ~ catis e e e W e
cats ~ catz e W e e L
cats ~ cat e e W e L
The tableau for cat + z contains rows with a single W and a single L. This shows that Agree, Max, and Dep
must all dominate Ident; however, no ranking can be established between those constraints on the basis of this
input. Based on this tableau, the following ranking has been established:
Agree, Max, Dep >> Ident
dish + z > dishiz
dish + z *SS Agree Max Dep Ident
dishiz ~ dishis e e e e W
dishiz ~ dishz W W e L e
dishiz ~ dish e e W L e
dishiz ~ dishs W e e L W
This tableau shows that several more rankings are necessary in order to
predict the desired outcome. The first row says nothing; there is no loser-
preferring comparison in the first row. The second row reveals
that either *SSor Agree must dominate Dep, based on the comparison
between dishiz and dishz. The third row shows that Max must dominate
Dep. The final row shows that either *SS or Ident must dominate Dep.
From the cat + ztableau, it was established that Dep dominates Ident; this
means that *SS must dominate Dep.
So far, the following rankings have been shown to be necessary:
*SS, Max >> Dep >> Ident
While it is possible that Agree can dominate Dep, it is not necessary;
the ranking given above is sufficient for the observed for fishiz to
When the rankings from the tableaux are combined, the following
ranking summary can be given:
*SS, Max >> Agree, Dep >> Ident
*SS, Max, Agree >> Dep >> Ident
There are two possible places to put Agree when writing out
rankings linearly; neither is truly accurate. The first implies
that *SS and Max must dominate Agree, and the second
implies that Agree must dominate Dep. Neither of these are
truthful, which is a failing of writing out rankings in a linear
fashion like this. These sorts of problems are the reason why
most linguists utilize a lattice graph to represent necessary
and sufficient rankings, as shown below.

A diagram that represents necessary rankings of constraints
in this style is a Hasse diagram.
Optimality theory has attracted substantial amounts of
criticism, most of which is directed at its application to
phonology (rather than syntax or other fields).

It is claimed that optimality theory cannot account
for phonological opacity (see Idsardi 2000, e.g.). In
derivational phonology effects may be seen that are
inexplicable at the surface level but which are explainable
through "opaque" rule ordering; but in optimality theory, which
has no intermediate levels for rules to operate on, these
effects are difficult to explain.
For example, in Québécois French high front vowels
triggered affrication of /t/, (e.g. /tipik/ → [tˢpɪk]) but the loss of
high vowels (visible at the surface level) leaves the affrication
with no apparent source. Derivational phonology can explain
this by saying that vowel syncope (the loss of the vowel)
"counterbled" affrication - that is, instead of vowel syncope
occurring and "bleeding" (i.e. preventing) affrication, we say
that affrication applies before vowel syncope, so that the high
vowel is removed and the environment destroyed which had
triggered affrication. Such counterbleeding rule orderings are
therefore termed opaque (instead of transparent), because
their effects are not visible at the surface level.
The opacity of such phenomena finds no straightforward
explanation in optimality theory, since intermediate forms are
not accessible (constraints refer only to the surface form
and/or the underlying form). There have however been a
number of proposals designed to account for it; but most of
these proposals significantly alter optimality theory's basic
architecture, and therefore tend to be highly controversial.
Frequently, such alterations add new types of constraints
(which are not universal faithfulness or markedness
constraints), or change the properties of GEN (such as
allowing for serial derivations) or EVAL. Some well-known
examples of these include John J. McCarthy's sympathy
theory and candidate chains theory, and there are many
A relevant issue is the existence of circular chain shifts, i.e.
cases where input /X/ maps to output [Y], but input /Y/ maps
to output [X]. Many versions of optimality theory predict this
to be impossible (see Moreton 2004, Prince 2007). It is not
certain whether patterns of this sort occur in natural
Optimality theory is also criticized as being an impossible
model of speech production/perception: computing and
comparing an infinite number of possible candidates would
take an infinitely long time to process. Idsardi (2006) argues
this position, though other linguists dispute this claim on the
grounds that Idsardi makes unreasonable assumptions about
the constraint set and candidates, and that more moderate
instantiations of optimality theory do not present such
big computational problems (see Kornai (2006) and Heinz,
Kobele, and Riggle (2009)). Another common rebuttal to this
criticism of optimality theory is that the framework is purely
representational. In this view, optimality theory is taken to be
a model of linguistic competence and is therefore not
intended to explain the specifics of linguistic performance.

Another objection to optimality theory is the claim that it is not
technically a theory, in that it does not make falsifiable
predictions. The source of this issue is terminology: the term
"theory" is used differently here than in physics, chemistry,
and other sciences. Specific instantiations of optimality theory
may make falsifiable predictions, in the same way that
specific proposals within other linguistic frameworks can.
What predictions are made, and whether they are testable,
depends on the specifics of individual proposals (most
commonly, this is a matter of the definitions of the constraints
used in an analysis). Thus, optimality theory as a framework
is best described as a scientific paradigm.

Some criticisms of optimality theory appear to be based on a
misunderstanding of how it works. A well-known example of
this is Noam Chomsky's widely-repeated assertion that
optimality theory would predict every lexical input to be
reduced to a single optimal syllable (e.g. every word is
realized as [ba])
. In fact, under the premises of Optimality
Theory universal neutralization of this type would only be
predicted if there were no faithfulness constraints (see
McCarthy 1997). In a sense, the diametrically opposite kind
of criticism comes from Morris Halle: "... the existence of
phonology in every language shows that Faithfulness is at
best an ineffective principle that might well be done
By "phonology," Halle clearly means disparity
between inputs and outputs of a phonological system.
Optimality theory would fail to predict such deviations from an
underlying form only if there were no markedness
. In OT, input-output disparity is normally
understood as the result of markedness constraints being
ranked over faithfulness constraints (M >> F).
[edit]Theories within optimality theory
In practice, implementations of optimality theory often
assume other related theories, such as Syllable theory,
Moraic theory, or Feature Geometry. Completely distinct from
these, there are sub-theories which have been proposed
entirely within optimality theory, such as positional
faithfulness theory, correspondence theory (McCarthy &
Prince 1995), sympathy theory, and a number of theories of
learnability, most notably by Bruce Tesar. There are also a
range of theories specifically about optimality theory. These
are concerned with issues like the possible formulations of
constraints, and constraint interactions other than strict
[edit]Use outside of phonology
Optimality Theory is most commonly associated with the field
of Phonology, but has also been applied to other areas of
linguistics. Jane Grimshaw, Geraldine Legendre and Joan
Bresnan are well known for developing instantiations of OT
within syntax.
Optimality theoretic approaches are also
relatively prominent in morphology (and the morphology-
phonology interface in particular)
1. ^ Chomsky (1995)
2. ^ Dresher (1996)
3. ^ Halle (1995)
4. ^ Idsardi (2000)
5. ^ Idsardi (2006)
6. ^ Kager, René (1999). Optimality theory. Section 1.4.4: Fear of infinity, pp. 25–27.
7. ^ Prince, Alan and Paul Smolensky. (2004): Optimality Theory: Constraint Interaction in Generative Grammar.
Section 10.1.1: Fear of Optimization, pp. 215–217.
8. ^ de Lacy (editor). (2007). The Cambridge Handbook of Phonology, p. 1.
9. ^ Chomsky (1995)
10. ^ Halle (1995)
11. ^ Prince (2007), pp. 22-24
12. ^ McCarthy, John (2001). A Thematic Guide to Optimality Theory, Chapter 4: Connections of Optimality
13. ^ Legendre, Grimshaw & Vikner (2001)
14. ^ Trommer (2001)
15. ^ Wolf (2008)
 Brasoveanu, Adrian, and Alan Prince (2005). Ranking & Necessity. ROA-794.
 Chomsky (1995). The Minimalist Program. Cambridge, MA: The MIT Press.
 Dresher, Bezalel Elan (1996): The Rise of Optimality Theory in First Century Palestine. GLOT International 2, 1/2,
January/February 1996, page 8 (a humorous introduction for novices)
 Halle, Morris (1995). Feature Geometry and Feature Spreading. Linguistic Inquiry 26, 1-46.
 Heinz, Jeffrey, Greg Kobele, and Jason Riggle (2009). Evaluating the Complexity of Optimality Theory. Linguistic
Inquiry 40, 277-288.
 Idsardi, William J. (2006). A Simple Proof that Optimality Theory is Computationally Intractable. Linguistic
Inquiry 37:271-275.
 Idsardi, William J. (2000). Clarifying opacity. The Linguistic Review 17:377-50.
 Kager, René (1999). Optimality Theory. Cambridge: Cambridge University Press.
 Kornai, Andras (2006). Is OT NP-hard?. ROA-838.
 Legendre, Géraldine, Jane Grimshaw and Sten Vikner. (2001). Optimality-theoretic syntax. MIT Press.
 McCarthy, John (2007). Hidden Generalizations: Phonological Opacity in Optimality Theory. London: Equinox.
 McCarthy, John (2001). A Thematic Guide to Optimality Theory. Cambridge: Cambridge University Press.
 McCarthy, John and Alan Prince (1993): Prosodic Morphology: Constraint Interaction and Satisfaction. Rutgers
University Center for Cognitive Science Technical Report 3.
 McCarthy, John and Alan Prince (1994): The Emergence of the Unmarked: Optimality in Prosodic Morphology.
Proceedings of NELS.
 McCarthy, John J. & Alan Prince. (1995). Faithfulness and reduplicative identity. In J. Beckman, L. W. Dickey, & S.
Urbanczyk (Eds.), University of Massachusetts occasional papers in linguistics (Vol. 18, pp. 249–384). Amherst, MA:
GLSA Publications.
 Moreton, Elliott (2004): Non-computable Functions in Optimality Theory. Ms. from 1999, published 2004 in John J.
McCarthy (ed.), Optimality Theory in Phonology.
 Prince, Alan (2007). The Pursuit of Theory. In Paul de Lacy, ed., Cambridge Handbook of Phonology.
 Prince, Alan (2002). Entailed Ranking Arguments. ROA-500.
 Prince, Alan (2002). Arguing Optimality. In Coetzee, Andries, Angela Carpenter and Paul de Lacy (eds). Papers in
Optimality Theory II. GLSA, UMass. Amherst. ROA-536.
 Prince, Alan and Paul Smolensky. (1993/2002/2004): Optimality Theory: Constraint Interaction in Generative Grammar.
Blackwell Publishers (2004) [1](2002). Technical Report, Rutgers University Center for Cognitive Science and Computer
Science Department, University of Colorado at Boulder (1993).
 Trommer, Jochen. (2001). Distributed Optimality. PhD dissertation, Universität Potsdam.
 Wolf, Matthew. (2008). Optimal Interleaving: Serial Phonology-Morphology Interaction in a Constraint-Based Model. PhD
dissertation, University of Massachusetts. ROA-996.
[edit]External links
 Rutgers University Optimality Archive
 Optimality Theory and the Three Laws of Robotics
 OT Syntax: an interview with Jane Grimshaw
View page ratings
Rate this page
What's this?
I am highly knowledgeable about this topic (optional)
Submit ratings
 Linguistics
 Phonology
 Log in / create account
 Article
 Talk
 Read
 Edit
 View history

 Main page
 Contents
 Featured content
 Current events
 Random article
 Donate to Wikipedia
 Help
 About Wikipedia
 Community portal
 Recent changes
 Contact Wikipedia
 Deitsch
 Deutsch
 Español
 Esperanto
 یس را ف
 한국어
 Nederlands
 日本語
 Norsk (bokmål)
 Norsk (nynorsk)
 Русский
 This page was last modified on 6 February 2012 at 04:10.
 Text is available under the Creative Commons Attribution-ShareAlike
License; additional terms may apply. See Terms of use for details.
Wikipedia® is a registered trademark of the Wikimedia Foundation,
Inc., a non-profit organization.
 Contact us
 Privacy policy
 About Wikipedia
 Disclaimers
 Mobile view

Sign up to vote on this title
UsefulNot useful