Professional Documents
Culture Documents
Linguistics and Formal Languages
Linguistics and Formal Languages
Marina Ermolaeva
University of Chicago
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 1 / 18
Hi!
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 2 / 18
Communication outside class
• Canvas
– homework, readings, and other class materials
– all announcements (please keep your notifications on!)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 3 / 18
Sources
• Special mention:
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 4 / 18
Communication in class
• Chat messages
– for short questions and comments
• Nonverbal feedback
– to give me an idea of how the class is going
– I will sometimes ask you to use this feature for a quick poll
• Audio + video
– for extended discussion; please raise your hand first
– make sure to keep your mic muted when not talking
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 5 / 18
Background survey (preliminary results)
(if you haven’t taken the survey yet, please do so after class!)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 6 / 18
Before we continue...
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 7 / 18
Language + Math = ?
• Computational linguistics
– often used as an umbrella term
• Mathematical linguistics
– applying mathematical methods to
(source: xkcd) understand natural language
– formal language theory: languages as
mathematical objects generated by rule
systems
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 8 / 18
Language + Math = ?
• Computational linguistics
– often used as an umbrella term
• Mathematical linguistics
– applying mathematical methods to
(source: xkcd) understand natural language
– formal language theory: languages as
mathematical objects generated by rule
systems
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 8 / 18
Formalization in linguistics
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 9 / 18
Formal languages
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
• {a, b, c, ..., z} is the alphabet of Latin characters
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
• {a, b, c, ..., z} is the alphabet of Latin characters
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
• {a, b, c, ..., z} is the alphabet of Latin characters
• {the, be, to, of, and} is the alphabet of five most common English
words, according to Wikipedia
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
• {a, b, c, ..., z} is the alphabet of Latin characters
• {the, be, to, of, and} is the alphabet of five most common English
words, according to Wikipedia
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
• Examples:
• {a, b, c, ..., z} is the alphabet of Latin characters
• {the, be, to, of, and} is the alphabet of five most common English
words, according to Wikipedia
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 10 / 18
Formal languages
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 11 / 18
Formal languages
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 11 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[[[]]]
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[[]][[]]]
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[ [ ] ] [ [ ] ] ] (no)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[ [ ] ] [ [ ] ] ] (no)
[[][][]][]
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[ [ ] ] [ [ ] ] ] (no)
[ [ ] [ ] [ ] ] [ ] (yes)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[ [ ] ] [ [ ] ] ] (no)
[ [ ] [ ] [ ] ] [ ] (yes)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Example: square brackets
• Let Σ = { [, ] }
• Let L be a language over Σ such that:
(1) [ ] ∈ L;
(2) For any string s ∈ L, [s] ∈ L;
(3) For any two strings s, t ∈ L, st ∈ L
[ [ [ ] ] ] (yes)
[ [ ] ] [ [ ] ] ] (no)
[ [ ] [ ] [ ] ] [ ] (yes)
(no)
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 12 / 18
Relevant questions
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 13 / 18
Relevant questions
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 13 / 18
Relevant questions
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 13 / 18
The Chomsky Hierarchy of formal languages
(also known as the Chomsky–Schützenberger hierarchy)
Recursively enumerable
ex: all square numbers
(each number n is a string of length n)
Context-sensitive
Finite
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 14 / 18
The Chomsky Hierarchy of formal languages
(also known as the Chomsky–Schützenberger hierarchy)
Recursively enumerable
ex: all square numbers
(each number n is a string of length n)
Context-sensitive
Finite
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 14 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
• Intro to linguistics
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
What this course is NOT about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 15 / 18
Background survey
• Regular expressions
• Finite-state automata
• Context-free languages
• Subregular languages
• Mildly context-sensitive languages
• International Phonetic Alphabet
• Phonemes and allophones
• Phonological rewriting rules
• Autosegmental phonology
• Syntactic constituents
• Phrase-structure rules
• Minimalist syntax
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 16 / 18
Background survey
• Regular expressions
• Finite-state automata
• Context-free languages (formal language theory)
• Subregular languages
• Mildly context-sensitive languages
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 16 / 18
What this course IS about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 17 / 18
What this course IS about
What these...
• Regular expressions
• Finite-state automata
• Subregular languages
• Context-free languages
• Mildly context-sensitive languages
• ...
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 17 / 18
What this course IS about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 17 / 18
What this course IS about
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 17 / 18
Next time
• Formal grammars
Marina Ermolaeva (University of Chicago) Linguistics and formal languages LING 21020 | April 6, 2020 18 / 18