You are on page 1of 47

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/2833761

Automata and Formal Languages

Article · June 2003


Source: CiteSeer

CITATIONS READS

12 2,316

1 author:

Dominique Perrin
Ecole Supérieure d'Ingénieur en Electronique et Electrotechnique - Paris
155 PUBLICATIONS 3,227 CITATIONS

SEE PROFILE

All content following this page was uploaded by Dominique Perrin on 16 October 2013.

The user has requested enhancement of the downloaded file.


Automata and
Formal
Languages

Peter Wood

Motivation and
Background

Automata
Automata and Formal Languages Grammars

Regular
Expressions

Example of
Peter Wood Research

Conclusion
Department of Computer Science and Information Systems
Birkbeck, University of London
ptw@dcs.bbk.ac.uk
Automata and
Outline Formal
Languages

Peter Wood

Motivation and
Motivation and Background Background

Automata

Grammars
Automata
Regular
Expressions

Example of
Grammars Research

Conclusion

Regular Expressions

Example of Research

Conclusion
Automata and
Doing Research Formal
Languages

Peter Wood

Motivation and
Background

Automata
I analysing problems/languages Grammars
I computability/solvability/decidability Regular
Expressions
— is there an algorithm?
Example of
Research
I computational complexity
Conclusion
— is it practical?
I expressive power
— are there things that cannot be expressed?
I formal languages provide well-studied models
Automata and
Formal Languages Formal
Languages

Peter Wood

Motivation and
Background
I finite alphabet of symbols Σ Automata
— e.g., Σ = {0, 1} Grammars

I all finite strings over Σ denoted by Σ∗ Regular


Expressions
— e.g., Σ∗ = {, 0, 1, 00, 01, 10, 11, . . .} Example of
Research
I language L over Σ is just subset of Σ∗
Conclusion
— e.g., L1 : strings with an even number of 1’s
— e.g., L2 : strings representing valid Java programs
I are there finite representations for infinite languages?
Automata and
Formal Languages Formal
Languages

Peter Wood

Motivation and
Background
I finite alphabet of symbols Σ Automata
— e.g., Σ = {0, 1} Grammars

I all finite strings over Σ denoted by Σ∗ Regular


Expressions
— e.g., Σ∗ = {, 0, 1, 00, 01, 10, 11, . . .} Example of
Research
I language L over Σ is just subset of Σ∗
Conclusion
— e.g., L1 : strings with an even number of 1’s
— e.g., L2 : strings representing valid Java programs
I are there finite representations for infinite languages?
I yes, grammars (generative) and automata
(recognition)
Automata and
Automata Formal
Languages

Peter Wood

Motivation and
Background
I device (machine) for recognising (accepting) a Automata
language Grammars

I provide models of computation Regular


Expressions
I automaton comprises states and transitions between Example of
Research
states
Conclusion
I automaton is given a string as input
I automaton M accepts a string w by halting in an
accept state, when given w as input
I language L(M) accepted by automaton M is set of
strings which M accepts
Automata and
Types of Automata Formal
Languages

Peter Wood

Motivation and
Background

Automata

Grammars
I finite state automaton Regular
I deterministic Expressions

I nondeterministic Example of
Research
I pushdown automaton Conclusion

I linear-bounded automaton
I Turing machine
Automata and
Example of a Finite State Automaton Formal
Languages

Peter Wood

I L1 (strings with an even number of 1’s) can be Motivation and


Background
recognised by the following FSA Automata
I 2 states seven and sodd Grammars
I 4 transitions Regular
I seven is both the initial and final state Expressions

Example of
Research
0 0 Conclusion
1

seven sodd

1
Automata and
Grammars Formal
Languages

Peter Wood

Motivation and
I grammars generate languages using: Background

I symbols from alphabet Σ (called terminals) Automata

I set N of nonterminals (one designated as starting) Grammars


I set P of productions, each of the form Regular
Expressions

Example of
U→V Research

Conclusion
where U and V are (loosely) strings over Σ ∪ N
I a string (sequence of terminals) w is generated by G
if there is a derivation of w using G, starting from the
starting nonterminal of G
I language generated by grammar G, denoted L(G), is
the set of strings which can be derived using G
Automata and
Grammar Example Formal
Languages

Peter Wood
I L1 (strings with an even number of 1’s) can be
Motivation and
generated by grammar with productions Background

Automata

S →  Grammars

S → 0S Regular
Expressions

S → 1T Example of
Research
T → 0T Conclusion

T → 1S

where S is the starting nonterminal


Automata and
Grammar Example Formal
Languages

Peter Wood
I L1 (strings with an even number of 1’s) can be
Motivation and
generated by grammar with productions Background

Automata

S →  Grammars

S → 0S Regular
Expressions

S → 1T Example of
Research
T → 0T Conclusion

T → 1S

where S is the starting nonterminal


I a derivation of 01010 is given by

S ⇒ 0S ⇒ 01T ⇒ 010T ⇒ 0101S ⇒ 01010S ⇒ 01010


Automata and
Uses of Grammars Formal
Languages

Peter Wood

Motivation and
Background

Automata

Grammars
I to specify syntax of programming languages Regular
Expressions
I in natural language understanding
Example of
I in pattern recognition Research

Conclusion
I to specify schemas (types) for tree-structured data,
e.g., XML
I ...
Automata and
Hierarchy of Grammars and Languages Formal
Languages

Peter Wood

Motivation and
Background
I restrictions on productions give different types of
Automata
grammars
Grammars
I regular (type 3) Regular
I context-free (type 2) Expressions
I context-sensitive (type 1) Example of
Research
I phrase-structure (type 0)
Conclusion
I for context-free, e.g., left side must be single
nonterminal
I no restrictions for phrase-structure
I language is of type i iff there is a grammar of type i
which generates it
Automata and
Examples of Language Hierarchy Formal
Languages

Peter Wood

Motivation and
Background
I varying expressive power Automata

I regular ⊂ context-free ⊂ context-sensitive ⊂ Grammars

phrase-structure Regular
Expressions

Example of
Research

Conclusion
Automata and
Examples of Language Hierarchy Formal
Languages

Peter Wood

Motivation and
Background
I varying expressive power Automata

I regular ⊂ context-free ⊂ context-sensitive ⊂ Grammars

phrase-structure Regular
Expressions
I L1 (strings over {0, 1} with an even number of 1’s) is Example of
Research
regular
Conclusion
Automata and
Examples of Language Hierarchy Formal
Languages

Peter Wood

Motivation and
Background
I varying expressive power Automata

I regular ⊂ context-free ⊂ context-sensitive ⊂ Grammars

phrase-structure Regular
Expressions
I L1 (strings over {0, 1} with an even number of 1’s) is Example of
Research
regular
Conclusion
I L2 = {0n 1n | n ≥ 0} is context-free, but not regular
Automata and
Examples of Language Hierarchy Formal
Languages

Peter Wood

Motivation and
Background
I varying expressive power Automata

I regular ⊂ context-free ⊂ context-sensitive ⊂ Grammars

phrase-structure Regular
Expressions
I L1 (strings over {0, 1} with an even number of 1’s) is Example of
Research
regular
Conclusion
I L2 = {0n 1n | n ≥ 0} is context-free, but not regular
I L3 = {ww | w ∈ {0, 1}∗ } is context-sensitive, but not
context-free
Automata and
Examples of Language Hierarchy Formal
Languages

Peter Wood

Motivation and
Background
I varying expressive power Automata

I regular ⊂ context-free ⊂ context-sensitive ⊂ Grammars

phrase-structure Regular
Expressions
I L1 (strings over {0, 1} with an even number of 1’s) is Example of
Research
regular
Conclusion
I L2 = {0n 1n | n ≥ 0} is context-free, but not regular
I L3 = {ww | w ∈ {0, 1}∗ } is context-sensitive, but not
context-free
I there exists a phrase-structure (recursive) language
which is not context-sensitive
Automata and
Complexity of Grammar Problems Formal
Languages

Peter Wood

Problem Type Motivation and


Background
3 2 1 0 Automata
Is w ∈ L(G)? P P PSPACE U Grammars
Is L(G) empty? P P U U Regular
Expressions
Is L(G1 ) ≡ L(G2 )? PSPACE U U U
Example of
Research

Conclusion
I P: decidable in polynomial time
I PSPACE: decidable in polynomial space (and
complete for PSPACE: at least as hard as
NP-complete)
I U: undecidable
I so type of grammar has significant effect on
complexity
Automata and
Relationships between Languages and Formal
Languages
Automata Peter Wood

Motivation and
Background

Automata

Grammars

Regular
Expressions

Example of
A language is Research

Conclusion
 
regular finite-state
iff

 

context-free pushdown
 
accepted
context-sensitive linear-bounded
by

 

phrase-structure Turing machine
 
Automata and
Regular Expressions Formal
Languages

Peter Wood

Motivation and
Background

Automata
I algebraic notation for denoting regular languages Grammars

Regular
I use ◦ (concatenation), ∪ (union) and ∗ (closure) Expressions
operators Example of
Research
I L1 denoted by RE 0∗ ∪ (0∗ ◦1◦ 0∗ ◦1◦ 0∗ )∗ Conclusion
I given RE R, the set of strings it denotes is L(R)
I pattern matching in text
I query languages for XML or RDF
Automata and
Using Regular Expressions to Query Graphs Formal
Languages

Graphs (networks) are widely used for representing data Peter Wood

I social networks Motivation and


Background
I transportation and other networks Automata
I geographical information Grammars

Regular
I semistructured data Expressions

I (hyper)document structure Example of


Research
I semantic associations in criminal investigations Conclusion

I bibliographic citation analysis


I pathways in biological processes
I knowledge representation (e.g. semantic web)
I program analysis
I workflow systems
I data provenance
I ...
Automata and
Using Regular Expressions to Query Graphs Formal
Languages

Peter Wood

Motivation and
I (my PhD thesis!) Background

Automata
I usually regular expressions used for string search Grammars
I consider data represented by a directed graph of Regular
Expressions
labelled nodes and labelled edges
Example of
I regular expressions can express paths we are Research

Conclusion
interested in
I sequence of edge labels rather than sequence of
symbols (characters)
I a query using regular expression R can ask for all
nodes connected by a path whose concatenation of
edge labels is in L(R)
Automata and
Formal
Graph G (where nodes represent people and places): Languages

Peter Wood

citizenOf Motivation and


a SA Background

Automata

Grammars

locatedIn Regular
Expressions

Example of
Research
bornIn
b CT Conclusion

livesIn

bornIn
c UK
Automata and
Formal
Languages

Peter Wood

Motivation and
Background
Regular expression Automata

R = citizenOf | ((bornIn | livesIn) · locatedIn∗ ) Grammars

Regular
asks for paths of edges between a person x and a place Expressions

Example of
y such that Research

I x is a citizenOf y , or Conclusion

I x is bornIn or livesIn y , or
I x is bornIn or livesIn a place that is locatedIn y
Automata and
Regular path query evaluation Formal
Languages

Peter Wood

I R EGULAR PATH P ROBLEM Motivation and


Background
Given graph G, pair of nodes x and y and regular Automata
expression R, is there a path from x to y satisfying Grammars
R? Regular
Expressions
I algorithm:
Example of
I construct a nondeterministic finite automaton (NFA) Research

M accepting L(R) Conclusion


I assume M has initial state s0 and final state sf
I consider G as an NFA with initial state x and final
state y
I form the “intersection” I of M and G
I check if there is a path from (s0 , x) to (sf , y )
I Each step can be done in PTIME, so R EGULAR PATH
P ROBLEM has PTIME complexity
Automata and
Formal
Languages

Peter Wood

NFA M for R = citizenOf | ((bornIn | livesIn) · locatedIn∗ ) Motivation and


Background

Automata
bornIn Grammars

Regular
Expressions
start s0 s1 locatedIn Example of
livesIn Research

Conclusion

citizenOf 
sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Formal
Intersection of G and M Languages

Peter Wood
citizenOf
Motivation and
Background
 Automata
a, s0 SA, s1 SA, sf
Grammars

Regular
Expressions
locatedIn Example of
Research

bornIn  Conclusion
b, s0 CT , s1 CT , sf

livesIn
bornIn 
c, s0 UK , s1 UK , sf
Automata and
Regular simple path queries Formal
Languages

Peter Wood

I path p is simple if no node is repeated on p Motivation and


Background
I R EGULAR S IMPLE PATH P ROBLEM Automata

Given graph G, pair of nodes x and y and regular Grammars

expression R, is there a simple path from x to y Regular


Expressions
satisfying R? Example of
Research

Conclusion
Automata and
Regular simple path queries Formal
Languages

Peter Wood

I path p is simple if no node is repeated on p Motivation and


Background
I R EGULAR S IMPLE PATH P ROBLEM Automata

Given graph G, pair of nodes x and y and regular Grammars

expression R, is there a simple path from x to y Regular


Expressions
satisfying R? Example of
Research
I R EGULAR S IMPLE PATH P ROBLEM is NP-complete
Conclusion
[Mendelzon & Wood (1989)]
Automata and
Regular simple path queries Formal
Languages

Peter Wood

I path p is simple if no node is repeated on p Motivation and


Background
I R EGULAR S IMPLE PATH P ROBLEM Automata

Given graph G, pair of nodes x and y and regular Grammars

expression R, is there a simple path from x to y Regular


Expressions
satisfying R? Example of
Research
I R EGULAR S IMPLE PATH P ROBLEM is NP-complete
Conclusion
[Mendelzon & Wood (1989)]
I there can be a path from x to y satisfying R but no
simple path satisfying R, e.g., R = (c · d)∗
d

c
a b
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background

Automata

Grammars

Regular
Expressions

Example of
Research

Conclusion
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background
I the presence of cycles
Automata

Grammars

Regular
Expressions

Example of
Research

Conclusion
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background
I the presence of cycles
Automata
I obvious first step is to consider graphs without Grammars
cycles—DAGs Regular
Expressions

Example of
Research

Conclusion
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background
I the presence of cycles
Automata
I obvious first step is to consider graphs without Grammars
cycles—DAGs Regular
Expressions
I then might look at restricted forms of REs—those Example of
corresponding to languages closed under Research

abbreviations Conclusion
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background
I the presence of cycles
Automata
I obvious first step is to consider graphs without Grammars
cycles—DAGs Regular
Expressions
I then might look at restricted forms of REs—those Example of
corresponding to languages closed under Research

abbreviations Conclusion

I then one might consider a combination of graphs and


REs—those graphs whose cycle structure does not
conflict with RE
Automata and
Approaches to deal with this problem Formal
Languages

Peter Wood

I what causes the problem? Motivation and


Background
I the presence of cycles
Automata
I obvious first step is to consider graphs without Grammars
cycles—DAGs Regular
Expressions
I then might look at restricted forms of REs—those Example of
corresponding to languages closed under Research

abbreviations Conclusion

I then one might consider a combination of graphs and


REs—those graphs whose cycle structure does not
conflict with RE
I finally show that conflict-freedom is a generalisation:
I no RE conflicts with any DAG
I RE closed under abbreviations never conflicts with
any graph
Automata and
Other approaches Formal
Languages

Peter Wood

Motivation and
Background

Automata

Grammars
I in general, may also run experiments to measure Regular
actual running times Expressions

Example of
Research

Conclusion
Automata and
Other approaches Formal
Languages

Peter Wood

Motivation and
Background

Automata

Grammars
I in general, may also run experiments to measure Regular
actual running times Expressions

Example of
I may also develop approximation algorithms Research
I can sometimes find a PTIME algorithm with a Conclusion
performance guarantee (e.g. for TSP, finds a tour at
most twice the optimal distance)
I other times this problem itself is NP-hard
Automata and
Conclusion Formal
Languages

Peter Wood

Motivation and
Background
I is my system/language more powerful than others? Automata

Grammars
I is my system/language more efficient than others?
Regular
I expressive power or computational complexity can Expressions
be studied by relating them to Example of
Research
I formal language theory: languages, grammars,
Conclusion
automata, . . .
I tradeoff between expressive power and
computational complexity
I consider restrictions of difficult problems or giving up
exact solutions
Automata and
References Formal
Languages

Peter Wood

Motivation and
I Aho, Hopcroft and Ullman, “The Design and Analysis Background

of Computer Algorithms,” Addison-Wesley, 1974 Automata

Grammars
I Garey and Johnson, “Computers and Intractability: A
Regular
Guide to the Theory of NP-Completeness,” W. H. Expressions
Freeman and Company, 1979 Example of
Research
I Lewis and Papadimitriou, “Elements of the Theory of Conclusion
Computation,” Prentice-Hall, 1981
I Mendelzon and Wood, “Finding Regular Simple
Paths in Graph Databases,” SIAM J. Computing, Vol.
24. No. 6, 1995, pp. 1235–1258
I Sipser, “Introduction to the Theory of Computation,”
PWS Publishing Company, 1997

View publication stats

You might also like