Introduction To PEG (Parsing Expression Grammar) in Python

Introduction to packrat parsing for PEGs (Parsing Expression Grammars)
pycon APAC 2011, Singapore gavin bong

Le 8 juin 2011 2129 mercredi
roadmap
Motivation PEG theory pyparsing PyMeta PyPy rlib/parsing Closing 04 05 16 07 01 01 mins mins mins mins min min
34 mins
2
motivation
How to parse texts with PEGs
Natural languages NLTK Mini languages (DSLs) Structured / unstructured file formats
4 thoughts :
i. Aren't structured formats like JSON, XML, HTML well-served by existing parsers ? ii. Parsing log files & configuration files are easy with python. iii. Regular expression is good enough. iv. What is wrong with the classical way of writing parsers ?
3
CFG (Context Free Grammars)

In formal language theory, CFG is suitable for modeling both natural & computer languages. BNF is the defacto notation for describing syntax of CFGs.
if_stmt ::= "if" expression ":" suite ( "elif" expression ":" suite )* [ "else" ":" suite ]
EBNF
Original BNF only supported recursion.

sequence, decision(choice) repetition, recursion
S Sa S
4
CFG & Ambiguity

CFG grammars are potentially ambiguous.
1 if( x > 5 ) 2 if( y > 5 ) 3 console.log("heaven"); 4 else console.log("limbo"); Dangling else problem
Name AST 'x' Ops Comp test > Num test 5 body IfExp IfExp Comp Name body orelse Log values Str 'limbo' 'y'
#1
values Str Log 'heaven' 5
CFG & Ambiguity (2)

Name 'x' Ops Comp test > Num test 5 body IfExp IfExp Comp Name body orelse 'y' values Str Log 'heaven' Log values Str 'limbo'
6
AST #2
Definitions
Parse trees vs AST
= concrete = abstract whitespace, braces, = uses tree nodes semicolons specific to language = nodes are nonterminals constructs from grammar
Top-Down
= begin with start nonterminal. = work down the parse tree.
vs
Bottom-up
= identify terminals = infer nonterminals = climb the parse tree.
Definitions (2)
Recursive descent parsing
* A top-down parser constructed from recursive functions. * Each function represents a rule in the grammar.
version ::= <digit> '.' <digit> digit ::= '0' | '1' ... | '9' def version( source, position=0 ): digit( source, position ) period( source, position + 1 ) digit( source, position + 2 )
Run (pymeta) nose --nocapture -v test_rdp_list.py 8
Recursive Descent Parsing

def expect(source, position, comparator): try: expecting, msg = comparator if not expecting(source[0]): raise ParseError(position, msg) source.popleft() #consume ! except IndexError: raise EOFError(position)
def digit(source, position): fn = (lambda t: t in string.digits,this_rule()) expect(source, position, fn) def period(source, position): fn = (lambda t: t == '.',this_rule()) expect(source, position, fn)
Recursive Descent Parsing (2)

>>> import collections >>> version(collections.deque('1.6')) >>> version(collections.deque('A.6')) ParseError: (0, 'expected <digit>') >>> version(collections.deque('1,6')) ParseError: (1, 'expected <period>') >>> version(collections.deque('1.')) EOFError: (2, [('message', 'end of input')])
10
Classical method of parsing

1. Flesh out a grammar in BNF 2. Lexical analysis phase lexer ( patterns, stream-of-characters)
=> stream of tokens
3. Parsing phase
parser ( grammar, stream-of-tokens) => parse tree / AST
4. Use your parser
Specific to LALR(1) bottom-up parsers

Photo attribution: http://www.flickr.com/photos/j_aroche/2160902499/ 11
Spectrum of parsing solutions

Regex Handwritten Recursive Descent Parsers PEG parsers ANTLR Lex / Yacc parser generators (GNU flex/bison)
12
Other python parsing toolkits

PLY Yapps funcparserlib
http:// wiki.python.org / moin / LanguageParsing
13
PEG
Formalized by Bryan Ford in 2002-2004 Grammar mimics a recursive descent parser (+ backtracking). Scanner-less A PEG grammar consists of a set of parsing expressions of the form: A e One expression is denoted the starting expression
e1 / e2 e1 e2 e+ e? &e !e e* Ordered Choice Sequence PEG Repetition Predicates
!= EBNF
14
PEG's ordered choice

S Hitch / Hitchens Q. Given an input string of Hitchens, what is the result of the parse ? Law #1: Given an input of A, the parsing expression matches a prefix A' of A or fails. Law #2: A rule S -> M / N will try to parse for a M. If that fails, backtrack & look for N. Answer: Hitch
15
PEG vs CFG
PEG Syntax definition philosophy Choice alternatione1/e2 Handles ambiguous grammars Requires a lexical analysis phase ? Left recursion * Analytical Ordered No No No CFG
Generative
Commutative Yes Yes (lex/yacc) Yes
Warth et al. Packrat parsers can support left recursion (2008)
16
PEG & Packrat parsing

Context: recursive descent parsing with backtracking Problem: an input substring might be re-parsed during backtracking.
grammar ::= AB | AC
Solution: memoization guarantees linear time performance. Neotoma Cinerea

Photo attribution: http://en.wikipedia.org/wiki/File:Neotoma_cinerea.jpg 17
case study #1 problem statement

Parse modern Japanese dates in various formats.
If the date parses successfully, convert it to its equivalent datetime.date instance.

18
case study #1 : The four ERAs

HEISEI
Akihito
( ( ( (
) 1989 Jan 8 - present ) 1926 Dec 25 1989 Jan 7 ) 1912 Jul 30 1926 Dec 24 ) 1868 Sep 8 1912 July 29
19
SHOWA
Hirohito
TAISHOU
Yoshihito
MEIJI
Mutsuhito
case study #1 : liberties taken

1. No support for days-of-the-week tagged onto the end. 2. Numbers use western digits, not kanji. 3. Some eras have overlapping days. Ignore. 4. For 1st year of an era, no support for gannen.
20
case study #1 : initial attempt
from pyparsing import Literal, Word, nums year month day heisei_era integer = = = = = Literal( u'\u5e74' ) Literal( u'\u6708' ) Literal( u'\u65e5' ) Literal( u'\u5e73\u6210' ) Word(nums) Word(nums, exact=2)
21
case study #1 : initial attempt (2)
day_spec = integer.setResultsName('dd') + day month_spec = integer('mm') + month
western_year = integer('yyyy') + year imperial_year = heisei_era + western_year year_spec = (imperial_year('imperial') | western_year('western')) grammar = year_spec + month_spec + day_spec
case study #1 : initial attempt (3)

result = grammar.parseString(japanese_date) print result.dump()
23
pyparsing : introduction
Easy to use PEG-based text parser Grammar definitions in python Framework distributed as one file
pyparsing.py
Runs on both python 2.x & 3.x . Future releases after 1.5.x will be focused on python 3.x only Not classified as recursive descent !
24
pyparsing : framework overview
25
pyparsing & PEGs : correlation

PEG
e1 e2 e1 e2 e* e+ e? &e !e
pyparsing
e1 + e2 e1 | e2 == And( e1, e2 ) == MatchFirst( [e1,e2] )
ZeroOrMore( e ) OneOrMore( e ) Optional( e ) Followed( e ) ~e == NotAny( e )
pyparsing : framework overview
27
pyparsing : ordered choice

will short circuit as soon as a match is found. Not commutative.
MatchFirst
Shadowing literals in which one is a substring of the other should be avoided.
Keywords are different
28
pyparsing : backtracking
Or forces the parser to make an exhaustive
search of the alternatives. (match longest)

Or might introduce ambiguities.
No better than non-PEG parsers. Tweak the order of alternatives & put most probable (e.g. frequency of occurrence) first. Avoids wasteful backtracking.
29
Ballon d'Or 2011 example
p1,p2,p3,p4,p5 = map(Literal,['ronaldo','messi', 'park-ji-sung', 'xavi','iniesta']) first = p2 + p1 + p4 second = p2 + p1 + p5 third = p2 + p1 + p3 grammar = first | second | third print grammar.parseString( "messi ronaldo park-ji-sung" )
31
pyparsing : left factored

p1,p2,p3,p4,p5 = map(Literal,['ronaldo','messi', 'park-ji-sung', 'xavi','iniesta']) absolute_certainty = p2 + p1 too_close_to_call = p4 | p5 | p3 grammar = absolute_certainty + too_close_to_call print grammar.parseString( "messi ronaldo \ park-ji-sung" )
32
pyparsing : packrat
Memoization must be manually turned on.
ParserElement.enablePackrat()
Caches: a. ParseResults b. Exceptions thrown Caveat emptor: A grammar with parse actions that has side effects do not always play well with memoization turned on.
run python
select_parser.py
33
pyparsing : semantic actions

In pyparsing parlance, a ParserElement can have zero or more parsing actions. 4 forms of parse actions:
fn(s,loc,toks) fn(loc,toks) fn(toks) fn()
Usage:
ParserElement.setParseAction( *fn ) ParserElement.addParseAction( *fn )
Uses: 1. Perform validation (see ParseException) 2. Process the matched token(s) & modify it Returning a value overwrites the matched token(s). 3. Annotate with custom types (collary of #2)
34
case study #1 : Semantic action

All users of the integer expression will inherit the parse action.
integer = Word(nums).setParseAction( lambda t: int(t[0]))
Selective assignments of parse action to copies.

def range_check(toks): month = int(toks[0]) if month <=0 or month >= 13: raise ParseException('month must be in range 1..12') month_spec = integer('m').addParseAction(range_check) + month
Show: japan_simple.py
integer.copy().addParseAction( .. ) integer( 'result_name' ).addParseAction( .. )
35
case study #1 : test files

imperial . utf8 western . utf8
36
case study #1 : complete solution

Demo:
@traceParseAction def convert_kanji_year(toks): if 'imperial' in toks.keys(): year = toks.imperial.yearZero + toks.imperial.yy toks['era'] = toks.imperial.type_ toks['yyyy'] = year elif 'western' in toks.keys(): year = toks.yyyy try: toks['modernDate'] = date(year, toks.mm, toks.dd) except ValueError, error: raise ParseException(error.args[0])
Show: japan_dates.py
37
case study #2 problem statement

Parse Gmail search criterias.
Supports a tiny subset of the full grammar : from : ( <sender> ) label : inbox -label : sent yyyyy -yyyyy zzzzz -zzzzz
38
case study #2: example strings

from : ( bruno manser ) from : ( bruno.manser@swiss.org ) from : ( @swiss.org ) label : sarawak -label : not-urgent penan injustice -logging
39
case study #2: email addresses
emailfull = Regex(r"(?P<user>[A-Za-z0-9._%+-]+)@ (?P<hostname>[A-Za-z0-9.-]+)\.(?P<tld>[A-Za-z]{2,4})")
emailpartial = Regex(r"@(?P<hostname>[A-Za-z0-9.-]+)\. (?P<tld>[A-Za-z]{2,4})")
email = (emailpartial | emailfull) squeeze = lambda t: ' '.join( t[0].split() ) name = ZeroOrMore(Word(alphanums + ' ')) .setParseAction( squeeze )
40

opener,closer,colon = map(Suppress,'():') enclosed = email | name nested = opener + enclosed + closer grammar_email = Combine(Suppress('from') + colon + nested)
41

result = grammar_email.parseString( 'from:(bruno.manser.25@borneo.org)' ) print result.dump()
result = grammar.parseString( 'from:( Marco de Gasperi print result.dump()
)')
Run: nosetests -v testFromTo.py
42
case study #2: labels

GOAL: group the excluded and included labels into their own sub-lists. E.g. label : fukushima1 -label : aloo-gobi
hyphen = Suppress('-')
Combine( expr + ZeroOrMore( delim + expr ) )
label_rhs = delimitedList(Word(alphanums), delim='-', combine=True ) label_include = Combine( Suppress('label') + colon + label_rhs ) label_exclude = Combine( hyphen + label_include ) label_all = MatchFirst([ label_exclude.setResultsName('labels.exclude', listAllMatches=True), pyparsing 1.5.6 label_include('labels.include*')]) grammar_label = ZeroOrMore( label_all )
43
case study #2: labels
result = grammar_label.parseString('-label:fukushima1 label:onagawa -label:aloo-gobi label:cheese-naan' ) print result.dump()
Question. Will this grammar work if the user entered LABEL instead of label ? Answer.
CaselessLiteral('label')
44
case study #2: search strings

GOAL: group the excluded and included search strings into their own sub-lists. rumi - jack kerouac
key_single = Word(alphanums) key_quoted = quotedString.setParseAction(removeQuotes) key_included = key_quoted | key_single key_excluded = Combine(hyphen + key_included) key_all = MatchFirst( [key_excluded("key.exclude*"), key_included("key.include*")] ) grammar_key = ZeroOrMore( key_all )
45
case study #2: search strings

result = grammar_key.parseString( ' -osama obama -"bin laden" "white house" ' ) print result.dump()
Question. If the user entered single instead of double quotes, will it conform to the grammar ?
Answer. Yes
46
case study #2: Final solution

Let's compose all the individual pieces together.
email_all = grammar_email('from*') gmail = (ZeroOrMore(email_all | label_all | key_all) + Suppress(restOfLine))
result = gmail.parseString('love label:writing-tips "bird by bird" from:(Anne Lamott) -"dalai lama" -label:macchu-pichu from:(agnes.obel@sparrow.net) -label:french-guiana -"epictetus" label:yoga "bugle podcast" label from:(@microsoft.com)') print result.dump() nested = opener + Group(enclosed) + closer
47
case study #2: Final solution

['love', 'writing-tips', 'bird-by-bird', 'Anne Lamott', 'dalai lama', 'macchu-pichu', '@microsoft.com', 'agnes.obel@sparrow.net', 'french-guiana', 'epictetus', 'yoga', 'bugle podcast', 'label'] -from: ['Anne Lamott','@microsoft.com', 'agnes.obel@sparrow.net'] -key.exclude: ['dalai lama','epictetus'] -key.include: ['love', 'bird by bird', 'bugle podcast', 'label'] -labels.exclude: ['macchu-pichu', 'french-guiana'] -labels.include: ['writing-tips','yoga']
48
pyparsing: Recursion
A grammar is recursive when there exists a nonterminal which has itself in the right-hand-side of the production rule. number ::= digit rest
rest ::= digit rest | empty digit = Word(nums,exact=1).setName('1-digit') rest = Forward() rest << Optional(digit + rest) number = Combine(digit + rest, adjacent=False) ('digit-list') grammar = number.setParseAction(lambda t:int(t[0])) + Suppress(restOfLine)
Run 49
case study #3: binary tree

Parse parentheses notation for binary trees.
(nil,4,nil) ((nil,2,(nil,3,nil)),4,((nil,5,(nil,6,nil)),7,nil))
4 2 3 5 6 7
Convert it to list notation in python

50
case study #3: recursive solution

BNF
node ::= '(' node ',' number ',' node ')' | empty
Code
left, right, comma = map(Suppress, '(),') empty = (CaselessLiteral('nil') .setParseAction(replaceWith(None))) tree = Forward() value = Word(nums).setParseAction(lambda t:int(t[0])) tree << ((left + Group(tree) + bookend(value) + Group(tree) + right)
Run
51
case study #3: recursive solution

Input :
((nil,2,(nil,3,nil)),4,((nil,5,(nil,6,nil)),7,nil))
Output :
[[[None],2,[[None],3,[None]]],4,[[[None],5, [[None],6,[None]]],7,[None]]]
How to fix it : Re-implement Group in
Group(tree)
class TreeGroup(TokenConverter): def postParse(self, instring, loc, tokenlist): if len(tokenlist) == 1 and tokenlist[0] is None: return tokenlist else: 52 return [tokenlist]
pyparsing : left recursion

pyparsing does not support left recursion.
term ::= \d+ expr ::= expr + term | term
pyparsing will raise a RuntimeError with message

'maximum recursion depth exceeded' '
@raises(RecursiveGrammarException) def test_left_recursion(self): expr.validate()
Eliminate left recursion if you want it to work in pyparsing

Run
53
PyMeta : introduction
OMeta is a language prototyping system (PEG). Implemented in several programming languages. * Packrat memoization * Grammar: BNF dialect (with host language snippets) * Object-Oriented: inheritance, overriding rules
lowercase ::= <char_range 'a' 'z'> def rule_lowercase(): // ..body..
* <anything> consumes one object from the input stream. (c.f. regex) * Built-in rules <letter> <digit>
<letterOrDigit> <token '?'>
55
PEGs & PyMeta

PEG
e1 e2 e1 / e2 e* e+ e?
Syntactic Predicates (unlimited lookahead)
PyMeta
e1 e2 e1 | e2 e* e+ e? ~~e !e == ~e
&e !e
case study #1 : in PyMeta

Modest goals: a) recognize western and Heisei imperial dates b) read & parse both imperial.utf8 & western.utf8 Separate files: common.py : Common rules & utilities
western_dates.py : Grammar to recognize
western dates
era_heisei.py : Grammar to recognize
heisei dates
japan_date_parser.py : Final grammar
57
case study #1 : in PyMeta pt A

from pymeta.grammar import OMeta baseGrammar = r""" # common literals for all ERAs year ::= <token u'\u5E74'> month ::= <token u'\u6708'> day ::= <token u'\u65E5'> common.py
range_num :min :max ::= <digit>+:m ?(int(join(m)) >= min & int(join(m)) <= max) => m rest_of_line ::= <anything>* <token '\n'>? => None empty_line ::= <spaces> <rest_of_line> => None python_comment ::= <token '#'> <rest_of_line> => None """ def join(x): return ''.join(x) JapanCommonParser = OMeta.makeGrammar(baseGrammar, globals(), "JapanCommonParser")
58
case study #1 : in PyMeta pt B

western_dates.py westernGrammar = r""" western ::= <spaces> <digit>+:y <year> <range_num 1 12>:m <month> <range_num 1 31>:d <day> <rest_of_line> => westernized( int(join(y)),int(join(m)), int(join(d))) grammar ::= <python_comment> | <western>""" def westernized(yyyy, mm, dd): retval = JapanDate() retval['western'] = date(yyyy,mm,dd) return retval WesternParser = JapanCommonParser.makeGrammar( westernGrammar, globals(), 59 'WesternParser')
case study #1 : in PyMeta pt C

era_heisei.py era_heisei = Era('Heisei','Akihito', (u'\u5E73\u6210',u'\u337B'), startDate=date(1989,1,8)) def heisei_year_ok(yy): return (yy >= 1 and yy <= era_heisei.maxYearUnit) def collect( yy, mm, dd ): retval = JapanDate() retval['imperial'] = date( era_heisei.yearZero + yy, mm, dd ) retval['era'] = [ era_heisei.name, yy ] return retval
60
case study #1: in PyMeta pt C (2)

era_heisei.py (continued) heiseiGrammar = r""" hlong ::= <token u'\u5e73\u6210'> hshort ::= <token u'\u337b'> heisei ::= (<hlong> | <hshort>) <digit>+:y ?(heisei_year_ok(int(join(y)))) <year> <range_num 1 12>:m <month> <range_num 1 31>:d <day> <rest_of_line> => collect(int(join(y)),int(join(m)),int(join(d))) """ HeiseiParser = JapanCommonParser.makeGrammar(heiseiGrammar, globals(), 'HeiseiParser')
61
case study #1 : in PyMeta pt D

japan_date_parser.py finalGrammar = r""" # override 'grammar' in WesternParser grammar ::= <super> | <heisei> | <empty_line>""" class BaseParser(HeiseiParser, WesternParser): pass BaseParser.globals.update(WesternParser.globals) BaseParser.globals.update(HeiseiParser.globals) JapanDateParser = BaseParser.makeGrammar( finalGrammar, globals(), "JapanDateParser")
62
case study #1 : in PyMeta pt D (2)

japan_date_parser.py (continued)
def parse_file(filename): iterate through each line .... snipped ... parser = JapanDateParser(line) result,error = parser.apply('grammar') .... snipped ... results = parse_file('imperial.utf8') results = parse_file('western.utf8')
Run
63
case study #1 : PyMeta output
64
PyMeta : Left Recursion

PyMeta can handle left recursion.
recursiveGrammar = r""" num ::= <num>:n <digit>:d => n * 10 + d | <digit> digit ::= :d ?((d>='0') & (d<='9')) => int(d)"""
Quiz. Is the following grammar equivalent ?

num ::= <digit> | <num>:n <digit>:d => n * 10 + d
Run
65
PyMeta : Matching objects

python list
listGrammar = digit ::= :x ?(x.isdigit()) interp ::= [<digit>:x '+' <digit>:y] => int(x) => x + y
g = OMeta.makeGrammar(listGrammar, {}) parser = g( [['600','+','66']] ) iterable result,error = parser.apply('interp') >>> result 666 >>> error ParseError(2,[])
66
PyMeta : Matching objects (2)

Object graph (e.g. tree) python rewriter project visits the AST tree created by the compiler module (python 2.x) & regenerates the python statement.
>>> import compiler >>> print compiler.parse('import ctypes') >>> Module(None, Stmt([Import(['ctypes', None)])]))
import :i ::= <anything>:a ?(a.__class__ == Import) => 'import '+', '.join(import_match(a.names))
67
pyparsing vs PyMeta
pyparsing Whitespace sensitive? Left recursion Packrat memoization Operates on character streams Operates on object streams Syntactic predicates Semantic predicates Semantic actions Regex support No. But turned on via
leaveWhitespace()
PyMeta Yes. Use <spaces> rule to eat whitespaces Yes Yes. Only no-arg rules Yes Yes Yes
No Yes. Off by default. Yes No Yes
No (@see parse actions) Yes Yes Yes Yes No

68
PyPy rlib/parsing
Library for generating tokenizers & parsers in RPython. Consists of: regex / packrat parser tree structure / EBNF parser Sample JSON ebnf
NUMBER: "\-?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][\ +\-]?[0-9]+)?"; value: <STRING> | <NUMBER> | <object> | <array> | <"null"> |<"true"> | <"false">; array: ["["] (value [","])* value ["]"]; entry: STRING [":"] value;
Resulting parse tree can be transformed or traversed with custom visitors. (dot)
69
Topics not covered
Usage of syntactic predicates Parsing grammars of mathematical expression in order to preserve operator precedence Handling indents/dedents in order to parse indentation-sensitive languages
e.g. coffeescript, python, haskell
Resources
pyparsing
http://pyparsing.wikispaces.com/ https://github.com/marcua/tweeql
PyMeta
http://www.tinlizzie.org/ometa/ http://gitorious.org/python-decompiler/python_rewriter
PyPy Rpython parsing library

http://doc.pypy.org/en/latest/rlib.html
rubycoder@gmail.com
71

Introduction To PEG (Parsing Expression Grammar) in Python

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To PEG (Parsing Expression Grammar) in Python

Uploaded by

Copyright:

Available Formats

Introduction to packrat parsing for PEGs (Parsing Expression Grammars)

pycon APAC 2011, Singapore gavin bong

CFG (Context Free Grammars)

Original BNF only supported recursion.

CFG & Ambiguity

values Str Log 'heaven' 5

CFG & Ambiguity (2)

Recursive Descent Parsing

Recursive Descent Parsing (2)

Classical method of parsing

4. Use your parser

Specific to LALR(1) bottom-up parsers

Spectrum of parsing solutions

Other python parsing toolkits

PEG's ordered choice

Commutative Yes Yes (lex/yacc) Yes

Warth et al. Packrat parsers can support left recursion (2008)

PEG & Packrat parsing

Solution: memoization guarantees linear time performance. Neotoma Cinerea

case study #1 problem statement

If the date parses successfully, convert it to its equivalent datetime.date instance.

case study #1 : The four ERAs

case study #1 : liberties taken

case study #1 : initial attempt

case study #1 : initial attempt (2)

day_spec = integer.setResultsName('dd') + day month_spec = integer('mm') + month

case study #1 : initial attempt (3)

pyparsing : framework overview

pyparsing & PEGs : correlation

ZeroOrMore( e ) OneOrMore( e ) Optional( e ) Followed( e ) ~e == NotAny( e )

pyparsing : framework overview

pyparsing : ordered choice

Shadowing literals in which one is a substring of the other should be avoided.

Keywords are different

search of the alternatives. (match longest)

pyparsing : left factored

pyparsing : semantic actions

case study #1 : Semantic action

Selective assignments of parse action to copies.

integer.copy().addParseAction( .. ) integer( 'result_name' ).addParseAction( .. )

case study #1 : test files

case study #1 : complete solution

case study #2 problem statement

case study #2: example strings

case study #2: email addresses

emailfull = Regex(r"(?P<user>[A-Za-z0-9._%+-]+)@ (?P<hostname>[A-Za-z0-9.-]+)\.(?P<tld>[A-Za-z]{2,4})")

emailpartial = Regex(r"@(?P<hostname>[A-Za-z0-9.-]+)\. (?P<tld>[A-Za-z]{2,4})")

case study #2: email addresses

case study #2: email addresses

result = grammar.parseString( 'from:( Marco de Gasperi print result.dump()

Run: nosetests -v testFromTo.py

case study #2: labels

case study #2: labels

result = grammar_label.parseString('-label:fukushima1 label:onagawa -label:aloo-gobi label:cheese-naan' ) print result.dump()

case study #2: search strings

case study #2: search strings

case study #2: Final solution

case study #2: Final solution

case study #3: binary tree

Convert it to list notation in python

case study #3: recursive solution

case study #3: recursive solution

How to fix it : Re-implement Group in

import :i ::= <anything>:a ?(a.class == Import) => 'import '+', '.join(import_match(a.names))