Regex

Identifiers:
\ used to escape a character

\d any number
\D anything but a number
\s space
\S anything but a space
\w any character
\W anything but a character
. any character except a new line
\. actually a period
\b whitespace around words
Modifiers:
{1,3} we're expecting 1-3
+ Match 1 or more
? Match 0 or 1
* Match 0 or more
$ match the end of a string
^ match the beginning of a string
| matches either or e.g. \d{1-3}|\w{5-6}
[] Match range or "variance" e.g. [A-Za-z] or [1-5a-qA-Z]
{x} expecting "x" amount
White Space Characters:

\n new line
\s space
\t tab
\e escape (rare)
\f form feed (rare)
\r return
DON'T FORGET!:
. + * ? [ ] $ ^ ( ) { } \ |
Regular Expression - Re
In many regex functions and methods, there are optional arguments called flags,
which modify the meaning of a given regex pattern.
Syntax Long Syntax Meaning

re.I re.IGNORECASE Ignores case.
re.M re.MULTILINE Enables begin/end{^,$} consider each line.
re.S re.DOTALL Enables " . " match newline.
re.U re.UNICODE Enables {\W,\w,\B,\b} follow unicode rules.
re.L re.LOCALE Enables {\W,\w,\B,\b} follow locale.
re.X re.VERBOSE Allows comment in regex.
Regular Expression - Re2

An re2 package is an extension of Google's re2 regular expression library. For
simple substitutions, re packages have been found to work better than re2 packages.
To install the re2 package, use the following command:
pip install re2
Q1.start and end with vowel then return true
import os
import re
# Enter your code here
def function(a):
match = re.match('^[AEIOU].*[aeiou]$',a)
return match
if __name__ == "__main__":
a = input()
b = function(a)
if b:
print(True)
else:
print(False)
Q2. insert two word string.if both strings start with 'e' then true else return
false
import os
import re
def function(a):
m = re.match('(e\w+)\s(e\w+)',a)
return m
if __name__ == "__main__":
a = input()
b = function(a)
if b:
print(True)
else:
print(False)
Q3. If string starts with capital letter, return true if it starts with anything
else return false
import os
import re

def function(a):
result = re.match('^[A-Z].*',a)
return result
if __name__ == "__main__":
a = input()
b = function(a)
if b:
print(True)
else:
print(False)
Q4: Extract names and values into dictionary from given string
import os
import re
# Enter your code here. Read input from STDIN. Print output to STDOUT
def main(x):
pattern = x
names = re.findall(r'[A-Z][a-z]*',pattern)
#print(names)
values = re.findall(r'\d{1,5}',pattern)
#print(values)
dicts = {}
x = 0
for eachName in names:
dicts[eachName] = values[x]
x += 1
return dicts
'''For testing the code, no input is to be provided'''
if __name__ == "__main__":
x=input()
print(main(x))
Input (stdin)
Run as Custom Input

|
Download
Dhoni scored 100 runs and Kohli scored 150 runs.Rohit scored 50 runs and Dhawan
scored 250 runs.
Your Output (stdout)
{'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}
Expected Output
Download
{'Dhoni': '100', 'Kohli': '150', 'Rohit': '50', 'Dhawan': '250'}
Q5.
import os
import re

sample_text = ['199.72.81.55 - - [01/Jul/1995:00:00:01 -0400] "GET /history/apollo/
HTTP/1.0" 200 6245',
'unicomp6.unicomp.net - - [01/Jul/1995:00:00:06 -0400] "GET /shuttle/countdown/
HTTP/1.0" 200 3985',
'199.120.110.21 - - [01/Jul/1995:00:00:09 -0400] "GET /shuttle/missions/sts-
73/mission-sts-73.html HTTP/1.0" 200 4085',
'burger.letters.com - - [01/Jul/1995:00:00:11 -0400] "GET
/shuttle/countdown/liftoff.html HTTP/1.0" 304 0',
'199.120.110.21 - - [01/Jul/1995:00:00:11 -0400] "GET /shuttle/missions/sts-
73/sts-73-patch-small.gif HTTP/1.0" 200 4179',
'burger.letters.com - - [01/Jul/1995:00:00:12 -0400] "GET /images/NASA-
logosmall.gif HTTP/1.0" 304 0',
'burger.letters.com - - [01/Jul/1995:00:00:12 -0400] "GET
/shuttle/countdown/video/livevideo.gif HTTP/1.0" 200 0',
'205.212.115.106 - - [01/Jul/1995:00:00:12 -0400] "GET
/shuttle/countdown/countdown.html HTTP/1.0" 200 3985',
'd104.aa.net - - [01/Jul/1995:00:00:13 -0400] "GET /shuttle/countdown/ HTTP/1.0"
200 3985',
'129.94.144.152 - - [01/Jul/1995:00:00:13 -0400] "GET / HTTP/1.0" 200 7074',
'unicomp6.unicomp.net - - [01/Jul/1995:00:00:14 -0400] "GET
/shuttle/countdown/count.gif HTTP/1.0" 200 40310',
'unicomp6.unicomp.net - - [01/Jul/1995:00:00:14 -0400] "GET /images/NASA-
'unicomp6.unicomp.net - - [01/Jul/1995:00:00:14 -0400] "GET /images/KSC-
'd104.aa.net - - [01/Jul/1995:00:00:15 -0400] "GET /shuttle/countdown/count.gif
HTTP/1.0" 200 40310',
'd104.aa.net - - [01/Jul/1995:00:00:15 -0400] "GET /images/NASA-logosmall.gif
HTTP/1.0" 200 786']
def func1():
hosts = []
for item in sample_text:
if re.match('^\d',item) :
hosts.append(re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}',item)[0])
else :
hosts.append(re.findall(r'[a-z0-9]*\w\.[a-z0-9]*\w\.[a-z0-9]*\w',item)
[0])
print(hosts)
def func2():
timestamps = []
timestamps.append(re.findall(r'\[(.*?)\]',item)[0])
print(timestamps)
def func3():
method_uri_protocol = []
method_uri_protocol.append(tuple(re.findall(r'\"(.*?)\"',item)[0].split()))
print(method_uri_protocol)
def func4():
status = []
status.append(re.findall(r'\s\d{3}\s',item)[0].strip())
print(status)
def func5():
content_size = []
content_size.append(re.findall(r'.\s(\d{1,5})$',item)[0])
print(content_size)
'''For testing the code, no input is to be provided'''
if __name__ == "__main__":
func1()
func2()
func3()
func4()
func5()

Regex

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Regex

Uploaded by

Copyright:

Available Formats

Identifiers:

\ used to escape a character

White Space Characters:

Syntax Long Syntax Meaning

Regular Expression - Re2

To install the re2 package, use the following command:

pip install re2

Q1.start and end with vowel then return true

# Enter your code here

Run as Custom Input

# Enter your code here

You might also like