0% found this document useful (0 votes)

42 views4 pages

Regular Expression Notes

Uploaded by

deokumar271189

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views4 pages

Regular Expression Notes

Uploaded by

deokumar271189

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as TXT, PDF, TXT or read online on Scribd

==> Regular Expressions

===================
A regular grammar has a special property:- by substituting every nonterminal
(except the root one) with its
righthand side, you can reduce it down
to a single production for
the root, with only terminals and
operators on the right-hand side.

Our URL grammar was regular.

By replacing nonterminals with their productions,
it can be reduced to a single expression:-

url ::= 'http://' ([a-z]+ '.')+ [a-z]+ (':' [0-9]+)? '/'

The Markdown grammar is also regular:-

======================================

markdown ::= ([^_]* | '_' [^_]* '_' )*

But our HTML grammar can�t be reduced completely.

By substituting righthand sides for nonterminals,
We can eventually reduce it to something like this:

html ::= ( [^<>]* | '<i>' html '</i>' ) *

�but the recursive use of html on the righthand side can�t be eliminated,
and can�t be simply replaced by a repetition operator either.
So the HTML grammar is not regular.

The reduced expression of terminals and operators can be written in an even more
compact form,
called a regular expression
A regular expression does away with the quotes around the terminals,
and the spaces between terminals and operators, so that it consists just of
terminal characters,
parentheses for grouping, and operator characters.
For example:- the regular expression for our markdown format is just
([^_]*|_[^_]*_)*

Regular expressions are also called regexes for short.

A regex is far less readable than the original grammar,
because it lacks the nonterminal names that documented the meaning of each
subexpression.
But a regex is fast to implement, and there are libraries in many programming
languages that
support regular expressions.

==>The regex syntax commonly implemented in programming language libraries has a

few more special operators,
in addition to the ones we used above in grammars.
Here�s are some common useful ones:-

. any single character

\d any digit, same as [0-9]

\s any whitespace character, including space, tab, newline
\w any word character, including letters and digits
\., \(, \), \*, \+, ...
escapes an operator or special character so that it matches literally.

Using backslashes is important

whenever there are terminal characters that would be confused with special
characters.
Because our url regular expression has .
in it as a terminal,
we need to use a backslash to escape it:-

http://([a-z]+\.)+[a-z]+(:[0-9]+)/

==>Using regular expressions in Java

=================================

Regular expressions (�regexes�) are widely used in programming,

and you should have them in your toolbox.

In Java,
We can use regexes for manipulating strings
(see String.split , String.matches , java.util.regex.Pattern ).
They�re built-in as a first-class feature of modern scripting languages
like Python, Ruby, and Javascript,
and We can use them in many text editors for find and replace.
Regular expressions are our friend! Most of the time.
Here are some examples.

Replace all runs of spaces with a single space:-

==============================================

String singleSpacedString = string.replaceAll(" +", " ");

Match a URL:-
============
Pattern regex = Pattern.compile("http://([a-z]+\\.)+[a-z]+(:[0-9]+)?/");
Matcher m = regex.matcher(string);
if (m.matches()) {
// then string is a url
}

Extract part of an HTML tag:-

============================
Pattern regex = Pattern.compile("<a href=['\"]([^']*)['\"]>");
Matcher m = regex.matcher(string);
if (m.matches()) {
String url = m.group(1);
// Matcher.group(n) returns the nth parenthesized part of the regex
}

Notice the backslashes in the URL and HTML tag examples.

In the URL example, we want to match a literal period . ,
so we have to first escape it as \. to protect it from being interpreted as
the regex match-any-character operator, and then we have to further escape it
as \\.
to protect the backslash from being interpreted as a Java string escape character.
In the HTML example, we have to escape the quote mark " as \" to keep it from
ending the string.
The frequency of backslash escapes makes regexes still less readable.
Context-Free Grammars
In general,
A language that can be expressed with our system of grammars is called context-
free.
Not all context-free languages are also regular;
that is, some grammars can�t be reduced to single nonrecursive productions.
Our HTML grammar is context-free but not regular.

The grammars for most programming languages are also context-free.

In general, any language with nested structure (like nesting parentheses or braces)
is context-free
but not regular.
That description applies to the Java grammar,
shown here in part:-

statement ::=
'{' statement* '}'
| 'if' '(' expression ')' statement ('else' statement)?
| 'for' '(' forinit? ';' expression? ';' forupdate? ')' statement
| 'while' '(' expression ')' statement
| 'do' statement 'while' '(' expression ')' ';'
| 'try' '{' statement* '}' ( catches | catches? 'finally' '{' statement* '}' )
| 'switch' '(' expression ')' '{' switchgroups '}'
| 'synchronized' '(' expression ')' '{' statement* '}'
| 'return' expression? ';'
| 'throw' expression ';'
| 'break' identifier? ';'
| 'continue' identifier? ';'
| expression ';'
| identifier ':' statement
| ';'

==> SUMMARY:-
========
Machine-processed textual languages are ubiquitous in computer science.
Grammars are the most popular formalism for describing such languages,
and regular expressions are an important subclass of grammars that can be expressed
without recursion.

The topics of today�s reading connect to our three properties of good software as
follows:-
===================================================================================
=======

Safe from bugs:- Grammars and regular expressions are declarative

specifications for strings and streams,
which can be used directly by libraries and tools.
These specifications are often simpler, more direct, and less
likely to be buggy
then parsing code written by hand.

Easy to understand:- A grammar captures the shape of a sequence in a form that

is easier to understand
than hand-written parsing code.
Regular expressions, alas, are often not easy to
understand,
because they are a one-line reduced form of what might
have been a more understandable
regular grammar.

Ready for change:- A grammar can be easily edited, but regular expressions,
unfortunately, are much harder
to change, because a complex regular expression is cryptic
and hard to understand.

Regular Expressions
No ratings yet
Regular Expressions
14 pages
Regular Expressions and Parsing Techniques
No ratings yet
Regular Expressions and Parsing Techniques
32 pages
Regular Expressions
100% (5)
Regular Expressions
94 pages
Java and Regular Expressions
No ratings yet
Java and Regular Expressions
18 pages
Module 2 Chap1
No ratings yet
Module 2 Chap1
92 pages
Intro to Regular Expressions
No ratings yet
Intro to Regular Expressions
4 pages
Regular Expressions
No ratings yet
Regular Expressions
23 pages
Java Regex Tutorial: Lars Vogel
No ratings yet
Java Regex Tutorial: Lars Vogel
20 pages
Understanding Regular Languages and Tokens
No ratings yet
Understanding Regular Languages and Tokens
24 pages
Comprehensive Regular Expressions Guide
No ratings yet
Comprehensive Regular Expressions Guide
94 pages
Unit1 01
No ratings yet
Unit1 01
10 pages
JavaScript Regular Expressions Guide
No ratings yet
JavaScript Regular Expressions Guide
24 pages
14.regular Expression
No ratings yet
14.regular Expression
3 pages
Regex and Automata in NLP
No ratings yet
Regex and Automata in NLP
62 pages
Regular Expressions
No ratings yet
Regular Expressions
20 pages
Regex Cheat Sheet
No ratings yet
Regex Cheat Sheet
10 pages
Regex Techniques for NLP Applications
No ratings yet
Regex Techniques for NLP Applications
16 pages
Regex
No ratings yet
Regex
24 pages
Understanding Languages and Regex
No ratings yet
Understanding Languages and Regex
23 pages
Free Email Parser and Regex Guide
0% (1)
Free Email Parser and Regex Guide
5 pages
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
No ratings yet
David Wang Computing Science and Information Technology: Info 1211 - Operating System'S Principles and Applications
73 pages
Preface: The Need For This Book
No ratings yet
Preface: The Need For This Book
5 pages
Java Regular Expressions Guide
100% (8)
Java Regular Expressions Guide
26 pages
Regex Tutorial
No ratings yet
Regex Tutorial
103 pages
Understanding Regular Expressions in NLP
No ratings yet
Understanding Regular Expressions in NLP
15 pages
Understanding Regular Expressions in NLP
No ratings yet
Understanding Regular Expressions in NLP
22 pages
Lecture Slides Regular Expressions
No ratings yet
Lecture Slides Regular Expressions
138 pages
Understanding Languages and Regex
No ratings yet
Understanding Languages and Regex
27 pages
Lecture 2
No ratings yet
Lecture 2
70 pages
Unit 2 Regular Expression
No ratings yet
Unit 2 Regular Expression
3 pages
Module2 NLP BAD613B Notes
100% (1)
Module2 NLP BAD613B Notes
16 pages
Regular Expressions Basics
No ratings yet
Regular Expressions Basics
11 pages
Understanding Regular Expressions in Compilers
No ratings yet
Understanding Regular Expressions in Compilers
44 pages
Understanding Regular Expressions in NLP
No ratings yet
Understanding Regular Expressions in NLP
34 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
65 pages
ACT Chapter 2
No ratings yet
ACT Chapter 2
22 pages
NLP Chapter 5
No ratings yet
NLP Chapter 5
70 pages
Compiler Construction: Scanning & Tokens
No ratings yet
Compiler Construction: Scanning & Tokens
72 pages
Network Security - 4.2 Reg Ex Primer
No ratings yet
Network Security - 4.2 Reg Ex Primer
3 pages
JavaScript Regex Guide for Beginners
No ratings yet
JavaScript Regex Guide for Beginners
22 pages
Java Regex Tutorial and Examples
No ratings yet
Java Regex Tutorial and Examples
21 pages
L02 - Programming - RE PLC
No ratings yet
L02 - Programming - RE PLC
35 pages
JavaScript Regex & Error Handling
No ratings yet
JavaScript Regex & Error Handling
30 pages
Mastering Regular Expressions: Jeffrey E. F. Friedl
No ratings yet
Mastering Regular Expressions: Jeffrey E. F. Friedl
10 pages
Regular Expression Micro Project Report
No ratings yet
Regular Expression Micro Project Report
14 pages
JavaScript Regular Expressions Guide
No ratings yet
JavaScript Regular Expressions Guide
13 pages
Regular Expressions Overview by Mansouri
No ratings yet
Regular Expressions Overview by Mansouri
39 pages
Regular - Expressions For FL & A
No ratings yet
Regular - Expressions For FL & A
34 pages
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
No ratings yet
Regular Expression Tutorial: What Regular Expressions Are Exactly - Terminology
42 pages
Regex
100% (1)
Regex
42 pages
3-Regular Expressions
No ratings yet
3-Regular Expressions
34 pages
Regular Expressions in Java
No ratings yet
Regular Expressions in Java
30 pages
Chapter 10
No ratings yet
Chapter 10
28 pages
06 - JavaScript Miscellaneous Concepts
No ratings yet
06 - JavaScript Miscellaneous Concepts
9 pages
Regular Expressions in Java
No ratings yet
Regular Expressions in Java
30 pages
Regular Expressions, Tok-Enization, Edit Distance
No ratings yet
Regular Expressions, Tok-Enization, Edit Distance
29 pages
BeneficiaryDetailForSocialAuditReport PMAYG 0503006 2025-2026
No ratings yet
BeneficiaryDetailForSocialAuditReport PMAYG 0503006 2025-2026
20 pages
BeneficiaryDetailForSocialAuditReport PMAYG 0503009 2021-2022
No ratings yet
BeneficiaryDetailForSocialAuditReport PMAYG 0503009 2021-2022
25 pages
4 e Commerce Notes Unit 4
No ratings yet
4 e Commerce Notes Unit 4
5 pages
2 E Commerce Notes UNIT II
No ratings yet
2 E Commerce Notes UNIT II
9 pages
Early Years Teaching Guide
No ratings yet
Early Years Teaching Guide
119 pages
VTAMPS 5.0 Primary 3 Set 4
0% (1)
VTAMPS 5.0 Primary 3 Set 4
9 pages
AUI Academic Calendar 2025-26
No ratings yet
AUI Academic Calendar 2025-26
2 pages
Introduction to Logic and Arguments
No ratings yet
Introduction to Logic and Arguments
4 pages
Advance Statistics For Computing PT
No ratings yet
Advance Statistics For Computing PT
55 pages
Mastering Text Analysis Techniques
No ratings yet
Mastering Text Analysis Techniques
5 pages
Class 9 English Literature QP HY
100% (1)
Class 9 English Literature QP HY
5 pages
Advanced Engineering Drafting Test
No ratings yet
Advanced Engineering Drafting Test
5 pages
Solutions for Homework 2 in Logic
No ratings yet
Solutions for Homework 2 in Logic
2 pages
Vocabulary Mastery and Speaking Ability Study
No ratings yet
Vocabulary Mastery and Speaking Ability Study
114 pages
FTLG gr001 - en e
No ratings yet
FTLG gr001 - en e
111 pages
8251 Usart
No ratings yet
8251 Usart
14 pages
Understanding Paragraph Structure and Topic Sentences
No ratings yet
Understanding Paragraph Structure and Topic Sentences
19 pages
WebSphere and Liberty L2 Quiz - Attempt Review Final
No ratings yet
WebSphere and Liberty L2 Quiz - Attempt Review Final
16 pages
DSU Practical Related 16 To 24
No ratings yet
DSU Practical Related 16 To 24
38 pages
Question Bank
No ratings yet
Question Bank
15 pages
Mismatch of Teachers' Qualifications and Subjects Taught: Effects On Students' National Achievement Test
100% (1)
Mismatch of Teachers' Qualifications and Subjects Taught: Effects On Students' National Achievement Test
19 pages
SEMI DETAILED LESSON PLAN For Demo. Final
No ratings yet
SEMI DETAILED LESSON PLAN For Demo. Final
4 pages
Important One Word Substitutions
No ratings yet
Important One Word Substitutions
3 pages
Active Database Triggers Explained
No ratings yet
Active Database Triggers Explained
36 pages
IP I - Chapter 1 - Internet Technologies and Protocols
No ratings yet
IP I - Chapter 1 - Internet Technologies and Protocols
30 pages
Software Design Principles Guide
No ratings yet
Software Design Principles Guide
17 pages
Operating System Process Management Programs
No ratings yet
Operating System Process Management Programs
25 pages
Noun Clause Exercises with Answers
No ratings yet
Noun Clause Exercises with Answers
4 pages
Europass CV
No ratings yet
Europass CV
2 pages
DS0000087 - Programmer's Manual - of X-LIB Software Library
100% (1)
DS0000087 - Programmer's Manual - of X-LIB Software Library
175 pages
CEID - Uncp Practice B07
No ratings yet
CEID - Uncp Practice B07
2 pages
Q1 - English Week 5 Day 2
No ratings yet
Q1 - English Week 5 Day 2
4 pages
Text Visual Audio Motion Manipulative and Multimedia Information and Media.
No ratings yet
Text Visual Audio Motion Manipulative and Multimedia Information and Media.
6 pages
A1 - Ellipsis and Substitution
No ratings yet
A1 - Ellipsis and Substitution
23 pages

Regular Expression Notes

Uploaded by

Regular Expression Notes

Uploaded by

==> Regular Expressions

Our URL grammar was regular.

url ::= 'http://' ([a-z]+ '.')+ [a-z]+ (':' [0-9]+)? '/'

The Markdown grammar is also regular:-

markdown ::= ([^_]* | '_' [^_]* '_' )*

But our HTML grammar can�t be reduced completely.

html ::= ( [^<>]* | '<i>' html '</i>' ) *

Regular expressions are also called regexes for short.

==>The regex syntax commonly implemented in programming language libraries has a

. any single character

\d any digit, same as [0-9]

Using backslashes is important

==>Using regular expressions in Java

Regular expressions (�regexes�) are widely used in programming,

Replace all runs of spaces with a single space:-

String singleSpacedString = string.replaceAll(" +", " ");

Extract part of an HTML tag:-

Notice the backslashes in the URL and HTML tag examples.

The grammars for most programming languages are also context-free.

Safe from bugs:- Grammars and regular expressions are declarative

Easy to understand:- A grammar captures the shape of a sequence in a form that

You might also like