Course Schedule • Regular Expressions
• Meta Characters
• Quantifiers
• Regular Expressions • Character Classes
• RegEx Examples
• String RegEx Methods
• Replacing
• RexEx Flags
• RegExp Object
• Using RegExp
• RegExp Methods
www.spiraltrain.nl 1
Regular Expressions
• Abbreviated as RegEx and describes pattern of text :
/[a-zA-Z_\-]+@(([a-zA-Z_\-])+\.)+[a-zA-Z]{2,4}/
• Used to test whether string matches a pattern
• Used to search and replace characters in a string
• Very powerful, but tough to read
• Regular expressions occur in many places :
• Text editors (TextPad) allow RegExes in search and replace operations
• Programming languages like JavaScript, Java, PHP, Python
• In JavaScript in String methods and in RegExp object :
function checkEmail() {
if(form1.users_email.
value.match(/^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$/)){
return true;
} }
alert("Please enter valid email address");
form1.users_email.focus();
03-emailvalidation
return false;
}
www.spiraltrain.nl Regular Expressions 2
Meta Characters
• Characters with special meaning in regular expressions :
Metacharacter Description
. Matches any character but new line
^ Next character should be at start of string
$ Previous character should be end of string
\ Escapes the next character
\b Matches on a word boundary
| Pipe symbol indicates a choice between regular expressions
• Metacharacters are made non special with a \ in front of them :
• Pattern /3\.14159/ does not have a wildcard character but a literal dot
www.spiraltrain.nl Regular Expressions 3
Quantifiers
• Use quantifiers to indicate something must be repeated :
Quantifier Description Example
? zero or one occurrences a?
* zero or more occurrences b*
+ one or more occurrences c+
{min,max} at least min occurrences, at most max occurrences {2,5}
{equ} exactly equ occurrences {3}
{min,} at least min occurrences {2,}
{0,max} from zero to no more that max occurrences {0,6}
{0,0} exactly zero occurrences {0,0}
• Quantifiers work on the previous item :
• Item after which the quantifier occurs
• Pattern andromeda\t*nebula would match :
• Word andromeda followed by zero or more tabs followed by word nebula
www.spiraltrain.nl Regular Expressions 4
Character Classes
• Indicate list of characters that one element in a string will match :
• Inside square brackets "[]" a list of characters can be provided
• The expression matches if any of these characters is found
• The order of characters is insignificant
• Distinguish three kinds of Character Groups :
• Indicate which characters may occur
• Indicate which characters may not occur
• Indicate which characters are excluded from a group to occur
• Positive character group :
[ace] , [A-Za-z]
• Negative character group indicated with ^ inside [ and ]
[^a-f] , [^KPV]
• Character Class Subtraction :
[A-Z-[QZ]]
04-matchmethod
www.spiraltrain.nl Regular Expressions 5
RegEx Examples
Regular Expression Match
abc This is looking for exact character sequence a, b and then c
[abc] Square brackets match any of characters inside : a, b or c
(abc)* Parenthesis group patterns. Asterix marks zero or more of
previous character. Would match empty string or abcabcabc
\.+ Backslash is an all purpose escape character. + marks one or
more of the previous character. This would match ......
[0-4] Match any number from 0 to 4
[^0-4] Match anything not the number 0-4
\sword\s Match word where there is white space before and after
\bword\b \b marks word boundary. Could be white space, new line or
end of the string
[a-z]{8,} Must be at least 8 letters
\d{3,12} \d matches any digit ([0-9]) while the braces mark the min and
max count of the previous character. In this case 3 to 12 digits.
www.spiraltrain.nl Regular Expressions 6
String RegEx Methods
String Method Description
match(regexp) Returns first match for this string against the given regular
expression; if global /g flag is used, returns array of all
matches
replace(regexp, "text") Replaces first occurrence of the regular expression with the
given text; if global /g flag is used, replaces all occurrences
search(regexp) Returns first index where given regular expression occurs
split(delimiter[,limit]) Breaks apart a string into an array of strings using the given
regular as the delimiter; returns the array of tokens
var someText = "This is a string the string to be searched";
document.write("String to be searched is : " + someText);
document.write("<br/>");
var nr = someText.search(/is/);
document.write("First occurrence is at : " + nr);
• Result :
05-searchmethod
www.spiraltrain.nl Regular Expressions 7
Replacing
• Replace method of String :
string.replace(regex, "text")
• Replaces first occurrence of pattern with given text :
var state = "Mississippi";
var replaced = state.replace(/s/, "x") // returns "Mixsissippi"
• g after regex is global flag indicating to replace all occurrences :
var replaced = state.replace(/s/g, "x"); // returns "Mixxixxippi"
• Method returns the modified string as its result
• RegEx flags :
• /pattern/g : global; match/replace all occurrences
• /pattern/i : case-insensitive
• /pattern/m : multi-line mode
• /pattern/y : "sticky" search, starts from a given index
• Flags can be combined:
06-replacemethod
• /abc/gi matches all occurrences of abc, AbC, aBc, ABC, ..
www.spiraltrain.nl Regular Expressions 8
RegExp Object
• Constructs a regex dynamically based on a given string :
var r = new RegExp(string);
var r = new RegExp(string, flags);
• Useful when you don't know regex's pattern until runtime :
• Prompt user for his/her name, then search for it
• Comparing regex literal with RexExp object :
var r = new RegExp("ab+c", "gi") equivalent to var r = /ab+c/gi
• In a regex literal, forward slashes must be \ escaped :
/http[s]?:\/\/\w+\.com/
• In a new RegExp object, the pattern is a string :
• So usual escapes are necessary (quotes, backslashes, etc.)
var r = new RegExp("http[s]?://\\w+\\.com")
• RegExp object has various properties/methods :
• Properties: global, ignoreCase, lastIndex, multiline, source, sticky
• Methods: exec, test
www.spiraltrain.nl Regular Expressions 9
RegExp Methods
• exec() Method :
• The exec() method takes one argument, a string, and checks whether that string
contains one or more matches of the pattern specified by the regular expression. If
one or more matches is found, the method returns a result array with the starting
points of the matches. If no match is found, the method returns null.
• test() Method :
• The test() method also takes one argument, a string, and checks whether that
string contains a match of the pattern specified by the regular expression. It returns
true if it does contain a match and false if it does not. This method is very useful in
form validation scripts.
• Flags :
• Flags appearing after the end slash modify how a regular expression works.
• The i flag makes a regular expression case insensitive. For example, /aeiou/i
matches all lowercase and uppercase vowels.
• The g flag specifies a global match, meaning that all matches of the specified
pattern should be returned.
www.spiraltrain.nl Regular Expressions 10
String Methods
• The search() Method :
• The search() method takes one argument: a regular expression. It returns the index
of the first character of the substring matching the regular expression. If no match is
found, the method returns -1.
• The split() Method :
• The split() method takes one argument: a regular expression. It uses the regular
expression as a delimiter to split the string into an array of strings.
• The replace() Method :
• The replace() method takes two arguments: a regular expression and a string. It
replaces the first regular expression match with the string. If the g flag is used in the
regular expression, it replaces all matches with the string.
• The match() Method :
• The match() method takes one argument: a regular expression. It returns each
substring that matches the regular expression pattern.
www.spiraltrain.nl Regular Expressions 11
Summary : Regular Expressions
• Validation is process of checking if criteria satisfied :
• Testify data is correct or compliant with set standards or rules
• Validation on client is never suffcient :
• Should be complemented by server side validation
• Regular expressions are a syntax to match text :
• Became embedded in UNIX systems through tools like ed and grep
• Now available in all popular programming languages like PHP, Java, .NET
• Typical applications of Regular Expressions are :
• Form field validation
• Searching for patterns in e.g. log files
• String have several regexp methods :
• match, search replace and split
• RegExp object : Exercise
Validation
• Constructs a regex dynamically based on a given string
© copyright : spiraltrain@gmail.com Regular Expressions 12
Appendix : Modules and Symbols
www.spiraltrain.nl 13
JavaScript Module Systems
• JavaScript never had built-in modules :
• Community has converged on a simple style of modules
• Supported by libraries in ES5 and earlier
• Module is piece of code that is executed once it is loaded :
• In module may be variable declarations and function declarations
• Default these declarations stay local to the module
• Can mark some of them as exports so other modules can import them
• Module can import from other modules via moduel specifiers :
• Module specifiers are either :
— Relative paths like ../model/user
— Absolute paths like /lib/js/helpers
— Names like util : What modules names refer to has to be configured
• Modules are singletons :
• Even if module is imported multiple times, only a single instance of it exists
• Approach to modules avoids global variables, only globals are module specifiers
www.spiraltrain.nl 14
ECMAScript 5 Module Systems
• ES5 module systems work :
• Without explicit support from the language
• Two most important but incompatible standards are :
• CommonJS Modules :
— Dominant implementation of this standard is in Node.js
— Compact syntax
— Designed for synchronous loading and servers
• Asynchronous Module Definition (AMD) :
— Most popular implementation of this standard is RequireJS
— Slightly more complicated syntax, enabling AMD to work without eval()
— Designed for asynchronous loading and browsers
www.spiraltrain.nl 15
ECMAScript 6 Modules
• There are two kinds of exports :
• Named exports, several per module
• Default exports, one per module
• Module can export multiple things :
• By prefixing its declarations with the keyword export
• Exports are distinguished by their names and are called named exports
// es6lib.js
export const sqrt = Math.sqrt;
export function square(x) { return x * x; }
export function diag(x, y) {
return sqrt(square(x) + square(y));
}
//------ es6main.js ------
import { square, diag } from 'es6lib';
console.log(square(11)); // 121
console.log(diag(4, 3)); // 5
• Node.js cannot yet cope with this syntax : Demo24
ES6 Main
• Give SyntaxError: Unexpected token import
www.spiraltrain.nl 16
CommonJS Syntax
• In order that Node.js is able to find library modules :
• They should be placed in the node_modules directory
// commonjslib.js
var sqrt = Math.sqrt;
function square(x) { return x * x; }
function diag(x, y) { return sqrt(square(x) + square(y)); }
module.exports = {
sqrt: sqrt,
square: square,
diag: diag,
};
// commonjsmain.js
var square = require('commonjslib').square;
var diag = require('commonjslib').diag;
console.log(square(11)); // 121
console.log(diag(4, 3)); // 5 Demo25
CommonJS Main
www.spiraltrain.nl 17
Symbols
• Symbol() function returns value of type symbol :
• Has static properties that expose several members of built-in objects
• Has static methods that expose the global symbol registry
• Every symbol value returned from Symbol() is unique :
• Only purpose of symbol value is usage as identifier for object properties
• Only operators that apply to Symbol() :
• Instance of Symbol can be assigned to L-value and examined for identity
const symbol1 = Symbol();
const symbol2 = Symbol(42); Demo14
Symbols
const symbol3 = Symbol('foo');
const symbol4 = Symbol('foo');
console.log(typeof symbol1); // expected output: "symbol"
console.log(symbol3.toString()); // expected output: "Symbol(foo)"
console.log(symbol3 === symbol4); // expected output false
• Symbols can be used to create private properties in Object :
www.spiraltrain.nl ECMAScript6 Features 18
Built-in Symbols
• Keys of type symbol exist in various built-in JavaScript objects :
• Represent internal JavaScript language behaviors
• These Symbols can be accessed using following properties :
• Symbol.iterator :
— Method returning the default iterator for an object
— Used by for...of
• Symbol.match :
— Method that matches against string, used to determine if object usable as a regex
— Used by String.prototype.match()
• Symbol.hasInstance :
— Method determining if constructor object recognizes object as its instance
— Used by instanceof
• Symbol.toStringTag :
— String value used for the default description of an object Demo15
— Used by Object.prototype.toString() Builtin Symbols
www.spiraltrain.nl ECMAScript6 Features 19