Regular Expression Basic Syntax Reference

Characters Character Any character except
[\^$.|?*+()

Description All characters except the listed special characters match a single instance of themselves. { and } are literal characters, unless they're part of a valid regular expression token (e.g. the {n} quantifier).
a

Example matches a

(backslash) followed by A backslash escapes special characters to suppress their special meaning. any of [\^$.|?*+(){}
\ \Q...\E

\+

matches + matches +-

Matches the characters between \Q and \E literally, suppressing the meaning of special characters. Matches the character with the specified ASCII/ANSI value, which depends on the code page used. Can be used in character classes. Match an LF character, CR character and a tab character respectively. Can be used in character classes. Match a bell character (\x07), escape character (\x1B), form feed (\x0C) and vertical tab (\x0B) respectively. Can be used in character classes. Match an ASCII character Control+A through Control+Z, equivalent to \x01 through \x1A. Can be used in character classes. Character Classes or Character Sets [abc]

\Q+-*/\E */ \xA9

where FF are 2 hexadecimal digits
\xFF \n, \r

matches © when using the Latin-1 code page. matches a DOS/Windows CRLF line break.
\r\n

and \t

\a, \e, \f

and \v

\cA

through \cZ

matches a DOS/Windows CRLF line break.
\cM\cJ

Character (opening square bracket)
[

Description Starts a character class. A character class matches a single character out of all the possibilities offered by the character class. Inside a character class, different rules apply. The rules in this section are only valid inside character classes. The rules outside this section are not valid in character classes, except for a few character escapes that are indicated with "can be used inside character classes".
[abc] c

Example

Any character except All characters except the listed special characters. ^-]\ add that character to the possible matches for the character class. (backslash) followed by A backslash escapes special characters to suppress their special meaning. any of ^-]\
\

matches a, b or

[\^\]] ]

matches ^ or

(hyphen) except immediately after the opening [
-

Specifies a range of characters. (Specifies a hyphen if placed immediately after the opening [)

matches any letter or digit
[a-zA-Z0-9]

1

matches a in abc \Z . causing it to match a single [^a-d] matches x character not listed in the character class. \W and \S [\b] matches a backspace or tab character [\b\t] Dot Character . c or d) Shorthand character classes matching digits. and character that is a whitespace (spaces. Never matches before line breaks. Matches at the end of the string the regex pattern is applied to. and line breaks). (Can be used inside. matches x or characters \r and \n. Also matches before the very last line break if the string ends with a line break. at the start of a line in a file) as well.(caret) immediately after the opening [ ^ \d. digits. Never matches before line breaks. Matches at the end of the string the regex pattern is applied to. Also matches d in "multi- ^ (caret) line" mode. and underscores). Word Boundaries Character Description Matches at the position between a word character . Matches a position rather than a character. Matches a position rather than a character. matches a character that is not a digit \D and \s \D. (Specifies a (any character except caret if placed anywhere except after the opening [) a. Also matches c in "multi- $ (dollar) line" mode. tabs. Matches at the end of the string the regex pattern is applied to. Can be used digit or whitespace inside and outside character classes. Description Example (dot) Matches any single character except line break .\b Example ^. Matches a position rather than a character. except for the very last line break if the string ends with a line break. Most regex flavors have an option to make the dollar match before line breaks (i.) Inside a character class. Matches a position rather than a character.$ matches f in abc\ndef.e. \w Negates the character class. Matches at the start of the string the regex pattern is applied to. \b is a backspace character. but that is confusing. .\z matches f abc\ndef in Example matches c in abc 2 \b . character Anchors Character Description Matches at the start of the string the regex pattern is applied to. \A \A.\Z matches f abc\ndef in \z . Negated versions of the above. Should be used only outside character classes.e. b. word [\d\s] matches a characters (letters. Most regex flavors have an option to make the caret match after line breaks (i. at the end of a line in a file) as well. Matches a position rather than a character. Never matches after line breaks. Most regex flavors have an option (almost) any other to make the dot match line break characters too. matches a in abc\ndef.

Greedy. Repeats the previous item exactly n times. Repeats the previous item zero or more times. Repeats the previous item once or more. Use grouping to alternate only part of the regular expression. before trying permutations with ever increasing matches of the preceding item. Lazy. Lazy. Can be strung abc. so as ". Repeats the previous item between n and m times. so the engine first matches the previous item only once.m} a{2. Repeats the previous item zero or more times. so the optional item is included in the match if possible. This abc construct is often excluded from documentation because of its limited use.*" matches "def" so as many items as possible will be matched before "ghi" in abc "def" trying permutations with less matches of the preceding "ghi" jkl item.e. Alternation Character Description \B. Lazy. Repeats the previous item once or more.e the position between \w\w) as well as at the position between two non-word characters (i. so ". abc? abc Example matches ab or (question mark) ?? Makes the preceding item optional. Greedy. up to the point where the preceding item is not matched at all.\B abc matches b in Example | (pipe) Causes the regex engine to match either the part on the abc|def|xyz matches left side.4} matches aaaa. so the abc?? matches ab or optional item is excluded in the match if possible. ". \W\W).*?" matches "def" the engine first attempts to skip the previous item. so repeating m times is tried before reducing ". Greedy. def or xyz together into a series of options.+?" matches "def" in abc "def" "ghi" jkl * (star) *? (lazy star) + (plus) +? (lazy plus) where n is an integer >= 1 {n} a{3} matches aaa where n >= 0 and m >= n {n. Quantifiers abc(def|xyz) matches abcdef abcxyz | (pipe) or Character ? Description Makes the preceding item optional.+" matches "def" many items as possible will be matched before trying "ghi" in abc "def" permutations with less matches of the preceding item. or the part on the right side. aaa or aa 3 . Greedy.(anything matched by \w) and a non-word character (anything matched by [^\w] or \W) as well as at the start and/or end of the string if the first and/or last characters in the string are word characters. \B Matches at the position between two word characters (i. "ghi" jkl up to the point where the preceding item is matched only once. in abc "def" "ghi" jkl before trying permutations with ever increasing matches of the preceding item. The pipe has the lowest precedence of all operators.

so repeating n times is tried before increasing the aaa or aaaa m >= n repetition to m times. Repeats the previous item n or more times. Lazy.}? aaaaa {n. {n.4}? matches aa.the repetition to n times.} matches aaaaa as many items as possible will be matched before in aaaaa trying permutations with less matches of the preceding item.m}? {n. a{2. where n >= 0 and Repeats the previous item between n and m times. a{2. before trying permutations with ever increasing matches of the preceding item. Lazy. up to the point where the preceding item is matched only n times.}? where n >= 0 matches aa in 4 . so a{2. Greedy. so the engine first matches the previous item n times.} where n >= 0 Repeats the previous item at least n times.

Sign up to vote on this title
UsefulNot useful