You are on page 1of 7

01/07/2017 Regex Cheat Sheet

Quick-Start: Regex Cheat Sheet

The tables below are a reference to basic regex. While reading the rest of the site, when in doubt, you can
always come back and look here. (It you want a bookmark, here's a direct link to the regex reference
tables). I encourage you to print the tables so you have a cheat sheet on your desk for quick reference.

The tables are not exhaus ve, for two reasons. First, every regex flavor is different, and I didn't want to
crowd the page with overly exo c syntax. For a full reference to the par cular regex flavors you'll be using,
it's always best to go straight to the source. In fact, for some regex engines (such as Perl, PCRE, Java and
.NET) you may want to check once a year, as their creators o en introduce new features.

The other reason the tables are not exhaus ve is that I wanted them to serve as a quick introduc on to
regex. If you are a complete beginner, you should get a firm grasp of basic regex syntax just by reading the
examples in the tables. I tried to introduce features in a logical order and to keep out oddi es that I've
never seen in actual use, such as the "bell character". With these tables as a jumping board, you will be able
to advance to mastery by exploring the other pages on the site.

How to use the tables
The tables are meant to serve as an accelerated regex course, and they are meant to be read slowly, one
line at a me. On each line, in the le most column, you will find a new element of regex syntax. The next
column, "Legend", explains what the element means (or encodes) in the regex syntax. The next two
columns work hand in hand: the "Example" column gives a valid regular expression that uses the element,
and the "Sample Match" column presents a text string that could be matched by the regular expression.

You can read the tables online, of course, but if you suffer from even the mildest case of online-ADD
(a en on deficit disorder), like most of us… Well then, I highly recommend you print them out. You'll be
able to study them slowly, and to use them as a cheat sheet later, when you are reading the rest of the site
or experimen ng with your own regular expressions.


If you overdose, make sure not to miss the next page, which comes back down to Earth and talks about
some really cool stuff: The 1001 ways to use Regex.

Regex Accelerated Course and Cheat Sheet
For easy naviga on, here are some jumping points to various sec ons of the page:

✽ Characters
✽ Quan fiers
✽ More Characters
✽ Logic
✽ More White-Space
✽ More Quan fiers
✽ Character Classes
✽ Anchors and Boundaries
✽ POSIX Classes
✽ Inline Modifiers
✽ Lookarounds
✽ Character Class Opera ons
✽ Other Syntax 1/7

man. tab.c a. ideogram.c abc break .NET. digit. ver cal tab \s .* whatever. carriage return.html 2/7 . break \.rexegg. Python 3: one Unicode file_\d\d file_9੩ digit in any script \w Most engines: "word \w-\w\w\w A-b_1 character": ASCII le er. c newline.} Three or more mes \w{3.NET.4} 156 {3. Any character except line . Python 3.4} Two to four mes \d{2.01/07/2017 Regex Cheat Sheet (direct link) Characters Character Legend Example Sample Match \d Most engines: one digit file_\d\d file_25 from 0 to 9 \d . JavaScript: a\sb\sc ab "whitespace character": any c Unicode separator \D One character that is not \D\D\D ABC a digit as defined by your engine's \d \W One character that is not \W\W\W\W\W *-+=) a word character as defined by your engine's \w \S One character that is not \S\S\S\S Yoyo a whitespace character as defined by your engine's \s (direct link) Quantifiers Quan fier Legend Example Sample Match + One or more Version \w-\w+ Version A-b1_1 {3} Exactly three mes \D{3} ABC {2.c needs to be escaped by a \) http://www.NET: "word character": \w-\w\w\w 字-ま‿۳ Unicode le er. digit or underscore \w . Any character except line a. or underscore \w .} regex_tutorial * Zero or more mes A*B*C* AAACC ? Once or none plurals? plural (direct link) More Characters Character Legend Example Sample Match . or connector \s Most engines: "whitespace a\sb\sc ab character": space. ideogram. A period (special character: a\.Python 3: "word character": \w-\w\w\w 字-ま_۳ Unicode le er.

html 3/7 .NET. PHP. PHP. Python. form feed. R…): one \N+ ABC character that is not a line break \h Perl. carriage return. PCRE (C.*+? $^/\ \ Escapes a special character \[\{\(\)\}\] [{()}] (direct link) Logic Logic Legend Example Sample Match | Alterna on / OR operand 22|33 33 ( … ) Capturing group A(nt|pple) Apple (captures "pple") \1 Contents of Group 1 r(\w)g\1x regex \2 Contents of Group 2 (\d\d)\+(\d\d)=\2\+\112+65=65+12 (?: … ) Non-capturing group A(?:nt|pple) Apple (direct link) More White-Space Character Legend Example Sample Match \t Tab T\t\w{2} T ab \r Carriage return character see below \n Line feed character see below \r\n Line separator on Windows AB\r\nCD AB CD \N PCRE (C. PCRE (C. Java: one line break (carriage return + line feed pair.rexegg. and all the characters matched by \v) (direct link) More Quantifiers Quan fier Legend Example Sample Match + The + (one or more) is \d+ 12345 "greedy" ? Makes quan fiers "lazy" \d+? 1 in 12345 * The * (zero or more) is A* AAA "greedy" http://www.01/07/2017 Regex Cheat Sheet \ Escapes a special character \. R…). ver cal tab. R…).\*\+\? \$\^\/\\ . Java: one horizontal whitespace character: tab or Unicode space separator \H One character that is not a horizontal whitespace \v . R…). JavaScript. PCRE (C. PHP. Java: one ver cal whitespace character: line feed. paragraph or line separator \V Perl. R…). Ruby: ver cal tab \v Perl. Java: any character that is not a ver cal whitespace \R Perl. PHP. PCRE (C. PHP. R…).the end Not available in Python and JS \Z End of string or (except the end\Z this is.x.html 4/7 .. PHP. Range indicator [a-z] One lowercase le er [x-y] One of the characters in the [A-Z]+ GREAT range from x to y [ … ] One of the characters in the [AB1-5w-z] One of either: brackets A. Java. Many engine- dependent subtle es.. which the regular dot doesn't match [\x41] Matches the character at [\x41-\x45]{3} ABE hexadecimal posi on 41 in the ASCII table. inc- a non-digit luding new lines.\n. [^x] One character that is not x [^a-z]{3} A1! [^x-y] One of the characters not in [^ -~]+ Characters that the range from x to y are not in the printable sec on of the ASCII table.*? the end$ this is the end line depending on mul line mode..2.rexegg. i. Perl. A (direct link) Anchors and Boundaries Anchor Legend Example Sample Match ^ Start of string or start of ^abc .y. (all major engines except JS) .. [\d\D] One character that is a digit or[\d\D]+ Any characters. "greedy" \w{2. it means "not") $ End of string or end of ..4} Two to four mes.\n.1.w.z [x-y] One of the characters in the [ -~]+ Characters in the range from x to y printable sec on of the ASCII table.01/07/2017 Regex Cheat Sheet ? Makes quan fiers "lazy" A*? empty in AAA {2...the end\n Python) before final line break Not available in JS \G Beginning of String or End of Previous Match ..4. PCRE (C. Ruby http://www.* abc (line start) line depending on mul line mode.e. (But when [^inside brackets].start) \z Very end of the string the end\z this is..4} abcd ? Makes quan fiers "lazy" \w{2.NET. \A Beginning of string \Aabc[\d\D]* abc (string..4}? ab in abcd (direct link) Character Classes Character Legend Example Sample Match [ … ] One of the characters in the [AEIOU] One uppercase vowel brackets [ … ] One of the characters in the T[ao]p Tap or Top brackets ..B.5.

Ruby: posi on where one side only is a Unicode le er.*\bcat\b Bob ate the cat Most engines: posi on where one side only is an ASCII le er.: ⁆ mark (direct link) Inline Modifiers None of these are supported in JavaScript.e. Java. In Ruby.e.:. le er [[:alnum:]]{10} кошка90210 or ideogram [:punct:] PCRE (C.01/07/2017 Regex Cheat Sheet \b Word boundary Bob.*to Z From A Ruby). R…): ASCII [[:punct:]]+ ?!. punctua on mark [:punct:] Ruby: Unicode punctua on [[:punct:]]+ 〽 ‽. R…): ASCII [8[:alpha:]]+ WellDone88 le ers A-Z and a-z [:alpha:] Ruby 2: Unicode le er or [[:alpha:]\d]+ кошка99 ideogram [:alnum:] PCRE (C. DOTALL to Z mode.rexegg. digit or underscore \B Not a word boundary c. PHP. Also # comment known as comment mode or abc # write on whitespace mode mul ple # lines [ ]d # spaces must be # in brackets http://www. R…): ASCII [[:alnum:]]{10} ABCDE12345 digits and le ers A-Z and a-z [:alnum:] Ruby 2: Unicode digit.* copycats (direct link) POSIX Classes Character Legend Example Sample Match [:alpha:] PCRE (C.) matches to Z new line characters (\r\n).*\Bcat\B.html 5/7 . The dot (.. dot matches line breaks (?x) Free-Spacing Mode mode (?x) # this is a abc d (except JavaScript).*\b\кошка\b Bob ate the кошка .com/regex-quickstart. i. beware of (?s) and (?m) . digit or underscore \b Word boundary Bob.*to Z From A other engines. PHP. i. Also known as "single-line mode" because the dot treats the en re input as a single line (?m) Mul line mode (?m)1\r\n^2$\r\n^3$ 1 (except Ruby and JS) ^ and $ 2 match at the beginning and 3 end of every line (?m) In Ruby: the same as (?s) in (?m)From A.NET. PHP. Modifier Legend Example Sample Match (?i) Case-insensi ve mode (?i)Monday monDAY (except JavaScript) (?s) DOTALL mode (except JS and (?s)From A. Python 3.

An non-whitespace intersec on. PCRE (C.NET: character class [a-z-[aeiou]] Any lowercase subtrac on. Ruby 2+: character class [\S&&[\D]&&[^a-zA.01/07/2017 Regex Cheat Sheet (?n) . R…). Ruby 2+: drop everything that was http://www. […&&[…]] Java. […&&[^…]] Java. Ruby 2+: character class [\S&&[\D]] An non-whitespace intersec on. To capture. […-[…]] . i. Ruby 2+: character class [\p{InArabic}&& An Arabic character subtrac on [^\p{L}\p{N}]] that is not a le er or a number (direct link) Other Syntax Syntax Legend Example Sample Match \K Keep Out prefix\K\d+ 12 Perl. le and in the && class.html 6/7 .. but not in the subtracted class. Z]] character that a non- digit and not a le er.e. an Arabic digit […&&[…]] Java. One character consonant that is in those on the le .NET: character class [\p{IsArabic}-[\D]] An Arabic character subtrac on. Ruby 2+: character class [a-z&&[^aeiou]] An English lowercase subtrac on is obtained by le er that is not a intersec ng a class with a vowel. PHP. that is not a non-digit.NET: named capture only Turns all (parentheses) into non-capture negated class […&&[^…]] Java. One character character that is a that is both in those on the non-digit.rexegg. (?d) Java: Unix linebreaks only The dot and the ^ and $ anchors are only affected by \n (direct link) Lookarounds Lookaround Legend Example Sample Match (?=…) Posi ve lookahead (?=\d{10})\d{5} 01234 in 0123456789 (?<=…) Posi ve lookbehind (?<=\d)cat cat in 1cat (?!…) Nega ve lookahead (?!theatre)the\w+ theme (?<!…) Nega ve lookbehind \w{3}(?<!mon)ster Munster (direct link) Character Class Operations Class Legend Example Sample Match Opera on […-[…]] . use named groups. Python's alternate regex engine.

01/07/2017 Regex Cheat Sheet matched so far from the overall match to be returned \Q…\E 7/7 . PCRE (C. PHP. Java: \Q(C++ ?)\E (C++ ?) treat anything between the delimiters as a literal string. R…). Useful to escape metacharacters. http://www.rexegg.