- | - |
---|---|
(*ACCEPT) | Control verb |
(*FAIL) | Control verb |
(*MARK:NAME) | Control verb |
(*COMMIT) | Control verb |
(*PRUNE) | Control verb |
(*SKIP) | Control verb |
(*THEN) | Control verb |
(*UTF) | Pattern modifier |
(*UTF8) | Pattern modifier |
(*UTF16) | Pattern modifier |
(*UTF32) | Pattern modifier |
(*UCP) | Pattern modifier |
(*CR) | Line break modifier |
(*LF) | Line break modifier |
(*CRLF) | Line break modifier |
(*ANYCRLF) | Line break modifier |
(*ANY) | Line break modifier |
\R | Line break modifier |
(*BSR_ANYCRLF) | Line break modifier |
(*BSR_UNICODE) | Line break modifier |
(*LIMIT_MATCH=x) | Regex engine modifier |
(*LIMIT_RECURSION=d) | Regex engine modifier |
(*NO_AUTO_POSSESS) | Regex engine modifier |
(*NO_START_OPT) | Regex engine modifier |
Getting Started
Import the regular expressions module
import re
Control verb
POSIX Character Classes
Character Class | Same as | Meaning |
---|---|---|
[[:alnum:]] | [0-9A-Za-z] | Letters and digits |
[[:alpha:]] | [A-Za-z] | Letters |
[[:ascii:]] | [\x00-\x7F] | ASCII codes 0-127 |
[[:blank:]] | [\t ] | Space or tab only |
[[:cntrl:]] | [\x00-\x1F\x7F] | Control characters |
[[:digit:]] | [0-9] | Decimal digits |
[[:graph:]] | [[:alnum:][:punct:]] | Visible characters (not space) |
[[:lower:]] | [a-z] | Lowercase letters |
[[:print:]] | [ -~] == [ [:graph:]] | Visible characters |
[[:punct:]] | [!"#$%&â()*+,-./:;<=>?@[]^_ { | }~]` |
[[:space:]] | [\t\n\v\f\r ] | Whitespace |
[[:upper:]] | [A-Z] | Uppercase letters |
[[:word:]] | [0-9A-Za-z_] | Word characters |
[[:xdigit:]] | [0-9A-Fa-f] | Hexadecimal digits |
[[:<:]] | [\b(?=\w)] | Start of word |
[[:>:]] | [\b(?<=\w)] | End of word |
Recurse
- | - |
---|---|
(?R) | Recurse entire pattern |
(?1) | Recurse first subpattern |
(?+1) | Recurse first relative subpattern |
(?&name) | Recurse subpattern name |
(?P=name) | Match subpattern name |
(?P>name) | Recurse subpattern name |
Flags/Modifiers
Pattern | Description |
---|---|
g | Global |
m | Multiline |
i | Case insensitive |
x | Ignore whitespace |
s | Single line |
u | Unicode |
X | eXtended |
U | Ungreedy |
A | Anchor |
J | Duplicate group names |
Lookarounds
- | - |
---|---|
(?=...) | Positive Lookahead |
(?!...) | Negative Lookahead |
(?<=...) | Positive Lookbehind |
(?<!...) | Negative Lookbehind |
Lookaround lets you match a group before (lookbehind) or after (lookahead) your main pattern without including it in the result. |
Assertions
- | - |
---|---|
(?(1)yes|no) | Conditional statement |
(?(R)yes|no) | Conditional statement |
(?(R#)yes|no) | Recursive Conditional statement |
(?(R&name)yes|no) | Conditional statement |
(?(?=...)yes|no) | Lookahead conditional |
(?(?<=...)yes|no) | Lookbehind conditional |
Group Constructs
Pattern | Description |
---|---|
(...) | Capture everything enclosed |
(a|b) | Match either a or b |
(?:...) | Match everything enclosed |
(?>...) | Atomic group (non-capturing) |
(?|...) | Duplicate subpattern group number |
(?#...) | Comment |
(?'name'...) | Named Capturing Group |
(?<name>...) | Named Capturing Group |
(?P<name>...) | Named Capturing Group |
(?imsxXU) | Inline modifiers |
(?(DEFINE)...) | Pre-define patterns before using them |
Substitution
Pattern | Description |
---|---|
\0 | Complete match contents |
\1 | Contents in capture group 1 |
$1 | Contents in capture group 1 |
${foo} | Contents in capture group foo |
\x20 | Hexadecimal replacement values |
\x{06fa} | Hexadecimal replacement values |
\t | Tab |
\r | Carriage return |
\n | Newline |
\f | Form-feed |
\U | Uppercase Transformation |
\L | Lowercase Transformation |
\E | Terminate any Transformation |
Anchors
Pattern | Description |
---|---|
\G | Start of match |
^ | Start of string |
$ | End of string |
\A | Start of string |
\Z | End of string |
\z | Absolute end of string |
\b | A word boundary |
\B | Non-word boundary |
Meta Sequences
Pattern | Description |
---|---|
. | Any single character |
\s | Any whitespace character |
\S | Any non-whitespace character |
\d | Any digit, Same as [0-9] |
\D | Any non-digit, Same as [^0-9] |
\w | Any word character |
\W | Any non-word character |
\X | Any Unicode sequences, linebreaks included |
\C | Match one data unit |
\R | Unicode newlines |
\v | Vertical whitespace character |
\V | Negation of \v - anything except newlines and vertical tabs |
\h | Horizontal whitespace character |
\H | Negation of \h |
\K | Reset match |
\n | Match nth subpattern |
\pX | Unicode property X |
\p{...} | Unicode property or script category |
\PX | Negation of \pX |
\P{...} | Negation of \p |
\Q...\E | Quote; treat as literals |
\k<name> | Match subpattern name |
\k'name' | Match subpattern name |
\k{name} | Match subpattern name |
\gn | Match nth subpattern |
\g{n} | Match nth subpattern |
\g<n> | Recurse nth capture group |
\g'n' | Recurses nth capture group. |
\g{-n} | Match nth relative previous subpattern |
\g<+n> | Recurse nth relative upcoming subpattern |
\g'+n' | Match nth relative upcoming subpattern |
\g'letter' | Recurse named capture group letter |
\g{letter} | Match previously-named capture group letter |
\g<letter> | Recurses named capture group letter |
\xYY | Hex character YY |
\x{YYYY} | Hex character YYYY |
\ddd | Octal character ddd |
\cY | Control character Y |
[\b] | Backspace character |
\ | Makes any character literal |
Common Metacharacters
- ^
- {
- +
- <
- [
- *
- )
- >
- .
- (
- |
- $
- \
- ? Escape these special characters with
\
Quantifiers
Pattern | Description |
---|---|
a? | Zero or one of a |
a* | Zero or more of a |
a+ | One or more of a |
[0-9]+ | One or more of 0-9 |
a{3} | Exactly 3 of a |
a{3,} | 3 or more of a |
a{3,6} | Between 3 and 6 of a |
a* | Greedy quantifier |
a*? | Lazy quantifier |
a*+ | Possessive quantifier |
Character Classes
Pattern | Description |
---|---|
[abc] | A single character of: a, b or c |
[^abc] | A character except: a, b or c |
[a-z] | A character in the range: a-z |
[^a-z] | A character not in the range: a-z |
[0-9] | A digit in the range: 0-9 |
[a-zA-Z] | A character in the range:a-z or A-Z |
[a-zA-Z0-9] | A character in the range: a-z, A-Z or 0-9 |
Introduction
This is a quick cheat sheet to getting started with regular expressions.
- Regex in Python (quickref.me)
- Regex in JavaScript (quickref.me)
- Regex in PHP (quickref.me)
- Regex in Java (quickref.me)
- Regex in MySQL (quickref.me)
- Regex in Vim (quickref.me)
- Regex in Emacs (quickref.me)
- Online regex tester (regex101.com)
Comments