Regular expressions (regex) are patterns used to match character combinations in strings. They appear in virtually every programming language and many command-line tools. This reference covers the full regex syntax you need for day-to-day development.
Metacharacters
These special characters have meaning beyond their literal value.
| Character | Meaning | Example | Matches |
|---|---|---|---|
. | Any character except newline | a.c | abc, a1c, a-c |
\d | Any digit (0-9) | \d{3} | 123, 456 |
\D | Any non-digit | \D+ | abc, --- |
\w | Word character (a-z, A-Z, 0-9, _) | \w+ | hello_42 |
\W | Non-word character | \W | @, #, |
\s | Whitespace (space, tab, newline) | \s+ | , \t\n |
\S | Non-whitespace | \S+ | hello |
\b | Word boundary | \bcat\b | cat in “the cat sat” |
\B | Non-word boundary | \Bcat\B | cat in “concatenate” |
\\ | Escape special character | \. | literal . |
Quantifiers
Quantifiers control how many times a pattern element repeats.
| Quantifier | Meaning | Example | Matches |
|---|---|---|---|
* | 0 or more | ab*c | ac, abc, abbc |
+ | 1 or more | ab+c | abc, abbc (not ac) |
? | 0 or 1 | colou?r | color, colour |
{n} | Exactly n | \d{4} | 2026 |
{n,} | n or more | \d{2,} | 42, 123, 9999 |
{n,m} | Between n and m | \d{2,4} | 42, 123, 9999 |
Lazy (non-greedy) versions: Add ? after any quantifier to match as few characters as possible.
| Greedy | Lazy | Behavior |
|---|---|---|
.* | .*? | Match as little as possible |
.+ | .+? | Match 1+, preferring fewer |
.{2,5} | .{2,5}? | Match 2-5, preferring 2 |
Anchors
Anchors match positions, not characters.
| Anchor | Meaning | Example | Matches |
|---|---|---|---|
^ | Start of string (or line with m flag) | ^Hello | Hello world |
$ | End of string (or line with m flag) | world$ | Hello world |
\b | Word boundary | \bword\b | whole word only |
\A | Start of string (never line) | \AStart | only at absolute start |
\Z | End of string (never line) | end\Z | only at absolute end |
Character Classes
| Syntax | Meaning | Example | Matches |
|---|---|---|---|
[abc] | Any of a, b, or c | [aeiou] | any vowel |
[^abc] | Not a, b, or c | [^0-9] | any non-digit |
[a-z] | Range: a through z | [A-Za-z] | any letter |
[a-zA-Z0-9] | Alphanumeric | [a-zA-Z0-9_] | same as \w |
[\s\S] | Any character including newline | [\s\S]* | everything |
Groups and Capturing
| Syntax | Meaning | Example |
|---|---|---|
(abc) | Capturing group | (foo)bar captures foo |
(?:abc) | Non-capturing group | (?:foo)bar groups without capture |
(?<name>abc) | Named capturing group | (?<year>\d{4}) |
\1 | Backreference to group 1 | (a)\1 matches aa |
\k<name> | Named backreference | \k<year> |
(a|b) | Alternation (OR) | (cat|dog) matches either |
Lookahead and Lookbehind
These match a position based on what comes before or after, without consuming characters.
| Syntax | Name | Example | Matches |
|---|---|---|---|
(?=abc) | Positive lookahead | \d(?=px) | 5 in 5px |
(?!abc) | Negative lookahead | \d(?!px) | 5 in 5em |
(?<=abc) | Positive lookbehind | (?<=\$)\d+ | 100 in $100 |
(?<!abc) | Negative lookbehind | (?<!\$)\d+ | 100 in €100 |
Flags / Modifiers
Flags change how the regex engine processes the pattern.
| Flag | Name | Effect |
|---|---|---|
g | Global | Find all matches, not just the first |
i | Case-insensitive | A matches a |
m | Multiline | ^ and $ match line starts/ends |
s | Dotall / Single-line | . matches newline characters |
u | Unicode | Enable full Unicode matching |
x | Extended | Ignore whitespace, allow comments |
y | Sticky | Match only at lastIndex position |
Common Patterns
These battle-tested patterns cover frequent validation tasks.
# Email (simplified)
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
# URL
https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)
# IPv4 Address
\b(?:\d{1,3}\.){3}\d{1,3}\b
# Date (YYYY-MM-DD)
\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])
# Hex color
#?([a-fA-F0-9]{6}|[a-fA-F0-9]{3})
# Phone (US)
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
# Strong password (8+ chars, upper, lower, digit, special)
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
# HTML tag
<\/?[\w\s]*>|<.+[\W]>
# Whitespace trimming
^\s+|\s+$
POSIX Character Classes
Used in tools like grep, sed, and awk.
| Class | Equivalent | Meaning |
|---|---|---|
[:alpha:] | [a-zA-Z] | Letters |
[:digit:] | [0-9] | Digits |
[:alnum:] | [a-zA-Z0-9] | Alphanumeric |
[:space:] | [\s] | Whitespace |
[:upper:] | [A-Z] | Uppercase letters |
[:lower:] | [a-z] | Lowercase letters |
[:punct:] | Punctuation characters | |
[:print:] | Printable characters |
Test your patterns in real time with the Regex Tester or generate patterns automatically with the Regex Generator.