Skip to content

// glossary

What is Regex (Regular Expressions)?

A regular expression (regex) is a sequence of characters that defines a search pattern, used for matching, extracting, and manipulating text in strings.

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern, used for matching, extracting, and manipulating text in strings. Regexes are supported in virtually every programming language, text editor, and command-line tool.

Basic syntax

A regex pattern is built from literal characters and metacharacters:

  • . matches any single character
  • * means zero or more of the preceding element
  • + means one or more
  • ? means zero or one
  • ^ anchors to the start of a string
  • $ anchors to the end
  • [abc] matches any one of a, b, or c
  • \d matches any digit, \w matches word characters, \s matches whitespace
Pattern: ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
Matches: user@example.com

Capture groups and backreferences

Parentheses () create capture groups that extract matched substrings. In the pattern (\d{4})-(\d{2})-(\d{2}) matched against 2026-03-21, group 1 captures 2026, group 2 captures 03, and group 3 captures 21.

Named groups like (?<year>\d{4}) make patterns more readable. Backreferences like \1 refer back to previously captured groups within the same pattern.

Lookaheads and lookbehinds

These are zero-width assertions — they check what’s around a match without including it:

  • (?=...) positive lookahead: match only if followed by…
  • (?!...) negative lookahead: match only if NOT followed by…
  • (?<=...) positive lookbehind: match only if preceded by…

Example: \d+(?=px) matches 16 in 16px but not 16 in 16em.

Where regex is used

  • Validation: Email addresses, phone numbers, URLs, dates
  • Search and replace: In code editors, sed, and string manipulation functions
  • Log parsing: Extracting timestamps, error codes, IP addresses from log files
  • Web scraping: Pulling structured data from HTML (though a proper parser is usually better)
  • Routing: Web frameworks use regex patterns for URL routing

Common gotchas

Greedy matching (.*) captures as much as possible. Use .*? for non-greedy (lazy) matching. Backtracking can cause catastrophic performance on certain patterns — a concept called ReDoS (Regular Expression Denial of Service).

Test and debug patterns with the Regex Tester, reference syntax with the Regex Cheatsheet, or visualize pattern logic with the Regex Visualizer.

#Related Tools

#Related Terms

#Learn More