Regular Expressions Cheat Sheet for Developers
· 12 min read
Table of Contents
- Introduction to Regular Expressions
- Basic Syntax and Elements
- Character Classes and Quantifiers
- Anchors and Word Boundaries
- Groups and Capturing
- Regex Flags and Modifiers
- Common Patterns and Their Applications
- Advanced Techniques and Lookarounds
- Performance Optimization
- Practical Real-World Examples
- Frequently Asked Questions
- Related Articles
Introduction to Regular Expressions
Regular expressions (regex) are powerful pattern-matching tools that every developer should master. They provide a concise, declarative way to search, validate, and manipulate text data without writing verbose string manipulation code.
Whether you're building form validation, parsing log files, extracting data from APIs, or cleaning datasets, regex offers an elegant solution. A single regex pattern can replace dozens of lines of conditional logic, making your code more maintainable and less error-prone.
Modern programming languages like JavaScript, Python, Java, PHP, and Ruby all have built-in regex support. Once you learn the syntax, you can apply these skills across virtually any development environment.
Pro tip: Regular expressions can seem cryptic at first, but they follow consistent patterns. Start with simple expressions and gradually build complexity as you gain confidence. Use online regex testers to experiment and visualize matches in real-time.
Basic Syntax and Elements
Understanding the fundamental building blocks of regex is essential before tackling complex patterns. These core elements form the foundation of every regular expression you'll write.
Literal Characters
The simplest regex patterns match literal characters exactly as they appear. The pattern cat matches the string "cat" wherever it occurs in your text.
Most alphanumeric characters are literal, but certain special characters (called metacharacters) have special meanings and must be escaped with a backslash when you want to match them literally.
Essential Metacharacters
| Pattern | Description | Example | Matches |
|---|---|---|---|
. |
Matches any single character except newline | c.t |
"cat", "cot", "c9t" |
^ |
Asserts position at start of string | ^Hello |
"Hello world" (at start only) |
$ |
Asserts position at end of string | world$ |
"Hello world" (at end only) |
* |
Matches 0 or more repetitions | ab*c |
"ac", "abc", "abbc" |
+ |
Matches 1 or more repetitions | ab+c |
"abc", "abbc" (not "ac") |
? |
Matches 0 or 1 occurrence | colou?r |
"color", "colour" |
| |
Alternation (OR operator) | cat|dog |
"cat" or "dog" |
\ |
Escapes special characters | \. |
Literal period character |
Escaping Special Characters
When you need to match metacharacters literally, prefix them with a backslash. For example, to match a literal period, use \. instead of just .
Common characters that need escaping include: . * + ? ^ $ { } [ ] ( ) | \
// Match a literal question mark
const pattern = /What\?/;
pattern.test("What?"); // true
// Match a dollar amount
const price = /\$\d+\.\d{2}/;
price.test("$19.99"); // true
Character Classes and Quantifiers
Character classes let you match any character from a specific set, while quantifiers control how many times a pattern should repeat. Together, they form the backbone of flexible pattern matching.
Predefined Character Classes
| Class | Description | Equivalent | Example |
|---|---|---|---|
\d |
Any digit | [0-9] |
\d{3} matches "123" |
\D |
Any non-digit | [^0-9] |
\D+ matches "abc" |
\w |
Word character (alphanumeric + underscore) | [A-Za-z0-9_] |
\w+ matches "user_123" |
\W |
Non-word character | [^A-Za-z0-9_] |
\W matches "@" or " " |
\s |
Whitespace (space, tab, newline) | [ \t\n\r\f\v] |
\s+ matches " " |
\S |
Non-whitespace | [^ \t\n\r\f\v] |
\S+ matches "hello" |
Custom Character Sets
Square brackets define custom character sets. The pattern [aeiou] matches any single vowel, while [0-9a-fA-F] matches any hexadecimal digit.
Use a caret inside brackets to negate the set: [^0-9] matches any character that's NOT a digit.
// Match any vowel
const vowels = /[aeiou]/gi;
"Hello World".match(vowels); // ["e", "o", "o"]
// Match consonants only
const consonants = /[^aeiou\s]/gi;
"Hello".match(consonants); // ["H", "l", "l"]
Quantifiers in Detail
Quantifiers specify how many times the preceding element should match. They're greedy by default, meaning they match as much text as possible.
{n}- Exactly n times:\d{4}matches exactly 4 digits{n,}- At least n times:\d{3,}matches 3 or more digits{n,m}- Between n and m times:\d{2,4}matches 2, 3, or 4 digits*- Zero or more (equivalent to{0,})+- One or more (equivalent to{1,})?- Zero or one (equivalent to{0,1})
Quick tip: Add a question mark after any quantifier to make it non-greedy (lazy). For example, .*? matches as few characters as possible instead of as many as possible. This is crucial when extracting content between delimiters.
Anchors and Word Boundaries
Anchors don't match characters—they match positions in the string. They're essential for precise pattern matching when you need to ensure text appears in specific locations.
Position Anchors
The caret ^ matches the start of a string (or line in multiline mode), while the dollar sign $ matches the end. These are invaluable for validation where the entire string must match a pattern.
// Validate that entire string is digits
const onlyDigits = /^\d+$/;
onlyDigits.test("12345"); // true
onlyDigits.test("123abc"); // false
// Match lines starting with "Error"
const errorLines = /^Error/gm;
const logs = "Info: Starting\nError: Failed\nError: Timeout";
logs.match(errorLines); // ["Error", "Error"]
Word Boundaries
The \b anchor matches word boundaries—positions between word and non-word characters. It's perfect for matching whole words without accidentally matching parts of larger words.
The \B anchor matches positions that are NOT word boundaries.
// Match "cat" as a whole word only
const wholeCat = /\bcat\b/;
wholeCat.test("cat"); // true
wholeCat.test("cats"); // false
wholeCat.test("concatenate"); // false
// Match "cat" within words
const partialCat = /\Bcat\B/;
partialCat.test("concatenate"); // true
partialCat.test("cat"); // false
Word boundaries are particularly useful for search and replace operations where you want to avoid partial matches. They're also essential when building syntax highlighters or code analyzers.
Groups and Capturing
Parentheses in regex serve multiple purposes: they group parts of patterns together, capture matched text for later use, and enable backreferences within the pattern itself.
Capturing Groups
Parentheses create numbered capturing groups that store the matched text. You can reference these captures in replacement strings or extract them programmatically.
// Extract date components
const datePattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = "2026-03-31".match(datePattern);
// match[0]: "2026-03-31" (full match)
// match[1]: "2026" (year)
// match[2]: "03" (month)
// match[3]: "31" (day)
// Reformat dates using captures
const text = "Date: 2026-03-31";
const reformatted = text.replace(/(\d{4})-(\d{2})-(\d{2})/, "$2/$3/$1");
// Result: "Date: 03/31/2026"
Non-Capturing Groups
Use (?:...) to group patterns without creating a capture. This improves performance and keeps your capture numbering clean when you only need grouping for alternation or quantifiers.
// Group without capturing
const protocol = /(?:https?|ftp):\/\//;
protocol.test("https://example.com"); // true
// Compare with capturing (unnecessary overhead)
const protocolCapture = /(https?|ftp):\/\//;
// Creates an extra capture group we don't need
Named Capturing Groups
Modern regex engines support named captures using (?<name>...) syntax. This makes your patterns self-documenting and easier to maintain.
// Named captures for clarity
const emailPattern = /(?<user>[\w.]+)@(?<domain>[\w.]+)/;
const match = "[email protected]".match(emailPattern);
console.log(match.groups.user); // "john.doe"
console.log(match.groups.domain); // "example.com"
// Use in replacements
const masked = "[email protected]".replace(
/(?<user>[\w.]+)@(?<domain>[\w.]+)/,
"***@$<domain>"
);
// Result: "***@example.com"
Backreferences
Reference earlier captures within the same pattern using \1, \2, etc. This is powerful for matching repeated or mirrored patterns.
// Match repeated words
const repeated = /\b(\w+)\s+\1\b/;
repeated.test("hello hello"); // true
repeated.test("hello world"); // false
// Match HTML tags
const htmlTag = /<(\w+)>.*?<\/\1>/;
htmlTag.test("<div>content</div>"); // true
htmlTag.test("<div>content</span>"); // false
Regex Flags and Modifiers
Flags modify how the regex engine interprets your pattern. They're added after the closing delimiter in most languages (e.g., /pattern/flags in JavaScript).
Common Flags
- g (global) - Find all matches instead of stopping after the first match
- i (case-insensitive) - Ignore case when matching letters
- m (multiline) - Make
^and$match line boundaries instead of string boundaries - s (dotall) - Make
.match newline characters - u (unicode) - Enable full Unicode support
- y (sticky) - Match only from the lastIndex position
// Case-insensitive search
const pattern = /hello/i;
pattern.test("HELLO"); // true
pattern.test("Hello"); // true
// Global flag for multiple matches
const digits = /\d+/g;
"Phone: 555-1234, Fax: 555-5678".match(digits);
// ["555", "1234", "555", "5678"]
// Multiline mode
const headers = /^#.+$/gm;
const markdown = "# Title\nContent\n## Subtitle";
markdown.match(headers); // ["# Title", "## Subtitle"]
Pro tip: The global flag changes how methods like exec() and test() behave—they maintain state between calls. If you're getting unexpected results, check whether you need the global flag or if it's causing issues with stateful matching.
Common Patterns and Their Applications
Let's explore battle-tested regex patterns for common development tasks. These patterns have been refined through real-world use and handle edge cases you might not initially consider.
Email Validation
Email validation is notoriously complex due to RFC specifications, but this practical pattern handles 99% of real-world email addresses:
const emailPattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
// Valid emails
emailPattern.test("[email protected]"); // true
emailPattern.test("[email protected]"); // true
// Invalid emails
emailPattern.test("invalid@"); // false
emailPattern.test("@example.com"); // false
emailPattern.test("[email protected]"); // false
For production applications, consider using a dedicated email validation library that handles internationalized domains and all RFC edge cases. You can also use our Email Validator Tool to test email patterns interactively.
URL Matching
Match and extract URLs from text with this comprehensive pattern:
const urlPattern = /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/\/=]*)/gi;
const text = "Visit https://example.com or http://www.site.org/path?query=value";
const urls = text.match(urlPattern);
// ["https://example.com", "http://www.site.org/path?query=value"]
Phone Number Formats
Phone numbers vary by country, but here are patterns for common US formats:
// US phone numbers (various formats)
const phonePattern = /^(\+1[-.\s]?)?(\(?\d{3}\)?[-.\s]?)?\d{3}[-.\s]?\d{4}$/;
phonePattern.test("555-1234"); // true
phonePattern.test("(555) 123-4567"); // true
phonePattern.test("+1-555-123-4567"); // true
phonePattern.test("5551234567"); // true
Password Strength Validation
Enforce password requirements using lookahead assertions:
// At least 8 chars, 1 uppercase, 1 lowercase, 1 digit, 1 special char
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;
strongPassword.test("Weak123"); // false (no special char)
strongPassword.test("Strong123!"); // true
strongPassword.test("NoDigits!"); // false
Credit Card Numbers
Validate credit card formats (remember to use proper payment processing libraries for real transactions):
// Visa, MasterCard, Amex, Discover
const creditCard = /^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$/;
creditCard.test("4532015112830366"); // true (Visa)
creditCard.test("5425233430109903"); // true (MasterCard)
creditCard.test("1234567890123456"); // false
Date Formats
Match various date formats with these patterns:
// YYYY-MM-DD format
const isoDate = /^\d{4}-\d{2}-\d{2}$/;
// MM/DD/YYYY format
const usDate = /^(0[1-9]|1[0-2])\/(0[1-9]|[12]\d|3[01])\/\d{4}$/;
// DD-MM-YYYY format
const euDate = /^(0[1-9]|[12]\d|3[01])-(0[1-9]|1[0-2])-\d{4}$/;
isoDate.test("2026-03-31"); // true
usDate.test("03/31/2026"); // true
euDate.test("31-03-2026"); // true
Test these patterns and more using our Regex Tester Tool for instant feedback and pattern visualization.
IP Address Validation
Validate IPv4 addresses with proper range checking:
// IPv4 address
const ipv4 = /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/;
ipv4.test("192.168.1.1"); // true
ipv4.test("255.255.255.255"); // true
ipv4.test("256.1.1.1"); // false (256 exceeds max)
ipv4.test("192.168.1"); // false (incomplete)
Hexadecimal Color Codes
Match CSS color codes in both 3 and 6 digit formats:
const hexColor = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/;
hexColor.test("#fff"); // true
hexColor.test("#ffffff"); // true
hexColor.test("#38bdf8"); // true
hexColor.test("#gggggg"); // false
Advanced Techniques and Lookarounds
Lookaround assertions let you match patterns based on what comes before or after, without including that context in the match. They're zero-width assertions that don't consume characters.
Positive Lookahead
The (?=...) syntax matches a position where the pattern inside the lookahead follows, but doesn't include it in the match:
// Match "foo" only if followed by "bar"
const pattern = /foo(?=bar)/;
pattern.test("foobar"); // true
"foobar".match(pattern); // ["foo"] (doesn't include "bar")
pattern.test("foobaz"); // false
Negative Lookahead
The (?!...) syntax matches a position where the pattern inside does NOT follow:
// Match "foo" only if NOT followed by "bar"
const pattern = /foo(?!bar)/;
pattern.test("foobaz"); // true
pattern.test("foobar"); // false
// Match numbers not followed by a percent sign
const notPercent = /\d+(?!%)/;
"50 items".match(notPercent); // ["50"]
"50% off".match(notPercent); // null
Positive Lookbehind
The (?<=...) syntax matches a position preceded by the pattern:
// Match digits preceded by a dollar sign
const price = /(?<=\$)\d+/;
"Price: $50".match(price); // ["50"]
"Price: 50".match(price); // null
// Extract values after a label
const value = /(?<=value: )\w+/;
"The value: test".match(value); // ["test"]
Negative Lookbehind
The (?<!...) syntax matches a position NOT preceded by the pattern:
// Match numbers not preceded by a dollar sign
const notPrice = /(?<!\$)\d+/;
"Item $50 weighs 10 lbs".match(notPrice); // ["10"]
// Match words not preceded by "not "
const positive = /(?<!not )\w+/;
"This is good not bad".match(positive); // ["This", "is", "good"]
Pro tip: Lookarounds are powerful but can impact performance on large texts. Use them judiciously and test performance with realistic data volumes. Consider simpler alternatives when possible.
Atomic Groups and Possessive Quantifiers
Atomic groups (?>...) prevent backtracking within the group, which can significantly improve performance for certain patterns:
// Without atomic group (may backtrack extensively)
const slow = /\d+\w+/;
// With atomic group (no backtracking)
const fast = /(?>\d+)\w+/;
Possessive quantifiers (*+, ++, ?+) work similarly—they match as much as possible and never give back characters through backtracking.
Performance Optimization
Poorly written regex patterns can cause catastrophic performance issues. Understanding regex engine behavior helps you write efficient patterns that scale.
Catastrophic Backtracking
Nested quantifiers can cause exponential time complexity. This pattern is dangerous:
// DANGEROUS: Can cause catastrophic backtracking
const bad = /(a+)+b/;
// This will hang on strings like "aaaaaaaaaaaaaaaaaaaaac"
// The engine tries every possible way to group the a's
Fix it by being more specific or using atomic groups:
// Better: More specific pattern
const better = /a+b/;
// Or use atomic group to prevent backtracking
const atomic = /(?>a+)+b/;
Optimization Strategies
- Be specific: Use
[0-9]instead of.when you know you want digits - Anchor patterns: Use
^and$to fail fast on non-matching strings - Use non-capturing groups: Replace
(...)with(?:...)when you don't need captures - Avoid nested quantifiers: Patterns like
(a*)*are almost always problematic - Use possessive quantifiers: When you know you won't need backtracking
- Compile once: Reuse regex objects instead of recreating them in loops
// Bad: Compiling regex in loop
for (let i = 0; i < items.length; i++) {
if (/\d+/.test(items[i])) { /* ... */ }
}
// Good: Compile once, reuse
const digitPattern = /\d+/;
for (let i = 0; i < items.length; i++) {
if (digitPattern.test(items[i])) { /* ... */ }
}
Testing Performance
Always test your regex patterns with realistic data volumes. A pattern that works fine on 10 strings might hang on 10,000.
// Benchmark regex performance
const pattern = /your-pattern/g;
const testData = /* large dataset */;
console.time('regex-test');
for (let i = 0; i < 10000; i++) {
pattern.test(testData);
}
console.timeEnd('regex-test');
Use our Regex Benchmark Tool to compare pattern performance across different implementations.
Practical Real-World Examples
Let's apply regex to solve common development challenges you'll encounter in production applications.
Log File Parsing
Extract structured data from application logs:
const logPattern = /^\[(?<timestamp>[\d\-: ]+)\] (?<level>\w+): (?<message>.+)$/gm;
const logs = `
[2026-03-31 10:15:23] INFO: Application started
[2026-03-31 10:15:24] ERROR: Database connection failed
[2026-03-31 10:15:25] WARN: Retrying connection
`;
const entries = [...logs.matchAll(logPattern)].map(match => ({
timestamp: match.groups.timestamp,
level: match.groups.level,
message: match.groups.message
}));
// Result: Array of structured log objects
Markdown Link Extraction
Parse markdown links