Regular Expressions Cheat Sheet for Developers

· 12 min read

Table of Contents

Introduction to Regular Expressions

Regular expressions (regex) are powerful pattern-matching tools that every developer should master. They provide a concise, declarative way to search, validate, and manipulate text data without writing verbose string manipulation code.

Whether you're building form validation, parsing log files, extracting data from APIs, or cleaning datasets, regex offers an elegant solution. A single regex pattern can replace dozens of lines of conditional logic, making your code more maintainable and less error-prone.

Modern programming languages like JavaScript, Python, Java, PHP, and Ruby all have built-in regex support. Once you learn the syntax, you can apply these skills across virtually any development environment.

Pro tip: Regular expressions can seem cryptic at first, but they follow consistent patterns. Start with simple expressions and gradually build complexity as you gain confidence. Use online regex testers to experiment and visualize matches in real-time.

Basic Syntax and Elements

Understanding the fundamental building blocks of regex is essential before tackling complex patterns. These core elements form the foundation of every regular expression you'll write.

Literal Characters

The simplest regex patterns match literal characters exactly as they appear. The pattern cat matches the string "cat" wherever it occurs in your text.

Most alphanumeric characters are literal, but certain special characters (called metacharacters) have special meanings and must be escaped with a backslash when you want to match them literally.

Essential Metacharacters

Pattern Description Example Matches
. Matches any single character except newline c.t "cat", "cot", "c9t"
^ Asserts position at start of string ^Hello "Hello world" (at start only)
$ Asserts position at end of string world$ "Hello world" (at end only)
* Matches 0 or more repetitions ab*c "ac", "abc", "abbc"
+ Matches 1 or more repetitions ab+c "abc", "abbc" (not "ac")
? Matches 0 or 1 occurrence colou?r "color", "colour"
| Alternation (OR operator) cat|dog "cat" or "dog"
\ Escapes special characters \. Literal period character

Escaping Special Characters

When you need to match metacharacters literally, prefix them with a backslash. For example, to match a literal period, use \. instead of just .

Common characters that need escaping include: . * + ? ^ $ { } [ ] ( ) | \

// Match a literal question mark
const pattern = /What\?/;
pattern.test("What?"); // true

// Match a dollar amount
const price = /\$\d+\.\d{2}/;
price.test("$19.99"); // true

Character Classes and Quantifiers

Character classes let you match any character from a specific set, while quantifiers control how many times a pattern should repeat. Together, they form the backbone of flexible pattern matching.

Predefined Character Classes

Class Description Equivalent Example
\d Any digit [0-9] \d{3} matches "123"
\D Any non-digit [^0-9] \D+ matches "abc"
\w Word character (alphanumeric + underscore) [A-Za-z0-9_] \w+ matches "user_123"
\W Non-word character [^A-Za-z0-9_] \W matches "@" or " "
\s Whitespace (space, tab, newline) [ \t\n\r\f\v] \s+ matches " "
\S Non-whitespace [^ \t\n\r\f\v] \S+ matches "hello"

Custom Character Sets

Square brackets define custom character sets. The pattern [aeiou] matches any single vowel, while [0-9a-fA-F] matches any hexadecimal digit.

Use a caret inside brackets to negate the set: [^0-9] matches any character that's NOT a digit.

// Match any vowel
const vowels = /[aeiou]/gi;
"Hello World".match(vowels); // ["e", "o", "o"]

// Match consonants only
const consonants = /[^aeiou\s]/gi;
"Hello".match(consonants); // ["H", "l", "l"]

Quantifiers in Detail

Quantifiers specify how many times the preceding element should match. They're greedy by default, meaning they match as much text as possible.

Quick tip: Add a question mark after any quantifier to make it non-greedy (lazy). For example, .*? matches as few characters as possible instead of as many as possible. This is crucial when extracting content between delimiters.

Anchors and Word Boundaries

Anchors don't match characters—they match positions in the string. They're essential for precise pattern matching when you need to ensure text appears in specific locations.

Position Anchors

The caret ^ matches the start of a string (or line in multiline mode), while the dollar sign $ matches the end. These are invaluable for validation where the entire string must match a pattern.

// Validate that entire string is digits
const onlyDigits = /^\d+$/;
onlyDigits.test("12345"); // true
onlyDigits.test("123abc"); // false

// Match lines starting with "Error"
const errorLines = /^Error/gm;
const logs = "Info: Starting\nError: Failed\nError: Timeout";
logs.match(errorLines); // ["Error", "Error"]

Word Boundaries

The \b anchor matches word boundaries—positions between word and non-word characters. It's perfect for matching whole words without accidentally matching parts of larger words.

The \B anchor matches positions that are NOT word boundaries.

// Match "cat" as a whole word only
const wholeCat = /\bcat\b/;
wholeCat.test("cat"); // true
wholeCat.test("cats"); // false
wholeCat.test("concatenate"); // false

// Match "cat" within words
const partialCat = /\Bcat\B/;
partialCat.test("concatenate"); // true
partialCat.test("cat"); // false

Word boundaries are particularly useful for search and replace operations where you want to avoid partial matches. They're also essential when building syntax highlighters or code analyzers.

Groups and Capturing

Parentheses in regex serve multiple purposes: they group parts of patterns together, capture matched text for later use, and enable backreferences within the pattern itself.

Capturing Groups

Parentheses create numbered capturing groups that store the matched text. You can reference these captures in replacement strings or extract them programmatically.

// Extract date components
const datePattern = /(\d{4})-(\d{2})-(\d{2})/;
const match = "2026-03-31".match(datePattern);
// match[0]: "2026-03-31" (full match)
// match[1]: "2026" (year)
// match[2]: "03" (month)
// match[3]: "31" (day)

// Reformat dates using captures
const text = "Date: 2026-03-31";
const reformatted = text.replace(/(\d{4})-(\d{2})-(\d{2})/, "$2/$3/$1");
// Result: "Date: 03/31/2026"

Non-Capturing Groups

Use (?:...) to group patterns without creating a capture. This improves performance and keeps your capture numbering clean when you only need grouping for alternation or quantifiers.

// Group without capturing
const protocol = /(?:https?|ftp):\/\//;
protocol.test("https://example.com"); // true

// Compare with capturing (unnecessary overhead)
const protocolCapture = /(https?|ftp):\/\//;
// Creates an extra capture group we don't need

Named Capturing Groups

Modern regex engines support named captures using (?<name>...) syntax. This makes your patterns self-documenting and easier to maintain.

// Named captures for clarity
const emailPattern = /(?<user>[\w.]+)@(?<domain>[\w.]+)/;
const match = "[email protected]".match(emailPattern);
console.log(match.groups.user); // "john.doe"
console.log(match.groups.domain); // "example.com"

// Use in replacements
const masked = "[email protected]".replace(
  /(?<user>[\w.]+)@(?<domain>[\w.]+)/,
  "***@$<domain>"
);
// Result: "***@example.com"

Backreferences

Reference earlier captures within the same pattern using \1, \2, etc. This is powerful for matching repeated or mirrored patterns.

// Match repeated words
const repeated = /\b(\w+)\s+\1\b/;
repeated.test("hello hello"); // true
repeated.test("hello world"); // false

// Match HTML tags
const htmlTag = /<(\w+)>.*?<\/\1>/;
htmlTag.test("<div>content</div>"); // true
htmlTag.test("<div>content</span>"); // false

Regex Flags and Modifiers

Flags modify how the regex engine interprets your pattern. They're added after the closing delimiter in most languages (e.g., /pattern/flags in JavaScript).

Common Flags

// Case-insensitive search
const pattern = /hello/i;
pattern.test("HELLO"); // true
pattern.test("Hello"); // true

// Global flag for multiple matches
const digits = /\d+/g;
"Phone: 555-1234, Fax: 555-5678".match(digits);
// ["555", "1234", "555", "5678"]

// Multiline mode
const headers = /^#.+$/gm;
const markdown = "# Title\nContent\n## Subtitle";
markdown.match(headers); // ["# Title", "## Subtitle"]

Pro tip: The global flag changes how methods like exec() and test() behave—they maintain state between calls. If you're getting unexpected results, check whether you need the global flag or if it's causing issues with stateful matching.

Common Patterns and Their Applications

Let's explore battle-tested regex patterns for common development tasks. These patterns have been refined through real-world use and handle edge cases you might not initially consider.

Email Validation

Email validation is notoriously complex due to RFC specifications, but this practical pattern handles 99% of real-world email addresses:

const emailPattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

// Valid emails
emailPattern.test("[email protected]"); // true
emailPattern.test("[email protected]"); // true

// Invalid emails
emailPattern.test("invalid@"); // false
emailPattern.test("@example.com"); // false
emailPattern.test("[email protected]"); // false

For production applications, consider using a dedicated email validation library that handles internationalized domains and all RFC edge cases. You can also use our Email Validator Tool to test email patterns interactively.

URL Matching

Match and extract URLs from text with this comprehensive pattern:

const urlPattern = /https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&\/\/=]*)/gi;

const text = "Visit https://example.com or http://www.site.org/path?query=value";
const urls = text.match(urlPattern);
// ["https://example.com", "http://www.site.org/path?query=value"]

Phone Number Formats

Phone numbers vary by country, but here are patterns for common US formats:

// US phone numbers (various formats)
const phonePattern = /^(\+1[-.\s]?)?(\(?\d{3}\)?[-.\s]?)?\d{3}[-.\s]?\d{4}$/;

phonePattern.test("555-1234"); // true
phonePattern.test("(555) 123-4567"); // true
phonePattern.test("+1-555-123-4567"); // true
phonePattern.test("5551234567"); // true

Password Strength Validation

Enforce password requirements using lookahead assertions:

// At least 8 chars, 1 uppercase, 1 lowercase, 1 digit, 1 special char
const strongPassword = /^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$/;

strongPassword.test("Weak123"); // false (no special char)
strongPassword.test("Strong123!"); // true
strongPassword.test("NoDigits!"); // false

Credit Card Numbers

Validate credit card formats (remember to use proper payment processing libraries for real transactions):

// Visa, MasterCard, Amex, Discover
const creditCard = /^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|6(?:011|5[0-9]{2})[0-9]{12})$/;

creditCard.test("4532015112830366"); // true (Visa)
creditCard.test("5425233430109903"); // true (MasterCard)
creditCard.test("1234567890123456"); // false

Date Formats

Match various date formats with these patterns:

// YYYY-MM-DD format
const isoDate = /^\d{4}-\d{2}-\d{2}$/;

// MM/DD/YYYY format
const usDate = /^(0[1-9]|1[0-2])\/(0[1-9]|[12]\d|3[01])\/\d{4}$/;

// DD-MM-YYYY format
const euDate = /^(0[1-9]|[12]\d|3[01])-(0[1-9]|1[0-2])-\d{4}$/;

isoDate.test("2026-03-31"); // true
usDate.test("03/31/2026"); // true
euDate.test("31-03-2026"); // true

Test these patterns and more using our Regex Tester Tool for instant feedback and pattern visualization.

IP Address Validation

Validate IPv4 addresses with proper range checking:

// IPv4 address
const ipv4 = /^(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)$/;

ipv4.test("192.168.1.1"); // true
ipv4.test("255.255.255.255"); // true
ipv4.test("256.1.1.1"); // false (256 exceeds max)
ipv4.test("192.168.1"); // false (incomplete)

Hexadecimal Color Codes

Match CSS color codes in both 3 and 6 digit formats:

const hexColor = /^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/;

hexColor.test("#fff"); // true
hexColor.test("#ffffff"); // true
hexColor.test("#38bdf8"); // true
hexColor.test("#gggggg"); // false

Advanced Techniques and Lookarounds

Lookaround assertions let you match patterns based on what comes before or after, without including that context in the match. They're zero-width assertions that don't consume characters.

Positive Lookahead

The (?=...) syntax matches a position where the pattern inside the lookahead follows, but doesn't include it in the match:

// Match "foo" only if followed by "bar"
const pattern = /foo(?=bar)/;
pattern.test("foobar"); // true
"foobar".match(pattern); // ["foo"] (doesn't include "bar")
pattern.test("foobaz"); // false

Negative Lookahead

The (?!...) syntax matches a position where the pattern inside does NOT follow:

// Match "foo" only if NOT followed by "bar"
const pattern = /foo(?!bar)/;
pattern.test("foobaz"); // true
pattern.test("foobar"); // false

// Match numbers not followed by a percent sign
const notPercent = /\d+(?!%)/;
"50 items".match(notPercent); // ["50"]
"50% off".match(notPercent); // null

Positive Lookbehind

The (?<=...) syntax matches a position preceded by the pattern:

// Match digits preceded by a dollar sign
const price = /(?<=\$)\d+/;
"Price: $50".match(price); // ["50"]
"Price: 50".match(price); // null

// Extract values after a label
const value = /(?<=value: )\w+/;
"The value: test".match(value); // ["test"]

Negative Lookbehind

The (?<!...) syntax matches a position NOT preceded by the pattern:

// Match numbers not preceded by a dollar sign
const notPrice = /(?<!\$)\d+/;
"Item $50 weighs 10 lbs".match(notPrice); // ["10"]

// Match words not preceded by "not "
const positive = /(?<!not )\w+/;
"This is good not bad".match(positive); // ["This", "is", "good"]

Pro tip: Lookarounds are powerful but can impact performance on large texts. Use them judiciously and test performance with realistic data volumes. Consider simpler alternatives when possible.

Atomic Groups and Possessive Quantifiers

Atomic groups (?>...) prevent backtracking within the group, which can significantly improve performance for certain patterns:

// Without atomic group (may backtrack extensively)
const slow = /\d+\w+/;

// With atomic group (no backtracking)
const fast = /(?>\d+)\w+/;

Possessive quantifiers (*+, ++, ?+) work similarly—they match as much as possible and never give back characters through backtracking.

Performance Optimization

Poorly written regex patterns can cause catastrophic performance issues. Understanding regex engine behavior helps you write efficient patterns that scale.

Catastrophic Backtracking

Nested quantifiers can cause exponential time complexity. This pattern is dangerous:

// DANGEROUS: Can cause catastrophic backtracking
const bad = /(a+)+b/;

// This will hang on strings like "aaaaaaaaaaaaaaaaaaaaac"
// The engine tries every possible way to group the a's

Fix it by being more specific or using atomic groups:

// Better: More specific pattern
const better = /a+b/;

// Or use atomic group to prevent backtracking
const atomic = /(?>a+)+b/;

Optimization Strategies

// Bad: Compiling regex in loop
for (let i = 0; i < items.length; i++) {
  if (/\d+/.test(items[i])) { /* ... */ }
}

// Good: Compile once, reuse
const digitPattern = /\d+/;
for (let i = 0; i < items.length; i++) {
  if (digitPattern.test(items[i])) { /* ... */ }
}

Testing Performance

Always test your regex patterns with realistic data volumes. A pattern that works fine on 10 strings might hang on 10,000.

// Benchmark regex performance
const pattern = /your-pattern/g;
const testData = /* large dataset */;

console.time('regex-test');
for (let i = 0; i < 10000; i++) {
  pattern.test(testData);
}
console.timeEnd('regex-test');

Use our Regex Benchmark Tool to compare pattern performance across different implementations.

Practical Real-World Examples

Let's apply regex to solve common development challenges you'll encounter in production applications.

Log File Parsing

Extract structured data from application logs:

const logPattern = /^\[(?<timestamp>[\d\-: ]+)\] (?<level>\w+): (?<message>.+)$/gm;

const logs = `
[2026-03-31 10:15:23] INFO: Application started
[2026-03-31 10:15:24] ERROR: Database connection failed
[2026-03-31 10:15:25] WARN: Retrying connection
`;

const entries = [...logs.matchAll(logPattern)].map(match => ({
  timestamp: match.groups.timestamp,
  level: match.groups.level,
  message: match.groups.message
}));

// Result: Array of structured log objects

Markdown Link Extraction

Parse markdown links

We use cookies for analytics. By continuing, you agree to our Privacy Policy.