100% Private

Regular Expressions for Beginners: Complete Regex Tutorial

Regular expressions (regex) are powerful patterns for matching and manipulating text. This guide takes you from zero to writing your own patterns for validation, search, and text processing.

What is Regex?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Think of it as a sophisticated "find" feature that can match complex patterns instead of just exact text.

Common Use Cases

  • Validation: Check if input matches a format (email, phone, password)
  • Search: Find patterns in text (all URLs, dates, IP addresses)
  • Replace: Transform text (reformat dates, clean data)
  • Extract: Pull specific data from strings (parse logs, scrape content)
Try it yourself: Use our Regex Tester to practice patterns as you learn!

Basic Patterns

Literal Characters

Most characters match themselves literally:

Pattern: cat
Matches: "cat" in "The cat sat on the mat"
         ^^^

The Dot (.)

The dot matches any single character (except newline):

Pattern: c.t
Matches: "cat", "cot", "cut", "c@t", "c1t"
Does NOT match: "ct", "cart"

Escaping Special Characters

Special characters need a backslash to match literally:

Special characters: . * + ? ^ $ { } [ ] \ | ( )

Pattern: 3\.14
Matches: "3.14" (the literal dot)

Pattern: \$100
Matches: "$100" (the literal dollar sign)

Character Classes

Match one character from a set of characters.

Square Brackets [ ]

Pattern: [aeiou]
Matches: Any single vowel

Pattern: [0-9]
Matches: Any single digit (0 through 9)

Pattern: [a-zA-Z]
Matches: Any letter (upper or lower case)

Pattern: [^0-9]
Matches: Any character that is NOT a digit

Shorthand Character Classes

Shorthand Equivalent Meaning
\d[0-9]Any digit
\D[^0-9]Any non-digit
\w[a-zA-Z0-9_]Word character
\W[^a-zA-Z0-9_]Non-word character
\s[ \t\n\r\f]Whitespace
\S[^ \t\n\r\f]Non-whitespace

Quantifiers

Specify how many times a pattern should match.

Quantifier Meaning Example
*0 or moreab*c matches "ac", "abc", "abbc"
+1 or moreab+c matches "abc", "abbc" (not "ac")
?0 or 1colou?r matches "color", "colour"
{n}Exactly n\d{4} matches "2024"
{n,}n or more\d{2,} matches 2+ digits
{n,m}Between n and m\d{2,4} matches 2-4 digits

Greedy vs Lazy

Text: <div>Hello</div><div>World</div>

Greedy: <.*>
Matches: "<div>Hello</div><div>World</div>" (everything)

Lazy: <.*?>
Matches: "<div>" then "</div>" then "<div>" then "</div>"

Add ? after a quantifier to make it lazy (match as few as possible).

Anchors & Boundaries

Match positions, not characters.

Anchor Meaning Example
^Start of string/line^Hello matches "Hello world"
$End of string/lineworld$ matches "Hello world"
\bWord boundary\bcat\b matches "cat" not "cats"
\BNot word boundary\Bcat matches "bobcat"
Text: "the cat scattered the cats"

Pattern: cat
Matches: "cat" (4 times - in cat, scattered, cats)

Pattern: \bcat\b
Matches: "cat" (1 time - only the word "cat")

Groups & Capturing

Parentheses ( )

Group parts of a pattern together:

Pattern: (ab)+
Matches: "ab", "abab", "ababab"

Pattern: (Mr|Mrs|Ms)\.?\s\w+
Matches: "Mr. Smith", "Mrs Jones", "Ms. Lee"

Capturing Groups

Parentheses also "capture" matched text for later use:

Pattern: (\d{4})-(\d{2})-(\d{2})
Text: "2024-03-15"

Group 0 (full match): "2024-03-15"
Group 1: "2024"
Group 2: "03"
Group 3: "15"

Non-Capturing Groups

Pattern: (?:Mr|Mrs|Ms)\.?\s(\w+)
Only captures the name, not the title

Backreferences

Pattern: (\w+)\s+\1
Matches repeated words: "the the", "is is"

Pattern: (['"]).*?\1
Matches quoted strings with matching quotes

Practical Examples

Email Validation

Pattern: ^[\w.-]+@[\w.-]+\.\w{2,}$

Breakdown:
^           Start of string
[\w.-]+     Username: letters, numbers, dots, hyphens
@           Literal @
[\w.-]+     Domain name
\.          Literal dot
\w{2,}      TLD (2+ letters)
$           End of string

Matches: user@toolsdock.com, john.doe@company.co.uk

Phone Numbers (US)

Pattern: ^\(?(\d{3})\)?[-.\s]?(\d{3})[-.\s]?(\d{4})$

Matches:
(555) 123-4567
555-123-4567
555.123.4567
5551234567

URL Matching

Pattern: https?://[\w.-]+(?:/[\w./-]*)?

Matches:
https://toolsdock.com/
https://sub.domain.com/page.html

Password Validation

Pattern: ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$

Requirements:
- At least 8 characters
- At least one lowercase letter
- At least one uppercase letter
- At least one digit
- At least one special character

Extract Hashtags

Pattern: #\w+

Text: "Learning #regex is #awesome! #programming"
Matches: #regex, #awesome, #programming

Date Reformatting

Pattern: (\d{2})/(\d{2})/(\d{4})
Replace: $3-$1-$2

Input:  03/15/2024
Output: 2024-03-15

Quick Reference Cheatsheet

Characters
.     Any character
\d    Digit [0-9]
\D    Non-digit
\w    Word char [a-zA-Z0-9_]
\W    Non-word char
\s    Whitespace
\S    Non-whitespace
Quantifiers
*     0 or more
+     1 or more
?     0 or 1
{3}   Exactly 3
{3,}  3 or more
{3,5} Between 3 and 5
Anchors
^     Start of string
$     End of string
\b    Word boundary
\B    Non-word boundary
Groups
(...)   Capturing group
(?:...) Non-capturing
\1      Backreference
|       Alternation (or)

Common Mistakes to Avoid

  1. Forgetting to escape special characters: \. not . for literal dot
  2. Greedy matching: Use .*? instead of .* when needed
  3. Missing anchors: Use ^ and $ for full-string validation
  4. Overcomplicating: Simple patterns are easier to maintain
  5. Not testing edge cases: Always test with various inputs

Practice Tools

Privacy Notice: This site works entirely in your browser. We don't collect or store your data. Optional analytics help us improve the site. You can deny without affecting functionality.