PROGRAMMING-CONCEPTS

Regular Expression: Definition, Purpose, and Examples

A regular expression (often shortened to regex) is a compact pattern used to search, match, and manipulate text. Instead of writing long, manual string checks, a regex lets you describe what you’re looking for using a sequence of symbols. This makes it ideal for tasks like validating emails, finding numbers, parsing logs, or cleaning data.

Regular expressions are extremely powerful, but they can look cryptic at first. The more you use them, the more you recognize common patterns such as \d (digit), \w (word character), or + (one or more occurrences).


What a Regular Expression Does

A regex works like a template for text:

  • It describes patterns, not fixed strings
  • It can match simple or complex structures
  • It can extract only the pieces you want
  • It works consistently across multiple languages

For example, this regex describes “one or more digits”:

This will match 3, 42, 2025, or any other sequence of digits.

Regex becomes more meaningful when dealing with real-world input like dates, emails, passwords, tags, or code.


Regular Expressions in JavaScript / TypeScript

JavaScript includes regex support directly in the language. You can write patterns using regex literals or the RegExp constructor.

Basic match

const pattern = /\d+/;
console.log(pattern.test("Room 42"));

\d+ matches one or more digits, so test() returns true because "42" appears in the string.


Capturing groups

Capturing groups extract specific parts of a match.

const pattern = /(\d{2})-(\d{2})-(\d{4})/;
const result = "12-05-2025".match(pattern);
// ["12-05-2025", "12", "05", "2025"]

Each pair of parentheses captures a piece of the date: day, month, year.


Named capturing groups (TypeScript and modern JS)

const pattern = /(?<day>\d{2})-(?<month>\d{2})-(?<year>\d{4})/;
const match = pattern.exec("12-05-2025");

match?.groups?.day;   // "12"
match?.groups?.month; // "05"

Named groups make code easier to read by labeling each extracted value.


Replacing text with regex

const cleaned = "File   name    with   spaces"
  .replace(/\s+/g, " ");

\s+ collapses multiple whitespace characters into one space.


Regular Expressions in Python

Python’s re module provides extensive regex tools.

Searching for a match

import re

pattern = r"\d+"
match = re.search(pattern, "Invoice #3021")

r"\d+" is a raw string literal, which avoids escape-character issues.

The regex finds "3021".


Extracting groups

m = re.match(r"(\w+)@(\w+)\.(\w+)", "ana@email.com")
username, provider, domain = m.groups()

This captures three parts of an email address: name, provider, and top-level domain.


Using findall()

re.findall(r"\b\w{4}\b", "come play with this data")

\b marks word boundaries; this finds all 4-letter words in the sentence.


Regular Expressions in Swift

Swift uses regex via foundation APIs and modern literal syntax (Swift 5.7+).

Matching a simple pattern

let text = "Order #1299"
if text.contains(/\d+/) {
    print("Contains a number")
}

Swift now allows regex literals similarly to JavaScript.

/\d+/ finds one or more digits.


Extracting values

if let match = "12-05-2025".firstMatch(of: /(\d{2})-(\d{2})-(\d{4})/) {
    let (day, month, year) = match.output
}

Swift’s structured regex API makes it easier to extract captured groups.


When to Use a Regular Expression

Regex is your tool of choice when dealing with text that follows a recognizable pattern.


1. Validating user input

Examples: emails, postal codes, phone numbers, usernames.

/^[a-zA-Z0-9_]{3,16}$/;

This ensures a username contains letters, digits, or underscores and is between 3–16 characters long.


2. Extracting structured data

Dates, timestamps, error codes, or tags.

re.findall(r"#(\w+)", "Loving these #sunsets and #mountains")

Captures everything after a #, giving hashtag names.


3. Cleaning messy text

Removing extra spaces, HTML tags, or formatting artifacts.

text.replace(/<[^>]*>/g, "");

<[^>]*> matches any HTML tag.


4. Transforming data

Renaming variables, converting formats, or replacing tokens.

re.sub(r"\s+", "-", "many   spaces   here")

Multiple whitespaces become hyphens.


5. Searching in logs or large files

Perfect for operations like:

  • “Find lines containing an IP address”
  • “Extract all failed status codes”
  • “Locate all ERROR messages”

Regex scales well across large datasets.


Examples of Regular Expressions in Action

These examples demonstrate real programming scenarios where regex makes logic much simpler.


Example 1: Email validation (JavaScript)

const emailPattern = /^[\w.-]+@[\w.-]+\.\w+$/;

emailPattern.test("hello@example.com"); // true
emailPattern.test("not-an-email");      // false

The pattern checks for:

  • characters before @
  • characters after @
  • a dot with a domain at the end

It’s not perfect (email rules are complex), but it handles common cases.


Example 2: Finding numbers in text (Python)

re.findall(r"\d+(?:\.\d+)?", "Prices: 12, 8.99, 100")

This regex finds integers or decimals (12, 8.99, 100).


Example 3: Extracting parts of a file path (TypeScript)

const pattern = /\/users\/(\d+)\/files\/(.+)/;
const [, userId, filename] = "/users/42/files/report.pdf".match(pattern)!;

This extracts:

  • the numeric user ID
  • the filename (with extension)

Useful in routing, APIs, and file processing.


Example 4: Validating a strong password (Swift)

let strong = /^(?=.*[A-Z])(?=.*\d)(?=.*[!@#\$%]).{8,}$/;

"Passw0rd!".contains(strong) // true

The regex ensures:

  • at least one uppercase letter
  • at least one number
  • at least one symbol
  • minimum 8 characters

Understanding Common Regex Symbols

A few patterns show up everywhere:

  • . – any character
  • \d – digit
  • \w – letter, digit, or underscore
  • \s – whitespace
  • + – one or more
  • – zero or more
  • ? – optional
  • [] – character class
  • () – capturing group
  • {m,n} – repeated range

These symbols combine to express very complex logic in a short form.


Regex vs Manual String Operations

You could write manual code to check if a string has:

  • exactly 2 digits
  • followed by a dash
  • followed by 2 digits
  • followed by a dash
  • followed by 4 digits

But that takes many lines of explicitly checking characters, slices, and indexes.

A regex expresses the same concept in one readable line:

\d{2}-\d{2}-\d{4}

Regex replaces tedious, error-prone logic with a clean, declarative pattern.


Common Mistakes with Regular Expressions

Mistake 1 — Forgetting anchors

^ = start of string

$ = end of string

Without them, partial matches may pass unexpectedly.

Mistake 2 — Overescaping

Beginners often escape characters unnecessarily, causing patterns to fail.

Mistake 3 — Trying to handle every edge case

Some formats (like emails) are too complex for simple regex patterns.

Aim for practical, not perfect.

Mistake 4 — Writing unreadable patterns

Long regexes benefit from comments or splitting them into logical parts.

Mistake 5 — Not using raw strings in Python

Regular strings turn "\n" into a newline.

Raw strings treat it as \ and n.


Summary

A regular expression is a powerful tool for matching, searching, validating, and transforming text using patterns. JavaScript/TypeScript, Python, and Swift all support regex features that make data handling far easier. Whether you’re validating user input, extracting structured information, cleaning data, or searching through logs, regex offers compact, expressive solutions. Although patterns can look intimidating at first, learning the common symbols quickly pays off in cleaner, more flexible code.

Learn to Code for Free
Start learning now
button icon
To advance beyond this tutorial and learn to code by doing, try the interactive experience of Mimo. Whether you're starting from scratch or brushing up your coding skills, Mimo helps you take your coding journey above and beyond.

Sign up or download Mimo from the App Store or Google Play to enhance your programming skills and prepare for a career in tech.

Reach your coding goals faster