Sagar.BlogArticle
All posts
All posts
Bash

Regular Expressions in Bash with =~

Bash's =~ operator matches ERE regular expressions and captures groups into BASH_REMATCH. Use it for validation, parsing, and conditional logic without spawning subshells.

January 31, 20267 min read
BashRegexPattern MatchingScripting

Bash's [[ str =~ pattern ]] operator matches Extended Regular Expressions (ERE) without launching an external grep or sed — it's fast and captures groups into the BASH_REMATCH array. This makes it ideal for input validation, parsing structured strings, and conditional logic.

Basic =~ Matching

str="Hello, World 2026!"

# Check if string matches a pattern
if [[ "$str" =~ World ]]; then
    echo "Contains 'World'"
fi

# Match a number
if [[ "$str" =~ [0-9]+ ]]; then
    echo "Contains a number: ${BASH_REMATCH[0]}"   # 2026
fi

# Anchor to start and end
if [[ "hello" =~ ^[a-z]+$ ]]; then
    echo "All lowercase"
fi

BASH_REMATCH — Capture Groups

date_str="2026-01-31"

# Capture groups with ()
if [[ "$date_str" =~ ^([0-9]{4})-([0-9]{2})-([0-9]{2})$ ]]; then
    echo "Full match:  ${BASH_REMATCH[0]}"   # 2026-01-31
    echo "Year:        ${BASH_REMATCH[1]}"   # 2026
    echo "Month:       ${BASH_REMATCH[2]}"   # 01
    echo "Day:         ${BASH_REMATCH[3]}"   # 31
else
    echo "Invalid date format"
fi

Store regex in a variable

Don't quote the regex in =~ — quotes make it a literal string match.

pattern='^[0-9]+$'             # ✅ Store in variable — not quoted
[[ "$input" =~ $pattern ]]     # ✅ No quotes on variable in the [[ ]]

[[ "$input" =~ '^[0-9]+$' ]]   # ❌ Quotes make it literal — doesn't work as regex

Storing the regex in a variable also makes complex patterns cleaner to read.

ERE Quick Reference

Common ERE elements

PatternMatches
.Any single character
*Zero or more of preceding
+One or more of preceding
?Zero or one of preceding
^strAnchored to start
str$Anchored to end
[abc]a, b, or c
[^abc]Not a, b, or c
[a-z]a through z
(pat)Capture group
`pat1pat2`
{n,m}Between n and m repetitions
[[:alpha:]]POSIX: any letter
[[:digit:]]POSIX: any digit
[[:space:]]POSIX: any whitespace

Practical Validation Examples

#!/usr/bin/env bash

function validate_email {
    local email="$1"
    local pattern='^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}$'
    [[ "$email" =~ $pattern ]]
}

function validate_ip {
    local ip="$1"
    local pattern='^([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})$'
    if [[ "$ip" =~ $pattern ]]; then
        for octet in "${BASH_REMATCH[@]:1}"; do  # skip [0] (full match)
            (( octet >= 0 && octet <= 255 )) || return 1
        done
        return 0
    fi
    return 1
}

function is_integer {
    [[ "$1" =~ ^-?[0-9]+$ ]]
}

# Test them
validate_email "alice@example.com" && echo "Valid email"
validate_ip "192.168.1.1" && echo "Valid IP"
is_integer "-42" && echo "Is integer"
Quick Check

After `[[ "foo123bar" =~ ([0-9]+) ]]`, what is `${BASH_REMATCH[0]}` and `${BASH_REMATCH[1]}`?

Exercise

Write a function parse_connection_string that:

  1. Takes a connection string like postgres://alice:secret@db.example.com:5432/mydb
  2. Uses =~ with capture groups to extract: user, password, host, port, database
  3. Prints each component on its own line
  4. Returns 1 if the string doesn't match the expected format