How does regex::Regex::replace_all handle capture groups in replacement text for dynamic substitutions?

regex::Regex::replace_all performs global search-and-replace with support for capturing group references in the replacement text through $n and $name syntax, where n is the capture group number and name is a named capture group. The replacement text is interpreted specially: $1 refers to the first capture group, $2 to the second, and $name to a named capture group, allowing you to rearrange, duplicate, or selectively include captured portions of the match in the output. For dynamic replacements that require computation beyond simple string interpolation, replace_all accepts a closure that receives a Captures object, enabling transformations based on capture group contents. Understanding this syntax is essential for tasks like format conversion, data extraction with reformatting, and template processing where the replacement depends on what was matched.

Basic Replacement with replace_all

use regex::Regex;
 
fn main() {
    let re = Regex::new(r"hello").unwrap();
    let text = "hello world, hello universe";
    
    // replace_all replaces all occurrences
    let result = re.replace_all(text, "hi");
    
    assert_eq!(result, "hi world, hi universe");
    
    // Unlike replace, which replaces only the first occurrence
    let result_first = re.replace(text, "hi");
    assert_eq!(result_first, "hi world, hello universe");
}

replace_all replaces every match, while replace handles only the first.

Numeric Capture Group References

use regex::Regex;
 
fn main() {
    // Capture groups are referenced with $1, $2, etc.
    // The entire match is $0
    
    let re = Regex::new(r"(\w+)@(\w+)\.(\w+)").unwrap();
    let text = "Contact: alice@example.com and bob@test.org";
    
    // Rearrange using capture groups
    let result = re.replace_all(text, "$2@$1.$3");
    
    assert_eq!(result, "Contact: example@alice.com and test@bob.org");
    
    // $1 = first capture group (\w+ before @)
    // $2 = second capture group (\w+ after @)
    // $3 = third capture group (\w+ after .)
}

$1, $2, etc. reference capture groups by their order in the pattern.

Named Capture Group References

use regex::Regex;
 
fn main() {
    // Named captures are referenced with $name
    let re = Regex::new(r"(?P<user>\w+)@(?P<domain>\w+)\.(?P<tld>\w+)").unwrap();
    let text = "alice@example.com";
    
    // Use named references in replacement
    let result = re.replace_all(text, "User: $user, Domain: $domain.$tld");
    
    assert_eq!(result, "User: alice, Domain: example.com");
    
    // Can also use ${name} syntax for disambiguation
    let re2 = Regex::new(r"(?P<word>\w+)ing").unwrap();
    let result2 = re2.replace_all("testing", "${word}ed");
    
    assert_eq!(result2, "tested");
}

Named captures use $name or ${name} syntax for clearer replacement text.

The Full Match with $0

use regex::Regex;
 
fn main() {
    // $0 refers to the entire match
    let re = Regex::new(r"\b\w{4}\b").unwrap();  // 4-letter words
    let text = "test this word here";
    
    // Wrap matched words in brackets
    let result = re.replace_all(text, "[$0]");
    
    assert_eq!(result, "[test] this word [here]");
    
    // $0 is useful when you want to include the match with modifications
    let re2 = Regex::new(r"\d+").unwrap();
    let text2 = "Numbers: 42 100 7";
    
    let result2 = re2.replace_all(text2, "number($0)");
    assert_eq!(result2, "Numbers: number(42) number(100) number(7)");
}

$0 references the entire matched text, not just a capture group.

Escaping Dollar Signs

use regex::Regex;
 
fn main() {
    // To include a literal $ in replacement, use $$
    let re = Regex::new(r"price").unwrap();
    let text = "The price is good";
    
    let result = re.replace_all(text, "price: $$100");
    
    assert_eq!(result, "The price: $100 is good");
    
    // Without $$, $1 would try to reference capture group 1
    let re2 = Regex::new(r"item").unwrap();
    let text2 = "The item costs";
    
    // $$ produces a literal $
    let result2 = re2.replace_all(text2, "item ($$5)");
    assert_eq!(result2, "The item ($5) costs");
}

Use $$ to produce a literal $ in the replacement text.

Optional Capture Groups

use regex::Regex;
 
fn main() {
    // When a capture group is optional and doesn't match,
    // the reference is replaced with an empty string
    
    let re = Regex::new(r"a(\d+)?b").unwrap();
    let text = "ab a123b a456b";
    
    // $1 will be empty for "ab", and digits for others
    let result = re.replace_all(text, "[$1]");
    
    assert_eq!(result, "[] [123] [456]");
    
    // This works similarly for named captures
    let re2 = Regex::new(r"(?P<num>\d+)?x").unwrap();
    let text2 = "x 5x 10x";
    
    let result2 = re2.replace_all(text2, "num=$num");
    assert_eq!(result2, "num= num=5 num=10");
}

Unmatched optional capture groups produce empty strings in replacements.

Dynamic Replacement with Closures

use regex::Regex;
 
fn main() {
    // For dynamic replacements, use a closure
    // The closure receives a &Captures and returns a String
    
    let re = Regex::new(r"(\d+)").unwrap();
    let text = "Values: 5 10 15";
    
    // Double each number
    let result = re.replace_all(text, |caps: &regex::Captures| {
        let num: i32 = caps[1].parse().unwrap();
        format!("{}", num * 2)
    });
    
    assert_eq!(result, "Values: 10 20 30");
    
    // Closures can do arbitrary computation
    let re2 = Regex::new(r"(?P<temp>\d+)C").unwrap();
    let text2 = "Temperature: 0C, 100C";
    
    // Convert Celsius to Fahrenheit
    let result2 = re2.replace_all(text2, |caps: &regex::Captures| {
        let celsius: i32 = caps["temp"].parse().unwrap();
        let fahrenheit = celsius * 9 / 5 + 32;
        format!("{}F", fahrenheit)
    });
    
    assert_eq!(result2, "Temperature: 32F, 212F");
}

Closures enable computation-based replacements using capture group contents.

Accessing Capture Groups in Closures

use regex::Regex;
 
fn main() {
    let re = Regex::new(r"(?P<first>\w+)\s+(?P<last>\w+)").unwrap();
    let text = "john doe jane smith";
    
    // Access captures by index
    let result1 = re.replace_all(text, |caps: &regex::Captures| {
        format!("{} {}", &caps[2], &caps[1])  // last, first
    });
    
    assert_eq!(result1, "doe john smith jane");
    
    // Access captures by name
    let result2 = re.replace_all(text, |caps: &regex::Captures| {
        let first = &caps["first"];
        let last = &caps["last"];
        format!("{} {}", last, first)
    });
    
    assert_eq!(result2, "doe john smith jane");
    
    // Check if a capture group matched
    let re3 = Regex::new(r"(?P<required>\w+)(?:\s+(?P<optional>\w+))?").unwrap();
    let text3 = "hello world test";
    
    let result3 = re3.replace_all(text3, |caps: &regex::Captures| {
        let required = &caps["required"];
        if let Some(optional) = caps.name("optional") {
            format!("[{}+{}]", required, optional.as_str())
        } else {
            format!("[{}]", required)
        }
    });
    
    assert_eq!(result3, "[hello+world] [test]");
}

Use caps[n], caps["name"], or caps.name() for capture group access in closures.

Conditional Replacement Based on Content

use regex::Regex;
 
fn main() {
    // Replace based on capture group content
    let re = Regex::new(r"(?P<word>\w+)").unwrap();
    let text = "hello WORLD testing";
    
    let result = re.replace_all(text, |caps: &regex::Captures| {
        let word = &caps["word"];
        if word.chars().all(|c| c.is_uppercase()) {
            word.to_lowercase()
        } else if word.len() > 5 {
            format!("{}...", &word[..5])
        } else {
            word.to_string()
        }
    });
    
    assert_eq!(result, "hello world test...");
}

Closures enable conditional logic based on what was captured.

Format Conversion Example

use regex::Regex;
 
fn main() {
    // Convert date formats
    let re = Regex::new(r"(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})").unwrap();
    let text = "Dates: 2024-01-15 and 2024-12-31";
    
    // ISO format to US format
    let result = re.replace_all(text, "$month/$day/$year");
    
    assert_eq!(result, "Dates: 01/15/2024 and 12/31/2024");
    
    // Convert Markdown links to HTML
    let md_re = Regex::new(r"\[(?P<text>[^\]]+)\]\((?P<url>[^)]+)\)").unwrap();
    let md_text = "Check [Rust](https://rust-lang.org) for info";
    
    let html = md_re.replace_all(md_text, r#"<a href="$url">$text</a>"#);
    
    assert_eq!(html, r#"Check <a href="https://rust-lang.org">Rust</a> for info"#);
}

Capture groups enable complex format transformations.

Handling Special Characters in Replacement

use regex::Regex;
 
fn main() {
    // Be careful with backslashes and special chars
    
    // To include literal backslash in replacement
    let re = Regex::new(r"newline").unwrap();
    let text = "Add newline here";
    
    let result = re.replace_all(text, r"newline\n");
    // Note: \n in raw string is literal backslash + n
    // The replacement string is "newline\n" literally
    
    // For actual newline, use regular string
    let result2 = re.replace_all(text, "newline\n");
    // Now it contains actual newline
    
    // For replacement strings that need escaping:
    // $ -> $$
    // \ -> \ (literal backslash is fine in raw strings)
    
    let re3 = Regex::new(r"cost").unwrap();
    let text3 = "The cost is";
    
    let result3 = re3.replace_all(text3, "price is $$50");
    assert_eq!(result3, "The price is $50 is");
}

Understand how Rust string literals and regex replacement interact.

Replace with a Different Return Type

use regex::Regex;
 
fn main() {
    // replace_all can return Cow<str> to avoid allocations
    // when no replacements are needed
    
    let re = Regex::new(r"XXX").unwrap();
    let text = "Hello World";
    
    let result = re.replace_all(text, "YYY");
    
    // result is Cow<str>, borrows if no changes
    // If no replacements occurred, it borrows the original
    // If replacements occurred, it owns a new String
    
    // Can use into_owned() to get a String regardless
    let owned: String = re.replace_all(text, "YYY").into_owned();
    
    // For maximum efficiency with known ownership:
    let re2 = Regex::new(r"test").unwrap();
    let text2 = "this is a test";
    
    // If you need a String, into_owned() is clear
    let result_string = re2.replace_all(text2, "example").into_owned();
    assert_eq!(result_string, "this is a example");
}

replace_all returns Cow<str> to avoid unnecessary allocations.

Comparison: replace_all vs replace

use regex::Regex;
 
fn main() {
    let re = Regex::new(r"\d+").unwrap();
    let text = "1 2 3 4 5";
    
    // replace - only first match
    let first_only = re.replace(text, "X");
    assert_eq!(first_only, "X 2 3 4 5");
    
    // replace_all - all matches
    let all = re.replace_all(text, "X");
    assert_eq!(all, "X X X X X");
    
    // Both support capture groups
    let re2 = Regex::new(r"(\w)(\w+)").unwrap();
    let text2 = "hello world";
    
    let first_caps = re2.replace(text2, "$2$1");
    assert_eq!(first_caps, "elloh world");
    
    let all_caps = re2.replace_all(text2, "$2$1");
    assert_eq!(all_caps, "elloh orldw");
}

replace affects only the first match; replace_all affects every match.

Synthesis

Replacement syntax:

Syntax Meaning
$1, $2, ... Capture group by number
$name Named capture group
${name} Named capture (disambiguated)
$0 Entire match
$$ Literal $

Replacement types:

Type Use Case
&str Simple text replacement
String Computed static replacement
Closure Dynamic computation from captures
Cow<str> Avoid allocation when unchanged

Key behaviors:

Behavior Description
Global replacement All matches are replaced
Empty capture groups Unmatched optional groups become empty
Ownership Returns Cow<str> to minimize allocations
Closure flexibility Full access to all capture groups

Key insight: replace_all interprets the replacement text as a mini-template language where $n and $name are interpolation points for captured content. This is powerful but requires understanding two distinct parsing stages: Rust parses the string literal first (handling escape sequences), then the regex crate parses the replacement text (handling $n references). For complex transformations, closures provide the ultimate flexibility by exposing the full Captures object, enabling arbitrary computation on the matched content. The Cow<str> return type is a nice optimization: when no replacements occur, the original string is borrowed rather than copied, avoiding unnecessary allocation. This makes replace_all efficient even when called speculatively on strings that might not contain matches.