What are the trade-offs between glob::Pattern::compile and glob function for repeated pattern matching?

glob::Pattern::compile pre-compiles a glob pattern into a reusable Pattern object that can match multiple paths efficiently, while the glob function compiles and matches in a single step, recompiling the pattern on every call—which is convenient for one-time use but wasteful when matching the same pattern repeatedly. The key trade-off is upfront compilation cost versus repeated compilation overhead: Pattern::compile pays the parsing cost once and reuses the compiled pattern, while glob pays that cost each time it's called. For scenarios matching many paths against the same pattern, Pattern::compile offers significant performance benefits; for one-off matches, the glob function provides a more ergonomic API.

The glob Function for One-Time Matching

use glob::glob;
 
fn one_time_matching() {
    // The glob function compiles and matches in one step
    // Convenient for one-time use
    
    for entry in glob("src/**/*.rs").expect("Failed to read glob pattern") {
        match entry {
            Ok(path) => println!("Found: {:?}", path),
            Err(e) => println!("Error: {:?}", e),
        }
    }
    
    // Each call to glob() recompiles the pattern
    for entry in glob("src/**/*.rs").expect("Failed to read pattern") {
        // Pattern "src/**/*.rs" is parsed AGAIN
        match entry {
            Ok(path) => println!("Found: {:?}", path),
            Err(e) => println!("Error: {:?}", e),
        }
    }
}

The glob function is ergonomic but recompiles the pattern on every call.

Pattern::compile for Repeated Matching

use glob::{glob, Pattern};
 
fn repeated_matching() {
    // Compile the pattern once
    let pattern = Pattern::compile("src/**/*.rs")
        .expect("Failed to compile pattern");
    
    // Now we can match multiple times without recompiling
    let paths = vec![
        "src/main.rs",
        "src/lib.rs",
        "src/module/mod.rs",
        "tests/test.rs",
    ];
    
    for path in paths {
        if pattern.matches(path) {
            println!("{} matches the pattern", path);
        }
    }
    
    // The pattern parsing happened once
    // Each match just uses the compiled pattern
}

Pattern::compile pays the parsing cost once, then reuses the compiled pattern.

Performance Comparison

use glob::Pattern;
use std::time::Instant;
 
fn performance_comparison() {
    let pattern_str = "src/**/*.rs";
    let paths: Vec<&str> = (0..10000)
        .map(|i| {
            if i % 2 == 0 {
                "src/module/file.rs"
            } else {
                "other/path.txt"
            }
        })
        .collect();
    
    // Approach 1: glob function (implicit recompilation)
    // Note: glob() actually returns an iterator over filesystem
    // This is a conceptual example
    
    let start = Instant::now();
    // If we could use glob function for in-memory matching:
    // for path in &paths {
    //     if glob(pattern_str).unwrap().any(|p| p.unwrap() == *path) {
    //         // Match found
    //     }
    // }
    // This would parse the pattern 10000 times!
    
    // Approach 2: Pattern::compile (single compilation)
    let pattern = Pattern::compile(pattern_str).unwrap();
    for path in &paths {
        pattern.matches(path);
    }
    let duration = start.elapsed();
    println!("Pattern::compile: {:?}", duration);
    
    // Pattern::compile is significantly faster for repeated matches
    // Because the pattern is parsed only once
}

The performance difference grows with the number of matches.

How Pattern Compilation Works

use glob::Pattern;
 
fn compilation_details() {
    // Pattern compilation involves:
    // 1. Parsing the glob string into tokens
    // 2. Building an internal representation for matching
    // 3. Validating the pattern syntax
    
    let pattern_str = "src/**/*.rs";
    
    // Pattern::compile does all the work once
    let pattern = Pattern::compile(pattern_str).unwrap();
    
    // The compiled pattern contains:
    // - Parsed tokens (literals, wildcards, etc.)
    // - An efficient matching state machine
    // - Cached pattern metadata
    
    // Now matches() only does the matching work
    assert!(pattern.matches("src/main.rs"));
    assert!(pattern.matches("src/module/file.rs"));
    assert!(!pattern.matches("tests/test.rs"));
}

Pattern compilation parses the glob string into an internal representation used for matching.

The glob Function Internals

use glob::{glob, GlobResult};
 
fn glob_function_internals() {
    // The glob function does:
    // 1. Parse the pattern (Pattern::compile internally)
    // 2. Create a filesystem walker
    // 3. Filter paths by the pattern
    // 4. Return an iterator
    
    // From the source (conceptually):
    // pub fn glob(pattern: &str) -> Result<GlobIter, PatternError> {
    //     let compiled = Pattern::compile(pattern)?;  // Compiles!
    //     let walker = filesystem_walker(...);
    //     Ok(GlobIter { pattern: compiled, walker })
    // }
    
    // Each glob() call compiles the pattern
    // If you call glob() twice with the same pattern,
    // you compile twice
    
    // Using glob() twice:
    for entry in glob("src/**/*.rs").unwrap() {
        // First compilation happens here
    }
    
    for entry in glob("src/**/*.rs").unwrap() {
        // Second compilation - wasted work!
    }
}

Each glob() call compiles the pattern internally, even for identical patterns.

When to Use Each Approach

use glob::{glob, Pattern};
 
fn when_to_use() {
    // Use glob() function when:
    // 1. One-time filesystem iteration
    // 2. Different patterns each time
    // 3. Convenience matters more than performance
    
    // Example: Process files matching pattern once
    for entry in glob("data/**/*.json").unwrap() {
        let path = entry.unwrap();
        // Process file
    }
    // Done, no need to reuse pattern
    
    // Use Pattern::compile when:
    // 1. Matching multiple paths against same pattern
    // 2. In-memory matching (not filesystem)
    // 3. Repeated matching in a loop
    
    // Example: Filter in-memory paths
    let pattern = Pattern::compile("*.rs").unwrap();
    let files = vec!["main.rs", "lib.rs", "test.txt", "mod.rs"];
    
    let rust_files: Vec<_> = files
        .into_iter()
        .filter(|f| pattern.matches(f))
        .collect();
    
    // Example: Repeated matching in server
    fn server_loop() {
        let allowed_pattern = Pattern::compile("public/**/*").unwrap();
        
        loop {
            let request_path = get_request_path();
            
            // No recompilation each request
            if !allowed_pattern.matches(&request_path) {
                deny_access();
            }
        }
    }
}

Choose based on whether you need one-time filesystem iteration or repeated matching.

Pattern Matching vs Filesystem Walking

use glob::{glob, Pattern};
 
fn matching_vs_walking() {
    // glob() function: Filesystem walking + pattern matching
    // - Walks directory tree
    // - Filters by pattern
    // - Returns matching paths
    
    for entry in glob("src/**/*.rs").unwrap() {
        let path = entry.unwrap();
        println!("Found: {:?}", path);
    }
    
    // Pattern::compile + matches(): In-memory matching only
    // - No filesystem access
    // - Just matches a string against pattern
    // - Use when paths are already in memory
    
    let pattern = Pattern::compile("src/**/*.rs").unwrap();
    
    // In-memory paths
    let paths = vec![
        "src/main.rs",
        "src/lib.rs",
        "tests/test.rs",
    ];
    
    for path in paths {
        if pattern.matches(path) {
            println!("{} matches", path);
        }
    }
    
    // Use Pattern::compile for:
    // - Path validation
    // - Filtering collections
    // - URL routing patterns
    // - Any in-memory matching
}

glob() walks the filesystem; Pattern::matches() only matches strings against patterns.

Combining with Paths

use glob::Pattern;
use std::path::Path;
 
fn path_matching() {
    let pattern = Pattern::compile("src/**/*.rs").unwrap();
    
    // matches() works with string slices
    assert!(pattern.matches("src/main.rs"));
    
    // For Path types, convert to string
    let path = Path::new("src/module/mod.rs");
    let path_str = path.to_str().unwrap();
    
    if pattern.matches(path_str) {
        println!("{:?} matches", path);
    }
    
    // Be careful with path separators:
    // - Pattern uses '/' as separator
    // - Windows paths use '\'
    // - Use path.to_slash() or similar for portability
    
    // Pattern.matches works with:
    // - Unix paths: "src/module/file.rs"
    // - Windows paths (converted): "src/module/file.rs"
}

Pattern::matches expects string paths; convert Path values appropriately.

Advanced Pattern Features

use glob::Pattern;
 
fn advanced_patterns() {
    // Glob patterns support:
    // * - matches any sequence of characters (except separator)
    // ? - matches any single character
    // ** - matches any sequence including separators (recursive)
    // [abc] - matches any character in set
    // [!abc] - matches any character not in set
    // {a,b,c} - matches any of the alternatives
    
    // All of these are compiled once:
    let patterns = vec![
        Pattern::compile("*.rs").unwrap(),
        Pattern::compile("src/**/*.rs").unwrap(),
        Pattern::compile("test_[0-9].txt").unwrap(),
        Pattern::compile("src/**/*.{rs,toml}").unwrap(),
    ];
    
    // Reuse compiled patterns for all matches
    let paths = vec!["main.rs", "src/lib.rs", "test_5.txt", "Cargo.toml"];
    
    for path in paths {
        for pattern in &patterns {
            if pattern.matches(path) {
                println!("{} matches {:?}", path, pattern.as_str());
            }
        }
    }
}

Complex patterns benefit even more from compilation reuse.

Pattern Options

use glob::Pattern;
 
fn pattern_options() {
    // Pattern::compile uses default options
    
    // For case-insensitive matching:
    let pattern = Pattern::compile("*.RS")
        .expect("Failed to compile");
    
    // By default, pattern is case-sensitive
    assert!(!pattern.matches("main.rs"));  // Doesn't match
    
    // Note: glob crate has limited options
    // For more control, you may need Pattern methods:
    
    // Pattern::matches() - standard matching
    // Pattern::matches_with() - matching with options (if available)
    
    // For case-insensitive matching:
    let paths = vec!["file.RS", "file.rs", "FILE.RS"];
    let pattern = Pattern::compile("*.rs").unwrap();
    
    // Standard matching is case-sensitive
    for path in &paths {
        if pattern.matches(path) {
            println!("{} matches", path);  // Only "file.rs"
        }
    }
}

Pattern::compile uses default matching behavior; check crate documentation for options.

Error Handling

use glob::{glob, Pattern, PatternError};
 
fn error_handling() {
    // glob() returns Result<GlobIter, PatternError>
    match glob("invalid[") {
        Ok(iter) => {
            for entry in iter {
                println!("{:?}", entry);
            }
        }
        Err(e) => {
            println!("Invalid pattern: {}", e);
        }
    }
    
    // Pattern::compile returns Result<Pattern, PatternError>
    let pattern: Result<Pattern, PatternError> = Pattern::compile("invalid[");
    
    match pattern {
        Ok(p) => {
            println!("Compiled: {}", p.as_str());
        }
        Err(e) => {
            println!("Failed to compile: {}", e);
        }
    }
    
    // Common pattern errors:
    // - Unmatched brackets: "file["
    // - Unmatched braces: "file{"
    // - Invalid character range: "file[a-].rs"
    // - Empty alternatives: "file{}.rs"
}

Both approaches return Result for invalid patterns; handle errors appropriately.

Memory and Resource Considerations

use glob::Pattern;
 
fn resource_considerations() {
    // Pattern::compile stores the parsed pattern
    // Memory overhead: The compiled pattern structure
    
    let pattern = Pattern::compile("src/**/*.rs").unwrap();
    
    // The pattern holds:
    // - Original pattern string
    // - Parsed tokens/structure
    // - Metadata for matching
    
    // For most patterns, overhead is small
    // For very complex patterns, overhead increases
    
    // Trade-off:
    // - Compile once: Pay memory cost once, save CPU time
    // - Compile repeatedly: No memory cost, waste CPU time
    
    // For server applications:
    // - Compile patterns at startup
    // - Store in static or lazy_static
    // - Reuse for all requests
    
    // For CLI tools:
    // - Pattern usually comes from args
    // - Compile once at start
    // - Use throughout execution
}

Compiled patterns have memory overhead but save CPU time on repeated matches.

Practical Example: Route Matching

use glob::Pattern;
use std::collections::HashMap;
 
// Practical use: URL route matching
struct Router {
    routes: HashMap<String, Pattern>,
}
 
impl Router {
    fn new() -> Self {
        Router {
            routes: HashMap::new(),
        }
    }
    
    fn add_route(&mut self, name: &str, pattern: &str) {
        // Compile pattern once
        let compiled = Pattern::compile(pattern)
            .expect("Invalid route pattern");
        self.routes.insert(name.to_string(), compiled);
    }
    
    fn match_route(&self, path: &str) -> Option<&str> {
        // No recompilation - just matching
        for (name, pattern) in &self.routes {
            if pattern.matches(path) {
                return Some(name);
            }
        }
        None
    }
}
 
fn router_example() {
    let mut router = Router::new();
    
    // Compile patterns once at setup
    router.add_route("users", "/users/*");
    router.add_route("posts", "/posts/**");
    router.add_route("api", "/api/v*/*");
    
    // Match many paths efficiently
    let paths = vec![
        "/users/123",
        "/posts/2024/01/article",
        "/api/v1/users",
        "/unknown",
    ];
    
    for path in paths {
        match router.match_route(path) {
            Some(route) => println!("{} -> {}", path, route),
            None => println!("{} -> no match", path),
        }
    }
    
    // Each pattern compiled once, used many times
}

Compiled patterns are ideal for routing and URL matching scenarios.

Practical Example: File Filter

use glob::Pattern;
 
// Practical use: File filtering with multiple patterns
struct FileFilter {
    include_patterns: Vec<Pattern>,
    exclude_patterns: Vec<Pattern>,
}
 
impl FileFilter {
    fn new(includes: &[&str], excludes: &[&str]) -> Result<Self, String> {
        let include_patterns = includes
            .iter()
            .map(|p| Pattern::compile(p).map_err(|e| e.to_string()))
            .collect::<Result<Vec<_>, _>>()?;
        
        let exclude_patterns = excludes
            .iter()
            .map(|p| Pattern::compile(p).map_err(|e| e.to_string()))
            .collect::<Result<Vec<_>, _>>()?;
        
        Ok(FileFilter {
            include_patterns,
            exclude_patterns,
        })
    }
    
    fn is_match(&self, path: &str) -> bool {
        // Must match at least one include pattern
        let included = self.include_patterns
            .iter()
            .any(|p| p.matches(path));
        
        // Must not match any exclude pattern
        let excluded = self.exclude_patterns
            .iter()
            .any(|p| p.matches(path));
        
        included && !excluded
    }
}
 
fn filter_example() {
    let filter = FileFilter::new(
        &["src/**/*.rs", "tests/**/*.rs"],
        &["**/target/**", "**/*.bak"],
    ).expect("Invalid patterns");
    
    let files = vec![
        "src/main.rs",
        "src/lib.rs",
        "tests/test.rs",
        "target/debug/main.rs",
        "src/backup.rs.bak",
    ];
    
    for file in files {
        if filter.is_match(file) {
            println!("Included: {}", file);
        } else {
            println!("Excluded: {}", file);
        }
    }
}

Compiling patterns once and reusing them is efficient for filtering operations.

Synthesis

Performance trade-off:

// glob() function: Parse pattern every call
// Cost: O(p) pattern parsing per call
// Use when: Pattern changes or one-time use
 
for entry in glob("*.rs").unwrap() {
    // Pattern parsed here
}
 
// Pattern::compile: Parse pattern once, match many times
// Cost: O(p) parsing once + O(m) per match
// Use when: Same pattern, many matches
 
let pattern = Pattern::compile("*.rs").unwrap();
for path in paths {
    if pattern.matches(path) { /* ... */ }  // No parsing
}

Use cases:

// Use glob() for:
// 1. Filesystem iteration (built-in feature)
// 2. One-time use
// 3. Dynamic patterns that change each call
// 4. Simple scripts
 
// Use Pattern::compile for:
// 1. Repeated matching
// 2. In-memory path collections
// 3. URL routing
// 4. File filtering
// 5. Long-running servers/daemons

Key insight: The glob function compiles patterns internally, making it convenient but inefficient for repeated use with the same pattern. Pattern::compile separates the parsing phase from matching, allowing the compiled pattern to be reused efficiently. The performance benefit grows linearly with the number of matches—compile once and you save the parsing overhead on every subsequent match. For one-time filesystem iteration, the glob function is perfectly suitable and more ergonomic; for any scenario where you're matching multiple paths against the same pattern, especially in memory rather than on the filesystem, Pattern::compile is the correct choice. The compiled pattern can also be stored in structs, statics, or passed between functions, enabling patterns that persist across the lifetime of your application.