Loading pageā¦
Rust walkthroughs
Loading pageā¦
glob::Pattern::new and compiled glob::Pattern for repeated matching?glob::Pattern::new compiles a glob pattern string into a reusable Pattern object each time it's called, while reusing a compiled Pattern avoids repeated parsing and compilation overhead when matching multiple paths. The Pattern::new function parses the glob syntax (*, ?, [], **) and builds an internal representation suitable for matchingāa non-trivial operation involving syntax validation, bracket expression parsing, and wildcard handling. When matching many files against the same pattern, compiling once and reusing the Pattern yields significant performance benefits. However, for one-off matches or patterns that change frequently, the convenience of Pattern::new inline may outweigh the optimization. The Pattern struct also provides the compile method for more explicit pattern creation, though new is the typical entry point.
use glob::Pattern;
fn main() {
// Create a pattern from a string
let pattern = Pattern::new("src/**/*.rs").unwrap();
// Use the pattern to match paths
assert!(pattern.matches("src/main.rs"));
assert!(pattern.matches("src/lib/mod.rs"));
assert!(!pattern.matches("test/main.rs"));
println!("Pattern compiled successfully");
}Pattern::new parses and compiles the glob string into a reusable matcher.
use glob::Pattern;
fn main() {
let pattern = Pattern::new("*.rs").unwrap();
// matches() - simple boolean check
println!("matches 'main.rs': {}", pattern.matches("main.rs"));
println!("matches 'lib.txt': {}", pattern.matches("lib.txt"));
// matches_path() - matches against full Path
let path = std::path::Path::new("src/main.rs");
println!("matches_path: {}", pattern.matches_path(path));
// matches_with() - control path separator handling
use glob::MatchOptions;
let options = MatchOptions {
case_sensitive: false,
require_literal_separator: false,
require_literal_leading_dot: false,
};
println!("case-insensitive: {}", pattern.matches_with("MAIN.RS", &options));
}Pattern provides multiple matching methods with different behaviors.
use glob::Pattern;
use std::time::Instant;
fn main() {
let paths: Vec<&str> = (0..10_000)
.map(|i| format!("src/file{}.rs", i))
.collect();
// Approach 1: Compile pattern each time (slow)
let start = Instant::now();
for path in &paths {
let pattern = Pattern::new("src/*.rs").unwrap(); // Compiles EVERY iteration!
if pattern.matches(path) {
// process
}
}
let repeated_compile = start.elapsed();
// Approach 2: Compile once, reuse (fast)
let start = Instant::now();
let pattern = Pattern::new("src/*.rs").unwrap(); // Compile ONCE
for path in &paths {
if pattern.matches(path) {
// process
}
}
let single_compile = start.elapsed();
println!("Repeated compile: {:?}", repeated_compile);
println!("Single compile: {:?}", single_compile);
println!("Speedup: {:.0}x", repeated_compile.as_nanos() as f64 / single_compile.as_nanos() as f64);
}Compiling on every iteration is wasteful; compile once and reuse.
use glob::Pattern;
use std::collections::HashMap;
struct GlobMatcher {
patterns: HashMap<String, Pattern>,
}
impl GlobMatcher {
fn new() -> Self {
Self {
patterns: HashMap::new(),
}
}
fn matches(&mut self, pattern_str: &str, path: &str) -> bool {
// Get or create the compiled pattern
let pattern = self.patterns
.entry(pattern_str.to_string())
.or_insert_with(|| Pattern::new(pattern_str).unwrap());
pattern.matches(path)
}
fn matches_path(&self, pattern_str: &str, path: &std::path::Path) -> bool {
self.patterns
.get(pattern_str)
.map(|p| p.matches_path(path))
.unwrap_or(false)
}
}
fn main() {
let mut matcher = GlobMatcher::new();
// First use compiles the pattern
println!("Match 1: {}", matcher.matches("*.rs", "main.rs"));
// Subsequent uses reuse the compiled pattern
println!("Match 2: {}", matcher.matches("*.rs", "lib.rs"));
println!("Match 3: {}", matcher.matches("*.rs", "test.txt"));
// Different pattern gets compiled separately
println!("Match 4: {}", matcher.matches("*.txt", "readme.txt"));
}Caching compiled patterns avoids redundant parsing.
use glob::Pattern;
fn main() {
// Simple wildcards
let simple = Pattern::new("*.rs").unwrap();
// Character classes
let char_class = Pattern::new("file[0-9].txt").unwrap();
// Negation in character classes
let negation = Pattern::new("file[!0-9].txt").unwrap();
// Single character match
let single = Pattern::new("file?.txt").unwrap();
// Recursive glob
let recursive = Pattern::new("src/**/*.rs").unwrap();
// Each pattern has different parsing complexity
// Character classes require more parsing than simple wildcards
println!("All patterns compiled successfully");
}Complex patterns with character classes require more parsing work.
use glob::{Pattern, MatchOptions};
fn main() {
let pattern = Pattern::new("*.RS").unwrap();
// Default: case-sensitive on Unix, case-insensitive on Windows
println!("Default (case-sensitive on Unix): {}", pattern.matches("file.rs"));
// Case-insensitive matching
let case_insensitive = MatchOptions {
case_sensitive: false,
require_literal_separator: false,
require_literal_leading_dot: false,
};
println!("Case-insensitive: {}", pattern.matches_with("file.rs", &case_insensitive));
// Require literal separator (don't match '/' with '*')
let literal_sep = MatchOptions {
case_sensitive: true,
require_literal_separator: true,
require_literal_leading_dot: false,
};
let all_files = Pattern::new("*").unwrap();
println!("Matches 'src/file.rs' with literal sep: {}",
all_files.matches_with("src/file.rs", &literal_sep));
// With literal separator, '*' won't match '/'
}MatchOptions provides fine-grained control over matching behavior.
use glob::Pattern;
fn main() {
// Valid pattern
match Pattern::new("*.rs") {
Ok(pattern) => println!("Compiled: {:?}", pattern.as_str()),
Err(e) => println!("Error: {}", e),
}
// Invalid pattern (unmatched bracket)
match Pattern::new("file[0-9.txt") {
Ok(pattern) => println!("Compiled: {:?}", pattern.as_str()),
Err(e) => println!("Error: {}", e),
}
// Invalid pattern (unclosed range)
match Pattern::new("file[0-") {
Ok(pattern) => println!("Compiled: {:?}", pattern.as_str()),
Err(e) => println!("Error: {}", e),
}
}Pattern::new returns Result to handle invalid glob syntax.
use glob::glob;
fn main() {
// glob() compiles pattern internally each call
for entry in glob("src/**/*.rs").unwrap() {
match entry {
Ok(path) => println!("Found: {:?}", path),
Err(e) => println!("Error: {}", e),
}
}
// For repeated calls with same pattern,
// compile and use glob::Paths directly
}glob::glob compiles the pattern internally; consider caching for repeated use.
use glob::Pattern;
use std::collections::HashMap;
struct FileFilter {
// Cache of compiled patterns from config
include_patterns: Vec<Pattern>,
exclude_patterns: Vec<Pattern>,
}
impl FileFilter {
fn from_config(include: &[&str], exclude: &[&str]) -> Result<Self, glob::PatternError> {
let include_patterns = include
.iter()
.map(|p| Pattern::new(p))
.collect::<Result<Vec<_>, _>>()?;
let exclude_patterns = exclude
.iter()
.map(|p| Pattern::new(p))
.collect::<Result<Vec<_>, _>>()?;
Ok(Self {
include_patterns,
exclude_patterns,
})
}
fn should_process(&self, path: &str) -> bool {
// Check if matches any include pattern
let included = self.include_patterns.iter().any(|p| p.matches(path));
// Check if matches any exclude pattern
let excluded = self.exclude_patterns.iter().any(|p| p.matches(path));
included && !excluded
}
}
fn main() {
let filter = FileFilter::from_config(
&["src/**/*.rs", "tests/**/*.rs"],
&["**/target/**", "**/.git/**"],
).unwrap();
println!("Process src/main.rs: {}", filter.should_process("src/main.rs"));
println!("Process target/main.rs: {}", filter.should_process("target/main.rs"));
println!("Process tests/test.rs: {}", filter.should_process("tests/test.rs"));
}Compile patterns from configuration once at startup, then reuse.
use glob::Pattern;
use std::time::Instant;
fn main() {
let iterations = 100_000;
// Simple pattern
let simple_pattern = "*.rs";
// Complex pattern with character classes
let complex_pattern = "src/**/file[0-9][a-z].rs";
// Test simple pattern compilation
let start = Instant::now();
for _ in 0..iterations {
let _ = Pattern::new(simple_pattern).unwrap();
}
let simple_compile_time = start.elapsed();
// Test complex pattern compilation
let start = Instant::now();
for _ in 0..iterations {
let _ = Pattern::new(complex_pattern).unwrap();
}
let complex_compile_time = start.elapsed();
println!("Simple pattern compile: {:?}", simple_compile_time);
println!("Complex pattern compile: {:?}", complex_compile_time);
println!("Complex is {:.1}x slower to compile",
complex_compile_time.as_nanos() as f64 / simple_compile_time.as_nanos() as f64);
// Compare matching time
let simple = Pattern::new(simple_pattern).unwrap();
let complex = Pattern::new(complex_pattern).unwrap();
let start = Instant::now();
for _ in 0..iterations {
let _ = simple.matches("src/main.rs");
}
let simple_match_time = start.elapsed();
let start = Instant::now();
for _ in 0..iterations {
let _ = complex.matches("src/dir/file1a.rs");
}
let complex_match_time = start.elapsed();
println!("\nSimple pattern match: {:?}", simple_match_time);
println!("Complex pattern match: {:?}", complex_match_time);
}Complex patterns cost more to compile and match than simple ones.
use glob::Pattern;
fn validate_patterns(patterns: &[&str]) -> Result<Vec<Pattern>, String> {
patterns
.iter()
.map(|p| {
Pattern::new(p).map_err(|e| {
format!("Invalid pattern '{}': {}", p, e)
})
})
.collect()
}
fn main() {
// Validate all patterns at startup
let patterns = ["src/**/*.rs", "tests/**/*.rs", "invalid[pattern"];
match validate_patterns(&patterns) {
Ok(compiled) => {
println!("All patterns valid, {} compiled", compiled.len());
}
Err(e) => {
println!("Pattern validation failed: {}", e);
}
}
// Alternative: compile on demand with error handling
let on_demand = |pattern_str: &str| -> Result<bool, glob::PatternError> {
let pattern = Pattern::new(pattern_str)?;
Ok(pattern.matches("src/main.rs"))
};
}Validate and compile patterns at startup to catch errors early.
use glob::Pattern;
use std::path::Path;
fn main() {
// glob::Pattern - matches against glob syntax
let glob_pattern = Pattern::new("src/**/*.rs").unwrap();
// std::path - matches against path components
let path = Path::new("src/main.rs");
// glob pattern matching
println!("Glob matches: {}", glob_pattern.matches_path(path));
// Manual path matching (more limited)
println!("Extension: {:?}", path.extension());
println!("Parent: {:?}", path.parent());
// For complex patterns, glob is more expressive
// For simple checks, Path methods may be faster
}Pattern provides expressive matching; Path methods are simpler for basic checks.
use glob::Pattern;
use std::sync::Arc;
use std::thread;
fn main() {
// Pattern is Send + Sync, can be shared across threads
let pattern = Arc::new(Pattern::new("src/**/*.rs").unwrap());
let handles: Vec<_> = (0..4)
.map(|_| {
let pattern = Arc::clone(&pattern);
thread::spawn(move || {
// Each thread uses the same compiled pattern
let test_paths = vec
!["src/main.rs", "tests/test.rs", "src/lib/mod.rs"];
for path in test_paths {
if pattern.matches(path) {
println!("Thread matched: {}", path);
}
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}Pattern can be shared across threads without recompilation.
use glob::Pattern;
#[derive(Debug)]
struct Rule {
pattern: Pattern,
action: String,
}
impl Rule {
fn new(pattern_str: &str, action: &str) -> Result<Self, glob::PatternError> {
Ok(Self {
pattern: Pattern::new(pattern_str)?,
action: action.to_string(),
})
}
fn matches(&self, path: &str) -> Option<&str> {
if self.pattern.matches(path) {
Some(&self.action)
} else {
None
}
}
}
fn main() {
let rules = [
Rule::new("*.rs", "compile").unwrap(),
Rule::new("*.md", "copy").unwrap(),
Rule::new("*.toml", "process").unwrap(),
];
let files = ["main.rs", "readme.md", "config.toml", "test.txt"];
for file in files {
for rule in &rules {
if let Some(action) = rule.matches(file) {
println!("{}: {} (rule: {})", file, action, rule.pattern.as_str());
}
}
}
}Storing Pattern in structs enables efficient rule-based matching.
Compilation cost comparison:
| Pattern Type | Compilation Cost | Match Cost |
|--------------|------------------|------------|
| Simple (*) | Low | Low |
| Character class ([a-z]) | Medium | Medium |
| Recursive (**) | Low | Higher |
| Complex (src/**/[A-Z]*.rs) | Higher | Higher |
When to compile once:
| Scenario | Recommendation |
|----------|----------------|
| Matching many paths | Compile once, reuse |
| Pattern from config | Compile at startup |
| Long-running service | Cache patterns |
| Multiple files, same pattern | Pattern struct |
| Concurrent matching | Arc<Pattern> |
When inline compilation is acceptable:
| Scenario | Reasoning | |----------|-----------| | One-off match | Overhead negligible | | Dynamic patterns | Must compile anyway | | CLI with one match | Simplicity over optimization | | Prototyping | Optimize later |
Key insight: The glob::Pattern::new function performs real workāparsing the glob syntax, validating bracket expressions, and building an internal representation that can efficiently match paths. This work is done once per new call, regardless of how many subsequent matches occur. For a script that matches a single path, compiling inline is fine. But for a file watcher that matches thousands of files against the same patterns, or a web server that routes requests by path patterns, the amortized cost of repeated compilation adds up. The Pattern struct is designed to be created once and used many times: it's Send + Sync for sharing across threads, Clone for cheap copying, and its matching methods are &self borrows that work on the compiled representation. The trade-off isn't just performanceāit's also about error handling. If you compile patterns at startup, validation errors surface immediately with clear context. If you compile on-demand, a typo in a configuration pattern only reveals itself when that pattern is first needed, possibly in production during an edge case. Compile early, validate early, and let the compiled Pattern do the heavy lifting of matching without re-parsing.