What is the purpose of `rayon::iter::ParallelBridge` for converting sequential iterators to parallel?

rayon::iter::ParallelBridge provides a bridge between sequential Rust iterators and Rayon's parallel iterators through the par_bridge() method, enabling any Iterator + Send to be consumed in parallel without implementing Rayon's ParallelIterator trait directly. This is useful when you have an existing sequential iterator—perhaps from a library you don't control, or from a source that doesn't fit Rayon's parallel iterator model—and you want to process its items in parallel. The bridge pulls items from the sequential iterator in one thread and distributes them to Rayon's thread pool for parallel processing, though this introduces some overhead compared to native ParallelIterator implementations.

Understanding the Sequential-to-Parallel Gap

use rayon::prelude::*;
 
// Rayon's ParallelIterator trait enables parallel iteration
// It's implemented for slices, ranges, and other parallel-aware types
 
fn parallel_example() {
    // Native parallel iteration (efficient)
    let sum: i32 = (0..1000)
        .into_par_iter()  // Creates ParallelIterator directly
        .map(|x| x * x)
        .sum();
    
    // But what if you have a sequential Iterator?
    // You can't call into_par_iter() on it
    
    let sequential_iter = 0..1000;  // This is an Iterator
    // sequential_iter.into_par_iter()  // ERROR: Iterator doesn't implement ParallelIterator
    
    // You'd need to collect first:
    let sum: i32 = sequential_iter.collect::<Vec<_>>()
        .into_par_iter()
        .map(|x| x * x)
        .sum();
    // This allocates a Vec just to get parallel iteration
}

Rayon's ParallelIterator requires types that implement it; sequential Iterator types don't automatically work.

The par_bridge Method

use rayon::prelude::*;
 
fn bridge_example() {
    // par_bridge() converts Iterator to ParallelIterator
    let sequential_iter = 0..1000;
    
    let sum: i32 = sequential_iter
        .par_bridge()  // Bridge to parallel!
        .map(|x| x * x)
        .sum();
    
    // No intermediate Vec allocation
    // Items are pulled from the iterator and processed in parallel
}

par_bridge() bridges the gap between sequential Iterator and parallel ParallelIterator.

When par_bridge is Necessary

use rayon::prelude::*;
use std::io::{BufRead, BufReader};
 
// Example 1: Reading from stdin or files
// Stdin lines is an Iterator, not ParallelIterator
fn process_stdin() {
    let stdin = std::io::stdin();
    let reader = BufReader::new(stdin.lock());
    
    // This is a sequential Iterator
    let lines = reader.lines();
    
    // Process lines in parallel with par_bridge
    lines
        .par_bridge()
        .for_each(|line| {
            if let Ok(text) = line {
                // Process each line in parallel
                process_line(&text);
            }
        });
}
 
// Example 2: External library iterator
fn process_from_library() {
    // Suppose a library returns an Iterator
    let external_iter = some_library::get_items();
    
    // Can't implement ParallelIterator for external type
    // But can bridge it:
    external_iter
        .par_bridge()
        .for_each(|item| {
            process_item(item);
        });
}
 
// Example 3: Generator-style iteration
fn process_generated() {
    // Iterators with closure state don't implement ParallelIterator
    let generated = std::iter::successors(Some(1), |&n| {
        if n < 1000 { Some(n * 2) } else { None }
    });
    
    generated
        .par_bridge()
        .for_each(|n| {
            println!("Processing: {}", n);
        });
}

par_bridge is useful when you can't implement ParallelIterator or control the source.

The ParallelBridge Trait

use rayon::iter::ParallelBridge;
use rayon::prelude::*;
 
// The trait is simple:
// pub trait ParallelBridge: Iterator + Send {
//     fn par_bridge(self) -> IntoIter<Self> { ... }
// }
 
// It's automatically implemented for all Iterator + Send types
 
fn trait_example() {
    // Any Iterator + Send can use par_bridge
    let iter = 0..10;  // Range<i32>: Iterator + Send
    
    // par_bridge is available
    let par_iter = iter.par_bridge();
    
    // par_bridge works with:
    // - std::iter::Range
    // - std::vec::IntoIter
    // - std::collections::hash_map::Iter (if items are Send)
    // - Any custom Iterator that is Send
}

ParallelBridge is implemented for all Iterator + Send types automatically.

Send Requirement

use rayon::prelude::*;
 
// The iterator must be Send
// This means the iterator itself can be sent between threads
 
fn send_requirement() {
    // OK: Range is Send
    let _ok = (0..100).par_bridge();
    
    // OK: Vec iterator is Send
    let _ok = vec![1, 2, 3].into_iter().par_bridge();
    
    // NOT OK: Rc is not Send
    // let not_send = std::rc::Rc::new(42);
    // let iter = std::iter::once(not_send);
    // let _bad = iter.par_bridge();  // ERROR: Rc is not Send
    
    // NOT OK: RefCell iterator is not Send
    // let cell = std::cell::RefCell::new(42);
    // let iter = std::iter::once(&cell);
    // let _bad = iter.par_bridge();  // ERROR: &RefCell is not Send
}

The iterator must be Send because it's moved between threads during parallel execution.

How par_bridge Works Internally

use rayon::prelude::*;
 
// Internal behavior (simplified):
// 1. One worker thread pulls items from the sequential iterator
// 2. Items are placed in a work queue
// 3. Other worker threads take items from the queue
// 4. Processing happens in parallel
 
fn how_it_works() {
    let iter = 0..1000;
    
    // par_bridge creates a split-pull pattern:
    // - One thread owns the iterator
    // - It pulls items and sends them to worker threads
    // - Workers process items in parallel
    
    iter.par_bridge()
        .for_each(|n| {
            // This closure may run on any thread in the pool
            // Items come from the sequential iterator
        });
    
    // The bridge introduces some coordination overhead:
    // - Iterator thread must communicate with worker threads
    // - Work queue adds synchronization
    // - Less efficient than native ParallelIterator
}

par_bridge pulls items from one thread and distributes them to worker threads.

Performance Characteristics

use rayon::prelude::*;
use std::time::Instant;
 
fn performance_comparison() {
    let data: Vec<i32> = (0..1_000_000).collect();
    
    // Native parallel iteration (most efficient)
    let start = Instant::now();
    let sum1: i32 = data.par_iter().map(|&x| x * 2).sum();
    println!("Native parallel: {:?}", start.elapsed());
    
    // par_bridge from sequential (less efficient)
    let start = Instant::now();
    let sum2: i32 = data.iter().par_bridge().map(|&x| x * 2).sum();
    println!("par_bridge: {:?}", start.elapsed());
    
    // Sequential (baseline)
    let start = Instant::now();
    let sum3: i32 = data.iter().map(|&x| x * 2).sum();
    println!("Sequential: {:?}", start.elapsed());
    
    // par_bridge overhead:
    // - Iterator thread coordination
    // - Work queue synchronization
    // - Less predictable work distribution
    
    // Use native ParallelIterator when possible
    // Use par_bridge when necessary for compatibility
}

par_bridge has overhead compared to native ParallelIterator; use native when available.

Ordering Guarantees

use rayon::prelude::*;
 
fn ordering_example() {
    // Sequential iteration: items processed in order
    (0..10)
        .for_each(|n| print!("{} ", n));
    // Output: 0 1 2 3 4 5 6 7 8 9
    
    println!();
    
    // Parallel iteration: order NOT guaranteed
    (0..10)
        .into_par_iter()
        .for_each(|n| print!("{} ", n));
    // Output: could be 3 7 1 0 5 ... (any order)
    
    println!();
    
    // par_bridge: also NOT guaranteed order
    (0..10)
        .par_bridge()
        .for_each(|n| print!("{} ", n));
    // Output: could be any order
    
    // The iterator yields items in order
    // But parallel processing means results can interleave
}

par_bridge provides no ordering guarantees; items are processed as they're pulled.

Preserving Order with collect

use rayon::prelude::*;
 
fn preserving_order() {
    // While processing is parallel, collect() preserves order
    let squares: Vec<i32> = (0..10)
        .par_bridge()
        .map(|n| n * n)
        .collect();
    
    // Result is always in order: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
    println!("{:?}", squares);
    
    // How this works:
    // 1. Items are processed in parallel (any order)
    // 2. Results are placed in output at correct positions
    // 3. collect() assembles final result in order
    
    // But with for_each:
    (0..10)
        .par_bridge()
        .for_each(|n| {
            // Side effects are NOT in order
            println!("Processing {}", n);
        });
    // Output order is unpredictable
}

collect preserves order; for_each and other side-effect operations don't.

Real-World Example: File Processing

use rayon::prelude::*;
use std::fs::{self, DirEntry};
use std::path::Path;
 
fn process_directory(dir: &Path) {
    // fs::read_dir returns an Iterator
    let entries: Vec<_> = fs::read_dir(dir)
        .expect("Failed to read directory")
        .collect();
    
    // Process files in parallel with par_bridge
    entries
        .into_iter()
        .par_bridge()
        .filter_map(|entry| entry.ok())
        .filter(|entry| entry.path().extension().map_or(false, |ext| ext == "txt"))
        .for_each(|entry| {
            process_text_file(&entry.path());
        });
}
 
fn process_text_file(path: &Path) {
    // Read and process file contents
    if let Ok(contents) = std::fs::read_to_string(path) {
        let lines = contents.lines().count();
        println!("{}: {} lines", path.display(), lines);
    }
}
 
// This is more efficient than:
// - Processing files sequentially
// - Loading all files into memory first

par_bridge enables parallel file processing from a sequential directory iterator.

Real-World Example: Stream Processing

use rayon::prelude::*;
use std::io::{BufRead, BufReader};
 
fn process_stream_example() {
    // Processing lines from stdin or a stream
    let stdin = std::io::stdin();
    let reader = BufReader::new(stdin.lock());
    
    // The lines iterator is sequential
    reader
        .lines()
        .par_bridge()
        .filter_map(|line| line.ok())
        .map(|line| process_line(&line))
        .for_each(|result| {
            println!("Result: {}", result);
        });
}
 
fn process_line(line: &str) -> String {
    // Expensive processing that benefits from parallelism
    line.to_uppercase()
}
 
// Key points:
// 1. Lines come from sequential source (stdin)
// 2. Processing happens in parallel
// 3. No need to load all lines into memory

par_bridge allows parallel processing of streaming data from sequential sources.

Thread Safety Considerations

use rayon::prelude::*;
use std::sync::Mutex;
 
fn thread_safety() {
    // Items from the iterator must be Send
    let data: Vec<i32> = (0..100).collect();
    
    // This is fine - i32 is Send
    data.into_iter()
        .par_bridge()
        .for_each(|n| {
            // Can safely use n in any thread
            println!("{}", n);
        });
    
    // Shared state must use proper synchronization
    let counter = Mutex::new(0);
    
    (0..100)
        .par_bridge()
        .for_each(|_| {
            // Safe: Mutex provides synchronization
            let mut count = counter.lock().unwrap();
            *count += 1;
        });
    
    // Final count
    let count = counter.lock().unwrap();
    println!("Count: {}", *count);
}

Items must be Send; shared state requires synchronization primitives.

Combining with Other Parallel Operations

use rayon::prelude::*;
 
fn combining_operations() {
    // par_bridge creates a ParallelIterator
    // You can use all ParallelIterator methods
    
    let sum: i32 = (0..100)
        .par_bridge()
        .map(|x| x * 2)        // Parallel map
        .filter(|&x| x > 50)  // Parallel filter
        .reduce(|| 0, |a, b| a + b);  // Parallel reduce
    
    println!("Sum: {}", sum);
    
    // More operations:
    let has_even: bool = (0..100)
        .par_bridge()
        .any(|x| x % 2 == 0);  // Parallel any
    
    let all_positive: bool = (0..100)
        .par_bridge()
        .all(|x| x >= 0);  // Parallel all
    
    let first_50: Option<i32> = (0..100)
        .par_bridge()
        .find_first(|&x| x == 50);  // Parallel find
    
    // All ParallelIterator methods are available
}

par_bridge returns a full ParallelIterator, enabling all parallel operations.

Comparison with collect().into_par_iter()

use rayon::prelude::*;
use std::time::Instant;
 
fn comparison() {
    let size = 1_000_000;
    let iter = 0..size;
    
    // Option 1: par_bridge (no allocation)
    let start = Instant::now();
    let sum1: i32 = iter.clone()
        .par_bridge()
        .map(|x| x * 2)
        .sum();
    let bridge_time = start.elapsed();
    
    // Option 2: collect then into_par_iter (allocates Vec)
    let start = Instant::now();
    let sum2: i32 = iter.clone()
        .collect::<Vec<_>>()
        .into_par_iter()
        .map(|x| x * 2)
        .sum();
    let collect_time = start.elapsed();
    
    println!("par_bridge: {:?}", bridge_time);
    println!("collect + into_par_iter: {:?}", collect_time);
    
    // Trade-offs:
    // par_bridge:
    // - No allocation
    // - Better for streaming/large data
    // - Slightly more coordination overhead
    
    // collect + into_par_iter:
    // - Allocation cost
    // - Native ParallelIterator efficiency
    // - Better work distribution
    // - Good for small/medium data
}

par_bridge avoids allocation; collect().into_par_iter() may be faster for small data.

Limitations and Caveats

use rayon::prelude::*;
 
fn limitations() {
    // 1. Iterator is consumed from one thread
    //    This is a bottleneck if the iterator is slow
    
    // 2. Work distribution may be uneven
    //    The iterator yields items one at a time
    
    // 3. Ordering is not preserved during processing
    
    // 4. Some operations require indexed ParallelIterator
    //    par_bridge returns unindexed ParallelIterator
    //    So operations like split_at are not available
    
    // This works:
    (0..100)
        .par_bridge()
        .for_each(|_| {});
    
    // This doesn't work - requires indexed:
    // (0..100)
    //     .par_bridge()
    //     .chunks(10)  // ERROR: needs IndexedParallelIterator
    //     .for_each(|_| {});
    
    // Native ParallelIterator supports more operations:
    (0..100)
        .into_par_iter()
        .chunks(10)  // Works - Range is indexed
        .for_each(|_| {});
}

par_bridge returns an unindexed ParallelIterator, limiting some operations.

When to Use par_bridge vs Alternatives

use rayon::prelude::*;
 
fn when_to_use() {
    // Use par_bridge when:
    
    // 1. You have a sequential Iterator from external source
    //    and can't implement ParallelIterator
    
    let lines = std::io::stdin().lock().lines();
    lines.par_bridge().for_each(|line| {
        // Process lines in parallel
    });
    
    // 2. The iterator is lazy/infinite
    //    and you want parallel processing
    
    let infinite = std::iter::successors(Some(1), |&n| Some(n + 1));
    infinite
        .par_bridge()
        .take(1000)  // Still works with take
        .for_each(|n| {
            // Process first 1000 items in parallel
        });
    
    // 3. Memory is constrained
    //    and you don't want to collect
    
    // Use native ParallelIterator when:
    
    // 1. You control the data structure
    //    (Vec, slice, range, etc.)
    
    let vec = vec![1, 2, 3, 4, 5];
    vec.par_iter().for_each(|&n| {
        // Native is more efficient
    });
    
    // 2. You need indexed operations
    //    (chunks, split_at, etc.)
    
    (0..100)
        .into_par_iter()
        .chunks(10)  // Requires indexed
        .for_each(|chunk| {
            // Process chunks
        });
}

Use par_bridge for external/lazy iterators; use native ParallelIterator when available.

Synthesis

par_bridge purpose:

// par_bridge converts Iterator to ParallelIterator
// Enabling parallel processing of sequential iterators
 
trait ParallelBridge: Iterator + Send {
    fn par_bridge(self) -> IntoIter<Self>
    where
        Self: Sized;
}
 
// Usage:
iter.par_bridge()
    .map(|x| process(x))
    .for_each(|result| use_result(result));

How it works:

// 1. Creates a ParallelIterator adapter
// 2. Pulls items from Iterator in one thread
// 3. Distributes items to Rayon's thread pool
// 4. Processes items in parallel
 
// The bridge pattern:
// [Sequential Iterator] -> [One thread pulls] -> [Work queue] -> [Worker threads process]

Trade-offs:

// par_bridge advantages:
// - Works with any Iterator + Send
// - No allocation needed
// - Works with lazy/infinite iterators
// - Good for streaming data
 
// par_bridge disadvantages:
// - Coordination overhead
// - Iterator thread can be bottleneck
// - Unindexed ParallelIterator
// - Less efficient than native ParallelIterator
 
// When to use:
// - External/unknown Iterator types
// - Streaming from stdin/files
// - Memory-constrained scenarios
// - Lazy iteration patterns
 
// When NOT to use:
// - You have Vec, slice, or range
// - Need indexed operations
// - Performance is critical
// - Small data sets

Key insight: ParallelBridge::par_bridge() bridges the sequential and parallel worlds in Rayon, allowing any Iterator + Send to be processed in parallel without implementing ParallelIterator directly or collecting into an intermediate container. The bridge pulls items from the sequential iterator in a single thread and distributes them to Rayon's worker threads, enabling parallelism but with coordination overhead. It's most useful for iterators from external sources (like file I/O or library APIs) that you can't control, or for lazy/streaming data where collecting first isn't practical. For data you control—slices, vectors, ranges—the native into_par_iter() or par_iter() methods are more efficient because they provide better work distribution and support indexed operations. The choice between par_bridge and native parallel iterators should be based on whether you control the source and whether the overhead of the bridge is justified by avoiding allocation.

What is the purpose of rayon::iter::ParallelBridge for converting sequential iterators to parallel?