What is the purpose of rayon::iter::ParallelBridge for converting sequential iterators to parallel?
rayon::iter::ParallelBridge provides a bridge between sequential Rust iterators and Rayon's parallel iterators through the par_bridge() method, enabling any Iterator + Send to be consumed in parallel without implementing Rayon's ParallelIterator trait directly. This is useful when you have an existing sequential iteratorāperhaps from a library you don't control, or from a source that doesn't fit Rayon's parallel iterator modelāand you want to process its items in parallel. The bridge pulls items from the sequential iterator in one thread and distributes them to Rayon's thread pool for parallel processing, though this introduces some overhead compared to native ParallelIterator implementations.
Understanding the Sequential-to-Parallel Gap
use rayon::prelude::*;
// Rayon's ParallelIterator trait enables parallel iteration
// It's implemented for slices, ranges, and other parallel-aware types
fn parallel_example() {
// Native parallel iteration (efficient)
let sum: i32 = (0..1000)
.into_par_iter() // Creates ParallelIterator directly
.map(|x| x * x)
.sum();
// But what if you have a sequential Iterator?
// You can't call into_par_iter() on it
let sequential_iter = 0..1000; // This is an Iterator
// sequential_iter.into_par_iter() // ERROR: Iterator doesn't implement ParallelIterator
// You'd need to collect first:
let sum: i32 = sequential_iter.collect::<Vec<_>>()
.into_par_iter()
.map(|x| x * x)
.sum();
// This allocates a Vec just to get parallel iteration
}Rayon's ParallelIterator requires types that implement it; sequential Iterator types don't automatically work.
The par_bridge Method
use rayon::prelude::*;
fn bridge_example() {
// par_bridge() converts Iterator to ParallelIterator
let sequential_iter = 0..1000;
let sum: i32 = sequential_iter
.par_bridge() // Bridge to parallel!
.map(|x| x * x)
.sum();
// No intermediate Vec allocation
// Items are pulled from the iterator and processed in parallel
}par_bridge() bridges the gap between sequential Iterator and parallel ParallelIterator.
When par_bridge is Necessary
use rayon::prelude::*;
use std::io::{BufRead, BufReader};
// Example 1: Reading from stdin or files
// Stdin lines is an Iterator, not ParallelIterator
fn process_stdin() {
let stdin = std::io::stdin();
let reader = BufReader::new(stdin.lock());
// This is a sequential Iterator
let lines = reader.lines();
// Process lines in parallel with par_bridge
lines
.par_bridge()
.for_each(|line| {
if let Ok(text) = line {
// Process each line in parallel
process_line(&text);
}
});
}
// Example 2: External library iterator
fn process_from_library() {
// Suppose a library returns an Iterator
let external_iter = some_library::get_items();
// Can't implement ParallelIterator for external type
// But can bridge it:
external_iter
.par_bridge()
.for_each(|item| {
process_item(item);
});
}
// Example 3: Generator-style iteration
fn process_generated() {
// Iterators with closure state don't implement ParallelIterator
let generated = std::iter::successors(Some(1), |&n| {
if n < 1000 { Some(n * 2) } else { None }
});
generated
.par_bridge()
.for_each(|n| {
println!("Processing: {}", n);
});
}par_bridge is useful when you can't implement ParallelIterator or control the source.
The ParallelBridge Trait
use rayon::iter::ParallelBridge;
use rayon::prelude::*;
// The trait is simple:
// pub trait ParallelBridge: Iterator + Send {
// fn par_bridge(self) -> IntoIter<Self> { ... }
// }
// It's automatically implemented for all Iterator + Send types
fn trait_example() {
// Any Iterator + Send can use par_bridge
let iter = 0..10; // Range<i32>: Iterator + Send
// par_bridge is available
let par_iter = iter.par_bridge();
// par_bridge works with:
// - std::iter::Range
// - std::vec::IntoIter
// - std::collections::hash_map::Iter (if items are Send)
// - Any custom Iterator that is Send
}ParallelBridge is implemented for all Iterator + Send types automatically.
Send Requirement
use rayon::prelude::*;
// The iterator must be Send
// This means the iterator itself can be sent between threads
fn send_requirement() {
// OK: Range is Send
let _ok = (0..100).par_bridge();
// OK: Vec iterator is Send
let _ok = vec![1, 2, 3].into_iter().par_bridge();
// NOT OK: Rc is not Send
// let not_send = std::rc::Rc::new(42);
// let iter = std::iter::once(not_send);
// let _bad = iter.par_bridge(); // ERROR: Rc is not Send
// NOT OK: RefCell iterator is not Send
// let cell = std::cell::RefCell::new(42);
// let iter = std::iter::once(&cell);
// let _bad = iter.par_bridge(); // ERROR: &RefCell is not Send
}The iterator must be Send because it's moved between threads during parallel execution.
How par_bridge Works Internally
use rayon::prelude::*;
// Internal behavior (simplified):
// 1. One worker thread pulls items from the sequential iterator
// 2. Items are placed in a work queue
// 3. Other worker threads take items from the queue
// 4. Processing happens in parallel
fn how_it_works() {
let iter = 0..1000;
// par_bridge creates a split-pull pattern:
// - One thread owns the iterator
// - It pulls items and sends them to worker threads
// - Workers process items in parallel
iter.par_bridge()
.for_each(|n| {
// This closure may run on any thread in the pool
// Items come from the sequential iterator
});
// The bridge introduces some coordination overhead:
// - Iterator thread must communicate with worker threads
// - Work queue adds synchronization
// - Less efficient than native ParallelIterator
}par_bridge pulls items from one thread and distributes them to worker threads.
Performance Characteristics
use rayon::prelude::*;
use std::time::Instant;
fn performance_comparison() {
let data: Vec<i32> = (0..1_000_000).collect();
// Native parallel iteration (most efficient)
let start = Instant::now();
let sum1: i32 = data.par_iter().map(|&x| x * 2).sum();
println!("Native parallel: {:?}", start.elapsed());
// par_bridge from sequential (less efficient)
let start = Instant::now();
let sum2: i32 = data.iter().par_bridge().map(|&x| x * 2).sum();
println!("par_bridge: {:?}", start.elapsed());
// Sequential (baseline)
let start = Instant::now();
let sum3: i32 = data.iter().map(|&x| x * 2).sum();
println!("Sequential: {:?}", start.elapsed());
// par_bridge overhead:
// - Iterator thread coordination
// - Work queue synchronization
// - Less predictable work distribution
// Use native ParallelIterator when possible
// Use par_bridge when necessary for compatibility
}par_bridge has overhead compared to native ParallelIterator; use native when available.
Ordering Guarantees
use rayon::prelude::*;
fn ordering_example() {
// Sequential iteration: items processed in order
(0..10)
.for_each(|n| print!("{} ", n));
// Output: 0 1 2 3 4 5 6 7 8 9
println!();
// Parallel iteration: order NOT guaranteed
(0..10)
.into_par_iter()
.for_each(|n| print!("{} ", n));
// Output: could be 3 7 1 0 5 ... (any order)
println!();
// par_bridge: also NOT guaranteed order
(0..10)
.par_bridge()
.for_each(|n| print!("{} ", n));
// Output: could be any order
// The iterator yields items in order
// But parallel processing means results can interleave
}par_bridge provides no ordering guarantees; items are processed as they're pulled.
Preserving Order with collect
use rayon::prelude::*;
fn preserving_order() {
// While processing is parallel, collect() preserves order
let squares: Vec<i32> = (0..10)
.par_bridge()
.map(|n| n * n)
.collect();
// Result is always in order: [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
println!("{:?}", squares);
// How this works:
// 1. Items are processed in parallel (any order)
// 2. Results are placed in output at correct positions
// 3. collect() assembles final result in order
// But with for_each:
(0..10)
.par_bridge()
.for_each(|n| {
// Side effects are NOT in order
println!("Processing {}", n);
});
// Output order is unpredictable
}collect preserves order; for_each and other side-effect operations don't.
Real-World Example: File Processing
use rayon::prelude::*;
use std::fs::{self, DirEntry};
use std::path::Path;
fn process_directory(dir: &Path) {
// fs::read_dir returns an Iterator
let entries: Vec<_> = fs::read_dir(dir)
.expect("Failed to read directory")
.collect();
// Process files in parallel with par_bridge
entries
.into_iter()
.par_bridge()
.filter_map(|entry| entry.ok())
.filter(|entry| entry.path().extension().map_or(false, |ext| ext == "txt"))
.for_each(|entry| {
process_text_file(&entry.path());
});
}
fn process_text_file(path: &Path) {
// Read and process file contents
if let Ok(contents) = std::fs::read_to_string(path) {
let lines = contents.lines().count();
println!("{}: {} lines", path.display(), lines);
}
}
// This is more efficient than:
// - Processing files sequentially
// - Loading all files into memory firstpar_bridge enables parallel file processing from a sequential directory iterator.
Real-World Example: Stream Processing
use rayon::prelude::*;
use std::io::{BufRead, BufReader};
fn process_stream_example() {
// Processing lines from stdin or a stream
let stdin = std::io::stdin();
let reader = BufReader::new(stdin.lock());
// The lines iterator is sequential
reader
.lines()
.par_bridge()
.filter_map(|line| line.ok())
.map(|line| process_line(&line))
.for_each(|result| {
println!("Result: {}", result);
});
}
fn process_line(line: &str) -> String {
// Expensive processing that benefits from parallelism
line.to_uppercase()
}
// Key points:
// 1. Lines come from sequential source (stdin)
// 2. Processing happens in parallel
// 3. No need to load all lines into memorypar_bridge allows parallel processing of streaming data from sequential sources.
Thread Safety Considerations
use rayon::prelude::*;
use std::sync::Mutex;
fn thread_safety() {
// Items from the iterator must be Send
let data: Vec<i32> = (0..100).collect();
// This is fine - i32 is Send
data.into_iter()
.par_bridge()
.for_each(|n| {
// Can safely use n in any thread
println!("{}", n);
});
// Shared state must use proper synchronization
let counter = Mutex::new(0);
(0..100)
.par_bridge()
.for_each(|_| {
// Safe: Mutex provides synchronization
let mut count = counter.lock().unwrap();
*count += 1;
});
// Final count
let count = counter.lock().unwrap();
println!("Count: {}", *count);
}Items must be Send; shared state requires synchronization primitives.
Combining with Other Parallel Operations
use rayon::prelude::*;
fn combining_operations() {
// par_bridge creates a ParallelIterator
// You can use all ParallelIterator methods
let sum: i32 = (0..100)
.par_bridge()
.map(|x| x * 2) // Parallel map
.filter(|&x| x > 50) // Parallel filter
.reduce(|| 0, |a, b| a + b); // Parallel reduce
println!("Sum: {}", sum);
// More operations:
let has_even: bool = (0..100)
.par_bridge()
.any(|x| x % 2 == 0); // Parallel any
let all_positive: bool = (0..100)
.par_bridge()
.all(|x| x >= 0); // Parallel all
let first_50: Option<i32> = (0..100)
.par_bridge()
.find_first(|&x| x == 50); // Parallel find
// All ParallelIterator methods are available
}par_bridge returns a full ParallelIterator, enabling all parallel operations.
Comparison with collect().into_par_iter()
use rayon::prelude::*;
use std::time::Instant;
fn comparison() {
let size = 1_000_000;
let iter = 0..size;
// Option 1: par_bridge (no allocation)
let start = Instant::now();
let sum1: i32 = iter.clone()
.par_bridge()
.map(|x| x * 2)
.sum();
let bridge_time = start.elapsed();
// Option 2: collect then into_par_iter (allocates Vec)
let start = Instant::now();
let sum2: i32 = iter.clone()
.collect::<Vec<_>>()
.into_par_iter()
.map(|x| x * 2)
.sum();
let collect_time = start.elapsed();
println!("par_bridge: {:?}", bridge_time);
println!("collect + into_par_iter: {:?}", collect_time);
// Trade-offs:
// par_bridge:
// - No allocation
// - Better for streaming/large data
// - Slightly more coordination overhead
// collect + into_par_iter:
// - Allocation cost
// - Native ParallelIterator efficiency
// - Better work distribution
// - Good for small/medium data
}par_bridge avoids allocation; collect().into_par_iter() may be faster for small data.
Limitations and Caveats
use rayon::prelude::*;
fn limitations() {
// 1. Iterator is consumed from one thread
// This is a bottleneck if the iterator is slow
// 2. Work distribution may be uneven
// The iterator yields items one at a time
// 3. Ordering is not preserved during processing
// 4. Some operations require indexed ParallelIterator
// par_bridge returns unindexed ParallelIterator
// So operations like split_at are not available
// This works:
(0..100)
.par_bridge()
.for_each(|_| {});
// This doesn't work - requires indexed:
// (0..100)
// .par_bridge()
// .chunks(10) // ERROR: needs IndexedParallelIterator
// .for_each(|_| {});
// Native ParallelIterator supports more operations:
(0..100)
.into_par_iter()
.chunks(10) // Works - Range is indexed
.for_each(|_| {});
}par_bridge returns an unindexed ParallelIterator, limiting some operations.
When to Use par_bridge vs Alternatives
use rayon::prelude::*;
fn when_to_use() {
// Use par_bridge when:
// 1. You have a sequential Iterator from external source
// and can't implement ParallelIterator
let lines = std::io::stdin().lock().lines();
lines.par_bridge().for_each(|line| {
// Process lines in parallel
});
// 2. The iterator is lazy/infinite
// and you want parallel processing
let infinite = std::iter::successors(Some(1), |&n| Some(n + 1));
infinite
.par_bridge()
.take(1000) // Still works with take
.for_each(|n| {
// Process first 1000 items in parallel
});
// 3. Memory is constrained
// and you don't want to collect
// Use native ParallelIterator when:
// 1. You control the data structure
// (Vec, slice, range, etc.)
let vec = vec![1, 2, 3, 4, 5];
vec.par_iter().for_each(|&n| {
// Native is more efficient
});
// 2. You need indexed operations
// (chunks, split_at, etc.)
(0..100)
.into_par_iter()
.chunks(10) // Requires indexed
.for_each(|chunk| {
// Process chunks
});
}Use par_bridge for external/lazy iterators; use native ParallelIterator when available.
Synthesis
par_bridge purpose:
// par_bridge converts Iterator to ParallelIterator
// Enabling parallel processing of sequential iterators
trait ParallelBridge: Iterator + Send {
fn par_bridge(self) -> IntoIter<Self>
where
Self: Sized;
}
// Usage:
iter.par_bridge()
.map(|x| process(x))
.for_each(|result| use_result(result));How it works:
// 1. Creates a ParallelIterator adapter
// 2. Pulls items from Iterator in one thread
// 3. Distributes items to Rayon's thread pool
// 4. Processes items in parallel
// The bridge pattern:
// [Sequential Iterator] -> [One thread pulls] -> [Work queue] -> [Worker threads process]Trade-offs:
// par_bridge advantages:
// - Works with any Iterator + Send
// - No allocation needed
// - Works with lazy/infinite iterators
// - Good for streaming data
// par_bridge disadvantages:
// - Coordination overhead
// - Iterator thread can be bottleneck
// - Unindexed ParallelIterator
// - Less efficient than native ParallelIterator
// When to use:
// - External/unknown Iterator types
// - Streaming from stdin/files
// - Memory-constrained scenarios
// - Lazy iteration patterns
// When NOT to use:
// - You have Vec, slice, or range
// - Need indexed operations
// - Performance is critical
// - Small data setsKey insight: ParallelBridge::par_bridge() bridges the sequential and parallel worlds in Rayon, allowing any Iterator + Send to be processed in parallel without implementing ParallelIterator directly or collecting into an intermediate container. The bridge pulls items from the sequential iterator in a single thread and distributes them to Rayon's worker threads, enabling parallelism but with coordination overhead. It's most useful for iterators from external sources (like file I/O or library APIs) that you can't control, or for lazy/streaming data where collecting first isn't practical. For data you controlāslices, vectors, rangesāthe native into_par_iter() or par_iter() methods are more efficient because they provide better work distribution and support indexed operations. The choice between par_bridge and native parallel iterators should be based on whether you control the source and whether the overhead of the bridge is justified by avoiding allocation.
