Loading page…
Rust walkthroughs
Loading page…
rayon::iter::ParallelBridge and par_iter for sequential-to-parallel conversion?rayon::iter::ParallelBridge and par_iter represent two different approaches to parallel iteration: par_iter converts collections directly into parallel iterators with optimal performance characteristics, while ParallelBridge wraps sequential iterators to make them parallel-compatible at the cost of reduced efficiency and control. The fundamental trade-off is convenience versus performance—par_iter requires ownership or borrowing of a collection that implements IntoParallelIterator, enabling rayon to split work optimally across threads, while parallel_bridge can parallelize any iterator but loses information about work distribution, forcing rayon to pull items sequentially from the source and distribute them less efficiently. par_iter should be the default choice when working with collections like Vec, HashMap, or ranges, while parallel_bridge is reserved for cases where you have an existing sequential iterator that cannot be easily converted to a parallel iterator, such as when receiving an Iterator trait object from another library or when the data source doesn't implement the parallel iterator traits.
use rayon::prelude::*;
fn main() {
// par_iter works directly on collections that implement IntoParallelIterator
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// par_iter creates a parallel iterator that borrows the data
let sum: i32 = data.par_iter().sum();
println!("Sum: {}", sum);
// into_par_iter takes ownership and returns owned parallel iterator
let squares: Vec<i32> = data.into_par_iter().map(|x| x * x).collect();
println!("Squares: {:?}", squares);
// par_iter gives rayon direct access to the collection
// It can split the data optimally across threads
// Each thread gets a "slice" of the original data
}par_iter provides rayon with direct knowledge of the collection structure.
use rayon::iter::ParallelBridge;
use rayon::prelude::*;
fn main() {
// parallel_bridge wraps any Iterator to make it parallel
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
// Create a sequential iterator first
let seq_iter = data.iter().filter(|x| *x % 2 == 0);
// Bridge it to parallel
let sum: i32 = seq_iter
.parallel_bridge()
.sum();
println!("Sum of evens: {}", sum);
// parallel_bridge works with ANY Iterator
// It pulls items sequentially and distributes them to threads
// The original collection is not accessible to rayon
}parallel_bridge converts any sequential iterator into a parallel one.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
let data: Vec<i32> = (0..1000).collect();
// par_iter: Rayon knows the collection
// - Can split at any point
// - Can estimate work distribution
// - Can divide evenly across threads
data.par_iter().for_each(|x| {
// Each thread processes a contiguous slice
// Thread 1 might process 0-249
// Thread 2 might process 250-499, etc.
});
// parallel_bridge: Rayon sees only an iterator
// - Pulls items one by one
// - Distributes via work-stealing queue
// - Cannot pre-plan distribution
data.iter().parallel_bridge().for_each(|x| {
// Items pulled sequentially then distributed
// Less predictable thread assignment
});
// Key difference: par_iter has collection knowledge
// parallel_bridge has only Iterator interface
}par_iter enables optimal work splitting; parallel_bridge uses work-stealing.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
use std::time::Instant;
fn main() {
let data: Vec<i32> = (0..1_000_000).collect();
// par_iter performance
let start = Instant::now();
let sum1: i32 = data.par_iter().map(|x| x * 2).sum();
let par_iter_time = start.elapsed();
println!("par_iter: {:?}", par_iter_time);
// parallel_bridge performance (typically slower)
let start = Instant::now();
let sum2: i32 = data.iter().map(|x| x * 2).parallel_bridge().sum();
let bridge_time = start.elapsed();
println!("parallel_bridge: {:?}", bridge_time);
// parallel_bridge is slower because:
// 1. Sequential pull from iterator
// 2. Distribution via work-stealing queue
// 3. Less optimal cache locality
// 4. More synchronization overhead
// par_iter is faster because:
// 1. Direct access to collection slices
// 2. Known sizes enable optimal splitting
// 3. Better cache locality
// 4. Less coordination overhead
}par_iter has inherent performance advantages over parallel_bridge.
use rayon::iter::ParallelBridge;
use rayon::prelude::*;
// Case 1: Receiving an Iterator from another source
fn process_iterator(iter: impl Iterator<Item = i32> + Send + 'static) -> i32 {
// We only have an Iterator, not a collection
// par_iter won't work here
iter.parallel_bridge()
.map(|x| x * 2)
.sum()
}
// Case 2: Generator-style iteration
fn fibonacci(n: usize) -> impl Iterator<Item = u64> + Send + 'static {
let mut a: u64 = 0;
let mut b: u64 = 1;
std::iter::from_fn(move || {
let current = a;
a = b;
b = current + b;
Some(current)
})
.take(n)
}
fn main() {
// Generator doesn't implement IntoParallelIterator
let sum: u64 = fibonacci(1000)
.parallel_bridge()
.filter(|&x| x % 2 == 0)
.sum();
println!("Sum of even Fibonacci: {}", sum);
// Case 3: Lines from a file (sequential by nature)
use std::io::BufRead;
// File lines come as an Iterator
// parallel_bridge enables parallel processing
// par_iter would require collecting first (inefficient)
}
// Case 4: Chained operations with non-parallel iterator
fn process_chained() -> i32 {
(0..1000)
.filter(|x| x % 3 == 0) // Filtered sequentially first
.parallel_bridge() // Then process in parallel
.map(|x| x * x)
.sum()
}Use parallel_bridge when you have an Iterator that cannot implement IntoParallelIterator.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
// par_iter constraints:
// - Collection must implement IntoParallelIterator
// - Items must implement Send (for safety)
// - Borrow or own the collection
let vec: Vec<i32> = vec![1, 2, 3, 4, 5];
vec.par_iter().for_each(|x| println!("{}", x)); // Works
// parallel_bridge constraints:
// - Iterator must implement Iterator + Send
// - Items must implement Send + 'static
// - The iterator itself must be Send
vec.iter().parallel_bridge().for_each(|x| println!("{}", x)); // Works
// parallel_bridge requires 'static lifetime
// This won't compile:
// fn broken() {
// let vec = vec![1, 2, 3];
// vec.iter().parallel_bridge().for_each(|_| {});
// // Error: borrowed value must be 'static
// }
// par_iter can work with borrowed data:
fn works(data: &Vec<i32>) {
data.par_iter().for_each(|x| println!("{}", x));
}
}parallel_bridge has stricter lifetime requirements than par_iter.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
let data: Vec<i32> = (0..100).collect();
// par_iter: Can control with with_max_len and with_min_len
data.par_iter()
.with_max_len(10) // Split into chunks of at most 10
.for_each(|x| {
// Predictable chunk sizes
});
// parallel_bridge: Less control
// Work is pulled and distributed dynamically
data.iter()
.parallel_bridge()
.for_each(|x| {
// Cannot control chunk distribution
});
// par_iter respects chunk_size configuration:
data.par_iter()
.with_min_len(5) // Chunks at least 5 items
.with_max_len(20) // Chunks at most 20 items
.for_each(|_| {});
}par_iter provides more control over work distribution granularity.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
let data: Vec<i32> = (0..1_000_000).collect();
// par_iter: No additional allocation for splitting
// Rayon works with slices of the original vector
// Memory layout: Original Vec + thread work slices
data.par_iter().for_each(|_| {
// Each thread has a view into original memory
// No copying of data between threads
});
// parallel_bridge: Uses work-stealing queue
// Items pulled and pushed to concurrent queue
// Memory layout: Original + work-stealing queue
data.iter().parallel_bridge().for_each(|_| {
// Items fetched from sequential iterator
// Then distributed via work queue
});
// par_iter is more memory-efficient
// parallel_bridge has queue allocation overhead
}par_iter avoids allocation overhead by working with slices.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
// Use par_iter when:
// 1. You have a collection (Vec, array, HashMap keys, etc.)
// 2. You own or can borrow the collection
// 3. Performance matters
// 4. You want control over chunk sizes
// Use parallel_bridge when:
// 1. You only have an Iterator (not a collection)
// 2. The source doesn't implement IntoParallelIterator
// 3. You're working with generator functions
// 4. Processing sequential streams (file lines, network)
// 5. Convenience outweighs performance
fn main() {
let data: Vec<i32> = (0..1000).collect();
// ✅ GOOD: Use par_iter for collections
data.par_iter().for_each(|x| {
// Optimal performance
});
// ❌ AVOID: Don't use parallel_bridge when par_iter works
data.iter().parallel_bridge().for_each(|x| {
// Unnecessary overhead
});
// ✅ NECESSARY: Use parallel_bridge for true iterators
(0..1000)
.filter(|x| x % 7 == 0) // Sequential filter
.parallel_bridge() // Then parallel
.for_each(|x| {
// Only option for filtered iterator
});
}Choose par_iter by default; use parallel_bridge only when necessary.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
let data: Vec<i32> = (0..1000).collect();
// Sometimes you want sequential operations before parallel
// parallel_bridge enables this pattern
// Sequential filtering, then parallel processing
let result: i32 = data
.iter()
.filter(|x| **x % 2 == 0) // Sequential filter
.parallel_bridge() // Bridge to parallel
.map(|x| x * x) // Parallel map
.sum(); // Parallel reduce
println!("Result: {}", result);
// Alternative: Use par_iter with filter (also parallel)
let result2: i32 = data
.par_iter()
.filter(|x| **x % 2 == 0) // Parallel filter
.map(|x| x * x) // Parallel map
.sum(); // Parallel reduce
// The difference:
// - parallel_bridge: filter is sequential, then parallel
// - par_iter: entire pipeline is parallel
// Use parallel_bridge when sequential pre-processing is intentional:
// - Complex sequential logic
// - Stateful iteration
// - External API returns Iterator
}parallel_bridge enables mixing sequential and parallel operations intentionally.
use rayon::prelude::*;
use rayon::iter::ParallelBridge;
fn main() {
// par_iter with try_fold for fallible operations
let data: Vec<i32> = (0..100).collect();
let result: Result<i32, &str> = data
.par_iter()
.try_fold(|| 0, |acc, x| {
if *x > 50 {
Err("Value too large")
} else {
Ok(acc + x)
}
});
// parallel_bridge also supports try_fold
let result2: Result<i32, &str> = data
.iter()
.parallel_bridge()
.try_fold(|| 0, |acc, x| {
if *x > 50 {
Err("Value too large")
} else {
Ok(acc + x)
}
});
// Both short-circuit on error
// But par_iter has better early termination
}Both support fallible operations, but par_iter has more efficient early termination.
Comparison table:
| Aspect | par_iter | parallel_bridge |
|--------|-------------|-------------------|
| Input type | Collection (Vec, &[T], range) | Any Iterator |
| Work distribution | Optimal splitting | Work-stealing queue |
| Memory overhead | Minimal (slices) | Queue allocation |
| Performance | Best | Acceptable |
| Lifetime constraints | Borrowed data OK | Requires 'static |
| Control | with_min_len/with_max_len | Limited |
| Use case | Default choice | When only Iterator available |
Decision flow:
| Situation | Choice |
|-----------|--------|
| Have Vec, &[T], range, etc. | par_iter |
| Have HashMap, HashSet | par_iter |
| Received generic Iterator | parallel_bridge |
| Generator/stream source | parallel_bridge |
| Need sequential preprocessing | parallel_bridge |
| Performance critical | par_iter |
Key insight: The trade-off between par_iter and parallel_bridge fundamentally concerns what rayon knows about your data. par_iter gives rayon a collection view—rayon sees the slice bounds, can calculate optimal split points, and distributes contiguous memory regions to threads with minimal coordination. parallel_bridge gives rayon only an Iterator view—rayon pulls items one at a time through next(), places them in a work-stealing queue, and threads grab items dynamically. This is why par_iter is faster: it preserves the "collectionness" that enables optimal parallel decomposition. parallel_bridge should be reserved for genuine cases where you have an Iterator that cannot be converted to a parallel iterator, such as when receiving an iterator from an external library, working with generator functions (iter::from_fn), or processing streams that don't fit the collection model. The performance penalty of parallel_bridge is acceptable when there's no alternative, but using it when par_iter would work is leaving performance on the table unnecessarily. The 'static lifetime requirement of parallel_bridge is another practical constraint: because items are distributed across threads through a queue, they must outlive the parallel operation, whereas par_iter can work with borrowed data since rayon controls the entire execution.