How does `rayon::slice::ParallelSlice::par_chunks` differ from `par_chunks_mut` for chunk-based parallel processing?

par_chunks provides immutable access to slice chunks for parallel read-only operations, while par_chunks_mut provides mutable access enabling parallel modification of the underlying data. Both methods split a slice into non-overlapping chunks and process them in parallel, but the key difference is mutability: par_chunks yields &[T] references that cannot modify the original data, while par_chunks_mut yields &mut [T] references that can write to the underlying slice in parallel without data races because chunks are guaranteed to be disjoint.

Basic Chunk-Based Processing

use rayon::prelude::*;
 
fn basic_chunks() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    
    // Split into chunks of 3 elements
    // Last chunk may be smaller
    data.par_chunks(3).for_each(|chunk| {
        println!("Chunk: {:?}", chunk);
    });
    // Output (order may vary due to parallel execution):
    // Chunk: [1, 2, 3]
    // Chunk: [4, 5, 6]
    // Chunk: [7, 8, 9]
    // Chunk: [10]
}

par_chunks divides a slice into chunks and processes each chunk in parallel.

par_chunks for Immutable Access

use rayon::prelude::*;
 
fn par_chunks_immutable() {
    let data = vec![10, 20, 30, 40, 50, 60, 70, 80];
    
    // par_chunks gives &[T] - immutable access
    let results: Vec<u32> = data.par_chunks(2)
        .map(|chunk| {
            // chunk is &[T], can only read
            chunk.iter().sum()
        })
        .collect();
    
    // Each chunk is read-only: [10, 20], [30, 40], [50, 60], [70, 80]
    // Results: [30, 70, 110, 150]
    assert_eq!(results, vec![30, 70, 110, 150]);
    
    // Original data is unchanged
    assert_eq!(data, vec![10, 20, 30, 40, 50, 60, 70, 80]);
}

par_chunks yields immutable references; the original data cannot be modified.

par_chunks_mut for Mutable Access

use rayon::prelude::*;
 
fn par_chunks_mutable() {
    let mut data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    // par_chunks_mut gives &mut [T] - mutable access
    data.par_chunks_mut(2).for_each(|chunk| {
        // chunk is &mut [T], can modify in place
        for elem in chunk.iter_mut() {
            *elem *= 2;
        }
    });
    
    // Each chunk was modified: [1,2] -> [2,4], [3,4] -> [6,8], etc.
    assert_eq!(data, vec![2, 4, 6, 8, 10, 12, 14, 16]);
}

par_chunks_mut yields mutable references, allowing parallel modification of the slice.

Function Signatures

use rayon::prelude::*;
 
fn signature_comparison() {
    // Simplified signatures:
    //
    // fn par_chunks(&self, chunk_size: usize) -> Chunks<'_, T>
    //   - Returns parallel iterator over &[T]
    //   - Yields immutable slice references
    //
    // fn par_chunks_mut(&mut self, chunk_size: usize) -> ChunksMut<'_, T>
    //   - Returns parallel iterator over &mut [T]
    //   - Yields mutable slice references
    
    let data = vec![1, 2, 3, 4];
    
    // par_chunks takes &self
    let sum: u32 = data.par_chunks(2).map(|c| c.iter().sum()).sum();
    
    // par_chunks_mut takes &mut self
    let mut data_mut = data.clone();
    data_mut.par_chunks_mut(2).for_each(|c| {
        for x in c { *x += 1; }
    });
}

par_chunks takes &self while par_chunks_mut takes &mut self.

Calculating Chunk Statistics

use rayon::prelude::*;
 
fn chunk_statistics() {
    let data = vec![1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0];
    
    // Calculate statistics for each chunk (read-only)
    let stats: Vec<f64> = data.par_chunks(2)
        .map(|chunk| {
            let sum: f64 = chunk.iter().sum();
            let mean = sum / chunk.len() as f64;
            mean
        })
        .collect();
    
    // Each chunk of 2 elements: mean of each pair
    assert_eq!(stats, vec![1.5, 3.5, 5.5, 7.5]);
}

Use par_chunks for read-only operations like computing statistics.

Parallel In-Place Modification

use rayon::prelude::*;
 
fn in_place_modification() {
    let mut data = vec![10, 20, 30, 40, 50, 60];
    
    // Double every element in parallel
    data.par_chunks_mut(2).for_each(|chunk| {
        for elem in chunk.iter_mut() {
            *elem *= 2;
        }
    });
    
    assert_eq!(data, vec![20, 40, 60, 80, 100, 120]);
    
    // Or use a more idiomatic approach
    let mut data2 = vec![1, 2, 3, 4, 5, 6];
    data2.par_chunks_mut(3).for_each(|chunk| {
        chunk.iter_mut().for_each(|x| *x += 10);
    });
    assert_eq!(data2, vec![11, 12, 13, 14, 15, 16]);
}

Use par_chunks_mut for in-place transformations.

Chunk Size Considerations

use rayon::prelude::*;
 
fn chunk_sizes() {
    let data: Vec<i32> = (1..=10).collect();
    
    // Chunk size divides evenly
    data.par_chunks(5).for_each(|chunk| {
        assert_eq!(chunk.len(), 5);
    });
    
    // Chunk size doesn't divide evenly
    let chunks: Vec<&[i32]> = data.par_chunks(3).collect();
    assert_eq!(chunks.len(), 4);
    assert_eq!(chunks[0].len(), 3); // [1, 2, 3]
    assert_eq!(chunks[1].len(), 3); // [4, 5, 6]
    assert_eq!(chunks[2].len(), 3); // [7, 8, 9]
    assert_eq!(chunks[3].len(), 1); // [10]
    
    // Chunk size larger than data
    let chunks: Vec<&[i32]> = data.par_chunks(20).collect();
    assert_eq!(chunks.len(), 1);
    assert_eq!(chunks[0].len(), 10);
}

Chunks are non-overlapping; the last chunk may be smaller.

Parallel Normalization

use rayon::prelude::*;
 
fn parallel_normalization() {
    let mut data = vec![100.0, 200.0, 300.0, 400.0, 500.0, 600.0];
    
    // Find max (requires another pass)
    let max = data.par_iter().cloned().fold(|| 0.0, f64::max).max();
    
    // Normalize in parallel chunks
    if let Some(max_val) = max {
        data.par_chunks_mut(2).for_each(|chunk| {
            for val in chunk.iter_mut() {
                *val /= max_val;
            }
        });
    }
    
    assert_eq!(data, vec![1.0/6.0, 2.0/6.0, 3.0/6.0, 4.0/6.0, 5.0/6.0, 1.0]);
}

par_chunks_mut enables parallel in-place normalization.

Processing Rows of a Matrix

use rayon::prelude::*;
 
fn matrix_processing() {
    // Store matrix as flat slice: row-major order
    let mut matrix = vec![
        1, 2, 3,    // row 0
        4, 5, 6,    // row 1
        7, 8, 9,    // row 2
        10, 11, 12, // row 3
    ];
    
    let cols = 3;
    let rows = matrix.len() / cols;
    
    // Process each row in parallel (read-only)
    let row_sums: Vec<i32> = matrix.par_chunks(cols)
        .map(|row| row.iter().sum())
        .collect();
    
    assert_eq!(row_sums, vec![6, 15, 24, 33]);
    
    // Process each row in parallel (mutable)
    matrix.par_chunks_mut(cols).for_each(|row| {
        // Normalize each row
        let sum: i32 = row.iter().sum();
        for val in row.iter_mut() {
            *val = *val * 10 / sum;
        }
    });
    
    // Each row normalized proportionally
}

Matrix rows can be processed in parallel using chunk-based operations.

Guaranteed Non-Overlapping

use rayon::prelude::*;
use std::sync::atomic::{AtomicUsize, Ordering};
 
fn non_overlapping_guarantee() {
    let mut data = vec![0; 100];
    let modifications = AtomicUsize::new(0);
    
    // Chunks are guaranteed non-overlapping
    // Safe to modify in parallel without locks
    data.par_chunks_mut(10).for_each(|chunk| {
        // Each thread gets exclusive access to its chunk
        for (i, val) in chunk.iter_mut().enumerate() {
            *val = i;
        }
        // No data races because chunks don't overlap
    });
    
    // Verify: no overlaps occurred
    for chunk in data.chunks(10) {
        for (i, &val) in chunk.iter().enumerate() {
            assert_eq!(val, i);
        }
    }
}

Rayon guarantees chunks are disjoint, enabling safe parallel mutation.

When to Use par_chunks

use rayon::prelude::*;
 
fn when_to_use_par_chunks() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    // Use par_chunks when:
    // 1. You need read-only access to chunks
    // 2. Computing derived data from chunks
    // 3. Collecting results from each chunk
    // 4. Original data should not be modified
    
    // Example: Find max in each chunk
    let maxes: Vec<i32> = data.par_chunks(2)
        .map(|chunk| *chunk.iter().max().unwrap())
        .collect();
    
    // Example: Check chunk properties
    let all_positive = data.par_chunks(4)
        .all(|chunk| chunk.iter().all(|&x| x > 0));
    
    // Example: Filter based on chunk content
    let has_even: Vec<&[i32]> = data.par_chunks(2)
        .filter(|chunk| chunk.iter().any(|&x| x % 2 == 0))
        .collect();
}

Use par_chunks for any read-only parallel chunk operation.

When to Use par_chunks_mut

use rayon::prelude::*;
 
fn when_to_use_par_chunks_mut() {
    let mut data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    // Use par_chunks_mut when:
    // 1. You need to modify elements in place
    // 2. Transforming data within chunks
    // 3. Applying operations that mutate state
    // 4. Memory efficiency (no allocation for results)
    
    // Example: Scale each chunk
    data.par_chunks_mut(2).for_each(|chunk| {
        for val in chunk.iter_mut() {
            *val *= 2;
        }
    });
    
    // Example: Fill chunks with computed values
    let mut buffer = vec![0; 16];
    buffer.par_chunks_mut(4).enumerate().for_each(|(i, chunk)| {
        for val in chunk.iter_mut() {
            *val = i + 1;
        }
    });
    // Each chunk of 4 filled with its chunk index + 1
}

Use par_chunks_mut when you need to modify data in place.

Combining with Reduce Operations

use rayon::prelude::*;
 
fn chunk_with_reduce() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
    
    // par_chunks: compute per-chunk, then reduce
    let total: i32 = data.par_chunks(3)
        .map(|chunk| chunk.iter().sum::<i32>())
        .sum();
    
    // Equivalent to: sum of all elements
    assert_eq!(total, 55);
    
    // par_chunks_mut: modify first, then process
    let mut data_mut = data.clone();
    data_mut.par_chunks_mut(5).for_each(|chunk| {
        for val in chunk.iter_mut() {
            *val *= 2;
        }
    });
    
    // Now sum the modified data
    let total_modified: i32 = data_mut.par_iter().sum();
    assert_eq!(total_modified, 110);
}

Both methods can be combined with parallel reduction operations.

Parallel Sorting Within Chunks

use rayon::prelude::*;
 
fn sort_within_chunks() {
    let mut data = vec![5, 3, 8, 1, 9, 2, 7, 4, 6, 0];
    
    // Sort each chunk independently (requires mutable)
    data.par_chunks_mut(5).for_each(|chunk| {
        chunk.sort();
    });
    
    // Each chunk of 5 is sorted, but overall order may not be
    assert_eq!(&data[0..5], &[1, 3, 5, 8, 9]);
    assert_eq!(&data[5..10], &[0, 2, 4, 6, 7]);
    
    // Compare with par_chunks (immutable)
    let data2 = vec![5, 3, 8, 1, 9, 2, 7, 4, 6, 0];
    let sorted_chunks: Vec<Vec<i32>> = data2.par_chunks(5)
        .map(|chunk| {
            let mut v = chunk.to_vec();
            v.sort();
            v
        })
        .collect();
    // Creates new vectors (allocation) vs in-place sort
}

par_chunks_mut enables in-place sorting within each chunk.

Performance Characteristics

use rayon::prelude::*;
 
fn performance_characteristics() {
    let data: Vec<i32> = (0..100_000).collect();
    
    // par_chunks:
    // - Read-only: no synchronization needed for data
    // - Can borrow slice multiple times safely
    // - Good for computing derived values
    
    // par_chunks_mut:
    // - Exclusive access to each chunk
    // - No locks needed (chunks are disjoint)
    // - Good for in-place transformations
    
    // Choose chunk size based on:
    // 1. Work per chunk (enough to amortize overhead)
    // 2. Number of CPU cores (balance load)
    // 3. Memory access patterns
    
    // Too small: high overhead
    // Too large: poor parallelism
    
    // Rule of thumb: chunk work should be significant
    // compared to parallel scheduling overhead
}

Choose chunk size to balance parallelism overhead vs workload distribution.

Comparison with par_iter

use rayon::prelude::*;
 
fn chunks_vs_iter() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    // par_iter: process each element individually
    let doubled: Vec<i32> = data.par_iter().map(|&x| x * 2).collect();
    
    // par_chunks: process groups of elements together
    let chunk_sums: Vec<i32> = data.par_chunks(2).map(|c| c.iter().sum()).collect();
    
    // Use par_iter when:
    // - Operations are independent per element
    // - No need to group elements
    
    // Use par_chunks when:
    // - Operations benefit from locality within groups
    // - Need to process related elements together
    // - Working with matrix rows, tiles, etc.
}

Use par_chunks when processing groups of related elements together.

Nested Parallelism

use rayon::prelude::*;
 
fn nested_parallelism() {
    let mut matrix: Vec<Vec<i32>> = vec![
        vec![1, 2, 3],
        vec![4, 5, 6],
        vec![7, 8, 9],
    ];
    
    // Process each row in parallel
    matrix.par_iter_mut().for_each(|row| {
        // Within each row, process elements in parallel
        row.par_iter_mut().for_each(|val| {
            *val *= 2;
        });
    });
    
    assert_eq!(matrix[0], vec![2, 4, 6]);
    assert_eq!(matrix[1], vec![8, 10, 12]);
    assert_eq!(matrix[2], vec![14, 16, 18]);
    
    // Alternative using chunks on flattened data
    let mut flat = vec![1, 2, 3, 4, 5, 6, 7, 8, 9];
    let cols = 3;
    flat.par_chunks_mut(cols).for_each(|row| {
        row.iter_mut().for_each(|val| *val *= 2);
    });
}

Both chunk operations can be nested with other parallel operations.

Error Handling in Chunks

use rayon::prelude::*;
 
fn error_handling() -> Result<(), String> {
    let data = vec!["1", "2", "three", "4", "5", "six"];
    
    // Process chunks that may fail (try_fold, try_for_each)
    let results: Result<Vec<Vec<i32>>, _> = data.par_chunks(2)
        .map(|chunk| {
            chunk.iter()
                .map(|s| s.parse::<i32>().map_err(|_| "parse error".to_string()))
                .collect()
        })
        .collect();
    
    // This will fail because "three" and "six" can't be parsed
    assert!(results.is_err());
    
    // Process successfully parsed values only
    let valid: Vec<Vec<i32>> = data.par_chunks(2)
        .filter_map(|chunk| {
            chunk.iter()
                .map(|s| s.parse::<i32>().ok())
                .collect::<Option<Vec<_>>>()
        })
        .collect();
    
    // Only chunks where all parse succeed
    // [1, 2] succeeded, [4, 5] succeeded
    assert_eq!(valid.len(), 2);
    
    Ok(())
}

Chunk operations can use try_fold and try_for_each for fallible operations.

Realistic Example: Image Processing

use rayon::prelude::*;
 
// Simulated image: grayscale pixels
fn process_image() {
    let width = 800;
    let height = 600;
    let mut pixels: Vec<u8> = vec![128; width * height];
    
    // Process each row in parallel
    pixels.par_chunks_mut(width).for_each(|row| {
        // Apply some filter to each row
        for (i, pixel) in row.iter_mut().enumerate() {
            // Simulated processing: brighten edges
            let brightness = if i == 0 || i == row.len() - 1 {
                255
            } else {
                *pixel
            };
            *pixel = brightness;
        }
    });
    
    // First and last pixel of each row are now 255
}

par_chunks_mut is natural for row-based image processing.

Comparison Summary

use rayon::prelude::*;
 
fn comparison_table() {
    // | Aspect | par_chunks | par_chunks_mut |
    // |--------|-----------|----------------|
    // | Access | &[T] (immutable) | &mut [T] (mutable) |
    // | Self | &self | &mut self |
    // | Modification | No | Yes |
    // | Use case | Read-only processing | In-place transformation |
    // | Safety | Always safe | Safe (disjoint chunks) |
    // | Allocation | May allocate for results | No allocation needed |
    
    let data = vec![1, 2, 3, 4, 5, 6];
    
    // par_chunks: read and compute
    let sums: Vec<i32> = data.par_chunks(2).map(|c| c.iter().sum()).collect();
    
    // par_chunks_mut: modify in place
    let mut data_mut = data;
    data_mut.par_chunks_mut(2).for_each(|c| {
        for x in c { *x *= 2; }
    });
}

Synthesis

Quick reference:

use rayon::prelude::*;
 
fn quick_reference() {
    let data = vec![1, 2, 3, 4, 5, 6, 7, 8];
    
    // par_chunks: read-only chunk access
    // Yields &[T], takes &self
    let chunk_sums: Vec<i32> = data.par_chunks(2)
        .map(|chunk| chunk.iter().sum())
        .collect();
    
    let mut data_mut = data.clone();
    
    // par_chunks_mut: mutable chunk access
    // Yields &mut [T], takes &mut self
    data_mut.par_chunks_mut(2)
        .for_each(|chunk| {
            for val in chunk.iter_mut() {
                *val *= 2;
            }
        });
    
    // Key difference: mutability
    // - par_chunks: can't modify original
    // - par_chunks_mut: can modify in place
    
    // Both guarantee non-overlapping chunks
    // Both execute in parallel
    // Both handle uneven division (last chunk may be smaller)
}

Key insight: The fundamental distinction is mutability, which determines the operation type and signature. par_chunks yields immutable &[T] references, making it suitable for any read-only operation—computing statistics, filtering, transforming into new collections, or checking properties. par_chunks_mut yields mutable &mut [T] references, enabling in-place modification of the underlying slice. Rayon guarantees that chunks are non-overlapping, so parallel mutation is safe without locks; each thread has exclusive access to its assigned chunk. This makes par_chunks_mut ideal for in-place transformations like scaling values, sorting within chunks, or filling buffers. The choice between them mirrors the choice between iter() and iter_mut(): use par_chunks when you only need to read, use par_chunks_mut when you need to write. Both methods handle uneven divisions gracefully—the last chunk may be smaller than the specified size—and both integrate seamlessly with Rayon's parallel iterator combinators like map, for_each, filter, and reduce.

How does rayon::slice::ParallelSlice::par_chunks differ from par_chunks_mut for chunk-based parallel processing?