What is the purpose of criterion::Bencher::iter_batched for setup/teardown in benchmark iterations?

iter_batched allows you to run setup code before each benchmark iteration without including the setup time in the measurements, while teardown lets you clean up after each iteration—this enables accurate benchmarking of operations that require fresh initial state or produce results that need cleanup. The standard iter method includes all code in its closure in the timing, so iter_batched is essential when setup would distort the results or when cleanup is necessary to prevent resource leaks.

The Problem with Setup in Regular iter

use criterion::{Criterion, black_box};
 
fn setup_problem(c: &mut Criterion) {
    c.bench_function("with_setup_included", |b| {
        b.iter(|| {
            // PROBLEM: This setup is included in timing!
            let mut data = vec![0u64; 1000];
            
            // Only want to measure this
            data.sort();
            black_box(data)
        });
    });
}

Using iter, all code including setup is timed, distorting benchmark results.

Basic iter_batched Usage

use criterion::{Criterion, black_box};
 
fn basic_iter_batched(c: &mut Criterion) {
    c.bench_function("sort_batched", |b| {
        b.iter_batched(
            // Setup: NOT timed
            || vec![0u64; 1000],
            
            // Routine: IS timed
            |mut data| {
                data.sort();
                black_box(data)
            },
            
            // Batch size: number of iterations per setup call
            criterion::BatchSize::SmallInput,
        );
    });
}

iter_batched separates setup from the timed routine, producing accurate measurements.

Understanding Batch Sizes

use criterion::{Criterion, BatchSize, black_box};
 
fn batch_sizes(c: &mut Criterion) {
    // BatchSize::SmallInput: Setup runs once per iteration
    // Good for small inputs where setup is cheap
    c.bench_function("small_batch", |b| {
        b.iter_batched(
            || vec![0u64; 10],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::SmallInput,
        );
    });
    
    // BatchSize::LargeInput: Setup runs less frequently
    // Good for large inputs where setup is expensive
    c.bench_function("large_batch", |b| {
        b.iter_batched(
            || vec![0u64; 1_000_000],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::LargeInput,
        );
    });
    
    // BatchSize::PerIteration: Setup runs for each iteration
    // Most precise but most setup overhead
    c.bench_function("per_iteration", |b| {
        b.iter_batched(
            || vec![0u64; 100],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::PerIteration,
        );
    });
    
    // Custom batch size: Specify iterations per batch
    c.bench_function("custom_batch", |b| {
        b.iter_batched(
            || vec![0u64; 100],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::NumIterations(10),  // 10 iterations per setup
        );
    });
}

BatchSize controls how often setup runs relative to the routine.

Teardown with iter_batched_ref

use criterion::{Criterion, BatchSize, black_box};
 
fn teardown_example(c: &mut Criterion) {
    c.bench_function("with_teardown", |b| {
        b.iter_batched_ref(
            // Setup
            || Vec::with_capacity(1000),
            
            // Routine (takes &mut reference)
            |data| {
                data.extend(0..1000);
                data.sort();
                black_box(data)
            },
            
            BatchSize::SmallInput,
        );
        // data is dropped after each iteration (implicit teardown)
    });
    
    // Explicit teardown using iter_batched and returning
    c.bench_function("explicit_teardown", |b| {
        b.iter_batched(
            // Setup
            || std::fs::File::open("/dev/null").unwrap(),
            
            // Routine
            |file| {
                // Use file
                black_box(&file);
                file  // Return for teardown
            },
            
            BatchSize::SmallInput,
        );
        // File is dropped (closed) after each batch
    });
}

Resources from setup are automatically dropped; for explicit teardown, return them from the routine.

iter_batched vs iter_batched_ref

use criterion::{Criterion, BatchSize, black_box};
 
fn batched_vs_batched_ref(c: &mut Criterion) {
    // iter_batched: Routine takes ownership
    c.bench_function("owned", |b| {
        b.iter_batched(
            || vec![1, 2, 3],
            |mut v| {
                v.push(4);
                black_box(v)  // Own and return
            },
            BatchSize::SmallInput,
        );
    });
    
    // iter_batched_ref: Routine takes &mut reference
    c.bench_function("ref", |b| {
        b.iter_batched_ref(
            || vec![1, 2, 3],
            |v| {
                v.push(4);
                black_box(v)
            },
            BatchSize::SmallInput,
        );
    });
    
    // Use iter_batched_ref when:
    // - You don't need ownership in the routine
    // - Setup produces expensive-to-clone values
    // - You want reuse across iterations (with proper batch sizing)
}

iter_batched_ref passes a mutable reference instead of ownership to the routine.

Fresh State for Each Iteration

use criterion::{Criterion, BatchSize, black_box};
 
fn fresh_state(c: &mut Criterion) {
    // Problem: In-place operations modify state
    // If we reuse state, it's not representative
    
    c.bench_function("sort_fresh", |b| {
        b.iter_batched(
            // Fresh shuffled data each time
            || {
                let mut v: Vec<u64> = (0..1000).collect();
                v.shuffle(&mut rand::thread_rng());
                v
            },
            // Sort in place
            |mut v| {
                v.sort();
                black_box(v)
            },
            BatchSize::SmallInput,
        );
    });
    
    // Without iter_batched, we'd measure shuffle + sort
    // Or we'd sort already-sorted data (not realistic)
}

iter_batched ensures each iteration starts with fresh, realistic state.

Benchmarking Allocation Patterns

use criterion::{Criterion, BatchSize, black_box};
 
fn allocation_benchmarks(c: &mut Criterion) {
    // Measure vector filling, not allocation
    c.bench_function("fill_vector", |b| {
        b.iter_batched(
            // Pre-allocate
            || Vec::with_capacity(1000),
            // Fill
            |mut v| {
                for i in 0..1000 {
                    v.push(i);
                }
                black_box(v)
            },
            BatchSize::SmallInput,
        );
    });
    
    // Measure only the push operations, not allocation
    // Allocation is in setup, not timed
    
    // Compare with measuring allocation:
    c.bench_function("fill_with_allocation", |b| {
        b.iter(|| {
            let mut v = Vec::with_capacity(1000);
            for i in 0..1000 {
                v.push(i);
            }
            black_box(v)
        });
    });
    // This includes allocation in timing
}

Use iter_batched to isolate operations from allocation overhead.

Benchmarking I/O Operations

use criterion::{Criterion, BatchSize, black_box};
use std::io::Cursor;
 
fn io_benchmarks(c: &mut Criterion) {
    // Setup creates fresh cursor, routine reads
    c.bench_function("read_from_cursor", |b| {
        b.iter_batched(
            // Fresh cursor each time
            || Cursor::new(vec![0u8; 1024]),
            // Read operation
            |mut cursor| {
                let mut buf = [0u8; 64];
                cursor.read_exact(&mut buf).unwrap();
                black_box(buf)
            },
            BatchSize::SmallInput,
        );
    });
    
    // File operations
    c.bench_function("file_operations", |b| {
        b.iter_batched(
            // Create temp file
            || {
                let path = std::env::temp_dir().join("bench_test");
                std::fs::write(&path, vec![0u8; 1024]).unwrap();
                std::fs::File::open(&path).unwrap()
            },
            // Seek and read
            |mut file| {
                use std::io::{Seek, SeekFrom};
                file.seek(SeekFrom::Start(0)).unwrap();
                let mut buf = [0u8; 64];
                std::io::Read::read_exact(&mut file, &mut buf).unwrap();
                black_box(buf)
            },
            BatchSize::SmallInput,
        );
        // File closed (dropped) automatically
    });
}

iter_batched handles resource creation and cleanup around I/O benchmarks.

Benchmarking Collection Operations

use criterion::{Criterion, BatchSize, black_box};
use std::collections::HashMap;
 
fn collection_benchmarks(c: &mut Criterion) {
    // HashMap insertion: measure insertion, not HashMap creation
    c.bench_function("hashmap_insert", |b| {
        b.iter_batched(
            // Fresh empty HashMap
            || HashMap::with_capacity(100),
            // Insert 100 items
            |mut map| {
                for i in 0..100 {
                    map.insert(i, i * 2);
                }
                black_box(map)
            },
            BatchSize::SmallInput,
        );
    });
    
    // HashMap lookup: populate in setup, measure lookups
    c.bench_function("hashmap_lookup", |b| {
        b.iter_batched_ref(
            // Pre-populated HashMap
            || {
                let mut map = HashMap::with_capacity(100);
                for i in 0..100 {
                    map.insert(i, i * 2);
                }
                map
            },
            // Lookup items
            |map| {
                for i in 0..100 {
                    black_box(map.get(&i));
                }
            },
            BatchSize::SmallInput,
        );
    });
}

Different benchmarks need different initial states; iter_batched provides them.

Avoiding Compiler Optimization

use criterion::{Criterion, BatchSize, black_box};
 
fn optimization_avoidance(c: &mut Criterion) {
    // black_box prevents compiler from optimizing away results
    // But setup might still be optimized if not connected
    
    c.bench_function("proper_black_box", |b| {
        b.iter_batched(
            || {
                // Setup with black_box to ensure it's not optimized out
                black_box(vec![1, 2, 3, 4, 5])
            },
            |data| {
                let result = data.into_iter().sum::<u64>();
                black_box(result)
            },
            BatchSize::SmallInput,
        );
    });
    
    // The setup value is passed to routine
    // Compiler can't eliminate it
}

Pass setup through to routine with black_box to prevent optimization.

Timing Only the Interesting Part

use criterion::{Criterion, BatchSize, black_box};
 
fn timing_specific_operations(c: &mut Criterion) {
    // Example: measure only the algorithm, not parsing
    c.bench_function("algorithm_only", |b| {
        b.iter_batched(
            // Parse data in setup
            || {
                let json = r#"{"values": [1, 2, 3, 4, 5]}"#;
                serde_json::from_str::<serde_json::Value>(json).unwrap()
            },
            // Time only processing
            |value| {
                if let Some(arr) = value.get("values").and_then(|v| v.as_array()) {
                    let sum: i64 = arr.iter().filter_map(|v| v.as_i64()).sum();
                    black_box(sum)
                } else {
                    black_box(0)
                }
            },
            BatchSize::SmallInput,
        );
    });
}

iter_batched isolates the algorithm from parsing or setup overhead.

Large Setup with LargeInput

use criterion::{Criterion, BatchSize, black_box};
 
fn large_setup(c: &mut Criterion) {
    // Large setup: don't run setup every iteration
    c.bench_function("large_data", |b| {
        b.iter_batched(
            // Expensive setup
            || {
                // Large allocation
                let mut v = Vec::with_capacity(1_000_000);
                for i in 0..1_000_000 {
                    v.push(i);
                }
                v
            },
            // Fast operation
            |v| {
                v.binary_search(&500_000)
            },
            // LargeInput: setup runs less frequently
            // Criterion reuses the input multiple times
            BatchSize::LargeInput,
        );
    });
    
    // LargeInput allows criterion to:
    // 1. Run setup once
    // 2. Run the routine multiple times per setup
    // 3. Balance setup overhead vs. measurement precision
}

BatchSize::LargeInput avoids running expensive setup for every iteration.

Comparing Setup Overhead

use criterion::{Criterion, black_box};
 
fn setup_comparison(c: &mut Criterion) {
    let mut group = c.benchmark_group("setup_comparison");
    
    // Without batched: setup included
    group.bench_function("setup_included", |b| {
        b.iter(|| {
            let data: Vec<u64> = (0..1000).collect();
            data.sort();
            black_box(data)
        });
    });
    
    // With batched: setup excluded
    group.bench_function("setup_excluded", |b| {
        b.iter_batched(
            || (0..1000).collect::<Vec<u64>>(),
            |mut data| { data.sort(); black_box(data) },
            criterion::BatchSize::SmallInput,
        );
    });
    
    // The results will show:
    // - setup_included: ~X ns/iter (includes Vec creation)
    // - setup_excluded: ~Y ns/iter (only sorting)
    // - Difference shows setup overhead
    
    group.finish();
}

Compare results to quantify setup overhead impact.

Custom Batch Sizes

use criterion::{Criterion, BatchSize, black_box};
 
fn custom_batch_sizes(c: &mut Criterion) {
    // Number of iterations per setup call
    c.bench_function("custom_10", |b| {
        b.iter_batched(
            || vec![0u64; 100],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::NumIterations(10),
        );
    });
    
    // Setup runs once per 10 iterations
    // Good for: medium-cost setup, need more iterations for precision
    
    // Or use NumBatches for a fixed number of setup calls
    c.bench_function("custom_5_batches", |b| {
        b.iter_batched(
            || vec![0u64; 100],
            |mut v| { v.sort(); black_box(v) },
            BatchSize::NumBatches(5),
        );
    });
}

Custom batch sizes let you tune the setup-to-iteration ratio.

Benchmarking Stateful Algorithms

use criterion::{Criterion, BatchSize, black_box};
 
fn stateful_algorithms(c: &mut Criterion) {
    // Algorithms that modify state need fresh state each time
    c.bench_function("in_place_algorithm", |b| {
        b.iter_batched(
            // Fresh state each time
            || {
                let mut rng = rand::thread_rng();
                (0..1000).map(|_| rng.gen::<u64>()).collect::<Vec<_>>()
            },
            // Algorithm modifies in place
            |mut data| {
                // In-place sort
                data.sort_unstable();
                // In-place dedup
                data.dedup();
                black_box(data)
            },
            BatchSize::SmallInput,
        );
    });
    
    // If we didn't use batched:
    // - First iteration: sort random data (slow)
    // - Next iterations: sort already sorted data (fast)
    // - Results would be misleading
}

iter_batched ensures consistent initial state for stateful algorithms.

Synthesis

Comparison table:

Method Setup Teardown Use Case
iter None None Simple operations, no state
iter_batched Before batch After batch Owned input, needs fresh state
iter_batched_ref Before batch After batch Reference input, reuse state

BatchSize guidelines:

BatchSize When to Use
SmallInput Setup is cheap, need precise per-iteration
LargeInput Setup is expensive, can reuse across iterations
PerIteration Must have fresh state every time
NumIterations(n) Custom control over batch size

When to use iter_batched:

// Fresh state for stateful operations
b.iter_batched(
    || create_fresh_data(),
    |data| modify_in_place(data),
    BatchSize::SmallInput,
);
 
// Exclude setup from timing
b.iter_batched(
    || expensive_initialization(),
    |resource| measure_this_operation(resource),
    BatchSize::LargeInput,
);
 
// Resource cleanup
b.iter_batched(
    || create_file(),
    |file| read_and_process(file),
    BatchSize::SmallInput,
); // File dropped (closed) after

Key insight: iter_batched solves the fundamental benchmarking problem of separating setup from measurement. Without it, you'd either measure setup time (distorting results) or reuse state across iterations (giving unrealistic measurements). The BatchSize parameter lets you control the trade-off between measurement precision (more iterations) and setup overhead (fewer setup calls). For operations that modify state—sorting, in-place algorithms, stateful processing—iter_batched ensures each measurement starts from a consistent, representative state. For expensive setup—large allocations, file creation, network connections—BatchSize::LargeInput amortizes the setup cost across multiple routine calls. The pattern is: setup creates what you need, routine does what you measure, and teardown (implicit in dropping) cleans up.