What is the purpose of `criterion::BatchSize` for controlling iteration counts in benchmarks with expensive setup?

BatchSize controls how Criterion groups benchmark iterations into batches to amortize expensive setup or teardown costs across multiple iterations, preventing setup overhead from dominating the measured time. Without proper batching, benchmarks with expensive setup would measure mostly setup time rather than the actual operation, producing misleading results.

The Problem: Setup Overhead in Benchmarks

use criterion::{Criterion, black_box};
 
fn problem_demonstration(c: &mut Criterion) {
    // Problem: Expensive setup before each measurement
    c.bench_function("expensive_setup", |b| {
        b.iter(|| {
            // Setup is INSIDE the measurement!
            let data: Vec<u64> = (0..1_000_000).collect();  // Expensive!
            
            // The actual operation we want to measure
            let sum: u64 = data.iter().sum();
            black_box(sum)
        });
    });
    
    // This benchmark measures: setup + operation
    // Not just the operation itself
    // Results are misleading!
}

When setup code is inside iter(), it gets measured as part of the benchmark, contaminating results.

Criterion's Default Batching Behavior

use criterion::{Criterion, BatchSize, black_box};
 
fn default_behavior(c: &mut Criterion) {
    // By default, Criterion:
    // 1. Runs the closure many times to estimate time
    // 2. Decides how many iterations per sample
    // 3. Takes multiple samples for statistical analysis
    
    // For fast operations, many iterations per sample
    // For slow operations, fewer iterations per sample
    
    // BUT: Criterion doesn't know about setup overhead
    // It might run 100 iterations, each with expensive setup
    // The measured time = 100 * (setup_time + operation_time)
}

Criterion auto-scales iteration counts, but doesn't distinguish setup from operation time.

BatchSize::Small: Amortize Setup

use criterion::{Criterion, BatchSize, black_box};
 
fn small_batch(c: &mut Criterion) {
    c.bench_function("with_batching", |b| {
        // iter_batched: setup runs ONCE per batch, not per iteration
        b.iter_batched(
            // Setup function: runs once per batch
            || (0..1_000_000).collect::<Vec<u64>>(),
            // Routine: runs multiple times per batch
            |data| {
                let sum: u64 = data.iter().sum();
                black_box(sum)
            },
            BatchSize::Small,  // Small batches: amortize setup over few iterations
        );
    });
    
    // BatchSize::Small means:
    // - Few iterations per batch (maybe 1-10)
    // - Setup runs once for those few iterations
    // - Each iteration reuses the same setup data
}

iter_batched separates setup from measurement, and BatchSize::Small keeps batches small for expensive setup.

Understanding BatchSize Variants

use criterion::{Criterion, BatchSize, black_box};
 
fn batch_size_variants(c: &mut Criterion) {
    // BatchSize::Small: Few iterations per batch
    // - Good for expensive setup relative to operation
    // - Setup cost amortized over few iterations
    // - More setup calls total
    
    // BatchSize::Large: Many iterations per batch
    // - Good for cheap setup
    // - Setup cost amortized over many iterations
    // - Fewer setup calls total
    
    // BatchSize::PerIteration: One iteration per batch
    // - Equivalent to setup before every iteration
    // - Use when setup must be fresh for each iteration
    // - Most expensive in terms of overhead
    
    // BatchSize::NumIterations(n): Fixed batch size
    // - Exactly n iterations per batch
    // - Complete control over batching
}

Each variant suits different setup/operation cost ratios.

BatchSize::PerIteration

use criterion::{Criterion, BatchSize, black_box};
 
fn per_iteration(c: &mut Criterion) {
    c.bench_function("per_iteration", |b| {
        b.iter_batched(
            || {
                // Setup runs before EVERY iteration
                // Use when state cannot be reused
                let data: Vec<u64> = (0..1_000_000).collect();
                data
            },
            |data| {
                let sum: u64 = data.iter().sum();
                black_box(sum)
            },
            BatchSize::PerIteration,  // 1 iteration per batch
        );
    });
    
    // This is similar to iter(), but setup is explicitly separated
    // Still measured, but Criterion knows it's setup
    // Can be useful for baseline comparison
}

BatchSize::PerIteration runs setup before each iteration—use when state cannot be reused.

BatchSize::NumIterations for Fixed Control

use criterion::{Criterion, BatchSize, black_box};
 
fn fixed_batch_size(c: &mut Criterion) {
    c.bench_function("fixed_batch", |b| {
        b.iter_batched(
            || (0..10_000).collect::<Vec<u64>>(),
            |data| {
                data.iter().sum::<u64>()
            },
            BatchSize::NumIterations(100),  // Exactly 100 iterations per batch
        );
    });
    
    // Each batch:
    // 1. Setup runs once
    // 2. Routine runs 100 times (with same data)
    // 3. Batch complete
    
    // Total time / 100 = average iteration time
    // Setup cost amortized over 100 iterations
}

BatchSize::NumIterations(n) provides exact control over batch size for predictable behavior.

When to Use Different BatchSizes

use criterion::{Criterion, BatchSize, black_box};
 
fn choosing_batch_size(c: &mut Criterion) {
    // Expensive setup (millions of allocations):
    // Use BatchSize::Small or BatchSize::NumIterations(1)
    c.bench_function("expensive_setup", |b| {
        b.iter_batched(
            || (0..1_000_000).collect::<Vec<u64>>(),  // Expensive!
            |data| data.iter().sum::<u64>(),
            BatchSize::Small,  // Amortize over few iterations
        );
    });
    
    // Moderate setup (thousands of allocations):
    // Use BatchSize::Small or default
    c.bench_function("moderate_setup", |b| {
        b.iter_batched(
            || (0..1_000).collect::<Vec<u64>>(),  // Moderate
            |data| data.iter().sum::<u64>(),
            BatchSize::Small,
        );
    });
    
    // Cheap setup (few allocations, simple initialization):
    // BatchSize::Large is fine
    c.bench_function("cheap_setup", |b| {
        b.iter_batched(
            || vec![0u64; 100],  // Cheap
            |data| data.iter().sum::<u64>(),
            BatchSize::Large,  // Setup cost negligible
        );
    });
}

Setup Cost	Recommended BatchSize	Reason
Very expensive	`Small` or `NumIterations(1)`	Minimize setup impact
Moderate	`Small`	Balance setup and measurement
Cheap	`Large`	Let Criterion auto-optimize

iter_batched vs iter_batched_ref

use criterion::{Criterion, BatchSize, black_box};
 
fn batched_vs_batched_ref(c: &mut Criterion) {
    // iter_batched: Moves setup value into routine
    c.bench_function("batched_move", |b| {
        b.iter_batched(
            || vec![0u64; 100],
            |data| {
                // data is moved in
                black_box(data.len())
            },
            BatchSize::Small,
        );
    });
    
    // iter_batched_ref: Passes reference to routine
    c.bench_function("batched_ref", |b| {
        b.iter_batched_ref(
            || vec![0u64; 100],
            |data| {
                // data is &Vec<u64>
                data.iter().sum::<u64>()
            },
            BatchSize::Small,
        );
    });
    
    // iter_batched_ref: Setup value is reused across iterations
    // The reference allows multiple uses of same data
    // Useful when you don't want to consume the setup
}

iter_batched_ref passes a reference, allowing setup reuse without ownership transfer.

Realistic Example: Database Query Benchmark

use criterion::{Criterion, BatchSize, black_box};
 
struct Database {
    data: Vec<u64>,
}
 
impl Database {
    fn new() -> Self {
        // Expensive setup: create and populate database
        Self {
            data: (0..100_000).collect(),
        }
    }
    
    fn query(&self, id: usize) -> u64 {
        self.data.get(id).copied().unwrap_or(0)
    }
}
 
fn database_benchmark(c: &mut Criterion) {
    c.bench_function("database_query", |b| {
        b.iter_batched(
            || Database::new(),  // Expensive: create DB
            |db| {
                // Query many times per database
                let mut sum = 0u64;
                for i in 0..1000 {
                    sum += db.query(i);
                }
                black_box(sum)
            },
            BatchSize::Small,  // Few iterations per DB creation
        );
    });
    
    // Each batch: Create DB once, query 1000 times
    // Setup amortized over those queries
}

Database benchmarks often have expensive setup that must be amortized.

Realistic Example: File I/O Benchmark

use criterion::{Criterion, BatchSize, black_box};
use std::io::Write;
use tempfile::NamedTempFile;
 
fn file_io_benchmark(c: &mut Criterion) {
    c.bench_function("file_write", |b| {
        b.iter_batched(
            || {
                // Setup: Create temp file for writing
                NamedTempFile::new().unwrap()
            },
            |mut file| {
                // Routine: Write to file
                file.write_all(b"test data\n").unwrap();
                file.flush().unwrap();
                black_box(file)
            },
            BatchSize::PerIteration,  // Fresh file for each iteration
        );
    });
    
    // File I/O benchmarks often need fresh state per iteration
    // BatchSize::PerIteration ensures clean file each time
}

File I/O benchmarks need PerIteration when state must be fresh each time.

Realistic Example: Sorting Benchmark

use criterion::{Criterion, BatchSize, black_box};
 
fn sorting_benchmark(c: &mut Criterion) {
    // Problem: Sorting modifies the data!
    // Second iteration would sort already-sorted data
    
    // WRONG: Data modified by iteration
    c.bench_function("sorting_wrong", |b| {
        let data: Vec<u64> = (0..10_000).rev().collect();
        b.iter(|| {
            let mut d = data.clone();  // Still inside measurement!
            d.sort();
            black_box(d)
        });
    });
    
    // CORRECT: Setup outside measurement
    c.bench_function("sorting_correct", |b| {
        b.iter_batched(
            || (0..10_000_u64).rev().collect::<Vec<_>>(),  // Setup
            |mut data| {
                data.sort();  // Only this is measured
                black_box(data)
            },
            BatchSize::NumIterations(1),  // Fresh data each time
        );
    });
    
    // iter_batched ensures setup is not measured
    // BatchSize::NumIterations(1) gives fresh data per batch
}

Sorting benchmarks need fresh (unsorted) data—iter_batched with NumIterations(1) handles this.

iter_batched for Stateful Operations

use criterion::{Criterion, BatchSize, black_box};
 
fn stateful_operations(c: &mut Criterion) {
    // Operations that modify state need careful handling
    
    c.bench_function("hashmap_insert", |b| {
        b.iter_batched(
            || {
                // Fresh HashMap each batch
                std::collections::HashMap::<u64, u64>::new()
            },
            |mut map| {
                // Insert many items
                for i in 0..1000 {
                    map.insert(i, i * 2);
                }
                black_box(map)
            },
            BatchSize::NumIterations(1),  // Each iteration gets fresh map
        );
    });
    
    // If we reused the map, subsequent iterations would be faster
    // (inserting into already-large HashMap is different)
}

Stateful operations need fresh state to get consistent measurements.

Comparing iter vs iter_batched

use criterion::{Criterion, BatchSize, black_box};
 
fn iter_comparison(c: &mut Criterion) {
    // iter: Everything inside closure is measured
    c.bench_function("with_iter", |b| {
        b.iter(|| {
            let data: Vec<u64> = (0..1000).collect();  // Measured
            data.iter().sum::<u64>()
        });
    });
    
    // iter_batched: Setup separated from routine
    c.bench_function("with_iter_batched", |b| {
        b.iter_batched(
            || (0..1000).collect::<Vec<u64>>(),  // Not measured
            |data| data.iter().sum::<u64>(),      // Measured
            BatchSize::Small,
        );
    });
    
    // iter_batched_ref: Setup preserved across iterations
    c.bench_function("with_iter_batched_ref", |b| {
        b.iter_batched_ref(
            || (0..1000).collect::<Vec<u64>>(),  // Not measured
            |data| data.iter().sum::<u64>(),      // Measured, ref passed
            BatchSize::Small,
        );
    });
}

Method	Setup Measured	Setup Reused	Use Case
`iter`	Yes	N/A	Simple, no setup
`iter_batched`	No	No (moved)	Expensive setup
`iter_batched_ref`	No	Yes (reference)	Reusable setup

How Criterion Uses BatchSize

use criterion::{Criterion, BatchSize};
 
fn criterion_internals(c: &mut Criterion) {
    // Criterion's measurement process:
    // 1. Run a few iterations to estimate timing
    // 2. Determine sample count and iterations per sample
    // 3. For each sample:
    //    a. Run setup
    //    b. Run routine N times (N = batch size)
    //    c. Record total time
    //    d. Repeat for more samples
    
    // BatchSize controls N (iterations per batch):
    
    // Small: N is small (maybe 1-10)
    // - More batches for same total iterations
    // - More setup calls
    // - But setup cost spread over fewer iterations each time
    
    // Large: N is large (maybe 100-10000)
    // - Fewer batches
    // - Fewer setup calls
    // - Setup cost spread over many iterations
    
    // Criterion may adjust N based on observed timing
    // BatchSize is a hint, not a hard requirement
}

BatchSize is a hint to Criterion about batch sizing, not an absolute directive.

Memory Considerations

use criterion::{Criterion, BatchSize, black_box};
 
fn memory_considerations(c: &mut Criterion) {
    // Large setup data consumes memory
    
    // Small batch: More batches, but same peak memory
    // (setup data from previous batch is dropped)
    c.bench_function("small_batch_memory", |b| {
        b.iter_batched(
            || vec![0u64; 1_000_000],  // 8MB per setup
            |data| data.iter().sum::<u64>(),
            BatchSize::Small,  // Many batches, but sequential
        );
    });
    
    // Large batch: Fewer batches, same peak memory
    // But setup cost amortized differently
    c.bench_function("large_batch_memory", |b| {
        b.iter_batched(
            || vec![0u64; 1_000_000],
            |data| data.iter().sum::<u64>(),
            BatchSize::Large,
        );
    });
    
    // Peak memory is similar (one setup data at a time)
    // Difference is in measurement behavior
}

Memory usage is similar regardless of batch size; the difference is measurement semantics.

Summary Table

fn summary_table() {
    // | BatchSize | Iterations/Batch | Setup Calls | Use Case |
    // |-----------|-------------------|-------------|----------|
    // | Small | Few (1-10) | Many | Expensive setup |
    // | Large | Many (100+) | Few | Cheap setup |
    // | PerIteration | 1 | Per iter | Must reset state |
    // | NumIterations(n) | n | Per batch | Exact control |
    
    // | Method | Setup Position | Setup Measured | Data Ownership |
    // |--------|----------------|-----------------|----------------|
    // | iter | Inside closure | Yes | N/A |
    // | iter_batched | Before closure | No | Moved to routine |
    // | iter_batched_ref | Before closure | No | Borrowed by routine |
}

Synthesis

Quick reference:

use criterion::{Criterion, BatchSize, black_box};
 
fn quick_reference(c: &mut Criterion) {
    // Expensive setup: Small batches
    c.bench_function("expensive", |b| {
        b.iter_batched(
            || (0..1_000_000).collect::<Vec<u64>>(),
            |data| data.iter().sum::<u64>(),
            BatchSize::Small,
        );
    });
    
    // Must reset state: PerIteration
    c.bench_function("stateful", |b| {
        b.iter_batched(
            || create_fresh_state(),
            |state| modify_state(state),
            BatchSize::PerIteration,
        );
    });
    
    // Reusable setup: Large batches
    c.bench_function("cheap", |b| {
        b.iter_batched_ref(
            || create_reusable_state(),
            |state| use_state(state),
            BatchSize::Large,
        );
    });
}

Key insight: BatchSize solves the fundamental benchmarking problem of expensive setup by separating setup code from measured code and controlling how many iterations share the same setup instance. When setup is inside iter(), it contaminates measurements because Criterion can't distinguish setup time from operation time. iter_batched separates these concerns: setup runs once per batch, the routine runs multiple times per batch, and BatchSize controls how many iterations share each setup. BatchSize::Small amortizes expensive setup over few iterations (minimizing setup's contribution to each iteration's time), BatchSize::PerIteration ensures fresh state when operations modify data, and BatchSize::NumIterations(n) provides exact control for predictable behavior. The choice depends on the ratio of setup cost to operation cost and whether operations modify the setup state.

What is the purpose of criterion::BatchSize for controlling iteration counts in benchmarks with expensive setup?