Loading page…
Rust walkthroughs
Loading page…
criterion::black_box prevent compiler optimizations from skewing benchmark results?criterion::black_box is a function that forces the compiler to treat its input as "used" without generating any actual code that observes the value. This prevents the compiler from optimizing away computations whose results would otherwise be unused, or caching values that should be recomputed each iteration. The function is implemented using inline assembly or volatile reads that have no side effects visible to the optimizer, creating a barrier that the compiler cannot see through. Without black_box, the compiler may eliminate benchmarked code entirely, hoist invariant computations out of loops, or constant-fold expressions that should be measured dynamically.
fn main() {
let x = 42;
let y = x * 2;
// y is never used
// Compiler will likely eliminate both x and y
}Unused values get optimized away by the compiler.
fn fibonacci(n: u64) -> u64 {
if n < 2 {
n
} else {
fibonacci(n - 1) + fibonacci(n - 2)
}
}
fn main() {
let start = std::time::Instant::now();
for _ in 0..1000 {
fibonacci(20); // Result is discarded
}
println!("Time: {:?}", start.elapsed());
}The compiler may eliminate fibonacci(20) calls since results are unused.
fn fibonacci(n: u64) -> u64 {
if n < 2 {
n
} else {
fibonacci(n - 1) + fibonacci(n - 2)
}
}
use std::hint::black_box;
fn main() {
let start = std::time::Instant::now();
for _ in 0..1000 {
let result = fibonacci(20);
black_box(result); // Forces result to be "used"
}
println!("Time: {:?}", start.elapsed());
}black_box prevents the compiler from eliminating the computation.
use std::hint::black_box;
fn main() {
let x = 42;
// black_box returns its input unchanged
let y = black_box(x);
// But the compiler cannot assume anything about:
// - Whether x was read
// - What the value of y is
// - Whether the read had side effects
println!("y = {}", y); // y is still 42
}black_box is a identity function that the compiler cannot optimize through.
// From std::hint source (simplified)
#[inline(always)]
pub fn black_box<T>(dummy: T) -> T {
// Implementation varies by platform
// Common approaches:
// 1. Inline assembly that does nothing
// asm!("", in(reg) dummy, out(reg) dummy);
// 2. Volatile read
// unsafe { ptr::read_volatile(&dummy) }
// The key: compiler cannot optimize through it
dummy
}The implementation uses techniques opaque to the optimizer.
use std::hint::black_box;
fn compute(x: u64) -> u64 {
x * x + x
}
fn main() {
// Without black_box:
// let result = compute(10);
// Compiler may fold this to: 110 (constant)
// With black_box:
let input = black_box(10);
let result = compute(input);
// Compiler cannot assume input == 10
// Must actually call compute
println!("Result: {}", result);
}black_box prevents constant folding through its input.
use std::hint::black_box;
fn expensive_computation(x: u64) -> u64 {
x * x + x
}
fn main() {
// Without black_box:
// let invariant = expensive_computation(100);
// for i in 0..1000 {
// use(invariant); // Computed once
// }
// With black_box:
for i in 0..1000 {
let result = expensive_computation(black_box(100));
black_box(result); // Computed each iteration
}
}black_box prevents hoisting computations out of loops.
use std::hint::black_box;
fn allocate_buffer(size: usize) -> Vec<u8> {
Vec::with_capacity(size)
}
fn main() {
// Without black_box:
// let buffer = allocate_buffer(1024);
// Buffer never used, allocator call eliminated
// With black_box:
let buffer = allocate_buffer(1024);
black_box(&buffer); // Must allocate
}black_box forces side effects to occur.
use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn fibonacci(n: u64) -> u64 {
if n < 2 {
n
} else {
fibonacci(n - 1) + fibonacci(n - 2)
}
}
fn criterion_benchmark(c: &mut Criterion) {
c.bench_function("fibonacci 20", |b| {
b.iter(|| {
// black_box ensures fibonacci is called each iteration
// and its result is not optimized away
fibonacci(black_box(20))
});
});
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);Criterion wraps all benchmarks with black_box internally.
use criterion::{black_box, Criterion};
fn process(data: &[u8]) -> u64 {
data.iter().map(|&b| b as u64).sum()
}
fn benchmark(c: &mut Criterion) {
let data = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
c.bench_function("process", |b| {
b.iter(|| {
// Black box input: compiler cannot assume data contents
// Black box output: compiler cannot eliminate result
process(black_box(&data))
});
});
}Black box both input and output for accurate measurement.
use std::hint::black_box;
fn compute(x: u64) -> u64 {
x * 2
}
fn main() {
// Black box input: prevents constant propagation into function
let input = black_box(42);
let result = compute(input);
// Black box output: prevents elimination of computation
let result = compute(42);
black_box(result);
// Both: maximum protection
let result = compute(black_box(42));
black_box(result);
}Apply to both sides for complete optimization protection.
use std::hint::black_box;
fn main() {
// black_box does NOT:
// 1. Add delays or synchronization
let x = black_box(42);
// 2. Prevent CPU-level optimizations
// CPU may still cache, prefetch, etc.
// 3. Add memory barriers
// No synchronization with other threads
// 4. Prevent previous code from being optimized
// Only affects code AFTER the call
// 5. Make timing more predictable
// Timing still varies due to CPU behavior
}black_box only prevents compiler optimizations, not CPU optimizations.
use std::hint::black_box;
use std::time::Instant;
fn main() {
let iterations = 10_000_000u64;
// Without black_box
let start = Instant::now();
for i in 0..iterations {
let _ = i + 1;
}
println!("Without black_box: {:?}", start.elapsed());
// With black_box
let start = Instant::now();
for i in 0..iterations {
black_box(i + 1);
}
println!("With black_box: {:?}", start.elapsed());
// Difference: black_box prevents some optimizations
// But adds minimal runtime overhead
}black_box adds negligible runtime cost; it's an optimization barrier.
use std::hint::black_box;
fn complex_computation(a: u64, b: u64, c: u64) -> u64 {
a * b + b * c + a * c
}
fn main() {
// Each parameter needs protection
let result = complex_computation(
black_box(10),
black_box(20),
black_box(30),
);
black_box(result);
}Each input should be black-boxed if needed.
use criterion::{black_box, criterion_group, criterion_main, Criterion};
fn sort_slice(data: &mut [u64]) {
data.sort();
}
fn benchmark_sort(c: &mut Criterion) {
let mut data: Vec<u64> = (0..1000).rev().collect();
c.bench_function("sort 1000", |b| {
// Need to reset data each iteration
// Otherwise sorting already-sorted data is much faster
b.iter_batched(
|| data.clone(), // Setup: fresh data each time
|mut data| {
sort_slice(black_box(&mut data));
black_box(data); // Ensure result is used
},
criterion::BatchSize::SmallInput,
);
});
}
criterion_group!(benches, benchmark_sort);
criterion_main!(benches);Complex benchmarks may need setup/teardown with black boxing.
use std::hint::black_box;
use std::ptr;
fn main() {
let x = 42;
// volatile_read: reads memory each time
// Compiler cannot cache the value
let v = unsafe { ptr::read_volatile(&x) };
// black_box: opaque to compiler
// Compiler cannot assume anything about it
let b = black_box(x);
// Difference:
// - volatile: memory read each time (for hardware)
// - black_box: compiler barrier only (for optimization)
// Use volatile for memory-mapped I/O
// Use black_box for benchmarks
}Volatile and black box serve different purposes.
use std::hint::black_box;
fn main() {
let data = vec![1, 2, 3, 4, 5];
// WRONG: black_box doesn't prevent allocation optimization
black_box(&data); // Only marks the reference as used
// data could still be optimized in complex ways
// WRONG: black_box outside loop
black_box(|| {
for i in 0..1000 {
let _ = i + 1; // Loop can still be optimized
}
});
// CORRECT: black_box on values inside computation
for i in 0..1000 {
black_box(i + 1);
}
}Black box must be placed where computation actually happens.
| Technique | Prevents | Adds | Use Case |
|-----------|----------|------|----------|
| black_box | Compiler elimination | Nothing | Benchmarks |
| volatile | Compiler caching | Memory read | Hardware I/O |
| std::mem::forget | Drop code | Nothing | Resource management |
| std::hint::must_use | Dead code warnings | Nothing | API design |
black_box solves the fundamental problem that compilers optimize unused code away, and benchmarks intentionally run code for its side effect (time) rather than its output:
Why it works: black_box is implemented as an identity function that the compiler cannot see through. Using inline assembly or volatile operations that have no defined behavior, it creates an opaque barrier. The compiler must assume the value could be read, could be any value, and could have side effects—even though none of these are true.
What it prevents: Dead code elimination (removing computations with unused results), constant folding (replacing runtime computations with compile-time constants), and loop invariant code motion (hoisting computations out of loops). These are precisely the optimizations that make code faster but benchmarks meaningless.
What it doesn't prevent: CPU-level optimizations like caching, prefetching, branch prediction, or out-of-order execution. Black box affects the compiler, not the hardware. Benchmark timing still varies due to CPU behavior.
Key insight: The purpose of black_box is to ensure the benchmark measures what you intend. The compiler is your adversary—it wants to make code faster by eliminating work. In normal code, this is good. In benchmarks, it means measuring nothing. black_box tells the compiler "this value matters" without adding any actual runtime cost, ensuring measurements reflect real computation time.