What are the trade-offs of using `smallvec::SmallVec` inline capacity vs heap allocation?

SmallVec is a vector-like container that stores a small number of elements inline on the stack before spilling to the heap. The inline capacity is determined at compile time through the type parameter, allowing small collections to avoid heap allocation entirely. This optimization improves cache locality and reduces allocation overhead for the common case where collections are small. The trade-off is increased stack size and potential memory waste when the inline capacity exceeds actual usage, plus a small cost when transitioning from inline to heap storage.

Basic SmallVec Usage

use smallvec::SmallVec;
 
fn basic_usage() {
    // SmallVec with inline capacity for 4 elements
    let mut vec: SmallVec<[i32; 4]> = SmallVec::new();
    
    // First 4 elements stored inline (no heap allocation)
    vec.push(1);
    vec.push(2);
    vec.push(3);
    vec.push(4);
    
    // Fifth element causes heap allocation
    vec.push(5);
    
    println!("Length: {}", vec.len());
    println!("Is on heap: {}", !vec.spilled());
}

The type SmallVec<[T; N]> stores up to N elements inline before spilling to heap.

Inline vs Spilled State

use smallvec::SmallVec;
 
fn inline_vs_heap() {
    let mut vec: SmallVec<[u64; 3]> = SmallVec::new();
    
    // Stack allocated, inline storage
    vec.push(1);
    vec.push(2);
    vec.push(3);
    
    println!("Spilled: {}", vec.spilled());  // false
    println!("Capacity: {}", vec.capacity()); // 3
    
    // Exceeds inline capacity - spills to heap
    vec.push(4);
    
    println!("Spilled: {}", vec.spilled());  // true
    println!("Capacity: {}", vec.capacity()); // usually larger (e.g., 4 or more)
    
    // Now using heap storage
    vec.push(5);
    vec.push(6);
}

spilled() indicates whether the data has moved from inline to heap storage.

Memory Layout Comparison

use smallvec::SmallVec;
 
fn memory_layout() {
    // Vec<T> is 3 pointers (24 bytes on 64-bit)
    // - pointer to heap data
    // - capacity
    // - length
    
    // SmallVec<[T; N]> inline storage:
    // - length
    // - capacity (or tag + capacity when spilled)
    // - inline array of N elements
    
    // Example: SmallVec<[u64; 4]>
    // Inline: 8 bytes (len) + 8 bytes (cap/tag) + 32 bytes (4 * u64) = 48 bytes
    // Vec<u64>: 24 bytes + heap allocation
    
    // Example: SmallVec<[u8; 32]>
    // Inline: 8 + 8 + 32 = 48 bytes total
    // Can store 32 bytes without heap allocation
    
    let small: SmallVec<[u8; 32]> = SmallVec::from_buf([0u8; 32]);
    println!("Stack size: {} bytes", std::mem::size_of_val(&small));
}

SmallVec has larger stack size than Vec to accommodate inline storage.

Performance Benefits of Inline Storage

use smallvec::SmallVec;
 
fn performance_example() {
    // Avoiding heap allocation for small collections
    
    // Vec: always heap allocates
    let mut vec: Vec<i32> = Vec::new();
    vec.push(1);  // Heap allocation happens here
    
    // SmallVec: no allocation for first N elements
    let mut small: SmallVec<[i32; 4]> = SmallVec::new();
    small.push(1);  // No allocation
    small.push(2);  // No allocation
    small.push(3);  // No allocation
    small.push(4);  // No allocation
    // 4 elements, 0 heap allocations
    
    // For functions that create temporary small vectors:
    fn process_small<'a>() -> SmallVec<[i32; 8]> {
        let mut result = SmallVec::new();
        for i in 0..5 {
            result.push(i);
        }
        result  // No heap allocation occurred
    }
}

Inline storage eliminates heap allocation overhead for small collections.

Cache Locality Benefits

use smallvec::SmallVec;
 
fn cache_locality() {
    struct Point {
        x: f64,
        y: f64,
        z: f64,
    }
    
    // Vec<Point>: Points scattered on heap
    // SmallVec<[Point; 4]>: Points contiguous on stack with the SmallVec
    
    // When SmallVec is on the stack, inline elements are too
    fn process_points() {
        let mut points: SmallVec<[Point; 4]> = SmallVec::new();
        points.push(Point { x: 0.0, y: 0.0, z: 0.0 });
        points.push(Point { x: 1.0, y: 0.0, z: 0.0 });
        points.push(Point { x: 0.0, y: 1.0, z: 0.0 });
        
        // All points are cache-friendly, contiguous with the SmallVec
        for p in &points {
            println!("({}, {}, {})", p.x, p.y, p.z);
        }
    }
    
    // Better cache locality means faster access
    // Especially important in tight loops
}

Inline elements share cache lines with the SmallVec itself.

Choosing Inline Capacity

use smallvec::SmallVec;
 
fn capacity_selection() {
    // Choose capacity based on common case analysis
    
    // If 90% of your vectors have <= 4 elements:
    let vec: SmallVec<[String; 4]> = SmallVec::new();
    
    // If most have <= 8:
    let vec: SmallVec<[i32; 8]> = SmallVec::new();
    
    // Profile your actual usage:
    fn analyze_data(data: &[Vec<i32>]) {
        let sizes: Vec<usize> = data.iter().map(|v| v.len()).collect();
        sizes.sort();
        
        // Find P50, P90, P99
        let p50 = sizes[sizes.len() / 2];
        let p90 = sizes[(sizes.len() as f64 * 0.9) as usize];
        let p99 = sizes[(sizes.len() as f64 * 0.99) as usize];
        
        println!("P50: {}, P90: {}, P99: {}", p50, p90, p99);
        // Choose capacity near P90 to avoid most spills
    }
    
    // Trade-off: larger capacity = more stack space but fewer spills
}

Select inline capacity based on profiling your actual data distribution.

Stack Size Considerations

use smallvec::SmallVec;
 
fn stack_size_warning() {
    // SmallVec size on stack = inline capacity * element size + overhead
    
    // Small inline capacity - reasonable
    let v1: SmallVec<[i32; 4]> = SmallVec::new();  // ~24 bytes
    
    // Medium - still reasonable
    let v2: SmallVec<[i32; 16]> = SmallVec::new();  // ~72 bytes
    
    // Large - be careful
    let v3: SmallVec<[i32; 64]> = SmallVec::new();  // ~264 bytes
    
    // Very large - probably too much
    let v4: SmallVec<[i32; 256]> = SmallVec::new();  // ~1032 bytes
    
    // With larger elements
    // Each String is 24 bytes
    let v5: SmallVec<[String; 16]> = SmallVec::new();  // ~392 bytes!
    
    // Rule of thumb: keep total SmallVec size under ~500 bytes
    // Otherwise you risk stack overflow in recursive functions
}

Large inline capacities consume significant stack space.

Spill Cost Analysis

use smallvec::SmallVec;
 
fn spill_cost() {
    let mut vec: SmallVec<[i32; 4]> = SmallVec::new();
    
    // Inline: push is cheap
    vec.push(1);  // Just writing to stack memory
    vec.push(2);
    vec.push(3);
    vec.push(4);
    
    // Spilling has overhead:
    // 1. Allocate heap buffer
    // 2. Copy inline elements to heap
    // 3. Set up heap pointer
    // 4. Then push new element
    
    vec.push(5);  // Triggers spill, has one-time cost
    
    // After spill, behaves like Vec:
    vec.push(6);  // Normal heap push
    vec.push(7);
    
    // If you frequently exceed inline capacity,
    // you pay the spill cost AND heap allocation
    // Better to use Vec in that case
}

Spilling has a one-time cost to copy inline elements to heap.

Comparison with Vec

use smallvec::SmallVec;
 
fn comparison() {
    // Vec pros:
    // - Consistent heap allocation
    // - Small stack footprint (24 bytes)
    // - No "spill" surprise
    // - Well-known, universally understood
    
    // Vec cons:
    // - Every allocation requires heap
    // - Pointer chasing for access
    // - Allocation overhead for small collections
    
    // SmallVec pros:
    // - No allocation for small collections
    // - Better cache locality when inline
    // - Can avoid allocation in hot paths
    
    // SmallVec cons:
    // - Larger stack footprint
    // - Spill cost if exceeded
    // - More complex type
    // - Potential memory waste if underutilized
    
    // When to use Vec:
    // - Unpredictable or large sizes
    // - Memory-constrained environments
    // - Deep recursion (stack space matters)
    // - Consistency matters more than performance
    
    // When to use SmallVec:
    // - Known small size distributions
    // - Performance-critical paths
    // - Temporary collections
    // - Want to reduce allocator pressure
}

Choose based on your size distribution and performance requirements.

Memory Waste Analysis

use smallvec::SmallVec;
 
fn memory_waste() {
    // Inline capacity wastes stack space if underutilized
    
    // Worst case: empty SmallVec
    let empty: SmallVec<[u64; 16]> = SmallVec::new();
    // Uses 136 bytes of stack for 0 elements!
    
    // Vec<u64> would only use 24 bytes
    
    // Best case: exactly filled
    let full: SmallVec<[u64; 16]> = (0..16).collect();
    // Uses all 136 bytes efficiently
    
    // Partially filled
    let partial: SmallVec<[u64; 16]> = (0..4).collect();
    // Uses 136 bytes for 32 bytes of data
    // Wastes 104 bytes
    
    // Heap waste also possible after spill
    let mut spilled: SmallVec<[u64; 4]> = SmallVec::new();
    spilled.extend(0..5);  // Spills
    // May allocate more capacity than needed
}

Unused inline capacity consumes stack space that could be used elsewhere.

SmallVec in Function Signatures

use smallvec::SmallVec;
 
// Return SmallVec to avoid allocation
fn get_small_collection() -> SmallVec<[i32; 8]> {
    let mut result = SmallVec::new();
    result.push(1);
    result.push(2);
    result.push(3);
    result  // No heap allocation
}
 
// Accept SmallVec in function parameter
fn process_small(data: SmallVec<[String; 4]>) {
    for item in &data {
        println!("{}", item);
    }
}
 
// SmallVec implements Deref<Target = [T]>
fn process_slice(data: &[i32]) {
    // Works with both Vec and SmallVec
}
 
fn caller() {
    let small: SmallVec<[i32; 8]> = get_small_collection();
    process_slice(&small);  // Coerces to &[i32]
    
    // Can also convert to Vec if needed
    let vec: Vec<i32> = small.into_vec();
}

SmallVec can be used in signatures and coerces to slices.

Real-World Pattern: Temporary Collections

use smallvec::SmallVec;
 
// Common pattern: temporary buffer in tight loop
fn process_data(data: &[i32]) -> i32 {
    // Temporary collection - rarely exceeds 8 elements
    let mut buffer: SmallVec<[i32; 8]> = SmallVec::new();
    
    for &value in data {
        if value > 0 {
            buffer.push(value * 2);
        }
        
        if buffer.len() == 8 {
            // Process batch
            let sum: i32 = buffer.iter().sum();
            if sum > 1000 {
                return sum;
            }
            buffer.clear();
        }
    }
    
    buffer.iter().sum()
}
 
// Parser pattern: collecting tokens
fn parse_tokens(input: &str) -> SmallVec<[&str; 16]> {
    let mut tokens = SmallVec::new();
    for word in input.split_whitespace() {
        tokens.push(word);
        if tokens.len() > 100 {
            // Handle unexpectedly large input
            break;
        }
    }
    tokens  // Usually no allocation
}

SmallVec excels for short-lived, small collections in hot code paths.

SmallVec with Clone and Copy Types

use smallvec::SmallVec;
 
fn type_behaviors() {
    // Copy types - inline storage is efficient
    let mut copy_vec: SmallVec<[i32; 4]> = SmallVec::new();
    copy_vec.push(1);
    copy_vec.push(2);
    
    // Cloning SmallVec copies inline storage
    let cloned = copy_vec.clone();
    
    // Non-Copy types - moves matter
    let mut string_vec: SmallVec<[String; 2]> = SmallVec::new();
    string_vec.push("hello".to_string());
    string_vec.push("world".to_string());
    
    // Cloning clones each String
    let cloned_strings = string_vec.clone();
    
    // After spill, clone must clone from heap
    string_vec.push("extra".to_string());  // May spill
    let cloned_after_spill = string_vec.clone();
}

Copy types benefit most from inline storage; Clone types still avoid allocation.

Growing Beyond Inline Capacity

use smallvec::SmallVec;
 
fn growth_behavior() {
    let mut vec: SmallVec<[i32; 4]> = SmallVec::new();
    
    // Initial capacity is inline capacity
    println!("Initial capacity: {}", vec.capacity());  // 4
    
    // Fill inline storage
    vec.extend(0..4);
    println!("After fill: len={}, cap={}, spilled={}", 
        vec.len(), vec.capacity(), vec.spilled());
    // len=4, cap=4, spilled=false
    
    // Spill happens
    vec.push(4);
    println!("After spill: len={}, cap={}, spilled={}", 
        vec.len(), vec.capacity(), vec.spilled());
    // len=5, cap=? (depends on growth strategy), spilled=true
    
    // Continue growing on heap
    vec.extend(5..100);
    println!("Final: len={}, cap={}", vec.len(), vec.capacity());
    
    // reserve() can trigger early spill
    let mut vec2: SmallVec<[i32; 4]> = SmallVec::new();
    vec2.reserve(10);  // Spills to heap immediately
    println!("After reserve: spilled={}", vec2.spilled());  // true
}

Spilling is automatic when inline capacity is exceeded; reserve can trigger early spill.

Using smallvec! Macro

use smallvec::{smallvec, SmallVec};
 
fn macro_usage() {
    // Create SmallVec with initial values
    let vec: SmallVec<[i32; 4]> = smallvec![1, 2, 3];
    println!("Len: {}, spilled: {}", vec.len(), vec.spilled());  // 3, false
    
    // Exactly at capacity
    let full: SmallVec<[i32; 4]> = smallvec![1, 2, 3, 4];
    println!("Spilled: {}", full.spilled());  // false
    
    // Exceeds capacity - creates on heap directly
    let spilled: SmallVec<[i32; 4]> = smallvec![1, 2, 3, 4, 5];
    println!("Spilled: {}", spilled.spilled());  // true
    
    // Empty SmallVec
    let empty: SmallVec<[String; 8]> = smallvec![];
}

The smallvec! macro provides convenient construction with type inference.

Performance Benchmarks Conceptual

use smallvec::SmallVec;
 
// Conceptual performance comparison:
//
// Operation          | Vec        | SmallVec (inline) | SmallVec (spilled)
// --------------------|------------|-------------------|-------------------
// New/empty           | 1 alloc    | 0 allocs          | 0 allocs
// Push (inline)       | 1 alloc    | 0 allocs          | N/A
// Push (spill)        | realloc    | 1 alloc + copy    | realloc
// Access by index     | 1 ptr deref| direct            | 1 ptr deref
// Iterate             | ptr chase  | direct            | ptr chase
// Clone (inline)      | 1 alloc    | stack copy        | N/A
// Clone (spilled)     | 1 alloc    | N/A               | 1 alloc
// Drop (inline)       | no-op      | no-op             | N/A
// Drop (spilled)      | 1 free     | N/A               | 1 free
 
// Key insight: SmallVec wins when most instances stay inline
// SmallVec loses when:
// - Most instances spill (extra spill cost)
// - Most instances are empty (wasted stack space)
// - Inline capacity is too large (stack pressure)

SmallVec performance depends on the ratio of inline to spilled instances.

Synthesis

The trade-offs of SmallVec inline capacity vs heap allocation:

Benefits of inline capacity:

Zero heap allocations for small collections
Better cache locality (data contiguous on stack)
Reduced allocator pressure in hot paths
Faster access (no pointer indirection when inline)

Costs of inline capacity:

Larger stack footprint (inline array size + overhead)
Memory waste when underutilized
Spill cost when exceeded (allocation + copy)
Risk of stack overflow with large capacities or deep recursion

Choose SmallVec when:

Most collections stay under inline capacity (profile P90)
Performance-critical code paths
Temporary/short-lived collections
Want to reduce allocator contention
Cache locality matters

Choose Vec when:

Sizes are unpredictable or often large
Stack space is constrained
Code simplicity is preferred
Collections are long-lived
Memory efficiency matters more than speed

Key insight: SmallVec is an optimization based on the observation that many vectors in practice are small. It's most effective when you've profiled your data and know that a high percentage of instances fit within the inline capacity. The optimal inline capacity balances stack space usage against the spill rate—too small and you spill often (losing the benefit), too large and you waste stack space. A common starting point is SmallVec<[T; 4]> or SmallVec<[T; 8]>, adjusted based on profiling data.

What are the trade-offs of using smallvec::SmallVec inline capacity vs heap allocation?