What are the memory layout implications of using smallvec::SmallVec vs Vec for small collections?

Small collections pose a memory efficiency problem in Rust. Vec<T> always heap-allocates, even for a single element, incurring allocation overhead and pointer indirection. SmallVec addresses this by storing small numbers of elements inline, eliminating heap allocation for small collections at the cost of a larger stack footprint.

The Vec Memory Model

A Vec<T> consists of three components on the stack:

use std::mem::size_of;
 
fn demonstrate_vec_layout() {
    // Vec always has this structure:
    // - pointer to heap allocation (8 bytes on 64-bit)
    // - length (8 bytes)
    // - capacity (8 bytes)
    // Total: 24 bytes on stack
    
    assert_eq!(size_of::<Vec<u8>>(), 24);
    assert_eq!(size_of::<Vec<u64>>(), 24);
    assert_eq!(size_of::<Vec<String>>(), 24);
    
    // Plus heap allocation:
    // - For 1 element: allocation overhead + 1 element
    // - For 0 elements: still allocates on push (or with with_capacity(0), no allocation)
}

Every Vec, regardless of element type or length, occupies 24 bytes on the stack. The actual data lives on the heap:

fn vec_heap_allocation() {
    // Single element vec
    let v: Vec<u8> = vec![42];
    
    // Stack: pointer (8) + length (8) + capacity (8) = 24 bytes
    // Heap: allocation (typically 8+ bytes for allocator metadata) + 1 byte used + padding
    
    // For a single byte, we have:
    // - 24 bytes on stack
    // - 16+ bytes on heap (minimum allocation often 8-16 bytes + allocator overhead)
    // Total: 40+ bytes to store 1 byte
}

SmallVec's Inline Storage

SmallVec stores elements inline up to a specified capacity:

use smallvec::SmallVec;
use std::mem::size_of;
 
fn demonstrate_smallvec_layout() {
    // SmallVec<[u8; 8]> stores up to 8 bytes inline
    // Stack layout:
    // - length (8 bytes)
    // - inline buffer (8 bytes)
    // No separate capacity field when inline
    
    type SmallU8 = SmallVec<[u8; 8]>;
    assert_eq!(size_of::<SmallU8>(), 16);  // length + inline buffer
    
    // Compare to Vec<u8>:
    assert_eq!(size_of::<Vec<u8>>(), 24);
    
    // SmallVec is smaller on stack AND doesn't allocate for ≤8 elements
}

The inline capacity is part of the type, enabling the compiler to reserve stack space.

How SmallVec Manages Storage

SmallVec transitions between inline and heap storage:

use smallvec::SmallVec;
 
fn smallvec_transitions() {
    let mut v: SmallVec<[u64; 2]> = SmallVec::new();
    
    // Initially inline, no heap allocation
    v.push(1);  // Still inline
    v.push(2);  // Still inline, buffer full
    
    // Stack: 16 bytes for length + inline buffer (2 * 8 bytes)
    // No heap allocation yet
    
    v.push(3);  // Exceeds inline capacity, spills to heap
    
    // Now: 
    // - Stack still 16 bytes
    // - Heap: Vec-like allocation for 3+ elements
    // - Data moved from inline to heap
}

This spilling mechanism allows SmallVec to handle arbitrarily large collections while optimizing the common case.

Memory Comparison: Concrete Example

Consider storing a small number of integers:

use smallvec::SmallVec;
use std::mem::size_of;
 
fn memory_comparison() {
    // Scenario: Store up to 4 integers
    
    // Option 1: Vec<i64>
    // Stack: 24 bytes
    // Heap: 32+ bytes (allocation for 4 i64s)
    // Total: 56+ bytes
    
    let vec: Vec<i64> = vec![1, 2, 3, 4];
    assert_eq!(size_of::<Vec<i64>>(), 24);
    
    // Option 2: SmallVec<[i64; 4]>
    // Stack: 40 bytes (length + 4 * 8)
    // Heap: 0 bytes (inline)
    // Total: 40 bytes
    
    type SmallInts = SmallVec<[i64; 4]>;
    let small: SmallInts = SmallVec::from_slice(&[1, 2, 3, 4]);
    assert_eq!(size_of::<SmallInts>(), 40);  // 8 (len) + 32 (data)
    
    // SmallVec uses less total memory for small collections
}

The Trade-off: Stack Size vs Heap Allocation

SmallVec increases stack size to avoid heap allocation:

use smallvec::SmallVec;
use std::mem::size_of;
 
fn stack_size_tradeoff() {
    // Vec<i64>: Always 24 bytes on stack
    assert_eq!(size_of::<Vec<i64>>(), 24);
    
    // SmallVec<[i64; 8]>: 72 bytes on stack (8 + 64)
    type Small8 = SmallVec<[i64; 8]>;
    assert_eq!(size_of::<Small8>(), 72);
    
    // SmallVec<[i64; 16]>: 136 bytes on stack (8 + 128)
    type Small16 = SmallVec<[i64; 16]>;
    assert_eq!(size_of::<Small16>(), 136);
    
    // Large inline capacity = larger stack footprint
    // But more elements can avoid heap allocation
}

Choosing the inline capacity requires balancing stack usage against allocation avoidance.

When SmallVec Saves Memory

SmallVec wins when elements are small and collections are typically within the inline capacity:

use smallvec::SmallVec;
 
// Good use case: Small byte buffers
type ByteBuffer = SmallVec<[u8; 32]>;  // 40 bytes stack
 
// Typical network packet: small, often < 32 bytes for headers
// No allocation for most packets
 
// Good use case: Small lists of IDs
type IdList = SmallVec<[u32; 8]>;  // 40 bytes stack
 
// Most operations touch few IDs
// Rare large lists spill to heap gracefully
 
// Good use case: Path components
type PathSegments = SmallVec<[&str; 4]>;  // 40 bytes stack
 
// Most paths have few components
// Deep paths still work via heap spillover

When SmallVec Costs Memory

SmallVec loses when elements are large or collections exceed inline capacity:

use smallvec::SmallVec;
use std::mem::size_of;
 
fn when_smallvec_loses() {
    // Large elements: inline buffer wastes stack space
    type BigElement = SmallVec<[String; 16]>;
    
    // Each String is 24 bytes
    // Inline buffer: 16 * 24 = 384 bytes
    // Total stack: 384 + 8 = 392 bytes
    assert_eq!(size_of::<BigElement>(), 392);
    
    // Compare to Vec<String>: 24 bytes stack
    assert_eq!(size_of::<Vec<String>>(), 24);
    
    // If most vectors are empty or have 1-2 elements:
    // SmallVec: 392 bytes stack per vector (wasteful)
    // Vec: 24 bytes stack + heap allocation for used elements
    
    // If vectors are always large (20+ elements):
    // SmallVec: 392 bytes stack + heap (inline never used)
    // Vec: 24 bytes stack + heap
    // SmallVec wastes 368 bytes of stack
}

Performance Implications

Memory layout affects cache behavior and allocation overhead:

use smallvec::SmallVec;
 
fn performance_characteristics() {
    // Cache locality: SmallVec's inline data
    let mut small: SmallVec<[u8; 64]> = SmallVec::new();
    for i in 0..64 {
        small.push(i);
    }
    
    // All data is contiguous with the SmallVec struct
    // Single cache line fetch gets both metadata and data
    
    // Vec's data is elsewhere on the heap
    let mut vec: Vec<u8> = Vec::with_capacity(64);
    for i in 0..64 {
        vec.push(i);
    }
    
    // Metadata on stack, data on heap
    // Two cache line accesses minimum
    // Potential cache miss on first data access
}

Allocation overhead dominates for small, short-lived collections:

use smallvec::SmallVec;
 
fn allocation_overhead() {
    // Many small allocations
    for _ in 0..1000 {
        let mut v: Vec<u8> = Vec::with_capacity(4);
        v.push(1);
        v.push(2);
        // Heap allocation + deallocation per iteration
    }
    
    // No allocations for small inline capacity
    for _ in 0..1000 {
        let mut v: SmallVec<[u8; 4]> = SmallVec::new();
        v.push(1);
        v.push(2);
        // No heap traffic at all
    }
}

Choosing the Right Inline Capacity

The inline capacity should match your typical usage pattern:

use smallvec::SmallVec;
 
// Analyze your data first
fn analyze_collection_sizes(records: &[Vec<u32>]) {
    let mut histogram = std::collections::HashMap::new();
    for v in records {
        *histogram.entry(v.len()).or_insert(0) += 1;
    }
    
    // Find the 90th percentile size
    // Use that as your inline capacity
}
 
// Example: HTTP headers typically have 10-20 headers
type Headers = SmallVec<[(String, String); 16]>;
 
// Example: Function arguments typically have 2-5 arguments
type Arguments = SmallVec<[Expr; 4]>;
 
// Example: Directory contents often have 5-20 entries
type DirEntries = SmallVec<[DirEntry; 16]>;

Spill Behavior and Reallocation

When SmallVec spills, it reallocates like Vec:

use smallvec::SmallVec;
 
fn spill_behavior() {
    let mut v: SmallVec<[u8; 4]> = SmallVec::new();
    
    v.push(1);
    v.push(2);
    v.push(3);
    v.push(4);
    // Inline, no heap allocation
    
    v.push(5);
    // Spills to heap:
    // 1. Allocates heap buffer (capacity typically grows)
    // 2. Copies inline data to heap
    // 3. Inline buffer no longer used
    // 4. Further pushes use heap like Vec
    
    // After spill, SmallVec behaves like Vec
    // The inline buffer becomes dead space on the stack
}

This is why oversized inline buffers waste memory for large collections.

Generic SmallVec Patterns

The inline array syntax can be confusing but follows Rust's array syntax:

use smallvec::SmallVec;
 
// The generic parameter is an array type [T; N]
// T is the element type, N is the inline capacity
 
type SmallBytes = SmallVec<[u8; 16]>;     // Up to 16 u8s inline
type SmallInts = SmallVec<[i32; 8]>;      // Up to 8 i32s inline
type SmallStrings = SmallVec<[String; 4]>; // Up to 4 Strings inline
 
// The inline buffer size is N * size_of::<T>()
// Plus 8 bytes for length (and flag for inline vs heap)

SmallVec with Zero Inline Capacity

You can create a SmallVec with zero inline capacity:

use smallvec::SmallVec;
 
fn zero_capacity() {
    // No inline storage at all
    type HeapOnly = SmallVec<[u8; 0]>;
    
    let mut v: HeapOnly = SmallVec::new();
    v.push(1);  // Immediately allocates on heap
    
    // This is equivalent to Vec but with SmallVec's API
    // Useful for generic code that accepts SmallVec<T, N>
}

Comparison with Similar Types

Rust has multiple small-collection optimizations:

use smallvec::SmallVec;
use std::mem::size_of;
 
fn compare_types() {
    // Vec: Always heap, fixed stack size
    assert_eq!(size_of::<Vec<u8>>(), 24);
    
    // SmallVec: Inline up to N, larger stack size
    type S8 = SmallVec<[u8; 8]>;
    assert_eq!(size_of::<S8>(), 16);
    
    // tinyvec::TinyVec: Similar to SmallVec but no heap spillover
    // ArrayVec from tinyvec or arrayvec: No heap at all, fixed capacity
    
    // Box<[T]>: Heap slice, no capacity field
    assert_eq!(size_of::<Box<[u8]>>(), 16);  // pointer + length
}

Practical Example: AST Node Children

A compiler AST often has nodes with few children:

use smallvec::SmallVec;
 
enum Expr {
    Literal(i64),
    Binary {
        op: BinOp,
        left: Box<Expr>,
        right: Box<Expr>,
    },
    Call {
        func: String,
        args: SmallVec<[Expr; 4]>,  // Most calls have 0-4 args
    },
    List(SmallVec<[Expr; 8]>),  // Most lists are small
}
 
// Without SmallVec:
// - Every Call with 1 arg allocates a Vec
// - Every List with 2 elements allocates a Vec
// - A 1000-node AST might have 500+ tiny allocations
 
// With SmallVec:
// - Most calls and lists are inline
// - Only unusual cases allocate
// - Better cache locality during traversal

Synthesis

SmallVec trades stack space for heap allocation avoidance. The memory implications depend on your data patterns:

Stack size: SmallVec<[T; N]> occupies size_of::<T>() * N + 8 bytes on stack (plus alignment padding). Vec<T> always occupies 24 bytes.

Heap allocation: SmallVec allocates only when exceeding inline capacity. Vec allocates for any non-zero capacity.

Total memory: For collections within inline capacity, SmallVec uses less total memory (no heap overhead). For large collections, SmallVec uses more memory (stack buffer is wasted after spilling).

Performance: SmallVec provides better cache locality for small collections and eliminates allocation overhead. The trade-off is larger stack frames and potential cache pressure from larger structs.

Choose SmallVec when your collections are typically small (< inline capacity) and you have many of them. Choose Vec when collections are typically large, elements are large, or stack space is at a premium. Profile your actual data distribution to determine the optimal inline capacity—if the 90th percentile size is 6, SmallVec<[T; 8]> avoids heap allocation for 90% of cases while keeping stack usage reasonable.