Loading page…
Rust walkthroughs
Loading page…
smallvec::SmallVec vs Vec for small collections?Small collections pose a memory efficiency problem in Rust. Vec<T> always heap-allocates, even for a single element, incurring allocation overhead and pointer indirection. SmallVec addresses this by storing small numbers of elements inline, eliminating heap allocation for small collections at the cost of a larger stack footprint.
A Vec<T> consists of three components on the stack:
use std::mem::size_of;
fn demonstrate_vec_layout() {
// Vec always has this structure:
// - pointer to heap allocation (8 bytes on 64-bit)
// - length (8 bytes)
// - capacity (8 bytes)
// Total: 24 bytes on stack
assert_eq!(size_of::<Vec<u8>>(), 24);
assert_eq!(size_of::<Vec<u64>>(), 24);
assert_eq!(size_of::<Vec<String>>(), 24);
// Plus heap allocation:
// - For 1 element: allocation overhead + 1 element
// - For 0 elements: still allocates on push (or with with_capacity(0), no allocation)
}Every Vec, regardless of element type or length, occupies 24 bytes on the stack. The actual data lives on the heap:
fn vec_heap_allocation() {
// Single element vec
let v: Vec<u8> = vec![42];
// Stack: pointer (8) + length (8) + capacity (8) = 24 bytes
// Heap: allocation (typically 8+ bytes for allocator metadata) + 1 byte used + padding
// For a single byte, we have:
// - 24 bytes on stack
// - 16+ bytes on heap (minimum allocation often 8-16 bytes + allocator overhead)
// Total: 40+ bytes to store 1 byte
}SmallVec stores elements inline up to a specified capacity:
use smallvec::SmallVec;
use std::mem::size_of;
fn demonstrate_smallvec_layout() {
// SmallVec<[u8; 8]> stores up to 8 bytes inline
// Stack layout:
// - length (8 bytes)
// - inline buffer (8 bytes)
// No separate capacity field when inline
type SmallU8 = SmallVec<[u8; 8]>;
assert_eq!(size_of::<SmallU8>(), 16); // length + inline buffer
// Compare to Vec<u8>:
assert_eq!(size_of::<Vec<u8>>(), 24);
// SmallVec is smaller on stack AND doesn't allocate for ≤8 elements
}The inline capacity is part of the type, enabling the compiler to reserve stack space.
SmallVec transitions between inline and heap storage:
use smallvec::SmallVec;
fn smallvec_transitions() {
let mut v: SmallVec<[u64; 2]> = SmallVec::new();
// Initially inline, no heap allocation
v.push(1); // Still inline
v.push(2); // Still inline, buffer full
// Stack: 16 bytes for length + inline buffer (2 * 8 bytes)
// No heap allocation yet
v.push(3); // Exceeds inline capacity, spills to heap
// Now:
// - Stack still 16 bytes
// - Heap: Vec-like allocation for 3+ elements
// - Data moved from inline to heap
}This spilling mechanism allows SmallVec to handle arbitrarily large collections while optimizing the common case.
Consider storing a small number of integers:
use smallvec::SmallVec;
use std::mem::size_of;
fn memory_comparison() {
// Scenario: Store up to 4 integers
// Option 1: Vec<i64>
// Stack: 24 bytes
// Heap: 32+ bytes (allocation for 4 i64s)
// Total: 56+ bytes
let vec: Vec<i64> = vec![1, 2, 3, 4];
assert_eq!(size_of::<Vec<i64>>(), 24);
// Option 2: SmallVec<[i64; 4]>
// Stack: 40 bytes (length + 4 * 8)
// Heap: 0 bytes (inline)
// Total: 40 bytes
type SmallInts = SmallVec<[i64; 4]>;
let small: SmallInts = SmallVec::from_slice(&[1, 2, 3, 4]);
assert_eq!(size_of::<SmallInts>(), 40); // 8 (len) + 32 (data)
// SmallVec uses less total memory for small collections
}SmallVec increases stack size to avoid heap allocation:
use smallvec::SmallVec;
use std::mem::size_of;
fn stack_size_tradeoff() {
// Vec<i64>: Always 24 bytes on stack
assert_eq!(size_of::<Vec<i64>>(), 24);
// SmallVec<[i64; 8]>: 72 bytes on stack (8 + 64)
type Small8 = SmallVec<[i64; 8]>;
assert_eq!(size_of::<Small8>(), 72);
// SmallVec<[i64; 16]>: 136 bytes on stack (8 + 128)
type Small16 = SmallVec<[i64; 16]>;
assert_eq!(size_of::<Small16>(), 136);
// Large inline capacity = larger stack footprint
// But more elements can avoid heap allocation
}Choosing the inline capacity requires balancing stack usage against allocation avoidance.
SmallVec wins when elements are small and collections are typically within the inline capacity:
use smallvec::SmallVec;
// Good use case: Small byte buffers
type ByteBuffer = SmallVec<[u8; 32]>; // 40 bytes stack
// Typical network packet: small, often < 32 bytes for headers
// No allocation for most packets
// Good use case: Small lists of IDs
type IdList = SmallVec<[u32; 8]>; // 40 bytes stack
// Most operations touch few IDs
// Rare large lists spill to heap gracefully
// Good use case: Path components
type PathSegments = SmallVec<[&str; 4]>; // 40 bytes stack
// Most paths have few components
// Deep paths still work via heap spilloverSmallVec loses when elements are large or collections exceed inline capacity:
use smallvec::SmallVec;
use std::mem::size_of;
fn when_smallvec_loses() {
// Large elements: inline buffer wastes stack space
type BigElement = SmallVec<[String; 16]>;
// Each String is 24 bytes
// Inline buffer: 16 * 24 = 384 bytes
// Total stack: 384 + 8 = 392 bytes
assert_eq!(size_of::<BigElement>(), 392);
// Compare to Vec<String>: 24 bytes stack
assert_eq!(size_of::<Vec<String>>(), 24);
// If most vectors are empty or have 1-2 elements:
// SmallVec: 392 bytes stack per vector (wasteful)
// Vec: 24 bytes stack + heap allocation for used elements
// If vectors are always large (20+ elements):
// SmallVec: 392 bytes stack + heap (inline never used)
// Vec: 24 bytes stack + heap
// SmallVec wastes 368 bytes of stack
}Memory layout affects cache behavior and allocation overhead:
use smallvec::SmallVec;
fn performance_characteristics() {
// Cache locality: SmallVec's inline data
let mut small: SmallVec<[u8; 64]> = SmallVec::new();
for i in 0..64 {
small.push(i);
}
// All data is contiguous with the SmallVec struct
// Single cache line fetch gets both metadata and data
// Vec's data is elsewhere on the heap
let mut vec: Vec<u8> = Vec::with_capacity(64);
for i in 0..64 {
vec.push(i);
}
// Metadata on stack, data on heap
// Two cache line accesses minimum
// Potential cache miss on first data access
}Allocation overhead dominates for small, short-lived collections:
use smallvec::SmallVec;
fn allocation_overhead() {
// Many small allocations
for _ in 0..1000 {
let mut v: Vec<u8> = Vec::with_capacity(4);
v.push(1);
v.push(2);
// Heap allocation + deallocation per iteration
}
// No allocations for small inline capacity
for _ in 0..1000 {
let mut v: SmallVec<[u8; 4]> = SmallVec::new();
v.push(1);
v.push(2);
// No heap traffic at all
}
}The inline capacity should match your typical usage pattern:
use smallvec::SmallVec;
// Analyze your data first
fn analyze_collection_sizes(records: &[Vec<u32>]) {
let mut histogram = std::collections::HashMap::new();
for v in records {
*histogram.entry(v.len()).or_insert(0) += 1;
}
// Find the 90th percentile size
// Use that as your inline capacity
}
// Example: HTTP headers typically have 10-20 headers
type Headers = SmallVec<[(String, String); 16]>;
// Example: Function arguments typically have 2-5 arguments
type Arguments = SmallVec<[Expr; 4]>;
// Example: Directory contents often have 5-20 entries
type DirEntries = SmallVec<[DirEntry; 16]>;When SmallVec spills, it reallocates like Vec:
use smallvec::SmallVec;
fn spill_behavior() {
let mut v: SmallVec<[u8; 4]> = SmallVec::new();
v.push(1);
v.push(2);
v.push(3);
v.push(4);
// Inline, no heap allocation
v.push(5);
// Spills to heap:
// 1. Allocates heap buffer (capacity typically grows)
// 2. Copies inline data to heap
// 3. Inline buffer no longer used
// 4. Further pushes use heap like Vec
// After spill, SmallVec behaves like Vec
// The inline buffer becomes dead space on the stack
}This is why oversized inline buffers waste memory for large collections.
The inline array syntax can be confusing but follows Rust's array syntax:
use smallvec::SmallVec;
// The generic parameter is an array type [T; N]
// T is the element type, N is the inline capacity
type SmallBytes = SmallVec<[u8; 16]>; // Up to 16 u8s inline
type SmallInts = SmallVec<[i32; 8]>; // Up to 8 i32s inline
type SmallStrings = SmallVec<[String; 4]>; // Up to 4 Strings inline
// The inline buffer size is N * size_of::<T>()
// Plus 8 bytes for length (and flag for inline vs heap)You can create a SmallVec with zero inline capacity:
use smallvec::SmallVec;
fn zero_capacity() {
// No inline storage at all
type HeapOnly = SmallVec<[u8; 0]>;
let mut v: HeapOnly = SmallVec::new();
v.push(1); // Immediately allocates on heap
// This is equivalent to Vec but with SmallVec's API
// Useful for generic code that accepts SmallVec<T, N>
}Rust has multiple small-collection optimizations:
use smallvec::SmallVec;
use std::mem::size_of;
fn compare_types() {
// Vec: Always heap, fixed stack size
assert_eq!(size_of::<Vec<u8>>(), 24);
// SmallVec: Inline up to N, larger stack size
type S8 = SmallVec<[u8; 8]>;
assert_eq!(size_of::<S8>(), 16);
// tinyvec::TinyVec: Similar to SmallVec but no heap spillover
// ArrayVec from tinyvec or arrayvec: No heap at all, fixed capacity
// Box<[T]>: Heap slice, no capacity field
assert_eq!(size_of::<Box<[u8]>>(), 16); // pointer + length
}A compiler AST often has nodes with few children:
use smallvec::SmallVec;
enum Expr {
Literal(i64),
Binary {
op: BinOp,
left: Box<Expr>,
right: Box<Expr>,
},
Call {
func: String,
args: SmallVec<[Expr; 4]>, // Most calls have 0-4 args
},
List(SmallVec<[Expr; 8]>), // Most lists are small
}
// Without SmallVec:
// - Every Call with 1 arg allocates a Vec
// - Every List with 2 elements allocates a Vec
// - A 1000-node AST might have 500+ tiny allocations
// With SmallVec:
// - Most calls and lists are inline
// - Only unusual cases allocate
// - Better cache locality during traversalSmallVec trades stack space for heap allocation avoidance. The memory implications depend on your data patterns:
Stack size: SmallVec<[T; N]> occupies size_of::<T>() * N + 8 bytes on stack (plus alignment padding). Vec<T> always occupies 24 bytes.
Heap allocation: SmallVec allocates only when exceeding inline capacity. Vec allocates for any non-zero capacity.
Total memory: For collections within inline capacity, SmallVec uses less total memory (no heap overhead). For large collections, SmallVec uses more memory (stack buffer is wasted after spilling).
Performance: SmallVec provides better cache locality for small collections and eliminates allocation overhead. The trade-off is larger stack frames and potential cache pressure from larger structs.
Choose SmallVec when your collections are typically small (< inline capacity) and you have many of them. Choose Vec when collections are typically large, elements are large, or stack space is at a premium. Profile your actual data distribution to determine the optimal inline capacity—if the 90th percentile size is 6, SmallVec<[T; 8]> avoids heap allocation for 90% of cases while keeping stack usage reasonable.