Loading pageβ¦
Rust walkthroughs
Loading pageβ¦
dashmap::DashMap::shards configuration and memory overhead?DashMap shards its data across multiple internal hash maps (one per shard) to enable concurrent access without global lockingβmore shards means finer-grained parallelism at the cost of increased per-shard allocation overhead, while fewer shards reduces memory overhead but increases lock contention under concurrent load. Each shard is a separate RwLock<HashMap> with its own allocation, so a DashMap with 256 shards allocates 256 separate hash tables even if the total entry count is small. The default shard count is based on available CPU cores, but tuning it involves trading memory overhead (empty or sparse shards) against contention (multiple threads competing for the same shard lock).
use dashmap::DashMap;
fn default_shards() {
// Default shard count is based on CPU count
let map = DashMap::new();
// The default is approximately:
// shard_amount = max(1, available_parallelism())
// Usually matches CPU core count
// On a 8-core machine, default is 8 shards
// On a 64-core machine, default is 64 shards
// Each shard allocates its own HashMap storage
// Even before any entries are inserted
}Default sharding scales with CPU count for optimal parallelism.
use dashmap::DashMap;
fn explicit_shards() {
// Create with specific shard count
let map: DashMap<i32, String> = DashMap::with_shards(4);
// Creates 4 internal HashMaps
// Each with its own RwLock
// Fewer shards:
// - Less memory overhead (fewer empty hash tables)
// - More contention (threads compete for fewer locks)
// More shards:
// - More memory overhead (more empty hash tables)
// - Less contention (threads spread across more locks)
}with_shards(n) creates exactly n internal hash tables.
use dashmap::DashMap;
use std::mem::size_of;
fn memory_analysis() {
// Each shard allocates:
// 1. HashMap structure (buckets array)
// 2. RwLock for synchronization
// HashMap allocation:
// - Default capacity (even for empty map)
// - Buckets array (raw pointers)
// - Metadata
// RwLock allocation:
// - State tracking
// - Parking mechanism
// Total overhead per shard:
// - Hash table buckets (even if empty)
// - Lock structure
// Example: DashMap with 256 shards
// - 256 empty HashMaps
// - 256 RwLocks
// - Even if storing only 10 entries total
let overhead_per_shard = size_of::<std::collections::HashMap<i32, i32>>();
let shards = 256;
let total_overhead = shards * overhead_per_shard;
}Each shard has its own allocation, creating baseline memory overhead.
use dashmap::DashMap;
use std::thread;
use std::sync::Arc;
fn contention_example() {
// Low shard count = high contention
let few_shards = Arc::new(DashMap::<i32, i32>::with_shards(1));
// All threads compete for single lock
// Effectively a global RwLock<HashMap>
// High shard count = low contention
let many_shards = Arc::new(DashMap::<i32, i32>::with_shards(256));
// Threads likely hit different shards
// Less waiting for locks
// Hash function determines which shard a key goes to:
// shard_index = hash(key) % shard_count
// With good hash distribution:
// - Keys spread evenly across shards
// - Concurrent access hits different shards
}More shards reduce contention by distributing keys across more locks.
use dashmap::DashMap;
fn empty_shard_overhead() {
// With 256 shards and 100 entries:
// - Average: 0.39 entries per shard
// - Most shards are nearly empty or empty
// - But each shard has allocated its hash table
// Each empty shard still has:
// - Allocated bucket array (default capacity)
// - RwLock state
// For small maps with many shards:
// Memory overhead >> actual data size
// Example calculation:
// 256 shards * default HashMap capacity (0 but pre-allocated structure)
// Can be significant for small total entry counts
// DashMap doesn't de-allocate empty shards
// They persist for the lifetime of the map
}Empty shards still consume memory for their internal structure.
use dashmap::DashMap;
fn small_maps() {
// For small maps, fewer shards are appropriate
// Maps with <100 entries:
let tiny: DashMap<i32, String> = DashMap::with_shards(1);
// Single shard, minimal overhead
// Contention acceptable for low concurrency
// Maps with ~1000 entries:
let small: DashMap<i32, String> = DashMap::with_shards(4);
// Moderate shards, balance overhead and contention
// Maps with ~10000+ entries:
let large: DashMap<i32, String> = DashMap::new(); // Default (CPU count)
// Let default handle it
// Memory overhead negligible relative to data
}Small maps benefit from fewer shards to reduce per-shard overhead.
use dashmap::DashMap;
fn large_scale() {
// For large maps, shards are efficient
// 1 million entries across 64 shards:
// ~15625 entries per shard
// Memory overhead: 64 * per-shard-structure
// Negligible compared to 1M entries
// Contention with 64 cores:
// Each core likely hits different shard
// Near-optimal parallelism
// In this regime, default shard count is ideal
let large: DashMap<i32, String> = DashMap::new();
}For large maps, shard overhead is negligible relative to data size.
use dashmap::DashMap;
use std::num::NonZeroUsize;
fn shard_sizing() {
// Heuristic: shard_count should be proportional to:
// 1. Expected concurrent writers
// 2. Total expected entries
// 3. Available memory
// Common patterns:
// Thread-heavy, data-light:
// Many threads, few entries
// Optimize for contention: shard_count = thread_count
let for_threads: DashMap<i32, i32> = DashMap::with_shards(
std::thread::available_parallelism()
.map(|n| n.get())
.unwrap_or(1)
);
// Thread-light, data-heavy:
// Few threads, many entries
// Optimize for memory: shard_count = few
let for_memory: DashMap<i32, i32> = DashMap::with_shards(4);
// Balanced: default works well
let balanced: DashMap<i32, i32> = DashMap::new();
}Shard count should reflect both concurrency and data scale.
use dashmap::DashMap;
fn capacity_and_shards() {
// with_capacity spreads capacity across shards
let map: DashMap<i32, String> = DashMap::with_capacity_and_shards(1000, 4);
// Each shard gets ~250 capacity
// Total allocation: ~1000 entry capacity
// More accurate than separate capacity/shards
// But still: 4 separate HashMap allocations
// If you know total expected size:
// with_capacity_and_shards(total, shard_count)
// Efficiently pre-allocates per shard
}with_capacity_and_shards distributes capacity across shards.
use dashmap::DashMap;
use std::hash::{Hash, Hasher};
fn hash_distribution() {
// Shard selection: shard_index = hash(key) % shard_count
// With poor hash distribution:
// - Some shards get more keys
// - Imbalanced load
// - Higher contention on popular shards
// With good hash distribution:
// - Keys spread evenly
// - Balanced shards
// - Optimal parallelism
// DashMap uses ahash by default
// Fast and good distribution
// Shard count as power of 2:
// shard_index = hash & (shard_count - 1)
// Faster than modulo, requires power-of-2 shards
// Internal: DashMap rounds shard_count to power of 2
}Hash quality affects shard balance; shard count affects hash distribution.
use dashmap::DashMap;
fn memory_layout() {
// DashMap with 4 shards:
//
// DashMap
// βββ Shard 0: RwLock<HashMap>
// β βββ Buckets array (capacity)
// βββ Shard 1: RwLock<HashMap>
// β βββ Buckets array (capacity)
// βββ Shard 2: RwLock<HashMap>
// β βββ Buckets array (capacity)
// βββ Shard 3: RwLock<HashMap>
// βββ Buckets array (capacity)
// Total allocations:
// - 4 HashMap structures
// - 4 RwLock structures
// - 4 bucket arrays (one per shard)
// Each bucket array has default capacity
// Even if no entries exist
// Memory = overhead_per_shard * shard_count + entries
}Each shard adds independent allocation overhead.
use dashmap::DashMap;
fn trade_off() {
// Low shard count (e.g., 1):
// PRO: Minimal memory overhead
// CON: All operations serialize through single lock
// Use when: Memory-constrained, single-threaded, or low-contention
// High shard count (e.g., 256):
// PRO: Minimal contention, high parallelism
// CON: High memory overhead per shard
// Use when: High concurrency, memory available
// Finding balance:
// 1. Start with default (CPU count)
// 2. Measure contention (lock wait times)
// 3. If contention high: increase shards
// 4. If memory pressure: decrease shards
// As a guideline:
// - shard_count β max(expected_threads, sqrt(expected_entries))
// - But this is situation-dependent
}Tune shard count based on observed contention and memory constraints.
use dashmap::DashMap;
fn resizing() {
// Each shard resizes independently
// When one shard grows, others unchanged
let map: DashMap<i32, i32> = DashMap::with_shards(4);
// Insert keys that hash to shard 0
// Shard 0 may resize while others stay small
// This is good for memory:
// - Only growing shards allocate more
// - Sparse shards don't waste memory
// But can cause imbalance:
// - Some shards larger than others
// - Operations on full shard slower
// - Empty shards still have base overhead
}Shards resize independently, growing only as needed.
use dashmap::DashMap;
use std::mem::size_of_val;
fn measure_overhead() {
// Base DashMap structure
let empty_map: DashMap<i32, i32> = DashMap::with_shards(64);
// Stack size of DashMap structure
println!("DashMap struct size: {} bytes", size_of_val(&empty_map));
// This is small (just pointers/indices)
// But heap allocations are larger:
// - Per-shard HashMap allocation
// - Per-shard RwLock allocation
// To measure heap:
// Use allocator tracking or heap profiling tools
// Rough estimate:
// Per-shard overhead β 100-200 bytes (structure + initial buckets)
// 64 shards β 6-12 KB overhead
// 256 shards β 25-50 KB overhead
// Compare to stored data:
// For small maps (100 entries), overhead >> data
// For large maps (1M entries), overhead << data
}Memory overhead scales with shard count, not just entry count.
use dashmap::DashMap;
fn reduce_shards() {
// Reduce shards when:
// 1. Memory-constrained environments
let embedded_map: DashMap<i32, i32> = DashMap::with_shards(4);
// Fewer shards, less overhead
// 2. Small total entry count
let small_map: DashMap<String, String> = DashMap::with_shards(2);
// 100 entries across 2 shards: minimal overhead
// 3. Single-threaded access
let single_threaded: DashMap<i32, i32> = DashMap::with_shards(1);
// No contention to avoid, minimize overhead
// 4. Read-heavy, write-rare
let read_heavy: DashMap<i32, i32> = DashMap::with_shards(4);
// RwLock allows concurrent reads
// Fewer shards acceptable
}Reduce shards for memory-constrained, small, or low-contention scenarios.
use dashmap::DashMap;
fn increase_shards() {
// Increase shards when:
// 1. High concurrent writes
let high_write: DashMap<i32, i32> = DashMap::with_shards(128);
// More shards, less write contention
// 2. Many threads
let many_threads: DashMap<i32, i32> = DashMap::with_shards(256);
// Spread across more shards
// 3. Large data with high access patterns
let large_hot: DashMap<String, Vec<u8>> = DashMap::with_shards(256);
// Parallel access across large dataset
// 4. Observed contention in profiling
// If lock wait times are high, add shards
}Increase shards for high-concurrency, write-heavy workloads.
use dashmap::DashMap;
use std::collections::HashMap;
use std::sync::RwLock;
fn compare_alternatives() {
// RwLock<HashMap>: single shard equivalent
let single: RwLock<HashMap<i32, i32>> = RwLock::new(HashMap::new());
// Minimal overhead, maximum contention
// Equivalent to DashMap::with_shards(1)
// DashMap with default: balanced
let balanced: DashMap<i32, i32> = DashMap::new();
// Shards = CPU count
// Good balance for most cases
// High-shard DashMap: parallelism optimized
let parallel: DashMap<i32, i32> = DashMap::with_shards(256);
// Maximum parallelism, more overhead
// Alternative: ConcurrentHashMap from other crates
// Similar sharding trade-offs
}RwLock<HashMap> is effectively a single-shard DashMap with minimal overhead.
The trade-off formula:
// Memory overhead β shard_count Γ per_shard_overhead + entries
// Contention β concurrent_threads / shard_count (roughly)
// More shards:
// - Higher memory overhead
// - Lower contention
// - Better parallelism
// Fewer shards:
// - Lower memory overhead
// - Higher contention
// - Worse parallelismPer-shard overhead components:
// Each shard allocates:
// 1. HashMap structure (~size_of::<HashMap<K,V>>() on stack)
// 2. Bucket array (heap allocation, even if empty)
// 3. RwLock state (small, but present)
// Total = shard_count Γ (base_overhead + initial_capacity)Configuration recommendations:
// Small maps (< 1000 entries):
DashMap::with_shards(4) // or fewer
// Medium maps (1000-100k entries):
DashMap::new() // default (CPU count)
// Large maps (> 100k entries):
DashMap::new() // or more shards for write-heavy
// Memory-constrained:
DashMap::with_shards(1) // or use RwLock<HashMap>
// Write-heavy, high concurrency:
DashMap::with_shards(64) // or higherKey insight: DashMap shards its data across multiple hash tables to enable concurrent access without global contention, but each shard adds memory overhead from its independent allocation. The default shard count (based on CPU cores) works well for most cases, but memory-sensitive applications should reduce shards, while high-contention workloads should increase them. The overhead per shard is small (hundreds of bytes for structure plus initial bucket capacity), but becomes significant when shard count greatly exceeds data scaleβ256 shards for a 100-entry map wastes memory relative to the data stored.