Loading page…
Rust walkthroughs
Loading page…
dashmap::DashMap::shards enable fine-grained locking control for concurrent access patterns?dashmap::DashMap::shards provides access to the underlying sharded storage, exposing individual RwLock-protected map segments that allow fine-grained concurrency control beyond the automatic per-key locking DashMap provides by default. Each shard contains a portion of the total key space determined by a hash function, and accessing a specific shard directly enables batch operations on related keys, custom locking strategies, and avoiding the overhead of acquiring locks one at a time for multiple operations on the same shard. The number of shards is configurable at construction time via DashMap::with_shard_amount, and tuning this value affects concurrency: more shards mean finer granularity and less contention, but more memory overhead and potential for uneven distribution. The shards() method returns an iterator over the raw shard data, which can be used for operations that benefit from holding a lock across multiple keys or for implementing custom concurrency patterns that the standard DashMap API doesn't directly support.
use dashmap::DashMap;
fn main() {
// DashMap divides keys across multiple shards
// Each shard is protected by its own RwLock
let map = DashMap::new();
// Default shard count is based on CPU count
// Typically: next_power_of_two(num_cpus * 4)
println!("Default shard count: {}", map.shards().count());
// Custom shard count
let map = DashMap::with_shard_amount(32);
println!("Custom shard count: {}", map.shards().count());
// Keys are distributed across shards via hash
// shard_index = hash(key) % shard_count
}Each shard is a separate RwLock<HashMap> enabling parallel access to different shards.
use dashmap::DashMap;
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
let map = DashMap::<u32, String>::new();
// Insert some values
for i in 0..100 {
map.insert(i, format!("value_{}", i));
}
// shards() returns an iterator over &RwLock<HashMap<K, V>>
// This gives direct access to the underlying shard data
for (shard_idx, shard) in map.shards().iter().enumerate() {
// Each shard is a RwLock<HashMap<K, V>>
let guard = shard.read().unwrap();
println!("Shard {}: {} entries", shard_idx, guard.len());
// Can inspect the raw HashMap contents
for (key, value) in guard.iter() {
println!(" {}: {:?}", key, value);
}
}
}The shards() method exposes the underlying storage for advanced operations.
use dashmap::DashMap;
use std::time::Instant;
fn main() {
// Scenario: High-contention workloads benefit from more shards
// Few shards - more contention
let map_few = DashMap::<u32, u32>::with_shard_amount(4);
// Many shards - less contention, more parallelism
let map_many = DashMap::<u32, u32>::with_shard_amount(256);
// Trade-offs:
// More shards:
// - Better parallelism under high contention
// - More memory overhead (each shard has its own RwLock)
// - Keys distributed across more HashMaps
// Fewer shards:
// - Less memory overhead
// - Better cache locality within shards
// - More contention when threads access same shard
// Rule of thumb: shard_count should be a power of 2
// and at least equal to number of concurrent threads
println!("Few shards: {}", map_few.shards().count());
println!("Many shards: {}", map_many.shards().count());
}Shard count tuning depends on contention patterns and thread count.
use dashmap::DashMap;
use std::collections::HashMap;
fn main() {
let map = DashMap::new();
// Insert keys that are likely to end up in the same shard
// (This requires understanding the hash function)
for i in 0..1000 {
map.insert(i, format!("value_{}", i));
}
// If you know keys share a shard, you can batch operations
// by holding the shard lock once instead of acquiring per-key
// Standard approach: multiple lock acquisitions
for i in 0..10 {
if let Some(mut value) = map.get_mut(&i) {
*value = format!("updated_{}", i);
}
}
// Batch approach: single lock acquisition for same-shard keys
// First, determine which shard a key belongs to
// Then hold that shard's lock for batch updates
// This is particularly useful when you have a group of related keys
// that you know will hash to the same shard
}Holding a shard lock enables batch operations without repeated lock acquisition.
use dashmap::DashMap;
use std::hash::{Hash, Hasher};
use std::collections::hash_map::DefaultHasher;
fn main() {
let map = DashMap::<u32, String>::with_shard_amount(16);
// Determine which shard a key belongs to
let shard_count = map.shards().len();
// DashMap uses the key's hash to determine shard
// shard_index = hash(key) % shard_count
// (where shard_count is a power of 2, so it uses bitwise AND)
fn determine_shard<K: Hash>(key: &K, shard_count: usize) -> usize {
let mut hasher = DefaultHasher::new();
key.hash(&mut hasher);
let hash = hasher.finish();
// For power-of-2 shard counts, use bitwise AND
(hash as usize) & (shard_count - 1)
}
for key in [0, 100, 200, 300, 400] {
let shard = determine_shard(&key, shard_count);
println!("Key {} is in shard {}", key, shard);
}
}Keys are assigned to shards via hash modulo (or bitwise AND for power-of-2).
use dashmap::DashMap;
use std::sync::RwLockWriteGuard;
use std::collections::HashMap;
fn main() {
let map = DashMap::<String, Vec<u32>>::new();
// Group related keys that should be processed together
// For example, keys with a common prefix
// Insert some data
for i in 0..100 {
let prefix = if i < 50 { "group_a" } else { "group_b" };
map.insert(format!("{}_{}", prefix, i), vec![i]);
}
// If all keys in a group hash to the same shard,
// you can process them efficiently
// For demonstration, access shard directly
let shard_count = map.shards().len();
// Find a shard with data
for (idx, shard) in map.shards().iter().enumerate() {
let read_guard = shard.read().unwrap();
if read_guard.len() > 0 {
println!("Shard {} has {} entries", idx, read_guard.len());
}
}
}Direct shard access enables custom batch processing patterns.
use dashmap::DashMap;
use std::mem;
fn main() {
// Each shard is: RwLock<HashMap<K, V>>
//
// DashMap structure:
// - Array of shards (typically a power of 2)
// - Each shard has:
// - RwLock for synchronization
// - HashMap for key-value storage
let map = DashMap::<u32, String>::with_shard_amount(16);
println!("Number of shards: {}", map.shards().len());
// Memory overhead:
// - Each shard has a RwLock (word-sized state + potential wait queue)
// - Each HashMap has internal allocation
// - Total overhead ≈ shard_count * (RwLock + empty HashMap)
// This is why fewer shards can be more memory-efficient
// when contention is low
}Each shard has its own RwLock and HashMap, contributing to memory overhead.
use dashmap::DashMap;
use std::sync::RwLock;
fn main() {
let map = DashMap::<u32, String>::new();
for i in 0..100 {
map.insert(i, format!("value_{}", i));
}
// DashMap automatically handles read/write locks
// - Reads acquire read locks (multiple readers allowed)
// - Writes acquire write locks (exclusive access)
// When accessing shards directly, you choose lock type:
// Read lock - allows concurrent reads on same shard
for shard in map.shards().iter() {
let read_guard = shard.read().unwrap();
// Multiple threads can hold read guards simultaneously
println!("Shard has {} entries", read_guard.len());
}
// Write lock - exclusive access to shard
for shard in map.shards().iter() {
let mut write_guard = shard.write().unwrap();
// Only one thread can hold write guard
// All other readers/writers blocked
for (key, value) in write_guard.iter_mut() {
// Can modify values
}
}
}Direct shard access lets you choose between read and write locks.
use dashmap::DashMap;
fn main() {
let map = DashMap::new();
// Standard DashMap API: automatic per-key locking
//
// For each operation:
// 1. Hash the key
// 2. Determine shard index
// 3. Acquire lock on that shard
// 4. Perform operation
// 5. Release lock
map.insert(1, "a"); // Lock shard for key 1, insert, unlock
map.insert(2, "b"); // Lock shard for key 2, insert, unlock
let _ = map.get(&1); // Lock shard for key 1, read, unlock
// If multiple keys are in the same shard:
// map.insert(1, ...) - lock shard X
// map.insert(3, ...) - lock shard X again (if same shard)
// This is two lock acquisitions!
// With direct shard access:
// - Acquire shard X lock once
// - Perform both inserts
// - Release lock
// This is one lock acquisition
// Trade-off: direct shard access is more complex
// but can be more efficient for batch operations
}Direct shard access reduces lock acquisitions for batch operations on the same shard.
use dashmap::DashMap;
use std::thread;
use std::sync::Arc;
use std::time::Instant;
fn main() {
// Benchmark: different shard counts under contention
let shard_counts = [4, 16, 64, 256];
for &count in &shard_counts {
let map = Arc::new(DashMap::<u32, u32>::with_shard_amount(count));
let start = Instant::now();
let handles: Vec<_> = (0..8)
.map(|_| {
let map = Arc::clone(&map);
thread::spawn(move || {
for i in 0..10_000 {
let key = i % 100; // Contention on 100 keys
*map.entry(key).or_insert(0) += 1;
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
println!("{} shards: {:?}", count, start.elapsed());
}
// Results typically show:
// - More shards = faster under contention (more parallelism)
// - Fewer shards = slower under contention (more lock waiting)
// - Memory usage increases with shard count
}Higher shard counts improve performance under contention but increase memory overhead.
use dashmap::DashMap;
use std::hash::{Hash, Hasher};
use std::collections::hash_map::DefaultHasher;
fn main() {
let map = DashMap::<u32, String>::with_shard_amount(16);
// When bulk loading data where you know keys hash to same shard,
// you can reduce lock overhead
// Hash function for determining shard (simplified)
fn hash_to_shard(key: u32, shard_count: usize) -> usize {
let mut hasher = DefaultHasher::new();
key.hash(&mut hasher);
let hash = hasher.finish();
(hash as usize) & (shard_count - 1)
}
// Group keys by shard for efficient batch loading
let shard_count = map.shards().len();
// For demonstration: insert into a specific shard
// In practice, you'd group keys first
let keys_for_shard_0: Vec<u32> = (0..1000)
.filter(|k| hash_to_shard(*k, shard_count) == 0)
.collect();
// Use shard directly for batch insert
if let Some(shard) = map.shards().first() {
let mut guard = shard.write().unwrap();
for key in keys_for_shard_0 {
guard.insert(key, format!("value_{}", key));
}
// Single lock acquisition for all inserts to this shard
}
}Grouping keys by shard enables batch operations with single lock acquisition.
use dashmap::DashMap;
fn main() {
let map = DashMap::<u32, String>::with_shard_amount(8);
// shards() returns &Vec<RwLock<HashMap<K, V>>>
// or equivalent array structure
let shards = map.shards();
// Iterate with index
for (i, shard) in shards.iter().enumerate() {
// shard is &RwLock<HashMap<K, V>>
let guard = shard.read().unwrap();
println!("Shard {}: {} entries", i, guard.len());
}
// Access specific shard by index
let first_shard = &shards[0];
let guard = first_shard.read().unwrap();
// Count total entries across all shards
let total: usize = shards.iter()
.map(|s| s.read().unwrap().len())
.sum();
println!("Total entries: {}", total);
}The shards() iterator provides indexed access to each shard's RwLock.
use dashmap::DashMap;
use std::sync::RwLock;
use std::collections::HashMap;
fn main() {
// Problem: some keys are accessed much more frequently (hot keys)
// Solution: distribute hot keys across shards
let map = DashMap::<String, u64>::with_shard_amount(64);
// Hot key pattern: user_0 is accessed 100x more than others
// With few shards, this causes contention
// Option 1: Increase shard count (done above)
// Option 2: Use shard knowledge to spread hot keys
// By default, similar keys might hash to same shard
// Consider key design that spreads hot keys
// For example, instead of "user_0":
// - "user_0_a", "user_0_b", "user_0_c" (sharded counter pattern)
// Then sum when you need the total
// This spreads writes across multiple shards
// When reading shards directly for aggregation:
let total: u64 = map.shards().iter()
.map(|shard| {
let guard = shard.read().unwrap();
guard.values().sum()
})
.sum();
println!("Total across all shards: {}", total);
}Understanding shards helps design keys that distribute load evenly.
use dashmap::DashMap;
fn main() {
let shard_counts = [4, 16, 64, 256, 1024];
println!("Shard count | Approximate overhead");
println!("------------|---------------------");
for &count in &shard_counts {
// Each shard has:
// - RwLock: ~8-16 bytes for state, plus potential wait queue
// - HashMap: at minimum, capacity 0 (but allocated structure)
// Rough estimate:
// - RwLock: ~20-40 bytes
// - Empty HashMap: ~48-64 bytes
// - Per-shard overhead: ~70-100 bytes
let overhead_per_shard = 80; // approximate bytes
let total_overhead = count * overhead_per_shard;
println!("{:12} | {} bytes ({} KB)", count, total_overhead, total_overhead / 1024);
}
// Note: This is just the shard structure overhead
// Actual data (key-value pairs) adds to this
// The HashMaps grow as data is added
}Shard overhead grows linearly with shard count.
Shard count trade-offs:
| Shards | Contention | Memory | Latency | Parallelism | |--------|------------|--------|---------|-------------| | Few (4-16) | Higher | Lower | Higher | Lower | | Many (64-256) | Lower | Higher | Lower | Higher |
When to tune shard count:
| Scenario | Recommendation | |----------|---------------| | High write contention | More shards | | Many threads | More shards | | Memory constrained | Fewer shards | | Low contention workload | Fewer shards | | Known hot keys | More shards + key design |
Direct shard access use cases:
| Use Case | Benefit | |----------|---------| | Batch operations on same-shard keys | Single lock acquisition | | Custom aggregation across shards | Direct HashMap iteration | | Debugging/inspecting distribution | See actual key distribution | | Custom locking semantics | Read vs write lock choice |
Key insight: DashMap::shards exposes the implementation detail that makes DashMap concurrent: instead of a single global lock protecting the entire map, the keyspace is partitioned across multiple shards, each protected by its own RwLock. This design allows concurrent access to keys in different shards without contention, achieving better parallelism than a standard HashMap<RwLock<...>> would allow. The shards() method provides a window into this structure for advanced use cases: batch operations can hold a single shard lock instead of acquiring per-key locks through the standard API, custom aggregation can iterate directly over HashMap contents without intermediate wrapper overhead, and debugging can reveal whether keys are evenly distributed across shards. The number of shards is the primary tuning knob: more shards mean finer granularity and less contention under concurrent access, but each additional shard adds memory overhead for its RwLock and internal HashMap structure. The default—typically based on CPU count—is a reasonable starting point, but high-contention workloads with many threads may benefit from more shards, while memory-constrained or low-contention scenarios may prefer fewer. The critical design insight is that the hash function determines shard assignment; understanding this allows you to design key schemes that distribute load across shards rather than concentrating it, and the shards() method lets you verify your assumptions about actual distribution.