Loading pageā¦
Rust walkthroughs
Loading pageā¦
dashmap::DashMap sharding strategy vs RwLock<HashMap>?DashMap uses a sharding strategy that divides the hash map into multiple internal maps (shards), each protected by its own lock, allowing concurrent operations on different keys to proceed in parallel. RwLock<HashMap> uses a single lock for the entire map, serializing all write operations and allowing concurrent reads only when no writes are active. DashMap excels under high concurrent write loads with low contention per key, while RwLock<HashMap> performs better when reads vastly dominate writes or when operations frequently access the same keys. The choice depends on contention patterns: DashMap reduces contention through parallelism, while RwLock<HashMap> offers simpler semantics with lower overhead per operation when contention is low.
use std::collections::HashMap;
use std::sync::RwLock;
struct SharedMap {
map: RwLock<HashMap<String, u32>>,
}
impl SharedMap {
fn new() -> Self {
Self {
map: RwLock::new(HashMap::new()),
}
}
fn insert(&self, key: String, value: u32) {
let mut map = self.map.write().unwrap();
map.insert(key, value);
}
fn get(&self, key: &str) -> Option<u32> {
let map = self.map.read().unwrap();
map.get(key).copied()
}
}
fn main() {
let shared = SharedMap::new();
shared.insert("a".to_string(), 1);
println!("a = {:?}", shared.get("a"));
}A single RwLock protects the entire map, allowing concurrent reads but serializing writes.
use dashmap::DashMap;
fn main() {
let map = DashMap::new();
// No need to manually acquire locks
map.insert("a", 1);
map.insert("b", 2);
// Concurrent access is automatic
println!("a = {:?}", map.get("a"));
println!("b = {:?}", map.get("b"));
}DashMap handles locking internally, with multiple shards for parallelism.
use dashmap::DashMap;
fn main() {
// Default shard count: number of CPU cores * 4
let map = DashMap::<String, u32>::new();
// Operations on different keys can proceed in parallel
// because they likely map to different shards
// Shard 1 handles keys that hash to shard 1
// Shard 2 handles keys that hash to shard 2
// etc.
// Each shard has its own RwLock
// Contention only occurs when accessing the same shard
}DashMap divides keys across N shards, each with independent locks.
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::thread;
fn main() {
// RwLock<HashMap>: writes are serialized
let rwlock_map = Arc::new(RwLock::new(HashMap::new()));
let handles: Vec<_> = (0..8)
.map(|i| {
let map = rwlock_map.clone();
thread::spawn(move || {
for j in 0..1000 {
let key = format!("key_{}_{}", i, j);
let mut m = map.write().unwrap(); // All writers wait
m.insert(key, j);
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("RwLock writes completed");
}RwLock serializes all writes, creating a bottleneck under write-heavy load.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
fn main() {
// DashMap: writes to different shards proceed in parallel
let dashmap = Arc::new(DashMap::new());
let handles: Vec<_> = (0..8)
.map(|i| {
let map = dashmap.clone();
thread::spawn(move || {
for j in 0..1000 {
let key = format!("key_{}_{}", i, j);
map.insert(key, j); // Only blocks on same shard
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("DashMap writes completed");
}DashMap allows concurrent writes to different shards, improving throughput.
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::thread;
fn main() {
let map = Arc::new(RwLock::new(HashMap::new()));
// Populate
for i in 0..1000 {
map.write().unwrap().insert(i, i * 2);
}
// RwLock: concurrent reads work well
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
let _ = map.read().unwrap().get(&(i % 1000));
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
}RwLock allows concurrent reads, performing well under read-heavy workloads.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
fn main() {
let map = Arc::new(DashMap::new());
// Populate
for i in 0..1000 {
map.insert(i, i * 2);
}
// DashMap: reads also benefit from sharding
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
let _ = map.get(&(i % 1000));
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
}DashMap reads also benefit from shard-level locking.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
fn main() {
let map = Arc::new(DashMap::new());
// Initialize with data
for i in 0..100 {
map.insert(i, 0);
}
let writers: Vec<_> = (0..4)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
// Updates spread across shards
map.entry(i % 100).and_modify(|v| *v += 1);
}
})
})
.collect();
let readers: Vec<_> = (0..4)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
let _ = map.get(&(i % 100));
}
})
})
.collect();
for h in writers.into_iter().chain(readers) {
h.join().unwrap();
}
}DashMap handles mixed workloads well when operations distribute across shards.
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::thread;
fn main() {
// When all threads access the SAME key
// RwLock overhead is lower
let map = Arc::new(RwLock::new(HashMap::new()));
map.write().unwrap().insert("counter", 0u32);
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for _ in 0..1000 {
let mut m = map.write().unwrap();
if let Some(v) = m.get_mut("counter") {
*v += 1;
}
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("Counter: {}", map.read().unwrap().get("counter").unwrap());
}When all operations hit the same key, sharding provides no benefit.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
fn main() {
// Single key: all threads hit same shard
let map = Arc::new(DashMap::new());
map.insert("counter", 0u32);
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for _ in 0..1000 {
// Still serializes on the same shard
map.entry("counter").and_modify(|v| *v += 1);
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("Counter: {}", map.get("counter").unwrap());
}Same key access still contends on a single shard in DashMap.
use dashmap::DashMap;
fn main() {
// Default: number of CPUs * 4
let default_map = DashMap::<String, u32>::new();
// Custom shard count (must be power of 2)
let custom_map = DashMap::<String, u32>::with_shard_amount(16);
// More shards = less contention, but more memory overhead
// Fewer shards = less overhead, but more contention
// Choose based on expected concurrency level
// - Many concurrent operations: more shards
// - Few concurrent operations: fewer shards
}Shard count affects the contention/memory trade-off.
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
// RwLock<HashMap>: single HashMap + single RwLock
let rwlock_map: RwLock<HashMap<String, u32>> = RwLock::new(HashMap::new());
// Memory: ~base HashMap size + RwLock overhead
// With 1000 entries: single allocation
}RwLock<HashMap> has minimal overhead beyond the HashMap itself.
use dashmap::DashMap;
fn main() {
// DashMap: multiple HashMaps + multiple RwLocks
let dashmap = DashMap::<String, u32>::new();
// Memory: N shards * (HashMap + RwLock)
// With default 16 shards on 4-core system:
// 16 * (HashMap overhead + RwLock)
// Each empty shard still allocates internal structure
}DashMap has overhead proportional to the number of shards.
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
let map = RwLock::new(HashMap::new());
// Must manually acquire lock
{
let mut m = map.write().unwrap();
m.insert("key", "value");
} // Lock released here
// Read requires lock acquisition
{
let m = map.read().unwrap();
if let Some(v) = m.get("key") {
println!("Value: {}", v);
}
}
// Complex operations need careful lock management
{
let mut m = map.write().unwrap();
if let Some(v) = m.get_mut("key") {
*v = "new_value";
}
}
}RwLock<HashMap> requires explicit lock management.
use dashmap::DashMap;
fn main() {
let map = DashMap::new();
// No explicit lock acquisition needed
map.insert("key", "value");
// Reads are straightforward
if let Some(v) = map.get("key") {
println!("Value: {}", *v);
}
// Updates are ergonomic
map.entry("key").and_modify(|v| *v = "new_value");
// Iteration returns references that hold shard locks
for entry in map.iter() {
println!("{}: {}", entry.key(), entry.value());
}
}DashMap provides a more ergonomic API with automatic lock handling.
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
let map = RwLock::new(HashMap::new());
// RwLock: entry API requires holding write lock
{
let mut m = map.write().unwrap();
m.entry("key".to_string())
.or_insert(0)
.modify(|v| *v += 1);
}
}RwLock entry operations hold the write lock for the entire operation.
use dashmap::DashMap;
fn main() {
let map = DashMap::new();
// DashMap: entry API handles locking internally
map.entry("key")
.and_modify(|v| *v += 1)
.or_insert(0);
// Only the relevant shard is locked during the operation
}DashMap entry API locks only the relevant shard.
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
let map = RwLock::new(HashMap::new());
for i in 0..10 {
map.write().unwrap().insert(i, i * 2);
}
// RwLock: read lock held for entire iteration
{
let m = map.read().unwrap();
for (k, v) in m.iter() {
println!("{}: {}", k, v);
}
// Write operations blocked until iteration completes
}
}RwLock iteration blocks all writes for the entire duration.
use dashmap::DashMap;
fn main() {
let map = DashMap::new();
for i in 0..10 {
map.insert(i, i * 2);
}
// DashMap: iter() locks shards one at a time
for entry in map.iter() {
println!("{}: {}", entry.key(), entry.value());
// Each shard lock is released before acquiring next
// Writes to other shards can proceed during iteration
}
}DashMap iterates by locking shards sequentially, not all at once.
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::thread;
fn main() {
let map = Arc::new(RwLock::new(HashMap::new()));
// Populate
for i in 0..10000 {
map.write().unwrap().insert(i, i * 2);
}
// 99% reads, 1% writes
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
if i % 100 == 0 {
// 1% writes
let mut m = map.write().unwrap();
m.insert(i % 10000, i);
} else {
// 99% reads
let _ = map.read().unwrap().get(&(i % 10000));
}
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
}RwLock can perform well under read-heavy workloads due to concurrent read support.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
fn main() {
let map = Arc::new(DashMap::new());
// Populate
for i in 0..10000 {
map.insert(i, i * 2);
}
// 99% reads, 1% writes
let handles: Vec<_> = (0..8)
.map(|_| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
if i % 100 == 0 {
// 1% writes
map.insert(i % 10000, i);
} else {
// 99% reads
let _ = map.get(&(i % 10000));
}
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
}DashMap also handles read-heavy workloads, with reads distributed across shards.
use std::collections::HashMap;
use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Instant;
fn main() {
let map = Arc::new(RwLock::new(HashMap::new()));
let start = Instant::now();
let handles: Vec<_> = (0..8)
.map(|thread_id| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
// High write contention: all threads serialize on write lock
let key = format!("key_{}", (thread_id * 10000 + i) % 100);
let mut m = map.write().unwrap();
m.insert(key, i);
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("RwLock write-heavy: {:?}", start.elapsed());
}Write-heavy workloads cause contention on RwLock.
use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
use std::time::Instant;
fn main() {
let map = Arc::new(DashMap::new());
let start = Instant::now();
let handles: Vec<_> = (0..8)
.map(|thread_id| {
let map = map.clone();
thread::spawn(move || {
for i in 0..10000 {
// Keys spread across shards, reducing contention
let key = format!("key_{}", (thread_id * 10000 + i) % 100);
map.insert(key, i);
}
})
})
.collect();
for h in handles {
h.join().unwrap();
}
println!("DashMap write-heavy: {:?}", start.elapsed());
}DashMap distributes writes across shards, reducing contention.
use std::collections::HashMap;
use std::sync::RwLock;
fn main() {
// Scenario 1: Read-dominated workload with rare writes
// RwLock allows many concurrent readers
// Scenario 2: Low overall contention
// Single lock overhead is less than sharding infrastructure
// Scenario 3: Uniform key access patterns
// All threads accessing the same small set of keys
// Sharding doesn't help when all operations hit same shard
// Scenario 4: Memory-sensitive applications
// DashMap has overhead for multiple shards
// Scenario 5: Simplicity matters more than performance
// RwLock is simpler to reason about
}RwLock<HashMap> is better for read-heavy or low-contention scenarios.
use dashmap::DashMap;
fn main() {
// Scenario 1: Write-heavy workloads
// Multiple writes can proceed in parallel on different shards
// Scenario 2: High concurrent access with distributed keys
// Operations on different keys rarely contend
// Scenario 3: Mixed read-write workloads
// Reads and writes on different shards proceed in parallel
// Scenario 4: Need ergonomic API
// DashMap handles locking automatically
// Scenario 5: Large maps with many concurrent operations
// Sharding amortizes lock contention
}DashMap excels under write contention with distributed key access.
| Aspect | RwLock | DashMap | |--------|----------------|---------| | Lock granularity | Single lock | Per-shard locks | | Concurrent reads | Yes (shared lock) | Yes (shard-level) | | Concurrent writes | No (exclusive lock) | Yes (different shards) | | Mixed operations | Reads block writes | Per-shard blocking | | Memory overhead | Minimal | Higher (N shards) | | API complexity | Explicit lock mgmt | Automatic locking | | Same-key contention | Good (single lock) | Same (same shard) | | Best for | Read-heavy, low contention | Write-heavy, distributed keys |
The trade-off between DashMap and RwLock<HashMap> centers on contention management:
Sharding distributes contention: DashMap's fundamental strategy is to divide the map into N independent shards, each with its own lock. When thread A operates on key X and thread B operates on key Y, and these keys hash to different shards, both operations proceed in parallel. This transforms a global contention point (single lock) into localized contention (shard locks), dramatically improving throughput when keys are well-distributed.
RwLock simplicity has merit: A single RwLock imposes lower overhead per operation and simpler memory management. When reads vastly dominate writes, RwLock allows many concurrent readers with minimal coordination. When operations frequently hit the same small key set, sharding provides no benefitāall operations still serialize on one shard's lock, plus you pay the sharding overhead.
Key insight: The decision hinges on contention patterns, not just operation counts. A high write rate with keys uniformly distributed across the hash space benefits from DashMap's sharding. A high write rate on the same few keys gains nothing from sharding. Similarly, many concurrent readers benefit from RwLock's shared read mode, but if reads are interspersed with writes, RwLock causes reader-writer contention that DashMap's per-shard approach mitigates.
Choose RwLock when: Reads dominate overwhelmingly, the key set is small or frequently overlaps, memory overhead matters, or simplicity is preferred. Choose DashMap when: writes are frequent, keys distribute well across the hash space, you need concurrent reads and writes on different keys, or you want automatic lock handling for cleaner code.