What are the trade-offs between `dashmap::DashMap::entry` and standard `HashMap::entry` for concurrent insertions?

DashMap::entry provides atomic check-then-insert semantics under concurrent access while holding a shard-level lock, whereas HashMap::entry has no concurrency guarantees and requires external synchronization. The key trade-off is that DashMap::entry is inherently thread-safe but more expensive due to locking, while HashMap::entry is fast but unsafe under concurrent access without external locks. DashMap::entry also has different blocking characteristics: the returned Entry holds a lock on its shard, so any blocking operation within the entry closure blocks all other operations on that shard. For high-contention scenarios, this can cause significant throughput degradation compared to using DashMap methods that release locks quickly like get or insert.

Standard HashMap Entry API

use std::collections::HashMap;
 
fn main() {
    let mut map = HashMap::new();
    
    // Standard entry API: check-and-insert in one operation
    map.entry("key1").or_insert(0);
    
    // or_insert_with for computed values
    map.entry("key2").or_insert_with(|| {
        println!("Computing default...");
        42
    });
    
    // Modify existing entry
    *map.entry("key1").or_insert(0) += 1;
    
    println!("{:?}", map);
    
    // This is NOT thread-safe - concurrent access would cause data races
    // HashMap has no internal synchronization
}

HashMap::entry provides atomic check-and-modify within a single thread, with no concurrency support.

DashMap Basic Entry API

use dashmap::DashMap;
 
fn main() {
    let map = DashMap::new();
    
    // DashMap::entry is similar in API but thread-safe
    map.entry("key1").or_insert(0);
    
    // or_insert_with computes the default
    map.entry("key2").or_insert_with(|| {
        println!("Computing default...");
        42
    });
    
    // Modify existing entry
    *map.entry("key1").or_insert(0) += 1;
    
    println!("{:?}", map);
    
    // This IS thread-safe - DashMap handles synchronization internally
}

DashMap::entry provides the same API but with internal synchronization for thread safety.

Concurrent Access Patterns

use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::thread;
 
fn main() {
    // APPROACH 1: HashMap with external Mutex
    
    let map = Arc::new(Mutex::new(HashMap::new()));
    let mut handles = vec![];
    
    for i in 0..10 {
        let map = Arc::clone(&map);
        let handle = thread::spawn(move || {
            let mut map = map.lock().unwrap();
            // Entire map is locked for the duration of entry operation
            map.entry(i).or_insert(0);
            *map.entry(i).or_insert(0) += 1;
        });
        handles.push(handle);
    }
    
    for handle in handles {
        handle.join().unwrap();
    }
    
    println!("HashMap with Mutex: {:?}", map.lock().unwrap().len());
    
    // APPROACH 2: DashMap with internal sharding
    
    let map = Arc::new(DashMap::new());
    let mut handles = vec![];
    
    for i in 0..10 {
        let map = Arc::clone(&map);
        let handle = thread::spawn(move || {
            // Only one shard is locked at a time
            map.entry(i).or_insert(0);
            *map.entry(i).or_insert(0) += 1;
        });
        handles.push(handle);
    }
    
    for handle in handles {
        handle.join().unwrap();
    }
    
    println!("DashMap: {:?}", map.len());
}

HashMap with Mutex locks the entire map; DashMap uses fine-grained shard locks.

Locking Behavior Under Contention

use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
use std::time::{Duration, Instant};
 
fn main() {
    let map = Arc::new(DashMap::new());
    
    // HIGH CONTENTION: all threads accessing same key
    let start = Instant::now();
    let mut handles = vec![];
    
    for _ in 0..100 {
        let map = Arc::clone(&map);
        let handle = thread::spawn(move || {
            for _ in 0..1000 {
                // All threads contend for the same shard (key hashes to same shard)
                *map.entry("hot_key").or_insert(0) += 1;
            }
        });
        handles.push(handle);
    }
    
    for handle in handles {
        handle.join().unwrap();
    }
    
    println!("High contention time: {:?}", start.elapsed());
    println!("Result: {}", map.get("hot_key").map(|v| *v).unwrap_or(0));
    
    // LOW CONTENTION: threads accessing different keys (different shards)
    let map2 = Arc::new(DashMap::new());
    let start = Instant::now();
    let mut handles = vec![];
    
    for thread_id in 0..100 {
        let map = Arc::clone(&map2);
        let handle = thread::spawn(move || {
            for i in 0..1000 {
                // Each thread accesses different keys, likely different shards
                let key = format!("key_{}_{}", thread_id, i);
                map.entry(key).or_insert(0);
            }
        });
        handles.push(handle);
    }
    
    for handle in handles {
        handle.join().unwrap();
    }
    
    println!("Low contention time: {:?}", start.elapsed());
}

High contention on a single key serializes access; low contention allows parallel access across shards.

Entry Lock Duration

use dashmap::DashMap;
use std::thread;
use std::time::Duration;
 
fn main() {
    let map = DashMap::new();
    
    // IMPORTANT: Entry holds a lock on its shard for the entire operation
    // Including during or_insert_with closure execution
    
    map.entry("key1").or_insert_with(|| {
        // This closure runs while holding the shard lock!
        // Long operations here block all other accesses to this shard
        println!("Computing expensive value...");
        thread::sleep(Duration::from_millis(100));  // BAD: blocks shard
        42
    });
    
    // BETTER: compute outside, then use or_insert
    let computed_value = {
        // Compute outside of entry lock
        thread::sleep(Duration::from_millis(100));
        42
    };
    
    map.entry("key2").or_insert(computed_value);
    // This is fast - just check and insert
    
    // EVEN BETTER for read-modify-write:
    if let Some(mut value) = map.get_mut("key1") {
        // Lock held only for modification
        *value += 1;
    }
}

or_insert_with closure executes while holding a shard lock; expensive computation should happen before entry.

Sharding Internal Structure

use dashmap::DashMap;
 
fn main() {
    // DashMap internally uses multiple "shards" (buckets)
    // Default is number of CPUs * 4
    
    let map = DashMap::new();
    
    // Each shard has its own lock
    // Operations on different shards can proceed in parallel
    
    // Keys are hashed to determine which shard they belong to
    // Contention occurs when multiple operations hit the same shard
    
    // Configure shard amount for expected concurrency
    let map2 = DashMap::with_shard_amount(32);  // 32 shards
    
    // With n shards, roughly n threads can write concurrently
    // (assuming good key distribution)
    
    // entry() locks ONE shard for the entire operation
    // get() and insert() lock the shard briefly
    
    // This means:
    // - entry() is more expensive than simple operations
    // - entry() blocks other entry() calls on same shard
    // - entry() also blocks get/insert on same shard during operation
    
    println!("Shards: {}", map.shards().len());
}

DashMap uses multiple shards; entry() locks one shard for the operation duration.

Alternative DashMap Methods

use dashmap::DashMap;
 
fn main() {
    let map = DashMap::new();
    
    // entry() - atomic check-then-insert, holds lock during computation
    
    // ALTERNATIVES for different use cases:
    
    // 1. Simple insert (no check needed)
    map.insert("key1", 42);  // Faster than entry().or_insert()
    
    // 2. Check then insert separately (two operations, but brief locks)
    if !map.contains_key("key2") {
        map.insert("key2", 42);
    }
    // NOT atomic - another thread could insert between check and insert
    // But releases lock between operations
    
    // 3. Get and modify
    if let Some(mut value) = map.get_mut("key3") {
        *value += 1;  // Lock held briefly for modification
    }
    
    // 4. Atomic update with update()
    map.update("key4", |value| {
        *value += 1;
        *value
    });
    
    // 5. Compute pattern
    map.compute("key5", |current| {
        match current {
            Some(value) => Some(*value + 1),  // Update existing
            None => Some(1),                   // Insert new
        }
    });
    
    // When entry() is necessary:
    // - Atomic check-then-insert is required
    // - Initial value depends on absence of key
    // - Pattern: "insert only if doesn't exist"
}

Simple operations like insert and get_mut are faster than entry() but may not be atomic.

Performance Comparison

use dashmap::DashMap;
use std::collections::HashMap;
use std::sync::{Arc, Mutex, RwLock};
use std::thread;
use std::time::Instant;
 
fn main() {
    const ITERATIONS: usize = 100_000;
    const THREADS: usize = 8;
    
    // HashMap + Mutex
    let map = Arc::new(Mutex::new(HashMap::new()));
    let start = Instant::now();
    
    let handles: Vec<_> = (0..THREADS)
        .map(|_| {
            let map = Arc::clone(&map);
            thread::spawn(move || {
                for i in 0..ITERATIONS / THREADS {
                    let key = format!("key_{}", i % 100);
                    let mut map = map.lock().unwrap();
                    *map.entry(key).or_insert(0) += 1;
                }
            })
        })
        .collect();
    
    for handle in handles {
        handle.join().unwrap();
    }
    println!("HashMap + Mutex: {:?}", start.elapsed());
    
    // HashMap + RwLock
    let map = Arc::new(RwLock::new(HashMap::new()));
    let start = Instant::now();
    
    let handles: Vec<_> = (0..THREADS)
        .map(|_| {
            let map = Arc::clone(&map);
            thread::spawn(move || {
                for i in 0..ITERATIONS / THREADS {
                    let key = format!("key_{}", i % 100);
                    let mut map = map.write().unwrap();
                    *map.entry(key).or_insert(0) += 1;
                }
            })
        })
        .collect();
    
    for handle in handles {
        handle.join().unwrap();
    }
    println!("HashMap + RwLock: {:?}", start.elapsed());
    
    // DashMap
    let map = Arc::new(DashMap::new());
    let start = Instant::now();
    
    let handles: Vec<_> = (0..THREADS)
        .map(|_| {
            let map = Arc::clone(&map);
            thread::spawn(move || {
                for i in 0..ITERATIONS / THREADS {
                    let key = format!("key_{}", i % 100);
                    *map.entry(key).or_insert(0) += 1;
                }
            })
        })
        .collect();
    
    for handle in handles {
        handle.join().unwrap();
    }
    println!("DashMap: {:?}", start.elapsed());
}

DashMap typically outperforms Mutex<HashMap> due to fine-grained locking, but results vary by contention pattern.

When Entry is Necessary

use dashmap::DashMap;
 
fn main() {
    let map = DashMap::new();
    
    // CASE 1: Insert only if key doesn't exist
    // entry() is the correct choice
    
    fn get_or_create(map: &DashMap<&str, Vec<i32>>, key: &str) {
        map.entry(key).or_insert_with(|| {
            // Only executed if key is absent
            println!("Creating new vec for {}", key);
            Vec::new()
        });
    }
    
    // CASE 2: Initialize with computed value based on key
    fn initialize_with_key(map: &DashMap<&str, String>, key: &str) {
        let initial = format!("initial_{}", key);
        map.entry(key).or_insert(initial);
    }
    
    // CASE 3: Count occurrences (common pattern)
    fn increment_count(map: &DashMap<&str, u32>, key: &str) {
        *map.entry(key).or_insert(0) += 1;
    }
    
    // CASE 4: Don't need entry if just inserting
    // map.insert() is simpler
    
    // This is fine but unnecessary:
    map.entry("simple").or_insert(42);
    
    // This is better for simple inserts:
    map.insert("simple", 42);
    
    // CASE 5: Modification doesn't need entry
    // Use get_mut instead
    
    // Unnecessarily complex:
    *map.entry("existing").or_insert(0) += 1;
    
    // Better (if you know key exists):
    if let Some(mut v) = map.get_mut("existing") {
        *v += 1;
    }
    
    // Or use the dedicated method:
    map.entry("existing").and_modify(|v| *v += 1).or_insert(1);
}

Use entry() for atomic check-then-insert; use simpler methods when the check isn't needed.

Deadlock Potential with Entry

use dashmap::DashMap;
use std::sync::Arc;
use std::thread;
 
fn main() {
    // POTENTIAL DEADLOCK: holding entry locks across DashMaps
    
    let map1 = Arc::new(DashMap::new());
    let map2 = Arc::new(DashMap::new());
    
    // Thread 1: acquires entry on map1 shard A, then needs map2 shard X
    let m1 = Arc::clone(&map1);
    let m2 = Arc::clone(&map2);
    
    let h1 = thread::spawn(move || {
        // Lock shard in map1
        let mut entry = m1.entry("key1").or_insert(0);
        // Entry holds lock on map1 shard
        // Try to access map2 while holding map1 lock
        *m2.entry("key2").or_insert(0) += 1;  // May deadlock
        *entry += 1;
    });
    
    // Thread 2: acquires entry on map2 shard X, then needs map1 shard A
    let h2 = thread::spawn(move || {
        let mut entry = m2.entry("key2").or_insert(0);
        // Try to access map1 while holding map2 lock
        *m1.entry("key1").or_insert(0) += 1;  // May deadlock
        *entry += 1;
    });
    
    // This CAN deadlock if thread 1 and thread 2 hold locks in opposite order
    
    // SAFER: release locks before acquiring another
    let map1 = Arc::new(DashMap::new());
    let map2 = Arc::new(DashMap::new());
    
    let safe_update = |map1: &DashMap<&str, i32>, map2: &DashMap<&str, i32>, k1: &str, k2: &str| {
        // Get values one at a time, releasing locks between
        let v1 = *map1.entry(k1).or_insert(0);
        let v2 = *map2.entry(k2).or_insert(0);
        // Now update
        map1.insert(k1, v1 + v2);
        map2.insert(k2, v1 + v2);
    };
}

Nested entry() calls across multiple DashMaps can deadlock; release locks before acquiring others.

Synthesis

Comparison table:

Aspect	`HashMap::entry`	`DashMap::entry`
Thread safety	None (requires external lock)	Built-in (shard locks)
Lock granularity	N/A (single lock for whole map)	Per-shard lock
Lock duration	N/A (user controls)	Entire entry operation
Contention impact	N/A	Blocks same-shard operations
Memory overhead	Minimal	Sharding structure
Best for	Single-threaded, external sync	Concurrent access

When to use DashMap::entry:

Concurrent access is required
Atomic check-then-insert semantics needed
Key distribution allows shard-level parallelism
Computation in or_insert_with is fast

When to prefer HashMap::entry:

Single-threaded access only
External synchronization already in place (e.g., database transaction)
Simpler code is preferred
No contention concerns

Performance considerations:

// FASTEST for simple insert (no check)
map.insert(key, value);
 
// FAST for check then insert (two operations, but brief locks)
if !map.contains_key(&key) {
    map.insert(key, value);
}
// NOT atomic - but often acceptable
 
// CORRECT for atomic check-then-insert
map.entry(key).or_insert(value);
// Slower due to held lock, but atomic
 
// SLOW if computation is expensive
map.entry(key).or_insert_with(|| expensive_computation());
// Lock held during computation!
 
// BETTER for expensive computation
let value = expensive_computation();
map.entry(key).or_insert(value);
// Compute outside lock

Key insight: DashMap::entry provides the atomicity of HashMap::entry with built-in thread safety, but the cost is lock duration. The entry holds a shard lock for the entire operation, including any computation in or_insert_with. Under high contention on a single shard (many threads accessing keys that hash to the same shard), this serialization defeats the parallelism that makes DashMap fast. The trade-off is clear: entry() gives atomicity but holds locks longer; separate contains_key() and insert() calls release locks quickly but aren't atomic. For most concurrent insert scenarios, DashMap::entry is the right choice—just keep the lock duration minimal by computing expensive values before entering the entry API. When contention is extreme and atomicity isn't required, consider simpler methods like insert that release locks immediately.

What are the trade-offs between dashmap::DashMap::entry and standard HashMap::entry for concurrent insertions?