What are the trade-offs between `Arc<Mutex<T>>` and `Arc<RwLock<T>>` for shared mutable state?

Arc<Mutex<T>> provides exclusive access to the inner value, allowing only one thread to hold the lock at a time regardless of operation type. Arc<RwLock<T>> allows multiple concurrent readers or one exclusive writer, providing better throughput for read-heavy workloads but with higher per-operation overhead. The choice depends on the ratio of reads to writes, contention patterns, and whether the lock coordination overhead of RwLock is justified by the concurrency gains. Mutex is simpler and has lower overhead; RwLock enables parallel reads at the cost of more complex internal state management.

Basic Mutex Usage

use std::sync::{Arc, Mutex};
use std::thread;
 
fn main() {
    let data = Arc::new(Mutex::new(vec![1, 2, 3]));
    
    let mut handles = vec![];
    
    for i in 0..3 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            let mut guard = data.lock().unwrap();
            guard.push(i);
        }));
    }
    
    for handle in handles {
        handle.join().unwrap();
    }
    
    println!("{:?}", *data.lock().unwrap());
}

Mutex provides exclusive access: only one thread can hold the lock.

Basic RwLock Usage

use std::sync::{Arc, RwLock};
use std::thread;
 
fn main() {
    let data = Arc::new(RwLock::new(vec![1, 2, 3]));
    
    let mut handles = vec![];
    
    // Multiple readers can hold lock simultaneously
    for _ in 0..3 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            let guard = data.read().unwrap();
            println!("Reader sees: {:?}", *guard);
        }));
    }
    
    // Writer needs exclusive access
    let data_write = Arc::clone(&data);
    handles.push(thread::spawn(move || {
        let mut guard = data_write.write().unwrap();
        guard.push(4);
    }));
    
    for handle in handles {
        handle.join().unwrap();
    }
}

RwLock allows concurrent readers but exclusive writers.

Concurrent Reader Demonstration

use std::sync::{Arc, RwLock};
use std::thread;
use std::time::Instant;
 
fn main() {
    let data = Arc::new(RwLock::new(vec![0u64; 1_000_000]));
    let start = Instant::now();
    
    // Multiple concurrent readers
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            let guard = data.read().unwrap();
            guard.iter().sum::<u64>()
        }));
    }
    
    let results: Vec<_> = handles.into_iter()
        .map(|h| h.join().unwrap())
        .collect();
    
    println!("RwLock concurrent readers: {:?}", start.elapsed());
    println!("Results: {:?}", results);
}

All readers execute concurrently with RwLock.

Mutex Blocks All Operations

use std::sync::{Arc, Mutex};
use std::thread;
use std::time::Instant;
 
fn main() {
    let data = Arc::new(Mutex::new(vec![0u64; 1_000_000]));
    let start = Instant::now();
    
    // Readers execute one at a time
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&data);
        handles.push(thread::spawn(move || {
            let guard = data.lock().unwrap();
            guard.iter().sum::<u64>()
        }));
    }
    
    let results: Vec<_> = handles.into_iter()
        .map(|h| h.join().unwrap())
        .collect();
    
    println!("Mutex readers: {:?}", start.elapsed());
}

Readers execute sequentially with Mutex.

Read-Heavy Workload Performance

use std::sync::{Arc, Mutex, RwLock};
use std::thread;
use std::time::Instant;
 
fn main() {
    // 95% reads, 5% writes
    let iterations = 100_000;
    let read_ratio = 0.95;
    
    // Mutex benchmark
    let mutex_data = Arc::new(Mutex::new(vec![0u64; 10_000]));
    let start = Instant::now();
    
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&mutex_data);
        handles.push(thread::spawn(move || {
            for i in 0..iterations {
                if (i as f64 / iterations as f64) < read_ratio {
                    let guard = data.lock().unwrap();
                    let _ = guard.len();
                } else {
                    let mut guard = data.lock().unwrap();
                    guard[0] += 1;
                }
            }
        }));
    }
    for h in handles {
        h.join().unwrap();
    }
    let mutex_time = start.elapsed();
    
    // RwLock benchmark
    let rwlock_data = Arc::new(RwLock::new(vec![0u64; 10_000]));
    let start = Instant::now();
    
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&rwlock_data);
        handles.push(thread::spawn(move || {
            for i in 0..iterations {
                if (i as f64 / iterations as f64) < read_ratio {
                    let guard = data.read().unwrap();
                    let _ = guard.len();
                } else {
                    let mut guard = data.write().unwrap();
                    guard[0] += 1;
                }
            }
        }));
    }
    for h in handles {
        h.join().unwrap();
    }
    let rwlock_time = start.elapsed();
    
    println!("Read-heavy Mutex: {:?}", mutex_time);
    println!("Read-heavy RwLock: {:?}", rwlock_time);
    // RwLock typically faster for read-heavy workloads
}

Concurrent reads provide throughput gains when reads dominate.

Write-Heavy Workload Performance

use std::sync::{Arc, Mutex, RwLock};
use std::thread;
use std::time::Instant;
 
fn main() {
    // 90% writes, 10% reads
    let iterations = 100_000;
    let write_ratio = 0.90;
    
    // Mutex benchmark
    let mutex_data = Arc::new(Mutex::new(0u64));
    let start = Instant::now();
    
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&mutex_data);
        handles.push(thread::spawn(move || {
            for i in 0..iterations {
                if (i as f64 / iterations as f64) < write_ratio {
                    let mut guard = data.lock().unwrap();
                    *guard += 1;
                } else {
                    let guard = data.lock().unwrap();
                    let _ = *guard;
                }
            }
        }));
    }
    for h in handles {
        h.join().unwrap();
    }
    let mutex_time = start.elapsed();
    
    // RwLock benchmark
    let rwlock_data = Arc::new(RwLock::new(0u64));
    let start = Instant::now();
    
    let mut handles = vec![];
    for _ in 0..4 {
        let data = Arc::clone(&rwlock_data);
        handles.push(thread::spawn(move || {
            for i in 0..iterations {
                if (i as f64 / iterations as f64) < write_ratio {
                    let mut guard = data.write().unwrap();
                    *guard += 1;
                } else {
                    let guard = data.read().unwrap();
                    let _ = *guard;
                }
            }
        }));
    }
    for h in handles {
        h.join().unwrap();
    }
    let rwlock_time = start.elapsed();
    
    println!("Write-heavy Mutex: {:?}", mutex_time);
    println!("Write-heavy RwLock: {:?}", rwlock_time);
    // Mutex often faster for write-heavy: lower overhead
}

When writes dominate, RwLock overhead outweighs concurrency benefits.

RwLock Internal Structure

use std::sync::RwLock;
 
fn main() {
    // RwLock has more internal state than Mutex:
    // - Reader count (number of active readers)
    // - Writer waiting flag/queue
    // - Fairness mechanism (potentially)
    
    // This overhead exists for every lock/unlock operation
    let lock = RwLock::new(42);
    
    // Read lock increments reader count
    {
        let _guard1 = lock.read().unwrap();  // reader_count = 1
        let _guard2 = lock.read().unwrap();  // reader_count = 2
        // Both hold lock simultaneously
    }  // Guards dropped, reader_count = 0
    
    // Write lock checks: no readers, no writers
    {
        let mut guard = lock.write().unwrap();  // writer active
        *guard += 1;
    }
}

RwLock tracks reader count and writer state, adding overhead.

Mutex Internal Structure

use std::sync::Mutex;
 
fn main() {
    // Mutex has minimal internal state:
    // - Locked/unlocked flag
    // - Possibly a waiter queue
    
    let lock = Mutex::new(42);
    
    // Lock: check flag, set to locked, proceed
    {
        let guard = lock.lock().unwrap();
        // No need to track reader count
    }
    // Unlock: clear flag, wake waiters
}

Mutex has simpler state management with lower overhead.

Writer Starvation in RwLock

use std::sync::{Arc, RwLock};
use std::thread;
 
fn main() {
    let data = Arc::new(RwLock::new(0));
    
    // Continuous readers can starve writers
    let reader_handles: Vec<_> = (0..3)
        .map(|i| {
            let data = Arc::clone(&data);
            thread::spawn(move || {
                for _ in 0..1_000_000 {
                    if let Ok(guard) = data.read() {
                        let _ = *guard;
                    }
                }
                println!("Reader {} done", i);
            })
        })
        .collect();
    
    // Writer may struggle to acquire
    let writer = {
        let data = Arc::clone(&data);
        thread::spawn(move || {
            for _ in 0..10 {
                if let Ok(mut guard) = data.write() {
                    *guard += 1;
                }
            }
            println!("Writer done");
        })
    };
    
    for h in reader_handles {
        h.join().unwrap();
    }
    writer.join().unwrap();
}

Continuous readers can prevent writers from acquiring the lock.

Fair RwLock Behavior

use std::sync::{Arc, RwLock};
 
fn main() {
    // std::sync::RwLock behavior depends on platform:
    // - Some implementations are writer-preferring (fair to writers)
    // - Some are reader-preferring (may starve writers)
    
    // RwLock is NOT fair by default in many implementations
    // A steady stream of readers can indefinitely delay writers
    
    // Mitigation strategies:
    // 1. Keep read locks short
    // 2. Use Mutex when writes are important
    // 3. Consider parking_lot::RwLock for more control
    
    let lock = RwLock::new(42);
    
    // Short read lock duration reduces writer starvation
    {
        let guard = lock.read().unwrap();
        let value = *guard;
        // Immediately drop guard
    }
}

Writer starvation depends on implementation and lock hold time.

Poisoning Behavior

use std::sync::{Arc, Mutex, RwLock};
use std::thread;
 
fn main() {
    // Both Mutex and RwLock can become poisoned
    
    // Mutex poisoning
    let mutex = Arc::new(Mutex::new(42));
    let mutex_clone = Arc::clone(&mutex);
    
    let handle = thread::spawn(move || {
        let mut guard = mutex_clone.lock().unwrap();
        *guard += 1;
        panic!("Panic while holding mutex");  // Poison the mutex
    });
    
    handle.join().unwrap_err();
    
    // Now the mutex is poisoned
    match mutex.lock() {
        Ok(_) => println!("Got lock"),
        Err(poison_err) => {
            // Can still access the data via into_inner()
            let value = poison_err.into_inner();
            println!("Mutex poisoned, recovered value: {}", *value);
        }
    }
    
    // RwLock works similarly for poisoning
}

Both Mutex and RwLock can become poisoned on panic.

Poisoning Recovery

use std::sync::{Arc, Mutex};
 
fn main() {
    let mutex = Arc::new(Mutex::new(vec![1, 2, 3]));
    
    // Option 1: Recover and continue
    let guard = mutex.lock().unwrap_or_else(|poison_err| {
        println!("Mutex poisoned, recovering");
        poison_err.into_inner()
    });
    
    // Option 2: Ignore poisoning entirely
    let mutex2 = Arc::new(Mutex::new(42));
    // Set poisoning behavior to ignore
    // (requires parking_lot or similar crate)
}

Poisoning can be handled or recovered from.

Read Guard vs Write Guard

use std::sync::RwLock;
 
fn main() {
    let lock = RwLock::new(vec![1, 2, 3]);
    
    // Read guard: shared reference, read-only
    {
        let guard1 = lock.read().unwrap();
        let guard2 = lock.read().unwrap();  // OK: multiple readers
        
        println!("Reader 1: {:?}", *guard1);
        println!("Reader 2: {:?}", *guard2);
        // Cannot modify through read guard
        // guard1.push(4);  // Compile error
    }
    
    // Write guard: exclusive reference, mutable
    {
        let mut guard = lock.write().unwrap();
        guard.push(4);  // OK: exclusive access
        println!("Writer: {:?}", *guard);
    }
    
    // Cannot hold read and write simultaneously
    // let read_guard = lock.read().unwrap();
    // let write_guard = lock.write().unwrap();  // Blocks forever
}

Read guards provide shared access; write guards provide exclusive access.

Downgrading Write Lock

use std::sync::RwLock;
 
fn main() {
    let lock = RwLock::new(vec![1, 2, 3]);
    
    // Acquire write lock
    let write_guard = lock.write().unwrap();
    
    // Modify while holding write lock
    // ...make changes...
    
    // std::sync::RwLock supports downgrading
    // but through unsafe code in some implementations
    
    // parking_lot::RwLock provides safe downgrade:
    // let read_guard = write_guard.downgrade();
    
    // Standard library: release and reacquire
    drop(write_guard);
    let read_guard = lock.read().unwrap();
}

std::sync::RwLock doesn't provide safe lock downgrading.

Try-Lock Variants

use std::sync::{Arc, Mutex, RwLock};
use std::thread;
 
fn main() {
    // Non-blocking lock attempts
    
    let mutex = Arc::new(Mutex::new(42));
    let rwlock = Arc::new(RwLock::new(42));
    
    // Mutex try_lock
    match mutex.try_lock() {
        Ok(guard) => println!("Mutex acquired: {}", *guard),
        Err(_) => println!("Mutex already locked"),
    }
    
    // RwLock try_read and try_write
    match rwlock.try_read() {
        Ok(guard) => println!("Read lock acquired: {}", *guard),
        Err(_) => println!("Read lock not available"),
    }
    
    match rwlock.try_write() {
        Ok(mut guard) => {
            *guard += 1;
            println!("Write lock acquired: {}", *guard);
        }
        Err(_) => println!("Write lock not available"),
    }
}

try_lock, try_read, and try_write attempt acquisition without blocking.

Deadlock Scenarios

use std::sync::{Arc, Mutex, RwLock};
use std::thread;
 
fn main() {
    // Mutex deadlock: classic ABBA pattern
    let mutex_a = Arc::new(Mutex::new(0));
    let mutex_b = Arc::new(Mutex::new(0));
    
    // Thread 1: lock A then B
    // Thread 2: lock B then A
    // Result: deadlock
    
    // RwLock has additional deadlock scenarios:
    // - Holding read lock and trying to upgrade to write
    // - Multiple threads waiting for write while readers hold lock
    
    let rwlock = Arc::new(RwLock::new(42));
    
    // This deadlocks:
    // let read_guard = rwlock.read().unwrap();
    // let write_guard = rwlock.write().unwrap();  // Blocks forever
    
    // Rule: never try to acquire write while holding read
}

RwLock has additional deadlock scenarios from read/write interaction.

Memory Overhead Comparison

use std::sync::{Mutex, RwLock};
use std::mem::size_of;
 
fn main() {
    // Mutex overhead
    struct MutexWrapper<T> {
        inner: Mutex<T>,
    }
    
    // RwLock overhead
    struct RwLockWrapper<T> {
        inner: RwLock<T>,
    }
    
    // Memory size comparison
    println!("Mutex<bool>: {} bytes", size_of::<Mutex<bool>>());
    println!("RwLock<bool>: {} bytes", size_of::<RwLock<bool>>());
    
    // RwLock is typically larger:
    // - Needs to track reader count
    // - Needs writer waiting state
    // - More complex internal state
    
    // For many instances, this can matter:
    // struct Data {
    //     values: Vec<Mutex<i32>>,  // Smaller
    // }
    // vs
    // struct Data {
    //     values: Vec<RwLock<i32>>,  // Larger
    // }
}

RwLock has larger memory footprint than Mutex.

Granular Locking Strategy

use std::sync::{Arc, Mutex, RwLock};
use std::collections::HashMap;
 
fn main() {
    // Coarse-grained: single lock for entire structure
    let coarse = Arc::new(RwLock::new(HashMap::<String, i32>::new()));
    
    // Fine-grained: lock per element or segment
    struct FineGrained {
        data: HashMap<String, i32>,
        locks: HashMap<String, Mutex<()>>,  // Lock per key
    }
    
    // Fine-grained reduces contention
    // but increases complexity
    
    // Alternative: sharded locks
    struct Sharded<K, V> {
        shards: Vec<RwLock<HashMap<K, V>>>,
    }
    
    // Use hash to determine shard
    fn get_shard<K: std::hash::Hash>(key: &K, num_shards: usize) -> usize {
        use std::collections::hash_map::DefaultHasher;
        use std::hash::{Hash, Hasher};
        let mut hasher = DefaultHasher::new();
        key.hash(&mut hasher);
        hasher.finish() as usize % num_shards
    }
}

Fine-grained locking reduces contention at the cost of complexity.

When to Prefer Mutex

use std::sync::{Arc, Mutex};
 
// Use Mutex when:
// 1. Write-heavy workloads (> 30% writes)
// 2. Lock hold time is very short
// 3. Simplicity is preferred
// 4. Fair access between readers and writers matters
// 5. Memory overhead is a concern
 
struct Counter {
    count: i64,
}
 
impl Counter {
    fn increment(&mut self) {
        self.count += 1;  // Very short hold time
    }
    
    fn get(&self) -> i64 {
        self.count
    }
}
 
fn main() {
    let counter = Arc::new(Mutex::new(Counter { count: 0 }));
    
    // Short hold times, frequent writes
    // Mutex is appropriate here
}

Mutex is better for write-heavy workloads or when simplicity matters.

When to Prefer RwLock

use std::sync::{Arc, RwLock};
use std::collections::HashMap;
 
// Use RwLock when:
// 1. Read-heavy workloads (> 70% reads)
// 2. Read operations take significant time
// 3. Many concurrent readers benefit from parallelism
// 4. Writes are infrequent
 
struct Cache {
    data: HashMap<String, String>,
}
 
impl Cache {
    fn get(&self, key: &str) -> Option<&String> {
        self.data.get(key)  // Frequent, may be slow
    }
    
    fn insert(&mut self, key: String, value: String) {
        self.data.insert(key, value);  // Infrequent
    }
}
 
fn main() {
    let cache = Arc::new(RwLock::new(Cache {
        data: HashMap::new(),
    }));
    
    // Many concurrent reads benefit from RwLock
    // Infrequent writes don't justify serialization
}

RwLock is better for read-heavy workloads with meaningful read times.

Performance Heuristics Summary

// Performance characteristics:
 
// Uncontended (single thread):
//   Mutex: ~20-30ns per lock/unlock
//   RwLock read: ~25-35ns per lock/unlock
//   RwLock write: ~25-35ns per lock/unlock
//   Difference: minimal
 
// Contended (multiple threads):
//   Mutex: serialization overhead, but simpler wake-up
//   RwLock read: parallel reads possible, lower total wait
//   RwLock write: serialization + wait for readers to finish
 
// Memory:
//   Mutex: smaller footprint
//   RwLock: larger (reader count, state)
 
// Decision heuristic:
//   < 70% reads: prefer Mutex
//   > 90% reads: prefer RwLock
//   70-90% reads: benchmark both

The break-even point depends on specific workload characteristics.

Summary Table

Aspect	`Arc<Mutex<T>>`	`Arc<RwLock<T>>`
Concurrent readers	No	Yes
Concurrent writer + reader	No	No
Per-operation overhead	Lower	Higher
Memory footprint	Smaller	Larger
Write-heavy performance	Better	Worse
Read-heavy performance	Worse	Better
Writer fairness	Equal access	May starve
Lock types	One (`lock`)	Two (`read`, `write`)
Downgrade support	N/A	Limited
Deadlock scenarios	Standard	Read/write interaction
Complexity	Simple	Moderate

Synthesis

The choice between Arc<Mutex<T>> and Arc<RwLock<T>> is about trading simplicity and overhead for read concurrency:

Mutex: Single lock type, lower overhead, fair access for all operations. Every lock acquisition is identical: wait for exclusive access, use the data, release. For write-heavy workloads or when lock hold times are short, Mutex wins because the coordination overhead of RwLock isn't justified. The simpler state machine inside Mutex means faster uncontended operations and smaller memory footprint.

RwLock: Two lock types (read/write), higher overhead, enables concurrent reads. When reads dominate and hold times are meaningful, multiple readers can proceed in parallel, providing throughput that Mutex cannot match. However, RwLock can starve writers—a continuous stream of readers may indefinitely delay a waiting writer. The coordination overhead (tracking reader count, managing writer queues) adds cost to every operation.

Key insight: The overhead difference matters most for high-frequency, short-duration locks. If you're locking millions of times per second for nanosecond operations, Mutex overhead adds up. If reads hold the lock for microseconds or longer, RwLock parallelism wins despite higher per-operation cost. Measure your specific workload: count read vs write frequency, measure lock hold duration, and benchmark both approaches. For contested write-heavy data, Mutex is often the right choice despite being "less capable"—the complexity of RwLock coordination becomes pure overhead when writes dominate. For read-heavy caches or configuration where reads vastly outnumber writes, RwLock concurrency wins.

What are the trade-offs between Arc<Mutex<T>> and Arc<RwLock<T>> for shared mutable state?