Loading page…
Rust walkthroughs
Loading page…
Arc<str> vs String for storing shared string data across threads?Sharing immutable string data across threads efficiently requires understanding the memory layout and ownership semantics of different string types. Arc<str> and String represent fundamentally different approaches to shared string storage, with distinct performance characteristics and ergonomic trade-offs.
String owns its heap-allocated data. Cloning a String allocates new memory and copies the entire string contents:
use std::sync::Arc;
fn demonstrate_string_clone() {
let s1 = String::from("hello world");
// Full heap allocation and copy
let s2 = s1.clone();
// Two separate allocations on the heap
// s1 points to one, s2 points to another
}Arc<str> uses reference counting to share a single allocation:
use std::sync::Arc;
fn demonstrate_arc_clone() {
let s1: Arc<str> = Arc::from("hello world");
// Only increments reference count
let s2 = Arc::clone(&s1);
// Both point to the same heap allocation
// No string data is copied
}This difference becomes significant when the same string is shared across many threads or locations.
Understanding the memory layout reveals the trade-offs:
use std::sync::Arc;
fn memory_layout() {
// String: 24 bytes on stack (pointer + length + capacity)
let s: String = String::from("hello");
// Heap: 5 bytes for "hello" (plus allocation overhead)
// Arc<str>: 8 bytes on stack (pointer only)
let a: Arc<str> = Arc::from("hello");
// Heap: 8 bytes for reference count + 8 bytes for weak count
// + 5 bytes for "hello" + metadata
}The String type stores a pointer, length, and capacity—allowing for mutation and growth. Arc<str> stores only a pointer; the length is part of the str metadata and the capacity is implicit (always exactly fits the string).
When sharing strings across threads, Arc<str> avoids repeated allocations:
use std::sync::Arc;
use std::thread;
fn share_string_arc() {
let shared: Arc<str> = Arc::from("configuration_value");
let handles: Vec<_> = (0..10)
.map(|_| {
let cloned = Arc::clone(&shared); // O(1) - just increment counter
thread::spawn(move || {
println!("Thread sees: {}", cloned);
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
// Reference count decrements as threads finish
// Memory freed when last reference drops
}
fn share_string_clone() {
let shared: String = String::from("configuration_value");
let handles: Vec<_> = (0..10)
.map(|_| {
let cloned = shared.clone(); // O(n) - allocates and copies
thread::spawn(move || {
println!("Thread sees: {}", cloned);
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
// 10 separate allocations were made, now being freed
}For small numbers of threads with small strings, the difference is negligible. For many threads sharing large strings, Arc<str> provides substantial savings.
Creating an Arc<str> has higher initial overhead:
use std::sync::Arc;
fn creation_costs() {
// String creation: one allocation
let s = String::from("hello world");
// Arc<str> from literal: checks if literal can be used directly
// If yes, no allocation (static data)
// If no, allocates with reference counts
let a: Arc<str> = Arc::from("hello world");
// Arc<str> from String: allocates new memory with ref counts
// Original String allocation is freed
let s = String::from("hello world");
let a: Arc<str> = Arc::from(s); // Re-allocation!
}Converting from String to Arc<str> requires a reallocation because the reference counts must be placed adjacent to the string data. This is a key consideration:
use std::sync::Arc;
// BAD: Double allocation
let s = String::from("hello"); // First allocation
let a: Arc<str> = Arc::from(s); // Second allocation (String's memory freed)
// GOOD: Single allocation (if not a literal)
let a: Arc<str> = "hello".into(); // May use static storage for literals
// GOOD: For runtime strings, accept the cost knowing it saves future clones
let input = read_user_input();
let shared: Arc<str> = Arc::from(input);
// Now cloning is cheapArc<str> has a special optimization for string literals:
use std::sync::Arc;
fn literal_optimization() {
// String literals are embedded in the binary
let a: Arc<str> = Arc::from("literal");
// For literals, Arc may not allocate at all
// It can point directly to static data
// Reference counting may be elided for 'static strs
// Compare to String, which always allocates
let s: String = String::from("literal"); // Heap allocation
}This optimization means Arc::from("literal") is extremely cheap compared to String::from("literal").
Arc<str> is inherently immutable. You cannot modify the string contents:
use std::sync::Arc;
fn immutability() {
let mut a: Arc<str> = Arc::from("hello");
// Cannot modify contents
// a.push_str(" world"); // Compile error!
// Can only replace the entire Arc
a = Arc::from("hello world"); // Points to new allocation
// String supports mutation
let mut s = String::from("hello");
s.push_str(" world"); // In-place modification
}This immutability is what enables safe sharing across threads without locks.
Arc<str> excels when the same string appears in many places:
use std::sync::Arc;
use std::collections::HashMap;
struct StringInterner {
strings: HashMap<Arc<str>, ()>,
}
impl StringInterner {
fn new() -> Self {
StringInterner {
strings: HashMap::new(),
}
}
fn intern(&mut self, s: &str) -> Arc<str> {
if let Some(existing) = self.strings.keys().find(|k| ***k == *s) {
return Arc::clone(existing);
}
let arc: Arc<str> = Arc::from(s);
self.strings.insert(Arc::clone(&arc), ());
arc
}
}
// Usage: All instances of "user_12345" share one allocation
let mut interner = StringInterner::new();
let id1 = interner.intern("user_12345");
let id2 = interner.intern("user_12345");
// id1 and id2 point to the same memoryThis pattern is common in compilers, symbol tables, and configuration systems.
Sharing configuration across threads is a canonical use case:
use std::sync::Arc;
use std::collections::HashMap;
#[derive(Clone)]
struct Config {
database_url: Arc<str>,
api_key: Arc<str>,
feature_flags: HashMap<Arc<str>, bool>,
}
struct RequestHandler {
config: Config,
}
impl RequestHandler {
fn new(config: Config) -> Self {
RequestHandler { config }
}
fn handle(&self, request: Request) -> Response {
// Access config without copying strings
if self.config.feature_flags.get("new_feature").copied().unwrap_or(false) {
// Handle with new behavior
}
Response::default()
}
}
fn spawn_handlers(config: Config, count: usize) {
for _ in 0..count {
let config = config.clone(); // Clones Arc<str> pointers, not string data
thread::spawn(move || {
let handler = RequestHandler::new(config);
// Handle requests
});
}
}String remains the right choice when you need mutation or exclusive ownership:
// Building strings incrementally
fn build_output(items: &[Item]) -> String {
let mut output = String::with_capacity(1024);
for item in items {
output.push_str(&item.name);
output.push_str(": ");
output.push_str(&item.value);
output.push('\n');
}
output
}
// Processing and transforming
fn process(input: String) -> String {
let mut result = input;
result.make_ascii_uppercase();
result.truncate(100);
result
}
// When only one owner exists
struct Logger {
buffer: String, // Only Logger owns this
}The Rust ecosystem offers several string storage options:
use std::sync::Arc;
fn compare_types() {
// String: Owned, mutable, heap-allocated
let s: String = String::from("hello");
// &str: Borrowed slice, no ownership
let s: &str = "hello";
// Arc<str>: Shared immutable, reference-counted
let s: Arc<str> = Arc::from("hello");
// Arc<String>: Shared mutable-ish (via interior mutability)
// More overhead: Arc points to String which points to data
let s: Arc<String> = Arc::new(String::from("hello"));
// Box<str>: Owned immutable, no capacity field
// Slightly smaller than String, but can't grow
let s: Box<str> = "hello".into();
}Arc<String> exists but is rarely the best choice:
use std::sync::Arc;
fn arc_string_double_indirection() {
// Arc<String> has two levels of indirection
let a: Arc<String> = Arc::new(String::from("hello"));
// Stack -> Arc control block -> String struct -> heap data
// Arc<str> has one level
let a: Arc<str> = Arc::from("hello");
// Stack -> Arc control block + str data (together)
}| Operation | String | Arc | |-----------|--------|----------| | Creation from literal | Allocates | May use static storage | | Creation from runtime data | One allocation | One allocation | | Clone | O(n) allocation + copy | O(1) ref count increment | | Pass to thread | O(n) clone | O(1) clone | | Mutation | O(1) amortized append | Not possible | | Stack size | 24 bytes | 8 bytes | | Deallocation | O(1) | O(1) + atomic ops |
use std::sync::Arc;
// Use Arc<str> when:
// 1. Multiple threads need the same string
let config_value: Arc<str> = Arc::from(load_config());
// 2. Many clones needed across codebase
#[derive(Clone)]
struct RequestContext {
request_id: Arc<str>, // Cloned cheaply
user_id: Arc<str>,
}
// 3. String interning / deduplication
struct SymbolTable {
symbols: HashSet<Arc<str>>,
}
// Use String when:
// 1. Building or modifying strings
let mut log = String::new();
log.push_str("Event: ");
log.push_str(&event_name);
// 2. Single owner, no sharing needed
struct LocalBuffer {
data: String,
}
// 3. Need to resize or append
fn accumulate(items: &[&str]) -> String {
items.join(", ")
}Arc<str> and String serve different ownership patterns. Arc<str> provides cheap cloning through reference counting, making it ideal for sharing immutable strings across threads or throughout a codebase. The trade-off is immutability and a slightly higher initial creation cost when converting from String.
String remains the workhorse for building, modifying, and owning string data when sharing isn't needed. Its mutation capabilities and straightforward ownership model make it appropriate for local string processing.
Choose Arc<str> when multiple owners need read access to the same string data, especially across thread boundaries. Choose String when you need mutation, are building strings incrementally, or have a single owner. The conversion cost from String to Arc<str> is worth paying when you'll clone the shared string multiple times—three or more clones typically justify the overhead.