Loading pageâŚ
Rust walkthroughs
Loading pageâŚ
indexmap::IndexMap maintain insertion order while still providing O(1) lookups?IndexMap combines the best of two data structure worlds: the O(1) lookup performance of hash tables and the insertion-order iteration of sequences. This hybrid structure achieves its properties through a dual-indexing design that maintains both a hash table for fast lookups and a dense array for ordered iteration.
Standard HashMap provides O(1) lookups but makes no guarantees about iteration order:
use std::collections::HashMap;
let mut map = HashMap::new();
map.insert("first", 1);
map.insert("second", 2);
map.insert("third", 3);
// Iteration order is unpredictable
for (key, value) in &map {
println!("{}: {}", key, value);
}
// Output order varies based on hash values and internal layoutThe internal structure of HashMap organizes entries by hash buckets, not insertion order. Rehashing during growth can completely reorder the internal layout.
IndexMap maintains two parallel data structures:
// Conceptual representation
struct IndexMap<K, V> {
entries: Vec<Entry<K, V>>, // Dense array for ordered iteration
indices: RawTable<usize>, // Hash table mapping hash -> index
}
struct Entry<K, V> {
hash: u64, // Cached hash for fast lookup
key: K,
value: V,
}The entries vector stores key-value pairs in insertion order, enabling ordered iteration. The indices hash table maps hash values to indices in the entries vector, enabling O(1) lookup.
When you look up a key, IndexMap uses a two-step process:
use indexmap::IndexMap;
let mut map = IndexMap::new();
map.insert("apple", 1);
map.insert("banana", 2);
map.insert("cherry", 3);
// Lookup: "banana"
let value = map.get("banana");The lookup proceeds as follows:
hash("banana")indices to get the position in entriesentries[position]// Simplified lookup implementation
fn get(&self, key: &K) -> Option<&V> {
let hash = self.hash(key);
// Step 1: O(1) hash table lookup
let index = self.indices.get(hash, |&idx| {
// Verify the key matches (handle collisions)
self.entries[idx].key == key
})?;
// Step 2: O(1) array access
Some(&self.entries[index].value)
}Both steps are O(1), so the overall lookup is O(1) amortized.
Insertion adds entries at the end of the vector and creates a hash table entry:
use indexmap::IndexMap;
let mut map = IndexMap::new();
map.insert("first", 1); // entries: [("first", 1)]
// indices: { hash("first") -> 0 }
map.insert("second", 2); // entries: [("first", 1), ("second", 2)]
// indices: { hash("first") -> 0, hash("second") -> 1 }The process:
entries and store the index in indices// Simplified insertion
fn insert(&mut self, key: K, value: V) -> Option<V> {
let hash = self.hash(&key);
// Check if key exists
if let Some(index) = self.indices.get_mut(hash, |&idx| {
self.entries[idx].key == key
}) {
// Key exists: update value, return old
let old = std::mem::replace(&mut self.entries[*index].value, value);
return Some(old);
}
// New key: append and index
let index = self.entries.len();
self.entries.push(Entry { hash, key, value });
self.indices.insert(hash, index, |idx| {
self.entries[*idx].hash
});
None
}Iteration simply walks the entries vector:
use indexmap::IndexMap;
let mut map = IndexMap::new();
map.insert("a", 1);
map.insert("b", 2);
map.insert("c", 3);
// Iteration is in insertion order
for (key, value) in &map {
println!("{}: {}", key, value);
}
// Output: a: 1, b: 2, c: 3 (always)The iterator implementation:
// Iter iterates over the entries vector
impl<K, V> Iterator for Iter<'_, K, V> {
type Item = (&K, &V);
fn next(&mut self) -> Option<Self::Item> {
self.iter.next().map(|entry| (&entry.key, &entry.value))
}
}This is cache-friendly: entries are stored contiguously in memory, providing excellent locality for iteration.
Removal is where the design becomes clever. Simply removing an entry from entries would leave a gap, and shifting all subsequent entries would be O(n). Instead, IndexMap uses swap removal with index updates:
use indexmap::IndexMap;
let mut map = IndexMap::new();
map.insert("a", 1); // index 0
map.insert("b", 2); // index 1
map.insert("c", 3); // index 2
map.insert("d", 4); // index 3
map.remove("b"); // Remove entry at index 1
// After swap removal:
// entries: [("a", 1), ("d", 4), ("c", 3)]
// indices: { hash("a") -> 0, hash("d") -> 1, hash("c") -> 2 }The process:
// Simplified swap removal
fn remove(&mut self, key: &K) -> Option<V> {
let hash = self.hash(key);
// Find index of entry to remove
let index = self.indices.remove(hash, |&idx| {
self.entries[idx].key == *key
})?;
// If not the last entry, swap with last
let last = self.entries.len() - 1;
if index != last {
// Swap entries
self.entries.swap(index, last);
// Update index of the swapped entry
let swapped_hash = self.entries[index].hash;
self.indices.get_mut(swapped_hash, |&idx| idx == last)
.map(|idx| *idx = index);
}
// Pop the removed entry
let entry = self.entries.pop()?;
Some(entry.value)
}This approach keeps removal at O(1) amortized, but it changes the iteration order of remaining elements after a removal.
For cases where removal must preserve order, IndexMap provides shift_remove:
use indexmap::IndexMap;
let mut map = IndexMap::new();
map.insert("a", 1);
map.insert("b", 2);
map.insert("c", 3);
map.shift_remove("b"); // O(n) but preserves order
// entries: [("a", 1), ("c", 3)]
// Order preserved: a still before cThis shifts all subsequent entries and updates all their indices, which is O(n) but maintains strict insertion order.
Like any hash table, IndexMap handles collisions through probing. The indices table stores indices, and when multiple keys hash to the same bucket, the lookup probes until it finds the matching key:
use indexmap::IndexMap;
let mut map = IndexMap::new();
// These might hash to the same bucket (depending on hash function)
map.insert("key1", 1);
map.insert("key2", 2); // Potentially same bucket as key1
// Lookup probes until it finds the right key
let value = map.get("key2"); // Works correctly despite collisionThe hash table uses Robin Hood hashing or similar probing strategies to keep probe lengths short, maintaining O(1) expected lookup time.
The dense entries vector provides cache-friendly iteration:
use indexmap::IndexMap;
let map: IndexMap<String, i32> = /* ... */;
// Cache-friendly: sequential memory access
let sum: i32 = map.values().sum();
// Compare to HashMap: scattered memory access
use std::collections::HashMap;
let map: HashMap<String, i32> = /* ... */;
let sum: i32 = map.values().sum(); // Potentially more cache missesStandard HashMap stores entries in buckets that may be scattered across memory, especially after rehashing. IndexMap's dense vector keeps entries contiguous.
use indexmap::IndexMap;
let mut config = IndexMap::new();
config.insert("database_url", "localhost:5432");
config.insert("pool_size", "10");
config.insert("timeout", "30");
config.insert("retry_count", "3");
// Configuration is applied in the order specified
for (key, value) in &config {
apply_config(key, value);
}use indexmap::IndexMap;
use serde_json::{Value, to_string_pretty};
// JSON objects are represented as IndexMap in serde_json
let mut obj: IndexMap<String, Value> = IndexMap::new();
obj.insert("name".to_string(), "example".into());
obj.insert("version".to_string(), "1.0".into());
obj.insert("dependencies".to_string(), Value::Object(IndexMap::new()));
// Output preserves insertion order
let json = to_string_pretty(&obj)?;use indexmap::IndexMap;
struct CommandRegistry {
commands: IndexMap<String, Command>,
}
impl CommandRegistry {
fn execute_all(&self, context: &mut Context) {
// Commands execute in registration order
for (name, cmd) in &self.commands {
println!("Executing: {}", name);
cmd.execute(context);
}
}
}| Operation | IndexMap | HashMap | BTreeMap | Vec<(K, V)> | |-----------|----------|---------|----------|-------------| | Lookup | O(1) | O(1) | O(log n) | O(n) | | Insert | O(1) | O(1) | O(log n) | O(1) | | Remove (swap) | O(1) | O(1) | O(log n) | O(n) | | Remove (shift) | O(n) | N/A | O(log n) | O(n) | | Ordered iteration | Yes | No | Yes (sorted) | Yes | | Memory locality | High | Medium | Medium | High |
The dual structure incurs some overhead:
use indexmap::IndexMap;
use std::collections::HashMap;
// IndexMap uses more memory: both indices and entries
// HashMap uses less: single table with entries
// Lookup comparison (conceptual):
// IndexMap: hash -> indices -> entries -> value (two indirections)
// HashMap: hash -> bucket -> value (one indirection)The extra indirection means IndexMap lookups are slightly slower than HashMap, but still O(1). The trade-off is worthwhile when you need ordered iteration.
IndexMap exposes indices that can be used for efficient access:
use indexmap::IndexMap;
let mut map = IndexMap::new();
let idx_a = map.insert_full("a", 1).0; // Returns (index, old_value)
let idx_b = map.insert_full("b", 2).0;
// Direct index access
let value = map[idx_a]; // No hash computation needed
// Get index for a key
if let Some(idx) = map.get_index_of("b") {
let (key, value) = map.get_index(idx).unwrap();
println!("Index {}: {} = {}", idx, key, value);
}
// Remove by index
map.swap_remove_index(idx_a);This is useful when you frequently access specific entries without wanting to recompute hashes.
IndexMap achieves its dual guarantees through parallel data structures: a hash table (indices) for O(1) key-to-index mapping, and a dense vector (entries) for ordered storage. Lookups traverse both structures but each step is O(1), preserving hash table performance. Insertion appends to the vector and indexes in the hash table. Removal uses swap-remove to maintain O(1) performance, with an optional shift-remove for strict order preservation.
The design demonstrates that hash table and array properties aren't mutually exclusiveâyou can have O(1) lookups and ordered iteration by maintaining two indices into the same data. The cost is slightly higher memory usage and an extra indirection on lookups, but for many applicationsâconfiguration, serialization, command processingâthe benefits of ordered iteration outweigh these costs.