How does indexmap::IndexMap maintain insertion order while still providing O(1) lookups?

IndexMap combines the best of two data structure worlds: the O(1) lookup performance of hash tables and the insertion-order iteration of sequences. This hybrid structure achieves its properties through a dual-indexing design that maintains both a hash table for fast lookups and a dense array for ordered iteration.

The Problem: HashMap's Unordered Nature

Standard HashMap provides O(1) lookups but makes no guarantees about iteration order:

use std::collections::HashMap;
 
let mut map = HashMap::new();
map.insert("first", 1);
map.insert("second", 2);
map.insert("third", 3);
 
// Iteration order is unpredictable
for (key, value) in &map {
    println!("{}: {}", key, value);
}
// Output order varies based on hash values and internal layout

The internal structure of HashMap organizes entries by hash buckets, not insertion order. Rehashing during growth can completely reorder the internal layout.

The IndexMap Solution: Dual Indices

IndexMap maintains two parallel data structures:

// Conceptual representation
struct IndexMap<K, V> {
    entries: Vec<Entry<K, V>>,        // Dense array for ordered iteration
    indices: RawTable<usize>,          // Hash table mapping hash -> index
}
 
struct Entry<K, V> {
    hash: u64,     // Cached hash for fast lookup
    key: K,
    value: V,
}

The entries vector stores key-value pairs in insertion order, enabling ordered iteration. The indices hash table maps hash values to indices in the entries vector, enabling O(1) lookup.

Lookup Operation: Hash to Index to Entry

When you look up a key, IndexMap uses a two-step process:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
map.insert("apple", 1);
map.insert("banana", 2);
map.insert("cherry", 3);
 
// Lookup: "banana"
let value = map.get("banana");

The lookup proceeds as follows:

  1. Hash the key: Compute hash("banana")
  2. Find the index: Look up the hash in indices to get the position in entries
  3. Access the entry: Use the index to directly access entries[position]
// Simplified lookup implementation
fn get(&self, key: &K) -> Option<&V> {
    let hash = self.hash(key);
    
    // Step 1: O(1) hash table lookup
    let index = self.indices.get(hash, |&idx| {
        // Verify the key matches (handle collisions)
        self.entries[idx].key == key
    })?;
    
    // Step 2: O(1) array access
    Some(&self.entries[index].value)
}

Both steps are O(1), so the overall lookup is O(1) amortized.

Insertion Operation: Append and Index

Insertion adds entries at the end of the vector and creates a hash table entry:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
 
map.insert("first", 1);   // entries: [("first", 1)]
                           // indices: { hash("first") -> 0 }
 
map.insert("second", 2);  // entries: [("first", 1), ("second", 2)]
                           // indices: { hash("first") -> 0, hash("second") -> 1 }

The process:

  1. Compute the hash of the key
  2. Check if the key already exists (hash table lookup)
  3. If new, append to entries and store the index in indices
  4. If existing, update the value in place (preserving position)
// Simplified insertion
fn insert(&mut self, key: K, value: V) -> Option<V> {
    let hash = self.hash(&key);
    
    // Check if key exists
    if let Some(index) = self.indices.get_mut(hash, |&idx| {
        self.entries[idx].key == key
    }) {
        // Key exists: update value, return old
        let old = std::mem::replace(&mut self.entries[*index].value, value);
        return Some(old);
    }
    
    // New key: append and index
    let index = self.entries.len();
    self.entries.push(Entry { hash, key, value });
    self.indices.insert(hash, index, |idx| {
        self.entries[*idx].hash
    });
    
    None
}

Ordered Iteration: Direct Vector Traversal

Iteration simply walks the entries vector:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
map.insert("a", 1);
map.insert("b", 2);
map.insert("c", 3);
 
// Iteration is in insertion order
for (key, value) in &map {
    println!("{}: {}", key, value);
}
// Output: a: 1, b: 2, c: 3 (always)

The iterator implementation:

// Iter iterates over the entries vector
impl<K, V> Iterator for Iter<'_, K, V> {
    type Item = (&K, &V);
    
    fn next(&mut self) -> Option<Self::Item> {
        self.iter.next().map(|entry| (&entry.key, &entry.value))
    }
}

This is cache-friendly: entries are stored contiguously in memory, providing excellent locality for iteration.

The Challenge of Removal: Maintaining Order

Removal is where the design becomes clever. Simply removing an entry from entries would leave a gap, and shifting all subsequent entries would be O(n). Instead, IndexMap uses swap removal with index updates:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
map.insert("a", 1);  // index 0
map.insert("b", 2);  // index 1
map.insert("c", 3);  // index 2
map.insert("d", 4);  // index 3
 
map.remove("b");  // Remove entry at index 1
 
// After swap removal:
// entries: [("a", 1), ("d", 4), ("c", 3)]
// indices: { hash("a") -> 0, hash("d") -> 1, hash("c") -> 2 }

The process:

  1. Find the index of the entry to remove (via hash table)
  2. Swap the entry with the last entry in the vector
  3. Update the index of the swapped entry in the hash table
  4. Pop the last entry (now the one to remove)
  5. Remove the hash table entry
// Simplified swap removal
fn remove(&mut self, key: &K) -> Option<V> {
    let hash = self.hash(key);
    
    // Find index of entry to remove
    let index = self.indices.remove(hash, |&idx| {
        self.entries[idx].key == *key
    })?;
    
    // If not the last entry, swap with last
    let last = self.entries.len() - 1;
    if index != last {
        // Swap entries
        self.entries.swap(index, last);
        
        // Update index of the swapped entry
        let swapped_hash = self.entries[index].hash;
        self.indices.get_mut(swapped_hash, |&idx| idx == last)
            .map(|idx| *idx = index);
    }
    
    // Pop the removed entry
    let entry = self.entries.pop()?;
    Some(entry.value)
}

This approach keeps removal at O(1) amortized, but it changes the iteration order of remaining elements after a removal.

Preserving Order with Shift Removal

For cases where removal must preserve order, IndexMap provides shift_remove:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
map.insert("a", 1);
map.insert("b", 2);
map.insert("c", 3);
 
map.shift_remove("b");  // O(n) but preserves order
 
// entries: [("a", 1), ("c", 3)]
// Order preserved: a still before c

This shifts all subsequent entries and updates all their indices, which is O(n) but maintains strict insertion order.

Hash Collision Handling

Like any hash table, IndexMap handles collisions through probing. The indices table stores indices, and when multiple keys hash to the same bucket, the lookup probes until it finds the matching key:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
 
// These might hash to the same bucket (depending on hash function)
map.insert("key1", 1);
map.insert("key2", 2);  // Potentially same bucket as key1
 
// Lookup probes until it finds the right key
let value = map.get("key2");  // Works correctly despite collision

The hash table uses Robin Hood hashing or similar probing strategies to keep probe lengths short, maintaining O(1) expected lookup time.

Memory Layout Benefits

The dense entries vector provides cache-friendly iteration:

use indexmap::IndexMap;
 
let map: IndexMap<String, i32> = /* ... */;
 
// Cache-friendly: sequential memory access
let sum: i32 = map.values().sum();
 
// Compare to HashMap: scattered memory access
use std::collections::HashMap;
let map: HashMap<String, i32> = /* ... */;
let sum: i32 = map.values().sum();  // Potentially more cache misses

Standard HashMap stores entries in buckets that may be scattered across memory, especially after rehashing. IndexMap's dense vector keeps entries contiguous.

Practical Usage Examples

Ordered Configuration

use indexmap::IndexMap;
 
let mut config = IndexMap::new();
config.insert("database_url", "localhost:5432");
config.insert("pool_size", "10");
config.insert("timeout", "30");
config.insert("retry_count", "3");
 
// Configuration is applied in the order specified
for (key, value) in &config {
    apply_config(key, value);
}

Preserving JSON Object Order

use indexmap::IndexMap;
use serde_json::{Value, to_string_pretty};
 
// JSON objects are represented as IndexMap in serde_json
let mut obj: IndexMap<String, Value> = IndexMap::new();
obj.insert("name".to_string(), "example".into());
obj.insert("version".to_string(), "1.0".into());
obj.insert("dependencies".to_string(), Value::Object(IndexMap::new()));
 
// Output preserves insertion order
let json = to_string_pretty(&obj)?;

Ordered Command Processing

use indexmap::IndexMap;
 
struct CommandRegistry {
    commands: IndexMap<String, Command>,
}
 
impl CommandRegistry {
    fn execute_all(&self, context: &mut Context) {
        // Commands execute in registration order
        for (name, cmd) in &self.commands {
            println!("Executing: {}", name);
            cmd.execute(context);
        }
    }
}

Comparison with Other Structures

Operation IndexMap HashMap BTreeMap Vec<(K, V)>
Lookup O(1) O(1) O(log n) O(n)
Insert O(1) O(1) O(log n) O(1)
Remove (swap) O(1) O(1) O(log n) O(n)
Remove (shift) O(n) N/A O(log n) O(n)
Ordered iteration Yes No Yes (sorted) Yes
Memory locality High Medium Medium High

Performance Trade-offs

The dual structure incurs some overhead:

use indexmap::IndexMap;
use std::collections::HashMap;
 
// IndexMap uses more memory: both indices and entries
// HashMap uses less: single table with entries
 
// Lookup comparison (conceptual):
// IndexMap: hash -> indices -> entries -> value (two indirections)
// HashMap: hash -> bucket -> value (one indirection)

The extra indirection means IndexMap lookups are slightly slower than HashMap, but still O(1). The trade-off is worthwhile when you need ordered iteration.

Index-Based API

IndexMap exposes indices that can be used for efficient access:

use indexmap::IndexMap;
 
let mut map = IndexMap::new();
let idx_a = map.insert_full("a", 1).0;  // Returns (index, old_value)
let idx_b = map.insert_full("b", 2).0;
 
// Direct index access
let value = map[idx_a];  // No hash computation needed
 
// Get index for a key
if let Some(idx) = map.get_index_of("b") {
    let (key, value) = map.get_index(idx).unwrap();
    println!("Index {}: {} = {}", idx, key, value);
}
 
// Remove by index
map.swap_remove_index(idx_a);

This is useful when you frequently access specific entries without wanting to recompute hashes.

Synthesis

IndexMap achieves its dual guarantees through parallel data structures: a hash table (indices) for O(1) key-to-index mapping, and a dense vector (entries) for ordered storage. Lookups traverse both structures but each step is O(1), preserving hash table performance. Insertion appends to the vector and indexes in the hash table. Removal uses swap-remove to maintain O(1) performance, with an optional shift-remove for strict order preservation.

The design demonstrates that hash table and array properties aren't mutually exclusive—you can have O(1) lookups and ordered iteration by maintaining two indices into the same data. The cost is slightly higher memory usage and an extra indirection on lookups, but for many applications—configuration, serialization, command processing—the benefits of ordered iteration outweigh these costs.