How does indexmap::IndexMap::retain preserve insertion order unlike HashMap::retain?

IndexMap::retain preserves insertion order by maintaining two parallel data structures—a hash table for lookups and a dense array for ordering—then performing in-place compaction that shifts remaining elements to fill gaps while preserving their relative sequence. Standard HashMap::retain has no ordering guarantee because it operates directly on the hash table's bucket array, which has no inherent insertion order; when entries are removed, remaining entries may be reordered during compaction or may stay in their hash positions. IndexMap's internal indices array maps hashes to positions in the entries array, and the entries array stores key-value pairs in insertion order. When retain removes entries, it performs a "swap remove" in the hash table but then compacts the entries array by shifting subsequent elements forward, preserving their original order. This order-preserving removal requires linear time in the number of elements after the first removal, but maintains the iteration order guarantee that IndexMap provides.

Basic retain Behavior Comparison

use std::collections::HashMap;
use indexmap::IndexMap;
 
fn main() {
    // HashMap: no order guarantee
    let mut hashmap = HashMap::new();
    hashmap.insert("a", 1);
    hashmap.insert("b", 2);
    hashmap.insert("c", 3);
    hashmap.insert("d", 4);
    
    hashmap.retain(|k, _| k != &"b" && k != &"c");
    
    println!("HashMap after retain:");
    for (k, v) in &hashmap {
        println!("  {}: {}", k, v);  // Order is unspecified
    }
    
    // IndexMap: preserves insertion order
    let mut indexmap = IndexMap::new();
    indexmap.insert("a", 1);
    indexmap.insert("b", 2);
    indexmap.insert("c", 3);
    indexmap.insert("d", 4);
    
    indexmap.retain(|k, _| k != &"b" && k != &"c");
    
    println!("\nIndexMap after retain:");
    for (k, v) in &indexmap {
        println!("  {}: {}", k, v);  // Order: a, d
    }
}

HashMap makes no guarantees about iteration order; IndexMap guarantees insertion order.

Internal Structure: How IndexMap Maintains Order

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    
    map.insert("first", 1);
    map.insert("second", 2);
    map.insert("third", 3);
    
    // IndexMap maintains:
    // 1. entries: Vec<(K, V)> - stores items in insertion order
    // 2. indices: HashTable<usize> - maps hashes to indices in entries
    
    // Access by index (not available in HashMap)
    println!("Index 0: {:?}", map.get_index(0));  // Some(("first", 1))
    println!("Index 1: {:?}", map.get_index(1));  // Some(("second", 2))
    println!("Index 2: {:?}", map.get_index(2));  // Some(("third", 3))
    
    // Keys() preserves insertion order
    println!("Keys in order: {:?}", map.keys().collect::<Vec<_>>());
}

IndexMap stores entries in a dense array, enabling index-based access.

The retain Algorithm: In-Place Compaction

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    map.insert("a", 1);
    map.insert("b", 2);
    map.insert("c", 3);
    map.insert("d", 4);
    map.insert("e", 5);
    
    println!("Before retain:");
    for (i, (k, v)) in map.iter().enumerate() {
        println!("  [{}] {}: {}", i, k, v);
    }
    
    // Remove b and d
    map.retain(|k, _| k != "b" && k != "d");
    
    println!("\nAfter retain:");
    for (i, (k, v)) in map.iter().enumerate() {
        println!("  [{}] {}: {}", i, k, v);
    }
    
    // The remaining entries (a, c, e) maintain their relative order
    // Internal algorithm compacts by shifting:
    //   entries[1] moves to entries[1] (c shifts left)
    //   entries[2] moves to entries[2] (e shifts left)
}

Entries are compacted in-place, shifting elements to fill removed positions.

HashMap retain: No Order Guarantee

use std::collections::HashMap;
 
fn main() {
    let mut map = HashMap::new();
    
    // Insert in specific order
    for i in 0..10 {
        map.insert(format!("key{}", i), i);
    }
    
    println!("Before retain:");
    for (k, v) in &map {
        println!("  {}: {}", k, v);  // Arbitrary order
    }
    
    map.retain(|k, v| {
        // Keep only even values
        *v % 2 == 0
    });
    
    println!("\nAfter retain:");
    for (k, v) in &map {
        println!("  {}: {}", k, v);  // Order may differ from insertion
    }
    
    // HashMap's internal bucket array doesn't track insertion order
    // Removal compaction may reorder elements
}

HashMap buckets entries by hash; removal may trigger reordering.

Performance Implications

use std::collections::HashMap;
use indexmap::IndexMap;
use std::time::Instant;
 
fn main() {
    const SIZE: usize = 100_000;
    
    // HashMap retain
    let mut hashmap: HashMap<i32, i32> = (0..SIZE as i32).map(|i| (i, i)).collect();
    let start = Instant::now();
    hashmap.retain(|_, v| *v % 2 == 0);
    let hashmap_time = start.elapsed();
    
    // IndexMap retain
    let mut indexmap: IndexMap<i32, i32> = (0..SIZE as i32).map(|i| (i, i)).collect();
    let start = Instant::now();
    indexmap.retain(|_, v| *v % 2 == 0);
    let indexmap_time = start.elapsed();
    
    println!("HashMap retain: {:?}", hashmap_time);
    println!("IndexMap retain: {:?}", indexmap_time);
    println!("HashMap is typically faster due to no ordering constraint");
    
    println!("\nRemaining elements:");
    println!("HashMap: {}", hashmap.len());
    println!("IndexMap: {}", indexmap.len());
    
    // Verify IndexMap order is preserved
    let first_five: Vec<_> = indexmap.keys().take(5).collect();
    println!("IndexMap first 5 keys: {:?}", first_five);  // [0, 2, 4, 6, 8]
}

IndexMap::retain pays a cost for order preservation; HashMap can be more aggressive with compaction.

The Swap Remove vs Shift Remove Difference

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    map.insert("a", 1);
    map.insert("b", 2);
    map.insert("c", 3);
    map.insert("d", 4);
    
    // swap_remove: O(1) but disturbs order
    // Used by HashMap internally (it doesn't care about order)
    
    // shift_remove: O(n) but preserves order
    // Used by IndexMap::retain
    
    // Comparison:
    // swap_remove("b") from [a, b, c, d]:
    //   - Swap b with last element: [a, d, c, b]
    //   - Remove last: [a, d, c]
    //   - Order changed: a was first, now d is second
    
    // shift_remove("b") from [a, b, c, d]:
    //   - Shift c left: [a, c, d]
    //   - Shift d left: [a, c, d] (already done)
    //   - Order preserved: a, c, d maintain relative positions
    
    map.retain(|k, _| k != "b");
    
    println!("After retain (shift_remove behavior):");
    for (k, v) in &map {
        println!("  {}: {}", k, v);
    }
    // Output: a: 1, c: 3, d: 4 (order preserved)
}

IndexMap::retain uses shift semantics internally; HashMap may use swap semantics.

Index Preservation After retain

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    map.insert("a", 1);
    map.insert("b", 2);
    map.insert("c", 3);
    map.insert("d", 4);
    map.insert("e", 5);
    
    println!("Before retain:");
    for i in 0..map.len() {
        let (k, v) = map.get_index(i).unwrap();
        println!("  index {}: {} -> {}", i, k, v);
    }
    
    // Remove middle elements
    map.retain(|k, _| k != &"b" && k != &"d");
    
    println!("\nAfter retain:");
    for i in 0..map.len() {
        let (k, v) = map.get_index(i).unwrap();
        println!("  index {}: {} -> {}", i, k, v);
    }
    
    // Indices are reassigned during compaction
    // a stays at index 0
    // c moves from index 2 to index 1
    // e moves from index 4 to index 2
}

Indices are updated during retain to remain contiguous.

Use Cases Requiring Order Preservation

use indexmap::IndexMap;
 
fn main() {
    // Configuration processing where order matters
    let mut config = IndexMap::new();
    config.insert("base", "/app");
    config.insert("data", "/app/data");
    config.insert("log", "/app/log");
    config.insert("cache", "/app/cache");
    config.insert("temp", "/tmp");
    
    // Remove entries that don't start with /app
    config.retain(|_, v| v.starts_with("/app"));
    
    println!("Filtered config:");
    for (k, v) in &config {
        println!("  {}: {}", k, v);
    }
    
    // Order preserved: base, data, log, cache
    // This matters if dependent configs need to be set in order
}

Order matters for configurations, build steps, and sequential operations.

Comparing with Entry-Based Removal

use indexmap::IndexMap;
 
fn main() {
    let mut map1 = IndexMap::new();
    map1.insert("a", 1);
    map1.insert("b", 2);
    map1.insert("c", 3);
    
    let mut map2 = IndexMap::new();
    map2.insert("a", 1);
    map2.insert("b", 2);
    map2.insert("c", 3);
    
    // Using retain
    map1.retain(|k, _| k != "b");
    
    // Using shift_remove (also preserves order)
    map2.shift_remove("b");
    
    println!("retain result:");
    for (k, v) in &map1 {
        println!("  {}: {}", k, v);
    }
    
    println!("\nshift_remove result:");
    for (k, v) in &map2 {
        println!("  {}: {}", k, v);
    }
    
    // Both preserve order
    // retain is more efficient for multiple removals
}

retain is more efficient than multiple shift_remove calls.

Memory Layout During retain

use indexmap::IndexMap;
 
fn main() {
    // IndexMap internal structure:
    // - entries: Vec<Entry<K, V>> - dense, insertion-ordered
    // - indices: RawTable<usize> - hash -> index mapping
    // - hashes: Vec<HashValue> - cached hash values
    
    let mut map = IndexMap::new();
    map.insert("a", 1);
    map.insert("b", 2);
    map.insert("c", 3);
    map.insert("d", 4);
    
    println!("Before retain:");
    println!("  len: {}", map.len());
    
    // Remove b
    map.retain(|k, _| k != "b");
    
    println!("\nAfter retain:");
    println!("  len: {}", map.len());
    
    // Internal operations:
    // 1. Iterate through entries, marking removals
    // 2. Compact entries array (shift elements left)
    // 3. Update hash table indices
    // 4. Update the indices array
    
    // Memory is contiguous and dense after retain
}

retain maintains a dense, contiguous entries array.

retain vs drain_filter

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    map.insert("a", 1);
    map.insert("b", 2);
    map.insert("c", 3);
    map.insert("d", 4);
    
    // retain: just removes, doesn't return
    map.retain(|k, _| k != "b");
    
    // For getting removed items, use drain() with filter
    // (Note: drain_filter is unstable, so we use retain + manual tracking)
    
    let mut map2 = IndexMap::new();
    map2.insert("a", 1);
    map2.insert("b", 2);
    map2.insert("c", 3);
    map2.insert("d", 4);
    
    // Manual tracking for getting removed items
    let mut removed = Vec::new();
    let keys_to_remove: Vec<_> = map2.iter()
        .filter(|(k, _)| *k == "b" || *k == "d")
        .map(|(k, _)| k.clone())
        .collect();
    
    for k in &keys_to_remove {
        if let Some(v) = map2.shift_remove(k) {
            removed.push((k.clone(), v));
        }
    }
    
    println!("Removed: {:?}", removed);
    println!("Remaining:");
    for (k, v) in &map2 {
        println!("  {}: {}", k, v);
    }
}

retain doesn't return removed items; use shift_remove if you need them.

Hash Table Consistency

use indexmap::IndexMap;
 
fn main() {
    let mut map = IndexMap::new();
    map.insert("apple", 1);
    map.insert("banana", 2);
    map.insert("cherry", 3);
    map.insert("date", 4);
    map.insert("elderberry", 5);
    
    // Verify lookups work before retain
    println!("Before retain:");
    println!("  apple: {:?}", map.get("apple"));
    println!("  cherry: {:?}", map.get("cherry"));
    println!("  elderberry: {:?}", map.get("elderberry"));
    
    // Remove some elements
    map.retain(|k, _| k != &"banana" && k != &"date");
    
    // Verify lookups still work after retain
    println!("\nAfter retain:");
    println!("  apple: {:?}", map.get("apple"));
    println!("  cherry: {:?}", map.get("cherry"));
    println!("  elderberry: {:?}", map.get("elderberry"));
    println!("  banana: {:?}", map.get("banana"));  // None
    
    // The hash table indices are correctly updated
    // Lookup remains O(1) average
}

retain updates both the entries array and hash table consistently.

Bulk Removal Performance Comparison

use std::collections::HashMap;
use indexmap::IndexMap;
use std::time::Instant;
 
fn main() {
    const SIZE: usize = 50_000;
    const REMOVE_RATIO: f32 = 0.7;  // Remove 70% of elements
    
    // HashMap
    let mut hashmap: HashMap<i32, String> = (0..SIZE as i32)
        .map(|i| (i, format!("value{}", i)))
        .collect();
    
    let start = Instant::now();
    hashmap.retain(|k, _| *k % 10 != 0);  // Remove ~10%
    let hashmap_time = start.elapsed();
    
    // IndexMap
    let mut indexmap: IndexMap<i32, String> = (0..SIZE as i32)
        .map(|i| (i, format!("value{}", i)))
        .collect();
    
    let start = Instant::now();
    indexmap.retain(|k, _| *k % 10 != 0);
    let indexmap_time = start.elapsed();
    
    println!("Elements removed: ~{}", SIZE / 10);
    println!("HashMap: {:?}", hashmap_time);
    println!("IndexMap: {:?}", indexmap_time);
    
    // Both are O(n), but IndexMap has additional work for order preservation
}

Both are O(n), but IndexMap has additional bookkeeping for order.

Synthesis

Data structure comparison:

Aspect HashMap retain IndexMap retain
Order preservation No guarantee Guaranteed
Algorithm Hash table compaction Array compaction + hash update
Complexity O(n) O(n)
Memory layout Sparse Dense
Index-based access Not available Preserved

Internal operations:

IndexMap::retain:
1. Iterate entries array
2. For each entry to remove:
   a. Remove from hash table (swap remove in hash buckets)
   b. Mark position for compaction
3. Compact entries array (shift remaining left)
4. Update hash table indices to new positions
5. Update indices array

HashMap::retain:
1. Iterate bucket array
2. Remove entries directly in hash table
3. May trigger rehash or compaction
4. No ordering concerns

When to use each:

Use HashMap::retain Use IndexMap::retain
Order doesn't matter Order matters for correctness
Maximum performance needed Predictable iteration order
Pure lookup use case Index-based access needed
Memory efficiency critical Sequential processing required

Key insight: The fundamental difference is that HashMap has no concept of insertion order—its internal bucket array organizes entries by hash, and retain operates directly on this structure. IndexMap maintains two parallel data structures: a dense Vec of entries in insertion order, and a hash table mapping keys to indices. When retain removes entries, it must preserve the relative order of remaining entries in the Vec, which requires shifting elements left (O(n) compaction). But it must also update the hash table to reflect new indices, because the hash table stores indices into the entries array, not direct pointers. This dual update—compacting the entries and updating the index mappings—is what guarantees that both key-based lookup and index-based iteration work correctly after retain. The HashMap version can skip the compaction step because it doesn't expose any index-based interface—the only way to access entries is by key hash, which remains valid as long as the hash table is internally consistent. The trade-off IndexMap makes is clear: it sacrifices some performance (the compaction and index updates) to provide the order guarantee that HashMap cannot offer.