How does `tempfile::spooled::SpooledTempFile` balance in-memory and on-disk storage for temporary files?

SpooledTempFile stores data in memory up to a configurable threshold, then transparently rolls over to a temporary file on disk when the threshold is exceeded. This hybrid approach gives you the performance of in-memory buffers for small files while automatically handling large data without memory exhaustion. The rollover happens automatically and seamlessly—your code interacts with the same Read/Write interface regardless of whether the data is currently in memory or on disk.

Basic Usage

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read};
 
fn basic_usage() {
    // Create with a 1 KB memory threshold
    let mut file = SpooledTempFile::new(1024);
    
    // Writing small amounts stays in memory
    file.write_all(b"Hello, world!").unwrap();
    println!("In memory: {}", file.is_rolled_over());  // false
    
    // Data is still accessible
    let mut contents = Vec::new();
    file.read_to_end(&mut contents).unwrap();
    assert_eq!(contents, b"Hello, world!");
}
 
fn creating_with_capacity() {
    // SpooledTempFile::new(max_memory_bytes)
    // Data stays in memory until this threshold is exceeded
    
    // Small threshold - rolls over quickly
    let small = SpooledTempFile::new(100);
    
    // Large threshold - likely never rolls over
    let large = SpooledTempFile::new(100 * 1024 * 1024);  // 100 MB
    
    // You can also create with a pre-allocated buffer
    let with_capacity = SpooledTempFile::with_capacity(4096, 8192);
    //                        initial capacity ^    ^ max memory
}

The new() constructor takes the maximum bytes to keep in memory before rolling over.

Automatic Rollover Behavior

use tempfile::spooled::SpooledTempFile;
use std::io::Write;
 
fn demonstrate_rollover() {
    // 100 byte threshold
    let mut file = SpooledTempFile::new(100);
    
    // Writing 50 bytes - stays in memory
    file.write_all(&[0u8; 50]).unwrap();
    assert!(!file.is_rolled_over());
    
    // Writing another 60 bytes - total 110 bytes
    // This exceeds 100 byte threshold
    file.write_all(&[0u8; 60]).unwrap();
    
    // File has rolled over to disk
    assert!(file.is_rolled_over());
    
    // Operations continue to work the same
    // The interface doesn't change after rollover
    file.write_all(&[0u8; 200]).unwrap();  // Works fine on disk
}
 
fn rollover_point() {
    // Rollover happens when the in-memory buffer would exceed the threshold
    // It's triggered by a write operation
    
    let mut file = SpooledTempFile::new(100);
    
    // Write exactly 100 bytes
    file.write_all(&[0u8; 100]).unwrap();
    assert!(!file.is_rolled_over());  // Still in memory!
    
    // Write 1 more byte - total 101 bytes
    file.write_all(&[1u8; 1]).unwrap();
    assert!(file.is_rolled_over());  // Now on disk
    
    // The rollover happens when data EXCEEDS the threshold
    // Not when it reaches the threshold
}

Rollover occurs when a write would cause the buffer to exceed the memory limit.

Reading from SpooledTempFile

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read, Seek, SeekFrom};
 
fn reading_data() {
    let mut file = SpooledTempFile::new(100);
    file.write_all(b"Hello, SpooledTempFile!").unwrap();
    
    // Read from beginning
    let mut buf = [0u8; 5];
    file.read_exact(&mut buf).unwrap();
    assert_eq!(&buf, b"Hello");
    
    // Seek back to start for another read
    file.seek(SeekFrom::Start(0)).unwrap();
    let mut contents = String::new();
    file.read_to_string(&mut contents).unwrap();
    assert_eq!(contents, "Hello, SpooledTempFile!");
}
 
fn reading_after_rollover() {
    // Reading works identically whether in memory or on disk
    let mut file = SpooledTempFile::new(10);
    file.write_all(b"This data is much larger than 10 bytes and will roll over").unwrap();
    
    // Has rolled to disk
    assert!(file.is_rolled_over());
    
    // Read works the same way
    file.seek(SeekFrom::Start(0)).unwrap();
    let mut contents = String::new();
    file.read_to_string(&mut contents).unwrap();
    assert!(contents.starts_with("This data"));
}

Read operations work consistently regardless of whether data is in memory or on disk.

Accessing Underlying Storage

use tempfile::spooled::SpooledTempFile;
use std::io::Write;
 
fn accessing_inner() {
    let mut file = SpooledTempFile::new(100);
    
    // Before rollover - in memory
    if let Some(cursor) = file.inner_mut() {
        // We have access to the underlying Vec<u8> Cursor
        let data = cursor.get_ref();
        println!("In memory: {} bytes", data.len());
    }
    
    // Write enough to trigger rollover
    file.write_all(&[0u8; 200]).unwrap();
    
    // After rollover - on disk
    if let Some(temp_file) = file.inner_mut() {
        // Wait, inner_mut() type changes after rollover
        // It returns the temporary file, not the cursor
    }
    
    // Better approach: check is_rolled_over() first
    if file.is_rolled_over() {
        // On disk, inner_mut() gives access to the temp file
        // Type depends on the spooled file implementation
    } else {
        // In memory, can access the cursor/buffer
    }
}

The inner_mut() method provides access to the underlying storage, but the type changes after rollover.

When Rollover Happens

use tempfile::spooled::SpooledTempFile;
use std::io::Write;
 
fn rollover_timing() {
    // Important: Rollover only happens on write operations
    
    let mut file = SpooledTempFile::new(50);
    
    // Writing exactly 50 bytes - stays in memory
    file.write_all(&[0u8; 50]).unwrap();
    assert!(!file.is_rolled_over());
    
    // Seeking doesn't trigger rollover
    use std::io::Seek;
    file.seek(std::io::SeekFrom::Start(0)).unwrap();
    assert!(!file.is_rolled_over());  // Still in memory
    
    // Writing when at limit triggers rollover
    file.write_all(&[1u8; 1]).unwrap();  // Total 51 bytes
    assert!(file.is_rolled_over());  // Now on disk
}
 
fn rollover_copy() {
    use std::io::{Read, Copy, sink};
    
    // Rollover also happens when copying large amounts
    let mut file = SpooledTempFile::new(100);
    
    // Create data larger than threshold
    let large_data: Vec<u8> = (0..200).collect();
    file.write_all(&large_data).unwrap();
    
    // Already rolled over during write
    assert!(file.is_rolled_over());
}

Rollover is triggered by write operations that would exceed the memory threshold.

Use Cases

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read};
 
// Use case 1: Processing uploads of unknown size
fn process_upload(data: &[u8]) -> Vec<u8> {
    let mut file = SpooledTempFile::new(10 * 1024 * 1024);  // 10 MB
    
    // Small uploads stay fast (in memory)
    // Large uploads don't exhaust memory (roll to disk)
    file.write_all(data).unwrap();
    
    // Process the data
    let mut result = Vec::new();
    file.read_to_end(&mut result).unwrap();
    result
}
 
// Use case 2: Logging with bounded memory
fn bounded_logging() {
    let mut log = SpooledTempFile::new(1024 * 1024);  // 1 MB
    
    // Log entries
    for i in 0..1000 {
        writeln!(log, "Log entry {}", i).unwrap();
    }
    
    // If logs are small, they're in memory (fast)
    // If logs are large, they go to disk (safe)
    println!("Rolled over: {}", log.is_rolled_over());
}
 
// Use case 3: Serialization buffer
fn serialize_with_bounds<T: serde::Serialize>(value: &T) -> std::io::Result<Vec<u8>> {
    let mut file = SpooledTempFile::new(64 * 1024);  // 64 KB
    
    // Serialize to the spooled file
    // Small serialized data stays in memory
    // Large data goes to disk
    serde_json::to_writer(&mut file, value)
        .map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?;
    
    let mut result = Vec::new();
    file.read_to_end(&mut result)?;
    Ok(result)
}

Common use cases include handling uploads, logging, and serialization where data size is unpredictable.

Memory Threshold Selection

use tempfile::spooled::SpooledTempFile;
 
fn threshold_guidelines() {
    // Choosing the right threshold depends on your use case:
    
    // Small threshold (KB range): Good for text processing
    let text_processor = SpooledTempFile::new(4 * 1024);  // 4 KB
    
    // Medium threshold (MB range): Good for small files, API responses
    let api_handler = SpooledTempFile::new(1024 * 1024);  // 1 MB
    
    // Large threshold (10s of MB): Good for binary data, images
    let binary_handler = SpooledTempFile::new(50 * 1024 * 1024);  // 50 MB
    
    // Considerations:
    // 1. Memory pressure: How much RAM can you spare?
    // 2. Concurrent instances: Each SpooledTempFile has its own buffer
    // 3. Performance: Larger in-memory buffer = faster small file handling
    // 4. Disk I/O: Rollover incurs disk write cost
}
 
fn concurrent_consideration() {
    // If you have 100 concurrent SpooledTempFiles with 1 MB threshold:
    // Maximum memory usage = 100 MB
    
    // If you have 1000 concurrent with 10 MB threshold:
    // Maximum memory usage = 10 GB! (worst case)
    
    // Be careful with thresholds in concurrent scenarios
    let reasonable_threshold = 1024 * 1024;  // 1 MB per file
    let mut files: Vec<_> = (0..100)
        .map(|_| SpooledTempFile::new(reasonable_threshold))
        .collect();
    
    // Total potential memory: 100 MB (if all in memory)
}

Choose thresholds based on expected data sizes and available memory.

Comparison with Alternatives

use tempfile::{spooled::SpooledTempFile, NamedTempFile, TempDir};
use std::io::Write;
 
fn compare_approaches() {
    // SpooledTempFile: Memory first, then disk
    let mut spooled = SpooledTempFile::new(1024);
    // Pros: Fast for small data, safe for large data
    // Cons: More complex, has rollover overhead
    
    // NamedTempFile: Always on disk
    let mut named = NamedTempFile::new().unwrap();
    // Pros: Predictable, works for any size
    // Cons: Always disk I/O, slower for small data
    
    // In-memory buffer (Vec<u8>): Never touches disk
    let mut buffer = Vec::new();
    // Pros: Fastest, simplest
    // Cons: Can exhaust memory with large data
    
    // Each has its place:
    // - SpooledTempFile: Unknown/variable size data
    // - NamedTempFile: Large data, need file path
    // - Vec<u8>: Known small data, performance critical
}
 
fn when_to_use_spooled() {
    // SpooledTempFile is ideal when:
    
    // 1. Data size is unpredictable
    fn handle_request(body: &[u8]) {
        let mut file = SpooledTempFile::new(1024 * 1024);
        // Small requests: fast (in memory)
        // Large requests: safe (on disk)
    }
    
    // 2. You want bounded memory usage
    fn bounded_processing() {
        // Memory capped at threshold
        let mut file = SpooledTempFile::new(10 * 1024 * 1024);
    }
    
    // 3. Performance matters for common (small) case
    fn optimize_common_case() {
        // Most requests are small (< 1 MB)
        // Those stay in memory
        // Outliers go to disk
        let mut file = SpooledTempFile::new(1024 * 1024);
    }
}

SpooledTempFile combines benefits of in-memory and on-disk approaches.

Working with Large Files

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read, Seek, SeekFrom};
 
fn large_file_handling() {
    // Create with threshold appropriate for your system
    let mut file = SpooledTempFile::new(1024 * 1024);  // 1 MB
    
    // Write large amounts of data
    for _ in 0..1000 {
        file.write_all(&[0u8; 10000]).unwrap();  // Write 10 KB at a time
    }
    
    // Total: 10 MB, far exceeds threshold
    assert!(file.is_rolled_over());
    
    // Can still read and seek
    file.seek(SeekFrom::Start(0)).unwrap();
    
    let mut chunk = [0u8; 1000];
    file.read_exact(&mut chunk).unwrap();
    
    // Position tracking still works
    let pos = file.seek(SeekFrom::Current(0)).unwrap();
    assert_eq!(pos, 1000);
}
 
fn streaming_with_spooled() {
    // SpooledTempFile is useful for streaming scenarios
    use std::io::BufReader;
    
    let file = SpooledTempFile::new(64 * 1024);
    // Can use with BufReader for line-by-line reading
    // after writing data to it
}

Even after rollover, the file supports full Read, Write, and Seek operations.

Cleaning Up

use tempfile::spooled::SpooledTempFile;
use std::io::Write;
 
fn automatic_cleanup() {
    // SpooledTempFile cleans up automatically when dropped
    
    {
        let mut file = SpooledTempFile::new(100);
        file.write_all(b"temporary data").unwrap();
        
        if file.is_rolled_over() {
            // Temp file on disk is created
            // It will be deleted when `file` is dropped
        }
    }  // `file` dropped here
    
    // If it rolled over, the temp file is deleted
    // If it stayed in memory, nothing to clean up
    
    // This matches the behavior of other tempfile types
}
 
fn early_rollover() {
    // You can force rollover if needed
    let mut file = SpooledTempFile::new(1024);
    
    // Force data to disk immediately
    // (e.g., before a long operation that shouldn't hold memory)
    file.write_all(b"some data").unwrap();
    
    // If not yet rolled over, you could trigger it by:
    // 1. Writing more data than threshold
    // 2. Or accepting that it stays in memory
    
    // Note: There's no explicit "force_rollover" method
    // The design is that rollover happens automatically
}

Temporary files on disk are automatically cleaned up when the SpooledTempFile is dropped.

Complete Example: HTTP Response Handling

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read, Seek, SeekFrom};
 
fn handle_http_response(response_data: &[u8]) -> Vec<u8> {
    // Common pattern: buffer HTTP responses with bounds
    let mut buffer = SpooledTempFile::new(1024 * 1024);  // 1 MB
    
    // Write response
    buffer.write_all(response_data).unwrap();
    
    // Check if we rolled over (for logging/metrics)
    let used_disk = buffer.is_rolled_over();
    if used_disk {
        eprintln!("Warning: Large response used disk buffer");
    }
    
    // Read back for processing
    buffer.seek(SeekFrom::Start(0)).unwrap();
    let mut processed = Vec::new();
    buffer.read_to_end(&mut processed).unwrap();
    
    processed
}
 
fn handle_unknown_size_upload(chunks: &[&[u8]]) -> std::io::Result<Vec<u8>> {
    // Handle upload where total size is unknown
    let mut file = SpooledTempFile::new(10 * 1024 * 1024);  // 10 MB
    
    // Write chunks as they arrive
    for chunk in chunks {
        file.write_all(chunk)?;
    }
    
    // Process the complete upload
    file.seek(SeekFrom::Start(0))?;
    let mut result = Vec::new();
    file.read_to_end(&mut result)?;
    
    println!("Rolled over: {}", file.is_rolled_over());
    Ok(result)
}
 
fn main_example() {
    let small_data = b"Hello, world!";
    let large_data: Vec<u8> = (0..255).cycle().take(1024 * 1024).collect();
    
    // Small data stays in memory
    let mut small_file = SpooledTempFile::new(1024);
    small_file.write_all(small_data).unwrap();
    assert!(!small_file.is_rolled_over());
    
    // Large data rolls to disk
    let mut large_file = SpooledTempFile::new(1024);
    large_file.write_all(&large_data).unwrap();
    assert!(large_file.is_rolled_over());
}

This pattern is common in web servers handling requests/responses of unpredictable size.

Synthesis

Quick reference:

use tempfile::spooled::SpooledTempFile;
use std::io::{Write, Read, Seek, SeekFrom};
 
fn spooled_temp_file_summary() {
    // ┌─────────────────────────────────────────────────────────────────────┐
    // │ Aspect              │ Behavior                                      │
    // ├─────────────────────────────────────────────────────────────────────┤
    // │ Storage             │ In memory up to threshold, then disk         │
    // │ Rollover            │ Automatic on write that exceeds threshold    │
    // │ Threshold           │ Specified in bytes at construction            │
    // │ After rollover      │ Operations work same, just on disk           │
    // │ Cleanup             │ Automatic when dropped                       │
    // │ Interfaces          │ Read, Write, Seek implemented                │
    // └─────────────────────────────────────────────────────────────────────┘
    
    let mut file = SpooledTempFile::new(1024);  // 1 KB threshold
    
    // Write operations trigger rollover when exceeding threshold
    file.write_all(b"data").unwrap();
    
    // Check current storage
    let in_memory = !file.is_rolled_over();
    
    // Read/Write/Seek work regardless of storage
    file.seek(SeekFrom::Start(0)).unwrap();
    
    // Choosing threshold:
    // - Too small: Frequent disk I/O for common cases
    // - Too large: Memory exhaustion for large files
    // - Rule of thumb: Match your expected common case + buffer
}
 
// Key benefits:
// ✅ Fast for small data (in-memory operations)
// ✅ Safe for large data (automatic disk fallback)
// ✅ Bounded memory usage (configurable threshold)
// ✅ Consistent API (same interface in both modes)
// ✅ Automatic cleanup (temp files deleted on drop)
 
// When to use:
// ✅ Processing uploads/responses of unknown size
// ✅ Logging with bounded memory
// ✅ Serialization buffers
// ✅ Any "usually small, sometimes large" scenario
 
// When NOT to use:
// ❌ Data always fits in memory (use Vec<u8>)
// ❌ Data always large (use NamedTempFile)
// ❌ Need file path (use NamedTempFile)

Key insight: SpooledTempFile solves the "size unknown until you have it" problem elegantly. Instead of choosing between memory speed (with risk of OOM) and disk safety (with performance cost), you get both: fast in-memory operations for the common small case and automatic disk fallback for outliers. The threshold determines your tradeoff point—a larger threshold means more data stays in memory (faster) but higher peak memory usage. The rollover is transparent; after creation, you interact with the same Read + Write + Seek interface whether data is in memory or on disk. This makes it ideal for web servers, file processors, and any code handling variable-size data where you want bounded memory without sacrificing performance for typical cases.

How does tempfile::spooled::SpooledTempFile balance in-memory and on-disk storage for temporary files?