How does `serde`'s `#[serde(untagged)]` enum representation affect deserialization performance?

#[serde(untagged)] removes the external type tag from enum serialization, representing variants directly as their inner content without a wrapping field. During deserialization, serde must try each variant in order until one matches, which creates trial parsing overhead proportional to the number of variants and their complexity. The performance cost comes from attempting deserialization for each variant, backtracking on failures, and potentially expensive matching logic for complex types. Tagged enums deserialize faster because the type tag tells serde exactly which variant to parse, but untagged enums produce cleaner JSON for certain API designs.

Basic Tagged vs Untagged Enums

use serde::{Deserialize, Serialize};
 
// Externally tagged (default)
#[derive(Serialize, Deserialize)]
enum TaggedEnum {
    String(String),
    Number(i32),
}
 
// JSON: {"String": "hello"} or {"Number": 42}
 
// Internally tagged
#[derive(Serialize, Deserialize)]
#[serde(tag = "type")]
enum InternallyTagged {
    String { value: String },
    Number { value: i32 },
}
 
// JSON: {"type": "String", "value": "hello"}
 
// Untagged
#[derive(Serialize, Deserialize)]
#[serde(untagged)]
enum UntaggedEnum {
    String(String),
    Number(i32),
}
 
// JSON: "hello" or 42 (no wrapper at all)

Untagged enums serialize to the inner content directly, without any type indicator.

Deserialization Matching Strategy

use serde::{Deserialize, Deserializer};
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Value {
    String(String),
    Number(i32),
    Boolean(bool),
}
 
fn main() {
    let json = r#"42"#;
    
    // Deserialization tries variants in order:
    // 1. Try to parse as String
    // 2. If that fails, try to parse as Number (i32)
    // 3. If that fails, try to parse as Boolean (bool)
    // 4. If all fail, return error
    
    let value: Value = serde_json::from_str(json).unwrap();
    println!("{:?}", value);  // Value::Number(42)
}

Untagged deserialization tries each variant sequentially until one succeeds.

Order Matters for Performance

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum SlowEnum {
    // Try complex types first - expensive on failure
    Complex(ComplexType),
    Simple(i32),
}
 
#[derive(Debug, Deserialize)]
struct ComplexType {
    field1: String,
    field2: Vec<i32>,
    field3: NestedType,
}
 
#[derive(Debug, Deserialize)]
struct NestedType {
    nested_field: Option<Box<ComplexType>>,  // Recursive!
}
 
// For input 42:
// 1. Try ComplexType - fails after parsing some fields
// 2. Try i32 - succeeds
// Total work: partial ComplexType parsing + i32 parsing

Variant order affects how much work is attempted before finding the match.

Optimal Variant Ordering

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum OptimizedEnum {
    // Order by: most common first, cheapest to parse first
    Number(i32),           // Very common, fast to parse
    String(String),        // Common, reasonably fast
    Complex(ComplexType),  // Rare, expensive to parse
}
 
// For input 42:
// 1. Try i32 - succeeds immediately
// Total work: just i32 parsing
 
// Contrast with wrong order:
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum PessimisticEnum {
    Complex(ComplexType),  // Always tried first!
    String(String),
    Number(i32),
}
// Every Number input pays the cost of trying ComplexType first

Place common and cheap-to-parse variants earlier for better performance.

Trial Parsing Overhead

use serde::Deserialize;
use std::time::Instant;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum ManyVariants {
    V1(i32),
    V2(i32),
    V3(i32),
    V4(i32),
    V5(i32),
    V6(i32),
    V7(i32),
    V8(i32),
    V9(i32),
    V10(i32),
}
 
fn main() {
    let json = "42";
    
    let start = Instant::now();
    for _ in 0..10_000 {
        let _: ManyVariants = serde_json::from_str(json).unwrap();
    }
    let elapsed = start.elapsed();
    
    println!("Untagged with many variants: {:?}", elapsed);
    // Must try V1-V9 before matching V10 (if input matches V10)
}

More variants mean more potential trial parsing attempts.

Comparison with Tagged Performance

use serde::Deserialize;
use std::time::Instant;
 
#[derive(Debug, Deserialize)]
enum TaggedEnum {
    V1(i32),
    V2(i32),
    V3(i32),
    V4(i32),
    V5(i32),
}
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum UntaggedEnum {
    V1(i32),
    V2(i32),
    V3(i32),
    V4(i32),
    V5(i32),
}
 
fn main() {
    let iterations = 100_000;
    
    // Tagged: direct lookup by type field
    let tagged_json = r#"{"V5":42}"#;
    let start = Instant::now();
    for _ in 0..iterations {
        let _: TaggedEnum = serde_json::from_str(tagged_json).unwrap();
    }
    let tagged_time = start.elapsed();
    
    // Untagged: try each variant
    let untagged_json = "42";
    let start = Instant::now();
    for _ in 0..iterations {
        let _: UntaggedEnum = serde_json::from_str(untagged_json).unwrap();
    }
    let untagged_time = start.elapsed();
    
    println!("Tagged: {:?}", tagged_time);
    println!("Untagged: {:?}", untagged_time);
    // Untagged often slower due to trial parsing
}

Tagged enums know exactly which variant to parse; untagged must search.

Ambiguous Variants

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Ambiguous {
    // Both variants can match "hello"
    // Always matches first one
    String(String),
    AlsoString(String),
}
 
fn main() {
    let json = r#""hello""#;
    let value: Ambiguous = serde_json::from_str(json).unwrap();
    
    match value {
        Ambiguous::String(s) => println!("String: {}", s),
        Ambiguous::AlsoString(s) => println!("AlsoString: {}", s),
    }
    // Always prints "String: hello" - first match wins
}

First matching variant wins; overlapping types cause unexpected behavior.

Overlapping Type Detection

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Numbers {
    // Problem: all i32 values are also i64 values
    Small(i32),
    Large(i64),
}
 
fn main() {
    // 42 matches both Small and Large
    let json = "42";
    let value: Numbers = serde_json::from_str(json).unwrap();
    
    match value {
        Numbers::Small(n) => println!("Small: {}", n),
        Numbers::Large(n) => println!("Large: {}", n),
    }
    // Always Small - i32 comes first and 42 fits
}

Overlapping types create semantic ambiguity even if they compile.

String vs Number Ambiguity

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum StringOrNumber {
    String(String),
    Number(i32),
}
 
fn main() {
    // Clear distinction: strings vs numbers
    let s: StringOrNumber = serde_json::from_str(r#""hello""#).unwrap();
    let n: StringOrNumber = serde_json::from_str("42").unwrap();
    
    // No ambiguity - JSON types don't overlap
}

Non-overlapping JSON types avoid matching issues.

Complex Nested Types

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
struct User {
    id: u64,
    name: String,
    email: String,
}
 
#[derive(Debug, Deserialize)]
struct Product {
    id: u64,
    name: String,
    price: f64,
}
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Response {
    User(User),
    Product(Product),
    Error(String),
}
 
fn main() {
    // JSON that looks like User
    let user_json = r#"{"id": 1, "name": "Alice", "email": "alice@example.com"}"#;
    let response: Response = serde_json::from_str(user_json).unwrap();
    
    // Deserialization tries:
    // 1. User - succeeds (all required fields present)
    // Done, no need to try Product or Error
}

Struct matching succeeds when all required fields are present.

Field Overlap Detection

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
struct TypeA {
    name: String,
    value: i32,
}
 
#[derive(Debug, Deserialize)]
struct TypeB {
    name: String,
    value: i32,
    extra: Option<String>,
}
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Overlapping {
    A(TypeA),
    B(TypeB),
}
 
fn main() {
    // Both TypeA and TypeB can parse {"name": "test", "value": 42}
    // TypeA matches first (comes first in enum)
    let json = r#"{"name": "test", "value": 42}"#;
    let value: Overlapping = serde_json::from_str(json).unwrap();
    
    match value {
        Overlapping::A(a) => println!("A: {:?}", a),
        Overlapping::B(b) => println!("B: {:?}", b),
    }
    // Always prints "A" - TypeA matches first
}

Structs with overlapping fields match the first variant that succeeds.

Using deny_unknown_fields

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(deny_unknown_fields)]
struct User {
    id: u64,
    name: String,
}
 
#[derive(Debug, Deserialize)]
#[serde(deny_unknown_fields)]
struct Product {
    id: u64,
    name: String,
    price: f64,
}
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Item {
    User(User),
    Product(Product),
}
 
fn main() {
    // With deny_unknown_fields:
    // {"id": 1, "name": "Alice"} -> User
    // {"id": 1, "name": "Widget", "price": 9.99} -> Product
    // {"id": 1, "name": "Alice", "price": 9.99} -> Error (User fails due to unknown field, Product succeeds)
}

deny_unknown_fields helps distinguish structs with overlapping fields.

Performance with Large Enums

use serde::Deserialize;
use std::time::Instant;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum LargeEnum {
    V0(i32), V1(i32), V2(i32), V3(i32), V4(i32),
    V5(i32), V6(i32), V7(i32), V8(i32), V9(i32),
    V10(i32), V11(i32), V12(i32), V13(i32), V14(i32),
    V15(i32), V16(i32), V17(i32), V18(i32), V19(i32),
}
 
fn main() {
    let iterations = 50_000;
    
    // Match last variant - worst case
    let json = "42";
    let start = Instant::now();
    for _ in 0..iterations {
        let _: LargeEnum = serde_json::from_str(json).unwrap();
    }
    let worst_case = start.elapsed();
    
    // Match first variant - best case
    let json = "42";
    let start = Instant::now();
    for _ in 0..iterations {
        let _: LargeEnum = serde_json::from_str(json).unwrap();
    }
    let best_case = start.elapsed();
    
    println!("First variant match: {:?}", best_case);
    println!("Last variant match: {:?}", worst_case);
    // Small difference for i32, but compounds for complex types
}

Worst case is matching the last variant after trying all others.

Alternatives to Untagged

use serde::Deserialize;
 
// Option 1: Internally tagged
#[derive(Debug, Deserialize)]
#[serde(tag = "type")]
enum InternallyTagged {
    User { id: u64, name: String },
    Product { id: u64, name: String, price: f64 },
}
 
// Option 2: Externally tagged (default)
#[derive(Debug, Deserialize)]
enum ExternallyTagged {
    User { id: u64, name: String },
    Product { id: u64, name: String, price: f64 },
}
 
// Option 3: Adjacently tagged
#[derive(Debug, Deserialize)]
#[serde(tag = "type", content = "data")]
enum AdjacentlyTagged {
    User { id: u64, name: String },
    Product { id: u64, name: String, price: f64 },
}
 
// All tagged variants have O(1) variant selection
// Untagged requires O(n) trial parsing in worst case

Tagged variants provide O(1) lookup; untagged is O(n).

When to Use Untagged

use serde::{Deserialize, Serialize};
 
// Good use case: parsing external APIs with varying formats
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum ApiValue {
    String(String),
    Number(f64),
    Boolean(bool),
    Null,
    Array(Vec<ApiValue>),
    Object(serde_json::Value),
}
 
// Good use case: union of similar types
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum IdOrName {
    Id(u64),
    Name(String),
}
 
// Bad use case: many similar variants
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum BadExample {
    V1(StructA),  // Similar to V2
    V2(StructB),  // Similar to V1
    V3(StructC),  // Similar to V4
    V4(StructD),  // Similar to V3
    // Many attempts, confusion about which matches
}

Use untagged for flexible parsing, avoid for many similar variants.

Benchmark: Tagged vs Untagged

use serde::Deserialize;
use std::time::Instant;
 
#[derive(Debug, Deserialize)]
#[serde(tag = "type")]
enum Tagged {
    Number(i32),
    Text(String),
    Flag(bool),
}
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum Untagged {
    Number(i32),
    Text(String),
    Flag(bool),
}
 
fn main() {
    let iterations = 100_000;
    
    // Tagged JSON
    let tagged_json = r#"{"type":"Text","Text":"hello"}"#;
    let start = Instant::now();
    for _ in 0..iterations {
        let _: Tagged = serde_json::from_str(tagged_json).unwrap();
    }
    println!("Tagged: {:?}", start.elapsed());
    
    // Untagged JSON
    let untagged_json = r#""hello""#;
    let start = Instant::now();
    for _ in 0..iterations {
        let _: Untagged = serde_json::from_str(untagged_json).unwrap();
    }
    println!("Untagged: {:?}", start.elapsed());
    
    // Tagged: O(1) variant lookup
    // Untagged: O(n) trial parsing, but smaller JSON
}

Untagged produces smaller JSON but requires trial parsing.

Memory Overhead During Parsing

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum MemoryExample {
    Small(i32),
    Large(Vec<String>),
}
 
fn main() {
    // For JSON "42":
    // 1. Try Small(i32) - succeeds
    // Minimal allocation
    
    // For JSON ["a", "b", "c", ...many more...]:
    // 1. Try Small(i32) - fails immediately (not a number)
    // 2. Try Large(Vec<String>) - allocates Vec, succeeds
    
    // For JSON {"unexpected": "structure"}:
    // 1. Try Small(i32) - fails
    // 2. Try Large(Vec<String>) - fails
    // Returns error, no successful allocation
}

Failed variant attempts may allocate temporarily before backtracking.

Error Messages

use serde::Deserialize;
 
#[derive(Debug, Deserialize)]
#[serde(untagged)]
enum StrictEnum {
    First(String),
    Second(i32),
}
 
fn main() {
    // Invalid input
    let json = r#"{"invalid": "data"}"#;
    
    match serde_json::from_str::<StrictEnum>(json) {
        Ok(_) => println!("Parsed"),
        Err(e) => println!("Error: {}", e),
    }
    // Error: data did not match any variant of untagged enum
    // Less specific than tagged enum errors
}

Untagged errors are less specific—they report "no variant matched" rather than field-specific errors.

Summary Table

Aspect	Tagged	Untagged
Variant selection	O(1) by tag	O(n) trial parsing
JSON size	Larger (includes tag)	Smaller (no tag)
Error messages	Specific to variant	Generic "no match"
Ambiguity	None (tag is authoritative)	Possible (first match wins)
API design	Explicit type field	Implicit by content
Deserialization speed	Faster (direct lookup)	Slower (trial parsing)

Synthesis

#[serde(untagged)] trades deserialization performance for JSON representation flexibility:

The matching process: Serde attempts to deserialize into each variant in declaration order. The first successful parse wins. This trial parsing continues until a match is found or all variants are exhausted. For a ten-variant enum where the input matches the last variant, nine failed deserialization attempts occur before success.

Performance characteristics: The cost is proportional to the number of variants and their complexity. Simple types like i32 fail quickly—the parser sees a non-number and moves on. Complex types like structs with many fields may parse significantly before failing on a missing required field. Ordering variants by frequency and parsing speed minimizes average cost.

When it's worth it: External APIs you don't control often use untagged representations—a response might be a string, number, or object depending on context. Untagged handles this naturally without custom parsing code. For internal APIs or performance-critical paths, tagged enums provide O(1) variant selection and clearer error messages.

Key insight: The ambiguity of untagged enums is both their strength and weakness. They accept more input variations cleanly but make debugging harder when data doesn't match any variant. The error message tells you "no variant matched" without explaining why each variant failed. Use untagged when you need flexible parsing of varied inputs; prefer tagged when you control both sides of the API and care about performance or error clarity.

How does serde's #[serde(untagged)] enum representation affect deserialization performance?