What is the purpose of `nom::character::complete::multispace0` for flexible whitespace parsing?

nom::character::complete::multispace0 is a parser combinator that matches zero or more whitespace characters—specifically spaces, tabs, line feeds, and carriage returns—returning the matched whitespace as a string slice. Unlike space0 which only matches ASCII spaces and tabs, multispace0 also matches newlines, making it suitable for parsing formats where whitespace can span multiple lines or where you need to consume all whitespace between tokens regardless of line boundaries. The complete variant ensures that the parser returns an error on incomplete input rather than returning Incomplete, which is appropriate for most text parsing scenarios where the entire input is available in memory. This makes multispace0 essential for building parsers that need to handle flexible whitespace in formats like configuration files, programming languages, and data serialization formats.

Basic multispace0 Usage

use nom::character::complete::multispace0;
use nom::IResult;
 
fn main() {
    // multispace0 matches zero or more whitespace characters
    // Including: space, tab, newline, carriage return
    
    // Empty input - matches zero characters
    let result: IResult<&str, &str> = multispace0("");
    assert_eq!(result, Ok(("", "")));
    
    // No whitespace - matches zero characters
    let result: IResult<&str, &str> = multispace0("hello");
    assert_eq!(result, Ok(("hello", "")));
    
    // Spaces
    let result: IResult<&str, &str> = multispace0("   hello");
    assert_eq!(result, Ok(("hello", "   ")));
    
    // Tabs
    let result: IResult<&str, &str> = multispace0("\t\t\tdata");
    assert_eq!(result, Ok(("data", "\t\t\t")));
    
    // Newlines
    let result: IResult<&str, &str> = multispace0("\n\n\ntext");
    assert_eq!(result, Ok(("text", "\n\n\n")));
    
    // Mixed whitespace
    let result: IResult<&str, &str> = multispace0(" \t\n\r\n  more");
    assert_eq!(result, Ok(("more", " \t\n\r\n  ")));
    
    println!("All tests passed!");
}

multispace0 consumes all whitespace characters and returns them, leaving the remaining input.

Whitespace Characters Recognized

use nom::character::complete::multispace0;
use nom::IResult;
 
fn main() {
    // multispace0 recognizes these characters:
    // - ' '  (space, 0x20)
    // - '\t' (tab, 0x09)
    // - '\n' (line feed, 0x0A)
    // - '\r' (carriage return, 0x0D)
    
    // Space
    let input = "   ";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "   ")));
    
    // Tab
    let input = "\t\t";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "\t\t")));
    
    // Newline (line feed)
    let input = "\n\n";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "\n\n")));
    
    // Carriage return
    let input = "\r\r";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "\r\r")));
    
    // Windows-style line endings (CRLF)
    let input = "\r\n\r\n";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "\r\n\r\n")));
    
    // Mixed whitespace
    let input = "  \t\n  \r\n  ";
    let result: IResult<&str, &str> = multispace0(input);
    assert_eq!(result, Ok(("", "  \t\n  \r\n  ")));
    
    // Note: Other Unicode whitespace is NOT matched
    // U+00A0 (non-breaking space) - NOT matched
    // U+2003 (em space) - NOT matched
    
    println!("All whitespace character tests passed!");
}

multispace0 matches exactly the four ASCII whitespace characters, not Unicode whitespace categories.

multispace0 vs space0

use nom::character::complete::{multispace0, space0};
use nom::IResult;
 
fn main() {
    // space0: matches only spaces and tabs (no newlines)
    // multispace0: matches spaces, tabs, AND newlines
    
    let input = "hello";
    assert_eq!(space0(input), Ok(("hello", "")));
    assert_eq!(multispace0(input), Ok(("hello", "")));
    
    // Spaces work the same
    let input = "   hello";
    assert_eq!(space0(input), Ok(("hello", "   ")));
    assert_eq!(multispace0(input), Ok(("hello", "   ")));
    
    // Tabs work the same
    let input = "\thello";
    assert_eq!(space0(input), Ok(("hello", "\t")));
    assert_eq!(multispace0(input), Ok(("hello", "\t")));
    
    // NEWLINES are the difference!
    let input = "\nhello";
    
    // space0 does NOT match newlines
    let result = space0(input);
    assert_eq!(result, Ok(("\nhello", ""))); // No match
    
    // multispace0 DOES match newlines
    let result = multispace0(input);
    assert_eq!(result, Ok(("hello", "\n"))); // Matched!
    
    // Multi-line input
    let input = "  \n  hello";
    
    // space0 stops at newline
    assert_eq!(space0(input), Ok(("\n  hello", "  ")));
    
    // multispace0 consumes all
    assert_eq!(multispace0(input), Ok(("hello", "  \n  ")));
    
    println!("Comparison tests passed!");
}

The key difference: space0 stops at newlines while multispace0 consumes them.

multispace0 vs multispace1

use nom::character::complete::{multispace0, multispace1};
use nom::IResult;
 
fn main() {
    // multispace0: zero or more (always succeeds)
    // multispace1: one or more (fails if no whitespace)
    
    // Both match when whitespace exists
    let input = "  hello";
    assert_eq!(multispace0(input), Ok(("hello", "  ")));
    assert_eq!(multispace1(input), Ok(("hello", "  ")));
    
    // Difference: zero whitespace case
    let input = "hello";
    
    // multispace0 succeeds with empty match
    assert_eq!(multispace0(input), Ok(("hello", "")));
    
    // multispace1 fails (needs at least one whitespace)
    let result = multispace1(input);
    assert!(result.is_err());
    
    // Use multispace0 when whitespace is optional
    // Use multispace1 when whitespace is required
    
    println!("multispace0 vs multispace1 tests passed!");
}

multispace0 always succeeds; multispace1 requires at least one whitespace character.

Parsing Tokens with Optional Whitespace

use nom::character::complete::{multispace0, char, digit1};
use nom::sequence::tuple;
use nom::combinator::map;
use nom::IResult;
 
// Parse a simple expression like "a , b" with flexible whitespace
fn parse_tuple(input: &str) -> IResult<&str, (&str, &str)> {
    let (remaining, _) = multispace0(input)?;  // Leading whitespace
    let (remaining, first) = digit1(remaining)?;
    let (remaining, _) = multispace0(remaining)?;  // Before comma
    let (remaining, _) = char(',')(remaining)?;
    let (remaining, _) = multispace0(remaining)?;  // After comma
    let (remaining, second) = digit1(remaining)?;
    let (remaining, _) = multispace0(remaining)?;  // Trailing whitespace
    
    Ok((remaining, (first, second)))
}
 
fn main() {
    // Flexible whitespace handling
    let result = parse_tuple("1,2");
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    let result = parse_tuple("1 , 2");
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    let result = parse_tuple("1  ,  2");
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    let result = parse_tuple("  1,2  ");
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    let result = parse_tuple("1\n,\n2");  // Newlines
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    let result = parse_tuple("1\t,\t2");  // Tabs
    assert_eq!(result, Ok(("", ("1", "2"))));
    
    println!("Token parsing tests passed!");
}

Use multispace0 between tokens to handle flexible whitespace formatting.

Building a Whitespace-Tolerant Parser

use nom::character::complete::{multispace0, alpha1, digit1, char};
use nom::sequence::{preceded, delimited, terminated};
use nom::combinator::map;
use nom::multi::many0;
use nom::branch::alt;
use nom::IResult;
 
// Helper: parse something surrounded by optional whitespace
fn ws<'a, F, O>(parser: F) -> impl Fn(&'a str) -> IResult<&'a str, O>
where
    F: Fn(&'a str) -> IResult<&'a str, O>,
{
    move |input: &str| {
        let (input, _) = multispace0(input)?;
        parser(input)
    }
}
 
#[derive(Debug, PartialEq)]
enum Token {
    Number(i64),
    Identifier(String),
    Plus,
    Minus,
}
 
fn parse_number(input: &str) -> IResult<&str, Token> {
    map(digit1, |s: &str| {
        Token::Number(s.parse().unwrap())
    })(input)
}
 
fn parse_identifier(input: &str) -> IResult<&str, Token> {
    map(alpha1, |s: &str| Token::Identifier(s.to_string()))(input)
}
 
fn parse_plus(input: &str) -> IResult<&str, Token> {
    map(char('+'), |_| Token::Plus)(input)
}
 
fn parse_minus(input: &str) -> IResult<&str, Token> {
    map(char('-'), |_| Token::Minus)(input)
}
 
fn parse_token(input: &str) -> IResult<&str, Token> {
    ws(alt((parse_number, parse_identifier, parse_plus, parse_minus)))(input)
}
 
fn parse_tokens(input: &str) -> IResult<&str, Vec<Token>> {
    many0(ws(|i| {
        let (i, token) = parse_token(i)?;
        Ok((i, token))
    }))(input)
}
 
fn main() {
    let result = parse_tokens("1 + abc - 456");
    assert_eq!(result, Ok(("", vec![
        Token::Number(1),
        Token::Plus,
        Token::Identifier("abc".to_string()),
        Token::Minus,
        Token::Number(456),
    ])));
    
    // Handles various whitespace
    let result = parse_tokens("1\n+\tabc\r\n-\n456");
    assert_eq!(result, Ok(("", vec![
        Token::Number(1),
        Token::Plus,
        Token::Identifier("abc".to_string()),
        Token::Minus,
        Token::Number(456),
    ])));
    
    println!("Whitespace-tolerant parser tests passed!");
}

The ws helper wraps parsers with multispace0 for consistent whitespace handling.

Parsing Configuration Files

use nom::character::complete::{multispace0, alpha1, alphanumeric1, char};
use nom::sequence::{preceded, terminated, tuple};
use nom::combinator::{map, opt};
use nom::multi::many0;
use nom::branch::alt;
use nom::IResult;
use std::collections::HashMap;
 
#[derive(Debug, PartialEq)]
struct Config {
    entries: HashMap<String, String>,
}
 
fn parse_key(input: &str) -> IResult<&str, &str> {
    alphanumeric1(input)
}
 
fn parse_value(input: &str) -> IResult<&str, &str> {
    // Value is everything until newline or end
    nom::character::complete::take_till(|c| c == '\n' || c == '\r')(input)
        .map(|(i, v)| (i, v.trim_end()))
}
 
fn parse_entry(input: &str) -> IResult<&str, (String, String)> {
    let (input, _) = multispace0(input)?;
    let (input, key) = parse_key(input)?;
    let (input, _) = multispace0(input)?;
    let (input, _) = char('=')(input)?;
    let (input, _) = multispace0(input)?;
    let (input, value) = parse_value(input)?;
    let (input, _) = multispace0(input)?;
    
    Ok((input, (key.to_string(), value.to_string())))
}
 
fn parse_comment(input: &str) -> IResult<&str, ()> {
    let (input, _) = char('#')(input)?;
    let (input, _) = nom::character::complete::take_till(|c| c == '\n')(input)?;
    let (input, _) = multispace0(input)?;
    Ok((input, ()))
}
 
fn parse_config(input: &str) -> IResult<&str, Config> {
    let (input, _) = multispace0(input)?;
    let (input, entries) = many0(alt((
        map(parse_comment, |_| None),
        map(parse_entry, Some),
    )))(input)?;
    
    let entries: HashMap<String, String> = entries
        .into_iter()
        .flatten()
        .collect();
    
    Ok((input, Config { entries }))
}
 
fn main() {
    let input = r#"
        # Configuration file
        host = localhost
        port = 8080
        
        # Database settings
        database = myapp_db
    "#;
    
    let result = parse_config(input).unwrap();
    assert_eq!(result.1.entries.get("host"), Some(&"localhost".to_string()));
    assert_eq!(result.1.entries.get("port"), Some(&"8080".to_string()));
    assert_eq!(result.1.entries.get("database"), Some(&"myapp_db".to_string()));
    
    println!("Config parsed: {:?}", result.1);
}

multispace0 handles the flexible whitespace in configuration files including comments and blank lines.

Parsing Nested Structures

use nom::character::complete::{multispace0, char, digit1};
use nom::sequence::delimited;
use nom::multi::many0;
use nom::combinator::map;
use nom::IResult;
 
// Parse nested lists like [1, [2, 3], 4]
#[derive(Debug, PartialEq)]
enum Value {
    Number(i64),
    List(Vec<Value>),
}
 
fn parse_number(input: &str) -> IResult<&str, Value> {
    let (input, _) = multispace0(input)?;
    map(digit1, |s: &str| Value::Number(s.parse().unwrap()))(input)
}
 
fn parse_list(input: &str) -> IResult<&str, Value> {
    let (input, _) = multispace0(input)?;
    let (input, _) = char('[')(input)?;
    let (input, _) = multispace0(input)?;
    
    let (input, values) = many0(|i| {
        let (i, v) = parse_value(i)?;
        let (i, _) = multispace0(i)?;
        let (i, _) = opt_comma(i)?;
        Ok((i, v))
    })(input)?;
    
    let (input, _) = multispace0(input)?;
    let (input, _) = char(']')(input)?;
    
    Ok((input, Value::List(values)))
}
 
fn opt_comma(input: &str) -> IResult<&str, Option<char>> {
    let (input, _) = multispace0(input)?;
    let result = char(',')(input);
    match result {
        Ok((i, c)) => {
            let (i, _) = multispace0(i)?;
            Ok((i, Some(c)))
        }
        Err(_) => Ok((input, None)),
    }
}
 
fn parse_value(input: &str) -> IResult<&str, Value> {
    let (input, _) = multispace0(input)?;
    nom::branch::alt((parse_number, parse_list))(input)
}
 
fn main() {
    let result = parse_value("[1, 2, 3]").unwrap();
    assert_eq!(result.1, Value::List(vec![
        Value::Number(1),
        Value::Number(2),
        Value::Number(3),
    ]));
    
    let result = parse_value("[ 1 , [ 2 , 3 ] , 4 ]").unwrap();
    assert_eq!(result.1, Value::List(vec![
        Value::Number(1),
        Value::List(vec![Value::Number(2), Value::Number(3)]),
        Value::Number(4),
    ]));
    
    let result = parse_value("[\n  1,\n  [\n    2,\n    3\n  ]\n]").unwrap();
    assert_eq!(result.1, Value::List(vec![
        Value::Number(1),
        Value::List(vec![Value::Number(2), Value::Number(3)]),
    ]));
    
    println!("Nested structure tests passed!");
}

multispace0 enables parsing of multiline, nested structures with flexible formatting.

Complete vs Streaming Variants

// nom::character::complete::multispace0
// nom::character::streaming::multispace0
 
// The difference is in handling incomplete input:
 
use nom::character::complete::multispace0 as complete_multispace0;
use nom::character::streaming::multispace0 as streaming_multispace0;
use nom::IResult;
 
fn main() {
    // Complete variant:
    // - Assumes entire input is available
    // - Returns Ok even if input ends
    // - Never returns Err(Err::Incomplete)
    
    let result: IResult<&str, &str> = complete_multispace0("  ");
    assert_eq!(result, Ok(("", "  ")));
    
    // Streaming variant:
    // - Designed for partial input
    // - May return Incomplete if more input could match
    // - Use for network/async parsing
    
    // In this case, both behave the same for valid input
    // The difference shows when input might be incomplete
    
    // For most text parsing (files, strings in memory),
    // use the complete variant
    
    // For streaming parsing (network, async I/O),
    // use the streaming variant
}

Use complete for in-memory parsing and streaming for incremental/async parsing.

Performance Considerations

use nom::character::complete::multispace0;
use nom::IResult;
 
fn main() {
    // multispace0 is efficient: O(n) where n is whitespace length
    // It just scans forward until non-whitespace is found
    
    // For parsing performance, position multispace0 calls:
    // 1. At the start of parsers (consume leading whitespace)
    // 2. Between tokens (consume separating whitespace)
    // 3. At the end of parsers (consume trailing whitespace)
    
    // Avoid calling multispace0 unnecessarily
    
    // Example: inefficient
    fn inefficient(input: &str) -> IResult<&str, &str> {
        let (i, _) = multispace0(input)?;
        let (i, _) = multispace0(i)?;  // Redundant
        let (i, result) = alpha1(i)?;
        let (i, _) = multispace0(i)?;
        let (i, _) = multispace0(i)?;  // Redundant
        Ok((i, result))
    }
    
    // Example: efficient
    fn efficient(input: &str) -> IResult<&str, &str> {
        let (i, _) = multispace0(input)?;
        let (i, result) = alpha1(i)?;
        let (i, _) = multispace0(i)?;
        Ok((i, result))
    }
    
    // Tip: Create a helper for whitespace-wrapped parsing
    fn ws_wrap<'a, F, O>(parser: F) -> impl Fn(&'a str) -> IResult<&'a str, O>
    where
        F: Fn(&'a str) -> IResult<&'a str, O>,
    {
        move |input| {
            let (input, _) = multispace0(input)?;
            parser(input)
        }
    }
    
    println!("Performance pattern examples shown");
}

multispace0 is linear-time; avoid redundant calls for better performance.

Synthesis

Whitespace parsers comparison:

Parser	Matches	Can Match Zero	Matches Newlines
`space0`	space, tab	Yes	No
`space1`	space, tab	No (requires 1+)	No
`multispace0`	space, tab, newline, CR	Yes	Yes
`multispace1`	space, tab, newline, CR	No (requires 1+)	Yes

Common patterns:

Pattern	Use Case
`preceded(multispace0, parser)`	Consume leading whitespace
`terminated(parser, multispace0)`	Consume trailing whitespace
`delimited(multispace0, parser, multispace0)`	Whitespace around content
`many0(preceded(multispace0, parser))`	Whitespace-separated items

When to use multispace0:

Use Case	Reason
Configuration files	Whitespace can span lines
Programming languages	Flexible formatting
Data formats	Multi-line structures
Text between tokens	Handles all whitespace

Key insight: multispace0 is the flexible whitespace workhorse for text parsers in nom. The key distinction from space0 is newline matching—formats that allow tokens to span multiple lines require multispace0, while single-line formats can use space0 for slightly better performance when you know newlines won't appear. The "complete" variant means it never returns Incomplete, which is what you want for parsing strings and files that are fully in memory. For parsing tokens separated by optional whitespace, the pattern of calling multispace0 before and after each token element creates parsers that are tolerant of any whitespace formatting, matching the flexibility users expect in most text-based formats. When building recursive parsers (like nested lists), multispace0 at each level ensures formatting can vary at any depth without breaking the parse.

What is the purpose of nom::character::complete::multispace0 for flexible whitespace parsing?