Loading pageâŚ
Rust walkthroughs
Loading pageâŚ
nom::character::complete::multispace0 for flexible whitespace parsing?nom::character::complete::multispace0 is a parser combinator that matches zero or more whitespace charactersâspecifically spaces, tabs, line feeds, and carriage returnsâreturning the matched whitespace as a string slice. Unlike space0 which only matches ASCII spaces and tabs, multispace0 also matches newlines, making it suitable for parsing formats where whitespace can span multiple lines or where you need to consume all whitespace between tokens regardless of line boundaries. The complete variant ensures that the parser returns an error on incomplete input rather than returning Incomplete, which is appropriate for most text parsing scenarios where the entire input is available in memory. This makes multispace0 essential for building parsers that need to handle flexible whitespace in formats like configuration files, programming languages, and data serialization formats.
use nom::character::complete::multispace0;
use nom::IResult;
fn main() {
// multispace0 matches zero or more whitespace characters
// Including: space, tab, newline, carriage return
// Empty input - matches zero characters
let result: IResult<&str, &str> = multispace0("");
assert_eq!(result, Ok(("", "")));
// No whitespace - matches zero characters
let result: IResult<&str, &str> = multispace0("hello");
assert_eq!(result, Ok(("hello", "")));
// Spaces
let result: IResult<&str, &str> = multispace0(" hello");
assert_eq!(result, Ok(("hello", " ")));
// Tabs
let result: IResult<&str, &str> = multispace0("\t\t\tdata");
assert_eq!(result, Ok(("data", "\t\t\t")));
// Newlines
let result: IResult<&str, &str> = multispace0("\n\n\ntext");
assert_eq!(result, Ok(("text", "\n\n\n")));
// Mixed whitespace
let result: IResult<&str, &str> = multispace0(" \t\n\r\n more");
assert_eq!(result, Ok(("more", " \t\n\r\n ")));
println!("All tests passed!");
}multispace0 consumes all whitespace characters and returns them, leaving the remaining input.
use nom::character::complete::multispace0;
use nom::IResult;
fn main() {
// multispace0 recognizes these characters:
// - ' ' (space, 0x20)
// - '\t' (tab, 0x09)
// - '\n' (line feed, 0x0A)
// - '\r' (carriage return, 0x0D)
// Space
let input = " ";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", " ")));
// Tab
let input = "\t\t";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", "\t\t")));
// Newline (line feed)
let input = "\n\n";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", "\n\n")));
// Carriage return
let input = "\r\r";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", "\r\r")));
// Windows-style line endings (CRLF)
let input = "\r\n\r\n";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", "\r\n\r\n")));
// Mixed whitespace
let input = " \t\n \r\n ";
let result: IResult<&str, &str> = multispace0(input);
assert_eq!(result, Ok(("", " \t\n \r\n ")));
// Note: Other Unicode whitespace is NOT matched
// U+00A0 (non-breaking space) - NOT matched
// U+2003 (em space) - NOT matched
println!("All whitespace character tests passed!");
}multispace0 matches exactly the four ASCII whitespace characters, not Unicode whitespace categories.
use nom::character::complete::{multispace0, space0};
use nom::IResult;
fn main() {
// space0: matches only spaces and tabs (no newlines)
// multispace0: matches spaces, tabs, AND newlines
let input = "hello";
assert_eq!(space0(input), Ok(("hello", "")));
assert_eq!(multispace0(input), Ok(("hello", "")));
// Spaces work the same
let input = " hello";
assert_eq!(space0(input), Ok(("hello", " ")));
assert_eq!(multispace0(input), Ok(("hello", " ")));
// Tabs work the same
let input = "\thello";
assert_eq!(space0(input), Ok(("hello", "\t")));
assert_eq!(multispace0(input), Ok(("hello", "\t")));
// NEWLINES are the difference!
let input = "\nhello";
// space0 does NOT match newlines
let result = space0(input);
assert_eq!(result, Ok(("\nhello", ""))); // No match
// multispace0 DOES match newlines
let result = multispace0(input);
assert_eq!(result, Ok(("hello", "\n"))); // Matched!
// Multi-line input
let input = " \n hello";
// space0 stops at newline
assert_eq!(space0(input), Ok(("\n hello", " ")));
// multispace0 consumes all
assert_eq!(multispace0(input), Ok(("hello", " \n ")));
println!("Comparison tests passed!");
}The key difference: space0 stops at newlines while multispace0 consumes them.
use nom::character::complete::{multispace0, multispace1};
use nom::IResult;
fn main() {
// multispace0: zero or more (always succeeds)
// multispace1: one or more (fails if no whitespace)
// Both match when whitespace exists
let input = " hello";
assert_eq!(multispace0(input), Ok(("hello", " ")));
assert_eq!(multispace1(input), Ok(("hello", " ")));
// Difference: zero whitespace case
let input = "hello";
// multispace0 succeeds with empty match
assert_eq!(multispace0(input), Ok(("hello", "")));
// multispace1 fails (needs at least one whitespace)
let result = multispace1(input);
assert!(result.is_err());
// Use multispace0 when whitespace is optional
// Use multispace1 when whitespace is required
println!("multispace0 vs multispace1 tests passed!");
}multispace0 always succeeds; multispace1 requires at least one whitespace character.
use nom::character::complete::{multispace0, char, digit1};
use nom::sequence::tuple;
use nom::combinator::map;
use nom::IResult;
// Parse a simple expression like "a , b" with flexible whitespace
fn parse_tuple(input: &str) -> IResult<&str, (&str, &str)> {
let (remaining, _) = multispace0(input)?; // Leading whitespace
let (remaining, first) = digit1(remaining)?;
let (remaining, _) = multispace0(remaining)?; // Before comma
let (remaining, _) = char(',')(remaining)?;
let (remaining, _) = multispace0(remaining)?; // After comma
let (remaining, second) = digit1(remaining)?;
let (remaining, _) = multispace0(remaining)?; // Trailing whitespace
Ok((remaining, (first, second)))
}
fn main() {
// Flexible whitespace handling
let result = parse_tuple("1,2");
assert_eq!(result, Ok(("", ("1", "2"))));
let result = parse_tuple("1 , 2");
assert_eq!(result, Ok(("", ("1", "2"))));
let result = parse_tuple("1 , 2");
assert_eq!(result, Ok(("", ("1", "2"))));
let result = parse_tuple(" 1,2 ");
assert_eq!(result, Ok(("", ("1", "2"))));
let result = parse_tuple("1\n,\n2"); // Newlines
assert_eq!(result, Ok(("", ("1", "2"))));
let result = parse_tuple("1\t,\t2"); // Tabs
assert_eq!(result, Ok(("", ("1", "2"))));
println!("Token parsing tests passed!");
}Use multispace0 between tokens to handle flexible whitespace formatting.
use nom::character::complete::{multispace0, alpha1, digit1, char};
use nom::sequence::{preceded, delimited, terminated};
use nom::combinator::map;
use nom::multi::many0;
use nom::branch::alt;
use nom::IResult;
// Helper: parse something surrounded by optional whitespace
fn ws<'a, F, O>(parser: F) -> impl Fn(&'a str) -> IResult<&'a str, O>
where
F: Fn(&'a str) -> IResult<&'a str, O>,
{
move |input: &str| {
let (input, _) = multispace0(input)?;
parser(input)
}
}
#[derive(Debug, PartialEq)]
enum Token {
Number(i64),
Identifier(String),
Plus,
Minus,
}
fn parse_number(input: &str) -> IResult<&str, Token> {
map(digit1, |s: &str| {
Token::Number(s.parse().unwrap())
})(input)
}
fn parse_identifier(input: &str) -> IResult<&str, Token> {
map(alpha1, |s: &str| Token::Identifier(s.to_string()))(input)
}
fn parse_plus(input: &str) -> IResult<&str, Token> {
map(char('+'), |_| Token::Plus)(input)
}
fn parse_minus(input: &str) -> IResult<&str, Token> {
map(char('-'), |_| Token::Minus)(input)
}
fn parse_token(input: &str) -> IResult<&str, Token> {
ws(alt((parse_number, parse_identifier, parse_plus, parse_minus)))(input)
}
fn parse_tokens(input: &str) -> IResult<&str, Vec<Token>> {
many0(ws(|i| {
let (i, token) = parse_token(i)?;
Ok((i, token))
}))(input)
}
fn main() {
let result = parse_tokens("1 + abc - 456");
assert_eq!(result, Ok(("", vec![
Token::Number(1),
Token::Plus,
Token::Identifier("abc".to_string()),
Token::Minus,
Token::Number(456),
])));
// Handles various whitespace
let result = parse_tokens("1\n+\tabc\r\n-\n456");
assert_eq!(result, Ok(("", vec![
Token::Number(1),
Token::Plus,
Token::Identifier("abc".to_string()),
Token::Minus,
Token::Number(456),
])));
println!("Whitespace-tolerant parser tests passed!");
}The ws helper wraps parsers with multispace0 for consistent whitespace handling.
use nom::character::complete::{multispace0, alpha1, alphanumeric1, char};
use nom::sequence::{preceded, terminated, tuple};
use nom::combinator::{map, opt};
use nom::multi::many0;
use nom::branch::alt;
use nom::IResult;
use std::collections::HashMap;
#[derive(Debug, PartialEq)]
struct Config {
entries: HashMap<String, String>,
}
fn parse_key(input: &str) -> IResult<&str, &str> {
alphanumeric1(input)
}
fn parse_value(input: &str) -> IResult<&str, &str> {
// Value is everything until newline or end
nom::character::complete::take_till(|c| c == '\n' || c == '\r')(input)
.map(|(i, v)| (i, v.trim_end()))
}
fn parse_entry(input: &str) -> IResult<&str, (String, String)> {
let (input, _) = multispace0(input)?;
let (input, key) = parse_key(input)?;
let (input, _) = multispace0(input)?;
let (input, _) = char('=')(input)?;
let (input, _) = multispace0(input)?;
let (input, value) = parse_value(input)?;
let (input, _) = multispace0(input)?;
Ok((input, (key.to_string(), value.to_string())))
}
fn parse_comment(input: &str) -> IResult<&str, ()> {
let (input, _) = char('#')(input)?;
let (input, _) = nom::character::complete::take_till(|c| c == '\n')(input)?;
let (input, _) = multispace0(input)?;
Ok((input, ()))
}
fn parse_config(input: &str) -> IResult<&str, Config> {
let (input, _) = multispace0(input)?;
let (input, entries) = many0(alt((
map(parse_comment, |_| None),
map(parse_entry, Some),
)))(input)?;
let entries: HashMap<String, String> = entries
.into_iter()
.flatten()
.collect();
Ok((input, Config { entries }))
}
fn main() {
let input = r#"
# Configuration file
host = localhost
port = 8080
# Database settings
database = myapp_db
"#;
let result = parse_config(input).unwrap();
assert_eq!(result.1.entries.get("host"), Some(&"localhost".to_string()));
assert_eq!(result.1.entries.get("port"), Some(&"8080".to_string()));
assert_eq!(result.1.entries.get("database"), Some(&"myapp_db".to_string()));
println!("Config parsed: {:?}", result.1);
}multispace0 handles the flexible whitespace in configuration files including comments and blank lines.
use nom::character::complete::{multispace0, char, digit1};
use nom::sequence::delimited;
use nom::multi::many0;
use nom::combinator::map;
use nom::IResult;
// Parse nested lists like [1, [2, 3], 4]
#[derive(Debug, PartialEq)]
enum Value {
Number(i64),
List(Vec<Value>),
}
fn parse_number(input: &str) -> IResult<&str, Value> {
let (input, _) = multispace0(input)?;
map(digit1, |s: &str| Value::Number(s.parse().unwrap()))(input)
}
fn parse_list(input: &str) -> IResult<&str, Value> {
let (input, _) = multispace0(input)?;
let (input, _) = char('[')(input)?;
let (input, _) = multispace0(input)?;
let (input, values) = many0(|i| {
let (i, v) = parse_value(i)?;
let (i, _) = multispace0(i)?;
let (i, _) = opt_comma(i)?;
Ok((i, v))
})(input)?;
let (input, _) = multispace0(input)?;
let (input, _) = char(']')(input)?;
Ok((input, Value::List(values)))
}
fn opt_comma(input: &str) -> IResult<&str, Option<char>> {
let (input, _) = multispace0(input)?;
let result = char(',')(input);
match result {
Ok((i, c)) => {
let (i, _) = multispace0(i)?;
Ok((i, Some(c)))
}
Err(_) => Ok((input, None)),
}
}
fn parse_value(input: &str) -> IResult<&str, Value> {
let (input, _) = multispace0(input)?;
nom::branch::alt((parse_number, parse_list))(input)
}
fn main() {
let result = parse_value("[1, 2, 3]").unwrap();
assert_eq!(result.1, Value::List(vec![
Value::Number(1),
Value::Number(2),
Value::Number(3),
]));
let result = parse_value("[ 1 , [ 2 , 3 ] , 4 ]").unwrap();
assert_eq!(result.1, Value::List(vec![
Value::Number(1),
Value::List(vec![Value::Number(2), Value::Number(3)]),
Value::Number(4),
]));
let result = parse_value("[\n 1,\n [\n 2,\n 3\n ]\n]").unwrap();
assert_eq!(result.1, Value::List(vec![
Value::Number(1),
Value::List(vec![Value::Number(2), Value::Number(3)]),
]));
println!("Nested structure tests passed!");
}multispace0 enables parsing of multiline, nested structures with flexible formatting.
// nom::character::complete::multispace0
// nom::character::streaming::multispace0
// The difference is in handling incomplete input:
use nom::character::complete::multispace0 as complete_multispace0;
use nom::character::streaming::multispace0 as streaming_multispace0;
use nom::IResult;
fn main() {
// Complete variant:
// - Assumes entire input is available
// - Returns Ok even if input ends
// - Never returns Err(Err::Incomplete)
let result: IResult<&str, &str> = complete_multispace0(" ");
assert_eq!(result, Ok(("", " ")));
// Streaming variant:
// - Designed for partial input
// - May return Incomplete if more input could match
// - Use for network/async parsing
// In this case, both behave the same for valid input
// The difference shows when input might be incomplete
// For most text parsing (files, strings in memory),
// use the complete variant
// For streaming parsing (network, async I/O),
// use the streaming variant
}Use complete for in-memory parsing and streaming for incremental/async parsing.
use nom::character::complete::multispace0;
use nom::IResult;
fn main() {
// multispace0 is efficient: O(n) where n is whitespace length
// It just scans forward until non-whitespace is found
// For parsing performance, position multispace0 calls:
// 1. At the start of parsers (consume leading whitespace)
// 2. Between tokens (consume separating whitespace)
// 3. At the end of parsers (consume trailing whitespace)
// Avoid calling multispace0 unnecessarily
// Example: inefficient
fn inefficient(input: &str) -> IResult<&str, &str> {
let (i, _) = multispace0(input)?;
let (i, _) = multispace0(i)?; // Redundant
let (i, result) = alpha1(i)?;
let (i, _) = multispace0(i)?;
let (i, _) = multispace0(i)?; // Redundant
Ok((i, result))
}
// Example: efficient
fn efficient(input: &str) -> IResult<&str, &str> {
let (i, _) = multispace0(input)?;
let (i, result) = alpha1(i)?;
let (i, _) = multispace0(i)?;
Ok((i, result))
}
// Tip: Create a helper for whitespace-wrapped parsing
fn ws_wrap<'a, F, O>(parser: F) -> impl Fn(&'a str) -> IResult<&'a str, O>
where
F: Fn(&'a str) -> IResult<&'a str, O>,
{
move |input| {
let (input, _) = multispace0(input)?;
parser(input)
}
}
println!("Performance pattern examples shown");
}multispace0 is linear-time; avoid redundant calls for better performance.
Whitespace parsers comparison:
| Parser | Matches | Can Match Zero | Matches Newlines |
|--------|---------|----------------|-------------------|
| space0 | space, tab | Yes | No |
| space1 | space, tab | No (requires 1+) | No |
| multispace0 | space, tab, newline, CR | Yes | Yes |
| multispace1 | space, tab, newline, CR | No (requires 1+) | Yes |
Common patterns:
| Pattern | Use Case |
|---------|----------|
| preceded(multispace0, parser) | Consume leading whitespace |
| terminated(parser, multispace0) | Consume trailing whitespace |
| delimited(multispace0, parser, multispace0) | Whitespace around content |
| many0(preceded(multispace0, parser)) | Whitespace-separated items |
When to use multispace0:
| Use Case | Reason | |----------|--------| | Configuration files | Whitespace can span lines | | Programming languages | Flexible formatting | | Data formats | Multi-line structures | | Text between tokens | Handles all whitespace |
Key insight: multispace0 is the flexible whitespace workhorse for text parsers in nom. The key distinction from space0 is newline matchingâformats that allow tokens to span multiple lines require multispace0, while single-line formats can use space0 for slightly better performance when you know newlines won't appear. The "complete" variant means it never returns Incomplete, which is what you want for parsing strings and files that are fully in memory. For parsing tokens separated by optional whitespace, the pattern of calling multispace0 before and after each token element creates parsers that are tolerant of any whitespace formatting, matching the flexibility users expect in most text-based formats. When building recursive parsers (like nested lists), multispace0 at each level ensures formatting can vary at any depth without breaking the parse.