Loading page…
Rust walkthroughs
Loading page…
nom, how do you combine parsers with >> vs | operators?In nom, the >> (sequence) and | (alternative) operators are parser combinators that combine smaller parsers into larger ones. The >> operator sequences two parsers, running them in order and succeeding only if both succeed, returning the result of the second parser. The | operator tries alternatives in order, returning the result of the first parser that succeeds, or failing if all alternatives fail. These operators come from the nom::sequence::tuple and nom::branch::alt patterns but provide a more readable infix syntax when using the appropriate traits. Understanding how they compose is essential for building complex parsers from simple building blocks.
use nom::{
IResult,
bytes::complete::{tag, take},
character::complete::{alpha1, digit1, space1},
sequence::preceded,
branch::alt,
};
// Basic parsers return IResult<Input, Output>
fn parse_hello(input: &str) -> IResult<&str, &str> {
tag("hello")(input)
}
fn parse_world(input: &str) -> IResult<&str, &str> {
tag("world")(input)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_basic_parsers() {
// tag returns the matched string on success
assert_eq!(parse_hello("hello world"), Ok((" world", "hello")));
assert_eq!(parse_world("world!"), Ok(("!", "world")));
// Failure returns Err
assert!(parse_hello("goodbye").is_err());
}
}Basic parsers in nom take an input and return IResult<Input, Output>.
>>use nom::{IResult, bytes::complete::tag, sequence::tuple};
// The >> operator sequences parsers, keeping the rightmost result
fn sequence_with_tuple(input: &str) -> IResult<&str, (&str, &str)> {
// tuple runs multiple parsers in sequence, returning all results
tuple((tag("hello"), tag(" "), tag("world")))(input)
}
fn sequence_example() {
let result = sequence_with_tuple("hello world");
assert_eq!(result, Ok(("", ("hello", " ", "world"))));
// If any parser fails, the whole sequence fails
let result = sequence_with_tuple("hello there");
assert!(result.is_err()); // " there" doesn't match " world"
}tuple runs parsers in sequence and returns all results; >> is a specialized form.
use nom::{IResult, bytes::complete::tag, sequence::preceded, Parser};
// >> is defined in nom's Parser trait
// It sequences two parsers and returns the result of the second
fn sequence_with_shiftr(input: &str) -> IResult<&str, &str> {
// preceded is equivalent to >> for two parsers
// preceded(first, second) runs first, then second, returns second
preceded(tag("hello "), tag("world"))(input)
}
// Using the >> operator syntax (requires importing Parser trait)
fn sequence_operator(input: &str) -> IResult<&str, &str> {
// tag("hello ") >> tag("world") would be:
// Run tag("hello "), then tag("world"), return "world"
preceded(tag("hello "), tag("world"))(input)
}
fn shift_right_examples() {
let input = "hello world";
// preceded discards first result, keeps second
let result = sequence_with_shiftr(input);
assert_eq!(result, Ok(("", "world")));
// First parser must consume its input
let result = sequence_with_shiftr("hellox world");
assert!(result.is_err()); // "hellox " doesn't match "hello "
}The >> (shift-right) operator sequences parsers and returns the second result.
|use nom::{IResult, bytes::complete::tag, branch::alt};
// The | operator tries alternatives, returning the first success
fn alternative_with_alt(input: &str) -> IResult<&str, &str> {
alt((tag("hello"), tag("goodbye"), tag("hi")))(input)
}
fn alternative_example() {
// First alternative matches
let result = alternative_with_alt("hello world");
assert_eq!(result, Ok((" world", "hello")));
// Second alternative matches
let result = alternative_with_alt("goodbye world");
assert_eq!(result, Ok((" world", "goodbye")));
// Third alternative matches
let result = alternative_with_alt("hi there");
assert_eq!(result, Ok((" there", "hi")));
// None match
let result = alternative_with_alt("farewell");
assert!(result.is_err());
}alt tries each parser in order; the | operator provides this functionality.
use nom::{
IResult,
bytes::complete::tag,
character::complete::alpha1,
sequence::{preceded, tuple},
branch::alt,
};
// Parse a greeting followed by a name
fn parse_greeting(input: &str) -> IResult<&str, (&str, &str)> {
// Sequence: greeting >> name
// Alternative for greeting: "hello" | "hi" | "hey"
tuple((
alt((tag("hello"), tag("hi"), tag("hey"))),
tag(" "),
alpha1
))(input)
}
fn combined_example() {
let result = parse_greeting("hello Alice");
assert_eq!(result, Ok(("", ("hello", " ", "Alice"))));
let result = parse_greeting("hi Bob");
assert_eq!(result, Ok(("", ("hi", " ", "Bob"))));
let result = parse_greeting("hey Charlie");
assert_eq!(result, Ok(("", ("hey", " ", "Charlie"))));
// Invalid greeting
let result = parse_greeting("greetings Dave");
assert!(result.is_err());
}Real parsers combine sequence and alternative operations.
use nom::{
IResult,
bytes::complete::tag,
sequence::{tuple, preceded, terminated, delimited, pair},
};
fn sequence_combinators() {
let input = "<content>";
// tuple: run all, return all results
let result: IResult<&str, (&str, &str, &str)> =
tuple((tag("<"), tag("content"), tag(">")))(input);
assert_eq!(result, Ok(("", ("<", "content", ">"))));
// preceded: run both, return second (equivalent to >>)
let result: IResult<&str, &str> =
preceded(tag("<"), tag("content"))(input);
assert_eq!(result, Ok((">", "content")));
// terminated: run both, return first
let result: IResult<&str, &str> =
terminated(tag("<content"), tag(">"))(input);
assert_eq!(result, Ok(("", "<content")));
// delimited: run three, return middle
let result: IResult<&str, &str> =
delimited(tag("<"), tag("content"), tag(">"))(input);
assert_eq!(result, Ok(("", "content")));
// pair: run two, return both
let result: IResult<&str, (&str, &str)> =
pair(tag("<"), tag("content"))(input);
assert_eq!(result, Ok((">", ("<", "content"))));
}nom provides several sequence combinators with different result behaviors.
use nom::{
IResult,
bytes::complete::tag,
branch::{alt, permutation},
character::complete::{alpha1, digit1},
};
fn alternative_combinators() {
// alt: try in order, return first success
let result: IResult<&str, &str> =
alt((tag("abc"), tag("abd"), tag("abe")))("abd123");
assert_eq!(result, Ok(("123", "abd")));
// alt stops at first match
let result: IResult<&str, &str> =
alt((tag("a"), tag("ab"), tag("abc")))("abc");
assert_eq!(result, Ok(("bc", "a"))); // Matches "a" first
// Order matters for overlapping patterns!
let result: IResult<&str, &str> =
alt((tag("abc"), tag("a")))("abc");
assert_eq!(result, Ok(("", "abc"))); // "abc" tried first
}
fn permutation_example() {
// permutation: match all in any order
let input = "123abc";
let result: IResult<&str, (&str, &str)> =
permutation((digit1, alpha1))(input);
assert_eq!(result, Ok(("", ("123", "abc"))));
// Different order
let input = "abc123";
let result: IResult<&str, (&str, &str)> =
permutation((digit1, alpha1))(input);
assert_eq!(result, Ok(("", ("123", "abc"))));
}alt tries alternatives in order; permutation matches all in any order.
use nom::{
IResult,
bytes::complete::tag,
character::complete::digit1,
sequence::{preceded, pair},
branch::alt,
combinator::map,
};
// >> returns only the second result
fn shift_right_return() {
let input = "hello123";
// preceded (>>) returns only the second parser's result
let result: IResult<&str, &str> =
preceded(tag("hello"), digit1)(input);
assert_eq!(result, Ok(("", "123")));
// "hello" is discarded, only "123" returned
}
// | returns the type of the chosen alternative
fn alternative_return() {
// All alternatives must return the same type
let result: IResult<&str, &str> =
alt((tag("hello"), tag("world")))("world");
assert_eq!(result, Ok(("", "world")));
// If alternatives have different return types, use map to unify
let result: IResult<&str, i64> = alt((
map(tag("one"), |_| 1),
map(tag("two"), |_| 2),
map(digit1, |s: &str| s.parse().unwrap()),
))("two");
assert_eq!(result, Ok(("", 2)));
}>> discards the first result; | alternatives must return compatible types.
use nom::{
IResult,
bytes::complete::tag,
character::complete::{alpha1, digit1, space0, space1, char},
sequence::{tuple, preceded, delimited, separated_pair},
branch::alt,
combinator::{map, opt, recognize},
multi::{many0, many1},
};
// Parse a simple expression: number or identifier
#[derive(Debug, PartialEq)]
enum Expr {
Number(i64),
Identifier(String),
BinaryOp(Box<Expr>, char, Box<Expr>),
}
fn parse_number(input: &str) -> IResult<&str, Expr> {
map(digit1, |s: &str| Expr::Number(s.parse().unwrap()))(input)
}
fn parse_identifier(input: &str) -> IResult<&str, Expr> {
map(alpha1, |s: &str| Expr::Identifier(s.to_string()))(input)
}
fn parse_primary(input: &str) -> IResult<&str, Expr> {
// Alternative: number | identifier | (expression)
alt((
parse_number,
parse_identifier,
delimited(char('('), parse_expr, char(')')),
))(input)
}
fn parse_expr(input: &str) -> IResult<&str, Expr> {
// For simplicity, just parse primary expressions
parse_primary(input)
}
fn complex_parser_example() {
let result = parse_expr("123");
assert_eq!(result, Ok(("", Expr::Number(123))));
let result = parse_expr("abc");
assert_eq!(result, Ok(("", Expr::Identifier("abc".to_string()))));
}Complex parsers are built by combining >> (sequence) and | (alternative).
use nom::{
IResult,
bytes::complete::tag,
character::complete::{alpha1, alphanumeric1, digit1, space0, char, multispace0},
sequence::{tuple, preceded, delimited, separated_pair},
branch::alt,
combinator::map,
multi::many0,
};
#[derive(Debug)]
struct KeyValue {
key: String,
value: Value,
}
#[derive(Debug)]
enum Value {
String(String),
Number(i64),
Boolean(bool),
}
fn parse_key(input: &str) -> IResult<&str, String> {
map(alpha1, |s: &str| s.to_string())(input)
}
fn parse_string_value(input: &str) -> IResult<&str, Value> {
map(
delimited(char('"'), alphanumeric1, char('"')),
|s: &str| Value::String(s.to_string())
)(input)
}
fn parse_number_value(input: &str) -> IResult<&str, Value> {
map(digit1, |s: &str| Value::Number(s.parse().unwrap()))(input)
}
fn parse_bool_value(input: &str) -> IResult<&str, Value> {
map(
alt((tag("true"), tag("false"))),
|s: &str| Value::Boolean(s == "true")
)(input)
}
fn parse_value(input: &str) -> IResult<&str, Value> {
// Try each alternative in order
alt((parse_string_value, parse_number_value, parse_bool_value))(input)
}
fn parse_key_value(input: &str) -> IResult<&str, KeyValue> {
// Sequence: key >> ":" >> value
map(
tuple((
parse_key,
delimited(space0, char(':'), space0),
parse_value
)),
|(key, _, value)| KeyValue { key, value }
)(input)
}
fn structured_data_example() {
let result = parse_key_value(r#"name: "Alice""#);
assert_eq!(result, Ok(("", KeyValue {
key: "name".to_string(),
value: Value::String("Alice".to_string())
})));
let result = parse_key_value("count: 42");
assert_eq!(result, Ok(("", KeyValue {
key: "count".to_string(),
value: Value::Number(42)
})));
let result = parse_key_value("active: true");
assert_eq!(result, Ok(("", KeyValue {
key: "active".to_string(),
value: Value::Boolean(true)
})));
}Combining sequence and alternative combinators enables parsing complex structures.
use nom::{
IResult,
bytes::complete::tag,
branch::alt,
error::{Error, ErrorKind, ParseError, VerboseError, context},
combinator::map,
};
// alt tries each parser and returns the error from the last one
fn alternative_errors() {
let input = "xyz";
// If all alternatives fail, alt returns the last error
let result: IResult<&str, &str, Error<&str>> =
alt((tag("abc"), tag("def"), tag("ghi")))(input);
assert!(result.is_err());
// Error from trying "ghi" on "xyz"
}
// Using context for better error messages
fn parse_with_context(input: &str) -> IResult<&str, &str, VerboseError<&str>> {
alt((
context("expected 'hello'", tag("hello")),
context("expected 'world'", tag("world")),
))(input)
}
fn error_context_example() {
let result = parse_with_context("foo");
assert!(result.is_err());
// Error will include context about what was expected
}alt returns the error from the last failed alternative; use context for better messages.
use nom::{
IResult,
bytes::complete::tag,
Parser,
sequence::preceded,
};
// The >> operator is part of the Parser trait
// It's equivalent to preceded for two parsers
fn parser_trait_example() {
// Using preceded directly
fn parse_after_hello(input: &str) -> IResult<&str, &str> {
preceded(tag("hello "), tag("world"))(input)
}
// The Parser trait allows chaining
// This is conceptually: tag("hello ") >> tag("world")
fn parse_with_chain(input: &str) -> IResult<&str, &str> {
tag("hello ")
.and_then(|_| tag("world"))
.parse(input)
.map(|(remaining, _)| (remaining, "world"))
}
// Actually using >> requires the shift-right trait
// In nom, this is typically done with preceded/terminated
}The >> operator in nom is usually expressed using preceded for clarity.
use nom::{
IResult,
bytes::complete::tag,
character::complete::{alpha1, space1, char},
sequence::{preceded, delimited},
multi::{many0, many1, separated_list0},
branch::alt,
};
fn parse_items(input: &str) -> IResult<&str, Vec<&str>> {
// Parse zero or more comma-separated words
separated_list0(char(','), alpha1)(input)
}
fn many_with_sequence(input: &str) -> IResult<&str, Vec<&str>> {
// Parse many sequences
many0(preceded(tag("item:"), alpha1))(input)
}
fn repeating_parsers_example() {
let result = parse_items("apple,banana,cherry");
assert_eq!(result, Ok(("", vec!["apple", "banana", "cherry"])));
let result = parse_items("single");
assert_eq!(result, Ok(("", vec!["single"])));
let result = parse_items("");
assert_eq!(result, Ok(("", vec![])));
// Sequence repeated
let result = many_with_sequence("item:one item:two item:three");
assert_eq!(result, Ok((" item:two item:three", vec!["one"])));
// Note: many0 doesn't consume between items - need space handling
}Combining repetition with sequence/alternative creates powerful parsers.
use nom::{
IResult,
bytes::complete::{tag, take_while, take_while1},
character::complete::char,
sequence::{preceded, tuple, separated_pair},
branch::alt,
combinator::{map, opt, recognize},
};
#[derive(Debug, PartialEq)]
struct Url<'a> {
scheme: &'a str,
host: &'a str,
port: Option<u16>,
path: &'a str,
}
fn parse_scheme(input: &str) -> IResult<&str, &str> {
recognize(tuple((
take_while1(|c: char| c.is_alphabetic()),
tag("://")
)))(input)
}
fn parse_host(input: &str) -> IResult<&str, &str> {
take_while1(|c: char| c.is_alphanumeric() || c == '.' || c == '-')(input)
}
fn parse_port(input: &str) -> IResult<&str, u16> {
preceded(char(':'),
map(take_while1(|c: char| c.is_numeric()),
|s: &str| s.parse().unwrap())
)(input)
}
fn parse_path(input: &str) -> IResult<&str, &str> {
alt((
take_while(|c: char| c != ' ' && c != '\n'),
map(tag(""), |_| "")
))(input)
}
fn parse_url(input: &str) -> IResult<&str, Url> {
map(
tuple((
parse_scheme, // scheme://
parse_host, // host
opt(parse_port), // :port (optional)
parse_path // /path
)),
|(scheme_with_sep, host, port, path)| {
// Extract scheme without ://
let scheme = &scheme_with_sep[..scheme_with_sep.len() - 3];
Url { scheme, host, port, path }
}
)(input)
}
fn url_parser_example() {
let result = parse_url("https://example.com/path");
assert_eq!(result, Ok(("", Url {
scheme: "https",
host: "example.com",
port: None,
path: "/path"
})));
let result = parse_url("http://localhost:8080/api/data");
assert_eq!(result, Ok(("", Url {
scheme: "http",
host: "localhost",
port: Some(8080),
path: "/api/data"
})));
}URL parsing combines sequence (scheme >> host >> port >> path) and alternatives (optional components).
| Aspect | >> (Sequence) | | (Alternative) |
|--------|-----------------|-------------------|
| Behavior | Run parsers in order | Try parsers in order |
| Success | All must succeed | First success wins |
| Failure | Any failure fails all | All failures fails all |
| Return | Second result (with preceded) | Result of successful alternative |
| Use case | Required sequence | Optional/multiple formats |
| Error | First failure reported | Last failure reported |
The >> and | operators in nom provide fundamental parser combination:
Sequence (>> / preceded / tuple):
preceded) or keeps all (with tuple)tag("(") >> content >> tag(")") for parenthesized expressionsAlternative (| / alt):
number | string | boolean for value parsingKey patterns:
tuple collects all results; preceded keeps only the secondalt tries alternatives; order matters for overlapping patternsmap to transform results and unify typescontext for better error messagesKey insight: Parser combinators work by building up complex parsers from simple, well-tested primitives. The >> operator creates sequences where each step depends on the previous; the | operator creates branches where the parser adapts to the input. Together, they form a powerful DSL for describing grammars directly in Rust code, with the type system ensuring that composed parsers have compatible input and output types.