How does `bytes::Buf::chunk` differ from `bytes` for accessing underlying buffer contents without copying?

Buf::chunk() returns a slice of the current contiguous portion of buffer data without consuming or copying, while the older Buf::bytes() method (now deprecated) had similar semantics but a misleading name—chunk() clarifies that you're accessing a contiguous segment, not all bytes, which is crucial for types like Chain or Cursor that may have non-contiguous underlying storage. Understanding this distinction is essential for zero-copy buffer handling in high-performance network code.

The Buf Trait and Contiguous Access

use bytes::Buf;
 
// The Buf trait provides read access to byte storage
// key insight: buffers may not be contiguous in memory
 
fn buf_trait_basics() {
    // chunk() returns the current contiguous portion
    // This is the primary way to access data without copying
    
    // For a simple Bytes buffer, chunk() returns all data
    let mut buf = bytes::Bytes::from("hello world");
    let chunk = buf.chunk(); // Returns &[u8] - "hello world"
    
    // For a Chain<Bytes, Bytes>, chunk() returns only the first part
    let buf1 = bytes::Bytes::from("hello ");
    let buf2 = bytes::Bytes::from("world");
    let mut chain = buf1.chain(buf2);
    
    let first_chunk = chain.chunk(); // Returns "hello " only!
    // The second part "world" is not visible in chunk()
}

The key insight is that chunk() only sees contiguous memory.

chunk() vs bytes() - Historical Context

use bytes::Buf;
 
fn chunk_vs_bytes() {
    // BEFORE (deprecated in bytes 1.0):
    // let data = buf.bytes(); // Misleading name!
    
    // AFTER (current):
    // let data = buf.chunk(); // Clearer name
    
    // The bytes() method was renamed to chunk() to clarify:
    // 1. You're not getting ALL bytes - just a contiguous chunk
    // 2. You're not consuming the buffer - just peeking
    
    // This is especially important for:
    // - Chain: multiple buffers concatenated
    // - Cursor: reading through a buffer with position tracking
    // - Custom Buf implementations with fragmented storage
    
    // bytes() still exists but is deprecated
    // #[deprecated(since = "1.0.0", note = "use chunk() instead")]
}

The rename clarifies that chunk() returns a contiguous segment, not all data.

Zero-Copy Access Pattern

use bytes::{Buf, Bytes};
 
fn zero_copy_pattern() {
    // The core pattern: peek with chunk(), consume with advance()
    let mut buf = Bytes::from("hello world");
    
    // Step 1: Peek at data without copying
    let chunk = buf.chunk();
    
    // Step 2: Process data in place
    if chunk.starts_with(b"hello") {
        // Step 3: Consume processed bytes
        buf.advance(5); // Move past "hello"
    }
    
    // This is zero-copy: no bytes were copied anywhere
    // chunk() returned &[u8] pointing directly into the buffer
}

chunk() provides direct access to underlying storage without copying.

Working with Bytes Directly

use bytes::Bytes;
 
fn bytes_direct_access() {
    // Bytes is a reference-counted byte buffer
    let bytes = Bytes::from("hello world");
    
    // Bytes implements AsRef<[u8]> for direct access
    let slice: &[u8] = bytes.as_ref();
    // All data accessible at once
    
    // Bytes also has slice() for sub-slicing
    let sub = bytes.slice(0..5); // "hello"
    
    // This is different from Buf::chunk():
    // - Bytes.as_ref() returns the entire buffer
    // - Buf::chunk() returns current contiguous portion
    
    // For Bytes specifically, chunk() returns all data
    // But for other Buf types, chunk() may return less
}

Bytes provides direct access to all data; Buf::chunk() may return only a portion.

Chain Buffers - Where chunk() Matters

use bytes::{Buf, Bytes};
 
fn chain_buffers() {
    // Chain concatenates multiple buffers without copying
    let buf1 = Bytes::from("hello ");
    let buf2 = Bytes::from("world");
    let mut chain = buf1.chain(buf2);
    
    // chunk() only returns the first buffer's data
    let first = chain.chunk(); // "hello " - only 6 bytes!
    
    // To see the second buffer, advance past the first
    chain.advance(6);
    let second = chain.chunk(); // "world" - now visible
    
    // This is why chunk() is named chunk():
    // It returns a contiguous chunk, not all bytes
    
    // Alternative: copy_to_bytes() copies all remaining data
    // (But this requires the caller to know total length)
}
 
fn process_chain_correctly() {
    let buf1 = Bytes::from("header: ");
    let buf2 = Bytes::from("value");
    let mut chain = buf1.chain(buf2);
    
    // WRONG: assuming chunk() has all data
    // let all = chain.chunk(); // Only gets "header: "!
    
    // CORRECT: iterate through chunks
    while chain.has_remaining() {
        let chunk = chain.chunk();
        println!("Chunk: {}", String::from_utf8_lossy(chunk));
        chain.advance(chunk.len());
    }
    
    // CORRECT: copy to contiguous buffer if needed
    let mut chain = Bytes::from("header: ").chain(Bytes::from("value"));
    let total_len = chain.remaining();
    let all_bytes = chain.copy_to_bytes(total_len);
    // Now all_bytes is contiguous (but copied!)
}

Chain is the clearest example of why chunk() only returns a portion.

Cursor Buffers

use bytes::{Buf, Bytes};
 
fn cursor_buffers() {
    use std::io::Cursor;
    
    // Cursor wraps a buffer with a position
    let data = Bytes::from("hello world");
    let mut cursor = Cursor::new(data);
    
    // Position starts at 0
    assert_eq!(cursor.position(), 0);
    
    // chunk() returns data from current position
    let chunk = cursor.chunk(); // "hello world"
    
    // Advance position
    cursor.advance(6); // Skip "hello "
    
    // chunk() now returns from position 6
    let chunk = cursor.chunk(); // "world"
    
    // Cursor tracks position separately from underlying buffer
    // chunk() respects this position
}
 
fn cursor_vs_bytes() {
    // Cursor<Bytes> vs Bytes directly:
    
    // Bytes: chunk() always returns all remaining data
    let bytes = Bytes::from("hello world");
    // bytes.chunk() == "hello world"
    
    // Cursor<Bytes>: chunk() returns data from current position
    let mut cursor = bytes::Buf::cursor(Bytes::from("hello world"));
    cursor.advance(6);
    // cursor.chunk() == "world"
}

Cursor shows how chunk() respects buffer position.

Custom Buf Implementations

use bytes::Buf;
use std::io::Cursor;
 
// A Buf implementation for fragmented storage
struct FragmentedBuf {
    fragments: Vec<Vec<u8>>,
    current_fragment: usize,
    offset_in_fragment: usize,
}
 
impl Buf for FragmentedBuf {
    fn remaining(&self) -> usize {
        // Calculate remaining bytes across all fragments
        let mut remaining = 0;
        for i in self.current_fragment..self.fragments.len() {
            if i == self.current_fragment {
                remaining += self.fragments[i].len() - self.offset_in_fragment;
            } else {
                remaining += self.fragments[i].len();
            }
        }
        remaining
    }
    
    fn chunk(&self) -> &[u8] {
        // Return current contiguous portion
        if self.current_fragment >= self.fragments.len() {
            return &[];
        }
        &self.fragments[self.current_fragment][self.offset_in_fragment..]
    }
    
    fn advance(&mut self, mut cnt: usize) {
        while cnt > 0 && self.current_fragment < self.fragments.len() {
            let fragment = &self.fragments[self.current_fragment];
            let available = fragment.len() - self.offset_in_fragment;
            
            if cnt >= available {
                // Move to next fragment
                cnt -= available;
                self.current_fragment += 1;
                self.offset_in_fragment = 0;
            } else {
                // Advance within current fragment
                self.offset_in_fragment += cnt;
                cnt = 0;
            }
        }
    }
}
 
fn custom_buf_chunk() {
    let mut buf = FragmentedBuf {
        fragments: vec![
            b"hello ".to_vec(),
            b"world".to_vec(),
            b"!".to_vec(),
        ],
        current_fragment: 0,
        offset_in_fragment: 0,
    };
    
    // chunk() returns only first fragment
    assert_eq!(buf.chunk(), b"hello ");
    
    buf.advance(6);
    assert_eq!(buf.chunk(), b"world");
    
    // This is why chunk() is the correct name:
    // It returns a contiguous chunk, not all remaining bytes
}

Custom Buf implementations may have genuinely fragmented storage.

Comparing Access Methods

use bytes::{Buf, Bytes};
 
fn access_method_comparison() {
    let bytes = Bytes::from("hello world");
    
    // Method 1: Buf::chunk() - zero-copy, respects position
    let mut buf = &bytes[..];
    let chunk = buf.chunk(); // &[u8], zero-copy
    
    // Method 2: Bytes::as_ref() - zero-copy, all data
    let slice: &[u8] = bytes.as_ref(); // &[u8], zero-copy
    
    // Method 3: Buf::copy_to_bytes() - copies data
    let mut buf = Bytes::from("hello world");
    let copy = buf.copy_to_bytes(5); // Bytes, copied!
    
    // Method 4: Buf::copy_to_slice() - copies to slice
    let mut buf = Bytes::from("hello world");
    let mut dest = [0u8; 5];
    buf.copy_to_slice(&mut dest); // Copies to dest
    
    // Method 5: collect() on Buf iterator - copies
    let buf = Bytes::from("hello world");
    let vec: Vec<u8> = buf.chunk().to_vec(); // Copies!
}

Different methods have different copying behaviors.

When to Use Each Method

use bytes::{Buf, Bytes};
 
fn when_to_use_what() {
    // Use chunk() when:
    // - You need zero-copy access
    // - You're implementing Buf yourself
    // - You're working with arbitrary Buf types
    // - You're processing data incrementally
    
    let mut buf = Bytes::from("data");
    while buf.has_remaining() {
        let chunk = buf.chunk();
        process_chunk(chunk);
        buf.advance(chunk.len());
    }
    
    // Use Bytes::as_ref() when:
    // - You have a Bytes specifically
    // - You need all data at once
    // - You want a simple &[u8]
    
    let bytes = Bytes::from("data");
    let slice: &[u8] = bytes.as_ref();
    process_all(slice);
    
    // Use copy_to_bytes() when:
    // - You need contiguous Bytes from Chain
    // - You need ownership of data
    // - You're okay with copying
    
    let chain = Bytes::from("hello").chain(Bytes::from("world"));
    let contiguous = chain.copy_to_bytes(10); // Copies for contiguity
    
    // Use copy_to_slice() when:
    // - You have a fixed-size buffer
    // - You need to write into existing allocation
    
    let mut buf = Bytes::from("data");
    let mut dest = [0u8; 4];
    buf.copy_to_slice(&mut dest);
}

Choose based on your specific needs for copying and contiguity.

Network Protocol Example

use bytes::{Buf, BytesMut};
 
fn parse_protocol_message() {
    // Simulating a network protocol parser
    let mut buffer = BytesMut::from(&b"\x00\x05hello\x00\x05world"[..]);
    
    // Parse header without copying
    if buffer.remaining() < 4 {
        return; // Need more data
    }
    
    let header = buffer.chunk();
    let len1 = u16::from_be_bytes([header[0], header[1]]) as usize;
    let len2 = u16::from_be_bytes([header[2], header[3]]) as usize;
    
    // Advance past header
    buffer.advance(4);
    
    // Parse first string
    let chunk = buffer.chunk();
    if chunk.len() < len1 {
        return; // Need more data
    }
    let first = &chunk[..len1];
    buffer.advance(len1);
    
    // Parse second string
    let chunk = buffer.chunk();
    if chunk.len() < len2 {
        return; // Need more data
    }
    let second = &chunk[..len2];
    buffer.advance(len2);
    
    // All parsing was zero-copy
    println!("First: {:?}", std::str::from_utf8(first));
    println!("Second: {:?}", std::str::from_utf8(second));
}

Network parsing often uses chunk() for zero-copy header parsing.

Performance Implications

use bytes::{Buf, Bytes};
 
fn performance_implications() {
    // Zero-copy with chunk()
    fn process_zero_copy(buf: &mut Bytes) {
        while buf.has_remaining() {
            let chunk = buf.chunk();
            // Process in place
            for byte in chunk {
                // Do something with *byte
            }
            buf.advance(chunk.len());
        }
    }
    
    // Copying approach
    fn process_with_copy(buf: &mut Bytes) {
        let data = buf.chunk().to_vec(); // Allocation!
        for byte in data {
            // Process copied data
        }
    }
    
    // For large buffers, zero-copy saves:
    // - Memory allocation
    // - Memory copy
    // - Cache pressure
    
    // But requires:
    // - Processing in chunks
    // - Understanding buffer boundaries
    // - Careful lifetime management
}

Zero-copy processing avoids allocations but requires understanding chunk boundaries.

Iterating Over Chunks

use bytes::{Buf, Bytes};
 
fn iterate_chunks() {
    // For Chain or other non-contiguous buffers:
    let buf1 = Bytes::from("hello ");
    let buf2 = Bytes::from("world");
    let mut chain = buf1.chain(buf2);
    
    // Manual iteration
    while chain.has_remaining() {
        let chunk = chain.chunk();
        println!("Chunk: {:?}", std::str::from_utf8(chunk));
        chain.advance(chunk.len());
    }
    
    // Using iterator
    let buf1 = Bytes::from("hello ");
    let buf2 = Bytes::from("world");
    let chain = buf1.chain(buf2);
    
    // Buf implements IntoIterator
    for (chunk, i) in chain.chunk_iter().enumerate() {
        // Wait, Buf doesn't have chunk_iter
        // Use has_remaining() pattern instead
    }
}
 
// Buf doesn't provide an iterator for chunks
// You must manually loop with has_remaining()

Buf requires manual iteration over chunks.

Real-World Pattern: Framing Parser

use bytes::{Buf, BytesMut};
 
struct FramedParser {
    buffer: BytesMut,
}
 
impl FramedParser {
    fn new() -> Self {
        Self {
            buffer: BytesMut::with_capacity(4096),
        }
    }
    
    // Zero-copy frame parsing
    fn try_parse_frame(&mut self) -> Option<BytesMut> {
        // Need at least 4 bytes for length header
        if self.buffer.remaining() < 4 {
            return None;
        }
        
        // Peek at length without consuming
        let chunk = self.buffer.chunk();
        let len = u32::from_be_bytes([
            chunk[0], chunk[1], chunk[2], chunk[3]
        ]) as usize;
        
        // Check if we have complete frame
        if self.buffer.remaining() < 4 + len {
            return None;
        }
        
        // Consume length header
        self.buffer.advance(4);
        
        // Split off frame (zero-copy in BytesMut)
        Some(self.buffer.split_to(len))
    }
    
    fn feed(&mut self, data: &[u8]) {
        self.buffer.extend_from_slice(data);
    }
}
 
fn frame_parser_usage() {
    let mut parser = FramedParser::new();
    
    // Feed partial data
    parser.feed(&[0, 0, 0, 5]); // Length header for 5 bytes
    parser.feed(b"hello"); // Frame payload
    
    // Try to parse
    if let Some(frame) = parser.try_parse_frame() {
        println!("Frame: {:?}", String::from_utf8_lossy(&frame));
    }
}

Framing protocols use chunk() to peek at headers before consuming.

Summary Table

fn summary() {
    // | Method            | Returns      | Copies? | Scope        |
    // |-------------------|--------------|---------|--------------|
    // | chunk()           | &[u8]        | No      | Contiguous   |
    // | bytes() (depr)    | &[u8]        | No      | Contiguous   |
    // | Bytes::as_ref()   | &[u8]        | No      | All data     |
    // | Bytes::slice()    | Bytes        | No      | Sub-slice    |
    // | copy_to_bytes()   | Bytes        | Yes     | Requested    |
    // | copy_to_slice()   | ()           | Yes     | Dest size    |
    // | to_vec()          | Vec<u8>      | Yes     | Chunk data   |
    
    // Use chunk() for:
    // - Zero-copy incremental processing
    // - Parsing protocols with headers
    // - Working with Chain, Cursor, custom Buf types
    // - Avoiding allocations in hot paths
    
    // Use as_ref()/slice() for:
    // - Direct access to Bytes
    // - When you need all data
    // - When you know buffer is contiguous
    
    // Use copy_* when:
    // - You need ownership
    // - You need contiguity from Chain
    // - You're crossing FFI boundaries
}

Synthesis

Quick reference:

use bytes::{Buf, Bytes};
 
// chunk() - current contiguous portion, zero-copy
let mut buf = Bytes::from("hello world");
let chunk = buf.chunk(); // &[u8] - all data for Bytes
 
// For Chain, chunk() returns only first part
let chain = Bytes::from("hello").chain(Bytes::from("world"));
let mut chain = chain;
let first = chain.chunk(); // "hello" only!
chain.advance(5);
let second = chain.chunk(); // "world" now
 
// Pattern: process all chunks
while buf.has_remaining() {
    let chunk = buf.chunk();
    // Process chunk
    buf.advance(chunk.len());
}

Key insight: Buf::chunk() is the fundamental zero-copy access primitive for byte buffers, but it only returns the current contiguous portion—for simple Bytes this is all remaining data, but for Chain<Bytes, Bytes> or Cursor<T> or custom implementations, chunk() returns just one contiguous segment. The old bytes() method was renamed to chunk() specifically because the name bytes() was misleading: it suggests you're getting "all bytes" when in fact you're getting "current contiguous chunk." This distinction matters most for Chain (concatenated buffers) where chunk() returns only the first buffer's data, and for custom Buf implementations that may have genuinely fragmented storage. The zero-copy pattern is: peek with chunk(), process the slice, then advance with advance(). This avoids allocations entirely—you're reading directly from the underlying buffer's memory. If you truly need all data contiguous, copy_to_bytes() will copy for you, but that defeats the zero-copy goal. For network protocols, chunk() is ideal: peek at the length header, check if you have enough data, and only then advance() past the header and split_to() the payload. The Buf trait abstracts over different underlying storage (contiguous Bytes, chains, vectors, file-backed memory) while chunk() provides a consistent zero-copy interface—you just need to remember it may return only a portion of your data.

How does bytes::Buf::chunk differ from bytes for accessing underlying buffer contents without copying?