How does `tower::Service` trait enable middleware composition in async Rust applications?

The tower::Service trait provides a uniform interface for asynchronous request-response operations, enabling a powerful middleware composition pattern. By abstracting over the details of how requests are processed, it allows middleware layers to be stacked and combined in ways that would be difficult with ad-hoc approaches.

The Service Trait: A Uniform Abstraction

At its core, the Service trait represents an asynchronous function from a request to a response:

use tower::Service;
use std::future::Future;
 
pub trait Service<Request> {
    type Response;
    type Error;
    type Future: Future<Output = Result<Self::Response, Self::Error>>;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>>;
    
    fn call(&mut self, req: Request) -> Self::Future;
}

The trait separates readiness checking (poll_ready) from request processing (call), enabling backpressure and resource management.

The Basic Service Pattern

A simple service wraps an async operation:

use tower::Service;
use std::future::{Future, Ready, ready};
 
struct EchoService;
 
impl Service<String> for EchoService {
    type Response = String;
    type Error = std::convert::Infallible;
    type Future = Ready<Result<String, Self::Error>>;
 
    fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>> 
    {
        // Always ready
        std::task::Poll::Ready(Ok(()))
    }
 
    fn call(&mut self, req: String) -> Self::Future {
        ready(Ok(req))
    }
}

This abstraction allows any async operation to be treated uniformly.

Middleware as Service Wrappers

Middleware wraps an inner service, transforming requests before they reach the inner service or responses before they return:

use tower::Service;
use std::future::Future;
use std::pin::Pin;
 
struct LoggingService<S> {
    inner: S,
}
 
impl<S, Request> Service<Request> for LoggingService<S>
where
    S: Service<Request>,
    S::Error: std::fmt::Debug,
{
    type Response = S::Response;
    type Error = S::Error;
    type Future = LoggingFuture<S::Future>;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>> 
    {
        self.inner.poll_ready(cx)
    }
 
    fn call(&mut self, req: Request) -> Self::Future {
        println!("Received request");
        LoggingFuture {
            inner: self.inner.call(req),
        }
    }
}
 
struct LoggingFuture<F> {
    inner: F,
}
 
impl<F, T, E> Future for LoggingFuture<F>
where
    F: Future<Output = Result<T, E>>,
    E: std::fmt::Debug,
{
    type Output = Result<T, E>;
 
    fn poll(mut self: Pin<&mut Self>, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Self::Output> 
    {
        let result = unsafe { 
            self.as_mut().map_unchecked_mut(|s| &mut s.inner).poll(cx) 
        };
        if let std::task::Poll::Ready(ref res) = result {
            match res {
                Ok(_) => println!("Request succeeded"),
                Err(e) => println!("Request failed: {:?}", e),
            }
        }
        result
    }
}

The middleware intercepts the request-response cycle without knowing the details of the inner service.

Composing Multiple Middleware Layers

Middleware layers compose naturally by wrapping each other:

use tower::Service;
 
// Assume we have these middleware types:
// LoggingService<S>
// TimeoutService<S>
// RateLimitService<S>
 
type MyService = LoggingService<
    TimeoutService<
        RateLimitService<
            EchoService
        >
    >
>;
 
// Request flow: Logging -> Timeout -> RateLimit -> Echo
// Response flow: Echo -> RateLimit -> Timeout -> Logging

Each layer adds behavior while delegating to the next. The type signature reflects the composition order.

The Layer Trait for Middleware Construction

Tower provides the Layer trait to separate middleware configuration from application:

use tower::Service;
 
pub trait Layer<S> {
    type Service: Service<Request>;
 
    fn layer(&self, inner: S) -> Self::Service;
}
 
// Using layers
use tower::ServiceBuilder;
 
let service = ServiceBuilder::new()
    .layer(LoggingLayer)
    .layer(TimeoutLayer::new(Duration::from_secs(30)))
    .layer(RateLimitLayer::new(100, Duration::from_secs(1)))
    .service(EchoService);

ServiceBuilder provides a fluent interface for stacking layers, producing a single composed service.

Built-in Middleware Examples

Tower includes many ready-to-use middleware:

use tower::ServiceBuilder;
use tower::limit::{RateLimitLayer, ConcurrencyLimitLayer};
use tower::timeout::TimeoutLayer;
use tower::retry::RetryLayer;
use tower::load_shed::LoadShedLayer;
use std::time::Duration;
 
let service = ServiceBuilder::new()
    // Shed load when backlogged
    .layer(LoadShedLayer::new())
    // Limit to 100 concurrent requests
    .layer(ConcurrencyLimitLayer::new(100))
    // Limit to 10 requests per second
    .layer(RateLimitLayer::new(10, Duration::from_millis(100)))
    // Add timeout
    .layer(TimeoutLayer::new(Duration::from_secs(30)))
    // Retry failed requests
    .layer(RetryLayer::new(retry_policy))
    .service(inner_service);

These layers compose correctly because they all implement Service.

Request Transformation Middleware

Middleware can transform the request type:

use tower::Service;
use std::future::{Future, Ready, ready};
 
struct AddHeaderService<S> {
    inner: S,
}
 
#[derive(Debug)]
struct RequestWithHeaders {
    body: String,
    headers: Vec<(String, String)>,
}
 
impl<S> Service<String> for AddHeaderService<S>
where
    S: Service<RequestWithHeaders>,
{
    type Response = S::Response;
    type Error = S::Error;
    type Future = S::Future;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>> 
    {
        self.inner.poll_ready(cx)
    }
 
    fn call(&mut self, req: String) -> Self::Future {
        let enriched = RequestWithHeaders {
            body: req,
            headers: vec![("X-Custom".to_string(), "value".to_string())],
        };
        self.inner.call(enriched)
    }
}

This pattern enables middleware to add context to requests as they flow through the stack.

Response Transformation Middleware

Middleware can also transform responses:

use tower::Service;
use std::future::Future;
use std::pin::Pin;
 
struct MapResponseService<S, F> {
    inner: S,
    f: F,
}
 
impl<S, F, Request, Response> Service<Request> for MapResponseService<S, F>
where
    S: Service<Request>,
    F: Fn(S::Response) -> Response + Clone,
{
    type Response = Response;
    type Error = S::Error;
    type Future = MapResponseFuture<S::Future, F>;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>> 
    {
        self.inner.poll_ready(cx)
    }
 
    fn call(&mut self, req: Request) -> Self::Future {
        MapResponseFuture {
            inner: self.inner.call(req),
            f: Some(self.f.clone()),
        }
    }
}
 
struct MapResponseFuture<Fut, F> {
    inner: Fut,
    f: Option<F>,
}
 
impl<Fut, F, Response> Future for MapResponseFuture<Fut, F>
where
    Fut: Future,
    F: Fn(Fut::Output) -> Response,
{
    type Output = Response;
 
    fn poll(mut self: Pin<&mut Self>, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Self::Output> 
    {
        let result = unsafe {
            self.as_mut().map_unchecked_mut(|s| &mut s.inner).poll(cx)
        };
        match result {
            std::task::Poll::Ready(output) => {
                let f = self.f.take().unwrap();
                std::task::Poll::Ready(f(output))
            }
            std::task::Poll::Pending => std::task::Poll::Pending,
        }
    }
}

This enables middleware to adapt response types or post-process results.

Backpressure with poll_ready

The poll_ready method enables backpressure propagation:

use tower::Service;
use std::sync::atomic::{AtomicUsize, Ordering};
 
struct ConcurrencyLimitService<S> {
    inner: S,
    active: AtomicUsize,
    limit: usize,
}
 
impl<S, Request> Service<Request> for ConcurrencyLimitService<S>
where
    S: Service<Request>,
{
    type Response = S::Response;
    type Error = S::Error;
    type Future = S::Future;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Self::Error>> 
    {
        // Check if we're at capacity
        if self.active.load(Ordering::Relaxed) >= self.limit {
            // Not ready - caller should wait
            cx.waker().wake_by_ref();
            return std::task::Poll::Pending;
        }
        
        // Check if inner service is ready
        self.inner.poll_ready(cx)
    }
 
    fn call(&mut self, req: Request) -> Self::Future {
        self.active.fetch_add(1, Ordering::Relaxed);
        // In a real implementation, we'd decrement on completion
        self.inner.call(req)
    }
}

When a service isn't ready, callers should wait before invoking call. This propagates backpressure through the middleware stack.

Real-World Usage: Tower-HTTP

The tower-http crate provides HTTP-specific middleware using Tower's abstractions:

use tower_http::{
    trace::TraceLayer,
    compression::CompressionLayer,
    cors::CorsLayer,
    limit::RequestBodyLimitLayer,
};
use tower::ServiceBuilder;
 
let service = ServiceBuilder::new()
    .layer(TraceLayer::new_for_http())
    .layer(CompressionLayer::new())
    .layer(CorsLayer::permissive())
    .layer(RequestBodyLimitLayer::new(1024 * 1024))  // 1MB limit
    .service(http_handler);

Each layer adds HTTP-specific functionality while maintaining the Service interface.

Integration with Web Frameworks

Axum, Tonic, and other frameworks are built on Tower:

use axum::{
    Router,
    routing::get,
};
use tower::ServiceBuilder;
use tower_http::trace::TraceLayer;
 
let app = Router::new()
    .route("/", get(handler))
    .layer(ServiceBuilder::new()
        .layer(TraceLayer::new_for_http())
        // More middleware...
    );
 
// Axum routes are Tower services
// The .layer() method adds middleware to all routes

This integration means middleware written for Tower works across multiple frameworks.

Custom Middleware with tower::ServiceExt

The ServiceExt trait provides utilities for working with services:

use tower::{Service, ServiceExt};
 
async fn example() -> Result<(), Box<dyn std::error::Error>> {
    let mut service = MyService::new();
    
    // Wait until ready
    service.ready().await?;
    
    // Call the service
    let response = service.call(Request::new()).await?;
    
    // Or use the one-shot pattern
    let response = MyService::new()
        .oneshot(Request::new())
        .await?;
    
    Ok(())
}

Middleware State and Cloning

Services often need to be cloned for each request:

use tower::Service;
use std::sync::Arc;
 
struct SharedStateService<S> {
    inner: S,
    state: Arc<AppState>,
}
 
impl<S> Clone for SharedStateService<S>
where
    S: Clone,
{
    fn clone(&self) -> Self {
        SharedStateService {
            inner: self.inner.clone(),
            state: Arc::clone(&self.state),
        }
    }
}
 
// Tower's ServiceBuilder creates services that can be cloned
// Each request gets its own clone of the service stack

This pattern ensures each request has isolated service state while sharing application state through Arc.

The Power of Uniform Composition

The Service trait's power comes from uniformity. Every middleware implements the same interface, enabling arbitrary composition:

use tower::ServiceBuilder;
 
// This composition is possible because all layers produce Services
let service = ServiceBuilder::new()
    .layer(MetricsLayer::new())
    .layer(TimeoutLayer::new(Duration::from_secs(30)))
    .layer(RetryLayer::new(policy))
    .layer(ConcurrencyLimitLayer::new(100))
    .layer(CacheLayer::new())
    .service(handler);

Each layer is oblivious to what comes before or after—it just sees a Service to delegate to.

Error Handling Across Layers

Errors propagate through the middleware stack:

use tower::Service;
use std::future::Future;
 
struct ErrorHandlerService<S, F> {
    inner: S,
    handler: F,
}
 
impl<S, F, Request, Response, Error> Service<Request> for ErrorHandlerService<S, F>
where
    S: Service<Request, Error = Error>,
    F: Fn(Error) -> Response + Clone,
{
    type Response = Response;
    type Error = std::convert::Infallible;  // All errors handled
    type Future = ErrorHandlerFuture<S::Future, F>;
 
    fn poll_ready(&mut self, cx: &mut std::task::Context<'_>) 
        -> std::task::Poll<Result<(), Error>> 
    {
        self.inner.poll_ready(cx)
    }
 
    fn call(&mut self, req: Request) -> Self::Future {
        ErrorHandlerFuture {
            inner: self.inner.call(req),
            handler: Some(self.handler.clone()),
        }
    }
}
 
// Errors from inner services can be caught and transformed
// by outer middleware layers

This enables centralized error handling and recovery strategies.

Synthesis

The tower::Service trait enables middleware composition by providing a uniform interface for asynchronous request-response operations. Middleware wraps inner services, transforming requests and responses while delegating through the stack. The Layer trait separates middleware configuration from application, and ServiceBuilder provides a fluent interface for composition.

The separation of poll_ready from call enables backpressure propagation, allowing services to signal when they're at capacity. Combined with the uniform interface, this means middleware can implement rate limiting, timeouts, retries, and load shedding in a way that composes correctly with other middleware.

This abstraction underpins Rust's web ecosystem—Axum, Tonic, and other frameworks are built on Tower, meaning middleware written for one framework often works with others. The pattern demonstrates how a simple, well-designed trait can enable powerful composition without sacrificing type safety or performance.

How does tower::Service trait enable middleware composition in async Rust applications?