How does hyper::server::Server::serve handle graceful shutdown signals?

hyper::server::Server::serve handles graceful shutdown through a CancellationToken or by dropping the serving future, but the most common pattern uses serve with with_graceful_shutdown to listen for shutdown signals while allowing in-flight requests to complete. The graceful shutdown mechanism ensures the server stops accepting new connections, finishes processing existing requests up to a timeout, and then terminates cleanly—unlike an abrupt termination that would sever active connections mid-request. This is implemented through async cancellation: when the shutdown signal fires, the server enters a draining phase where it waits for active connections to complete before fully shutting down.

Basic Server with serve

use hyper::server::Server;
use hyper::{Body, Request, Response, Method, StatusCode};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let server = Server::bind(&addr).serve(make_svc);
    
    println!("Server running on http://{}", addr);
    
    // This runs forever until interrupted
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    Ok(())
}

The basic serve runs indefinitely until the future is cancelled or an error occurs.

Graceful Shutdown with CancellationToken

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use tokio_util::sync::CancellationToken;
use std::convert::Infallible;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    let cancel_token = CancellationToken::new();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let server = Server::bind(&addr).serve(make_svc);
    
    // Clone token for the shutdown handler
    let cancel_token_clone = cancel_token.clone();
    
    // Spawn a task to listen for shutdown signal
    tokio::spawn(async move {
        tokio::signal::ctrl_c()
            .await
            .expect("Failed to install Ctrl+C handler");
        println!("Shutdown signal received");
        cancel_token_clone.cancel();
    });
    
    println!("Server running on http://{}", addr);
    
    // Run until cancelled
    tokio::select! {
        result = server => {
            if let Err(e) = result {
                eprintln!("Server error: {}", e);
            }
        }
        _ = cancel_token.cancelled() => {
            println!("Server shutting down gracefully");
        }
    }
    
    Ok(())
}

Using CancellationToken allows external signalling for shutdown.

with_graceful_shutdown Pattern

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    // Simulate some work
    tokio::time::sleep(Duration::from_millis(100)).await;
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    // Create shutdown signal
    let (tx, rx) = tokio::sync::oneshot::channel::<()>();
    
    // Spawn signal handler
    tokio::spawn(async move {
        tokio::signal::ctrl_c()
            .await
            .expect("Failed to install Ctrl+C handler");
        println!("Shutdown signal received");
        let _ = tx.send(());
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async {
            rx.await.ok();
        });
    
    println!("Server running on http://{}", addr);
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    println!("Server stopped");
    Ok(())
}

with_graceful_shutdown takes a future that resolves when shutdown should begin.

How Graceful Shutdown Works Internally

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
 
async fn handle_request(req: Request<Body>) -> Result<Response<Body>, Infallible> {
    println!("Request started: {}", req.uri());
    
    // Simulate long-running request
    tokio::time::sleep(Duration::from_secs(2)).await;
    
    println!("Request completed: {}", req.uri());
    Ok(Response::new(Body::from("Completed")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let (tx, rx) = tokio::sync::oneshot::channel::<()>();
    
    tokio::spawn(async move {
        // Simulate shutdown after 1 second
        tokio::time::sleep(Duration::from_secs(1)).await;
        println!("Initiating graceful shutdown");
        let _ = tx.send(());
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async { rx.await.ok(); });
    
    println!("Server running on http://{}", addr);
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    println!("Server has stopped");
    // At this point:
    // 1. No new connections are accepted
    // 2. In-flight requests have completed
    // 3. All connections are closed
    
    Ok(())
}

Graceful shutdown: stops accepting new connections, waits for active requests, then exits.

Shutdown with Timeout

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let (tx, rx) = tokio::sync::oneshot::channel::<()>();
    
    tokio::spawn(async move {
        tokio::signal::ctrl_c()
            .await
            .expect("Failed to install Ctrl+C handler");
        let _ = tx.send(());
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async { rx.await.ok(); });
    
    // Add timeout for graceful shutdown
    let shutdown_timeout = Duration::from_secs(30);
    
    match tokio::time::timeout(shutdown_timeout, server).await {
        Ok(Ok(())) => println!("Server shutdown gracefully"),
        Ok(Err(e)) => eprintln!("Server error: {}", e),
        Err(_) => {
            eprintln!("Graceful shutdown timed out, forcing exit");
        }
    }
    
    Ok(())
}

Wrap the server in a timeout to force shutdown after a deadline.

Signal Handling on Unix Systems

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
 
#[cfg(unix)]
async fn shutdown_signal() {
    use tokio::signal::unix::{signal, SignalKind};
    
    let mut sigterm = signal(SignalKind::terminate())
        .expect("Failed to install SIGTERM handler");
    let mut sigint = signal(SignalKind::interrupt())
        .expect("Failed to install SIGINT handler");
    
    tokio::select! {
        _ = sigterm.recv() => println!("Received SIGTERM"),
        _ = sigint.recv() => println!("Received SIGINT"),
    }
}
 
#[cfg(not(unix))]
async fn shutdown_signal() {
    tokio::signal::ctrl_c()
        .await
        .expect("Failed to install Ctrl+C handler");
}
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(shutdown_signal());
    
    println!("Server running on http://{}", addr);
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    println!("Server stopped");
    Ok(())
}

On Unix, handle SIGTERM and SIGINT for containerized deployments.

Multiple Shutdown Signals

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
async fn shutdown_signal() {
    let ctrl_c = async {
        tokio::signal::ctrl_c()
            .await
            .expect("Failed to install Ctrl+C handler");
        println!("Ctrl+C pressed");
    };
    
    #[cfg(unix)]
    let terminate = async {
        use tokio::signal::unix::{signal, SignalKind};
        let mut sigterm = signal(SignalKind::terminate())
            .expect("Failed to install SIGTERM handler");
        sigterm.recv().await;
        println!("SIGTERM received");
    };
    
    #[cfg(not(unix))]
    let terminate = std::future::pending::<()>();
    
    let timeout = async {
        tokio::time::sleep(Duration::from_secs(300)).await;
        println!("Shutdown timeout reached");
    };
    
    tokio::select! {
        _ = ctrl_c => {}
        _ = terminate => {}
        _ = timeout => {}
    }
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(shutdown_signal());
    
    println!("Server running on http://{}", addr);
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    Ok(())
}

Combine multiple shutdown triggers using tokio::select!.

Tracking Active Connections

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::sync::Arc;
use std::sync::atomic::{AtomicUsize, Ordering};
 
struct AppState {
    active_connections: AtomicUsize,
}
 
async fn handle_request(
    _req: Request<Body>,
    state: Arc<AppState>,
) -> Result<Response<Body>, Infallible> {
    state.active_connections.fetch_add(1, Ordering::SeqCst);
    
    let response = Response::new(Body::from("Hello, World!"));
    
    state.active_connections.fetch_sub(1, Ordering::SeqCst);
    Ok(response)
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    let state = Arc::new(AppState {
        active_connections: AtomicUsize::new(0),
    });
    
    let state_clone = Arc::clone(&state);
    let make_svc = make_service_fn(move |_conn| {
        let state = Arc::clone(&state_clone);
        async move {
            Ok::<_, Infallible>(service_fn(move |req| {
                handle_request(req, Arc::clone(&state))
            }))
        }
    });
    
    let (tx, rx) = tokio::sync::oneshot::channel::<()>();
    
    tokio::spawn(async move {
        tokio::signal::ctrl_c().await.ok();
        println!("Shutdown signal received");
        let _ = tx.send(());
    });
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async { rx.await.ok(); });
    
    println!("Server running on http://{}", addr);
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    println!(
        "Server stopped. Remaining connections: {}",
        state.active_connections.load(Ordering::SeqCst)
    );
    
    Ok(())
}

Track active connections to understand shutdown behavior.

Graceful Shutdown in Production

use hyper::server::Server;
use hyper::{Body, Request, Response, Server as HyperServer};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
use tokio::sync::watch;
 
async fn handle_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    Ok(Response::new(Body::from("Hello, World!")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Setup logging
    tracing_subscriber::fmt::init();
    
    let addr = ([0, 0, 0, 0], 8080).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(handle_request))
    });
    
    // Use watch channel for coordinated shutdown
    let (shutdown_tx, shutdown_rx) = watch::channel(false);
    
    // Spawn signal handler
    tokio::spawn(async move {
        #[cfg(unix)]
        {
            use tokio::signal::unix::{signal, SignalKind};
            let mut sigterm = signal(SignalKind::terminate()).unwrap();
            let mut sigint = signal(SignalKind::interrupt()).unwrap();
            
            tokio::select! {
                _ = sigterm.recv() => tracing::info!("Received SIGTERM"),
                _ = sigint.recv() => tracing::info!("Received SIGINT"),
            }
        }
        
        #[cfg(not(unix))]
        {
            tokio::signal::ctrl_c().await.unwrap();
            tracing::info!("Received Ctrl+C");
        }
        
        // Signal all components to shut down
        let _ = shutdown_tx.send(true);
    });
    
    let server = HyperServer::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async {
            shutdown_rx.changed().await.ok();
        });
    
    tracing::info!("Server running on http://{}", addr);
    
    // Run with timeout
    let result = tokio::time::timeout(Duration::from_secs(30), server).await;
    
    match result {
        Ok(Ok(())) => tracing::info!("Server shutdown gracefully"),
        Ok(Err(e)) => tracing::error!("Server error: {}", e),
        Err(_) => tracing::warn!("Graceful shutdown timed out"),
    }
    
    // Cleanup
    tracing::info!("Performing cleanup...");
    tokio::time::sleep(Duration::from_secs(1)).await;
    tracing::info!("Server stopped");
    
    Ok(())
}

Production pattern with coordinated shutdown and cleanup.

Connection Draining Behavior

use hyper::server::Server;
use hyper::{Body, Request, Response};
use hyper::service::{make_service_fn, service_fn};
use std::convert::Infallible;
use std::time::Duration;
 
async fn slow_request(_req: Request<Body>) -> Result<Response<Body>, Infallible> {
    println!("Request started");
    tokio::time::sleep(Duration::from_secs(5)).await;
    println!("Request finished");
    Ok(Response::new(Body::from("Slow response")))
}
 
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let addr = ([127, 0, 0, 1], 3000).into();
    
    let make_svc = make_service_fn(|_conn| async {
        Ok::<_, Infallible>(service_fn(slow_request))
    });
    
    let (tx, rx) = tokio::sync::oneshot::channel::<()>();
    
    // Trigger shutdown after 2 seconds
    tokio::spawn(async move {
        tokio::time::sleep(Duration::from_secs(2)).await;
        println!("Triggering graceful shutdown");
        let _ = tx.send(());
    });
    
    println!("Server running on http://{}", addr);
    println!("Try: curl http://127.0.0.1:3000/");
    
    let server = Server::bind(&addr)
        .serve(make_svc)
        .with_graceful_shutdown(async { rx.await.ok(); });
    
    if let Err(e) = server.await {
        eprintln!("Server error: {}", e);
    }
    
    // Output timeline:
    // t=0: Server starts
    // t=1: Request arrives, starts processing (5 second sleep)
    // t=2: Shutdown signal received
    // t=6: Request finishes (5 second sleep completes)
    // t=6: Server exits gracefully
    
    println!("Server stopped");
    Ok(())
}

Graceful shutdown waits for in-flight requests to complete.

Comparison: serve vs with_graceful_shutdown

Aspect serve alone with_graceful_shutdown
Termination Immediate on cancel Waits for in-flight requests
Signal handling External cancellation Built-in signal integration
Connection draining None Automatic
New connections Abruptly rejected Gracefully rejected
Use case Development, testing Production

Synthesis

Shutdown phases:

Phase Behavior
Signal received Stop accepting new connections
Draining Wait for in-flight requests
Timeout (optional) Force shutdown after deadline
Cleanup Release resources, exit

Signal sources:

Source Implementation
Ctrl+C tokio::signal::ctrl_c()
SIGTERM (Unix) signal(SignalKind::terminate())
SIGINT (Unix) signal(SignalKind::interrupt())
Programmatic oneshot::channel, watch::channel
Timeout tokio::time::sleep

Key insight: hyper::server::Server::serve with with_graceful_shutdown implements production-ready graceful shutdown by accepting a future that resolves when shutdown should begin. When the signal fires, the server immediately stops accepting new connections but continues processing in-flight requests until they complete or a timeout is reached. This is fundamentally different from simply cancelling the serve future, which would abort all connections immediately. The graceful shutdown pattern is essential for production deployments where you need zero-downtime rolling updates: a Kubernetes pod receiving SIGTERM should finish existing requests before terminating, ensuring no client experiences dropped connections mid-request. The implementation uses async cancellation semantics—the shutdown signal future completes, triggering the server's internal draining logic, which waits for all active connection handlers to complete before the serve future itself resolves. Combine this with a timeout to guarantee termination even if some requests are slow or hung.