Loading pageā¦
Rust walkthroughs
Loading pageā¦
tower::load_shed::LoadShedLayer and how does it protect services from overload?tower::load_shed::LoadShedLayer is middleware that monitors a service's load and automatically rejects requests when the system becomes overloaded, returning errors immediately rather than queueing requests or allowing resource exhaustion. It implements the "load shedding" patternāintentionally dropping requests under high load to preserve system stability and responsiveness for requests that are accepted. The layer tracks concurrency and/or queue depth, and when thresholds are exceeded, new requests receive an immediate error response rather than being processed. This prevents cascading failures, timeout amplification, and resource exhaustion that occur when systems accept more work than they can handle.
use tower::load_shed::{LoadShedLayer, LoadShed};
use tower::{Service, ServiceBuilder, ServiceExt};
use std::time::Duration;
// Example service that processes requests
struct MyService;
impl Service<String> for MyService {
type Response = String;
type Error = String;
type Future = std::future::Ready<Result<String, String>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), Self::Error>> {
// Service is always ready (simplified)
std::task::Poll::Ready(Ok(()))
}
fn call(&mut self, req: String) -> Self::Future {
std::future::ready(Ok(format!("Processed: {}", req)))
}
}
fn main() {
// Apply LoadShed layer to a service
let service = ServiceBuilder::new()
.layer(LoadShedLayer::new()) // Default: load shed based on readiness
.service(MyService);
// When underlying service is not ready, LoadShed returns error immediately
println!("LoadShed middleware configured");
}LoadShedLayer wraps a service and intercepts requests when overloaded.
use tower::load_shed::LoadShedLayer;
use tower::{Service, ServiceBuilder, ServiceExt};
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
// Service with limited capacity
struct LimitedService {
active_requests: Arc<AtomicUsize>,
max_concurrent: usize,
}
impl Service<String> for LimitedService {
type Response = String;
type Error = String;
type Future = std::future::Ready<Result<String, String>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), Self::Error>> {
let current = self.active_requests.load(Ordering::Relaxed);
if current < self.max_concurrent {
// Accept more requests
std::task::Poll::Ready(Ok(()))
} else {
// Service is overloaded - not ready
std::task::Poll::Pending
}
}
fn call(&mut self, req: String) -> Self::Future {
self.active_requests.fetch_add(1, Ordering::Relaxed);
// Process request...
std::future::ready(Ok(format!("Processed: {}", req)))
}
}
fn main() {
let active_requests = Arc::new(AtomicUsize::new(0));
// Without load shedding:
// - poll_ready returns Pending when overloaded
// - Requests queue up waiting for service
// - Memory grows, timeouts cascade
// With load shedding:
// - LoadShed checks poll_ready
// - If Pending, returns error immediately
// - Request fails fast, client can retry
// - System stays responsive
let service = ServiceBuilder::new()
.layer(LoadShedLayer::new())
.service(LimitedService {
active_requests,
max_concurrent: 10,
});
}Load shedding converts Pending readiness into immediate errors.
// Without load shedding:
// 1. High traffic arrives
// 2. Service queues requests
// 3. Memory grows with queued requests
// 4. Response times increase
// 5. Clients timeout and retry
// 6. More requests queue
// 7. System becomes unresponsive
// 8. Cascading failure
// With load shedding:
// 1. High traffic arrives
// 2. LoadShed checks capacity
// 3. If overloaded, returns error immediately
// 4. Client receives fast failure
// 5. Service stays responsive for accepted requests
// 6. System remains stable
fn main() {
// Problem: Accepting unlimited requests
// - Queue grows unbounded
// - Latency increases
// - Eventually OOM or timeout storm
// Solution: Shed load when overloaded
// - Reject excess requests quickly
// - Preserve capacity for accepted requests
// - Stable latency for successful requests
}Load shedding prevents the "death spiral" of overloaded systems.
use tower::load_shed::LoadShedLayer;
use tower::{ServiceBuilder, ServiceExt};
use tower::limit::{ConcurrencyLimitLayer, RateLimitLayer};
use tower::timeout::TimeoutLayer;
use std::time::Duration;
fn main() {
// Typical middleware stack with load shedding
let service = ServiceBuilder::new()
// Rate limiting: max requests per second
.layer(RateLimitLayer::new(100, Duration::from_secs(1)))
// Concurrency limit: max concurrent requests
.layer(ConcurrencyLimitLayer::new(50))
// Load shedding: reject when service not ready
.layer(LoadShedLayer::new())
// Timeout: fail slow requests
.layer(TimeoutLayer::new(Duration::from_secs(30)))
.service(MyService);
// Order matters:
// 1. Rate limit first (global throttle)
// 2. Concurrency limit (active request cap)
// 3. Load shed (catch-all for overload)
// 4. Timeout (bound request duration)
}
struct MyService;
impl tower::Service<String> for MyService {
type Response = String;
type Error = String;
type Future = std::future::Ready<Result<String, String>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), Self::Error>> {
std::task::Poll::Ready(Ok(()))
}
fn call(&mut self, req: String) -> Self::Future {
std::future::ready(Ok(format!("Processed: {}", req)))
}
}LoadShedLayer typically sits after concurrency limits in the middleware stack.
use tower::load_shed::LoadShedLayer;
use tower::limit::ConcurrencyLimitLayer;
use tower::{ServiceBuilder, ServiceExt};
fn main() {
// ConcurrencyLimit: Queues requests up to a limit
// - Enforces max concurrent requests
// - Queues excess requests (waits)
// - Good for predictable load
// LoadShed: Rejects requests immediately
// - Doesn't queue, returns error
// - Responds to service readiness
// - Good for unpredictable overload
// Often used together:
let service = ServiceBuilder::new()
// ConcurrencyLimit: Queue up to 100 requests
.layer(ConcurrencyLimitLayer::new(100))
// LoadShed: If service can't accept, reject
.layer(LoadShedLayer::new())
.service(MyService);
// Flow:
// 1. Request arrives
// 2. ConcurrencyLimit: if < 100 active, proceed
// 3. LoadShed: check if service ready
// 4. If ready: process
// 5. If not ready: return error immediately
}Concurrency limits queue; load shedding rejects.
use tower::load_shed::LoadShedLayer;
use tower::{Service, ServiceBuilder, ServiceExt};
use std::future::Future;
use std::pin::Pin;
#[derive(Debug)]
enum MyError {
Overloaded,
ProcessingError(String),
}
fn main() {
// LoadShed returns error when overloaded
// The error type must implement From<tower::load_shed::error::Overloaded>
async fn handle_request() {
let mut service = ServiceBuilder::new()
.layer(LoadShedLayer::new())
.service(MyService);
// Service might return Overloaded error
match service.ready().await {
Ok(svc) => {
match svc.call("request".to_string()).await {
Ok(response) => println!("Success: {}", response),
Err(e) => println!("Error: {:?}", e),
}
}
Err(e) => {
// Service not ready, load was shed
println!("Service overloaded, request shed");
}
}
}
}
struct MyService;
impl Service<String> for MyService {
type Response = String;
type Error = MyError;
type Future = Pin<Box<dyn Future<Output = Result<String, MyError>>>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), MyError>> {
std::task::Poll::Ready(Ok(()))
}
fn call(&mut self, req: String) -> Self::Future {
Box::pin(std::future::ready(Ok(format!("Processed: {}", req))))
}
}Handle load shed errors gracefully in application code.
use tower::load_shed::LoadShedLayer;
fn main() {
// Basic load shed
let _basic = LoadShedLayer::new();
// LoadShedLayer::new() creates a layer that:
// - Wraps the inner service
// - Calls poll_ready on inner service
// - If poll_ready returns Pending:
// - Returns error immediately (sheds load)
// - If poll_ready returns Ready:
// - Forwards request to inner service
// The layer itself doesn't have configuration
// Configuration comes from the inner service's readiness
// Common pattern: combine with other limiters
// to control when poll_ready returns Pending
}LoadShed configuration is implicit through service readiness.
use tower::load_shed::LoadShedLayer;
use tower::limit::ConcurrencyLimitLayer;
use tower::timeout::TimeoutLayer;
use tower::{ServiceBuilder, ServiceExt};
use std::time::Duration;
// Simulating a web service handler
struct WebService;
impl tower::Service<HttpRequest> for WebService {
type Response = HttpResponse;
type Error = Error;
type Future = std::future::Ready<Result<HttpResponse, Error>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), Error>> {
// In reality, might check database connections, thread pool, etc.
std::task::Poll::Ready(Ok(()))
}
fn call(&mut self, req: HttpRequest) -> Self::Future {
std::future::ready(Ok(HttpResponse { status: 200, body: "OK".to_string() }))
}
}
struct HttpRequest {
path: String,
}
struct HttpResponse {
status: u16,
body: String,
}
#[derive(Debug)]
enum Error {
Overloaded,
Timeout,
}
fn main() {
// Production-like middleware stack
let service = ServiceBuilder::new()
// Maximum concurrent requests
.layer(ConcurrencyLimitLayer::new(100))
// Shed load when inner service not ready
.layer(LoadShedLayer::new())
// Timeout for request processing
.layer(TimeoutLayer::new(Duration::from_secs(30)))
.service(WebService);
// This protects against:
// - Traffic spikes: rate limit absorbs
// - Resource exhaustion: concurrency limit bounds
// - Cascading failures: load shed fails fast
// - Slow requests: timeout bounds duration
}Load shedding is part of a comprehensive protection strategy.
// USE LOAD SHEDDING WHEN:
// 1. Service has variable processing times
// 2. Downstream dependencies can become unavailable
// 3. You need to maintain responsiveness under load
// 4. Clients can handle rejection (retry logic)
// 5. Queue growth could cause OOM
// DON'T USE LOAD SHEDDING WHEN:
// 1. All requests must be processed (use queue instead)
// 2. No retry/client handling for errors
// 3. Load is predictable and bounded
// 4. Better to queue and process eventually
fn main() {
// Example: API Gateway
// - Variable traffic patterns
// - Backend services may slow down
// - Clients can retry with backoff
// -> Use load shedding
// Example: Background Job Processor
// - Jobs must eventually complete
// - Can tolerate queue growth
// - No client waiting for response
// -> Maybe use queue instead of shedding
}Load shedding is appropriate when fail-fast is better than queue-and-wait.
// Load Shedding: Reject requests when overloaded
// Pros: Immediate feedback, preserves stability
// Cons: Lost requests, requires client retry
// Rate Limiting: Reject requests above threshold
// Pros: Predictable throughput, prevents spikes
// Cons: Doesn't adapt to service capacity
// Circuit Breaker: Reject when failure rate high
// Pros: Handles downstream failures
// Cons: Reactive, not proactive
// Bulkhead: Isolate resources per tenant
// Pros: Prevents noisy neighbor
// Cons: More complex resource management
// Queue: Buffer requests for later
// Pros: Smooths traffic, no lost requests
// Cons: Latency grows, memory risk
use tower::load_shed::LoadShedLayer;
use tower::limit::RateLimitLayer;
use tower::ServiceBuilder;
fn main() {
// Often used together:
let service = ServiceBuilder::new()
// Rate limit: prevent spikes
.layer(RateLimitLayer::new(1000, Duration::from_secs(1)))
// Load shed: handle capacity overflow
.layer(LoadShedLayer::new())
.service(MyService);
// Rate limit catches known overload
// Load shed catches unexpected capacity issues
}Load shedding complements other resilience patterns.
use tower::load_shed::LoadShedLayer;
use tower::{Service, ServiceBuilder, ServiceExt};
use std::future::Future;
use std::pin::Pin;
#[derive(Debug)]
enum ServiceError {
Overloaded,
Internal(String),
}
struct GracefulService;
impl Service<String> for GracefulService {
type Response = String;
type Error = ServiceError;
type Future = Pin<Box<dyn Future<Output = Result<String, ServiceError>>>>;
fn poll_ready(&mut self, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Result<(), ServiceError>> {
std::task::Poll::Ready(Ok(()))
}
fn call(&mut self, req: String) -> Self::Future {
Box::pin(std::future::ready(Ok(format!("Processed: {}", req))))
}
}
async fn handle_with_fallback(req: String) -> String {
let mut service = ServiceBuilder::new()
.layer(LoadShedLayer::new())
.service(GracefulService);
match service.ready().await {
Ok(mut svc) => {
match svc.call(req).await {
Ok(response) => response,
Err(ServiceError::Overloaded) => {
// Return degraded response instead of error
"Service temporarily degraded, please retry".to_string()
}
Err(ServiceError::Internal(e)) => {
format!("Internal error: {}", e)
}
}
}
Err(_) => {
// Service not ready, return fallback
"Service busy, please retry later".to_string()
}
}
}
fn main() {
// Clients can handle overload gracefully:
// 1. Return cached/stale data
// 2. Return simplified response
// 3. Return "try again later" message
// 4. Client implements exponential backoff
}Graceful degradation provides meaningful responses during overload.
Core purpose: LoadShedLayer protects services from overload by rejecting requests when the service cannot handle them, returning immediate errors instead of queuing.
How it works:
poll_ready before processingPending: returns Overloaded error immediatelyReady: forwards request to inner serviceProtection mechanisms:
When to use:
Common pattern:
ServiceBuilder::new()
.layer(ConcurrencyLimitLayer::new(max))
.layer(LoadShedLayer::new())
.service(inner)Key insight: Load shedding is about making explicit trade-offsārejecting some requests to preserve system stability and responsiveness for others. It's better to fail fast and let clients retry than to accept requests that cannot be processed in a reasonable time. LoadShedLayer integrates this pattern into Tower's middleware ecosystem, working alongside rate limits, concurrency limits, and timeouts to create robust, self-protecting services.