How does uuid::Uuid::new_v5 generate deterministic UUIDs using namespaces compared to new_v4 random generation?
new_v5 generates deterministic UUIDs by hashing a namespace UUID combined with a name using SHA-1, producing the same UUID every time for identical inputs, while new_v4 generates random UUIDs using cryptographically secure random bytes, producing a different UUID on every call even with the same logical inputs. The key distinction is reproducibility: new_v5 is designed for scenarios where you need to consistently generate the same UUID for a given identifier (like generating a UUID for a URL or email address), while new_v4 is for scenarios where uniqueness and unpredictability are paramount (like session identifiers or primary keys).
Basic new_v4 Random Generation
use uuid::Uuid;
fn new_v4_example() {
// new_v4 generates random UUIDs using a random number generator
let uuid1 = Uuid::new_v4();
let uuid2 = Uuid::new_v4();
println!("UUID 1: {}", uuid1);
println!("UUID 2: {}", uuid2);
// Each call produces a different UUID
assert_ne!(uuid1, uuid2);
// new_v4 uses randomness from the OS (getrandom, /dev/urandom, etc.)
// Version 4 UUID: xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
// where 4 indicates version, y is 8, 9, a, or b
// No inputs required - just randomness
let uuids: Vec<Uuid> = (0..5).map(|_| Uuid::new_v4()).collect();
for uuid in &uuids {
println!("Random UUID: {}", uuid);
}
// All UUIDs are different
}new_v4 uses randomness to create unique identifiers with no reproducibility guarantee.
Basic new_v5 Deterministic Generation
use uuid::Uuid;
fn new_v5_example() {
// new_v5 requires a namespace and a name
// The same namespace + name always produces the same UUID
// Standard namespaces provided by uuid crate
use uuid::UuidNamespace;
// Generate UUID for a DNS name
let dns_uuid = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
println!("DNS UUID: {}", dns_uuid);
// Same inputs -> same UUID
let dns_uuid_again = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
assert_eq!(dns_uuid, dns_uuid_again);
// Different name -> different UUID
let other_dns_uuid = Uuid::new_v5(UuidNamespace::Dns, b"other.com");
assert_ne!(dns_uuid, other_dns_uuid);
// Different namespace -> different UUID (even with same name)
let url_uuid = Uuid::new_v5(UuidNamespace::Url, b"example.com");
assert_ne!(dns_uuid, url_uuid);
}new_v5 produces identical UUIDs for identical namespace and name inputs.
UUID Versions Explained
use uuid::Uuid;
fn version_comparison() {
// Version 4: Random
let v4 = Uuid::new_v4();
println!("V4: {}", v4);
println!("Version: {:?}", v4.get_version());
// Version: Some(Random)
// Version 5: SHA-1 Hash
use uuid::UuidNamespace;
let v5 = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
println!("V5: {}", v5);
println!("Version: {:?}", v5.get_version());
// Version: Some(Sha1)
// Version differences:
// V4: 122 bits of randomness, 6 bits fixed for version/variant
// V5: SHA-1 hash (160 bits) truncated to 128 bits, 6 bits fixed
// The version number is embedded in the UUID:
// xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx
// ^ version digit (4 for v4, 5 for v5)
// ^ variant digit (8, 9, a, or b)
}Version numbers are embedded in UUIDs, distinguishing V4 (random) from V5 (SHA-1 hash).
Standard Namespaces
use uuid::{Uuid, UuidNamespace};
fn standard_namespaces() {
// RFC 4122 defines standard namespaces for V5 UUIDs
// These are well-known UUIDs used as namespaces
// DNS namespace - for domain names
let dns_uuid = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
println!("DNS namespace UUID: {}", dns_uuid);
// URL namespace - for URLs
let url_uuid = Uuid::new_v5(UuidNamespace::Url, b"https://example.com/path");
println!("URL namespace UUID: {}", url_uuid);
// OID namespace - for object identifiers
let oid_uuid = Uuid::new_v5(UuidNamespace::Oid, b"1.2.3.4.5");
println!("OID namespace UUID: {}", oid_uuid);
// X500 namespace - for X.500 distinguished names
let x500_uuid = Uuid::new_v5(UuidNamespace::X500, b"cn=John Doe,ou=People");
println!("X500 namespace UUID: {}", x500_uuid);
// The namespace ensures:
// - Same name in different namespaces -> different UUIDs
// - "example.com" as DNS vs URL -> different UUIDs
// - Prevents accidental collisions across different contexts
// Example: Same string, different namespaces
let dns = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
let url = Uuid::new_v5(UuidNamespace::Url, b"example.com");
assert_ne!(dns, url);
}Standard namespaces provide context for name resolution, preventing collisions across different identifier types.
Custom Namespaces
use uuid::Uuid;
fn custom_namespace() {
// You can use any UUID as a namespace
// This is useful for application-specific identifiers
// Create a namespace UUID for your application
let app_namespace = Uuid::parse_str("f47ac10b-58cc-4372-a567-0e02b2c3d479").unwrap();
// Generate UUIDs within your custom namespace
let user1_uuid = Uuid::new_v5(app_namespace, b"user@example.com");
let user2_uuid = Uuid::new_v5(app_namespace, b"user2@example.com");
println!("User 1 UUID: {}", user1_uuid);
println!("User 2 UUID: {}", user2_uuid);
// Same email in same namespace -> same UUID
let user1_again = Uuid::new_v5(app_namespace, b"user@example.com");
assert_eq!(user1_uuid, user1_again);
// Useful for:
// - Generating reproducible IDs from external identifiers
// - Creating stable IDs without database lookup
// - Generating IDs across distributed systems
// Example: Project-specific namespace
let project_namespace = Uuid::new_v4(); // Generate once, store
println!("Project namespace: {}", project_namespace);
// All resources in this project get reproducible UUIDs
let resource_id = Uuid::new_v5(project_namespace, b"resource-name");
println!("Resource UUID: {}", resource_id);
}Custom namespaces allow application-specific deterministic UUID generation.
Generation Algorithm
use uuid::Uuid;
fn v5_algorithm() {
// V5 UUID generation process (simplified):
// 1. Start with namespace UUID (16 bytes)
// 2. Concatenate with name (as bytes)
// 3. Compute SHA-1 hash (160 bits = 20 bytes)
// 4. Take first 16 bytes of hash
// 5. Set version bits (position 6, set to 5)
// 6. Set variant bits (position 8, set to 10xx)
// Result: 128-bit V5 UUID
// Why SHA-1?
// - Cryptographic hash provides good distribution
// - Collision resistance (though SHA-1 has known weaknesses)
// - Deterministic: same input -> same output
// - RFC 4122 specifies SHA-1 for V5
// V4 UUID generation process:
// 1. Generate 16 random bytes
// 2. Set version bits (position 6, set to 4)
// 3. Set variant bits (position 8, set to 10xx)
// Result: 128-bit V4 UUID
// V5 provides ~122 bits of hash output (after version/variant)
// V4 provides 122 bits of randomness
// Both have similar collision resistance in practice
}V5 uses SHA-1 hashing; V4 uses random bytes. Both set version and variant bits in specific positions.
Collision Behavior
use uuid::Uuid;
use std::collections::HashSet;
fn collision_analysis() {
// V4 collision probability:
// - 122 bits of randomness
// - Probability of collision depends on number of UUIDs generated
// - For n UUIDs: P(collision) ā n² / 2^123
// - Extremely unlikely for reasonable numbers
let v4_uuids: Vec<Uuid> = (0..1000000).map(|_| Uuid::new_v4()).collect();
let v4_set: HashSet<Uuid> = v4_uuids.iter().copied().collect();
println!("V4: Generated {}, unique {}", v4_uuids.len(), v4_set.len());
// All unique (with overwhelming probability)
// V5 collision behavior:
// - Deterministic: same input -> same UUID
// - Different inputs could theoretically collide (SHA-1 collision)
// - SHA-1 collisions are extremely unlikely
// - Namespace + name combination acts as unique key
use uuid::UuidNamespace;
// Generate V5 UUIDs from different inputs
let v5_uuids: Vec<Uuid> = (0..1000000)
.map(|i| Uuid::new_v5(UuidNamespace::Dns, format!("name{}.com", i).as_bytes()))
.collect();
let v5_set: HashSet<Uuid> = v5_uuids.iter().copied().collect();
println!("V5: Generated {}, unique {}", v5_uuids.len(), v5_set.len());
// All unique (different names -> different UUIDs)
// But: same name -> same UUID (deterministic)
let uuid1 = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
let uuid2 = Uuid::new_v5(UuidNamespace::Dns, b"example.com");
assert_eq!(uuid1, uuid2); // Guaranteed equal
}V4 UUIDs are unique with high probability; V5 UUIDs are deterministic, unique for different inputs.
Use Cases for V4
use uuid::Uuid;
use std::collections::HashMap;
fn v4_use_cases() {
// V4 is ideal when you need:
// - Unique identifiers
// - Unpredictability
// - No natural key to derive from
// Primary keys in databases
#[derive(Debug)]
struct User {
id: Uuid,
name: String,
email: String,
}
let user = User {
id: Uuid::new_v4(), // Unique, unpredictable
name: "Alice".to_string(),
email: "alice@example.com".to_string(),
};
println!("User ID: {}", user.id);
// Session identifiers
struct Session {
id: Uuid,
user_id: Uuid,
created_at: std::time::Instant,
}
let session = Session {
id: Uuid::new_v4(), // Unpredictable session ID
user_id: user.id,
created_at: std::time::Instant::now(),
};
println!("Session ID: {}", session.id);
// Request IDs for tracing
fn handle_request() {
let request_id = Uuid::new_v4();
println!("[{}] Processing request", request_id);
// Use request_id throughout request lifecycle
}
// Transaction IDs
struct Transaction {
id: Uuid,
amount: f64,
timestamp: std::time::Instant,
}
let tx = Transaction {
id: Uuid::new_v4(), // Unique transaction ID
amount: 100.0,
timestamp: std::time::Instant::now(),
};
// Key characteristics:
// - Uniqueness: Collision probability is negligible
// - Unpredictability: Can't guess other UUIDs
// - No relationship to content: UUID says nothing about data
}V4 is ideal for primary keys, session IDs, and any scenario requiring unique, unpredictable identifiers.
Use Cases for V5
use uuid::{Uuid, UuidNamespace};
fn v5_use_cases() {
// V5 is ideal when you need:
// - Deterministic IDs from existing identifiers
// - Reproducible UUIDs without storage
// - Cross-system identifier consistency
// URL identifiers
fn url_to_uuid(url: &str) -> Uuid {
Uuid::new_v5(UuidNamespace::Url, url.as_bytes())
}
let article_url = "https://example.com/articles/123";
let article_uuid = url_to_uuid(article_url);
println!("Article UUID: {}", article_uuid);
// Same URL always produces same UUID
assert_eq!(url_to_uuid(article_url), article_uuid);
// Email-based user IDs
fn email_to_uuid(email: &str) -> Uuid {
// Use a custom namespace for your application
let app_namespace = Uuid::parse_str("00000000-0000-0000-0000-000000000000").unwrap();
Uuid::new_v5(app_namespace, email.as_bytes())
}
// Multiple systems can generate same UUID for same email
// without coordination or lookup
let email = "user@example.com";
let user_id_system_a = email_to_uuid(email);
let user_id_system_b = email_to_uuid(email);
assert_eq!(user_id_system_a, user_id_system_b);
// Content-addressed storage
fn content_uuid(namespace: Uuid, content: &str) -> Uuid {
Uuid::new_v5(namespace, content.as_bytes())
}
let namespace = Uuid::new_v4(); // Application namespace
let doc_content = "Document content here";
let doc_uuid = content_uuid(namespace, doc_content);
// Same content -> same UUID
// Can verify content identity by regenerating UUID
// Cache keys
use std::collections::HashMap;
let mut cache: HashMap<Uuid, String> = HashMap::new();
fn cache_key(resource_type: &str, resource_id: &str) -> Uuid {
let namespace = Uuid::parse_str("11111111-1111-1111-1111-111111111111").unwrap();
let key = format!("{}:{}", resource_type, resource_id);
Uuid::new_v5(namespace, key.as_bytes())
}
cache.insert(cache_key("user", "123"), "cached data".to_string());
// Later, can retrieve without storing the key
let retrieved = cache.get(&cache_key("user", "123"));
}V5 excels when you need reproducible IDs derived from existing data without storage or coordination.
Determinism Across Systems
use uuid::{Uuid, UuidNamespace};
fn cross_system_determinism() {
// V5 enables distributed systems to generate same UUID
// without coordination
// Scenario: Microservices need consistent user IDs from emails
// Service A
fn generate_user_id_a(email: &str) -> Uuid {
Uuid::new_v5(UuidNamespace::Dns, email.as_bytes())
}
// Service B (completely independent)
fn generate_user_id_b(email: &str) -> Uuid {
Uuid::new_v5(UuidNamespace::Dns, email.as_bytes())
}
let email = "user@example.com";
let id_a = generate_user_id_a(email);
let id_b = generate_user_id_b(email);
// Both services generate the same UUID
assert_eq!(id_a, id_b);
println!("Service A ID: {}", id_a);
println!("Service B ID: {}", id_b);
// This enables:
// - Deduplication without lookup
// - ID generation before database insert
// - Consistent IDs across databases
// - No central ID authority needed
// Example: Event sourcing
struct Event {
id: Uuid,
event_type: String,
aggregate_id: Uuid,
}
// Events get deterministic IDs based on their content
fn event_uuid(aggregate_id: Uuid, sequence: u64) -> Uuid {
let key = format!("{}:{}", aggregate_id, sequence);
Uuid::new_v5(UuidNamespace::Dns, key.as_bytes())
}
// Same aggregate + sequence -> same event ID
// Enables deduplication in distributed systems
}V5 enables distributed ID generation without coordination when using the same namespace and inputs.
Comparing V4 and V5
use uuid::{Uuid, UuidNamespace};
fn comparison_table() {
// V4 (Random) V5 (SHA-1)
// Deterministic? No Yes (same input)
// Unpredictable? Yes No (can derive from input)
// Input required? None Namespace + name
// Collision risk? Random collision Hash collision
// Use case Primary keys, sessions Derived IDs, URLs
// Storage needed? Must store UUID Can regenerate from input
// Coordination? Must ensure uniqueness No coordination needed
// Speed comparison
let start = std::time::Instant::now();
for _ in 0..10000 {
let _ = Uuid::new_v4();
}
println!("V4: 10000 UUIDs in {:?}", start.elapsed());
let start = std::time::Instant::now();
for i in 0..10000 {
let _ = Uuid::new_v5(UuidNamespace::Dns, format!("name{}", i).as_bytes());
}
println!("V5: 10000 UUIDs in {:?}", start.elapsed());
// V4 is typically faster (just random bytes)
// V5 requires SHA-1 computation
// Both are fast enough for most use cases
}V4 is faster and unpredictable; V5 is deterministic and enables ID derivation without storage.
Security Considerations
use uuid::{Uuid, UuidNamespace};
fn security_considerations() {
// V4 security:
// - Unpredictable: UUID reveals nothing about data
// - Safe for session tokens, API keys
// - Can't derive other UUIDs from one
// V5 security:
// - Derivable: UUID can reveal the input name
// - If namespace is known, name can be brute-forced
// - Don't use for secrets, passwords, tokens
// BAD: Using V5 for secrets
// let password_uuid = Uuid::new_v5(namespace, password.as_bytes());
// If attacker knows namespace and UUID, they can verify password guesses
// GOOD: Using V5 for public identifiers
let user_id = Uuid::new_v5(UuidNamespace::Dns, b"user@example.com");
// This is fine - email is not secret, just identifier
// GOOD: Using V4 for session tokens
let session_token = Uuid::new_v4();
// Unpredictable, can't be guessed
// Privacy consideration:
// V5 UUIDs encode information about the input
// Consider privacy implications before using
// Example: V5 UUID for email reveals email was used
// If namespace is known, email can potentially be recovered
// (Through brute force, not reversal)
// Summary:
// - V4: Safe for secrets, unpredictable
// - V5: Public identifiers only, not for secrets
}V4 is safe for secrets; V5 should only be used for public identifiers.
Practical Patterns
use uuid::{Uuid, UuidNamespace};
use std::collections::HashMap;
fn practical_patterns() {
// Pattern 1: Entity ID from unique attribute
struct User {
id: Uuid, // V5 from email
email: String,
name: String,
}
impl User {
fn new(email: String, name: String) -> Self {
User {
id: Uuid::new_v5(UuidNamespace::Dns, email.as_bytes()),
email,
name,
}
}
}
// Pattern 2: Mixed approach (V5 for ID, V4 for tokens)
struct Session {
user_id: Uuid, // V5 - from email
session_id: Uuid, // V4 - random
}
// Pattern 3: Hierarchical namespaces
let org_namespace = Uuid::new_v4(); // Generate once
let project_namespace = Uuid::new_v5(org_namespace, b"project-1");
let resource_id = Uuid::new_v5(project_namespace, b"resource-1");
// Pattern 4: Content-addressed entities
struct Document {
id: Uuid, // V5 from content hash
content: String,
}
impl Document {
fn new(content: String) -> Self {
let namespace = Uuid::parse_str("doc-namespace").unwrap();
Document {
id: Uuid::new_v5(namespace, content.as_bytes()),
content,
}
}
}
// Same content -> same ID -> deduplication
let doc1 = Document::new("Hello".to_string());
let doc2 = Document::new("Hello".to_string());
assert_eq!(doc1.id, doc2.id);
}Choose V4 or V5 based on whether you need determinism or uniqueness guarantees.
Synthesis
Core difference:
// V4: Random, unpredictable, unique
let uuid = Uuid::new_v4();
// Each call produces a different UUID
// No inputs required
// V5: Deterministic, reproducible, derived
let uuid = Uuid::new_v5(namespace, name);
// Same namespace + name -> same UUID
// Requires namespace and name inputsGeneration mechanisms:
// V4: Random bytes + version/variant bits
// - 122 bits of randomness
// - Set version nibble to 4
// - Set variant bits
// V5: SHA-1 hash + version/variant bits
// - Hash: SHA-1(namespace || name)
// - Truncate to 128 bits
// - Set version nibble to 5
// - Set variant bitsWhen to use V4:
- Primary keys in databases
- Session identifiers
- Request/correlation IDs
- Transaction IDs
- Any scenario requiring uniqueness and unpredictability
When to use V5:
- URLs and URIs as identifiers
- Email-based user IDs
- Content-addressed storage
- Distributed ID generation
- Cache keys
- Any scenario needing reproducible IDs from existing identifiers
Key insight: new_v5 provides deterministic UUIDs through SHA-1 hashing of a namespace combined with a name, enabling reproducible identifier generation across systems and timeāpass "example.com" with UuidNamespace::Dns and you'll always get the same UUID. This is fundamentally different from new_v4, which generates a new random UUID on every call with no relationship to any input. V5 is ideal when you have a natural identifier (URL, email, DNS name) and want a stable UUID derived from it without storage; V4 is ideal when you need a unique identifier with no predictable relationship to anything. V5 should never be used for secrets because the UUID can be derived if the input is known; V4 is appropriate for session tokens and other sensitive identifiers. The namespace in V5 serves to separate different identifier domainsāsame string in different namespaces produces different UUIDs, preventing accidental collisions between URLs, DNS names, and custom application identifiers.
