How does zip::ZipArchive::extract handle file permissions and symbolic links during extraction?

zip::ZipArchive::extract attempts to preserve Unix file permissions when extracting entries on Unix-like systems and supports symbolic links, but the handling depends on the archive format, platform capabilities, and the library's security-conscious defaults. The zip crate provides mechanisms to control permission handling and symlink extraction while balancing security concerns with feature completeness.

Basic Extraction Behavior

use zip::ZipArchive;
use std::fs::File;
use std::path::Path;
 
fn basic_extraction() -> zip::result::ZipResult<()> {
    let file = File::open("archive.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // extract extracts all files to the specified directory
    archive.extract("output_directory")?;
    
    Ok(())
}

extract unpacks all archive entries to a target directory, handling files, directories, and potentially symlinks.

File Permission Handling

use zip::ZipArchive;
use std::fs::File;
use std::os::unix::fs::PermissionsExt;
 
fn permission_handling() -> zip::result::ZipResult<()> {
    // ZIP files can store Unix permissions in the "external attributes" field
    // When extracting on Unix, these permissions can be restored
    
    let file = File::open("archive_with_permissions.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // The zip format stores permissions differently depending on:
    // 1. The "version made by" field (Unix vs DOS/Windows)
    // 2. The "external file attributes" field
    
    // For Unix-created archives:
    // - Permissions are stored in the high 16 bits of external attributes
    // - Extracted files get these permissions restored
    
    // For Windows-created archives:
    // - No Unix permissions available
    // - Default permissions apply (usually 0o644 for files, 0o755 for directories)
    
    archive.extract("output")?;
    
    Ok(())
}

Unix permissions are stored in the ZIP format's external attributes field when created on Unix systems.

Permission Bits in ZIP Format

use zip::{ZipArchive, CompressionMethod};
use std::fs::File;
 
fn permission_bits() -> zip::result::ZipResult<()> {
    // ZIP format external attributes (Unix-specific):
    // Bits 0-8:   Standard Unix permissions (rwxrwxrwx)
    // Bits 9-11:  Setuid, setgid, sticky bits
    // Bits 12-15: File type (regular file, directory, symlink, etc.)
    // Bits 16-31: Reserved or additional attributes
    
    let file = File::open("archive.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // When reading entries:
    for i in 0..archive.len() {
        let file = archive.by_index(i)?;
        
        // The file object provides access to Unix mode if available
        // On Unix, this would be the permission bits
        println!("File: {}", file.name());
        
        // The actual permission restoration depends on:
        // - Platform (Unix vs Windows)
        // - Archive origin (Unix-created vs Windows-created)
        // - Library implementation details
    }
    
    Ok(())
}

The ZIP format encodes Unix permissions in specific bits of the external attributes field.

Directory Permission Defaults

use zip::ZipArchive;
use std::fs::File;
use std::path::Path;
 
fn directory_permissions() -> zip::result::ZipResult<()> {
    let file = File::open("archive.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // Directories in ZIP archives:
    // - May or may not be explicitly stored
    // - May have permission information
    
    // When directories are created during extraction:
    // - If permissions are stored: use stored permissions
    // - If no permissions: use platform defaults
    
    // On Unix, default directory permissions are typically 0o755:
    // - Owner: rwx (7)
    // - Group: rx (5)
    // - Other: rx (5)
    
    // On Windows:
    // - Permissions are largely ignored
    // - Windows ACLs are a separate system
    
    archive.extract("output")?;
    
    Ok(())
}

Directories get sensible defaults or stored permissions depending on archive contents.

Symbolic Link Handling

use zip::ZipArchive;
use std::fs::File;
 
fn symlink_handling() -> zip::result::ZipResult<()> {
    // ZIP archives can contain symbolic links
    // This depends on the tool that created the archive
    
    // In the ZIP format:
    // - Symlinks are stored as regular entries
    // - File type bits in external attributes indicate symlink
    // - File content is the target path
    
    // The zip crate's extraction behavior:
    // 1. Detects symlink type from Unix mode bits
    // 2. Creates symlink pointing to target (from file content)
    // 3. Only on Unix; Windows requires admin privileges
    
    let file = File::open("archive_with_symlinks.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // Extraction handles symlinks if supported
    // On Unix: symlinks are created
    // On Windows: may fail or create regular files
    
    archive.extract("output")?;
    
    Ok(())
}

Symlinks are stored as entries whose content is the target path, with type indicated by Unix mode bits.

Security Considerations for Symlinks

use zip::ZipArchive;
use std::fs::File;
use std::path::PathBuf;
 
fn symlink_security() {
    // Symlink extraction has security implications:
    
    // 1. Path traversal via symlinks:
    //    A symlink pointing to "../../../etc/passwd"
    //    Could expose sensitive files
    
    // 2. Absolute symlinks:
    //    A symlink pointing to "/etc/shadow"
    //    Could expose system files
    
    // 3. Symlink following:
    //    Extracting a file through a symlink
    //    Could overwrite arbitrary files
    
    // The zip crate implements protections:
    // - Validates paths don't escape extraction directory
    // - May reject dangerous symlinks
    // - May create regular files instead of symlinks
    
    // Users should:
    // - Only extract trusted archives
    // - Use sandboxed directories
    // - Verify archive contents before extraction
}

Symlink extraction requires security checks to prevent path traversal attacks.

Checking Archive Contents Before Extraction

use zip::ZipArchive;
use std::fs::File;
use std::path::Path;
 
fn safe_extraction() -> zip::result::ZipResult<()> {
    let file = File::open("untrusted.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    // Inspect archive before extracting
    for i in 0..archive.len() {
        let file = archive.by_index(i)?;
        let name = file.name();
        
        // Check for suspicious paths
        if name.contains("..") {
            println!("WARNING: Path traversal detected: {}", name);
        }
        
        // Check for absolute paths
        if name.starts_with('/') || name.len() > 1 && name.chars().nth(1) == Some(':') {
            println!("WARNING: Absolute path in archive: {}", name);
        }
        
        // Check for symlinks (on Unix)
        #[cfg(unix)]
        {
            use std::os::unix::fs::FileTypeExt;
            // Symlink detection depends on how the library exposes it
        }
    }
    
    // Only extract if everything looks safe
    archive.extract("safe_output")?;
    
    Ok(())
}

Inspecting archive contents before extraction is a security best practice.

Manual Extraction with Permission Control

use zip::ZipArchive;
use std::fs::{File, OpenOptions, create_dir_all};
use std::io::Write;
use std::path::Path;
 
#[cfg(unix)]
use std::os::unix::fs::PermissionsExt;
 
fn manual_extraction() -> zip::result::ZipResult<()> {
    let file = File::open("archive.zip")?;
    let mut archive = ZipArchive::new(file)?;
    
    for i in 0..archive.len() {
        let mut zip_file = archive.by_index(i)?;
        let outpath = Path::new("output").join(zip_file.name());
        
        if zip_file.is_dir() {
            // Create directory
            create_dir_all(&outpath)?;
            
            // Set directory permissions on Unix
            #[cfg(unix)]
            {
                // Default directory permissions
                let permissions = std::fs::Permissions::from_mode(0o755);
                std::fs::set_permissions(&outpath, permissions)?;
            }
        } else {
            // Create parent directories
            if let Some(p) = outpath.parent() {
                create_dir_all(p)?;
            }
            
            // Extract file content
            let mut outfile = OpenOptions::new()
                .write(true)
                .create(true)
                .truncate(true)
                .open(&outpath)?;
            
            std::io::copy(&mut zip_file, &mut outfile)?;
            
            // Set file permissions on Unix
            #[cfg(unix)]
            {
                // Default file permissions
                let permissions = std::fs::Permissions::from_mode(0o644);
                std::fs::set_permissions(&outpath, permissions)?;
            }
        }
    }
    
    Ok(())
}

Manual extraction allows complete control over permission handling.

Symlink Creation During Extraction

use zip::ZipArchive;
use std::fs::File;
use std::path::Path;
 
#[cfg(unix)]
use std::os::unix::fs as unix_fs;
 
fn extract_with_symlinks() -> zip::result::ZipResult<()> {
    #[cfg(unix)]
    {
        let file = File::open("archive.zip")?;
        let mut archive = ZipArchive::new(file)?;
        
        for i in 0..archive.len() {
            let mut zip_file = archive.by_index(i)?;
            let name = zip_file.name().to_string();
            let outpath = Path::new("output").join(&name);
            
            // Check if this is a symlink (from Unix mode bits)
            // The zip crate handles this internally, but manual extraction
            // would require reading the target from file content
            
            // For symlinks, file content is the target path:
            if is_symlink_entry(&zip_file) {
                let mut target = String::new();
                zip_file.read_to_string(&mut target)?;
                
                // Remove existing file if present
                let _ = std::fs::remove_file(&outpath);
                
                // Create symlink
                std::os::unix::fs::symlink(&target, &outpath)?;
            }
        }
    }
    
    Ok(())
}
 
#[cfg(unix)]
fn is_symlink_entry(file: &zip::read::ZipFile) -> bool {
    // Symlink detection from Unix mode bits
    // This is implementation-dependent
    // The actual method depends on zip crate version
    false // placeholder
}

On Unix, symlinks can be created by reading the target path from the file content.

Platform Differences

use zip::ZipArchive;
use std::fs::File;
 
fn platform_differences() -> zip::result::ZipResult<()> {
    // Unix:
    // - Full permission support (rwx for user/group/other)
    // - Symlink support
    // - Special file types (not typically stored in ZIP)
    
    // Windows:
    // - Limited permission model
    // - No traditional symlinks (requires admin for symlinks)
    // - Junctions and reparse points are different concepts
    
    // macOS:
    // - Unix permission support
    // - Symlink support
    // - Extended attributes (xattr) may or may not be preserved
    
    // The zip crate attempts to do the right thing on each platform:
    // - On Unix: restore permissions and create symlinks
    // - On Windows: apply Windows security, handle symlinks if possible
    
    let file = File::open("archive.zip")?;
    let mut archive = ZipArchive::new(file)?;
    archive.extract("output")?;
    
    Ok(())
}

Permission and symlink handling varies significantly across platforms.

Permission Preservation Accuracy

use zip::ZipArchive;
use std::fs::File;
use std::path::Path;
 
fn permission_accuracy() {
    // ZIP format limitations:
    
    // 1. Only Unix permissions are stored
    //    Windows ACLs, macOS extended attributes not preserved
    
    // 2. Permissions may be lost when:
    //    - Archive created on Windows, extracted on Unix
    //    - Archive created with non-preserving tool
    //    - Archive transferred through systems that strip metadata
    
    // 3. Umask affects extracted permissions
    //    Even if archive stores 0o755, umask might result in 0o755 & !umask
    
    // 4. Permission bits supported:
    //    - Read (r): 0o444
    //    - Write (w): 0o222
    //    - Execute (x): 0o111
    //    - Setuid, setgid, sticky: May or may not be preserved
    
    // For accurate permission preservation:
    // - Create archive on Unix
    // - Use tools that preserve permissions
    // - Extract on same or compatible platform
}

Permission preservation depends on archive creation and extraction platforms.

Handling Special Files

use zip::ZipArchive;
use std::fs::File;
 
fn special_files() {
    // ZIP archives typically don't store:
    // - Device files
    // - Named pipes
    // - Sockets
    
    // If such entries appear in an archive:
    // - They might be stored as regular files
    // - Extraction creates regular files, not special files
    // - This is a limitation of the ZIP format
    
    // Symlinks are the main special file type handled
    // Hard links are not typically stored in ZIP format
    
    // For full Unix file system archival:
    // - Consider tar format instead of ZIP
    // - tar preserves more Unix metadata
}

ZIP format has limited support for Unix special files beyond symlinks.

Real-World Example: Safe Archive Extraction

use zip::ZipArchive;
use std::fs::File;
use std::path::{Path, PathBuf, Component};
 
fn safe_extract(archive_path: &Path, output_dir: &Path) -> zip::result::ZipResult<()> {
    let file = File::open(archive_path)?;
    let mut archive = ZipArchive::new(file)?;
    
    for i in 0..archive.len() {
        let mut file = archive.by_index(i)?;
        let name = file.name().to_string();
        
        // Validate path doesn't escape output directory
        let outpath = output_dir.join(&name);
        if !is_safe_path(output_dir, &outpath) {
            eprintln!("Skipping unsafe path: {}", name);
            continue;
        }
        
        // Check for suspicious symlink targets
        if file.is_symlink() {
            // Additional symlink validation would go here
            #[cfg(unix)]
            {
                let mut target = String::new();
                file.read_to_string(&mut target)?;
                
                // Validate symlink target
                if target.contains("..") || Path::new(&target).is_absolute() {
                    eprintln!("Skipping dangerous symlink: {} -> {}", name, target);
                    continue;
                }
            }
        }
        
        if file.is_dir() {
            std::fs::create_dir_all(&outpath)?;
        } else {
            if let Some(parent) = outpath.parent() {
                std::fs::create_dir_all(parent)?;
            }
            let mut outfile = std::fs::File::create(&outpath)?;
            std::io::copy(&mut file, &mut outfile)?;
        }
    }
    
    Ok(())
}
 
fn is_safe_path(base: &Path, path: &Path) -> bool {
    // Check if path is within base directory
    match path.canonicalize() {
        Ok(canonical) => {
            match base.canonicalize() {
                Ok(base_canonical) => canonical.starts_with(&base_canonical),
                Err(_) => false,
            }
        }
        Err(_) => {
            // Path doesn't exist yet, check components
            let mut current = base.to_path_buf();
            for component in path.components() {
                match component {
                    Component::ParentDir => {
                        if !current.pop() {
                            return false;
                        }
                    }
                    Component::CurDir => {}
                    Component::Normal(_) => {
                        current.push(component);
                    }
                    _ => return false,
                }
            }
            true
        }
    }
}

Safe extraction validates paths and symlink targets to prevent security issues.

Comparing with tar Archives

fn compare_with_tar() {
    // ZIP vs tar for Unix permissions and special files:
    
    // ZIP:
    // - Permissions: Stored in external attributes (Unix-specific)
    // - Symlinks: Supported, stored as regular entries with target in content
    // - Device files: Not typically supported
    // - Extended attributes: Not standard
    // - Cross-platform: Better Windows support
    
    // tar:
    // - Permissions: Full Unix permission preservation
    // - Symlinks: Native support
    // - Device files: Supported
    // - Extended attributes: Some tar variants support
    // - Cross-platform: Primarily Unix-focused
    
    // For Unix-to-Unix archival preserving all metadata:
    // tar is generally better
    
    // For cross-platform compatibility:
    // ZIP is generally better
}

The tar format provides more complete Unix metadata preservation than ZIP.

Key Points Summary

fn key_points() {
    // 1. ZIP stores Unix permissions in external attributes field
    // 2. Permission restoration only works on Unix systems
    // 3. Windows-created archives lack Unix permissions
    // 4. Symlinks are stored as entries with target in content
    // 5. Symlink extraction requires Unix platform
    // 6. Security: validate paths and symlink targets
    // 7. Path traversal attacks are possible with malicious archives
    // 8. Extract to controlled directories for untrusted archives
    // 9. Manual extraction gives full control over permissions
    // 10. Platform differences affect what gets restored
    // 11. For complete Unix preservation, consider tar format
    // 12. Default permissions apply when not stored in archive
}

Key insight: ZipArchive::extract attempts to preserve Unix file permissions and create symbolic links when the archive contains this metadata and the platform supports it, but this is inherently limited by the ZIP format's design. The format stores Unix permissions in external attributes and symlinks as regular entries with target paths as content. Security is a critical consideration—malicious archives can contain path traversal attempts via .. components or symlinks pointing outside the extraction directory. For trusted archives, extract provides reasonable preservation; for untrusted input, validate archive contents or extract manually with explicit security checks. The tar format offers more complete Unix metadata preservation when cross-platform compatibility isn't required.