file

Overview

The file command determines file types by examining file contents rather than relying on file extensions. It uses magic numbers and patterns to identify file formats.

Syntax

file [options] file...

Common Options

Option Description
-b Brief mode (no filename)
-i MIME type output
-L Follow symbolic links
-z Look inside compressed files
-0 Read null-separated filenames
-f list Read filenames from file
-m magic Use specific magic file
-r Don’t stop at first match
-s Read block/character special files

File Type Categories

Category Examples
Text ASCII text, UTF-8 text
Binary ELF executable, PE32 executable
Archive ZIP, TAR, GZIP
Image JPEG, PNG, GIF
Audio MP3, WAV, FLAC
Video MP4, AVI, MKV
Document PDF, MS Word, LibreOffice

Key Use Cases

  1. Identify unknown files
  2. Verify file formats
  3. Check file integrity
  4. Security analysis
  5. Data recovery

Examples with Explanations

Example 1: Basic File Type

file document.pdf

Shows file type information for the PDF

Example 2: Multiple Files

file *

Shows file types for all files in directory

Example 3: MIME Type

file -i image.jpg

Shows MIME type instead of description

Understanding Output

Typical output format:

filename: file type description

Examples: - script.py: Python script, ASCII text executable - image.jpg: JPEG image data, JFIF standard - archive.tar.gz: gzip compressed data

Common Usage Patterns

  1. Check executable type:

    file /bin/ls
  2. Identify text encoding:

    file -i textfile.txt
  3. Batch file analysis:

    find . -type f | xargs file

Magic Database

The file command uses magic databases: - /usr/share/misc/magic (compiled) - /usr/share/misc/magic.mgc (binary) - /etc/magic (local additions) - ~/.magic (user-specific)

Advanced Usage

  1. Compressed file analysis:

    file -z archive.tar.gz
  2. Follow symlinks:

    file -L symlink
  3. Brief output:

    file -b mysterious_file

Performance Analysis

  • Fast operation
  • No file modification
  • Reads only file headers
  • Efficient for large directories
  • Minimal memory usage

Additional Resources

Best Practices

  1. Use with unknown files
  2. Verify file integrity
  3. Check before processing
  4. Use MIME types for web applications
  5. Combine with other analysis tools

Security Applications

  1. Malware detection:

    file suspicious_file
  2. Data validation:

    file uploaded_image | grep -q "JPEG"
  3. File type verification:

    [[ $(file -b file.pdf) == *"PDF"* ]]

Scripting Examples

  1. Process only images:

    for f in *; do
        if file -i "$f" | grep -q "image/"; then
            echo "Processing image: $f"
        fi
    done
  2. Find executables:

    find . -type f -exec file {} \; | grep executable
  3. Validate file types:

    validate_pdf() {
        file -b "$1" | grep -q "PDF" || return 1
    }

MIME Type Examples

Common MIME types: - text/plain - Plain text - image/jpeg - JPEG image - application/pdf - PDF document - video/mp4 - MP4 video - application/zip - ZIP archive

Troubleshooting

  1. Unknown file types
  2. Corrupted files
  3. Magic database issues
  4. Encoding problems
  5. Symlink handling

Integration Examples

  1. With find:

    find /home -type f -exec file {} \; | grep "ASCII text"
  2. With grep:

    file * | grep -i image
  3. File sorting:

    file * | awk -F: '/JPEG/ {print $1}' | xargs ls -l

Custom Magic Files

Create custom magic patterns:

# ~/.magic
0    string    MYFORMAT    My custom file format

Then use:

file -m ~/.magic custom_file