tr

Overview

The tr (translate) command translates or deletes characters from standard input. It’s used for character substitution, deletion, and squeezing repeated characters.

Syntax

tr [options] set1 [set2]

Common Options

Option Description
-c Complement set1
-d Delete characters in set1
-s Squeeze repeated characters
-t Truncate set1 to length of set2

Character Sets

Set Description
[:alnum:] Alphanumeric characters
[:alpha:] Alphabetic characters
[:digit:] Digits 0-9
[:lower:] Lowercase letters
[:upper:] Uppercase letters
[:space:] Whitespace characters
[:punct:] Punctuation characters
[:print:] Printable characters
[:cntrl:] Control characters

Key Use Cases

  1. Case conversion
  2. Character replacement
  3. Delete unwanted characters
  4. Format text data
  5. Clean input data

Examples with Explanations

Example 1: Uppercase Conversion

echo "hello world" | tr '[:lower:]' '[:upper:]'

Output: HELLO WORLD

Example 2: Delete Characters

echo "hello123world" | tr -d '[:digit:]'

Output: helloworld

Example 3: Replace Characters

echo "hello world" | tr ' ' '_'

Output: hello_world

Example 4: Squeeze Repeated Characters

echo "hello    world" | tr -s ' '

Output: hello world

Common Usage Patterns

  1. Convert to lowercase:

    echo "HELLO" | tr '[:upper:]' '[:lower:]'
  2. Remove newlines:

    cat file.txt | tr -d '\n'
  3. Replace multiple characters:

    echo "a,b;c:d" | tr ',;:' '   '

Character Ranges

  1. Letter ranges:

    echo "hello" | tr 'a-z' 'A-Z'
  2. Number ranges:

    echo "123" | tr '1-3' 'abc'
  3. Custom ranges:

    echo "hello" | tr 'helo' '1234'

Advanced Usage

  1. Complement sets:

    echo "hello123" | tr -cd '[:alpha:]'  # Keep only letters
  2. Multiple operations:

    echo "Hello World" | tr '[:upper:]' '[:lower:]' | tr ' ' '_'
  3. ROT13 encoding:

    echo "hello" | tr 'a-zA-Z' 'n-za-mN-ZA-M'

Text Processing

  1. Clean CSV data:

    cat data.csv | tr -d '"' | tr ',' '\t'
  2. Format phone numbers:

    echo "1234567890" | tr '0-9' '(###) ###-####'
  3. Remove control characters:

    cat file.txt | tr -d '[:cntrl:]'

Performance Analysis

  • Very fast character processing
  • Stream-based operation
  • Minimal memory usage
  • Efficient for large files
  • Good pipeline performance

Best Practices

  1. Use character classes for portability
  2. Test transformations on sample data
  3. Combine with other text tools
  4. Handle special characters carefully
  5. Consider locale settings

Data Cleaning

  1. Remove punctuation:

    echo "Hello, World!" | tr -d '[:punct:]'
  2. Normalize whitespace:

    echo "hello    world" | tr -s '[:space:]' ' '
  3. Extract numbers:

    echo "abc123def456" | tr -cd '[:digit:]'

File Processing

  1. Convert line endings:

    tr -d '\r' < dos_file.txt > unix_file.txt
  2. Create word list:

    cat text.txt | tr '[:space:][:punct:]' '\n' | tr -s '\n'
  3. Count characters:

    cat file.txt | tr -cd '[:alpha:]' | wc -c

Integration Examples

  1. With find for filename processing:

    find . -name "*.txt" | tr '[:upper:]' '[:lower:]'
  2. Log processing:

    tail -f access.log | tr ',' '\t' | cut -f1
  3. Data format conversion:

    cat data.txt | tr ';' ',' > data.csv

Scripting Applications

  1. Input validation:

    validate_input() {
        echo "$1" | tr -cd '[:alnum:]' | grep -q . || return 1
    }
  2. Password generation:

    generate_password() {
        tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 12
    }

Special Characters

  1. Handle tabs:

    echo -e "hello\tworld" | tr '\t' ' '
  2. Process escape sequences:

    echo "hello\nworld" | tr '\\' '/'
  3. Unicode handling:

    echo "café" | tr 'é' 'e'

Troubleshooting

  1. Character encoding issues
  2. Locale-specific behavior
  3. Special character handling
  4. Set length mismatches
  5. Unexpected transformations

Security Applications

  1. Sanitize input:

    echo "$user_input" | tr -cd '[:alnum:]._-'
  2. Remove dangerous characters:

    echo "$filename" | tr -d '/<>:|*?"\\'

Performance Optimization

  1. Use character classes:

    # Faster
    tr '[:lower:]' '[:upper:]'
    # Slower
    tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
  2. Combine operations:

    # Single tr call is faster
    echo "Hello World" | tr '[:upper:] ' '[:lower:]_'

Real-world Examples

  1. Log analysis:

    grep ERROR /var/log/app.log | tr '[:upper:]' '[:lower:]' | sort | uniq -c
  2. Data migration:

    cat old_format.txt | tr '|' ',' | tr -s ' ' > new_format.csv
  3. Text normalization:

    cat document.txt | tr -s '[:space:]' ' ' | tr '[:upper:]' '[:lower:]'