tr
Overview
The tr
(translate) command translates or deletes characters from standard input. It’s used for character substitution, deletion, and squeezing repeated characters.
Syntax
tr [options] set1 [set2]
Common Options
Option | Description |
---|---|
-c |
Complement set1 |
-d |
Delete characters in set1 |
-s |
Squeeze repeated characters |
-t |
Truncate set1 to length of set2 |
Character Sets
Set | Description |
---|---|
[:alnum:] |
Alphanumeric characters |
[:alpha:] |
Alphabetic characters |
[:digit:] |
Digits 0-9 |
[:lower:] |
Lowercase letters |
[:upper:] |
Uppercase letters |
[:space:] |
Whitespace characters |
[:punct:] |
Punctuation characters |
[:print:] |
Printable characters |
[:cntrl:] |
Control characters |
Key Use Cases
- Case conversion
- Character replacement
- Delete unwanted characters
- Format text data
- Clean input data
Examples with Explanations
Example 1: Uppercase Conversion
echo "hello world" | tr '[:lower:]' '[:upper:]'
Output: HELLO WORLD
Example 2: Delete Characters
echo "hello123world" | tr -d '[:digit:]'
Output: helloworld
Example 3: Replace Characters
echo "hello world" | tr ' ' '_'
Output: hello_world
Example 4: Squeeze Repeated Characters
echo "hello world" | tr -s ' '
Output: hello world
Common Usage Patterns
Convert to lowercase:
echo "HELLO" | tr '[:upper:]' '[:lower:]'
Remove newlines:
cat file.txt | tr -d '\n'
Replace multiple characters:
echo "a,b;c:d" | tr ',;:' ' '
Character Ranges
Letter ranges:
echo "hello" | tr 'a-z' 'A-Z'
Number ranges:
echo "123" | tr '1-3' 'abc'
Custom ranges:
echo "hello" | tr 'helo' '1234'
Advanced Usage
Complement sets:
echo "hello123" | tr -cd '[:alpha:]' # Keep only letters
Multiple operations:
echo "Hello World" | tr '[:upper:]' '[:lower:]' | tr ' ' '_'
ROT13 encoding:
echo "hello" | tr 'a-zA-Z' 'n-za-mN-ZA-M'
Text Processing
Clean CSV data:
cat data.csv | tr -d '"' | tr ',' '\t'
Format phone numbers:
echo "1234567890" | tr '0-9' '(###) ###-####'
Remove control characters:
cat file.txt | tr -d '[:cntrl:]'
Performance Analysis
- Very fast character processing
- Stream-based operation
- Minimal memory usage
- Efficient for large files
- Good pipeline performance
Best Practices
- Use character classes for portability
- Test transformations on sample data
- Combine with other text tools
- Handle special characters carefully
- Consider locale settings
Data Cleaning
Remove punctuation:
echo "Hello, World!" | tr -d '[:punct:]'
Normalize whitespace:
echo "hello world" | tr -s '[:space:]' ' '
Extract numbers:
echo "abc123def456" | tr -cd '[:digit:]'
File Processing
Convert line endings:
tr -d '\r' < dos_file.txt > unix_file.txt
Create word list:
cat text.txt | tr '[:space:][:punct:]' '\n' | tr -s '\n'
Count characters:
cat file.txt | tr -cd '[:alpha:]' | wc -c
Integration Examples
With find for filename processing:
find . -name "*.txt" | tr '[:upper:]' '[:lower:]'
Log processing:
tail -f access.log | tr ',' '\t' | cut -f1
Data format conversion:
cat data.txt | tr ';' ',' > data.csv
Scripting Applications
Input validation:
validate_input() { echo "$1" | tr -cd '[:alnum:]' | grep -q . || return 1 }
Password generation:
generate_password() { tr -dc 'A-Za-z0-9' < /dev/urandom | head -c 12 }
Special Characters
Handle tabs:
echo -e "hello\tworld" | tr '\t' ' '
Process escape sequences:
echo "hello\nworld" | tr '\\' '/'
Unicode handling:
echo "café" | tr 'é' 'e'
Troubleshooting
- Character encoding issues
- Locale-specific behavior
- Special character handling
- Set length mismatches
- Unexpected transformations
Security Applications
Sanitize input:
echo "$user_input" | tr -cd '[:alnum:]._-'
Remove dangerous characters:
echo "$filename" | tr -d '/<>:|*?"\\'
Performance Optimization
Use character classes:
# Faster tr '[:lower:]' '[:upper:]' # Slower tr 'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
Combine operations:
# Single tr call is faster echo "Hello World" | tr '[:upper:] ' '[:lower:]_'
Real-world Examples
Log analysis:
grep ERROR /var/log/app.log | tr '[:upper:]' '[:lower:]' | sort | uniq -c
Data migration:
cat old_format.txt | tr '|' ',' | tr -s ' ' > new_format.csv
Text normalization:
cat document.txt | tr -s '[:space:]' ' ' | tr '[:upper:]' '[:lower:]'