wc
Overview
The wc
(word count) command counts lines, words, characters, and bytes in files or input streams. It’s essential for text analysis and file statistics.
Syntax
wc [options] [file...]
Common Options
Option | Description |
---|---|
-l |
Count lines only |
-w |
Count words only |
-c |
Count bytes only |
-m |
Count characters only |
-L |
Length of longest line |
--files0-from=file |
Read null-separated filenames |
Default Output Format
Without options, wc
shows:
lines words bytes filename
Example output:
42 156 892 file.txt
Key Use Cases
- Count lines in files
- Analyze text statistics
- Monitor file growth
- Validate data processing
- Script automation
Examples with Explanations
Example 1: Basic Count
wc file.txt
Shows lines, words, and bytes count
Example 2: Lines Only
wc -l file.txt
Shows only line count
Example 3: Multiple Files
wc *.txt
Shows counts for all text files plus totals
Understanding Counts
- Lines: Number of newline characters
- Words: Sequences of non-whitespace characters
- Characters: Including multibyte characters
- Bytes: Raw byte count (may differ from characters)
Common Usage Patterns
Count log entries:
wc -l /var/log/syslog
Monitor file growth:
watch "wc -l growing_file.log"
Pipeline counting:
ps aux | wc -l
Advanced Usage
Longest line length:
wc -L file.txt
Character vs byte count:
wc -m file.txt # characters wc -c file.txt # bytes
Multiple file totals:
wc -l *.log
Pipeline Integration
Count command output:
ls | wc -l
Count unique lines:
sort file.txt | uniq | wc -l
Count pattern matches:
grep "error" log.txt | wc -l
Performance Analysis
- Very fast operation
- Efficient for large files
- Minimal memory usage
- Good pipeline performance
- Streaming capability
Additional Resources
Best Practices
- Use specific options for clarity
- Combine with other text tools
- Consider character encoding
- Use in scripts for validation
- Monitor with watch for real-time updates
Scripting Examples
File size validation:
if [ $(wc -l < file.txt) -gt 1000 ]; then echo "File too large" fi
Progress monitoring:
TOTAL=$(wc -l < input.txt) echo "Processing $TOTAL lines"
Log rotation trigger:
[ $(wc -l < logfile) -gt 10000 ] && logrotate config
Character Encoding
Difference between -c
and -m
: - -c
counts bytes - -m
counts characters (important for UTF-8)
Example with Unicode:
echo "café" | wc -c # 5 bytes
echo "café" | wc -m # 4 characters
Common Patterns
Count non-empty lines:
grep -c "." file.txt
Count files in directory:
ls -1 | wc -l
Count unique users:
cut -d: -f1 /etc/passwd | wc -l
Integration Examples
With find:
find . -name "*.py" -exec wc -l {} + | tail -1
With xargs:
find . -name "*.txt" | xargs wc -l
Log analysis:
tail -f access.log | while read line; do echo "Total requests: $(wc -l < access.log)" done
Troubleshooting
- Binary files giving unexpected results
- Character encoding issues
- Very large files
- Empty files
- Permission problems
Real-world Applications
Code metrics:
find . -name "*.py" | xargs wc -l | tail -1
Data validation:
[ $(wc -l < data.csv) -eq $(wc -l < expected.csv) ]
Monitoring:
wc -l /var/log/messages | awk '{print $1}' > line_count.txt