sort

Overview

The sort command sorts lines of text files or standard input. It provides various sorting options including numeric, alphabetic, and custom field-based sorting.

Syntax

sort [options] [file...]

Common Options

Option Description
-n Numeric sort
-r Reverse order
-u Unique lines only
-f Ignore case
-k field Sort by field
-t char Field separator
-o file Output to file
-c Check if sorted
-m Merge sorted files
-s Stable sort
-R Random sort
-h Human numeric sort

Sort Types

Type Description
Alphabetic Default text sorting
Numeric Numerical value sorting
Human Human-readable numbers (1K, 2M)
Month Month name sorting
Version Version number sorting
Random Random order

Key Use Cases

  1. Sort text files
  2. Organize data
  3. Remove duplicates
  4. Prepare data for processing
  5. System administration tasks

Examples with Explanations

Example 1: Basic Sort

sort file.txt

Sorts lines alphabetically

Example 2: Numeric Sort

sort -n numbers.txt

Sorts numbers in numerical order

Example 3: Sort by Field

sort -k 2 -t ',' data.csv

Sorts CSV by second field

Field-Based Sorting

Specify fields using -k: - -k 2 - Sort by field 2 - -k 2,4 - Sort by fields 2 through 4 - -k 2n - Numeric sort on field 2 - -k 2r - Reverse sort on field 2

Common Usage Patterns

  1. Remove duplicates:

    sort -u file.txt
  2. Sort and save:

    sort file.txt -o sorted.txt
  3. Multiple field sort:

    sort -k 1,1 -k 2n file.txt

Advanced Sorting

  1. Case-insensitive:

    sort -f file.txt
  2. Reverse numeric:

    sort -nr file.txt
  3. Month sorting:

    sort -M months.txt

Performance Analysis

  • Memory usage increases with file size
  • External sorting for large files
  • Use -S to specify buffer size
  • Consider using --parallel for multi-core systems
  • Temporary files created for large sorts

Additional Resources

Best Practices

  1. Use appropriate sort type
  2. Specify field separators clearly
  3. Test with small datasets first
  4. Consider memory limitations
  5. Use stable sort when needed

Locale Considerations

  • Sorting affected by locale settings
  • Use LC_ALL=C for consistent results
  • Consider character encoding
  • Collation rules vary by locale

Troubleshooting

  1. Unexpected sort order
  2. Memory limitations
  3. Field separator issues
  4. Locale-related problems
  5. Large file handling

Integration Examples

  1. With pipes:

    cat file.txt | sort | uniq
  2. With find:

    find . -name "*.txt" | sort
  3. Log analysis:

    sort -k 4 -t ' ' access.log