cut

Overview

The cut command extracts specific columns or fields from lines of text. It’s useful for processing structured data like CSV files, logs, and delimited text.

Syntax

cut [options] [file...]

Common Options

Option Description
-f list Select fields
-d char Field delimiter
-c list Select characters
-b list Select bytes
-s Suppress lines without delimiters
--complement Invert selection
--output-delimiter=string Output delimiter

Field/Character Lists

Format Description
1 Field/character 1
1,3,5 Fields 1, 3, and 5
1-5 Fields 1 through 5
1- Field 1 to end
-5 First 5 fields
1,3-5,7 Mixed selection

Key Use Cases

  1. Extract CSV columns
  2. Process log files
  3. Parse structured text
  4. Data extraction
  5. Text manipulation

Examples with Explanations

Example 1: Extract Fields

cut -f 1,3 -d ',' data.csv

Extracts fields 1 and 3 from CSV file

Example 2: Extract Characters

cut -c 1-10 file.txt

Extracts first 10 characters from each line

Example 3: Custom Delimiter

cut -f 2 -d ':' /etc/passwd

Extracts usernames from passwd file

Working with Different Delimiters

Common delimiters: - , - Comma (CSV) - : - Colon (passwd, PATH) - \t - Tab (TSV) - - Space - | - Pipe

Common Usage Patterns

  1. Extract usernames:

    cut -f 1 -d ':' /etc/passwd
  2. Get file extensions:

    ls | cut -d '.' -f 2-
  3. Process CSV data:

    cut -f 2,4,6 -d ',' data.csv

Advanced Operations

  1. Suppress delimiter-less lines:

    cut -f 1 -d ',' -s file.csv
  2. Change output delimiter:

    cut -f 1,2 -d ',' --output-delimiter='|' data.csv
  3. Complement selection:

    cut -f 1,3 --complement -d ',' data.csv

Character vs Field Extraction

Character extraction (-c): - Fixed position extraction - Useful for fixed-width data - Byte-based positioning

Field extraction (-f): - Delimiter-based extraction - Variable width fields - More flexible for structured data

Performance Analysis

  • Very fast operation
  • Minimal memory usage
  • Streaming operation
  • Efficient for large files
  • Good pipeline performance

Additional Resources

Best Practices

  1. Specify delimiters explicitly
  2. Test field numbers with sample data
  3. Use character extraction for fixed-width data
  4. Consider using awk for complex operations
  5. Handle missing delimiters appropriately

Common Patterns

  1. Extract IP addresses:

    cut -f 1 -d ' ' access.log
  2. Get file sizes:

    ls -l | cut -c 30-40
  3. Process PATH variable:

    echo $PATH | cut -f 1 -d ':'

Integration Examples

  1. With sort and uniq:

    cut -f 1 -d ',' data.csv | sort | uniq -c
  2. With grep:

    grep "error" log.txt | cut -f 1 -d ' '
  3. Pipeline processing:

    cat data.txt | cut -f 2,4 -d '|' | sort

Troubleshooting

  1. Wrong field numbers
  2. Delimiter not found
  3. Character encoding issues
  4. Empty fields handling
  5. Multi-character delimiters (use awk instead)