...
utility | default delimiter | how to change | example |
---|---|---|---|
cut | tab | -d or --delimiter option | cut -d ':' -f 1 /etc/passwd |
sort | whitespace (one ore more spaces or tabs) | -t or --field-separator option | sort -t ':' -k1,1 /etc/passwd |
awk | whitespace (one ore more spaces or tabs) for both input and output |
|
cat sampleinfo.txt | awk -F "\t" '{ print $1,$3 }' |
join | one or more spaces | -t option |
|
perl | whitespace (one ore more spaces or tabs) when auto-splitting input with -a | -F'/<pattern>/' option | cat sampleinfo.txt | perl -F'/\t/' -ane 'print "$F[0]\t$F[2]\n";' |
read | whitespace (one ore more spaces or tabs | IFS= option | see example above |
cut versus awk
The basic functions of cut and awk are similar – both are field oriented. Here are the main differences:
- Default field separators
- Tab is the default field separator for cut
- whitespace (one or more spaces) is the default field separator for awk
- Re-ordering
- cut cannot re-order fields; awk can, based on the order you specify
- awk is a full-featured programming language while cut is just a single-purpose utility.
When to use these programs is partly a matter of taste. I often use either cut or awk to deal with field-oriented data. Even though awk is a full-featured programming language, I find its pattern matching and text processing facilities awkward (pun intended), and so prefer perl for complicated text manipulation.