Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

So we've examined a lot of "framework" issue – argument handling, stream handling, error handling – in a systematic way. This section presents various tips and tricks for actually manipulating data, which can be useful both in writing scripts and in command line manipulations.

change shell text colors

If your terminal has a dark background, the default shell colors can be hard to read. Execute this line to display directory names in yellow (and put it in your ~/.profile login script)

Code Block
languagebash
export LS_COLORS=$LS_COLORS:'di=1;33:'

...

The ~/test/joblist.txt file you just symlink'd describes sequencing job/run pairs, tab-separated. We can use sort and uniq to collapse and count entries in the run name field (column 2):

Code Block
languagebash
cd ~/test
cut -f 2 ~/test/joblist.txt | sort | uniq | wc -l
# there are 1244 runs

Are all the The -c option to uniq addes a count field. Which

exercise 1

How many unique job names are in the joblist.txt file?

Expand
titleSolution
Code Block
languagebash
cut -f 1 ~/test/ joblist.txt | sort | uniq | wc -l
# there are 3842

Are all the job/run pairs unique?

Expand
titleSolution

Yes. Compare the unique lines of the file to the total lines.

Code Block
languagebash
cat joblist.txt | sort | uniq | wc -l
wc -l joblist.txt
# thereboth are 3842

Which run has the most jobs?

Expand
titleSolution

Add a count to the unique run lines then sort on it numerically, in reverse order. The 1st line will then be the job with the most lines (jobs).

Code Block
languagebash
cat joblist.txt | cut -f 2 | sort | uniq -c | sort -k1,1nr | head -1
# 23 SA13038