...
So we've examined a lot of "framework" issue – argument handling, stream handling, error handling – in a systematic way. This section presents various tips and tricks for actually manipulating data, which can be useful both in writing scripts and in command line manipulations.
change shell text colors
If your terminal has a dark background, the default shell colors can be hard to read. Execute this line to display directory names in yellow (and put it in your ~/.profile login script)
Code Block |
---|
|
export LS_COLORS=$LS_COLORS:'di=1;33:' |
create multiple symbolic links
...
The ~/test/joblist.txt file you just symlink'd describes sequencing job/run pairs, tab-separated. We can use sort and uniq to collapse and count entries in the run name field (column 2):
Code Block |
---|
|
cd ~/test
cut -f 2 ~/test/joblist.txt | sort | uniq | wc -l
# there are 1244 runs |
Are all the The -c option to uniq addes a count field. Which
exercise 1
How many unique job names are in the joblist.txt file?
Expand |
---|
|
Code Block |
---|
| cut -f 1 ~/test/ joblist.txt | sort | uniq | wc -l
# there are 3842 |
|
Are all the job/run pairs unique?
Expand |
---|
|
Yes. Compare the unique lines of the file to the total lines. Code Block |
---|
| cat joblist.txt | sort | uniq | wc -l
wc -l joblist.txt
# thereboth are 3842 |
|
Which run has the most jobs?
Expand |
---|
|
Add a count to the unique run lines then sort on it numerically, in reverse order. The 1st line will then be the job with the most lines (jobs). Code Block |
---|
| cat joblist.txt | cut -f 2 | sort | uniq -c | sort -k1,1nr | head -1
# 23 SA13038 |
|