...
Code Block |
---|
|
cd ~/test
cut -f 2 joblist.txt | sort | uniq | wc -l
# there are 12441234 unique runs |
Job names are in column 1 of the ~/test/sampleinfo.txt file. Here's how to create a histogram of job names showing the count of samples (lines) for each. the The -c option to uniq addes adds a count of unique items, which we can then sort on (numerically) to show the jobs with the most samples first.
...
Expand |
---|
|
Code Block |
---|
| cut -f 1 joblist.txt | sort | uniq | wc -l
# there are 38423841 |
|
Are all the job/run pairs unique?
Expand |
---|
|
Yes. Compare the unique lines of the file to the total lines. Code Block |
---|
| cat joblist.txt | sort | uniq | wc -l
wc -l joblist.txt
# both are 38423841 |
|
Which run has the most jobs?
Expand |
---|
|
Add a count to the unique run lines then sort on it numerically, in reverse order. The 1st line will then be the job with the most lines (jobs). |
Expand |
---|
|
Code Block |
---|
| cat joblist.txt | cut -f 2 | sort | uniq -c | sort -k1,1nr | head -1
# 23 SA13038 |
|
...