Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
cd ~/test
cut -f 2 joblist.txt | sort | uniq | wc -l
# there are 12441234 unique runs

Job names are in column 1 of the ~/test/sampleinfo.txt file. Here's how to create a histogram of job names showing the count of samples (lines) for each. the The -c option to uniq addes adds a count of unique items, which we can then sort on (numerically) to show the jobs with the most samples first.

...

Expand
titleSolution
Code Block
languagebash
cut -f 1 joblist.txt | sort | uniq | wc -l
# there are 38423841

Are all the job/run pairs unique?

Expand
titleSolution

Yes. Compare the unique lines of the file to the total lines.

Code Block
languagebash
cat joblist.txt | sort | uniq | wc -l
wc -l joblist.txt
# both are 38423841

Which run has the most jobs?

Expand
titleSolutionHint

Add a count to the unique run lines then sort on it numerically, in reverse order. The 1st line will then be the job with the most lines (jobs).

Expand
titleSolution
Code Block
languagebash
cat joblist.txt | cut -f 2 | sort | uniq -c | sort -k1,1nr | head -1
# 23 SA13038

...