Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

The above example isn't particularly efficient, but it gets the job done.  In the above examples, we heavily abuse the Linux "pipe," which is the symbol '|' (probably shares the '\' key on your keyboard above the Enter key).  Pipe takes the "standard output" that would normally go to the screen and sends it to the next command.

If we really like this output, we can save it to a file and look through it rather than just printing the first 10 lines:

Code Block
head -100000 $TEST_DATA/250k_reads.fastq | grep -A 1 '^@M00' | grep -v '^@M00' | grep -v '^--$' | sort | uniq -c | sort -n -r > my_file.txt
less my_file.txt

Less lets you look through files.  Up and down arrows move one line at a time.  'F' pages down, and 'W' pages up.  "Shift+G" takes you to the end of the file, and "G" take you back to the beginning.  "Q" exits the program.

 

Further reading and examples: Scott's list of linux one-liners

Software Modules

TACC has lots of software packages, but most of them are not in your environment.  TACC doesn't want to "bloat" your environment with a bunch of software that you don't use, so we put everything in modules.  Try this:

Code Block
module
 
module list
 
module avail
 
module key genomics
 
module show samtools
 
module load samtools
module list
 
module unload samtools
module list

If you have software that you use frequently, you can save all the modules that you currently have loaded as the default

Code Block
module save

Now, let's use some of our "one-liner" skills to see how many software modules there are at TACC related to life sciences.  For learning purposes (and it's also a common way to operate), let's build up our command one step at a time:

Code Block
module key biology chemistry genomics 2>&1 
module key biology chemistry genomics 2>&1 | grep -v "    " 
module key biology chemistry genomics 2>&1 | grep -v "    " | grep ":" 
module key biology chemistry genomics 2>&1 | grep -v "    " | grep ":" | grep "^ " 
module key biology chemistry genomics 2>&1 | grep -v "    " | grep ":" | grep "^ " | cut -d ':' -f 2 -s 
module key biology chemistry genomics 2>&1 | grep -v "    " | grep ":" | grep "^ " | cut -d ':' -f 2 -s | tr ',' '\n'
module key biology chemistry genomics 2>&1 | grep -v "    " | grep ":" | grep "^ " | cut -d ':' -f 2 -s | tr ',' '\n' | wc -l

Can you come up with your own way of doing this that uses fewer steps?

 

Finally, AWK gives us tons of ability to format text.  AWK is really a little programming language.  It's big utility is that when it reads a line, it assigns to the line to the variable $0, and then each "field" is assigned to variables $1, $2, $3, etc.  Look at this one-liner:

Code Block
module keyword genomics 2>&1 | grep -v '^[A-Za-z0-9]' | grep -v '^---' | grep -v spider | grep -v '^$' | grep -v "    " | sed s/','/' '/g | awk 'BEGIN {print "Module\t\tVersions\nList Updated\t"strftime("%B %d %Y",systime())", "} { prog=$1; prog_vers=$2 "\t" $3 "\t" $4; if (length(prog) < 8) prog="prog\t"; print prog "\t" prog_vers ; }'