Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleHint
Code Block
languagebash
lesscat --help
Expand
titleAnswer
Code Block
languagebash
cat -Nn small.fq /2316:10009:100563| tail

piping

So what is that vertical bar ( | ) all about? It is the pipe symbol!

...

Code Block
languagebash
titleUncompressing output on the fly with gunzip -c
# make sure you're in your $SCRATCH/core_ngs/fastq_prep directory
cd $SCRATCH/core_ngs/fastq_prep

gunzip -c Sample_Yeast_L005_R1.cat.fastq.gz | more
gunzip -c Sample_Yeast_L005_R1.cat.fastq.gz | head
gunzip -c Sample_Yeast_L005_R1.cat.fastq.gz | tail
gunzip -c Sample_Yeast_L005_R1.cat.fastq.gz | tail -n +901 | head -8

# Note that less will display .gz file contents automatically
less -N Sample_Yeast_L005_R1.cat.fastq.gz

...

Code Block
languagebash
titleCounting lines with wc -l
zcat Sample_Yeast_L005_R1.cat.fastq.gz | more
zcat Sample_Yeast_L005_R1.cat.fastq.gz | less -N
zcat Sample_Yeast_L005_R1.cat.fastq.gz | head
zcat Sample_Yeast_L005_R1.cat.fastq.gz | tail
zcat Sample_Yeast_L005_R1.cat.fastq.gz | tail -n +901 | head -8
Tip
There will

# include original line numbers
zcat Sample_Yeast_L005_R1.cat.fastq.gz | cat -n | tail -n +901 | head -8
Tip

There will be times when you forget to pipe your large zcat or gunzip -c output somewhere – even the experienced among us still make this mistake! This leads to pages and pages of data spewing across your terminal.

If you're lucky you can kill the output with Ctrl-c. But if that doesn't work (and often it doesn't) just close your Terminal window. This terminates the process on the server (like hanging up the phone), then you just can log back in.

...

Code Block
languagebash
titleFor loop to count sequences in multiple FASTQs
for fname in *.gz; do
  echo "Processing $fname"
  echo "..$fname has $((`zcat $fname | wc -l`l | awk '{print $1 / 4)) sequences}'` sequences"
done

Here fname is the name I gave the variable that is assigned a different file generated by the filename wildcard matching, each time through the loop. The actual file is then referenced as$fname inside the loop.

...

Code Block
languagebash
for <variable name> in <expression>; do 
  <something>
  <something else>
done
Tip

The bash shell lets you put multiple commands on one line if they are each separated by a semicolon ( ; ). So in the above for loop, you can see that bash consideres the do keyword to start a separate command. Two alternate ways of writing the loop are:

Code Block
languagebash
# One line for each clause, no semicolons
for <variable name> in <expression>
do 
  <something>
  <something else>
done
Code Block
languagebash
# All on one line, with semicolons separating clauses
for <variable name> in <expression>; do <something>; <something else>; done