Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

We'll set up a new directory to perform the V. cholerae data alignment. But first make sure you have the FASTQ file to align and the vibCho bowtie2 index:

Code Block
languagebash
# Get the FASTQ to align
mkdir -p $SCRATCH/core_ngs/alignment/fastq
cp $CORENGS/alignment/*fastq.gz $SCRATCH/core_ngs/alignment/fastq/

# Set up the bowtie2 index
mkdir -p $SCRATCH/core_ngs/references/bt2/vibCho
cp $CORENGS/idx/bt2/vibCho/*.*  $SCRATCH/core_ngs/references/bt2/vibCho/

Make sure you're in an idev session with the bowtie2 BioContainers module loaded:

Code Block
languagebash
idev -m 120 -A OTH21164  -N 1 -r CoreNGSday4
module load biocontainers
module load bowtiebowtie2

Now set up a directory to do this alignment, with symbolic links to the bowtie2 index directory and the directory containing the FASTQ to align:

...

  • -x  vibCho/vibCho.O395.fa – prefix path of index files
  • -U fq/cholera_rnaseq.fastq.gz – FASTQ file for single-end (Unpaired) alignment
  • -S cholera_rnaseq.sam – tells bowtie2 to report alignments in SAM format to the specified file
  • 2>&1 redirects standard error to standard output
    • while the alignment data is being written to the cholera_rnaseq.sam file, bowtie2 will report its progress to standard error.
  • | tee aln.log takes the bowtie2 progress output and pipes it to the tee program
    • tee takes its standard input and writes it to the specified file and also to standard output
    • that way, you can see the progress output now, but also save it to review later (or supply to MultiQC)

...

Code Block
89006 reads; of these:
  89006 (100.00%) were unpaired; of these:
    206755902 (236.23%63%) aligned 0 times
    3822651483 (4257.95%84%) aligned exactly 1 time
    3010531621 (3335.82%53%) aligned >1 times
7693.77%37% overall alignment rate

When the job is complete you should have a cholera_rnaseq.sam file that you can examine using whatever commands you like.  Remember, to further process it downstream, you should create a sorted, indexed BAM file from this SAM output.

...

Expand
titleAnswer


Code Block
languagebash
module load samtools
cd $SCRATCH/core_ngs/alignment/vibCho
bowtie2 --local -x vibCho/vibCho.O395 -U fq/cholera_rnaseq.fastq.gz 2>aln_local.log | \
  samtools view -b > cholera_rnaseq.local.bam

Reports these alignment statistics:

Code Block
89006 reads; of these:
  89006 (100.00%) were unpaired; of these:
    2706113359 (3015.40%01%) aligned 0 times
    3382846173 (3851.01%88%) aligned exactly 1 time
    2811729474 (3133.59%11%) aligned >1 times
6984.60%99% overall alignment rate

Interestingly, the local alignment rate here is lower than we saw with the gloabl alignmentglobal alignment. Usually local alignments have higher alignment rates than corresponding global ones.

Exercise #5: BWA-MEM - Human mRNA-seq

...