Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Added end of line carriate to bwa mem call

...

Code Block
languagebash
titleStart an idev session
idev -m 120 -A OTH21164 -N 1 -r CoreNGS-Thu      # or -A TRA23004 
# or
idev -m 90 -A OTH21164 -N 1 -p development   # or -A TRA23004 

Go ahead and load the bowtie2 module so we can examine some help pages and options.

...

Code Block
languagebash
idev -m 120 -A OTH21164  -N 1 -r CoreNGS-Thu    # or -A TRA23004
# or
idev -m 90 -A OTH21164 -N 1 -p development  # or -A TRA23004

module load biocontainers
module load bowtie2

...

Expand
titleAnswer


Code Block
languagebash
module load samtools
cd $SCRATCH/core_ngs/alignment/vibCho
bowtie2 --local -x vibCho/vibCho.O395 -U fq/cholera_rnaseq.fastq.gz 2>aln_local.log | \
 | samtools view -b > cholera_rnaseq.local.bam

Reports these alignment statistics:

Code Block
89006 reads; of these:
  89006 (100.00%) were unpaired; of these:
    13359 (15.01%) aligned 0 times
    46173 (51.88%) aligned exactly 1 time
    29474 (33.11%) aligned >1 times
84.99% overall alignment rate

Interestingly, the local alignment rate here is lower than we saw with the global alignment. Usually local alignments have higher alignment rates than corresponding global ones.

...

Using bwa mem for RNA-seq alignment is sort of a "poor man's" RNA-seq alignment method. Real splice-aware aligners like tophat2STAR, hisat2 or STAR tophat have more complex algorithms (as shown below) – and take a lot more time!

...

BWA MEM does not know about the exon structure of the genome. But it can align different sub-sections of a read to two different locations, producing two alignment records from one input read (one . One of the two will be marked as secondary (0x100 flag).

...

Code Block
languagebash
titleRun multiple alignments using the TACC batch system
# Make sure you're in an idev session
idev -m 120 -N 1 -A OTH21164 -r CoreNGS-Thu      # or -A TRA23004
# or
idev -m 90 -N 1 -A OTH21164 -p development   # or -A TRA23004

# Load the modules we'll need
module load biocontainers
module load bwa
module load samtools

# Copy over the FASTQ data if needed
mkdir -p $SCRATCH/core_ngs/alignment/fastq
cp $CORENGS/alignment/*.gz $SCRATCH/core_ngs/alignment/fastq/

# Make a new alignment directory for running these scripts
cds
mkdir -p core_ngs/alignment/bwamem
cd core_ngs/alignment/bwamem
ln -sf ../fastq
ln -sf /work/projects/BioITeam/ref_genome/bwa/bwtsw/hg38

Now take a look at bwa mem usage (type bwa mem with no arguments, or bwa mem 2>&1 | more). The most important parameters are the following:

...

Code Block
languagebash
cd $SCRATCH/core_ngs/alignment/bwamem
bwa mem -M hg38/hg38.fa fastq/human_rnaseq.fastq.gz 2>hs_rna.bwamem.log | \
  samtools view -b | \
  samtools sort -O BAM -T human_rnaseq.tmp > human_rnaseq.sort.bam

...