...
alignment type | aligner options | pro's | con's |
---|
global with bwa | single end reads: paired end reads: - bwa aln <R1>
- bwa aln <R2>
- bwa sampe
| - simple to use (take default options)
- good for basic global alignment
| |
global with bowtie2 | bowtie2 | - extremely configurable
- can be used for RNAseq alignment (after adapter trimming) because of its many options
| |
local with bwa | bwa mem | - simple to use (take default options)
- very fast
- no adapter trimming needed
- good for simple RNAseq analysis
- the secondary alignments it reports can provide splice junction information
| - always produces alignments with secondary reads
- must be filtered if not desired
|
local with bowtie2 | bowtie2 --local | - extremely configurable
- no adapter trimming needed
- good for small RNA alignment because of its many options
| |
...
We're going to skip the trimming step for now and see how it goes. We'll perform steps 2 - 5 now and leave , leaving samtools for a later exercise since steps 6 - 10 are common to nearly all post-alignment workflows.
...
Code Block |
---|
language | bash |
---|
title | Start an idev session |
---|
|
idev -m 180 -N 1 -A OTH21164 -r CoreNGS # or -A TRA23004
idev -m 120 -N 1 -A OTH21164 -p development # or -A TRA23004 |
Code Block |
---|
|
module load biocontainers # takes a while
module load bwa
bwa |
...
Expand |
---|
|
The last few lines of bwa's execution output should look something like this: Code Block |
---|
| [bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 50.76 sec
[bwa_aln_core] write to the disk... 0.07 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 50.35 sec
[bwa_aln_core] write to the disk... 0.07 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 13.64 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 592180 sequences have been processed.
[main] Version: 0.7.17-r1188
[main] CMD: /usr/local/bin/bwa aln sacCer3/sacCer3.fa fastq/Sample_Yeast_L005_R1.cat.fastq.gz
[main] Real time: 7885.185584 sec; CPU: 7783.598825 sec
|
So the R2 alignment took ~78 ~85 seconds (~1.3 4 minutes). |
Since you have your own private compute node, you can use all its resources. It has 128 cores, so re-run the R2 alignment asking for 60 execution threads.
...
Expand |
---|
|
The last few lines of bwa's execution output should look something like this: Code Block |
---|
| [bwa_aln] 17bp reads: max_diff = 2
[bwa_aln] 38bp reads: max_diff = 3
[bwa_aln] 64bp reads: max_diff = 4
[bwa_aln] 93bp reads: max_diff = 5
[bwa_aln] 124bp reads: max_diff = 6
[bwa_aln] 157bp reads: max_diff = 7
[bwa_aln] 190bp reads: max_diff = 8
[bwa_aln] 225bp reads: max_diff = 9
[bwa_aln_core] calculate SA coordinate... 266.70 sec
[bwa_aln_core] write to the disk... 0.04 sec
[bwa_aln_core] 262144 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 268.94 sec
[bwa_aln_core] write to the disk... 0.03 sec
[bwa_aln_core] 524288 sequences have been processed.
[bwa_aln_core] calculate SA coordinate... 72.26 sec
[bwa_aln_core] write to the disk... 0.01 sec
[bwa_aln_core] 592180 sequences have been processed.
[main] Version: 0.7.17-r1188
[main] CMD: /usr/local/bin/bwa aln -t 60 sacCer3/sacCer3.fa fastq/Sample_Yeast_L005_R2.cat.fastq.gz
[main] Real time: 57.013931 sec; CPU: 142179.813153 sec |
So the R2 alignment took only ~5 ~8 seconds (real time), or 1510+ times as fast as with only one processing thread. Note, though, that the CPU time with 60 threads was greater (142.8 ~180 sec) than with only 1 thread (77.6 ~85 sec). That's because of the thread management overhead when using multiple threads. |
...