...
Expand | ||
---|---|---|
| ||
The first argument is the reference FASTA. The second argument is the "base" file name to use for the created index files. It will create a bunch of files beginning bowtie/NC_012967.1*. |
...
Warning | |||||||
---|---|---|---|---|---|---|---|
| |||||||
Create a
|
...
Still, you should recognize some of the information on a line in a SAM file from the input FASTQ, and some of the other information is relatively straightforward to understand, like the position where the read mapped. Give this a try:
Code Block |
---|
head bowtiebowtie2/SRR030257.sam |
What do you think the 4th and 8th columns mean?
...
...
...
Multithreaded execution
We have actually massively under-utilized Lonestar in this example. We submitted a job that reserved a single node on the cluster, but that node has 12 processors. Bowtie was only using one of those processors (a single "thread")! For programs that support multithreaded execution (and most mappers do because they are obsessed with speed) we could have sped things up by using all 12 processors for the bowtie process.
Expand | ||||
---|---|---|---|---|
| ||||
It's
Try it out and compare the speed of execution by looking at the log files. |
...
One consequence of using multithreading that might be confusing is that the aligned reads might appear in your output SAM file in a different order than they were in the input FASTQ. This happens because small sets of reads get continuously packaged, "sent" to the different processors, and whichever set "returns" fastest is written first. You can force them to appear in the same order (at a slight cost in speed) by adding the --reorder
flag to your command.
Anchor | ||||
---|---|---|---|---|
|
Mapping with BWA
BWA (the Burrows-Wheeler Aligner) is another fast mapping program. It's the successor to another aligner you might have used or heard of called MAQ (Mapping and Assembly with Quality).
...
Code Block |
---|
module load bwa |
There are could be multiple versions of BWA on TACC , so you might want to check which one you have loaded for when you write up your awesome publication that was made possible by your analysis of next-gen sequencing data.and this command loads the default one.
How could you check to see what version you are using to write in the materials and methods of your paper?
Expand | ||
---|---|---|
| ||
|
Create a fresh output directory, so that we don't write over the output from bowtie. Be sure you are back in your main intro_to_mapping
directory. Then:
Code Block |
---|
mkdir bwa |
...
BWA doesn't give you a choice of where to create your index files. It creates them in the same directory as the FASTA that you input. So copy the FASTA in your intro_to_mapping
directory to your new bwa
directory:
...