Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info
titleyou may be wondering why creating an index is a common first step for many different mapping programs.

Like an index for a book (in the olden days before Kindles and Nooks), creating an index for a computer database allows quick access to any "record" given a short "key". In the case of mapping programs, creating an index for a reference sequence allows it to more rapidly place a read on that sequence at a location where it knows at least a piece of the read matches perfectly or with only a few mismatches. By jumping right to these spots in the genome, rather than trying to fully align the read to every place in the genome, it saves a ton of time.

Indexing is a separate step in running most mapping programs because it can take a LONG time if you are indexing a very large genome (like our own overly complicated human genome). Furthermore, you only need to index a genome sequence once, no matter how many samples you want to map. Keeping it as a separate step means that you can skip it later when you want to align a new sample to the same reference sequence.


Finally, it's time to map the reads! The command you need is: 

bowtie2
Code Block
languagebash
titleAgain, try reading the help for the bowtie2 command to figure out how to run the command yourself.

...

Remember these are paired-end reads.

...

collapsetrue
bowtie2
Warning
titleIMPORTANT

This command can take a while (~5 minutes) and is extremely taxing. This is longer than we want to run a job on the head node (especially when all of us are doing it at once). In fact, in previous years, TACC has noticed the spike in usage when multiple students forgot to make sure they were on idev nodes and complained pretty forcefully to us about it. Let's not have this be one of those years. Use the showq -u command to make sure you are on an idev node. If you are not sure if you are on an idev node, speak up on zoom and I'll show(q) -u what you are looking for. Yes, your instructor likes bad puns. My apologies.

If you are not on an idev node, and need help to relaunch it, click over to the idev tutorial.

...

  • In the bowtie2 example, we mapped in --local mode. Try mapping in --end-to-end mode (aka global mode).

  • Do the BWA tutorial so you can compare their outputs.
    • Did bowtie2 or BWA map more reads?
    • In our examples, we mapped in paired-end mode. Try to figure out how to map the reads in single-end mode and create this output.
    • Which aligner took less time to run? Are there any options you can change that:
      • Lead to a larger percentage of the reads being mapped? (increase sensitivity)
      • Speed up run time without causing many fewer reads to be mapped? (increase performance)

...