...
- c1_r1, c1_r2, c1_r3 from the first biological condition
- c2_r1, c2_r2, and c2_r3 from the second biological condition
Introduction
HISAT2 is a fast transcriptome-aware mapper that the fastest spliced mapper currently available. It is part of the new tuxedo suite of tools and it will map RNA-Seq data to the genome as well as identify splice junctions. HISAT2, like BWA and bowtie, uses burrows-wheeler transform (BWT) to compress genomes such that they require very little memory to store. Like BWA and bowtie, it builds indexes out of the transformed genomes using a special scheme called FM indexing. This makes it possible to search through these genomes rapidly. Unlike BWA and bowtie, HISAT2 builds a whole genome global index and tens of thousands of small local indexes (both using the BWT/FM methods) to make spliced alignment possible.to make spliced alignment possible. Despite the many indexes, because it uses BWT and FM indexing, the indexes take a very small memory footprint (~5gb RAM for the whole human genome), making it possible to run hisat2 on a standard laptop.
With the human genome, for example, hisat2 builds one global index and 48000 local indexes (each 64000bp long). The size of the local indexes is large enough that 90% of introns will fall into a single local index (on average, human introns are >6kb long).
First, the longer part of a read that maps to the genome contiguously (called the anchor) is mapped using the global index. Once this is mapped, this helps to to identify the relevant local index. HISAT can usually align the remaining part of the read (small anchor) within a single local index rather than searching across the whole genome.
Run HISAT2
First, make sure you are in the right directory for this exercise.
Code Block | ||
---|---|---|
| ||
cds cd my_rnaseq_course cd day_1_partB2/hisat_exercise ls |
Next, see if HISAT2 is a module that is available on stampede.
Code Block |
---|
module spider hisat |
...
Code Block | ||
---|---|---|
| ||
#if not loaded already, load biocontainers module
module spider hisat2
module load hisat2/ctr-2.1.0--py36pl5.22.0_0 |
Code Block | ||
---|---|---|
| ||
hisat2 |
Part 1. Create a index of your reference
...
Warning | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||
Create a
|
Hisat2 output
1.SAM file : HISAT2 alignment output in standard SAM format.
...
3. Log file : The log file should contain alignment summaries.
Code Block | ||
---|---|---|
| ||
ls results head results/GSM794483_C1.sam head results/GSM794483_C1.junctions |
Code Block | ||
---|---|---|
| ||
11607353 reads; of these: 11607353 (100.00%) were paired; of these: 21592 (0.19%) aligned concordantly 0 times 11417720 (98.37%) aligned concordantly exactly 1 time 168041 (1.45%) aligned concordantly >1 times ---- 21592 pairs aligned concordantly 0 times; of these: 82 (0.38%) aligned discordantly 1 time ---- 21510 pairs aligned 0 times concordantly or discordantly; of these: 43020 mates make up the pairs; of these: 25009 (58.13%) aligned 0 times 9694 (22.53%) aligned exactly 1 time 8317 (19.33%) aligned >1 times 99.89% overall alignment rate |
BACK TO COURSE OUTLINE