Objectives
In this lab, you will explore a popular new transcriptome-aware mapper called HISAT2. Simulated RNA-seq data will be provided to you; the data contains paired-end reads that have been generated in silico to replicate real gene count data from Drosophila. The data simulates two biological groups with three biological replicates per group (6 samples total). The objectives of this lab is to:
- Learn how HISAT2 works and how to use it.
- Learn how it is different from using a mapper like BWA.
12 raw data files have been provided for all our further RNA-seq analysis:
- c1_r1, c1_r2, c1_r3 from the first biological condition
- c2_r1, c2_r2, and c2_r3 from the second biological condition
Introduction
HISAT2 is a fast transcriptome-aware mapper that is part of the new tuxedo suite of tools. These tools start with raw fastq files and produce genes, gene counts and identifies differentially expressed genes. HISAT2 uses a global, whole-genome index and tens of thousands of small local indexes to perform mapping in an extremely fast manner.