Page Comparison

This RNA-Seq analysis pipeline uses an annotated genome to identify differentially expressed genes and it consists of the following steps:

1. Quality Assessment

...

Tools Used:

...

FastQC: (Andrews 2010) used to generate quality summaries of data:
- Per base sequence quality report: useful for deciding if trimming necessary.
- Sequence duplication levels: evaluation of library complexity. Higher levels of sequence duplication may be expected for high coverage RNAseq data.
- Overrepresented sequences: evaluation of adapter contamination.

2. Fastq Preprocessing

If required, preprocessing of fastq files is performed.

...

Tools Used:

...

Fastx-toolkit: Used to preprocess fastq files.
- Fastq quality trimmer: Trimming reads based on quality.
- Fastq quality filter: Filtering reads based on quality.

...

3. Mapping

Mapping to genome reference using BWA-mem or Tophat.

*

Tools Used:

*

*

Tophat: (Kim 2011) aligner used to generate read alignments in a splice-aware manner and identify novel junctions.

*

4. Gene/Transcript Counting

Counting the number of reads mapping to annotated intervals to obtain abundance of genes/transcripts.

*

Tools Used:

*

5. DEG Identification

Normalization and statistical testing to identify differentially expressed genes.

*

Deliverables: DEG Summary and master file containing fold changes and p values for every gene, MA Plots.

Tools Used:

*

DESeq2: (Love 2014) used to perform normalization and test for differential expression using the negative binomial distribution.

Versions Compared