Assessing Mapping Results I

Objectives

In this lab, you will explore the mapping results generated previously. You will learn how to assess mapping results using samtools and unix commands. You will also look at the differences between mapping with BWA and mapping with hisat2.


Get your data

To try out exercises, you'll need to access your mapping results. 

Get set up for the exercises
# If using your own BWA mapped results (Sam files), they should be at:
$SCRATCH/my_rnaseq_course/day_2/bwa_exercise 

 
# If not, you can look at the results from this directory: 
$SCRATCH/my_rnaseq_course/day_2/bwa_exercise/bwa_mem_results_transcriptome

Utilities for parsing and manipulating alignment files in SAM and BAM formats.  It is useful for:

  • Sorting alignment files
  • Merging multiple alignment files
  • Converting from SAM to BAM and vice versa.
  • Retrieving reads based on different criteria : reads mapping to a particular region, reads mapping with a certain quality, unmapped reads, etc.
  • Collecting statistics about your mapping result. 

With mapping results in SAM format, we need to convert it to BAM format, sort the BAM file and create and index for the BAM file. 

load the samtools module first
module load samtools

Syntax 1: Convert SAM file to BAM format.

samtools view syntax
samtools view -b -S samfile > bamfile


Syntax 2: Sort and index newly created BAM file.

samtools sort syntax
samtools sort -o sortedbamfile bamfile 
samtools index sortedbamfile


WE ARE NOT GOING TO DO THIS RIGHT NOW, BUT TAKE A LOOK!

Submit to the TACC queue or run in an idev shell

Create a commands file and use launcher_creator.py followed by sbatch.

 Show me the commands...

Put this in your commands file:

samtools view -b -S C1_R1.mem.sam > C1_R1.mem.bam && samtools sort -o C1_R1.mem.bam C1_R1.mem.bam && samtools index C1_R1.mem.bam

samtools view -b -S C1_R2.mem.sam > C1_R2.mem.bam && samtools sort -o C1_R2.mem.bam C1_R2.mem.bam && samtools index C1_R2.mem.bam

samtools view -b -S C1_R3.mem.sam > C1_R3.mem.bam && samtools sort -o C1_R3.mem.bam C1_R3.mem.bam && samtools index C1_R3.mem.bam

samtools view -b -S C2_R1.mem.sam > C2_R1.mem.bam && samtools sort -o C2_R1.mem.bam C2_R1.mem.bam && samtools index C2_R1.mem.bam

samtools view -b -S C2_R1.mem.sam > C2_R1.mem.bam && samtools sort -o C2_R1.mem.bam C2_R1.mem.bam && samtools index C2_R1.mem.bam


Exercise 1: Let's get some statistics: Samtools flagstat

PREFERABLY, DO THIS IN YOUR IDEV SESSION (IF ITS STILL AVAILABLE)

Samtools flagstat can generate useful statistics about a mapped BAM file. Let's try it on one of our samples- C1_R1. Let's look at the BAM files for this sample, generated from bwa aln/sampe and generated from bwa mem. The two files are:  C1_R1.bam  and C1_R1.mem.bam

samtools flagstat command
samtools flagstat bwa_mem_results_transcriptome/C1_R1.mem.bam


Exercise 2: Let's get some statistics: Samtools idxstats

samtools idxstats command
samtools idxstats bwa_mem_results_transcriptome/C1_R1.mem.bam

BACK TO COURSE OUTLINE