Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Get your data

Six raw data files were provided as the starting point:

...

Due to the size of the data and length of run time, many of the programs have already been run for this exercise. We will submit some jobs, but because we cannot wait for it to complete, we will be looking through already generated results. You will then be parsing the output, finding answers, and visualizing results (in the directory tophat_results).

Run tophat

On lonestar, to run tophat, following modules need to be loaded.

...

Code Block
titleCreate a launcher file called tophat_launcher.sge
qsub tophat_launcher.sge

Examine tophat parameters

Why did we choose the tophat parameters we did and what do they mean? Here's our tophat command again:

...

As you can see there are many many other options for running tophat!

Examine tophat results

 

We are going to be running commands that can take a couple of minutes to run. But with 20 people running the same commands, we don't want to run these on the head node. They are too small to be submitted too. So, let's switch over to the tab with our idev session.

...

Code Block
samtools flagstat accepted_hits.bam

Let's see how this compares to BWA results...

 

Code Block
titleBack to the directory with BWA results
cd $SCRATCH/my_rnaseq_course/bwa_mem_results

...

Expand
How to
How to
Code Block
samtools view C1_R1.mem.bam| awk '{if ($1 == 9) print}'

Help! I have a lots of reads and a large number of reads. Make tophat go faster!

  • Use threading option efficiently (tophat -p <number of threads>)

  • Split one data file into smaller chunks and run multiple instances of tophat. Finally concatenate the output.
    • WAIT! We have a pipeline for that!
    • Look for fastTophat in $BI/scripts  
  • Split mapping by chromosome- mapping to each chromosome=1 job.

...