Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

...

Expand
No, give me the commands...
No, give me the commands...

This should work:

Code Block
samtools
which samtools

Prepare your directories

Create From inside your main mapping directory, create a new output directory called samtools_bowtiebowtie2 or whatever makes sense to you.

...

SAM is a text file, so it is slow to access information about how any given read was mapped. SAMtools and many of the commands that we will run later work on BAM files (essentially GZIP compressed binary forms of the text SAM files). These can be loaded much more quickly. Typically, they also need to be sorted, so that when the program wants to look at all reads overlapping position 4,129,888, it can easily find them all at once without having to search through the entire BAM file.

Warning
titleDo not run on head node

Many commands past this point are computationally intensive. You should run them through an idev shell or by qsub. We recommend idev for the tutorial.

Code Block
titleExample command to start an idev shell
idev -m 60 -q development -A CCBB

 

Convert from SAM to BAM format.

...

What are all the options doing?

The samtools mpileup command will take a few minutes to run. You might consider putting it in the background by pressing control-z and then typing the command bg so that you can do some other things in this terminal window at the same time – like starting to run variant calling on the BWA or bowtie mapping results. Remember, there are still many other processors available on this node!

Convert BCF to human-readable VCF:

...

Often you want to compare the results of variant calling on different samples or using different pipelines. Bedtools is a suite of utility programs that work on a variety of file formats, one of which is conveniently VCF format. It provides many ways of slicing, dicing, and comparing the information in VCF files. For example, we can use it to find out what predictions are the same and which are different from the variant calling on reads mapped with different programs if you generated VCF files for each one.

Set up a new output directory and copy the respective VCF files to it, renaming them so that we know where they came from:

...