...
We'll return to this example data shortly to demonstrate a much more involved tool, GATK, to do the same steps.
Note if you're trying this on Stampede: The $BI directory is not accessible from compute nodes on Stampede so you will need to make a copy of your data on $SCRATCH and update file locations accordingly to get this demo to run.
...
Expand | |||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| |||||||||||||||||||||||||||||||||||||||||||||
Let's use Anna Battenhouse's shell script Don't forget that for this to work, you need to have appended
Move into your scratch directory and then try to figure out how to create and
Note that the input is paired-end data.
|
...
You can also get some quick stats with some linux one-liners on this page; there are more thorough analysis programs built to work with vcf's.
...
Expand | |||||||||
---|---|---|---|---|---|---|---|---|---|
| |||||||||
This linux one-liner should give you a snapshot of data sufficient to figure it out:
|
GATK
GATK deserves it's own page which is here, but we've already run it and will now look at some differences in the SNP calls.
...
You can't get gene-context information from bcftools - you have to shift to variant annotation tools to do that.
Side-note on samtools mpileup
...