...
$SCRATCH/GVA_Human_trios/raw_files/NA12878.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam
With samtoolsbcftools, this is a two-step process:
samtools bcftools mpileup
command transposes the mapped data in a sorted BAM file fully to genome-centric coordinates. It starts at the first base on the first chromosome for which there is coverage and prints out one line per base. Each line has information on every base observed in the raw data at that base position along with a lot of auxiliary information depending on which flags are set. It calculates the Bayseian prior probability given by the data, but does not estimate a real genotype.bcftools call
with a few options added uses the prior probability distribution and the data to calculate a genotype for the variants detected.
...
Warning |
---|
title | Remember to make sure you are on an idev done |
---|
|
It is unlikely that you are currently on an idev node as copying the files while on an idev node causes problems as discussed. Remember the hostname command and showq -u can be used to check if you are on one of the login nodes or one of the compute nodes. If you need more information or help re-launching a new idev node, please see this tutorial. DO NOT run these commands on the head nodeYou should request at least 60 minutes on the idev session to make sure the commands have time to finish running. |
Recall that we installed samtools on our GVA2021 conda installationand bcftools in our GVA-SNV conda environment. Make sure you have activated your GVA2021 GVA-SNV environment and you have access to samtools and bcftools version 1.915.1
Expand |
---|
title | Click here for commands and expected output if you are not sure |
---|
|
Code Block |
---|
language | bash |
---|
title | activate conda envrionment |
---|
| conda activate GVA2021GVA-SNV |
Code Block |
---|
language | bash |
---|
title | Make sure you have the expected versions of samtools and bcftools installed |
---|
| samtools --version
bcftools --version |
samtools --version output: No Format |
---|
samtools 1.15.91
Using htslib 1.15.91
Copyright (C) 20182022 Genome Research Ltd.
# Followed by a bunch of compilation details. |
bcftools --version output: No Format |
---|
bcftools 1.915.1
Using htslib 1.15.91
Copyright (C) 20182022 Genome Research Ltd.
License Expat: The MIT/Expat licenseGPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. |
|
...
Expand |
---|
title | What to do if you do not get version 1.15.9 1 for both samtools and bcftools in the above version checks? |
---|
| The #1 most likely issue here will be that you see samtools version 1.7 rather than 1.9 and will suggest that somewhere in other modifications of the GVA2021 environment, we introduced a change to either samtools or opensssl. If you experience this error, let me know, I'm trying to track down if this is something that a tutorial specifically caused, or if its an error of installing some other tool into this environment mistakenly.
The simplest solution will be to create a new conda environment, and install just samtools, bcftools, and openssl verison 1.0 in it, as we did when we originally installed them into the GVA2021 environment. Code Block |
---|
|
conda create --name GVA-samtools -c bioconda samtools bcftools openssl=1.0
conda activate GVA-samtools |
If you need to do this, make sure you are in the new GVA-samtools conda environment before using any of the samtools commands listed here, or switch to it when you get the error message about libcripto |
If you are not seeing the correct versions, there is either a problem activating or creating your environment. Either try to activate the environment again, go back to the SNV tutorial, or ask for help before continuing. |
Code Block |
---|
language | bash |
---|
title | Calling variants using samtools and bcftools. Note the last command is quite long and may wrap around 2 lines your monitor or extend to the right of what you can see without scrolling over |
---|
linenumbers | true |
---|
|
mkdir samtools_example
cd samtools_example
samtools mpileup -u -f $SCRATCH/GVA_Human_trios/raw_files/ref/hs37d5.fa $SCRATCH/GVA_Human_trios/raw_files/NA12878.chrom20.ILLUMINA.bwa.CEU.exome.20111114.bam | bcftools call -v -c - > trios_tutorial.raw.vcf
|
The above command may take >30 minutes to run based on some student's experience and will not produce many lines of output leading students to worry their terminal has locked up. More than likely this is not the case, but you can try hitting the return key 1 or 2 times to see if your terminal adds a blank line to verify. As this command is taking so long it is recomended recommended that you read ahead or switch to another tutorial in another terminal window while it runs, just be aware that you should not start 2 idev sessions in 2 different windows.
...