Table of Contents |
---|
...
Tip | ||
---|---|---|
| ||
This is one of only 2 java based programs that this course covers. As the readseq wrapper conda provided makes this so much easier to envoke, we will use it. It is recommended to look back at a previous years tutorial to see how this was handled without the wrapper incase you encounter a java based program in your own work that doesn't have such a helpful wrapper, and need to know where to start. |
...
- Close IGV (if you have it open from the first tutorial with your mapping, SNV, and SV data) and reopen it.
- Select "Human hg19" as the reference genome from the top left drop down (you may need to select "more" to have hg19 as an option)
- Load the bam files you downloaded: File > Load from File… and select HCC1143.normal.21.19M-20M.bam
- Turn on dbSNP annotations File > Load from Server… >Annotation > Variation and Repeats > dbSNP 1.4.7
- Navigate to chr21:19,500,000-19,500,001 to be able to view reads. (normally could see reads anywhere, but we have specifically downloaded only reads that map in a 1Mb window centered on chr21:19.5M
- Right click on the track name on the left and select sort alignments by start location
There are 2 mutations visible in the chr21:19,479,237-19,479,814 region answer the following questions:
Expand title Are both SNPS supported by reads mapping to both the forward and reverse DNA strand (hint: make sure reads are colored by strand)? Yes, both forward and reverse reads (red and blue if colored by strand) contain the SNPs compared to the reference
Expand title Which is more likely to be related to disease? why? The one on the left does not correspond to a dbSNP entry and is therefore more likely to be related to disease state.
There are 2 SNPs visible in the chr21:19,666,833-19,667,007 region. Answer the following questions:
Expand title Two mutations very close together is often a case of poor alignment scores. Is that the case here (remember this is human data)? No, each read only has 1 mutation on it, these are 2 different alleles each with its own SNP relative to 'wt'. Both are reported in dbSNP
Expand title Is either likely to be related to disease? Neither is likely to be related to disease (or at least not to rare disease) as both mutations have previously been identified as naturally occurring by dbSNP
Expand title What is going on in the chr21:19,324,469-19,331,468 region? Homozygous deletion. In the track on the left, right click and select 'view as pairs' to see linkage between R1 and R2 to see individual reads mapping to both sides of the deletion
Expand title What is going on in the chr21:19,102,154-19,103,108 region? This is an example of poor alignment to a repetitive AluY element. Notice how of the read pairs that map with numerous SNPs have 1 read that maps with lots of SNPs and the other read maps with none? This is caused by mapping reads to a limited area of the whole genome, if these reads had been allowed to map to the entire genome it is very likely that both read pairs would map without SNPs somewhere else in the genome.
Info title How can we identify this region as an AluY element There are several methods that could accomplish this. 1 as pointed out in class would be to pull the sequence from this region and blast that region, but that sounds like a lot of work. A second and much better way would be to turn on some additional tracks in IGV. In this case Alu elements are identified as 'SINE' elements in the repeat masker data base.
To turn this on, File > Load from Server… >Annotation > Variation and Repeats > Repeat Masker > SINEInformation about what Repeat masker is doing and where it comes from can be found here: http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=rmsk
What other interesting things can you find?
An additional tutorial from another group working with the same human data can be found here if interested.
...
- Tablet - a lightweight NGS data browser
- Visualize mapped data at UCSC genome browser
- In your work if you come across another (particularly if you find it very useful, or easier to use than IGV) I'd love to hear from you.
...