Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

...

  • Expand
    titleWhy are some reads different colors?
    Right click on one of the reads to bring up a set of options. Look at the color alignment by section to see what it currently is and what you might prefer it as.
  • Interested in determining the probability that a read is not where it should be? What is a typical mapping quality (MQ) for a read?

    Expand
    titleClick here for the formula.

    The estimated probability that a read is mapped incorrectly is 10^(-MQ/10). Where MQ is the mapping quality.

  • Can you find a variant where the sequenced sample differs from the reference? This would be like looking for a needle in a haystack if not for the use of variant callers and the control-f and control-b options to zoom right to areas where there are discrepancies between reads and the reference genome that might indicate there were mutations in the sequenced E. coli.

    Expand
    titleSome interesting example coordinates
    • Expand
      titleCoordinate 161,041. What gene is this in and what is the effect on the protein sequence?

      Gene is pcnB, mutation is a snp

    • Expand
      titleCoordinate 3,248,957. What gene is this in and what is the effect on the protein sequence?

      Gene is infB, mutation is a snp

    • Expand
      titleCoordinate 3,894,997. What type of mutation is this?

      Deletion of the rbsD gene

    • Expand
      titleCheck out the rbsA gene region? What's going on here?

      There was a large deletion. Can you figure out the exact coordinates of the endpoints?

    • Expand
      titleNavigate to coordinate 3,289,962. Compare the results for different alignment programs and settings. Can you explain what's going on here?

      There is a 16 base deletion in the gltB gene reading frame.

    • Expand
      titleWhat is going on in the pykF gene region? You might see red read pairs. What does that mean? Can you guess what type of mutation occurred here?

      The read pairs are discordantly mapped. There was an insertion of a new copy of a mobile genetic element (an IS150 element) that exists at other locations in the reference sequence.

    • See if you can find more interesting locations. There are ~40 ~190 mutations total in this sample MOST of which are false positives.

...

  1. Close IGV (if you have it open from the first tutorial with your mapping, SNV, and SV data) and reopen it. 
  2. Select "Human hg19" as the reference genome from the top left drop down (you may need to select "more" to have hg19 as an option)
  3. Load the bam files you downloaded: File > Load from File…  and select HCC1143.normal.21.19M-20M.bam
  4. Turn on dbSNP annotations File > Load from Server… >Annotation > Variation and Repeats > dbSNP 1.4.7
  5. Right click on the track name on the left and select sort alignments by start location
  6. There are 2 mutations visible in the chr21:19,479,237-19,479,814 region answer the following questions:

    Expand
    titleAre both SNPS supported by reads mapping to both the forward and reverse DNA strand (hint: make sure reads are colored by strand)?

    Yes, both forward and reverse reads (red and blue if colored by strand) contain the SNPs compared to the reference

    Expand
    titleWhich is more likely to be related to disease? why?

    The one on the left does not correspond to a dbSNP entry and is therefore more likely to be related to disease state.


  7. There are 2 SNPs visible in the chr21:19,666,833-19,667,007 region. Answer the following questions:

    Expand
    titleTwo mutations very close together is often a case of poor alignment scores. Is that the case here (remember this is human data)?

    No, each read only has 1 mutation on it, these are 2 different alleles each with its own SNP relative to 'wt'. Both are reported in dbSNP

    Expand
    titleIs either likely to be related to disease?

    Neither is likely to be related to disease (or at least not to rare disease) as both mutations have previously been identified as naturally occurring by dbSNP


  8. Expand
    titleWhat is going on in the chr21:19,324,469-19,331,468 region?

    Homozygous deletion. In the track on the left, right click and select 'view as pairs' to see linkage between R1 and R2 to see individual reads mapping to both sides of the deletion

  9. Expand
    titleWhat is going on in the chr21:19,102,154-19,103,108 region?

    This is an example of poor alignment to a repetitive AluY element. Notice how of the read pairs that map with numerous SNPs have 1 read that maps with lots of SNPs and the other read maps with none? This is caused by mapping reads to a limited area of the whole genome, if these reads had been allowed to map to the entire genome it is very likely that both read pairs would map without SNPs somewhere else in the genome.

  10. What other interesting things can you find?


Optional Tutorial Exercises ...

...