Genome Variant Analysis Course 2016

Course Overview

We will be meeting daily in MEZ 4.144 http://www.utexas.edu/maps/main/buildings/mez.html. If you have any trouble finding the room or building, please check your email for contact information and directions. 

The course will be built based on 2 ~90 minute sections per day for 4 days, with the goal of teaching you how to preform the standard next-generation sequencing analysis to identify genomic variants. This will be accomplished through: presentations covering essential information to all types of analysis, guided tutorials to reinforce the essential concepts, and self guided tutorials to help you learn the skills that are most specific to your own analysis. By the end of this course, we hope to achieve the following goals:

  1. Teach you different ways next generation sequencing libraries are constructed, and the advantages/disadvantages associated with the different types. 
  2. Familiarize you with how the Texas Advanced Computing Center (TACC) can be used to simplify and speed up your data analysis.
  3. Teach you the basics of read mapping in both individuals and populations, and identifying variants within individuals and rare variants within populations.
  4. Provide reference materials covering a breadth of material sufficient to give you a starting point of where to begin you own data analysis, and enough experience that you can begin that analysis on your own.

Your Instructors

Name

Initials

Affiliation

Expertise

Daniel Deatherage

DD

Barrick Lab

Unix, Python, NGS Library Prep, Capture, Rare Variant Identification

Sean LeonardSLBarrick LabUnix, R, Transposon sequencing, RNA sequencing

A nod to the past

We wish to acknowledge a great deal of help with creating these web pages and materials from previous instructors of the Intro to NGS Bioinformatics course taught in May 2013 and the Genome Variant Analysis Course 2014 taught in May 2014. Two individuals warrant special mention, the former director of the GSAF Scott Hunicke-Smith, and Jeffrey Barrick have been the driving force behind this class for a number of years, and the majority of the tutorials presented here were developed by them or adapted from their work.

 

Course Schedule

Monday, May 23rd. Day 1 – "The Basics"

Presentation: Next Generation Sequencing Library Preparation and Experimental Design (and general introduction)

Tutorial: Introduction to linux and lonestar5

Presentation: Single Nucleotide Variant Calling 

Presentation: Structural Variant Calling

Tutorial: Bacterial genome variants the easy way – breseq

Tuesday May 24th. Day 2 – "Principles of Variant calling"

Pre-Presentation Task Day 2

Presentation: fastq files, evaluating and improving quality

Bonus Presentation: Read Mapping Details

Tutorial: Evaluating raw sequencing data

Tutorial: Mapping reads with bowtie2

Tutorial: Using samtools to identify SNVs

Tutorial: Using SVDetect to identify SV

Wednesday May 25th. Day 3 – User specific tutorials

Presentation: Error rates where do they come from and when do we care?

Bonus Presentation: Alternative Library Prep Methods

 

Tutorial: Integrated Genome Viewer Tutorial

Now that you have completed the above tutorials you have accomplished the necessary skills for basic genome sequence analysis and identify real genetic variants out of the noise of genomic sequencing. The next step will be in learning additional skills to refine your abilities to better suit your personal analysis. The remaining tutorials should be completed based on what you think would be the most helpful to your specific analysis, we've divided them up into broad categories, and tried to explain what the purpose of each tutorial is, but if you are unsure just ask.

Bacterial  Centric Tutorials

Tutorial: Breseq basics

Tutorial: Advanced Breseq 

Tutorial: Evaluating Error Correction Using Breseq

Human and Higher Eukaryote Centric Tutorials

Tutorial: Human Trios Analysis

Tutorial: Annovar Analysis

Tutorial: Comparing Multiple samples

Method based Tutorials that may be of help regardless of sample type

Tutorial: Genome Assembly Using Velvet

Tutorial: Exome Capture Metrics

Tutorial: Error Correction (Molecular Indexing)

Thursday May 26th. Day 4 – User specific tutorials (continued) and TACC the normal way

The first half of today's class will be done as a continuation of tutorials that you are most interested in. While we have added some new tutorials to the above sections, we also introduced some tutorials here that are more methodology based and less sample based. Choose your own tutorial, and please don't hesitate to ask us what tutorials would be good for you to be working on given your data!

The second half of today's class will be going over how to do things the normal way on TACC which means using the job submission system and commands files.

Presentation: GVA2016_review.pdf

Tutorial: Job Submissions