Genome Variant Analysis Course 2017
Course Overview
We will be meeting daily in MEZ 4.136 http://www.utexas.edu/maps/main/buildings/mez.html. If you have any trouble finding the room or building, please check your email for contact information and directions.
The course will be built based on 2 ~90 minute sections per day for 4 days, with the goal of teaching you how to preform the standard next-generation sequencing analysis to identify genomic variants. This will be accomplished through: presentations covering essential information to all types of analysis, guided tutorials to reinforce the essential concepts, and self guided tutorials to help you learn the skills that are most specific to your own analysis. By the end of this course, we hope to achieve the following goals:
- Teach you different ways next generation sequencing libraries are constructed, and the advantages/disadvantages associated with the different types.
- Familiarize you with how the Texas Advanced Computing Center (TACC) can be used to simplify and speed up your data analysis.
- Teach you the basics of read mapping in both individuals and populations, and identifying variants within individuals and rare variants within populations.
- Provide reference materials covering a breadth of material sufficient to give you a starting point of where to begin you own data analysis, and enough experience that you can begin that analysis on your own.
Your Instructors
Name | Initials | Affiliation | Expertise |
---|---|---|---|
Daniel Deatherage | DD | Barrick Lab | Unix, Python, NGS Library Prep, Capture, Rare Variant Identification |
Dacia Leon | DL | Barrick Lab | R, NGS Library Prep |
A nod to the past
We wish to acknowledge a great deal of help with creating these web pages and materials from previous instructors of the Intro to NGS Bioinformatics course taught in 2013 and the Genome Variant Analysis Course taught in 2014-2016 (feel free to look through old materials at any point). Two individuals warrant special mention, the former director of the GSAF Scott Hunicke-Smith, and Jeffrey Barrick were the driving force behind this class for a number of years, and many of the tutorials presented here were developed by them or adapted from their work.
Course Schedule
Monday, May 22nd. Day 1 – "The Basics"
Presentation: General Course Introduction
Tutorial: Introduction to linux and lonestar5
Presentation: Experimental Design
Tutorial: Evaluating raw sequencing data
Tuesday May 23rd. Day 2 – "Principles of Variant calling"
Presentation: Read Mapping
Tutorial: Using Bowtie2 to map reads
Presentation: Single Nucleotide Variant Calling
Presentation: Structural Variant Calling
Tutorial: Using samtools to identify SNVs
Tutorial: Using SVDetect to identify SV
Bonus Presentation: Read Mapping Details and File formats
Wednesday May 24th. Day 3 – Visualization and User specific tutorials
Presentation: Errors - where do they come from and how do we identify them as noise rather than signal?
Bonus Presentation: Alternative Library Prep Methods - for when errors really do matter.
Tutorial: Visualization: Bacterial genome variants the easiest way – breseq
Tutorial: Visualization: Integrated Genome Viewer Tutorial
Now that you have completed the above tutorials you have accomplished the necessary skills for basic genome sequence analysis and identify real genetic variants out of the noise of genomic sequencing. The next step will be in learning additional skills to refine your abilities to better suit your personal analysis. The remaining tutorials should be completed based on what you think would be the most helpful to your specific analysis, we've divided them up into broad categories, and tried to explain what the purpose of each tutorial is, but if you are unsure just ask.
Bacterial Centric Tutorials
Tutorial: Advanced Breseq
Tutorial: Evaluating Error Correction Using Breseq
Human and Higher Eukaryote Centric Tutorials
Tutorial: Human Trios Analysis
Tutorial: Annovar Analysis
Tutorial: Comparing Multiple samples
Method based Tutorials that may be of help regardless of sample type
Tutorial: Genome Assembly
Tutorial: Exome Capture Metrics
Tutorial: Error Correction (Molecular Indexing)
Thursday May 25th. Day 4 – User specific tutorials (continued) and TACC the normal way
The first half of today's class will be done as a continuation of tutorials that you are most interested in. As was the case yesterday, choose your own tutorial, and please don't hesitate to ask us what tutorials would be good for you to be working on given your data! After the break, we will be go over a brief review to put things back in prospective and give you a tutorial on how to do things the 'normal way' on TACC which means using the job submission system and commands files before giving you the rest of the time to go through tutorials and ask any remaining questions.
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.