...
Anna Battenhouse, Associate Research Scientist, abattenhouse@utexas.edu
BA English literature, 1978
Commercial software development 1982 – 2007
Joined Iyer Lab 2007 (“retirement career”)
BS Biochemistry, UT Austin, 2013
- Joined the Biomedical Research Support Computing Facility (BRCF) and Marcotte Lab summer 2017
- Also affiliated with
Daryl Barth, daryl.barth@utexas.edu
- BS Materials Science & Engineering, UC Berkeley, 2017
- Student Researcher in France and Portugal 2017 - 2018
- Research Assistant in single cell genomics, UT Southwestern 2019-2021
- 2nd 4rd year graduate student in the the Marcotte Lab
Research Interests: biomaterials, developmental biology, and bioinformatics
...
Dr. Vishy Iyer, PI | |
Main focus is functional genomics
| |
Research methods include
| |
|
...
For online attendees, you can also post your question to the Zoom chat. We'll sometimes use breakout rooms when troubleshooting problems you run into. As you login to the Zoom, you'll be assigned to a breakout room where you can join TA Daryl Barth for assistance, if so, TA Daryl Barth will assign you to one.
Getting help
Since most folks are new to the Linux command line, we expect you to run into problems! Please let us know if you're having difficulties!
...
We intend this course to offer as much self-learning as possible. Consequently, you'll find many sections like this - click on the triangle to expand them:
Expand | ||
---|---|---|
| ||
Hint sections will provide you some guidance on what to do next, but will not spell it out. |
and some sections like this:
Expand | ||
---|---|---|
| ||
Solution sections will contain the |
...
- Hands-on, tutorial style – learn by doing
- common Common bioinformatics tools & file formats
- Introduce NGS vocabulary
- both high-level view and practice with specific tools
- Cover the NGS basics
- the The first few things you'll do after receiving raw sequences
- raw sequence QC and preparation
- alignment to reference
- basic alignment analysis
- the The first few things you'll do after receiving raw sequences
- Understand and practice required skills
- Get you comfortable with Linux and TACC – your best "frenemies"
- Make you self-sufficient enough in 5 days to become experts over time
- Show some "best practices" for working with NGS data
...
|
Large and growing datasets
...
- yeast: 5 – 20 million reads
- human: 20 – 250 million reads (~5 - 8 million for TagSeq)
- single end (SE) or paired end (PE), length 75 50 – 250 bases300 bases (100 or 150 typical)
The initial fastq FASTQ files are big (100s of MB to GB) – and they're just the start.
- Organization and naming conventions are critical.
- Your data can get out of hand very quickly!
progression Progression of Iyer Lab datasets over time:
...