Table of Contents |
---|
Anchor | ||||
---|---|---|---|---|
|
Special stampede login
Before we start, log into ls5 like you did yesterday, but use this special hostname:
login5.ls5.tacc.utexas.edu
Normally you should not perform significant computation on login nodes, since they are shared by all users in the TACC community. Well, there are a few exceptions, and login5.ls5.tacc.utexas.edu is one of them. Is it a dedicated login node owned by CSSB and CBRS, so we have given you access to it for the duration of this course. This will let us do a few things at the command line that would normally set off alarm bells from the TACC folks if we all did them on a standard login node.
Data staging
Data staging
First login to stampede2 like you did yesterday.
Set ourselves up to process some yeast data data in $SCRATCH, using some of best practices for organizing our workflow.
...
Exercise: What character in the quality score string in the FASTQ entry above represents the best base quality? Roughly what is the error probability estimated by the sequencer?
Expand | ||
---|---|---|
| ||
J is the best base quality score character (Q=41) It represents a probability of error of < 1/10^4 or 1/10,000 |
About compressed files
Sequencing data files can be very large - from a few megabytes to gigabytes. And with NGS giving us longer reads and deeper sequencing at decreasing price points, it's not hard to run out of storage space. As a result, most sequencing facilities will give you compressed sequencing data files.
...
Expand | ||
---|---|---|
| ||
FASTQ's are ~ 150 149 MB |
...
If you start less with the -N option, it will display line numbers.q
Exercise: What line of small.fq contains the read name with grid coordinates 2316:10009:100563?
...
Code Block | ||||
---|---|---|---|---|
| ||||
# shows 1st 10 lines head small.fq # shows 1st 100 lines -- might want to pipe this to more to see a bit at a time head -100 small.fq | more |
So what if you want to see line numbers on your head or tail output? Neither command seems to have an option to do this.
Expand | |||||
---|---|---|---|---|---|
| |||||
|
piping
So what is that vertical bar ( | ) all about? It is the pipe symbol!
...