...
Warning | ||
---|---|---|
When following along here, please start an idev session for running any example commands:
|
Illumina sequence data format (FASTQ)
GSAF gives you paired end sequencing data in two matching fastq format files, contining reads for each end sequenced -- for example Sample_ABC_L005_R1.cat.fastq and Sample_ABC_L005_R2.cat.fastq. Each read end sequenced is representd by a 4-line entry in the fastq file.
...
Code Block | ||
---|---|---|
| ||
cds
mkdir my_rnaseq_course #this is where you'll be doing all the course exercises
cd my_rnaseq_course
cp -r /corral-repl/utexas/BioITeam/rnaseq_course_2015/fastqc_exercise .
cd fastqc_exercise
ls data |
...
Expand | ||
---|---|---|
| ||
The wc -l command says there are16000000 lines. FASTQ files have 4 lines per sequence, so the file has 16,000,000/4 or 4,000,000 sequences.
|
Lets move on to assessing the quality of this data...