...
Code Block | ||||
---|---|---|---|---|
| ||||
#we have two fastq files ls -lh *.fastq #take a look at the top of the files less Lib01_R1.fastq #how long are the reads? #type 'q' to exit #how many lines are in each fastq file? wc -l Lib01*.fastq #look at how many reads there are (note that fastq generally files have 4 lines per read) expr $(cat Lib01_R1.fastq | wc -l) / 4 expr $(cat Lib01_R2.fastq | wc -l) / 4 |
Paired end read files should have matching names read for read.
...
Code Block | ||||
---|---|---|---|---|
| ||||
#this can be checked many ways, here is one option:
#search for lines that start with @ symbol (the read definition lines in our fastqs)
#and save the top 10 of them as a text file.
#Repeat for the R2 reads.
#Then line them up
grep "^@" Lib01_R1.fastq | head > first_10_R1_read_names.txt
grep "^@" Lib01_R2.fastq | head > first_10_R2_read_names.txt
paste first_10_R1_read_names.txt first_10_R2_read_names.txt
|
What if you wanted to repeat the above solution to check the bottom of the files as well?
Code Block | ||||
---|---|---|---|---|
| ||||
grep "^@" Lib01_R1.fastq | tail > last_10_R1_read_names.txt grep "^@" Lib01_R2.fastq | tail > last_10_R2_read_names.txt paste last_10_R1_read_names.txt last_10_R2_read_names.txt |
Based on the library preparation, we expect that our forward reads were cut with the restriction enzyme nlaIII (NLA3).
...