Demultiplexing ddRAD data with Stacks exercise:
This exercise should be done on TACC.
First copy over the exercise directoryCheck out the reads to make sure they make sense:
Code Block | ||||
---|---|---|---|---|
| #start an idev session if you have not already
idev
#go to coure directory on scratch
cds
cd rad_intro/
#copy over the demultiplexing directory
cp -r /work/02260/grovesd/lonestar/
| |||
#navigate to the exercise directory cd intro_to_rad_20172018/demultiplexing/process_ddRAD_stacks . #go to the demultiplexing with stacks exercise directory cd process_ddRAD_stacks |
Check out the reads to make sure they make sense:
Code Block | ||
---|---|---|
| ||
#we have two fastq files: Lib01_R1.fastq and Lib01_R2.fastq ls *.fastq #These files arehave paired end reads, (two separate reads from either end of the same DNA fragment) #Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read) expr $(cat Lib01_R1.fastq | wc -l) / 4 expr $(cat Lib01_R2.fastq | wc -l) / 4 #7453585#53042 #(Note these are for demo purposes and shorter than they should be) #this ddRAD library was prepared so that forward reads were cut with the nlaIII restriction enzyme. #Looking up nlaIII, we see it's cut site: # CATG' # 'GTAC #check if we see this cut site in the forward reads grep CATG Lib01_R1.fastq | wc -l #compare this with the total number of reads we got above #do these numbers make sense? #look at where in the reads the cut site is (may help to paste into a text editor and character search) grep CATG Lib01_R1.fastq | head -n 30 |
Based on the library preparation, we expect that our forward reads were cut with the restriction enzyme nlaIII (NLA3).
Check that the forward reads fit this expectation.
Hint: look up restriction site and use grep
Code Block | ||||
---|---|---|---|---|
| ||||
#Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read)
expr $(cat Lib01_R1.fastq | wc -l) / 4
expr $(cat Lib01_R2.fastq | wc -l) / 4
#53042
#(Note these are for demo purposes and shorter than they should be)
#Looking up nlaIII, we see it's cut site:
# CATG'
# 'GTAC
#check if we see this cut site in the forward reads
grep CATG Lib01_R1.fastq | wc -l
#compare this with the total number of reads we got above |
process the reads using process_radtags (part of the Stacks package):
...