Demultiplexing ddRAD
Demultiplexing ddRAD data with Stacks exercise:
Check out the reads to make sure they make sense:
check out reads
#navigate to the exercise directory cd intro_to_rad_2018/demultiplexing/process_ddRAD_stacks #we have two fastq files: Lib01_R1.fastq and Lib01_R2.fastq ls *.fastq #These files have paired end reads (two separate reads from either end of the same DNA fragment) #Double-check that the files have the same number of reads. (note that fastq genreally files have 4 lines per read) expr $(cat Lib01_R1.fastq | wc -l) / 4 expr $(cat Lib01_R2.fastq | wc -l) / 4 #53042 #(Note these are for demo purposes and shorter than they should be) #Looking up nlaIII, we see it's cut site: # CATG' # 'GTAC #check if we see this cut site in the forward reads grep CATG Lib01_R1.fastq | wc -l #compare this with the total number of reads we got above #do these numbers make sense? #look at where in the reads the cut site is (may help to paste into a text editor and character search) grep CATG Lib01_R1.fastq | head -n 30
Based on the library preparation, we expect that our forward reads were cut with the restriction enzyme nlaIII (NLA3).
Check that the forward reads fit this expectation.
Hint: look up restriction site and use grep
process the reads using process_radtags (part of the Stacks package):
process tags
#make a directory to put the resulting sample fastq files into mkdir sample_fastqs #look at the documentation for process_radtags ./process_radtags -h #execute the command to process the rad data ./process_radtags -i 'fastq' -1 Lib01_R1.fastq -2 Lib01_R2.fastq -o ./sample_fastqs/ -b barcodes_Lib1.tsv --inline_index -e 'nlaIII' -r --disable_rad_check
Check the results:
check results
#how many barcode combinations did we have? #First look at the barcodes file: cat barcodes_Lib1.tsv #then count the lines cat barcodes_Lib1.tsv | wc -l #how r1 and r2 fastq files did we output (ignore the 'rem' files) ls sample_fastqs/*AGCGAC.1.fq ls sample_fastqs/*AGCGAC.1.fq ls sample_fastqs/*AGCGAC.1.fq | wc -l ls sample_fastqs/*AGCGAC.1.fq | wc -l #are the paired end files still the same length? cat sample_fastqs/sample_CATAT-AGCGAC.1.fq | wc -l cat sample_fastqs/sample_CATAT-AGCGAC.2.fq | wc -l
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.