...
Code Block |
---|
title | solution |
---|
collapse | true |
---|
|
#navigate to the directory
cd process_2bRAD
#look at our reads
ls *fastq.gz
#decompress them
gunzip *.gz
#We have two groups, A and B, that are labeled with
#barcode sequences CTGCAG and GAAGTT respectively.
#These groups represent pools of samples that all had
#the same barcode (attached during PCR step in library prep)
#Each group was has two files labeled with L004 and L005 indicating lanes 4 and 5 from the sequencing run
#concatenate the lane 4 and lane 5 data for each pool
cat A_CTGCAG_L004_R1_001.fastq A_CTGCAG_L005_R1_001.fastq > A_CTGCAG_R1.fastq
cat B_GAAGTT_L004_R1_001.fastq B_GAAGTT_L005_R1_001.fastq > B_GAAGTT_R1.fastq
#Look at the barcode data to get an idea of what these fastq files are
#This is the same information that would be uploaded to GSAF when the
#samples were submitted for sequencing.
less barcode_data.tsv
|
Compare this with expected final product from library prep:
...
Code Block |
---|
title | solution |
---|
collapse | true |
---|
|
#In the 2bRAD library prep the adapters can be ligated to the fragment in either
#fragment in either orientation (note the mirrored sticky ends left by bcgl).
#So the genomic fragment may have been read from either direction
#check for reverse complement of cut site
grep "GCA......TCG" A_CTGCAG_R1.fastq | wc -l
grep "GCA......TCG" B_GAAGTT_R1.fastq | wc -l
#331339
#319385
#(Note these are doctored for demo and the 100% is not quite realistic
#(95% would be fine too) |
Demultiplexing
Code Block |
---|
|
#The command to run trim2bRAD_2barcodes_dedup.pl is complicated, so
#another script -- 2bRAD_trim_launch_dedup.pl -- is used to generate the command for us:
2bRAD_trim_launch_dedup.pl R1.fastq > dedupCommands
#look at the commands file
cat dedupCommands
#returns our next commands to execute:
trim2bRAD_2barcodes_dedup.pl input=A_CTGCAG_R1.fastq site=".{12}CGA.{6}TGC.{12}|.{12}GCA.{6}TCG.{12}" adaptor="AGATC" sampleID=100
trim2bRAD_2barcodes_dedup.pl input=B_GAAGTT_R1.fastq site=".{12}CGA.{6}TGC.{12}|.{12}GCA.{6}TCG.{12}" adaptor="AGATC" sampleID=100 |