...
Assembling even small bacterial genomes can be incredibly time intensive (as well as memory intensive as highlighted above). Fortunately for this class, we can make use of the plasmid spades option to assemble and even smaller plasmid genome that is ~2000 bp long in only a few minutes. I suggest analyzing this data on an idev node and then submitting the other data analysis for the bacterial genomes as a job to run overnight.
Data
Note | ||
---|---|---|
| ||
As mentioned yesterday, you can not copy from the BioITeam (because it is on corral-repl) while on an idev node. Logout of your idev session, copy the files. |
Download the paired end fastq files which have had their adapters trimmed from the $BI/gva_course/Assembly/ directory.
...
Once you have figured out what options you need to use see if you can come up with a command to run on the paired end reads and have the output go into a new directory called plasmid using all 68 48 cores that are available on your idev node (-t 6848). The following command is expected to take less than 2 minutes.
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
plasmidspades.py -t 6848 -o plasmid -1 SH1_1P.fastq.gz -2 SH1_2P.fastq.gz |
...
Here we will look at 4 sets of data with library preparation conditions to evaluate how wet lab decisions influence outcomes on the computer. Some of the text here is very similar or identical to that in set 1 incase people choose to skip directly to it.
Data
Note | ||
---|---|---|
| ||
As mentioned yesterday, you can not copy from the BioITeam (because it is on corral-repl) while on an idev node. Logout of your idev session, copy the files. |
Code Block | ||
---|---|---|
| ||
mkdir $SCRATCH/GVA_SPAdes_tutorial # you likely already did this when you ran the selftest cp $BI/ngs_course/velvet/data/*/* $SCRATCH/GVA_SPAdes_tutorial cd $SCRATCH/GVA_SPAdes_tutorial |
...
Once you have figured out what options you need to use see if you can come up with a command to run on the single end and have the output go into a new directory called single_end using all 68 48 threads that are available (-t 6848).
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
spades.py -t 6848 -o single_end -s single_end_100_c_50.fastq |
...
Code Block | ||||||
---|---|---|---|---|---|---|
| ||||||
spades.py -t 6848 -o 400_1500_3000 --12 paired_end_2x100_ins_400_c_50.fastq --12 paired_end_2x100_ins_1500_c_20.fastq --12 paired_end_2x100_ins_3000_c_25.fastq spades.py -t 6848 -o 400_and_1500 --12 paired_end_2x100_ins_400_c_50.fastq --12 paired_end_2x100_ins_1500_c_20.fastq spades.py -t 6848 -o 400_only --12 paired_end_2x100_ins_400_c_50.fastq |
...