As genome assembly is important part of analysis but is building a reference file that will be used many times, it makes more sense to install it its own environment. Other potential tools to have in the same environment would be read preprocessing tools, in particular adapter removal tools such as trimmomatic.

Code Block

language	bash

conda create --name GVA-SPAdes -c bioconda spades

Testing SPAdes installation

...

Line number	As is	To be
16	#SBATCH -J jobName	#SBATCH -J spades
17	#SBATCH -n 1	#SBATCH -n 4
18	#SBATCH -N 1	#SBATCH -N 4
21	#SBATCH -t 12:00:00	#SBATCH -t 104:0030:00
22	##SBATCH --mail-user=ADD	#SBATCH --mail-user=<YourEmailAddress>
23	##SBATCH --mail-type=all	#SBATCH --mail-type=all
27	conda activate GVA2021	conda activate GVA-SPAdes
31	export LAUNCHER_JOB_FILE=commands	export LAUNCHER_JOB_FILE=spades_commands

...

Code Block

language	bash
title	Example grep commands

# Count the total number of contigs:
grep -c "^>" single_end/contigs.fasta

# Determine the length of the 5 largest contigs:
grep "^>" single_end/contigs.fasta | head -n 5

# Determine the length of the 20 smallest contigs:
grep "^>" single_end/contigs.fasta | tail -n 20

# Determine the length of the 100th through 110th contigs:
grep "^>" single_end/contigs.fasta | head -n 110 | tail -n 10

If Since you ran multiple different combinations of reads for the simulated data how did the insert size effect the number of contigs? the length of the largest contigs? Why might larger insert sizes not help things very much?

...

Versions Compared

Old Version 2

New Version Current

Key

Testing SPAdes installation

Page Comparison

Versions Compared

Old Version 2

New Version Current

Key

Testing SPAdes installation