Prokka Annotations -- GVA2022
Overview
Learning Objectives
Installing Prokka
Conda installation
conda create -n GVA-prokka -c conda-forge -c bioconda -c defaults prokka
conda activate GVA-prokkaCheck for correct installation and get review command line options
prokka --version
prokka --listdb
prokka --helpNote the somewhat novel '--listdb' call. Since prokka is a program that works largely by comparing sequences to other sources, knowing what references it has access to is of equal importance as having the program working. In such situations the the program, and the associated databases may be updated independently.
prokka 1.14.6
Looking for databases in: /work2/01821/ded/stampede2/miniconda3/envs/GVA-prokka/db
* Kingdoms: Archaea Bacteria Mitochondria Viruses
* Genera: Enterococcus Escherichia Staphylococcus
* HMMs: HAMAP
* CMs: Archaea Bacteria Viruses
help command should give list of options you are familiar with by now
Get Some Data
If you have already run the SPAdes tutorial for assembling full bacterial genomes from simulated reads, it is recommended that you use one or more of the set of assembled contigs.
mkdir $SCRATCH/GVA_Prokka
cd $SCRATCH/GVA_Prokka
cp ../GVA_SPAdes_tutorialRunning Prokka
Using the prokka --help command, what options seem particularly useful or important to you?
For our example, we will leave proteins, evalue and covereage all at their defaults making our command rather simple.
Try to determine yourself before comparing against 1 reasonable solution
mkdir gene_annotations
prokka --outdir gene_annotations --prefix mygenome contigs.fa