SOAPtrans

Summary

SOAPtrans is short for SOAPdenovo-Trans, a transcriptome version of SOAPdenovo assembler developed by BGI (Beijing Genomic Center), taking fastq or fasta format reads as input.

Available on

The latest version is here.

User documentation

The documentation for SOAPtrans can be found here. It's related to SOAPdenovo so you might also want to check its documentation here.

How to run SOAPtrans

1. To run SOAPtrans, firstly you need to set up a config file, specifying the read length, insert size, reads locations, options for using the reads.

The example config file for pair-end reads is like this:

#maximal read length
max_rd_len=100
[LIB]
#average insert size
avg_ins=200
#if sequence needs to be reversed
reverse_seq=0
#in which order the reads are used, used for assembly or scaffolding, normally we use 3
asm_flags=3
#fastq file for read 1
q1= <location_of_read1>
#fastq file for read 2
q2= <location_of_read2>

2. There're two commands for SOAPtrans, SOAPtrans-31mer and SOAPtrans-127mer. The first one is suitable for assembly with kmer less than 31bp, it requires less memory and run faster. The later one works for kmer less than 127bp.

The command to run the assembly is:

SOAPdenovo-Trans-127mer all -s config -o <output_prefix> -K <kmer>

3. The contig file is something like <output_prefix>.contig

Memory and computational time

1. SOAPtrans is an memory intensive assembler, requires approximately 30~60 GB memory for assembling 50 million reads.

2. SOAPtrans is a fast assembler, taking around 30 minutes to assemble 50 million illumina reads.

Trouble shooting

1. If you want to run SOAPtrans with kmer exceeds 31bp, make sure use the 127mer version.

2. The assembly statistics(N50, mean contig length) will reach a peak at certain kmer, which according to our experience is around 2/3 of the read length.

3. If you have trouble in working with SOAPtrans, you might want to check the above documentation or consult with BGI SOAPdenovo maillist bgi-soap@googlegroups.com.