BFAST

How to use

bfast-book.pdf

See: /home/scott/Downloads/bfast-0.6.4d/manual/bfast-book.pdf for the manual. Evince is the PDF reader on Fourierseq, or scp it to your local computer.

First, convert SOLiD data to fastq format: solid2fastq <csfastafile> <qual file>

A shortcut script that executes all these functions on <input.fasta> and <reads.csfasta> is: 

Create a reference genome (note the -A 1 option means colorspace, use -A 0 for base space)

bfast fasta2brg -f <fastafile> and make a colorspace one too: bfast fasta2brg -f <fastafile> -A 1

Create indexes of the reference genome

For something like bacteria, these are some reasonable masks. Note that both base space and color space indexes are created with these commands:

bfast index -f <fastafile> -m 111111111111111111 -w 12 -i 1

bfast index -f <fastafile> -m 1111111110111111111 -w 12 -i 2

bfast index -f <fastafile> -m 111111011111101011111 -w 12 -i 3

bfast index -f <fastafile> -m 111111011001100111011111 -w 12 -i 4

bfast index -f <fastafile> -m 1111011101011111101111 -w 12 -i 5

bfast index -f <fastafile> -m 111111111111111111 -w 12 -i 1 - A 1

bfast index -f <fastafile> -m 1111111110111111111 -w 12 -i 2 -A 1

bfast index -f <fastafile> -m 111111011111101011111 -w 12 -i 3 -A 1

bfast index -f <fastafile> -m 111111011001100111011111 -w 12 -i 4 -A 1

bfast index -f <fastafile> -m 1111011101011111101111 -w 12 -i 5 -A 1

Use the indexes and reference genome to find CALs(Candidate Alignment Locations) (again, note -A 1 is colorspace)

bfast match -f <fastafile> -A 1 -r <fastq> > bfast.matches.fasta.reads.bmf

Align each CAL using a local alignment algorithm (again, note -A 1 is colorspace)

bfast localalign -f <fasta> -m bfast.matches.fasta.reads.bmf -A 1 > bfast.aligned.fasta.reads.baf

Filter/Prioritize alignments (again, note -A 1 is colorspace)

bfast postprocess -f <fasta> -i bfast.aligned.fasta.reads.baf -A 1 > bfast.reported.fasta.reads.sam

Then sam to bam via:

samtools view -S -b bfast.reported.reads.sam > bfast.reported.reads.bam

samtools sort bfast.reported.reads.bam

samtools index bfast.reported.reads.bam

 

The packages was installed on

Phylocluster  /share/apps

References

  • Homer N, Merriman B, Nelson SF., BFAST: an alignment tool for large scale genome resequencing., PLoS One., 4(11):e7767 (2009 Nov 11) PubMed Link
  • Homer N, Merriman B, Nelson SF., Local alignment of two-base encoded DNA sequence., BMC Bioinformatics., 10:175 (2009 Jun 9) PubMed Link