Bowtie
Summary
Bowtie is a fast and memory-efficient aligner that uses Burrows-wheeler transform. It handles color-space reads from version 0.13.
How to use Bowtie
Basic configuration
- Setup directory for indexed target sequences. I used '$HOME/BOWTIE.idx'. After building reference DB, move them to this directory.
- Setup 'BOWTIE_INDEXES' as your environment variable. If you use bash (if you don't know, type 'echo $SHELL' at command-line), put the following line on your '~/.bashrc' file. Alternatively, you can use automation shell script as below.
- Naming convention (optional). If you want to analyze both normal base-space reads and color-space reads, it is good idea to discriminate both target DBs with a flag. I put '_c' at the end of DB name if it is prepared with '-C' option.
Color-space reads
Prepare target(DB) sequence.
$ bowtie-build -C <FASTA file> <DB name>
Run bowtie. If you use fastq file (read sequences with quality scores),
$ bowtie \-a \-C \-q \-t \--suppress 6 <DB name> <Query fastq filename> <output filename>
If you want to ignore quality file, and use 'fasta' format reads,
$ bowtie \-a \-C \-f \-t \--suppress 6 <DB name> <Query csfasta filename> <output filename>
Normal base reads
Prepare target(DB) sequence.
$ bowtie-build <FASTA file> <DB name>
Run bowtie. If you use fastq file (read sequences with quality scores),
$ bowtie \-a \-q \-t \--suppress 6 <DB name> <Query fastq filename> <output filename>
If you want to ignore quality file, and use 'fasta' format reads,
$ bowtie \-a \-f \-t \--suppress 6 <DB name> <Query csfasta filename> <output filename>
bash script
I normally used the following bash script, after modifying each variable depending on data.
#\!/bin/bash export BOWTIE_INDEXES="/home/taejoon/BOWTIE.idx/" BOWTIE="/home/taejoon/src64/bowtie/bowtie-0.12.5/bowtie" DB="DROME_E57_cdna_c" QUERY="SRR034220.called.fastq" OUT="SRR034220_called.$DB.bowtie_c" time $BOWTIE \-a \-C \-q \-t \--suppress 6 $DB $QUERY $OUT OUT="SRR034220_called.$DB.trim5_bowtie_c" time $BOWTIE \-a \-C \-q \-t \--trim5 5 \--trim3 5 \--suppress 6 $DB $QUERY $OUT
Available on
User documentation
- To get started using bowtie, check out the bowtie manual.
How to run bowtie
Because bowtie does not handle color space data, the only way to use bowtie with color space reads is to convert both the reads and the reference to mock base space.
Example pipeline for running bowtie using colorspace reads: (when dealing with base space reads, follow step 3 onwards)
1. Convert the reference to mock base space.
bs2cs ref.fasta > ref.csfasta
cs2mbs ref.csfasta > ref.m.fasta
where
ref.fasta : reference in base space ref.csfasta : reference in color space (for temporary purposes) ref.m.fasta : reference in mock base space
2. Create bowtie indexes for the reference genome
bowtie-build ref.m.fasta refindex
where
ref.m.fasta : reference in mock base space refindex : basename for bowtie indexes
3. Convert the reads to mock base space
cs2mbs -d -r in.csfasta > in.m.fasta
where
in.csfasta : reads file in color space in.m.fasta : reads file in mock base space \-d : drop the first colorspace base during conversion. This will ignore the first color space base which is part of the primer. \-r : For each read, include the reverse of the mock base space sequence.
4. Convert the reads to fastq
fasta2fastq in.m.fasta in_QV.qual > in.m.fastq
where
in.m.fasta : reads file in mock base space in_QV.qual : corresponding quality file in.m.fastq : output fastq file
5. Align using bowtie
bowtie -q -n 3 --best --norc refindex in.m.fastq out
where
refindex : base name for the bowtie index of the reference in.m.fastq : input fastq file out : mapping output file \-q : indicates use of fastq file \-n 3 : mismatches allowed in seed ( \-v 3 can be used instead to indicate mismatches allowed in entire alignment) \--norc : do not report reverse complement matches -- best: make bowtie search till it find the best alignment (based on number of mismatches and quality values at mismatched positions)
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.