Software
The following is a categorized list of software available through the GSAF. Each page lists a summary of the software, the hardware it is currently installed on, links to user documentation, and helpful tips.
Contents
- Getting an account of GSAF's Galaxy instance
- Getting an account on GSAF server-fourierseq
- Getting started with Unix and Perl
- Tips for working with TACC resources
- GSAF:References (by organism)
- GSAF:General purpose tools
- GSAF:Galaxy Workflows
- GSAF:Microarray data analysis tools
- GSAF:NGS Data Quality Control Tools
- GSAF:Mappers / Aligners
- GSAF:Gene prediction tools
- GSAF:SNP discovery
- GSAF:Splice Junction discovery
- GSAF:Genome Alignment and Visualization
- GSAF:De novo assembly
- GSAF:454 Analysis tools
- GSAF:Useful scripts
- GSAF:Software users group meetings
- Variant calling
- ZOHO Information
- Extracting barcode split data from SOLiD 5500 XSQ files
- R and R packages
Galaxy Workflows
General purpose tools
- Blast
- Bioconductor
- Bioperl
- BioMart Perl APIs
- NCBI Eutils
- R
- Python Library
- Graphics programs
- BOOST libraries
- Phred, Phrap, Consed, cross_match, daev
- Hmmer
- Data compression programs
- Clustering programs - MCL and usearch, uclust
Microarray data analysis tools
NGS Data Quality Control Tools
Mappers/Aligners
- mapreads SOLiD data only, ungapped alignment
- MAQ - best for short-read SNP calling; ungapped alignment
- muscle - "old school" aligner - good for 454 amplicons
- SOAP - very fast and versatile: any read length, gapped, paired-end, SNP calling
- SSAHA & SSAHA2 - like Maq, fast for ungapped mapping - SNP calling, contig placement to reference, etc.
- Bowtie - very fast, ungapped alignment. Does not support color space data
- SHRiMP - A sensitive and accurate mapper. Supports color space data and gapped alignment.
- BFAST - BLAT-like short read mapper. Natively supports SOLiD colorspace short reads.
- BWA - The successor to MAQ; a BW mapper, but which allows for gaps and handles colorspace natively.
- GMAP and GSNAP - Mappers for cDNA and very sensitive detection of short indels.
- Mosaik - A suite of alignment and reference-guided assembly tools.
- See Category:Mapper for more details.
Gene prediction tools
SNP discovery and Annotation
- Corona-Lite - SOLiD data only
- Genome Analysis Tool Kit (GATK)
- MAQ - best for short-read SNP calling; ungapped alignment
- Picard
- SOAP - very versatile: any read length, gapped, paired-end, SNP calling
- SAMTOOLS
- Annovar
Splice Junction discovery
Genome Alignment and Visualization
- Circos
- IGV
- MaqView
- Mauve
- Affymetrix Integrated Genome Browser: easy to install genome browser. Download here
De novo assembly
- MIRA
- Velvet
- ABYSS
- Using consed, including editing Mira assemblies
- ABI's SOLiD de novo pipeline
- Phred, Phrap, Consed, cross_match, daev
- Allpaths-LG
- Mosaik - A suite of alignment and reference-guided assembly tools.
- (Newbler, the Roche/454 assembler, is under 454 Analysis tools)
ABI pipelines
454 Analysis tools
Current Roche/454 software versions on Fourierseq are all 2.5.3. Tarballs of various 454 software versions are available at /home/daras/454sw*
- Sff file manipulation tools - Utilities to convert and manipulate 454 sff files.
- GS De novo assembler - Performs assembly of reads and generates contigs.
- GS Reference mapper - Maps reads to a reference genome and reports consensus and variants.
- GS Amplicon variant analyzer - For detection of variants in amplicon libraries : a small region of interest at very large coverage.
- GS Run processor and run browser - Generally run already by the GSAF, but you might want to re-process image data sometimes.
- Georgiou Lab Amplicon scripts - Matlab scripts...
- BLAST tools - Scripts for quick and dirty blasts of 454 reads and contigs to see what's going on at a global level
Useful scripts
- Convert ABI SOLiD data to fasta fastq
- General parser scripts - scripts for parsing and filtering of fasta, fastq files and output files from different mappers; base space, color space conversion scripts.
- Small rna analysis
- Generation of wig files from mapreads output
- Conversion of mapreads output to GFF, SAM, or BAM format - These utilities can be used to convert mapreads mapping output to base space format
- Generation of gene counts from results of mapping to genome - These scripts can be used to identify the reads that correspond to genes, after mapping to the genome.
- Get Tm (melting temperature), length, and %GC from a bunch of sequences
- Conversion of gene ID's from one form to another (i.e. NCBI to Ensembl & vice-versa)
- Quick tips on GO analysis
- Median polish to consolidate quantitations
- Make a quick venn diagram based on lists in 3 files
- Plot a read length histogram based on sequences in a fasta file
- Reverse complement for fasta files
- Tricks to preprocess SOLiD and 454 data
Software users group meetings
- Small-rna data analysis - Lessons learned during Sullivan data analysis
- Mapping of short reads - Comparison of few publicly available mapping tools