Small rna analysis
Scripts found on fourierseq which are useful when performing small rna analysis, particularly using the ABI SREK (rna2map) pipeline.
- findpositionmiRNA: Given the miRBase fasta files containing mature microRNA sequences (mature.fa) and hairpin sequences (hairpin.fa), will find the start location of the mature miRNA relative to the start of the hairpin (zero offset)
- Inputs: mature.fa; hairpin.fa (preferably filtered to contain sequences from the organism of interest), organism (the three letter abbreviation like hsa or mmu)
Outputs: File with miRNAid\tmiRNAsequence\thairpinid\thairpinsequence\tstartposition
- Inputs: mature.fa; hairpin.fa (preferably filtered to contain sequences from the organism of interest), organism (the three letter abbreviation like hsa or mmu)
- mapreads_interpreter_SREK: converts a SREK mapping output file into a tab-delimited info file.
- Inputs: SREK mapping output file (after extension); reference fasta file
Outputs: Info file with readid\tgi#\tmismatches\tdirection\tstartlocation\tstart%\tend%\tcoverage%\tgenedescription\tgenelength\tmappinglength
- Inputs: SREK mapping output file (after extension); reference fasta file
- mapreads_select_mismatches: will filter info file by number of mismatches
- Inputs: info file generated by mapreads_interepreter; mismatch cutoff
Outputs: info file filtered to include only mappings with mismatches less than or equal to user specified cutoff
- Inputs: info file generated by mapreads_interepreter; mismatch cutoff
- mapreads_select_by_length: will filter info filter by length
- Inputs: info file generated by mapreads_interpreter_SREK; minimum length; maximum length
Outputs: info file filtered to include only results with mapping length within user specified cutoff
- Inputs: info file generated by mapreads_interpreter_SREK; minimum length; maximum length
- findmaturemicro_SREK_hsa: From SREK miRBase mapping results, will extract reads mapping within +-3bp of mature miRNA start sites and will provide read counts for each mature miRNA.
- Inputs: info file; file with location of mature miRNA relative to the hairpin (this is the output of findpositionmiRNA)
Outputs: counts file, with read counts for each mature miRNA; file with information about the reads and the mature miRNAs they mapped to
- Inputs: info file; file with location of mature miRNA relative to the hairpin (this is the output of findpositionmiRNA)
- combine_mutlicounts : to combine two files based on first column ( used to combine counts files generated by findmaturemicro_SREK above)
- Inputs: file1; file2 (used for two counts files); maximum number of columns in first file (first file can have any number of columns, but second file must have only two columns)
Outputs: file resulting from combining file1 and file2 based on first column
Note: Use this script multiple times to combine multiple counts files.
- Inputs: file1; file2 (used for two counts files); maximum number of columns in first file (first file can have any number of columns, but second file must have only two columns)
For generating miRNA read coverage graphs
These are scripts that can be used to generate simple read coverage graphs, one for each miRNA. However,these script will need to be modified according to the samples, miRNAs and files of interest.
- gethsastart.sh : Generates histogram of read coverage for every mature miRNA specified. Needs to be modified to indicate the information file generated by findmaturemicro_SREK
plot_hist_hsa.R : Uses above generated histogram files to generate an R graph (output as a pdf file). Again needs modification to indicate the output from gethsastart.sh
Welcome to the University Wiki Service! Please use your IID (yourEID@eid.utexas.edu) when prompted for your email address during login or click here to enter your EID. If you are experiencing any issues loading content on pages, please try these steps to clear your browser cache.