...
Code Block |
---|
title | For a "commands" file - to run four velvet assemblies in parallel. If you copy and paste, be sure that there are ONLY four lines in your file. |
---|
|
velveth single_out 61 -fastq single_end_100_c_50.fastq && velvetg single_out -exp_cov auto -amos_file yes
velveth pairedc20_out 61 -fastq -shortPaired paired_end_2x100_ins_3000_c_20.fastq paired_end_2x100_ins_1500_c_20.fastq paired_end_2x100_ins_400_c_20.fastq && velvetg pairedc20_out -exp_cov auto -amos_file yes
velveth pairedc25_out 61 -fastq -shortPaired paired_end_2x100_ins_3000_c_25.fastq paired_end_2x100_ins_400_c_25.fastq && velvetg pairedc25_out -exp_cov auto -amos_file yes
velveth pairedc50_out 61 -fastq -shortPaired paired_end_2x100_ins_400_c_50.fastq && velvetg pairedc50_out -exp_cov auto -amos_file yes
|
...
Expand |
---|
| The results... |
---|
| The results... |
---|
|
: Final graph has 9748 nodes and n50 of 191, max 1427, total 1865207, using 281499/2314900 reads |
Median coverage depth = 2.657895
Final graph has 9748 nodes and n50 of 191, max 1427, total 1865207, using 281499/2314900 reads
|
Code Block |
---|
title | Set with one group of reads at 50 coverage |
---|
| : Final graph has 271 nodes and n50 of 127086, max 397281, total 4555586, using 1464199/2314900 reads |
Median coverage depth = 11.131337
Final graph has 265 nodes and n50 of 127102, max 397974, total 4558511, using 1464201/2314900 reads
|
Code Block |
---|
title | Set with 2 groups of reads both at 25 coverage each |
---|
| : Final graph has 203 nodes and n50 of 698134, max 1032531, total 4585717, using 1465818/2314900 reads |
Median coverage depth = 11.109244
Final graph has 203 nodes and n50 of 698134, max 1032531, total 4585717, using 1465818/2314900 reads
|
Code Block |
---|
title | Set with 3 groups of reads all at 20 coverage each |
---|
| : Final graph has 202 nodes and n50 of 698626, max 1139610, total 4602729, using 1758595/2777880 reads |
Median coverage depth = 13.353287
Final graph has 202 nodes and n50 of 698626, max 1139610, total 4602729, using 1758595/2777880 reads
|
|
With better read pairs that link more distant locations in the genome, there are fewer contigs, and contigs are are longer, giving us a more complete picture of linkage across the genome.
The complete E. coli genome is about 4.6 Mb. Why weren't we able to assemble it, even with this "perfect" data?
Expand |
---|
| One possibility Possibilities... One possibility |
---|
| Possibilities... |
---|
|
- Sometimes errors in reads lead to dead-ends in the graphs that are trimmed when they should not be.
- There are 7 nearly identical ribosomal RNA operons in E. coli spaced throughout the chromosome. Since each is >3000 bases, contigs cannot be connected across them using this data.
|
More assembly statistics: contig_stats.pl
...