Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Genome

...

Assembly

...

First,

...

some

...

background

...

De

...

Novo

...

asssembly

...

is

...

creating

...

a

...

genome

...

without

...

a

...

reference

...

genome.

...

Creating

...

a

...

genome

...

with

...

a

...

reference

...

genome

...

is

...

called

...

mapping

...

assembly.

...

This

...

paper

...

is

...

an

...

excellent

...

review

...

of

...

the

...

theory

...

and

...

practice

...

of

...

NGS

...

assemblers

...

as

...

of

...

2010

...

.

...

Read

...

lengths

...

will

...

continue

...

to

...

get

...

longer,

...

error

...

rates

...

lower,

...

coverage

...

higher,

...

but

...

the

...

basic

...

concepts

...

embodied

...

in

...

that

...

paper

...

will

...

probably

...

remain

...

useful

...

for

...

several

...

more

...

years.

...

The

...

figures

...

embedded

...

in

...

this

...

wiki

...

page

...

for

...

educational

...

purposes

...

are

...

from

...

that

...

paper.

...

Upfront

...

we

...

need

...

to

...

discuss

...

the

...

two

...

basic

...

assembler

...

types:

...

overlap

...

graph

...

and

...

de

...

Bruijn:

...

Iframe

...

src

...

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494/figure/F2/

...

width

...

800

...

height

...

800

...

Iframe

In either case,

...

more

...

and

...

longer

...

reads

...

are

...

better

...

as

...

you

...

can

...

imagine.

...

With

...

an

...

overlap

...

graph

...

(also

...

called

...

overlap

...

layout

...

consensus

...

algorithm

...

or

...

overlap

...

layout

...

algorithm)

...

your

...

assembly

...

grows

...

much

...

more

...

effectively

...

with

...

longer

...

reads

...

and

...

there

...

are

...

few

...

parameters

...

you

...

can

...

tweak.

...

With

...

a

...

de

...

Bruijn

...

approach,

...

obviously

...

your

...

choice

...

of

...

k

...

can

...

have

...

a

...

strong

...

impact

...

on

...

your

...

assembly.

...

Effect

...

of

...

trade-off

...

in

...

read

...

length

...

and

...

coverage

...

Iframe

...

src

...

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494/figure/F3/

...

width

...

800

...

height

...

800

...

Iframe
k-mer

...

distributions

...

inherent

...

in

...

select

...

genomes

...

Iframe

...

src

...

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494/figure/F1/

...

width

...

800

...

height

...

800

...

Iframe

...

Some example assembly statistics

...

Iframe
srchttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2928494/table/T1/

...

width

...

800

...

height

...

800

...

Iframe

Many (many)

...

assemblers

...

are

...

available.

...

A

...

list

...

of

...

assemblers

...

can

...

be

...

found

...

here.

...

We'll

...

take

...

a

...

look at Velvet. - it's

...

a

...

fast

...

and

...

easy

...

to

...

use

...

de

...

Bruijn

...

assembler.

...

OK

...

-

...

let's

...

try

...

an

...

exercise on the next wiki page - Genome Assembly (with velvet)