de novo assembly Flashcards

1
Q

Explain the greedy approach

and drawbacks

A

Pairwise alignment
Find the ones that have overlap and merge
Repeats only found if small, high computational cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Explain approach of Overlap Layout Consensus and de Bruijn

A

Correct sequencing errors
Assemble contigs
Combine contigs to scaffolds

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

OLC, method, drawbacks

A

Make graph where nodes are reads
Branch graph if there are overlaps between reads
Not good for short reads and repeats

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

de Bruijn, method and drawbacks

A

Map kmers instead of entire reads. Every kmer exists once, so maybe walk through multiple times.
Then simplify after building graph. Refine by remove assembly parts not supported by PE.
RAM is a problem, optimal kmer is not known.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How to improve de novo assemblies?

A

PE and MP, better coverage, hybrid methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

N50

A

Smallest contig in largest half of the assembly (calculate from assembly sum)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

kmer size’s influence on de Bruijn graph

A

Large k gives limited overlap, small gives a complex graph.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly