Lecture 6 Flashcards

1
Q

What forms of repetitive sequences are dominant contributors to genome size variation?

A
  • Highly repetitive
  • Moderately repetitive
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 2 approaches to genome sequencing?

A
  • Top down
  • Bottom up
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the “top down” approach to genome sequencing?

A
  • Break large sequences into smaller pieces which overlap
  • These pieces cover the whole genome
  • Fragment these pieces, sequence

Called Hierarchical (or clone contig) sequencing (HS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the “bottom up” approach to genome sequencing?

A
  • Fragment wholegenome into small pieces
  • Sequence and assemble

Called Whole Genome Shotgun (WGS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a main obsticle of shotgun sequencing?

A

Repetitive genomes (with tandem and interspersed repeats) are difficult to assemble correctly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do you overscome the problem of small high copy number repeats (Like Alu elements) and large segmental duplications in WGS?

A

Sequence ends of larger clones (10kb-15kb) to confirm maintainance of genome wide repeat arrays in assembled sequence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

List 3 pros and cons for HS

A

Pros:

  • Accurate
  • Coverage Known
  • Sequential

Cons:

  • Slow
  • Expensive
  • Map dependent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

List 3 pros and cons for WGS

A

Pros:

  • Fast
  • Cheap
  • Map independant

Cons:

  • Accuracy
  • Coverage is estimated
  • Periodic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

List 2 types of sensors for geen identification

A
  • Signal sensors; detect short (disctrete) sequence motifs (consensus); eg start/stop codons, splice donor/acceptor sites
  • Content sensors; detect long (extended) sequence motifs (no consensus); eg CpG islands (vertebrates), coding vs intergenic
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are long open reading frames (ORFs) indicative of?

A

Genes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is high gene density indicative of in ORFs?

A
  • Long ORFs
  • Little intergenic DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is low gene density indicative of in ORFs?

A
  • Short ORFs (Exons)
  • Much intergenic DNA
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

List 2 problems with ORF detection

A
  • Short exons are ignored (by chance)
  • Boundaries missed (start codons, untranslated exons)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the ab initio method of gene prediction?

A
  • Infering candidate gene existance
  • Done by identifying appropriate arrangements of sensor features and content features
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the ab initio and homology method of gene prediction?

A
  • Test candidate gene similarity to know genes (BLASTP)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do Hidden Markov model programs function?

A

Build explicit methmatical models using training data

17
Q

What is sensitivity in regards to gene predictions?

A

Ratio of true positives (predicted) to actual (confirmed) positives

18
Q

What is specificity in regards to gene predictions?

A

Ratio of true positives to predicted positives