Lecture 6 Flashcards
What forms of repetitive sequences are dominant contributors to genome size variation?
- Highly repetitive
- Moderately repetitive
What are 2 approaches to genome sequencing?
- Top down
- Bottom up
What is the “top down” approach to genome sequencing?
- Break large sequences into smaller pieces which overlap
- These pieces cover the whole genome
- Fragment these pieces, sequence
Called Hierarchical (or clone contig) sequencing (HS)
What is the “bottom up” approach to genome sequencing?
- Fragment wholegenome into small pieces
- Sequence and assemble
Called Whole Genome Shotgun (WGS)
What is a main obsticle of shotgun sequencing?
Repetitive genomes (with tandem and interspersed repeats) are difficult to assemble correctly
How do you overscome the problem of small high copy number repeats (Like Alu elements) and large segmental duplications in WGS?
Sequence ends of larger clones (10kb-15kb) to confirm maintainance of genome wide repeat arrays in assembled sequence
List 3 pros and cons for HS
Pros:
- Accurate
- Coverage Known
- Sequential
Cons:
- Slow
- Expensive
- Map dependent
List 3 pros and cons for WGS
Pros:
- Fast
- Cheap
- Map independant
Cons:
- Accuracy
- Coverage is estimated
- Periodic
List 2 types of sensors for geen identification
- Signal sensors; detect short (disctrete) sequence motifs (consensus); eg start/stop codons, splice donor/acceptor sites
- Content sensors; detect long (extended) sequence motifs (no consensus); eg CpG islands (vertebrates), coding vs intergenic
What are long open reading frames (ORFs) indicative of?
Genes
What is high gene density indicative of in ORFs?
- Long ORFs
- Little intergenic DNA
What is low gene density indicative of in ORFs?
- Short ORFs (Exons)
- Much intergenic DNA
List 2 problems with ORF detection
- Short exons are ignored (by chance)
- Boundaries missed (start codons, untranslated exons)
What is the ab initio method of gene prediction?
- Infering candidate gene existance
- Done by identifying appropriate arrangements of sensor features and content features
What is the ab initio and homology method of gene prediction?
- Test candidate gene similarity to know genes (BLASTP)