Structure of Genomes I & II (lectures 2-3) Flashcards

(54 cards)

1
Q

Number of genomes sequenced

(February 2024;
https://www.ncbi.nlm.nih.gov/genome)

A
  • Eukaryotes: 30,530
  • Prokaryotes: 567,228
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is understanding genome structure important? = 3

A

1 * Medicine – predisposition to certain diseases, response to drugs

2 * Explanations for evolutionary change

3 * Production of “better” food (plants and animals), fodder, and fuel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gene Ontology (GO) Annotations (from the website: https://geneontology.org/):

“Associations of gene products to GO terms are statements that describe”: 3

A
  1. “Molecular Function:
    - the molecular activities of individual gene products”
  2. “Cellular Component:
    - where the gene products are active”
  3. “Biological Process:
    - the pathways and larger processes to which that gene product’s activities contributes”.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Prokaryotic genomes: What are they?

differences?

A
  1. Archea
  2. Bacteria

— differences:
different MOLECULAR and GENETIC characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Prokaryotic genomes:

Bacteria and Archea

Similarities?

A

1 * no extensive internal compartments

2 * chromosome / nucleoid

3 * plasmid(s)
————4 * integrative or independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Prokaryotic genomes: How is DNA packaged? = 6

A
  1. DNA packaged with DNA-binding proteins
    2 * typically circular genomes
    3 * negative supercoiling
  2. Protein core limits loss of supercoiling if a break occurs
    5 * domains
    6 * loops ~10 – 100 kb
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Escherichia coli nucleoid:

A

Know most about ‘Escherichia coli’ nucleoid but DNA-binding proteins found in other species, including archea.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Prokaryotic genomes cont’d = 5

shape and coiling

A
  1. Circular, double-stranded DNA
  2. Remove a few turns of the double helix
  3. Molecule forms a negative supercoil

diagram Protein core, Supercoiled DNA loops, Broken loop- no supercoiling

Look at Diagrams on it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Prokaryotic genome size:

A

Prokaryotic genome size is variable

  • most < 5 Mb
  • range: 112 kb – 14.8 Mb
  • average gene density: ~950 genes / 1Mb
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Prokaryotic genome size: Largest genome and smallest genome:

A

1 * largest genomes found in free-living soil bacteria – ability to respond to changing environment

2* smallest genomes in endosymbiotic bacteria – more consistent environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Prokaryotic genome size - commonly arranged …?

A
  • commonly arranged in OPERONS (not universal)
  • group of genes
  • involved in the same biochemical pathway and
  • expressed as a single unit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Prokaryotic genome size…is proportional to?

A

Prokaryotic GENOME size is proportional to GENE NUMBER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Prokaryotic genome organisation and content: 4

A
  1. Majority of BACTERIAL and ARCHEAL genomes are CIRCULAR
    …..2 * SOME are LINEAR,
    e.g. Borrelia burgdorferi (causative agent of Lyme disease)
  2. Many PROKARYOTIC genomes are MULTIPARTE
    ….4 * two or more molecules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Prokaryotic genome organisation and content :

PLASMIDS = 7

A
  1. typically CIRCULAR
  2. REPLICATION INDEPENDENT of nucleoid
  3. up to 1000s of COPIES / cell
  4. PARTITIONED TO NEW CELLS INDEPENDENT OF NUCLEOID
  5. contain GENES NOT ESSENTIAL FOR SURVIVAL IN PERMISSIVE HABITATS/CONDITIONS
  6. TRANSFERRED to / TAKEN UP BY VARIOUS SPECIES
  7. argument: plasmids not
    be included in definition
    of prokaryotic genomes
    BUT….
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Essential genes are found on ‘D. radiodurans’ R1 plasmids….

EXPLAIN = 4

A

‘B. burgdorferi’ (causative agent of Lyme disease)

  • 1 linear chromosome
  • 19 linear and circular plasmids
  • indispensable genes, e.g. encoding some membrane proteins
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Chromosome vs chromid vs plasmid

A
  1. CHROMOSOME (s) – located in nucleoid, carries essential genes
  2. CHROMID – uses plasmid partitioning system, carries essential genes
  3. PLASMID– uses plasmid partitioning system, carries nonessential genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

’ V. cholerae’ vs ‘D. radiodurans’

A

‘V. cholerae’ : one chromosome, one chromid

‘D. radiodurans’ : two chromosomes and two chromids

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

‘E.coli’ genome… space? separation?

other parts

A
  1. 11% of genome = non-coding DNA
    • little space between genes
    • some genes separated by only a single
      nucleotide (thrA and thrB) or none (thrB and
      thrC)
    • thrA-C = operon; encodes proteins for
      threonine biosynthesis
  2. Some archeal genes have introns
  3. Some prokaryotes contain nested genes = genes encoded within other genes (aka overlapping genes)
  4. Bacterial genes = slightly longer than archeal genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is in ‘E.coli’ genome SEQUENCES? = 8

A
  1. repeat sequences
      • few high-copy-number interspersed repeat families (compared with eukaryotes;)
      • insertion sequences (IS)
      • mobile elements (transposons) repeated in the genome
    • nontransposable repeat elements
        • repetitive extragenic palindromic (REP) sequences
    • gene regulation?
      8. * clustered regularly interspaced short palindromic repeats (CRISPRs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer; where, who? = 5

A
    • gene flow between prokaryotic species
        • frequent
    • most prokaryotic genomes contain hundreds of kb of DNA from different
      prokaryotic species
       4.* transfers occur between bacteria and archea
    • DNA originates from the environment, exchange of plasmids and viral
      vectors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer; how? = 7

A
  1. multiple genes in a singe transfer
    • mechanisms of transfer
      3. * transformation
      4. * conjugation
      5. * transduction
    • confuses species relationships
      7. * laterally transferred gene will have relatively similar sequences in two species – due to little time for sequence divergence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer;EXAMPLES? = 4

A
    • antibiotic resistance
    • ability tolerate hot environments
    • anaerobic to aerobic GROWTH HABITS
    • METABOLIC PATHWAYS
22
Q

Prokaryotic genome organisation and content cont’d DIAGRAMS

23
Q

Prokaryotic gene function catalogue (GO Terms):

‘E. coli’ genome = 5
FUNCTION
GENE FAMILYS

A
  1. function of many genes still not known
      • no significant similarity to any known genes in other bacteria
    • gene families
      • genes having arisen from gene duplication events
        • e.g. rRNA genes in bacteria and archea
24
Eukaryotic genomes: Heterochromatin vs Euchromatin = 16
1. LINEAR CHROMOSOMES with NEGATIVE SUPERCOILING in MEMBRANE-BOUND NUCLEUS 'Heterochromatin' 2. * densely staining regions in interphase nucleus 3. * chromatin densely packed 4. * constitutive 5. * permanently condensed chromatin 6. * DNA is gene poor - but does contain some genes 7. * centromeric, telomeric 8. * many repeat regions 9. * other chromosome regions 10. * most of the human Y chromosome 11. * facultative 12. * not permanently condensed 13. * exists in some cells at some times 14. * DNA encodes genes that are inactive at particular times or in particular cells 'Euchromatin' 15. * cannot see in interphase nucleus 16.* DNA is less condensed and gene rich
25
Eukaryotic genome organisation; 'Nucleosomes' = 7
1. * protein core = histone octamer 2. * 2 subunits of histones H2A, H2B, H3, and H4 3. * DNA ~147 bp 4. * “slide” 5. * expose chromatin regions for transcription 6. * also involves chromatin-remodelling proteins 7. * removed and replaced during DNA replication
26
Eukaryotic genome organization cont’d: 'Higher orders of chromatin structure' = 4 UNDERSTANDING EUCHROMATIN LOOPS
1. Euchromatin loops ARE 2. * dynamic 3. * extension / merging – allow access to transcriptional machinery 4. * condensation – repress transcription
27
Eukaryotic genome organization cont’d: 'Higher orders of chromatin structure' = DIAGRAM
SLIDE 22
28
Eukaryotic genome organization: 'Organisation of chromosomes in interphase nucleus' = 13
1. Organisation of chromosomes in interphase nucleus 2. CT – chromosome territory 3. * specific for each chromosome 4. LAD – lamin-associated domains 5. * heterochromatin interacting with nuclear lamins at nuclear periphery 6. TAD – topologically associating domain 7. * Compartment A 8.* transcriptionally active 9.* enhancer-promoter interactions 10. * Compartment B 11. * transcriptionally repressed (facultative heterochromatin) 13. * chromatin switches between compartments, depending gene expression demands
29
Eukaryotic genome organization: 'INSULATORS' = 7
1. define TAD boundaries 2. * 1-2 kb long 3. * in many (all?) eukaryotes 4. * DNAse I insensitive 5. * interact with specific DNA-binding proteins 6. * establish functional domains, e.g. loop 7. * prevent cross-talk of regulatory domains between functional domains
30
Eukaryotic genome organization: 'INSULATORS' DIAGRAM
SLIDE 24
31
Eukaryotic genome size: 3
1. Great VARIABILITY in eukaryotic genome size 2. * 10 Mb – 100,000 Mb 3. * EUKARYOTIC GENOME SIZE is NOT PROPORTIONAL TO GENE NUMBER
32
Eukaryotic genome size: correlation? = 4
1. Overall correlation of genome size with morphological complexity of organisms 2. * HOWEVER, no precise correlation between genome size and complexity 3. *especially evident when looking within eukaryotic groups 4. * C-value paradox/enigma (C-value = haploid genome size)
33
Eukaryotic genome size: 'Factors contributing to the C-value paradox/enigma' = 3
Factors contributing to the C-value paradox/enigma 1. * non-protein coding DNA 2. * gene density 3. * “split genes” – # introns / gene
34
Non-protein coding DNA scales with morphological complexity DIAGRAM
SLIDE 27 PROKARYOTES AT THE BOTTOM EUKARYOTES = MOST
35
Repeat sequences in eukaryotic genomes: 'INTERSPERSED REPEATS' = 10
1. GENOMES of most MULTICELLULAR EUKARYOTES have substantial amounts of moderately and HIGHLY REPETITIVE SEQUENCES 2. Interspersed repeats 3. * repeat units distributed (seemingly) randomly around the genome 4. * in intergenic regions and introns 5. * DNA transposons 6. * retrotransposons 7. * LTR – long terminal repeat retrotransposons 8. * Non-LTR retrotransposons 9. * SINE – short interspersed nuclear element 10. * LINE – long interspersed nuclear element
36
Repeat sequences in eukaryotic genomes: TANDEM REPEATS = 16
Tandem repeats 1. * repeat units located next to each other 2. * satellite DNA (satellite bands after fractionation and density gradient centrifugation of genomic DNA) 3. * repeat unit < 5 bp to > 200 bp 4. * clusters 100s of kb in length 5. * e.g. centromeric DNA 6. * minisatellites 7.* not part of satellite bands on gradients 8. *repeat unit up to 25 bp 9. * clusters up to 20 kb 10. * e.g. telomeric DNA 11. * microsatellites 12. * not part of satellite bands on gradients 13. * repeat unit < 13 bp 14. * clusters < 150 bp 15. * used to establish kinship 16.* an individual’s genetic profile DIAGRAM SLIDE 29
37
Eukaryotic genomes contain pseudogenes (2 TYPES) AND gene relics = 9
1. Conventional pseudogene 2. * inactivated due to mutation 3. Processed pseudogene 4. * derived from a mRNA that is converted to cDNA and reinserts into genome 5. * no introns or regulatory regions that ancestral gene had 6. * inactivated 7. Gene relics 8. * truncated gene – from 5’ or 3’ end 9. * gene fragments
38
Eukaryotic genomes contain pseudogenes and gene relics = DIAGRAM
SLIDE 30
39
Eukaryotic gene density
Genes are more closely packed along the chromosomes of less complex organisms
40
Less complex organisms contain fewer split genes
Genome of yeast compared to genomes of more complex eukaryotes * few genes with introns – yeast genome has 239; human genome > 300,000
41
G-value paradox
- Gene number does not scale with morphological complexity - Alternative splicing leads to multiple mRNAs and proteins from a single gene * explains part of the C- and G-value paradoxes
42
G-value paradox = DIAGRAM
SLIDE 33
43
Eukaryotic genomes contain gene deserts: 9 WHAT, SIGNIFICANCE? IN HUMAN GENOME?
1. Large regions of chromosomes (10^5-10^6 bp) devoid of known genes or other functional genetic elements 2. * human genome 3. * 25% consists of gene deserts 4. * chromosomes 4, 5 and 13 (30-40% of the chromosomes) 5. * significance of gene deserts 6. * not known 7. * some contain regulatory sequences that act over large distances to control gene expression 8. * others show no clear function 9. * superfluous regions of genomes??
44
Eukaryotic genomes contain gene families: SIMPLE VS COMPLEX: SIMPLE...= 9
1. Simple (aka classical) gene families 2. * all members have identical or nearly identical sequences 3. * arose from gene duplication events 4. * rRNA genes 5. *humans: 6. * 2000 genes for 5S rRNA 7. * single cluster on chromosome 1 8. * 280 copies of 28S, 5.8S, 18S repeat unit 9. * 50-70 repeats clustered on multiple chromosomes
45
Eukaryotic genomes contain gene families: SIMPLE VS COMPLEX: COMPLEX...= 7
1. Complex gene families 2. * members have similar sequences 3. * different enough to code for gene products with different properties 4. * arose from gene duplication events 5. * mammalian globin genes 6. * expressed at different developmental stages 7. * biochemical properties correlate to physiological needs during development
46
Eukaryotic genomes contain nested genes? How many Categories? =7
1. Overlapping genes found in the genomes of yeast, protists and metazoans 2. Two major categories 3. * genes nested within intron of another gene (= external host gene) 4. * relatively common in eukaryotes 5. * non-intronic genes nested opposite coding sequence of external host gene 6. * no clear evidence of these in metazoan genomes 7. * present in yeast and protistan genomes (and prokaryotic genomes)
47
Eukaryotic genomes contain nested genes diagram
slide 36
48
Eukaryotic gene function catalogues (GO Terms): = 8
1. Human genome 2. * greatest number of genes in all categories except metabolism 3. * many more genes involved in defence and immunity 4. 'Caenorhabditis elegans' (nematode worm) genome 6. * high number of genes in cell-cell communication category 7. *1000 genes vs 1250 in humans 8. *BUT only 959 cells vs 1013 cells in humans
49
The Hidden Genome: Non-coding (nc)RNAs
Non-coding (nc)RNAs * tRNAs, rRNAs, circRNA (circular RNA), eRNA (enhancer RNA), lincRNA (long intergenic non-coding RNA), microRNA (miRNA), NAT (natural antisense transcript), piRNA (PIWI RNA), scaRNA (small Cajal body-specific RNA), siRNA (small interfering RNA), snRNA (small nuclear RNA), snoRNA (small nucleolar RNA)
50
The Hidden Genome = RNAs encoding microproteins and peptides = 8
1. smORF = small open reading frame 2. * shorter than 100 amino acids 3. * dORF = downstream open reading frame 4. * located in the 3’-UTR of known proteincoding genes 5. * uORF = upstream-encoded smORF 6. * located in the 5’-UTR of known proteincoding genes 7. * nuORF = novel unannotated open reading frame 8. * SEP = small peptide
51
The Hidden Genome DIAGRAM
SLIDE 38
52
The Forbidden Genome: 4
1. Short DNA sequences not compatible with life 2. * minimal absent words (MAWs) 3. * not found in a particular genome (nullomers) 4. * not found in any genome (primes)
53
The Forbidden Genome: USES = 4
1. * tags to distinguish samples (e.g. control or reference samples vs forensic samples) 2. * suicide genes that could be encoded by genetically modified organism and activated to destroy them if they prove dangerous 3 * anticancer peptides (NulloPs) 4. * biomarkers for cancers