Structure of Genomes I & II (lectures 2-3) Flashcards

1
Q

Number of genomes sequenced

(February 2024;
https://www.ncbi.nlm.nih.gov/genome)

A
  • Eukaryotes: 30,530
  • Prokaryotes: 567,228
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is understanding genome structure important? = 3

A

1 * Medicine – predisposition to certain diseases, response to drugs

2 * Explanations for evolutionary change

3 * Production of “better” food (plants and animals), fodder, and fuel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Gene Ontology (GO) Annotations (from the website: https://geneontology.org/):

“Associations of gene products to GO terms are statements that describe”: 3

A
  1. “Molecular Function:
    - the molecular activities of individual gene products”
  2. “Cellular Component:
    - where the gene products are active”
  3. “Biological Process:
    - the pathways and larger processes to which that gene product’s activities contributes”.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Prokaryotic genomes: What are they?

differences?

A
  1. Archea
  2. Bacteria

— differences:
different MOLECULAR and GENETIC characteristics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Prokaryotic genomes:

Bacteria and Archea

Similarities?

A

1 * no extensive internal compartments

2 * chromosome / nucleoid

3 * plasmid(s)
————4 * integrative or independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Prokaryotic genomes: How is DNA packaged? = 6

A
  1. DNA packaged with DNA-binding proteins
    2 * typically circular genomes
    3 * negative supercoiling
  2. Protein core limits loss of supercoiling if a break occurs
    5 * domains
    6 * loops ~10 – 100 kb
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Escherichia coli nucleoid:

A

Know most about ‘Escherichia coli’ nucleoid but DNA-binding proteins found in other species, including archea.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Prokaryotic genomes cont’d = 5

shape and coiling

A
  1. Circular, double-stranded DNA
  2. Remove a few turns of the double helix
  3. Molecule forms a negative supercoil

diagram Protein core, Supercoiled DNA loops, Broken loop- no supercoiling

Look at Diagrams on it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Prokaryotic genome size:

A

Prokaryotic genome size is variable

  • most < 5 Mb
  • range: 112 kb – 14.8 Mb
  • average gene density: ~950 genes / 1Mb
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Prokaryotic genome size: Largest genome and smallest genome:

A

1 * largest genomes found in free-living soil bacteria – ability to respond to changing environment

2* smallest genomes in endosymbiotic bacteria – more consistent environment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Prokaryotic genome size - commonly arranged …?

A
  • commonly arranged in OPERONS (not universal)
  • group of genes
  • involved in the same biochemical pathway and
  • expressed as a single unit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Prokaryotic genome size…is proportional to?

A

Prokaryotic GENOME size is proportional to GENE NUMBER

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Prokaryotic genome organisation and content: 4

A
  1. Majority of BACTERIAL and ARCHEAL genomes are CIRCULAR
    …..2 * SOME are LINEAR,
    e.g. Borrelia burgdorferi (causative agent of Lyme disease)
  2. Many PROKARYOTIC genomes are MULTIPARTE
    ….4 * two or more molecules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Prokaryotic genome organisation and content :

PLASMIDS = 7

A
  1. typically CIRCULAR
  2. REPLICATION INDEPENDENT of nucleoid
  3. up to 1000s of COPIES / cell
  4. PARTITIONED TO NEW CELLS INDEPENDENT OF NUCLEOID
  5. contain GENES NOT ESSENTIAL FOR SURVIVAL IN PERMISSIVE HABITATS/CONDITIONS
  6. TRANSFERRED to / TAKEN UP BY VARIOUS SPECIES
  7. argument: plasmids not
    be included in definition
    of prokaryotic genomes
    BUT….
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Essential genes are found on ‘D. radiodurans’ R1 plasmids….

EXPLAIN = 4

A

‘B. burgdorferi’ (causative agent of Lyme disease)

  • 1 linear chromosome
  • 19 linear and circular plasmids
  • indispensable genes, e.g. encoding some membrane proteins
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Chromosome vs chromid vs plasmid

A
  1. CHROMOSOME (s) – located in nucleoid, carries essential genes
  2. CHROMID – uses plasmid partitioning system, carries essential genes
  3. PLASMID– uses plasmid partitioning system, carries nonessential genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

’ V. cholerae’ vs ‘D. radiodurans’

A

‘V. cholerae’ : one chromosome, one chromid

‘D. radiodurans’ : two chromosomes and two chromids

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

‘E.coli’ genome… space? separation?

other parts

A
  1. 11% of genome = non-coding DNA
    • little space between genes
    • some genes separated by only a single
      nucleotide (thrA and thrB) or none (thrB and
      thrC)
    • thrA-C = operon; encodes proteins for
      threonine biosynthesis
  2. Some archeal genes have introns
  3. Some prokaryotes contain nested genes = genes encoded within other genes (aka overlapping genes)
  4. Bacterial genes = slightly longer than archeal genes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is in ‘E.coli’ genome SEQUENCES? = 8

A
  1. repeat sequences
      • few high-copy-number interspersed repeat families (compared with eukaryotes;)
      • insertion sequences (IS)
      • mobile elements (transposons) repeated in the genome
    • nontransposable repeat elements
        • repetitive extragenic palindromic (REP) sequences
    • gene regulation?
      8. * clustered regularly interspaced short palindromic repeats (CRISPRs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer; where, who? = 5

A
    • gene flow between prokaryotic species
        • frequent
    • most prokaryotic genomes contain hundreds of kb of DNA from different
      prokaryotic species
       4.* transfers occur between bacteria and archea
    • DNA originates from the environment, exchange of plasmids and viral
      vectors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer; how? = 7

A
  1. multiple genes in a singe transfer
    • mechanisms of transfer
      3. * transformation
      4. * conjugation
      5. * transduction
    • confuses species relationships
      7. * laterally transferred gene will have relatively similar sequences in two species – due to little time for sequence divergence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

‘Prokaryotic genome organisation and content’

Lateral (aka horizontal) gene transfer;EXAMPLES? = 4

A
    • antibiotic resistance
    • ability tolerate hot environments
    • anaerobic to aerobic GROWTH HABITS
    • METABOLIC PATHWAYS
22
Q

Prokaryotic genome organisation and content cont’d DIAGRAMS

A

SLIDE 18

23
Q

Prokaryotic gene function catalogue (GO Terms):

‘E. coli’ genome = 5
FUNCTION
GENE FAMILYS

A
  1. function of many genes still not known
      • no significant similarity to any known genes in other bacteria
    • gene families
      • genes having arisen from gene duplication events
        • e.g. rRNA genes in bacteria and archea
24
Q

Eukaryotic genomes:
Heterochromatin vs Euchromatin = 16

A
  1. LINEAR CHROMOSOMES with NEGATIVE SUPERCOILING in MEMBRANE-BOUND NUCLEUS

‘Heterochromatin’
2. * densely staining regions in interphase nucleus

    • chromatin densely packed
    • constitutive
        • permanently condensed chromatin
        • DNA is gene poor - but does contain some genes
        • centromeric, telomeric
            • many repeat regions
        • other chromosome regions
          10. * most of the human Y chromosome
    • facultative
        • not permanently condensed
            • exists in some cells at some times
        • DNA encodes genes that are inactive at particular times or in particular cells

‘Euchromatin’
15. * cannot see in interphase nucleus

16.* DNA is less condensed and gene rich

25
Q

Eukaryotic genome organisation;

‘Nucleosomes’ = 7

A
    • protein core = histone octamer
        • 2 subunits of histones H2A, H2B, H3, and H4
    • DNA ~147 bp
    • “slide”
        • expose chromatin regions for transcription
        • also involves chromatin-remodelling proteins
    • removed and replaced during DNA replication
26
Q

Eukaryotic genome organization cont’d:

‘Higher orders of chromatin structure’ = 4

UNDERSTANDING EUCHROMATIN LOOPS

A
  1. Euchromatin loops ARE
    • dynamic
    • extension / merging – allow access to transcriptional machinery
    • condensation – repress transcription
27
Q

Eukaryotic genome organization cont’d:

‘Higher orders of chromatin structure’ = DIAGRAM

A

SLIDE 22

28
Q

Eukaryotic genome organization: ‘Organisation of chromosomes in interphase nucleus’ = 13

A
  1. Organisation of chromosomes in interphase nucleus
  2. CT – chromosome territory
      • specific for each chromosome
  3. LAD – lamin-associated domains
      • heterochromatin interacting with nuclear lamins at nuclear periphery
  4. TAD – topologically associating
    domain
      • Compartment A
        8.* transcriptionally active
        9.* enhancer-promoter
        interactions
      • Compartment B
          • transcriptionally repressed
            (facultative heterochromatin)
    • chromatin switches between compartments,
      depending gene expression
      demands
29
Q

Eukaryotic genome organization: ‘INSULATORS’
= 7

A
  1. define TAD boundaries
    • 1-2 kb long
    • in many (all?) eukaryotes
    • DNAse I insensitive
    • interact with specific DNA-binding proteins
    • establish functional domains, e.g. loop
    • prevent cross-talk of regulatory domains between functional domains
30
Q

Eukaryotic genome organization: ‘INSULATORS’ DIAGRAM

A

SLIDE 24

31
Q

Eukaryotic genome size: 3

A
  1. Great VARIABILITY in eukaryotic genome size
      • 10 Mb – 100,000 Mb
      • EUKARYOTIC GENOME SIZE is NOT PROPORTIONAL TO GENE NUMBER
32
Q

Eukaryotic genome size: correlation? = 4

A
  1. Overall correlation of genome size with morphological complexity of organisms
      • HOWEVER, no precise correlation between genome size and complexity
      1. *especially evident when looking within eukaryotic groups
        • C-value paradox/enigma (C-value = haploid genome size)
33
Q

Eukaryotic genome size:

‘Factors contributing to the C-value paradox/enigma’ = 3

A

Factors contributing to the C-value paradox/enigma

    • non-protein coding DNA
    • gene density
    • “split genes” – # introns / gene
34
Q

Non-protein coding DNA scales with morphological complexity DIAGRAM

A

SLIDE 27

PROKARYOTES AT THE BOTTOM
EUKARYOTES = MOST

35
Q

Repeat sequences in eukaryotic genomes: ‘INTERSPERSED REPEATS’ = 10

A
  1. GENOMES of most MULTICELLULAR EUKARYOTES have substantial amounts of moderately and HIGHLY REPETITIVE SEQUENCES
  2. Interspersed repeats
    3. * repeat units distributed (seemingly) randomly around the genome
    • in intergenic regions and introns
        • DNA transposons
        • retrotransposons
          7. * LTR – long terminal repeat retrotransposons
          8. * Non-LTR retrotransposons
            • SINE – short interspersed nuclear element
            • LINE – long interspersed
              nuclear element
36
Q

Repeat sequences in eukaryotic genomes: TANDEM REPEATS = 16

A

Tandem repeats
1. * repeat units located next to each other
2. * satellite DNA (satellite bands after fractionation and density gradient centrifugation of genomic DNA)
3. * repeat unit < 5 bp to > 200 bp
4. * clusters 100s of kb in length
5. * e.g. centromeric DNA

    • minisatellites
      7.* not part of satellite bands on gradients
      8. *repeat unit up to 25 bp
      9. * clusters up to 20 kb
    • e.g. telomeric DNA
    • microsatellites
      12. * not part of satellite bands on gradients
      13. * repeat unit < 13 bp
      14. * clusters < 150 bp
      15. * used to establish kinship
      16.* an individual’s genetic profile

DIAGRAM SLIDE 29

37
Q

Eukaryotic genomes contain pseudogenes (2 TYPES)

AND gene relics = 9

A
  1. Conventional pseudogene
    • inactivated due to mutation
  2. Processed pseudogene
    • derived from a mRNA that is converted to cDNA and reinserts into genome
    • no introns or regulatory regions that ancestral gene had
      6. * inactivated
  3. Gene relics
      • truncated gene – from 5’ or 3’ end
      • gene fragments
38
Q

Eukaryotic genomes contain pseudogenes and gene relics = DIAGRAM

A

SLIDE 30

39
Q

Eukaryotic gene density

A

Genes are more closely packed along the chromosomes of less complex organisms

40
Q

Less complex organisms contain fewer split genes

A

Genome of yeast compared to genomes of more complex eukaryotes

   * few genes with introns – yeast genome has 239; human genome > 300,000
41
Q

G-value paradox

A
  • Gene number does not scale with morphological complexity
  • Alternative splicing leads to multiple mRNAs and proteins from a single gene
    * explains part of the C- and G-value paradoxes
42
Q

G-value paradox = DIAGRAM

A

SLIDE 33

43
Q

Eukaryotic genomes contain gene deserts: 9

WHAT, SIGNIFICANCE? IN HUMAN GENOME?

A
  1. Large regions of chromosomes (10^5-10^6 bp) devoid of known genes or other functional genetic elements
      • human genome
        3. * 25% consists of gene deserts
        4. * chromosomes 4, 5 and 13 (30-40% of the chromosomes)
    • significance of gene deserts
    • not known
        • some contain regulatory sequences that act over large distances to control gene expression
8. * others show no clear function
9. * superfluous regions of genomes??
44
Q

Eukaryotic genomes contain gene families:

SIMPLE VS COMPLEX: SIMPLE…= 9

A
  1. Simple (aka classical) gene families
    • all members have identical or nearly identical sequences
    • arose from gene duplication events
      • rRNA genes
        5. *humans:
    • 2000 genes for 5S rRNA
    • single cluster on chromosome 1
    • 280 copies of 28S, 5.8S, 18S repeat unit
    • 50-70 repeats clustered on multiple chromosomes
45
Q

Eukaryotic genomes contain gene families:

SIMPLE VS COMPLEX: COMPLEX…= 7

A
  1. Complex gene families
    • members have similar sequences
    • different enough to code for gene products with different properties
    • arose from gene duplication events
      5. * mammalian globin genes
        • expressed at different
          developmental stages
        • biochemical properties correlate to physiological needs during development
46
Q

Eukaryotic genomes contain nested genes? How many Categories? =7

A
  1. Overlapping genes found in the genomes of yeast, protists and metazoans
  2. Two major categories
    • genes nested within intron of another gene (= external host gene)
      4. * relatively common in
      eukaryotes
    • non-intronic genes nested opposite coding sequence of external host gene
      6. * no clear evidence of these in metazoan genomes
      7. * present in yeast and
      protistan genomes (and
      prokaryotic genomes)
47
Q

Eukaryotic genomes contain nested genes diagram

A

slide 36

48
Q

Eukaryotic gene function catalogues (GO Terms):
= 8

A
  1. Human genome
    • greatest number of genes in all categories except metabolism
    • many more genes involved in defence and immunity
  2. ‘Caenorhabditis elegans’ (nematode worm) genome
      • high number of genes in cell-cell communication category
    1. *1000 genes vs 1250 in humans
    2. *BUT only 959 cells vs 1013 cells in humans
49
Q

The Hidden Genome:

Non-coding (nc)RNAs

A

Non-coding (nc)RNAs
* tRNAs, rRNAs, circRNA (circular RNA),
eRNA (enhancer RNA), lincRNA (long
intergenic non-coding RNA), microRNA
(miRNA), NAT (natural antisense transcript),
piRNA (PIWI RNA), scaRNA (small Cajal
body-specific RNA), siRNA (small interfering
RNA), snRNA (small nuclear RNA), snoRNA
(small nucleolar RNA)

50
Q

The Hidden Genome =

RNAs encoding microproteins and peptides = 8

A
  1. smORF = small open reading frame
      • shorter than 100 amino acids
    • dORF = downstream open reading frame
        • located in the 3’-UTR of known proteincoding genes
    • uORF = upstream-encoded smORF
        • located in the 5’-UTR of known proteincoding genes
    • nuORF = novel unannotated open reading
      frame
    • SEP = small peptide
51
Q

The Hidden Genome DIAGRAM

A

SLIDE 38

52
Q

The Forbidden Genome: 4

A
  1. Short DNA sequences not compatible with life
      • minimal absent words (MAWs)
      • not found in a particular genome (nullomers)
      • not found in any genome (primes)
53
Q

The Forbidden Genome: USES = 4

A
    • tags to distinguish samples (e.g. control or reference samples vs forensic samples)
    • suicide genes that could be encoded by genetically modified organism and activated to destroy them if they prove dangerous

3 * anticancer peptides (NulloPs)

    • biomarkers for cancers