Genomics and Evolution Flashcards

(446 cards)

1
Q

What are the main 2 types of genome?

A

The nuclear genome and the mitochondrial genome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is high level and low level in genome organisation?

A

High level is the chromosomes and low level is all the junk DNA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is chromosome fusion?

A

It is the fusing of two chromosomes into one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are segmental duplications?

A

They are when a section of the chromosome are duplicated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are inversions?

A

It is where there is flipping of the genes on the chromosome.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are translocations?

A

They are where there is movement of genes across chromosomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What do pseudoautosomal regions do?

A

They make sure the sex chromosomes pair and are separated correctly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Sex chromosomes are ____ and what happens after they pair?

A

Sex chromosomes are homologous and pair, then recombine in male meiosis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What species has an interesting case of chromosomal fusion?

A

Muntjac deer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens in male meiosis?

A

Sex chromosomes form chromosomal chains and then split after meiosis. This chain is used through translocation of autosomal regions leading to pairing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What did sex chromosomes originate as?

A

They originated as a pair of autosomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How did sex chromosomes arise?

A

There was stopping in recombination in a pair of autosomes and this is where the sex-determining gene arose. From here, they non-recombining regions expanded, creating an evolutionary state on the sex chromosomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens once a region becomes non-recombing?

A

There is an accumulation of deleterious mutations which leads to it becoming degenerate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Sex chromosomes evolved _________?

A

The evolved multiple times independently in different groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The process of degredation isn’t what, and what happens after degradation?

A

The process isn’t linear and after degredation, it stays at the “base level”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

There is a general tendency to lose what genes?

A

Genes that become unecessary.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What are 3 examples of gene loss?

A

The gene loss in the Y chromosome.
The loss of the Vitamin C producing gene multiple times over multiple lineages.
The loss of teeth in birds and turtles.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How often are genes gained and lost?

A

All of the time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

There is a what between genes being lost and gained?

A

There is a dynamic equilibrium between genes being lost and gained.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What are some mechanisms by which new genes arise?

A

Exon shuffling, gene duplication, retroposition, gene fusion, gene fission, and de novo origination.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Many proteins contain what, and how are new proteins with new functions made?

A

Many proteins contain “borrowed” domains, and old and new domains combine to create new proteins with new functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an example of where exon shuffling was used?

A

It was used in the origin of the jingwei gene in Drosophila.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What happens if a minor splice form does something useful?

A

It will be selected to increase its abundance in the cell, resulting in the evolution of major alternative gene isoforms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the introns early theory?

A

The theory that introns are ancient and are gradually lost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is the introns late theory?
The theory that introns evolved in early eukaryotes and keep spreading.
26
What is a common process behind the evolution of new genes?
Evolution by duplication, ranging from single genes to the whole genome.
27
The is the 2R hypothesis?
It's that there was two rounds of genome duplication in vertebrates.
28
What are whole genome duplications more common in?
Plants.
29
What happens when a gene is duplicated?
Some functional redundancy is created, which reduces purifying selection and allows the copies to accumulate mutations and diverge in function.
30
What is an example of gene duplication and sub-functionalisation?
The evolution of colour vision in primates, where the ancestral state is dichromatic, the S-gene and L-gene, and the L-gene duplicated and diverged. There was sub-functionalisation where the copies diverged to have different light sensitivities.
31
The size of the genome has little to do with what?
The size of a genome has little to do with the organism's complexity.
32
What is the C-value paradox?
The idea that larger genomes don't lead to higher complexity in eukaryotes.
33
In what does the number of genes and genome size show a pretty good correlation?
Viruses and prokaryotes.
34
How can genome sizes for extinct animals be measured?
The genome sies are measured from the size of the cells inside the bones. It is well-known that genome size correlates with a bigger nucleus, and so a bigger cell. To measure cell size, they measure the size of the pores that are in the bones.
35
What do transposable elements play a major role in?
Increasing genome size.
36
What is the only way to downsize a genome and what determines its efficiency?
Deletions are the only way to decrease the genome size, and the efficiency of downsizing depends on the frequency and size of the deletions.
37
What is an example of extreme genome reduction?
Buchnera is a mutualistic intracellular symbiont of aphids and since the revolutionary process started, there has been a massive reduction in genome size where only essential genes remain. The genome is currently essentially in genome stasis.
38
What was major in answering many questions regarding human evolution and why was it used?
mtDNA was used as it is more frequently mutating than DNA and has no recombination.
39
Where is human genetic diversity highest?
In Africa.
40
Why might looking at a single locus to explore the migration of humans be misleading?
It may tell a story of spread of an advantageous mutation and not the story of the migrations of humans, meaning it's important to look at other parts of the genome.
41
Why is the human Y chromosomal tree used for phylogenetic reconstructions?
Because it is paternally inherited and the phylogeny is aligned with mtDNA.
42
Why is using gene trees problematic for autosomal markers?
It is problematic due to the recombination.
43
On what basis are principal component analysis plots created?
They are created on the basis of individual genotypes for autosomal markers.
44
Why could human DNA show a lower global mobility in men?
This could reflect the fact that males inherit the land from their father and stay whereas women are married off to other families.
45
What genes are good for going really far back in human history?
Nuclear genes.
46
What is one way to learn about the ancestors of humans?
To use remenants of DNA in the ancient skeletal tissues.
47
Where was there hybridisation between Neanderthals and humans?
Only in Europe.
48
What has ancient DNA been used to study?
Ancient DNA has been used to study whether the ancestors of modern humans interbred with Neanderthals and other archaic hominids.
49
How much of our genome is believed to be Neanderthal?
4%.
50
What is the oldest genome that's been sequenced?
A Denisovan genome from a finger bone and a tooth, showed they are separate from Neanderthals and humans.
51
Closely related species were doing what when meeting?
Hybridising.
52
Different species on different islands in early Asia have suggested what?
That Homo.erectus had the ability and skill to travel across open ocean.
53
What does the adaptation of human skin pigmentation do?
It has a strong positive correlation with UV intensity, so dark skin has more protection against UV, which reduces the photolysis of folic acid. Light skin leads to more production of Vitamin D.
54
Selection leaves what in DNA polymorphisms?
Distinct footprints.
55
What is a result of the spread and fixation of an adaptive allele?
The loss of genetic variation around the target of selection.
56
What is an example of a footprint of recent adaptive evolution?
The genetic diversity around the gene involved in adaptation to milk adaptation in adult humans. The distribution of lactase persistence correlates with historical centres of dairy farming.
57
What happens when there is adaptation to contrasting conditions?
There is spread and fixation of different locally adaptive alleles in the population, which creates the signal of population differentiation at the genes under selection.
58
What is a good example of adaptation to local conditions?
An example is an adaptation to life at high altitudes where interspecies hybridisation was advantageous as Tibetans; they can breathe more easily at high altitudes due to having introgression of Denisovan-like DNA.
59
What is population genetics?
The study of genetic diversity in biological populations and of the processes that cause genetic diversity to change.
60
Genetic diversity is synonymous with what?
Intra-specific diversity.
61
What is the major process that differentiates intra- and inter-specific diversity?
Gene flow.
62
When did population genetic arise and from where?
Population genetics arose in the 1930s/1940s from the Modern synthesis of Mendelian Genetics and Darwinian Natural Selection.
63
Population genetics ultimately underpins what?
It underpins all phenomena in evolutionary biology.
64
What is a phenotype?
It is any observable or quantifiable characteristic of organisms that vary within or among populations.
65
What are genetic markers?
Genome regions that are useful for measuring and investigating genetic variation in populations.
66
What makes a population polymorphic at a specific genetic locus?
If more than one allele is commonly found.
67
The quality and resolution of genetic markers has improved with what?
The development of genetics, from proteins to DNA.
68
What is a genotype?
The allelic make-up of an individual.
69
What is the most common type of genetic marker used today?
DNA sequence variation.
70
What do we look at when looking at DNA sequence variation?
You can count the number of distinct sequences and the proportion of variable sites. You can also measure the average pairwise difference.
71
What are pairwise differences?
The number of differences between each pair of sequences.
72
What is heterozygosity?
The fraction of individuals in a population that are expected to be heterozygous.
73
What is heterozygosity equivalent to?
It is equivalent to the probability that any two alleles randomly sampled from the population are different.
74
What is average heterozygosity?
The proportion of loci observed to be heterozygous in an average individual, and it is obtained by averaging h across many loci.
75
What is the Hardy-Weinberg equilibrium?
It predicts genotype frequencies based on allele frequencies, when stable across generations in a stable population.
76
What assumptions are made regarding the Hardy-Weinberg principle?
- it's a diploid organism with sexual reproduction - there are non-overlapping generations - there's an infinite population size - there's non-random mating - males and females have equal allele frequencies - it's a closed population - there's no mutation - there's no selection
77
The Hardy-Weinberg principle is an example of what model and why?
It is an example of a null model as it describes the state of population when nothing interesting is happening.
78
The Hardy-Weinberg theorem extends to what?
It extends to more than 2 alleles and to multiple loci that segregate independently.
79
What are the ultimate driving force of diversity and natural selection?
Mutations.
80
What is linkage disequilibrium?
The fact that the inheritance of genes on the same chromosome is not independent.
81
How can linkage disequilibrium be decreased?
It is decreased due to recombination and random assortment
82
What does non-random mating mean?
It means that individuals mate at random with respect to a particular genotype, it doesn't mean that there's absolutely no choice.
83
What are exceptions to non-random mating?
Exceptions are inbreeding (mating with relatives more often by chance), and positive assortive mating (mating occurs with individuals with similar phenotypes).
84
What is identity be descent?
It is where offspring are more likely to inherit the same allele from both parents.
85
What is the inbreeding coefficient used for?
To measure the level of recent inbreeding
86
What does inbreeding depression result in?
It results in reduced fitness, and it often arises from homozygosity in recessive deleterious alleles.
87
What is genetic drift?
The idea that chance alone can result in changes in genetic variation over time.
88
What is fixation?
When an allele's frequency reaches 100% in the population.
89
When does genetic drift typically occur and what does it cause?
It usually occurs when populations go through bottlenecks and it causes substantive changes in allele frequencies.
90
Why is a founder effect observed?
It is observed due to genetic drift and inbreeding in a subpopulation.
91
What are examples of the founder effect?
Human diseases, some wild felid populations, and in captive breeding.
92
What can migration result in?
A reduction in overall heterozygosity.
93
What is the Fixation Index?
The fraction of total genetic diversity is due to differences among populations.
94
What is normally used instead of census population size in population genetics and why?
Effective population size is used instead of census population size as it takes into account that not all individuals in all generations have an equal propensity to reproduce.
95
What is effective population size?
The size of an idealised population that would experience the same rate of genetic drift as the real population, due partly to the limited proportion of breeding individuals.
96
What can cluster algorithms identify, and using what?
They can identify subgroups within a species using genetic marker data from multiple loci.
97
What does repetition of natural selection lead to?
It leads to the positive selection of beneficial alleles, which eventually results in their fixation.
98
Selection acts on what?
The whole organism.
99
What are examples of selection acting at a single locus?
Positive, negative, and balancing selection.
100
What are examples of selection acting at a multi-locus?
Directional, disruptive and stabilising selection.
101
What is relative fitness?
The average number of offspring produced by the individuals with a particular genotype compared to the number of produced by individuals with another genotype.
102
What is fitness of a new allele expressed as in population genetics?
It is expressed as a selection coefficient.
103
What does the selection coefficient represent in population genetics?
The increase or decrease in fitness conferred by that allele compared to another.
104
Changes in allele frequency occur more rapidly in what, and why?
The occur more rapidly in haploids than diploids, which is due to the fact that the relationship between genotype and phenotype is similar.
105
What shape are the plots of allele frequency against time?
Sigmoidal.
106
What is fitness influence by in diploids?
Allele interactions.
107
What is the range for the degree of dominance?
0-1.
108
What does the degree of dominance have a large effect on?
The rate of allele dominance.
109
A new rare allele that is created is normally what, and when does selection act on it?
A new rare allele initially created is mostly heterozygous, and selection can only favour the of the allele is dominant.
110
What dominance allow when an allele is near fixation?
The domainance allows for the less-fit allele to hide in heterozygotes, which makes it difficult to remove.
111
What case if balancing selection often typified by, and what is the case?
Balancing selection is often typified by the case of heterozygote advantage, where both alleles will stably coexist with a frequency that is proportional to the relative fitness of the two homozygotes.
112
What are two other types of selection that can maintain genetic variation in a population?
Frequency-dependent selection, and fluctuation selection.
113
What is frequency-dependent selection?
It is where allele fitness is high when the allele is rare, and so when the allele is common, the allele fitness is low.
114
What is fluctuation selection?
It is where allele fitness depends on an aspect of the environment tat is rapidly and constantly changing.
115
What happens when you go from a single locus to multiple loci?
You get a phenotypic curve.
116
What is the taxonomic domain to do with?
It is to do with describing, naming, identifying, and classifying species.
117
What is phylogenetics to do with?
It is to do with reconstructing patterns of shared ancestry among organisms.
118
Where was phylogeny first depicted?
In On the Origin of Species.
119
Where can phylogeny be seen?
It can be seen in hierarchal tables, the ladder of nature, and representing a process.
120
When are characteristics of organisms homologous?
If they are similar and have descended from a common ancestor.
121
When are characteristics of organisms analogous?
When they are similar but have descended from different ancestors.
122
What information do molecular sequences contain, and what is the problem?
Molecular sequences contain information about the evolutionary processes that produce them, but they are often scrambled, fragmentary, hidden, or lost.
123
How do modern methods recover and interpret challenging molecular sequences?
They use mathematical, statistical and computational methods.
124
What are orthologous molecular sequences?
They are sequences from different species.
125
What are homologous molecular sequences?
They are sequences from the same species.
126
What are paralogous molecular sequences?
They are sequences from different genes in the same genome.
127
When did molecular characters appear in science?
The arrived during the molecular biology revolution of the mid-20th century.
128
What are the advantages of using molecular characters over morphological ones?
- they are very common. - they are objective. - they are easy to quantify. - they are available when morphology is uninformative. - it is cheap and fast. - it can be obtained without specialist training.
129
What is the only significant disadvantage of molecular characters?
It is unavailable for extinct species.
130
What is a transition mutation?
It is a mutation of a purine-to-purine, or pyrimidine-to-pyrimidine.
131
What is a transversion mutation?
It is from a purine to a pyrimidine.
132
What is another way to refer to a silent mutation?
As a synonymous mutation.
133
What is another way to refer to a replacement mutation?
As a non-synonymous mutation.
134
What principle is phylogenetics based on?
The principle of parsimony.
135
What concept is molecular sequence alignment based on?
The concept of positional homology.
136
When do nucleotides exhibit positional homology?
If they exist at equivalent position in their respective sequences.
137
Good alignment is essential for what?
It is essential for good phylogenies.
138
How do alignment methods often work?
Most alignment methods start by assigning a different "cost" to each type of sequence difference. Each possible alignment, therefore, has a total cost. Algorithms then identify the alignment with the lowest cost.
139
When are alignment programs more prone to mistakes?
When the sequences are diverse or contain long insertions or deletions.
140
What is the multiple hits problem?
It is a problem seen when looking at how different two sequences are. When divergence is low, the observed number of changes is similar to the true number, but when divergence is high, the observed number underestimates the true genetic distance.
141
What are nucleotide substitution models used for?
They are used to estimate the true genetic distance from the observed changes.
142
What do nucleotide substitution models mathematically represent?
They represent the stochastic process of sequence evolution through time.
143
What matrix do protein models need?
A 20x20 matrix.
144
What important biological assumptions do nucleotide substitutions make?
- evolution at each site occurs at the same rate. - nucleotide base frequencies are the same for all sequences. - evolution at each site is independent.
145
What are statistical models used to do?
They are used to capture the variation in evolutionary rates among sites.
146
What distribution is most commonly used in statistical models?
The gamma-distribution.
147
What do all the lines represent in an unrooted tree?
All lines represent genetic distance.
148
In a rooted tree, what direction does it have and what do the line represent?
A rooted tree has evolutionary direction, and only horizontal lines represent genetic distance.
149
What does a clustering algorithm do?
It transforms genetic distances into a tree.
150
What do optimality methods define?
They define some kind of score for each possible tree.
151
What are statistical methods?
They are methods that calculate a probability for each possible tree and frame phylogeny estimation as a formal statistical problem.
152
What is maximum parsimony?
The tree which requires the fewest evolutionary changes to explain the observed sequences is the best tree.
153
When is maximum parsimony most useful?
It is most useful when it applies to morphological character data.
154
When is maximum parsimony inapplicable?
When there are fast-evolving sequences.
155
What is maximum likelihood?
The tree which is probabilistically most likely to have given rise to the observed sequences is the best tree. It is slower and the probabilities are given by nuclear substitution models.
156
What is Bayesian inference?
Where each tree has a probability given the data, and the whole probability distribution is considered, not just the one most likely.
157
What is a parsimony score?
It is the minimum number of evolutionary changes required to explain the observed characteristics.
158
What is a tree search used for?
To find the topology with the highest likelihood.
159
What are the best ways to do tree searching?
1. To use an exhaustive search which tries every possible tree and is only feasible with small numbers of taxa. 2. To do hill climbing which searches through trees by iterative trial and error, and it doesn't check all possible trees and isn't guaranteed to find the optimal one.
160
What is the most common technique to test phylogenetic uncertainty and what does it involve?
The most common technique to do is bootstrapping, which involves permutations of the original data to create large number of pseudoreplicates.
161
What do most phylogenetic methods provide?
They provide a single estimate of a 'true' tree.
162
How is the reliability of bootstrapping measured?
The generated trees from each replicate have clusters and it's the frequency of these clusters that is a measure of its reliability.
163
How was the idea of the molecular clock formed?
Zuckerandl and Pauling in 1962 compared the LCA of the Hepatitis C virus as a time scale and then compared the number of mutations compared to humans and this showed correlation, which led to the formation of the idea of the molecular clock.
164
What can molecules estimate that's related to the molecular clock?
They can estimate the date of a common ancestor for which no fossils are known, and the divergence dates when there is no obvious morphological change.
165
What happened as amino acid and gene sequence data accumulate?
It became obvious that there was much sequence variation at the molecular level, and that the amount of molecular diversity varied within genes, among genes, and among species.
166
What are the two competing perspectives on the process of molecular evolution?
The neutralist approach and the selectionist approach.
167
What 2 different ways are molecular clocks used?
To understand why some genes/species/genomic regions evolve at different rates, and to estimate a timescale for phylogenies and evolutionary history.
168
What is the substitution/fixation rate?
It is the rate at which sequences in different populations diverge through time.
169
What is the mutation rate?
It is the rate at which individuals incorporate errors during replication.
170
What does the probability of fixation determine?
The difference between the substitution/fixation rate, and the mutation rate.
171
When are mutations caused by drift?
When Ns is between 1 and -1.
172
What happens with mutations when N is small?
When N is small, slightly deleterious mutations are controlled by drift and can occasionally become fixed.
173
What happens with mutations when N is large?
When N is large, the slightly deleterious mutations are controlled by negative selection and never get fixed.
174
What happens to substitution rates in small populations, and what may cancel out the effect?
Substitution rates can increase in smaller populations, but organisms in small populations tend to have longer generation times, which may cancel out this effect.
175
What is generation time?
It is the time between germ line replications.
176
What is generation time a particularly important factor for?
It is a particularly important factor for selectively neutral polymorphisms.
177
What might explain why mtDNA genomes tend to evolve faster than nuclear genomes?
Higher concentration of oxygen radicals.
178
What is a good example for non-equal generation times?
The X and Y chromosomes are a good example as there may be more cell division events in some species in the germ line than the female, which leads to faster Y chromosome evolution.
179
Why do smaller-bodied vertebrates tend to have higher substitution rates than larger-bodied ones?
It is thought to be due to a higher basal metabolic rate, which is then caused by increased oxygen free radicals produced by aerobic respiration, which can generate mutations. However, there is no clear association that has been found due to there being too many confounding variables.
180
How do you calculate genetic distance?
Genetic distance = evolutionary rate x (2 x divergence time)
181
What is an example of different mutation rates due to different replication?
RNA viruses and ratroviruses have mutation rates many times higher than those of eukaryotes as they replicate using different polymerases.
182
How were phylogenetic timescales previously calculated, and what is it known as?
They were calibrated by assuming that all lineages/species evolve at the same rate, and this is known as a strict clock.
183
When can phylogenies be calculated using the tips of the trees?
When the sequences are from evolutionary different points in time.
184
What is an example of where co-evolution was used to calibrate a phylogeny?
An example is where the phylogeny of cats was used to date the evolution of feline papillomaviruses.
185
What is phylodynamics?
It is a study of how population processes shape phylogenies, and it includes changes in population size, migration, speciation and extinction.
186
Who coined the term phylodynamics, when and what was it used to describe?
It was first coined by Grenfell et al. in Science in 2004 where it was used to describe how epidemiological, immunological and evolutionary processes can shape viral phylogenies.
187
How does coalescent theory work?
It works backwards in time and traces ancestry given a set of sampled sequences. It typically considers intra-specific processes.
188
How does the birth-death model work?
It works forwards in time, and it is where given a population process and it determined what the resultant phylogeny would look like, and it considers inter and intra-specific processes.
189
Where has coalescent theory gained importance and where is it used?
Coalescent theory has gained importance in population-level sequencing and has become widespread in anthropology, association mapping, conservation biology, epidemiology, global warming, and cancer biology.
190
What is the 'Wright-Fiser' model and what does it assume?
It is an ideal population, and it assumes that individuals have equal propensity to reproduce, that generations are non-overlapping, and that there is a constant population size.
191
What is coalescent theory in reverse?
Coalescence theory is genetic drift in reverse, and vice-versa.
192
What is r in terms of coalescence theory?
It is the probability that two lineages coalesce in the previous generation, and move back in time, it is the rate of coalescence.
193
How is r calculated in terms of coalescence theory?
r = (probability that a pair of sampled lineages share the same parent) x (the number of possible pairs of sampled lineages)
194
What does a "serially sampled" coalescent include?
It includes sequences from the past.
195
What happens in coalescent theory when population changes are taken into account?
Moving back in time, the population size decreases and the rate of lineage joining increases.
196
What does theta denote in terms of coalescent theory and what is it related to?
Theta denotes sequence diversity. It is related to the number of mutations in the history of the sample.
197
What does theta equal when mutations occur randomly on the branches?
Theta equals the average pairwise genetic distance between sampled sequences.
198
What happens to gene trees where there is a large population size?
There are often long internal/near root branches, which means there are many mid-frequency polymorphisms.
199
What happens to gene trees when there is slow population growth?
There are long terminal branches, which means there are many low frequency polymorphisms.
200
What information do sequences contain?
Sequences contain information about demographic history.
201
What is Tajima's D?
A statistic that measures whether mutations are mostly high/medium/low frequency.
202
What methods can be used to study the demographic history through sequences?
Methods used include: Tajima's D, skyline plots, and the sequentially markovian coalescent model.
203
What are skyline plots?
They are plots that use the mathematical relationship between r(t) and 2N(t) to estimate past population size.
204
What is the sequentially markovian coalescent model?
It is a complex approach used for human genomes.
205
What lineages can coalesce?
Only lineages in the same deme/subpopulation can coalesce.
206
When does incomplete lineage sorting occur and when is it more likely to occur?
Incomplete lineage sorting occurs when coalescences predate multiple speciation times, and this is more likely to occur when ancestral effective population sizes are large.
207
What are examples of where skyline plots have been used?
When looking at the population sizes of Beringian bison over time, and the origins of HIV.
208
What do Genome Wide Association Studies try to find?
They try to find genotypes associated with human diseases like diabetes.
209
What is coalescent theory used to interpret?
Coalescent theory is used to interpret large-scale human genomics data.
210
What does a phylogenetic tree using the birth-death model show?
A complete population tree displays the full population dynamics and displays the dynamics giving rise to individuals at time T.
211
What has the birth-death model been used to study?
It has been used to study the diversification of mammals after the extinction of dinosaurs, and to study the spread of Ebola in Sierra Leone in 2014.
212
What is one of the most important tasks of evolutionary genetics?
One of the most important tasks is to understand the selective forces acting on individual genes, gene regions, and codons.
213
What process can generate similar trees?
Demographic and selective processes can generate similar trees.
214
How can you detect selection from gene sequences?
You can look for differences in genetic diversity, tree shape, or mutation frequencies among genes or along chromosomes, compare silent and replacement changes within a gene, and look for parallel/convergent evolutionary changes.
215
What is dN/dS?
It is the ratio of the number of replacement fixations to the number of silent fixations, and it is not a differential.
216
What does it mean if dN/dS = 1?
It means that all replacement mutations are neutral.
217
What does it mean if dN/dS = 0?
It means that all replacement mutations are deleterious.
218
What does it mean if dN/dS > 1?
It means that at least som of the replacement fixations are beneficial.
219
Why does dN/dS usually equal much less than 1 when applied to whole genes?
Because only a few codons are positively selected and most codons are selectively constrained and therefore dN/dS = 9.
220
When does the power of the dN/dS ratio greatly increase?
When the ratio is applied to parts of genes or individual codons.
221
When are silent changes not neutral?
When they are in: overlapping genes, alternate reading frames, regulatory sequence elements (they affect the stability of RNA/mRNA/DNA structure), and where codons for the same amino acids differ in fitness.
222
Where is there a high dN/dS found in codons?
In the codons that form the active site of the gene, so the antigen recognition site.
223
What can the McDonald-Kreitman Test be adapted to study and how?
It can be adapted to study adaptation in measurable evolving populations and ratherthan using an outgroup to
224
What is the McDonald-Kreitman Test?
It is a simple method to contrast the patterns of within-species polymorphism and between-species divergence at synonymous and nonsynonymous sites in the coding region of a gene.
225
What should you expect to see in the McDonald-Kreitman test if polymorphism and divergence at both types of sites are due to neutral mutations?
You would see that the ratio of replacement to synonymous differences between species should be the same as the ratio of replacement to synonymous polymorphisms within species.
226
What are viruses?
They are very small infectious agents that replicate inside living cells.
227
What is the key property of viruses?
The high mutation rates.
228
What are the 2 scales at which viral evolution occurs?
The within-host scale and the between-host scale.
229
What are the most studied viruses in terms of molecular evolution?
Human pathogenic viruses.
230
Can you describe the HIV-1 genome?
It is a single genome where new diversity is generated by mutation and recombination, and there is gradual evolution.
231
Can you describe the Influenza genome?
It is comprised of 8 genome segments, each encoding 1 or more genes. New diversity is generated by mutation and reassortment between segments can also occur.
232
When and who created the classification of viruses?
David Baltimore created the classification of viruses in 1971.
233
What is the Baltimore classification of viruses based on?
It is based on the route of information transmission from the genome for mRNA, from which virus proteins are translated.
234
What do organisms with smaller genomes have?
Higher mutation rates.
235
What do organisms with higher mutation rates have?
Higher substitution rates.
236
What means that viruses evolve on an ecological timescale?
The high substitution rates.
237
What are acute infections usually caused by?
Acute infections are usually caused by RNA viruses which have a high mutation rate.
238
How do evolution and selection act on acute infections?
There is limited opportunity for within-host evolution and it is expected that selection for transmission plays a relatively large role.
239
What are latent persistent infections usually caused by?
Latent persistent infections are usually caused by DNA viruses, where there is a short burst of replication followed by long periods of latency.
240
How do evolution and selection act on latent persistent infections?
One expects to see little within-host evolution and to see selection for transmission play a relatively large role.
241
What are chronic persistent infections usually caused by?
Chronic persistent infections are usually caused by RNA or DNA-RT viruses.
242
How do evolution and selection act on chronic persistent infections?
There is ongoing rapid evolution and one expects to see within-host selection playing a relatively large role in determining adaptive evolution at the host-population scale.
243
What is the selection pressure at the within-host scale?
There is selection pressure to maximise within-host fitness.
244
What is the selection pressure at the population scale?
There is selection pressure to maximise between-host fitness, normally seen as transmission.
245
How do you research viruses at the within-host scale?
You take multiple sequences from the same individual at different times.
246
How do you research viruses at the between-host scale?
You take consensus sequences from different individuals at different times.
247
What has been observed with regards to selection in chronic HIV-1 infections?
Data showed that selected mutations typically involve evasion from host immunity and mutations that are selected for in some individuals are selected against in others.
248
Where is adapt and revert commonly seen and what could it explain?
Adapt and revert is commonly seen in viruses and it could explain why we see high mutation rates within the individual.
249
Where can acute infections become chronic?
In immunocompromised individuals.
250
Why isn't an arms race between the virus and the host immune system in acute infections?
Because there is little opporunity for adaption so it is unlikely to see the arms race, and selection will be driven by intrinsic transmissibility and immune escape.
251
How can viral origins be understood?
They can be understood by using phylogenetic data.
252
Why does the phylogeny of SARS-Cov-2 have long branches and what does this lead to?
The long branches lead to 'variants of concern'. The leading hypothesis is that these long branches are a consequence of evolution during chronic infection, and these are characterised by many nonsynonymous mutations in Spike.
253
How are antigenic maps constructed and what are they used for?
They are constructed from immunologial assay data and are used to choose vaccine strains.
254
What happens in changes in Influenza antigens?
In Influenza, genetic divergence is continuous, but antigenic change is punctuated, with switches among discrete antigenic types being observed.
255
Where is there common cross-species transmission of Influenza?
It is commonly seen between humans, birds and pigs.
256
Where was HIV-1 establishment found to be and what molecular methods were used?
Reconstructing the phylogeny found that HIV-1 is most diverse in Central Africa and the phylogeographic and molecular clock methods place common ancestor in the captical of the DRC in the 1920s. The virus is thoguht to have spread to humans from chimps in Cameroon but the origins before that were unknown.
257
What are the zoonotic origins of HIV?
Using phylogenies, it shows there was direct transmission from chimps to humans, however, there wasn't just one transmission event but rather the virus jumped between lots of different species and then jumped from chimps to humans.
258
What are the zoonotic origins of Swine flu and what techniques were used to work it out?
Scientists took 8 segments from the genome and each genome segment was telling a different evolutionary story due to the reassortment. The best evidence shows it emerged in Mexico from pigs.
259
What is cluster busting?
It is where networks are generated of similar consensus sequences from different individuals.
260
When did Watson and Crick discover the structure DNA?
1953
261
Bacteria are one of what?
Bacteria are one of the earliest forms of life.
262
When was Sanger sequencing first done?
1976
263
What was the first free-living organism to be whole-genome sequenced?
The bacterium Haemophilus influenza in 1995.
264
When was commercialised pyrosequencing first done?
1998
265
When was Illumina sequencing first done?
In 2009.
266
What are the main steps in Illumina sequencing?
Sample prep, cluster generation, sequencing, and data analysis.
267
What did Illumina sequencing lead to?
The cost of genome sequencing plummeting.
268
What was the outcome of first generation sequencing?
Complete, assembled genomes with annotation.
269
What was the outcome of second generation sequencing?
Archival short-sequence data.
270
What are the two approaches for short read analysis?
First method is mapping where reads are aligned to a reference genome. Second method is assembly, where genomes are reconstructed from raw read data using de novo assembly.
271
What is the k-mer approach?
It is a reference-free assembly and comparison that is independent of biological information.
272
What are the two main types of genome assemblers?
The overlap-layout-consensus method, and the De Bruijn method.
273
What are the steps in assembly done with De Bruijn graphs?
1. Start with sequences. 2. Divide all possible k-mers and look for all possible overlapping 4-mers. 3. Spades is the most common assembler.
274
What are paired end sequences?
Two sequences that have a defined, known gap between them.
275
How long are the DNA sequences that short read sequencing technologies produce?
100-300 bps.
276
What is SNP calling?
It compares short reads to a high-quality reference, particularly used in comparing very closely related isolates.
277
What are the advantages of mapping?
- rapid - accurate, even with 'low coverage' samples - comparable - reproducible - problems are easy to visualise to help with identifying problems and errors.
278
What are the disadvantages of mapping?
- requires high-quality reference genome - can only identify variants relative to the reference genome - repeat high regions are problematic and can lead to induced error or under-reporting of variants - can't be reliably used to report large genomic events.
279
What is the overlap-layout-consensus method?
Where all of the overlaps between reads are determined then the reads and overlaps are all laid out on a graph and consensus sequences are identified, and a 'String Graph Assembler' (SGA) does this.
280
What is the De Bruijn graph method?
A graph that is constructed from a set of k-mers, where the vertices represent the k-mers and the edges represent the relationships between them.
281
What are the advantages of assembly?
- referene free so novel sequences can be constructed and identified. - can be used to identify large genomic sequence variants.
282
What are the disadvantages of assembly?
- struggles to solve repetitive or very similar regions. - computationally expensive -time consuming - no clear 'ground truth' as the output can be variable based upon input parameters.
283
What are the limitations of Illumina sequencing?
- short reads do not contain enough information to resolve low complexity regions that are larger than the length of the short read, leading to gaps in the assembly. - the assembled genome is fragmented into multiple contiguous sequences. - some regions will not be assembled.
284
How do long reads solve assembly problems?
By spanning the entire length of low complexity regions, or resolving intermittent identical repeats.
285
What does long read sequencing include?
- Pacific Biosystems (not used lots) - Oxford Nanopore (portable and used in Ebola outbreak and COVID).
286
How is the problem of long read sequencing methods being error prone overcome?
They are combined with second generation sequencing reads for an accurate hybrid assembly.
287
What are hybrid assemblies?
They are assemblies that combine the bae calling accuracy of short read sequencing with the scaffolding power of long reads to solve genomic features that are unresolvable by short reads alone.
288
What sort of things are including in bacterial genome annotation?
- Location, e.g. which sequence, where on the sequence, and which strand it's on - Feature type, e.g. protein coding, or tRNA, or repeat region - Attributes, e.g. products, enzyme code, cellular location.
289
What is Prokka?
A gene-by-gene annotation.
290
What is EggNOG?
A database of orthology relationships, functional annotation, and gene evolutionary histories.
291
What do the size and features of bacterial genomes depend on?
Their biology, so where they are and what they do, e.g. if they are free-living or obligate or facultative.
292
What is the core genome of bacteria?
It is the genes that are the same in all bacterial individuals of a species.
293
How much of the bacterial genome is different between individuals?
They tend to have about 1/4 of their genomes different to each other.
294
What is the accessory genome?
All the different genes, so the variable genome content.
295
What is the pangenome?
The core and accessory genome added together.
296
What are large genomes of soil-inhabiting bacteria rich in and why?
They tend to be G and C rich, with it being unknown why this is, but it is potentially related to temperature, which increases stability under high temperatures.
297
What is large-scale genomic rearrangement and where is it seen?
It is where the genomes and order of genes are all shuffled, and this is seen in prokaryotes.
298
Why is there large variation in the genotypes and phenotypes of bacteria?
There is large variation due to bacteria having been around for a very long time.
299
What is the typical way to analyse population genetic structure?
It is to construct a phylogenetic tree from DNA sequences of bacterial strains with different phenotypes.
300
What is neutral diversification?
A model that emphasises that most of the genetic variation can be explained by genetic drift.
301
What are ecotypes?
They highlight selection for adapted lineages in a given environment.
302
Where are adaptive explanations for variation seen?
Where there is a genetic mutation which effects the fitness/survival of an individual.
303
What processes are bacterial evolution dominated by the relative rates of?
-DNA replication errors - horizontal gene transfer
304
What are DNA replication errors?
Where there is generation of point mutations, rearrangements, or deletions of various sizes.
305
What is horizontal gene transfer?
Genetic material that is acquired from an external source and incorporated into the chromosome by recombination.
306
What is the only thing that can properly lead to innovation?
Mutation.
307
When may greatly elevated mutation rates occur?
Under strong selective pressure.
308
Why are different levels of clonal signals observed in different bacterial populations?
It is thought that it is a consequence of differing relative rates of recombination to mutation, although other forces may play a role.
309
What is a genetic bottleneck not the same as?
It is not the same as a selective sweep.
310
What is one method for quantifying selective pressure from sequence data?
One method is to compare the frequency of substitutions at synonymous sites.
311
What does it mean if dN/dS is less than 1?
It is associated with negative or purifying selection, which supresses protein changes.
312
What does it mean if dN/dS is more than 1?
It is associated with positive selection, promoting protein sequence changes.
313
What is positive diversifying selection associated with?
Host immune evasion or antimicrobial resistance.
314
Where my purifying selection be weaker?
Within host populations, where isolation from the ancestral population results in greater genetic drift and less time to purge deleterious mutations.
315
What are the limits to the utility of dN/dS estimates?
- Selection operates on features other protein-coding sequences which don't necessarily affect dN/dS. - dN/dS ratios do not detect complex traits such as interactions between genes. - Frameshifts and incorrect interpretation of start codons can lead to non-synonymous single nucleotide polymorphisms being interpreted as synonymous. - the estimates aren't accurate if polymorphisms are not fixed between independent lineages, and segregating variation in the population is likely weakly deleterious and destined to be purged in the future.
316
What type of organisms are pathogens principally?
Most pathogens are principally commensal organisms.
317
How can scientists identify genomic changes resulting in pathogen emergence?
One can compare the genomes of pathogens with other genomes of the ancestors and related non-pathogens.
318
What is the strongest evidence of adaptation?
Convergent evolution, also known as homoplasy.
319
What does it mean to say that bacterial genomes are interactive?
It is where the effect of one allele depends on another, which is also known as epistasis.
320
What happens as genes get closer together?
There is a higher linkage disequilibrium.
321
What does recombination promote and harm?
Recombination promotes adaptation by introducing novel functionality; on the other hand, it risks creating disharmonious gene combinations that are likely to be selected against.
322
How can induced genes be accommodated?
- potential variation, which can set the stage for subsequent genetic changes that can result in beneficial adaptations. - compensatory change, which adjusts the recipient genome to minimise potential disruptions, facilitating transition between fitness peaks.
323
While the number of genes varies greatly among species, it is not sufficient to account for what?
The differences in genome size.
324
What does the vast majority of genomic DNA code for in prokaryotes?
The vast majority of genomic DNA codes for protein in prokaryotes.
325
What are the ideas around why we carry so much non-coding DNA?
- non-coding DNA performs essential functions. - Non-coding DNA is useless "junk", carried passively by the chromosome simply because it is linked to functional genes. - Non-coding DNA has a structural or nucleoskeletal function. - Non-coding DNA is a functionless "parasite" that is in a selective battle with the host.
326
What is the best evidence that genome sizes are correlated with a variety of phenotypic traits?
- size of cell nucleus - duration of mitosis and meiosis - metabolic rate in birds and mammals - minimum generation time - seed size - response of annual plants to CO2 - embryonic development time in Salamanders - morphological complexity in the brains of frogs and salamanders.
327
What is the "skeletal DNA" hypothesis?
The hypothesis claims that cell size is adaptively important so that more genomic DNA is required to make bigger cells. So, DNA mass directly determines nuclear volume and there must be a constant ratio of nucleus to cell volume to maintain a balance between RNA synthesis and protein in the cytoplasm.
328
What is the evidence for the "skeletal DNA" hypothesis?
The evidence for the theory is in cryptomonad algae, where DNA in the nucleus performs a skeletal function.
329
What is one limitation in the "skeletal DNA" hypothesis?
While it is seen in unicellular eukaryotes, scaling it up to multicellular eukaryotes is challenging.
330
What affect did a study in 2003 determine that effective population size has on natural selection of non-coding DNA?
It suggested that effective population sizes are too small to allow for natural selection to effectively remove non-coding DNA from eukaryotic genomes.
331
Why do bacteria have very little non-coding DNA?
Probably because they have a single origin of replication and need to replicate quickly.
332
What is tandemly repeated DNA?
It is non-coding repetitive DNA consisting of short sequence motifs repeated 100s to 1000s of times in tandem.
333
What are the 3 major classes of tandemly repeated DNA?
- Satellite DNA (2-40Kb) - Minisatellites (11-60bp) - Microsatellites (2-5bp)
334
Why are minisatellites and microsatellites a powerful set of molecular markers for population genetics and disease studies?
They have very high mutation rates, meaning that their loci are extremely variable.
335
What are transposable elements?
"Selfish" DNA sequences which are able to increase their copy number by jumping around the genome and making additional copies of themselves as they do so.
336
What are transposable elements known as in bacteria?
They are called insertion sequences.
337
What are the 3 groups of transposable elements?
- Class I elements (retroelements) - Class II elements (DNA elements) - Miniature Inverted-Repeat Transposable Elements (MITES).
338
How do retroelements transpose?
They transpose via an RNA intermediate using the enzyme reverse transcriptase.
339
What are the 2 major groups of retroelements?
- LTR retrotransposons - non-LTR retrotransposons
340
What is one major group of the non-LTR retrotransposons?
Long Interspersed Nuclear Elements (LINEs), which are very common in eukaryotes.
341
What does the insertion of retroelements into genes cause?
It can cause deleterious mutations.
342
What are SINEs?
Short Interspersed Nuclear Elements , which do not encode their own reverse transcriptase like LINEs and they are very common in eukaryotic genomes.
343
What is the possible high rate that transposable elements can accumulate?
Copy number could increase by 20-100 copies in a single generation.
344
What is an example of Class II transposable elements?
Some drosophila species have P elements which are Class II elements. Wild flies carry them while lab flies don't. The insertion of P elements can lead to hybrid dysgenesis.
345
What is hybrid dysgenesis?
It is an increased infertility due to chromosome breakage.
346
What are the stages in the endogenous lifestyle?
- Retroviral infection of the germline - fixation -amplification -inactivation through mutations - loss through recombinal deletion - decay into junk -co-option.
347
What are the 3 groups of endogenous retroviruses?
Class I, II, and III.
348
What can be the consequence of endogenous retroviral activity?
Endogenous retroviruses cause diseases in a range of mammals, but there is no definitive link with disease that has been seen in humans.
349
What is an example of co-option of endogenous retroviruses?
Evidence has shown that a captive protein from an ancient endogenous retroviral insertion is involved in placental morphogenesis.
350
What does the ectopic exchange hypothesis predict?
It predicts that transposable elements will be preferentially found in regions with low recombination.
351
How can endogenous retroviruses and other transposable elements lead to chromosomal rearrangement?
It can happen through homologous recombination between distant loci.
352
What is the major force limiting transposable element copy numbers in genomes?
Selection against transposable elements that cause ectopic exchange.
353
What is the persistence of transposable elements likely to depend upon?
A complex interplay of factors specific to transposable element biology and the biology of the host.
354
How much of the human genome codes for genes?
1.5% codes for genes.
355
What is a major goal of comparative genomics?
To identify gene coding regions and determine their biological function.
356
What could be indicative of rapid adaptive evolution?
Regions of "dark matter" that show accelerated evolution in one species but not others.
357
What is the human genome near identical to in terms of gene coding sequences?
It is near identical to the chimpanzee.
358
What is the ENCODE project?
It is a project which aims to delineate all functional elements encoded in the human genome.
359
What are functional elements?
They are discrete genome segments that encode a defined product or display a reproducible biochemical signature.
360
How much of the human genome is under purifying selection?
3-8%.
361
How much of the human genome is functional in at least one cell type?
80.4%.
362
How much of the human genome is transcribed?
74.4%.
363
How much of the human genome is associated with modified histones?
56.1%.
364
How much of the human genome is found in open chromatin?
15.2%.
365
How much of the human genome binds transcription factors?
8.5%.
366
How much of the human genome consists of methylated CpGs?
4.6%.
367
What are large genomes found in in plants?
- pterophytes - gymnosperms - angiosperms (mainly the monocots).
368
How does whole genome duplication arise?
It arises from polyploidization events followed by chromosome reshaping.
369
Where is whole genome duplication best known, and how long for?
It is best known in flowering angiosperms, and they have been seen up to 400 million years ago in seed plants, then in ancestral angiosperms, and then in specific clades.
370
What does whole genome duplication underpin?
Innovations and adaptation in angiosperms.
371
What is whole genome duplication not sufficient to account for?
Very large genome sizes.
372
What is the outcome of whole genome duplication?
It doubles the genome size and gene number.
373
What is most stable to return to when whole genome duplication has occurred?
A return to a diploid state is most stable and has profound effects on the evolution of genome architecture.
374
What are a major class of transposable elements?
Retrotransposons.
375
What are the 2 super-families of plant LTR-retrotransposons?
Ty1/copia and Ty3/gypsy.
376
When are LTR-retrotransposons activated in plants?
While most LTR-retrotransposons are degenerate and inactive, stress tends to activate the movement of intact copies.
377
How much of monocot genomes are LTR-retrotransposons?
30-70%.
378
Why do LTR-retrotransposons make genome sequence analysis very challenging?
Because they tend to be highly nested in the genome.
379
What is the relationship between repeated sequences and genome size in plants?
The repeated sequences tend to drive up genome size, but the relationship is dynamic and changes in larger plant genomes.
380
How much of conifer genomes are LTR-retrotransposons?
60-85%.
381
How can retrotransposons be used as a molecular clock for plants and why?
Sequence divergence between the terminal repeats of a single retrotransposon can be used as a molecular clock. LTRs are initially identical and then their sequences decay due to random mutation.
382
What are conifers efficient at doing to LTR-retrotransposon copies?
They are efficient in key repair mechanisms to remove LTR-retrotransposon copies.
383
What is autopolyploidy?
It is where multiple chromosome sets derived from a single taxon. It comes from no chromosome disjunction during meiosis or spontaneous, somatic genome doubling.
384
What is allopolyploidy?
It is where multiple chromosomes derived from two or more diverged taxa.
385
What are polyploids abundant in?
They are abundant in crop plants.
386
What are examples of crop plant polyploids?
- Triploids include bananas, citrus, and some apples. - Tetraploids include wheat, cotton, potato, canola and rapeseed. - Hexaploids include chrysanthemum, bread wheat, oat, and kiwi. - Octaploids include strawberry and sugar cane.
387
What do changes in gene function result in after whole genome duplication?
It results in sub and neo-functionalisation which facilitates evolutionary change including adaptation.
388
What happens to duplicated genes after whole genome duplication?
Duplicated genes are initially redundant and most often, one copy is lost.
389
What can gene duplication be amplified by?
Natural selection.
390
What is the certainty of large reference genomes highly influences by?
The sequencing method and assembly method used.
391
How did mitochondria and chloroplasts evolve?
They both evolved through endosymbiosis from prokaryotic organisms, which were alpha-proteobacteria, and cyanobacteria.
392
What is non-Mendelian inheritance?
It is where a character/gene is inherited from one parent only. This often takes the form of maternal inheritance as the egg contributes the bulk of the cytoplasm to the zygote.
393
What did Carl Correns and Eriwn Bauer independently study and find?
They independently studied inheritance of leaf colour in variegated plants and found that inheritance of the trait could not be explained according to Mendel's laws of heredity.
394
What was important in the identification of organellar DNA?
- Genetic analysis - Biochemical analysis - Imaging.
395
What was the study on cytoplasmic inheritance in yeast?
The group of Boris Ephrussi studied yeast petite mutants and they were unable to grow on sugar-poor medium due to defective oxidative phosphorylation, and so formed small colonies. Sometimes this character was not inherited in Mendelian fashion, and it was later correlated with defective mitochondria.
396
How do you visualise mitochondrial DNA?
You stain DNA with ethidium bromide, and mitochondria with CiOC6. Where yellow is seen, it is mitochondrial DNA.
397
What do endosymbiotic organelles contain?
They contain double-stranded DNA molecules called mtDNA and cpDNA/ptDNA, meaning they are semi-autonomous.
398
What are mtDNA/cpDNAs?
They are small DNAs that do not encode for proteins and can be represented by circular DNA maps.
399
How do mtDNAs show reductive evolution?
They are much smaller than the ancestral genome, which is due to the genes needed for free-living being lost and many others being transferred to the nuclear genome.
400
What is the correlation between organelle and nuclear genome sizes?
There is no correlation.
401
What are the differences between organelle and nuclear genomes?
- organelle genomes lack features typical of nuclear chromosomes and exist as nucleoids. - DNA replication replication is not tightly coupled to cell division. - Organelle genome transcription and translation machineries are prokaryotic in character. - some genes are transcribed together to form polycistronic RNAs. - Introns exist but are of a different type. - the genetic code can deviate from the standard. - organelle transcripts can be subject to RNA editing.
402
What is mtDNA transcribed using?
mtDNA is transcribed using machinery that is related to T7 bacteriophage RNA polymerase.
403
What is mtDNA-directed RNA polymerase?
A single-subunit RNA polymerase and it requires the assistance of 2 transcription factors, mitochondrial transcription factor A, and mitochondrial transcription factor B.
404
How is mtDNA expressed in mammals?
- Transcription is initiated in the non-coding region. - Transcription proceeds in both directions, from 2 promoters: light-strand promoter and heavy-strand promoter. - Two transcripts spanning almost the entire genome are formed. - These polycistronic primary transcripts are processed to yield mRNAs, tRNAs, and rRNAs.
405
How are mitoribosomes different to bacterial ribosomes?
- They are 55S instead of 70S. - They have evolved unique features reflecting the special requirements of highly hydrophobic OxPhos proteins in the organelle.
406
What happens when mammalian mtDNA is packaged into nucleoids?
- TFAM molecules bind to mtDNA in short patches. - TFAM bends the mtDNA, and bridges neighbouring mtDNA stretches by cross-strand binding. This compacts the mtDNA to form the nucleoid. - The mtDNA in the nucleoid is inaccessible to the transcription an replication machineries.
407
Why is does the standard genetic code deviate in organellar genomes?
The reasons are unclear but likely reflect the unique evolutionary and operational circumstances of the organelles.
408
What are mtDNAs and cpDNAs rich in?
They tend to be AT-rich.
409
Where is the role of TFAM in mtDNA conserved and lost?
It is conserved in animals and fungi, but the protein is absent in plants.
410
Organelle DNA-encoded RNAs often undergo what, and what does it alter?
They often undergo C-to-U editing which often alters the coding sequences of a transcript to produce translatable mRNA.
411
What has the largest mtDNAs?
Plants have the largest mtDNAs, and the number varies little between species.
412
What are some sources of "extra" DNA in plant mtDNAs?
- some derived from chloroplast, nuclear or viral DNA. - some has been acquired by horizontal transfer from other plants.
413
What is the origin of most non-coding mtDNA?
Most is of unknown origin.
414
What is the ratio of synonymous substitution rates in mtDNA, cpDNA, and nuclear genes in angiosperms?
1:3:6.
415
What are the mutation rates in mtDNA in animals?
They are 1-2 orders of magnitude higher than in plant mtDNAs, and higher than animal nuclear genes.
416
What is MSH1 responsible for?
It is responsible for the unusually low mutation rates in plant organelle genomes. It is dual-targeted to chloroplasts and mitochondria, and it mediates efficient recognition and correction of DNA sequence errors.
417
What is the structure of plant mtDNAs?
Electrophoresis and microscopy studies suggest that genome-size mtDNA circles are rare or absent. Many repeated sequences are present which enables for homologous recombination and leads to highly variable structural organisation.
418
What is the organisation of most cpDNAs?
A long single-copy region (LSC), a short single-copy region (SSC), and two inverted repeats (IR).
419
What are losses of the mitochondrial genome documented in?
Anaerobic microbes, resulting in hydrogenosomes and mitosomes.
420
Where has loss of the plastid genome but retention of the plastid compartment been documented?
In a very small number of plants and algae.
421
How is human mitochondrial disease inherited?
It is inherited maternally.
422
What are the 2 different approaches to mitochondrial replacement therapy and what is the difference?
- maternal spindle transfer - pronuclear transfer. The difference is whether it is carried out before or after fertilisation.
423
What is the size of cpDNAs?
They are larger than mammalian mtDNAs, but smaller than plant mtDNAs.
424
What is sanger sequencing?
It was developed by Fred Sanger in 1977 and it used radioactively labelled ddNTPs with four independent reactions with each of the radioactive base analogues.
425
What does current Sanger sequencing use as a marker?
It uses flourescent tags instead of radioactively labelling.
426
What is the read length for Sanger sequencing?
up to 1000bp read.
427
How was Illumina sequencing found?
In the mid 90s, Shankar Balasubramanian and David Klenerman realised their work imaging the action of single polymerase molecules could be the basis for a new sequencing reaction by imaging the energy of the fluorescence omitted by the chemistry of the extension reaction.
428
What are the advantages of next generation sequencing?
- In vitro library preparation - In vitro clonal amplification - highly parallels as limited only by size of sequencing features and imaging limitations - low reagent volume ratios per sequencing feature.
429
What are the impacts of genome sequencing?
- Medical/personal/human population genomics - Metagenomics - Environmental genomics - Evolutionary/population genomics - Understanding gene regulation mechanisms and the genome at new depths.
430
What is the purpose of genome re-sequencing?
To detect variation and inform on mechanisms underpinning phenotype.
431
What are "mate pairs"?
Circularised fragments of >1 kb pieces or "confirmation capture" brings more distant part of the genome together, with the ends appearing in the same sequenced fragment.
432
What indicates structural change?
When distance rules are broken among all reads.
433
What are the disadvantages of de-novo genome assembly?
- highly complicated. - technical problems: biases in library preparation, biases in sequencing profile after amplification, sequencing error rates. - assembly/information problems: polymorphisms and repetitive regions.
434
What is the result of whole genome sequencing of cells in early S-phase?
It results in over representation over sequence at origins of replication.
435
What is seen when mapping the genome is done after high coverage sequencing?
One sees peaks of reads at replication origins.
436
What is Tn5 transposase used for?
It is used to insert sequencing compatible sequences into the genome where it then dissociates and leaves the insertion sequences. It often inserts into open chromatin.
437
How can cytosine be methylated and why is it essential?
It is methylated at its 5th carbon and it is essential to development as a loss of any mammalian cytosine methyltransferases is lethal.
438
What roles have been established through pre-NGS studies of individual loci?
- Imprinting - Retrotransposon silencing - X chromosome inactivation.
439
What are the steps in Chromatin Immuno-Precipitation-sequencing?
- crosslink proteins to DNA - fragment DNA - Use antibodies to rescue DNA with nucleosomes with histone mark of interest - de-crosslink DNA from proteins - make sequencing library and sequence.
440
How are genomes now being sequenced?
They are being sequenced using technologies that produce long reads.
441
What are the benefits of long read sequencing?
- overcomes problems from repetitive regions - allows for structural variation to be detected more directly - much better assemblies - easier to get high quality assemblies of new complex genomes - epigenetic base modification can be read directly - simpler library preps - ability to sequence impure environmental samples more directly.
442
What does Pacific Biosciences allow for and what are the pros and cons?
It allows for Single Molecule Real Time sequencing. They have a longer read, 5kb on average but up to 15kb. They have a higher error rate but allow for detection of structural variation and detection of base modifications.
443
What are the advantages of Oxford Nanopore?
- it can read lots of different lengths - it is portable - it can convey structure information - they are addressable and programmable on the array - longest read is over 2 million bps.
444
What are the disadvantages of Oxford Nanopore?
- it is limited by the size of the molecules going into the machine - can't sequence genomes at very high accuracy yet - some technical issues.
445
What is the Tree of Life program?
It is a project trying to sequence 60,000 eukaryotes in Britain and Ireland.
446
What can single-celled genomes tell us about the origin of animals?
Unicellular ancestor of animals had a complex repertoire of genes linked to multicellular processes, suggesting that changes in the regulatory genome were key to the origin of animals.