Plant diversity and from bone to genome Flashcards
(35 cards)
Why is it important to study plants?
We cannot live without plants! We use them for food, material for building, fabric/clothing, they produce useful chemicals (secondary metabolites), we burn them (fossil fuels) for energy and they produce much of the oxygen we breathe. Also, they are pretty and make us happy! People at work who can see plants report significantly greater job satisfaction than those who can’t and many substances we like come from plants, cocoa, coffee, alcohol, weed, tobacco etc.
Furthermore, plants are eukaryotes like us and studying them can help us learn a lot! Some of the most important discoveries in biology came from plant studies, e.g. heredity, the cell was first discovered though looking at plants and the first described viruses were purified from plants.
When was the green revolution? What is it?
The green revolution started in the 1950s (1950-1984). During this time, there was major advances in agriculture like modern plant breeding resulting in high yield varieties of food crops and the use of synthetic fertilizer, that increased the world grain production by 160%!
Plants can also be used to inform about the ancient world, give two examples of plant derived proxies in palaeogenetics.
Plant derived proxies are:
- Pollen
- Macrofossils
- sedaDNA
Plants have evolved to thrive in diverse habitats, give five examples.
Plants have evolved to thrive in:
- Desert climates: drought, heat and irradiation tolerance
- Grassland
- Alpine climates: cold, dry, low light
- Rain forest: Hot, humid, wet
- Aquatic climates
- Swamp forest
etc.
They have also evolved with very diverse forms and functions, flowering plants, trees, grass etc.
How do we think plants evolved?
Plants are thought to have evolved from an ancestral eukaryotic cell (containing mitochondra) that also acquired a cyanobacteria through endosymbiosis, which later became chloroplasts.
Plants originated in the ocean (as all other life) but when did they colonize land and what was the major difficulties they encountered up there?
Plants colonized land ~600 million years ago, and was met by a very harsh environment: first of all High UV irradiation, gravity, pathogens and wind. Later came challenges with herbivores, competition, seed dispersal, pollination etc.
Plants are a highly diverse group, explain the six molecular evolution mechanisms that facilitated this.
Molecular evolution mechanisms:
- Promoter duplication: If a promoter is duplicated, the gene it controls will be able to be even more active –> increased expression levels. This can also alter the tissue specific expression pattern.
- Missense mutation: A mutation that causes a change in amino acid sequence in the resulting protein, which can lead to new functions.
- Nonsense mutation (Premature stop codon): If a mutation leads to a stop codon instead of an AA it result in truncated gene products –> often leads to the gene being non-functional.
- Intergenic space deletion: Leads to gene fusion, which can result in new function or non functional.
- Transposable elements: Insertion deletion of DNA, can lead to new or lost function, overall variation.
- Gene duplication: When an ancestral gene is duplicated, you have two of the same gene (Paralogous genes) and the function is redundant, which opens up the opportunity for changes being tolerated in one of them which can lead to new functions (ie neofunctionalization).
All of these mechanisms drive evolution in animals too, but unlike animals, plants are highly adaptive and can tolerate these mutations to a much higher extent!
Name three use cases of plants in aDNA studies.
Plants can be useful for aDNA studies in many different contexts, for example:
- Reconstructing past ecosystems: which plant species was there? What can they say about the environment?
- Ancient ecosystem dynamics: How has the ecosystem changed in the past with introduction of new plant species? Competition?
- Biogeography: Plant colonization can help us in understanding past geographical changes like tectonic plate movements.
- human history and agriculture: aDNA can reveal the wild ancestors to domesticated crops, it can help us understand what was selected for and can be useful in bringing back genes that help with tolerance, e.g. temperature, drought etc. It can also be used to study human interactions and movements, by seeing how domesticated crops spread.
Historical climate change also shaped plant diversity and evolution, name three biotic and three abiotic stressors for plants.
Biotic stress: Pathogens, Insects, Herbivores
Abiotic stress: Drought, Flood, Extreme temperatures, UV irradiation
Plants are not static, they respond to environmental stimuli fast and might be even more sensitive than we are, as they are sessile they need to respond fast, as they cannot move.
Name five types of samples that are suitable for genome sequencing.
Types of samples suitable for genome sequencing:
- Bone
- Tooth
- Nail/claw
- Horn/antler
- Hair
- Skin/tissue
- Pollen & seed
- Eggshell
- Wood
- ”Chewing gums”
- Sediments
- Coprolites
Name five different sources for samples that are suitable for genome sequencing.
Sources of samples suitable for genome sequencing:
- Museum collections
- Field work/excavations
- caves
- permafrost
- erosion points in rivers.
For all field work: preparation and planning is key, also learning where to look (erosion points for example).
Is aDNA rare?
Yes! Often, <1% of sequences obtained from ancient samples are endogenous, much of the rest is microbial (”-but might still be interesting!”)
A sample has generally been affected by the environment for a long time after its Best-before date:
- DNA damage (fragmentation & alteration)
- Contamination of non-target organisms - Inhibitory substances
Because of this, we require lab methods that can optimize endogenous DNA recovery, reduce contamination & remove inhibitors.
There is a big variation in DNA preservation between substrates, what is the most important factor for good preservation? Which substrates are the “holy grail” because of this?
The most important factor for DNA preservation is the density! Because of their extremely high density, petrous bone (the dense, hard part of the temporal bone that houses the inner ear) mainly the cochlea and tooth cementum (outer layer of the root of teeth) are the holy grail, with much higher endogenous content than other bones.
Unfortunately, these are rare to find but when you do its great! Bones in general is fairly good, so no worries if you don’t find petrous bone.
Why is density so important in DNA preservation?
The more dense, the less exogenous DNA can leech into the substrate.
Explain the workflow from bone to genome briefly.
To get from bone to genome, there are 6 steps of wet lab (before dry lab):
- Remove surface contamination: Either by wiping with bleach (sodium hypochlorite), UV-irradiation or by removing the outer layer with a drill. This ensures that you have a clean surface.
- Collect bone material: Drill for powder (faster but risks heat damage so you need to have low speed and stop in-between) or cut off a piece and pulverize (less risk of heat damage). Generally, the higher the drill speed, the less DNA you get.
- Pre-treatment: Bleach and or pre-digestion. This step is not always needed, should not be used if not because it’s pretty harsh and damages the DNA - you get less DNA but also much less contamination, so it’s a trade off. You lose complexity = more clones of the same fragments and none of some, so if you are sequencing deeply it is better with low endogenous content and high complexity, but otherwise it’s fine.
- DNA extraction: Digest the bone powder with proteinase K and heat to Decalcify bone, digest proteins and fats, which releases the DNA.
- DNA purification:
- Binding buffer (acidic, high salt) + Silica = Immobilize DNA on membrane/beads
- Wash buffer (ethanol) = Remove cellular & inorganic remains from the solution
- Elution buffer (basic, low salt) = Release DNA from silica to a final ”pure” solution
So first the DNA is in suspension, then on silica coated beads to finally end up purified in the
column for subsequent analysis. - Amplification & Library build: Adding adapters and then send off for sequencing.
Then you clean and purify, check the size of the library to see that you have not just amplified adaptors, and then send it for sequencing.
Describe the state of the aDNA after purification. Are there any problems with this state?
aDNA is heavily fragmented and have overhang in the ends of the fragments, and for the bases that are exposed in the overhang (ss) they are much more susceptible to reacting with water.
For example, when Cytosine reacts with water, it gets converted into Uracil (which should not exist in DNA) which leads to uracil being interpreted as thymine by DNA polymerase, leading to an adenine being added on the complementary strand. When this is then amplified, a Thymine is inserted at the position of the original cytosine resulting in a C –> T base substitution. This is a problem, but is also used to authenticate aDNA (as modern DNA don’t have an overrepresentation of T and underrepresentation of C in the ends).
Is there a way to handle the problem with base substitutions in aDNA? When is this useful?
Yes! The solution is using the USER enzyme (Uracil DNA glycosylase + Endonuclease VIII) which cuts out Uracil from the strands –> no damage anymore. But this also means that the C–>T base substitution can’t be used to authenticate anymore. This is for example useful when studying an extinct animal, as there is no possible modern contamination from modern mammoths.
There are 2 categories of aDNA library build, which?
2 categories of aDNA library build:
- Double-stranded: Faster & cheaper
- Single-stranded: More efficient (at least without USER treatment)
Normal kits for library preparation are not suitable for aDNA, as they are not efficient in converting degraded DNA fragments into ready libraries –> Problematic for samples with little starting material! So specific protocols developed (more time consuming, but higher conversion rate).
How does ssDNA and dsDNA libraries differ in terms of library preparation?
For dsDNA libraries, the aDNA damage creates overhanging ends, so they need to be repaired in order to get a complete dsDNA molecule to sequence. To do this, you ligate universal adapters to the 5’-ends and use those to “fill in” or extend the 3’ ends until the whole DNA molecule is double stranded (classic method developed by Meyer & Kircher).
For ssDNA libraries, you can simply denature the strands, and then there is no more overhang as the strands are now separated, so a major benefits is that damaged molecules can be used as templates = higher complexity. After strand separation, adapters are ligated to all ssDNA and then immobilized on magnetic beads. Then the primer can bind and extend and then an adapter is ligated on the other side.
When the adapters are attached, you use amplification primers that bind to the adapters (universal - so you only need one pair) with unique indexes (so that you know which is forward and reverse and which belong to which sample if you pool samples) and now you can amplify the library before sequencing.
If you are only interested in sequencing specific regions of the DNA, what method can you use? How does it work in general?
If you only want to sequence specific regions of the DNA, you can use hybridization capture. Hybridization capture uses baits, which are short sequences that are present in the target DNA, e.g. from a specific species, which are biotinylated. Then you add magnetic beads coated in streptavidin to fish out the target DNA, e.g. mtDNA, Nuclear exomes, SNPs. This increases the endogenous content quite a lot, but since the baits are constructed from high quality modern genomes, you run the risk of not capturing variation that is now lost from the gene pool and risk of geographic misrepresentation.
What is “multiplexing”?
Multiplexing is when you have pooled individually indexed DNA libraries before sequencing. After sequencing you know from which sample a specific sequence is, so then you de-multiplex them bioinformatically before analysis to sort each sample.
When you receive the genome sequencing data back, in what format is it?
The sequence data output is in FASTQ format. Firt line is the index and some practical info and the second line is the sequence.
When getting the sequencing data back, the first thing to to is to check the quality. What three things do we usually look at?
Quality control:
- Mean quality score: over 30 is good, QS=30 means that there is 1 error per 10 000 nucleotides. This is usually over 30 across the board.
- Per base N content: When the machine gets conflicting base calls, it assigns an N. This is usually very low for most samples today.
- Adapter content: Allows you to see approximately how long your target DNA is.
What is done to the sequencing data after quality control?
After quality control, we perform adapter trimming and merging of overlapping reads. After this the QS is usually even higher. After this the reads can be of varying lengths. Here you often plot the read length, usually around 30-45 bp, if very long it is likely due to modern contamination.