Phylogenies and Genome Evolution Flashcards Preview

LS > Phylogenies and Genome Evolution > Flashcards

Flashcards in Phylogenies and Genome Evolution Deck (68):

What can you see using a phylogenetic tree?

What organisms are related to another.

how proteins are related to another, or how proteins might be in the same family and have similar functions.

Can see where mutations occured. Phenotypic changes associated with mutations?

Ex: sequencing flu genome to see which vaccinations work and which dont, develop vaccines to work for this flu and related virusesCan see which genomes are affected by different things


What is the out group?

Everything is compared to outgroup. Comparing everything against the same thing, can see how they are related



Come from common organism/ sequence


Sister taxa

Closer to each other than anything else on the tree


Monophyletic group

All descendants of a common ancestor and the common ancestor


Synthesis sequencing

Every time a base is incorporated it is detected

Generates a lot of genomic data quickly

Uses detection of byproducts of nucleotide incorporation into the chain to determine sequence


Pyro sequencing

Polymerase moves down a single base, many of same base are being added. Nucleotides added each round are of one base.

ATPs generated by pyrrophosphates being given off every time a base is added in. ATPs are convertd into light energy using an enzyme called luciferase. Light energy detected by the sequencing machine. When three bases incorporated at once, there is three times the amount of light energy so you get a triple peak.

Nucleotides (extra ones) need to be degraded so they dont get detected in the next round of replication. When wrong base, you dont get a peak becuase no pyrrophosphate is released to create light energy and ATP


Illumina sequencing

Library preparation- genome analyzer mass sequences a ton of fragments by using sound waves to break it up, incubate at temp with dATPs and taq will add As at end of blunt cut DNA.

size selection- run dna out on gel and your sample will make a smear, take secion of gel and you cut out a chunk at a certain size and you get a gel plug, you know the fragment length

cluster generation- dense lawn oligos that bind to the adapters ligated to the library fragments,the size selected single stranded DNA. DNA concentration is dilute so you dont end up with every single oligo bound to a fragment. Bridge amplification gives you clusters of same DNA sequences. Reverse strandds cleaved and washed away

sequencing- on geome analyzer genome clusters sequenced base by base using reversibly terminated flourescent nucleotides.



No functional genes but accumulates mutations at a rapid rate, useful for ancestry studies


Mitochondrial lineages

Seven mitochondrial lineages probably gave rise to all that we have today, mutations accumulated that start new lineages, can look at dispersal patterns. Y chromosome is looked at as well to see parternal dispersion.



Shows amount of divergence by branch length, proportional to DNA changes


Comparative Genomics

Can predict time since divergence

•  Molecular clock

•  Most are random neutral mutations

•  Sequence differences canbe used to estimate the time since divergence

•  Higher rates of mutation may identify important regions that may be affected by selection


Molecular clock

Look for regions that are conserved within and between groups of organisms. Look at times of divergence based on sequences.

Want genes that accumulate mutations in specific intervals, evenly.


How do we estimate myr divergence?

Adaptive radiations: many fossils appearing all of a sudden the the fossil record with similar morphology.

Fossils can be carbon dated (half life of carbon) or other radioactive isotopes.

Look at ancestors of fossils, sequence genome and correlate number of differences in two genomes and number of mutations, predict when they diverged


Molecular based origins of diversity and evolution

Alternative splicing

Changes in gene number

Changes in regulatory sequences

Non-protein coding RNA regulation

Gene mutation

Our genome is relatively static besides few mutations, but over long expanses of time they can be dynamic.


Gene Comparisons

Differences in between organisms, some fixed differences- mutation that is in everyone in the population (allele frequency equals one). Found in one group of organisms so thought to be important, found in one group but not the other, sometimes people attribute important things to these mutations that arent valid.


Selective sweep

individuals that had these mutations are selected for very highly so the mutation would expand very quickly



•  FOXP2 transcription factor known to function in human speech

•  Human speech associatedwith two non-conservative aa changes (Asparagine 303 and Serine 325) occurred after humans and chimps diverged (6mya)

•  Chimps and mice differ by only one nonconservative AA change (Asparagine 325) these two groups are at least 50my divergent


Transposable elements and diversity

Contribute more than anything to dynamic nature of genome. Over time there are increases in when a specific line is repeated in your genome.

Genome size has increased- Junk DNA can be attributed to transposable elements in a way becuase they replicate themselves over and over again(selfish, over represented number of copies)


How line transposable elements spread

Lines have original location, might be in non coding region, but have to have gene to create another DNA copy, and have to be able to integrate.

Codes for mRNA, gets transcibed and transported out of nucleus, translated

mRNA and proteins go back into the nucleus and we get cDNA being made (reverse transcriptase) integrase integrates DNA into genome. Lines recognize a small 4bp sequence as integration site, cuts chromosome open with endonuclease and inserts its genome, replicates itself within the genome. Now there are two copies


How do transposable elements work?

•  Transposon can move from one area of the genome to another "Jump”

•  Are not independent genetic elements (aren’t like plasmids)

•  Have one gene that catalyzes transposition

•  DNA is cut by transposase

•  insertion sequence is inserted (by transposase)

•  DNA polymerase and ligase fills it in


Are transposable elements good or bad?

Some copy and paste, some cut and paste, both jumping to different regions.

Only bad when they land in the middle of a gene or a promoter sequence. Gene probably wont be able to transcribe or translate.

Got most non coding DNA from transposable elements, its good becuase more likely the element will it non coding region of genome becuase there is more non coding DNA


Gene Duplication

Another way for an increase in number of DNA bases and maybe even a protein coding sequence that could be “modified” for an alternative function

This is how gene families arise. Were likely single gene with common ancestor but became many genes with related function.


What happens when you have two of the same gene?

Don't need two copies of same gene, so when you duplicate that, it might be deleterious becuase you have 2x the amount(dosage effect). But second copy is free to vary, might stay the same, can accumulate mutations and it might be able to take on a different function, usually similar. Pseudo gene is a copy of the gene but it takes on so many mutations it becomes non functional


Unequal crossing over between duplicated genes

can increase the number of gene copies in the genome

Polymerase slippage can cause gene duplication, daughter strands are different combinations of original gene


Meiotic mishaps

Incorrect pairing of two homologues during meiosis

If mutation (duplicated gene) is inherited it will have two copies on one chromosome and one on the normal chromosome


Beta globin genes

Found on chromosome 11


Alpha globin genes

Chromosome 16


Evolution of the globin genes

Thought to have been ancestral globin gene that was duplicated, and then we ended up with beta and alpha becuase they accumulated different mutations. Closely related- gene family. Slightly different function.

Meiotic mishap- ended up on seperate chromosome, follwed by more duplication events. Some become pseudo genes.


How do gene sequences reveal relatedness?

By looking at entire sequence of gene we can see remnants of relatedness (wouldnt be able to identify pseudo gene with proteins becuase they are non functional) can tell that there were more divergence events, every time the is a mutation it changes the genome


Expression of the globin genes

 All forms are three exons and two introns

 Expression patterns are different over the age of the individual

How might gene expression be affected?

Women can make a maternal type of hemoglobin, not expressed in men. You make very little beta hemoglobin as an embryo, then it goes way up. Make fetal hemoglobin, which hangs on to oxygen the tightest becuase fetuses need constant oxygen as they're developing, can pull oxygen off alpha and beta


Transcriptional level control in the globin genes

Chromatin remodeling complexes associate on different regions on chromosome depending on whether its a fetus or an adult. Can promote beta hemaglobin expression in adults. Differences in patterns of gene expression


Hemoglobin expression in fetal life

Transcription factor and chromatin remodeling complex interactions between beta LCR and gamma globin genes leads to preferential gamma globin gene transcription


Hemoglobin expression in adult life

Complexes repress interaction of Beta-LCR with gamma globin genes leading to preferential beta-globin gene transcription


How does sickle cell anemia occur?

•  Single base pair substitution (A to T) that causes a different amino acid to be incorporated

•  Valine instead of glutamic acid - cant hold the oxygen as well because the shape of the protein changed


How can sickle cell anemia be advantageous?

Homozygote recessive- bad, cant carry oxygen. Can be somewhat remedied.

For heterozygous, tend to not get malaria, parasite is unable to enter into the sickle cells. Different type of sickle cell alleles around the globe. Like convergent evolution, different mutations lead to similar things


Ice fish

see through, looked like they didnt have blood. Things that live in cooler water live in high O2 content, could be getting oxygen through diffusion


How do these fish deal with the cold waters?

Make antifreeze by increasing viscosity of blood with small polypeptides, 3 or 4 circulating amino acids that lower the freezing temperature of their blood. Deroved from gene for trypsinogen, which is secreted by the pancreas. Part of intron and part of exon that got duplicated and expanded with no stop codon, this little segment is repeated so you get multiple tripeptide copies in the circulatory system.

Arctic cod also have single exon repeated and also make antifreeze glycoproteins. Duplication of a gene and then mutated regions that were duplicated


Whole Genome Duplications

•  Usually caused bynon-disjunction in germ line cell


African clawed frogs

Species xenopus laevis Has an exact double of genomic content as tropicalis, so these frogs cant mate, products of their meiosis with be triploid and not develop



very diverse group of angiosperms, three quarters of all the species are believed to have derived from whole genome duplication


Duplication of a single domain

•  Some genes are made up of domains that are repeated over and over. Collagen is actually a protein formed by the same domain repeated multiple times linked together in series. Some genes are made up of different domains that have been rearranged in different regions of the chromosome and now they make up their own gene even though they were derived from parts of other genes.

•  One exon - one domain

•  Introns facilitates the process- they can be broken and reined again so theres no daminge to the coding region


How do cells produce somany different proteins?

We make about 25 thousand proteins. We have combinatorial control- tfs, etc

duplicated exons can occur

exon shuffling generates many different proteins with related domains. lots of proteins with related domains or identical domains

Duplicated exon or exon shuffling where domain from another protein can be inserted into another gene and by shuffling those exons we have generated a new protein with new domain. Can give it different functionality


Novel proteins through exon shuffling

•  Gives many proteins that are like a ”patchwork quilt” made ofmultiple protein domains

•  One hypothesis states that the 24,000 proteins found in the human genome are the result of duplication and shuffling of a few thousand exons that code for proteins domains consisting of 30-50 amino acids

•  From a small set of building blocks there are thousands of combinations


How can you see where exons have been inserted?

Cdna corresponding to an exon, search for any dna sequence in the genome that correlates to that. if many spots, generally one ancesteral sequence that has been duplicated.

Exon has been inserted into many different genes- domain functionality is distributed into different genes. Many genes we see are a mishmash of exons from different enes combined in novel ways


Exon shuffling

Domain duplication

Domain shuffling



Damaged capillaries, expose collagen and cause platelets to come and form. Plasminogen has post translation modification, once cleaved becomes active.

TPA gene cleaves plasminogen into plasmin to help form the blot clot, TPA has exons from many other genes, all the results of exon shuffling/recombination events. Within this gene there was also a duplication event. If there is an advantageous function from the exon shuffling then it is selected for


Transposable elements and exon shuffling

Transposable elements might take other portions of genome with them when they are removed from the genome, help exons jump to different genes. Often TE are placed at random in the gene.

Shuffling/duplication is rare, and having them combined so there is a new protein with advantageous function that is chosen for is rare


Histones and nucleosomes

•  DNA and histones are organized into repeating subunits called nucleosomes. (180 bp wound twice)

•  Each nucleosome includes a core particle of supercoiled DNA and histone H1 serving as a linker.

•  DNA is wrapped around the core complex.

•  The histone core complex consists of two molecules each of H2A, H2B, H3, and H4 forming an octamer.


Higher Levels of Chromatin Structure

–  A 30-nm filament is another level of chromatin packaging, maintained by histone H1.

–  Chromatin filaments are organized into large supercoiled loops.

–  The presence of loops in chromatin can be seen:
•  In mitotic chromosomes from which histones have been extracted.
•  In meiotic lampbrush chromosomes from amphibian oocytes.


Heterochromatin and Euchromatin

–  Euchromatin returns to a dispersed state after mitosis.

–  Heterochromatin is condensed during interphase.


Constitutive heterochromatin

•  Constitutive heterochromatin remains condensed all the time.

–  Found mostly around centromeres and telomeres.

–  Consists of highly repeated sequences and few genes.


Facultative heterochromatin

•  Facultative heterochromatin is inactivated during certain phases of the organism’s life.

–  Is found in one of the X chromosomes as a Barr body (looks like black dot) through X inactivation.

–  X inactivation is a random process, making adult females genetic mosaics.


Coat color

Coat color determined by genes expressed on the x chromsome. Is not by chance that every x chromsome in one patch of fur was inactivated at same time.

In early stage after a couple rounds of division there is inactivation through heavy and irreversible methylation. In one cell maternal x is inactivated and in one cell paternal. Each cell divides multiple times and the inactivation is heritable by the daughter cells. Each daughter cell from original call has same x chromosome turned off


The Histone Code and Formation ofHeterochromatin

–  The histone code hypothesis states that the activity of a chromatin region depends on the degree of chemical modification of histone tails. –  Histone tail modifications influence chromatin in two ways:

•  Serve as docking sites to recruit nonhistone proteins- Residues on histone tails interact with different proteins which recognize things and leave/remove acetyl groups or methyl groups. only certain residues that are modified with certain enzymes interact with the chromatin

•  Alter the way in which histones of neighboring nucleosomes interact with one another.


Why sequence whole genomes?

•  Extremophiles 
and industry- 
understanding bacteria that survive in boiling water helped us- we know to use taq polymerase in pcr. Methanogens break down methane for energy source.

•  Methanogens
- We can use them for sites that are hight pollulted with methane gases and pollutants from industrial processes, can put methanogens there to get rid of waste, express genes to break down compunds. Do the same with oil spills.

•  “food”

•  Model
- Model organisms- can compare conserved genes/sequences without using actual humans, can make comparative statements



•  Ancient

•  Comparing 

•  Insight
 to evolutionary changes

•  FOXP2



Look for open reading frame with no stop codons, look at each codon to see AA to identify other signatures of genes, then can use cdnas to look for exons. Might also wanna look for promoter sequence so you know the open reading frame is an exon/coding region

Expressed sequence tags


Eukaryotes and junk DNA?

can turn on and off gene expression.

Individual with 8 repeats on each allele (one band) has unequal crossing over, then you get a heterozygote, one allele with 6 repeats and one with 10. (2 bands)


How are microsatellites applied to specific questions?

Can be used to identify paternity, also perps at crime scene.

There is some span of repeats in the population at a microsatellite loci. Using this variability, your microsatellite repeats are unique. Child has matching alleles at specific locus, one from mom and one from dad



Novel DNA change, a shared derived character



Which hypothsis is correct? The one with the least evolutionary changes. Or tree that can explain relatedness of organisms with the least number of changes


 signatures in a phylogeny?

Can look for specific line, sine, gene, absence or presence of protein. Can determine phylogenies.


DNA Microarrays

Revolutionizing how researchers analyze changes in gene activity

Examine expression of thousands of genes at once and identify which are simultaneous

Ex: Exact same species but their living conditions are different, now we can look at gene expression patterns to see if differences due to environmental conditions. Can analyze them all simutneously


How do DNA Microarrays work?

Use reverse transcriptase to prepare single stranded DNA from mRNA of control cells. Add fluorescent label. D http Sam for treatment cells (attacked by a virus, exposed to heat, cancerous) and add different color label

Probe a microarray plate with the labeled cDNA, the coding sequences will hybridize to it and you get spots of various colors

Wash and then shine laser to show fluorescence



Insulin secreted when high glucose in bloodstream, it increases glucose uptake by the cells so you can remove glucose in blood stream. Liver uptakes and turns into glycogen for energy.


How does insulin work?

Insulin is peptide hormone, made in inactive form and then signal peptides cut off. Insulin receptor and then there is a signaling cascade. Our body wlready has glucose transporters bound in a vesicle sitting in the cytoplasm until signaling cascade beigins. Change in conformation when insulin binds, vesicle moved to cell membrane.


What modifications does insulin undergo?

Post translational modification

Change in conformation with protein binding

Signaling - also change in conformation

Gene expression has already ocurred