Lecture 9 Flashcards

Question 1

Q

Substitution

Answer

A

F -> S

Changes shape of surface of protein

Problem with looking at two alleles is you don’t know which was the original

Question 2

Q

Ancestry of descent

Answer

A

Not linear

Different descendants with originate from the same common ancestor which have different evolved protein sequences across multiple generations

Attempting to identify tree structure

For example:
True ancestor:
FCYGQLVFTVKEAA

Inferred ancestor:
S A
FWYGRLVFTVKEAA

Descendants:
FWYGRLVFTVKAAA
FWYGRLVSTVKEAA

Difficult to identify true ancestor via tracing

Question 3

Q

Class 1 GPCRs

Answer

A

Total: 811

523 were olfactory GPCRs which were identified by evolutionary relationships

Question 4

Q

Alignment

Answer

A

Begin with unaligned protein sequence which are later aligned

Alignments can be completed using either substitution matrices or percentage identities (number of differing positions)

Question 5

Q

Distance matrix

Answer

A

After alignment, a distance matrix is formed between the different species on the tree with their genetic distances e.g. 0.074 substitutions per site between pig and cow vs 0.618 between pig and virus

p-distance i.e. percentage difference used to calculate differences in genetic alignment between organisms e.g. 0.074 would mean 7.4% of the genetic sequence is different between cow and pig

P-distance = 1 - % identity

Question 6

Q

Blosum-62 substitution matrix

Answer

A

Common substitutions score high (positive values)
Rare substitutions score negatively

For example, alanine -> alanine gives a score of 4
R->R gives score of 5
A with R gives score of -1

Question 7

Q

What forms the closest pair

Answer

A

Species with the smallest p-distance

Then next animal with next smallest p-distance added and so on

This particular method is called UPGMA (Unweighted pair group method with Arithmetic Mean)

Other types of methods can use median or weighted methods

Question 8

Q

Bootstrapping

Answer

A

Percentage confidence of how reliable or confident we are about part of a phylogenetic tree

High values (e.g., > 70%) → Strong support for the branch/clade.

Moderate values (50–70%) → Moderate support (needs caution).

Low values (< 50%) → Weak or unreliable support.

Question 9

Q

Process of bootstrapping

Answer

A

Bootstrapsupport

Start with multiple sequence alignment (MSA).

Resample the columns randomly with replacement to create a new “pseudo-alignment” the same size as the original.

“With replacement” means some columns can appear multiple times, while others might not appear at all.

Rebuild a phylogenetic tree using this pseudo-alignment.

Repeat this process many times (e.g., 100, 500, 1000 times).

For each branch (clade) in the original tree, count how many bootstrap trees contain the same group.

Assign a bootstrap value (support value) to each branch:

(
Numberoftimescladeappears
Totalnumberofreplicates
)
×
100
Bootstrapsupport=(
Totalnumberofreplicates
Numberoftimescladeappears

)×100
Example: if a clade appears in 950 out of 1000 replicates, its bootstrap support is 95%.

Question 10

Q

Summary of tree building

Answer

A

Start with an alignment
Alignment is based upon some kind of distance measure e.g. p-distance which represents substitution percentage identity between species OR substitution matrices like blosum-62 which gives score based upon how common/rare a substitution is

Tree-building:
- Simple: UPGMA, neighbour joining

Parsimony-based: Maximum parsimony, minimum evolution - produces lots of trees and selects best one
Statistical: Maximum likelihood, Bayesian - creates one tree and attempts to improve
Can be rooted or unrooted

Conduct bootstrapping/stastical conference measure (Bayesian)

Question 11

Q

Dating trees

Answer

A

Time Tree

Calibrate tree

Example:
63.1 million yrs ago for common ancestor of pigs and cows

Calibrate the tree to identify divergence

Results:
Mammals-Fish - 400.1M years ago
Pig-Cow - 61.3M yrs ago

Question 12

Q

Molecular Clock to identify divergence using Pig-Cow example

Answer

A

How many changes between two species

Divide by 2, then by length of alignment - substitutions per site

Divide by 63 - substitutions per site per million years

Divide by substitutions per site per million years - date for next node

Question 13

Q

Why may molecular clock not be exactly correct?

Answer

A

Alignment may be (partially) incorrect
Insufficient information to calibrate molecular clock
Molecular clock may not be very regular in gene
Wrong substitution model
Choose carefully, align well, parametrise carefully, cross fingers

Question 14

Q

Parsimony’s maximum likelihood methods

Answer

A

Obtain a multiple sequence alignment (MSA)
Align your DNA, RNA, or protein sequences.

List all possible tree topologies
Draw all possible unrooted trees for your species.
(Example: 4 species → 3 unrooted trees.)

For each tree, evaluate the number of changes

For each site (column), find the minimum number of evolutionary changes needed.
Use methods like Fitch’s algorithm to do this efficiently.

Sum the changes across all sites

Add up the total number of changes for each tree.
Select the tree with the fewest total changes
The tree requiring the least number of steps is the Maximum Parsimony tree.

If more than one tree has the same number, they are equally parsimonious.

(Optional) Use software for larger datasets

Tools like MEGA, PAUP*, PHYLIP, or TNT can automate tree building.

Lecture 9 Flashcards

(14 cards)