Lecture 9 Flashcards

1
Q

Substitution

A

F -> S

Changes shape of surface of protein

Problem with looking at two alleles is you don’t know which was the original

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Ancestry of descent

A

Not linear

Different descendants with originate from the same common ancestor which have different evolved protein sequences across multiple generations

Attempting to identify tree structure

For example:
True ancestor:
FCYGQLVFTVKEAA

Inferred ancestor:
S A
FWYGRLVFTVKEAA

Descendants:
FWYGRLVFTVKAAA
FWYGRLVSTVKEAA

Difficult to identify true ancestor via tracing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Class 1 GPCRs

A

Total: 811

523 were olfactory GPCRs which were identified by evolutionary relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Alignment

A

Begin with unaligned protein sequence which are later aligned

Alignments can be completed using either substitution matrices or percentage identities (number of differing positions)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Distance matrix

A

After alignment, a distance matrix is formed between the different species on the tree with their genetic distances e.g. 0.074 substitutions per site between pig and cow vs 0.618 between pig and virus

p-distance i.e. percentage difference used to calculate differences in genetic alignment between organisms e.g. 0.074 would mean 7.4% of the genetic sequence is different between cow and pig

P-distance = 1 - % identity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Blosum-62 substitution matrix

A
  • Common substitutions score high (positive values)
  • Rare substitutions score negatively

For example, alanine -> alanine gives a score of 4
R->R gives score of 5
A with R gives score of -1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What forms the closest pair

A

Species with the smallest p-distance

Then next animal with next smallest p-distance added and so on

This particular method is called UPGMA (Unweighted pair group method with Arithmetic Mean)

Other types of methods can use median or weighted methods

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Bootstrapping

A

Percentage confidence of how reliable or confident we are about part of a phylogenetic tree

High values (e.g., > 70%) → Strong support for the branch/clade.

Moderate values (50–70%) → Moderate support (needs caution).

Low values (< 50%) → Weak or unreliable support.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Process of bootstrapping

A

Bootstrapsupport

Start with multiple sequence alignment (MSA).

Resample the columns randomly with replacement to create a new “pseudo-alignment” the same size as the original.

“With replacement” means some columns can appear multiple times, while others might not appear at all.

Rebuild a phylogenetic tree using this pseudo-alignment.

Repeat this process many times (e.g., 100, 500, 1000 times).

For each branch (clade) in the original tree, count how many bootstrap trees contain the same group.

Assign a bootstrap value (support value) to each branch:

(
Numberoftimescladeappears
Totalnumberofreplicates
)
×
100
Bootstrapsupport=(
Totalnumberofreplicates
Numberoftimescladeappears

)×100
Example: if a clade appears in 950 out of 1000 replicates, its bootstrap support is 95%.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Summary of tree building

A
  • Start with an alignment
  • Alignment is based upon some kind of distance measure e.g. p-distance which represents substitution percentage identity between species OR substitution matrices like blosum-62 which gives score based upon how common/rare a substitution is

Tree-building:
- Simple: UPGMA, neighbour joining

  • Parsimony-based: Maximum parsimony, minimum evolution - produces lots of trees and selects best one
  • Statistical: Maximum likelihood, Bayesian - creates one tree and attempts to improve
  • Can be rooted or unrooted
  1. Conduct bootstrapping/stastical conference measure (Bayesian)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Dating trees

A

Time Tree

Calibrate tree

Example:
63.1 million yrs ago for common ancestor of pigs and cows

Calibrate the tree to identify divergence

Results:
Mammals-Fish - 400.1M years ago
Pig-Cow - 61.3M yrs ago

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Molecular Clock to identify divergence using Pig-Cow example

A

How many changes between two species

Divide by 2, then by length of alignment - substitutions per site

Divide by 63 - substitutions per site per million years

Divide by substitutions per site per million years - date for next node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Why may molecular clock not be exactly correct?

A
  • Alignment may be (partially) incorrect
  • Insufficient information to calibrate molecular clock
  • Molecular clock may not be very regular in gene
  • Wrong substitution model
  • Choose carefully, align well, parametrise carefully, cross fingers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Parsimony’s maximum likelihood methods

A

Obtain a multiple sequence alignment (MSA)
Align your DNA, RNA, or protein sequences.

List all possible tree topologies
Draw all possible unrooted trees for your species.
(Example: 4 species → 3 unrooted trees.)

For each tree, evaluate the number of changes

For each site (column), find the minimum number of evolutionary changes needed.
Use methods like Fitch’s algorithm to do this efficiently.

Sum the changes across all sites

Add up the total number of changes for each tree.
Select the tree with the fewest total changes
The tree requiring the least number of steps is the Maximum Parsimony tree.

If more than one tree has the same number, they are equally parsimonious.

(Optional) Use software for larger datasets

Tools like MEGA, PAUP*, PHYLIP, or TNT can automate tree building.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly