Lecture 10 Flashcards

1
Q

Gamma distribution and alpha parameters

A

alpha determines the shape of the gamma distribution and how peaked/spread out it is

Gamma curve shows how variability is distributed

Alpha is almost universally and incorrectly referred to as gamma as some popular software allows you to choose your value of alpha from a dropdown menu

In alignment, not all sites are equally variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Relaxed vs strict clocks

A

Strict clocks - substitution rate (subs/site/year) is the same on all branches of the tree at all times

Relaxed clock - required when subs, site or year have been infringed - allow for distribution variation over time and across branches
- lognormal
- exponential
- random

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Rival phylogenetic methods

A

UPGMA

Minimum evolution - additive method, looks for minimum total tree length

Neighbour joining - Optimises tree length at local rather than global level

Maximum parsimony - looks for minimum number of substitutions

  • maximum likelihood looks for most probable tree given a model of evolution
  • Bayesian trees - applies Bayesian methods to probability problem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bayesian method

A
  • What is the most probable model that explains the data given in observed evidence
  • Aims for max probability if data given in a series of competing models

Posterior probability p(data|model) = likelihood p(model|data) x prior probability p(data)/Marginal likelihood p(model)

Priors: things we do not calculate but assume:
- Inherent probability of data
- Inherent probability of model

Posteriors: Based on priors we calculate:
- Probability of model given data
- Probability of data given model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bayesian method example

A

Queen Victoria had haemophilliac son and 3 daughters who were confirmed carriers

No incidence of haemophilia in royal family before Victoria

Two theories:
- Victoria was new mutation for gene
- Victoria was illegitimate daughter of Sir John Conroy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bayes theorem in context of Queen Victoria

A

𝑝(π‘šπ‘’π‘‘π‘Žπ‘›π‘‘|π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ)=(𝑝(π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿβ”‚π‘šπ‘’π‘‘π‘Žπ‘›π‘‘)𝑝(π‘šπ‘’π‘‘π‘Žπ‘›π‘‘))/(𝑝(π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ))
𝑝(π‘šπ‘’π‘‘π‘Žπ‘›π‘‘|π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ)=(1 βˆ— 0.00012)/0.0004 = 0.3

𝑝(π‘π‘Žπ‘ π‘‘π‘Žπ‘Ÿπ‘‘|π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ)=(𝑝(π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿβ”‚π‘π‘Žπ‘ π‘‘π‘Žπ‘Ÿπ‘‘)𝑝(π‘π‘Žπ‘ π‘‘π‘Žπ‘Ÿπ‘‘))/(𝑝(π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ))
𝑝(π‘π‘Žπ‘ π‘‘π‘Žπ‘Ÿπ‘‘|π‘π‘Žπ‘Ÿπ‘Ÿπ‘–π‘’π‘Ÿ)=(~0.0004 βˆ—0.06)/0.0004 = 0.06

Higher probability she was a mutant than illegitimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Is Queen Victoria’s undoubted carrier status the only data we have

A

No - if Victoria had haemophilia from Sir John Conroy, we would except him to be a haemophiliac

What proportion of male carriers are asymptomatic? - None, but 30% are mild and Conroy lived to 67 and was an army officer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Application of bayesian theory to Phylogenetics

A

Data is alignment you upload, plus any other fixed parameters you add e.g. tip dates, collection locations

Model is what you select from various menus in BEAST e.g. clocks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What can a phylogenetic model consist of?

A

Substitution matrix e.g. JTT, BLOSUM, GTR, TN93

Clock model e.g. fixed, random, strict

Population model e.g. static, fast growing, fluctuating

Any other prior e.g. substitution rate, kappa, alpha parameter, initial phylogenetic tree

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Application of Bayesian theory

A

BEAST takes model parameters and uses equation

BEAST subtly adjusts parameters randomly and compares

Iterates repeatedly, working from model that gives best p(data|model) at each stage

Hill-climbing algorithm - strives to get better p(data|model)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Application of Bayesian theory to phylogenetics

A

BEAST may converge, meaning p(data|model) does not improve, and indeed must do so if you are to believe results

Presents posteriors - optimised value of each parameter, including optimised tree, usually derived as consensus tree produced in iteration after convergence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

You have best p(data|model) but what about p(model|data)

A

Obtained by running various starting point prior models

Compare posterior probabilities in Tracer e.g.

strict vs relaxed

Constant size vs exponential growth

Substitution model vs another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly