Week 9 (bayesian LR & kernel) Flashcards
(16 cards)
Define a conjugate prior over W
Common choice of prior & resulting posterior mean and variance
Diagrams of Bayesian linear regression likelihood, posterior and data space
Form predictive distribution
We are averaging over all possible W, weighted by the posterior:
p(t|w,β) is the weighting
p(w|t, α, β) is the posterior
Graphs of predictive distributions for sinusoidal data with Gaussian basis functions
What is the equivalent kernel
We can write the mean as a weighted sum of the target values
Properties of equivalent kernel
Even non local basis functions have local equivalent kernels (see image)
The kernel relates to the covariance of predictions
Normalisation
Inner product
Show the kernel as a covariance function
Kernel as a covariance function
Normalization and inner product for kernel
Model evidence
Predictive mixture distribution
*having computed p(Mi |D) (which is the posterior for model i)
Calculating model evidence/marginal likelihood
Approximating model evidence
Full Bayesian predictive distribution
Empirical Bayes/Type 2 ML/Generalised ML/evidence approximation
Computing empirical Bayes
Here we p(α, β ) is assumed to be flat so
p(α, β | t ) is proportional to p(t|α, β)