Week 2 (MLE, MAP, Bayesian Curve, Entropy) Flashcards
(27 cards)
Stirling’s Approximation
As N -> inf
Marginal probability, Joint probability and conditional probability in the context of the grid
Where N is the total number of possibilities
Sum rule and product rule in context of grid
Transformed densities formula
Variance and covariance formulae
Multivariate Gaussian PDF
Log Likelihood and estimators of μ and σ^2 for univariate Gaussian
Expectation of MLEs for univariate Gaussian
Connect the MLE to polynomial curve fitting and construct resulting likelihood function
Instead of taking the naive error minimisation approach we let tn = y(xn, w) + εn where ε is Gaussian nose distributed N(0, β-1)
This implies p(tn | xn, w, β) = Πn N(y(xn, w), β -1)
Apply MLE to polynomial curve fitting to produce predictive distribution
How does MAP differ from MLE predictive distribution
We introduce a prior on the weights w addressing overfitting by regularisation
Process for MAP
Where p(w| α) is the Gaussian prior we assume on w
How does Bayesian curve fitting build on MAP
We treat w as a RV computing the full posterior and integrating over all possible w to make predictions
This addresses the limitations of the MLE and MAP approaches by quantifying the uncertainty of both the data and the model
Process for Bayesian curve fitting
Where p(w| x, t, α, β) is the posterior (as it is for MAP)
Entropy equation discrete
Cross validation, method & purpose
To select best degree of polynomial curve fitting split data into training and validation sets multiple times for each model for i=1,…, M :
Train model on training set
Evaluate error on validation sets
Choose M with lowest error
To avoid overfitting by over parametrising
Differential entropy for continuous variables
Differential entropy maximised for Gaussian distribution
Conditional entropy
Def KL divergence
Def mutual information
Mutual information I[x,y] measures information shared between x and y
Decision theory process
Minimum expected loss for classification
Decision theory steps for regression