Questions #3 Flashcards
What is the main shortcoming of grid approximation?
It scales very poorly in high dimensions.
What is the main shortcoming of quadratic approximation?
Scales better than grid, but it can struggle in the face of complex, hierarchical models.
It also does not fare well in the presence of posterior distributions that cannot be well approximated by a Gaussian distribution
What is the 3 things we must be able to do to perform MCMC sampling with the metropolis algo
- We must be able to generate a random value from the proposal distribution (Normal (theta current, sigma^2)
- We must be able to calculate the unnormalized posterior densities
- We must be able to generate a uniform random value from 0 to 1 to accept or reject the proposed parameter value
In the trace plot, if the standard variation of the proposal distribution is too low, what will happen?
It will take longer to get to the right values
In the trace plot, if the standard deviation of the proposal distribution is too high, what will happen?
It will generate far-away proposals that will usually get rejected and won’t explore the posterior distribution well. (graphique en robot)
What is the difference between Metropolis and Metropolis-Hasting?
Metropolis-Hasting generalizes the metropolis by allowing assymetric proposals
True or false : We need a symmetric proposal function in the metropolis algo
True
True or false : In the Gibbs Algorithm, we always accept the proposal
True
What are the advantages of Gibbs ?
Efficiency in sampling from the posterior and no tuning of the proposal distribution
What are the disadvantages of Gibbs?
- Ability to compute and sample from conditional posterior distributions
- Sampling efficiency in models with correlated parameters
True or false : If the parameters are graphed in n-dimensional space, the metropolis algorithm movements can be in any direction,. The gibbs movements are always parallel to the axes
True
Write the formulas useful for HMC
- Formula for momentum
- Formula for Theta
- Formula for prob of accept
What is the consequence in HMC if s is too low?
The proposal distribution is too concentrated and does not have sufficient time to move into the region of high posterior mass
What is the conserquence in HMC if s is too high?
Lower proposal rates since the proposals are too far away from the mode of the distribution
Can result in U-turn problem
In MCMC, a small number of samples is needed if we want the posterior mean
True
In MCMC, a small number of samples is needed if we want the posterior variance and the percentiles extreme
False. We need a lot of samples
In a trace plot, if the chains are all representative of the posterior, they should … (2)
- They should overlap each other and be unrelated to their randomly set starting positions
- They should also be stationary around the same modal value
Can you give me a reason why the trace plot are not mixing well?
The prior is too flat
What we need to do if a chain in the trace plot is isolated from the others?
- Try to run more samples
2. Need to check our model definition, implementation method, or input data for potential issues
How can we know how correlated the proposed parameters are through the sampling iterations in MCMC?
With an autocorrelation plot
True or false : A correlogram don’t necessarily give an indication if the chain is representative of the posterior distribution, but it will give us a sens of the efficiency of our MCMC algorithm
True
True or false : HMC as low-autocorrelation if the parameters are well-tuned
true
true or false : A greater number of iterations will be needed to explore the posterior distribution if there is a lot of autocorrelation
true
What are the 2 forms of binomial regression
- Logistic regression. Each record in the data set indicated whether an event occurred or didn’t. We are estimating probability of an event
- Aggregated binomial regression. Each record in the data set states the size of the population and the number of events that occurred