Critical Reflection and Future Work Flashcards

Question 1

Q

If you had more time, what extensions would you explore?

Answer

A

spatio-temporal models - jointly model changes over time and space (combining work of Ch 4 and 5)
include covariate information into the models to enhance fit

Question 2

Q

How do you plan to improve scalability of your models?

Answer

A

exploring alternative MCMC approaches which could be more computationally efficient
parallelising modes further

Question 3

Q

How would you change the priors to reflect stronger prior knowledge?

Answer

A

incorporate stronger prior knowledge into the priors by removing the weakly informative priors.

For example, the Beta distribution could be used as a more flexible prior (instead of Uniform) where the shape parameters could be chosen to reflect known tendencies in the compositions, such as higher probabilities for certain elements being present.

Question 4

Q

What are the limitations of using the Dirichlet prior?

Answer

A

despite being defined on the simplex and used commonly with compositional data
standard Dirichlet imposes a fixed correlation structure and cannot represent negative correlations
Dirichlet cannot model zero values
assumes equal exchangeability among components, which may be unrealistic in real-world settings

Question 5

Q

Could machine learning approaches be incorporated into your frameworks?

Answer

A

Something I have not looked into but:
* Neural networks or Gaussian processes could be used for complex non-linear covariate effects
* Deep generative models could simulate synthetic compositional datasets for training or augmentation

Question 6

Q

What did you learn about modelling trade-offs from this research?

Answer

A

model flexibility (e.g., GDM, hierarchical structures) often comes at the cost of computational complexity - simpler models may be quicker but do not perform as well
interpretability and model checking are critical and worth the extra time to ensure the model is performing well

Question 7

Q

How could your models be extended to include covariates?

Answer

A

Forensic: covariates can be added to the Bayesian hierarchical model to aid the classification of glass items
COVID-19: covariates be included to help influence the transition probabilities
Trees: environmental or demographic covariates could improve prediction accuracy

Question 8

Q

What other data types could your models be applied to?

Answer

A

Ecological / environmental data (species)
Electoral or marketing shares across regions
Finance or investment allocations, where proportions change dynamically
Disease incidence

Question 9

Q

How could your frameworks be adapted for longitudinal compositional data?

Answer

A

extend to include temporal correlation structures (e.g., autoregressive latent effects)
use dynamic latent variables to model long-term trends
adapt the GDM to allow time-dependent information

Question 10

Q

What directions would you suggest for future research?

Answer

A

addition of covariate information
combine work of ch 4 and 5 - spatio-temporal models
improving on the computational cost using alternative samplers or methods

Question 11

Q

How would you approach reducing computational cost?

Answer

A

explore alternative samplers or MCMC methods

Question 12

Q

How might variational inference or INLA help in future work?

Answer

A

potentially offer faster approximate inference
could be used to pre-screen models or parameters before refining with full MCMC

Question 13

Q

How can your models support real-time decision-making?

Answer

A

using approximate inference quick updates and integrating model outputs into dashboards
decision-support systems to aid real-time decision-making

Question 14

Q

What would you do differently if you started this project again?

Answer

A

spent a lot of time trying to replicate and improve the computational time of Napier model manual in R. If I could start again I would focus less so on this, and initially look for packages that implement Bayesian models in a flexible and efficient way (e.g. NIMBLE)
tried to focus on compositional data in the ‘original’ form. If I could start again, I would change the mindset around how to analyse compositional data (e.g. relative or absolute) earlier. For example, trying to use log-ratio transformations as the key to compositional data

Question 15

Q

Where are the biggest risks of bias in your models?

Answer

A

Priors: although weakly informative, they may still subtly shape inference in small or sparse data
Data imbalance: class imbalance across categories - draw results to the larger groups

Question 16

Q

What are the limitations of using counts in compositional models?

Answer

Study These Flashcards

A

counts are not scale-invariant, so comparisons across samples with different total counts require care
overdispersion commonly occur in count models - where the variance is higher than expected - standard models may underestimate
missing counts may bias estimates if not carefully accounted for

Question 17

Q

How robust are your conclusions to data irregularities?

Answer

Study These Flashcards

A

My Bayesian hierarchical frameworks were designed to be robust to:
* zeros
* small sample sizes
* overdispersion

Posterior predictive checks and calibration metrics showed models remained well-calibrated even under sparse or noisy data, but extreme class imbalance could still affect performance

Question 18

Q

How did you verify that your models were identifying structure and not overfitting?

Answer

Study These Flashcards

A

cross-validation across multiple folds
posterior predictive checks to compare observed and replicate data
performance metrics (e.g., MAE, Brier Score, ECE) to compare with the original data
visualisation of latent structure (e.g., clusters, state transitions) to ensure learned patterns matched known domain structure (e.g., variant surges, spatial gradients)

Critical Reflection and Future Work Flashcards

(18 cards)