Critical Reflection and Future Work Flashcards

(18 cards)

1
Q

If you had more time, what extensions would you explore?

A
  • spatio-temporal models - jointly model changes over time and space (combining work of Ch 4 and 5)
  • include covariate information into the models to enhance fit
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do you plan to improve scalability of your models?

A
  • exploring alternative MCMC approaches which could be more computationally efficient
  • parallelising modes further
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How would you change the priors to reflect stronger prior knowledge?

A
  • incorporate stronger prior knowledge into the priors by removing the weakly informative priors.

For example, the Beta distribution could be used as a more flexible prior (instead of Uniform) where the shape parameters could be chosen to reflect known tendencies in the compositions, such as higher probabilities for certain elements being present.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the limitations of using the Dirichlet prior?

A
  • despite being defined on the simplex and used commonly with compositional data
  • standard Dirichlet imposes a fixed correlation structure and cannot represent negative correlations
  • Dirichlet cannot model zero values
  • assumes equal exchangeability among components, which may be unrealistic in real-world settings
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Could machine learning approaches be incorporated into your frameworks?

A

Something I have not looked into but:
* Neural networks or Gaussian processes could be used for complex non-linear covariate effects
* Deep generative models could simulate synthetic compositional datasets for training or augmentation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What did you learn about modelling trade-offs from this research?

A
  • model flexibility (e.g., GDM, hierarchical structures) often comes at the cost of computational complexity - simpler models may be quicker but do not perform as well
  • interpretability and model checking are critical and worth the extra time to ensure the model is performing well
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How could your models be extended to include covariates?

A
  • Forensic: covariates can be added to the Bayesian hierarchical model to aid the classification of glass items
  • COVID-19: covariates be included to help influence the transition probabilities
  • Trees: environmental or demographic covariates could improve prediction accuracy
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What other data types could your models be applied to?

A
  • Ecological / environmental data (species)
  • Electoral or marketing shares across regions
  • Finance or investment allocations, where proportions change dynamically
  • Disease incidence
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How could your frameworks be adapted for longitudinal compositional data?

A
  • extend to include temporal correlation structures (e.g., autoregressive latent effects)
  • use dynamic latent variables to model long-term trends
  • adapt the GDM to allow time-dependent information
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What directions would you suggest for future research?

A
  • addition of covariate information
  • combine work of ch 4 and 5 - spatio-temporal models
  • improving on the computational cost using alternative samplers or methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you approach reducing computational cost?

A
  • explore alternative samplers or MCMC methods
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How might variational inference or INLA help in future work?

A
  • potentially offer faster approximate inference
  • could be used to pre-screen models or parameters before refining with full MCMC
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How can your models support real-time decision-making?

A
  • using approximate inference quick updates and integrating model outputs into dashboards
  • decision-support systems to aid real-time decision-making
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What would you do differently if you started this project again?

A
  • spent a lot of time trying to replicate and improve the computational time of Napier model manual in R. If I could start again I would focus less so on this, and initially look for packages that implement Bayesian models in a flexible and efficient way (e.g. NIMBLE)
  • tried to focus on compositional data in the ‘original’ form. If I could start again, I would change the mindset around how to analyse compositional data (e.g. relative or absolute) earlier. For example, trying to use log-ratio transformations as the key to compositional data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Where are the biggest risks of bias in your models?

A
  • Priors: although weakly informative, they may still subtly shape inference in small or sparse data
  • Data imbalance: class imbalance across categories - draw results to the larger groups
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are the limitations of using counts in compositional models?

A
  • counts are not scale-invariant, so comparisons across samples with different total counts require care
  • overdispersion commonly occur in count models - where the variance is higher than expected - standard models may underestimate
  • missing counts may bias estimates if not carefully accounted for
17
Q

How robust are your conclusions to data irregularities?

A

My Bayesian hierarchical frameworks were designed to be robust to:
* zeros
* small sample sizes
* overdispersion

Posterior predictive checks and calibration metrics showed models remained well-calibrated even under sparse or noisy data, but extreme class imbalance could still affect performance

18
Q

How did you verify that your models were identifying structure and not overfitting?

A
  • cross-validation across multiple folds
  • posterior predictive checks to compare observed and replicate data
  • performance metrics (e.g., MAE, Brier Score, ECE) to compare with the original data
  • visualisation of latent structure (e.g., clusters, state transitions) to ensure learned patterns matched known domain structure (e.g., variant surges, spatial gradients)