Chapter 4 - COVID-19 Flashcards

(22 cards)

1
Q

What are the key challenges in compositional time series?

A
  • traditional time series methods do not have the facility to account for the compositional nature
  • need methods to account for temporal of the time series
  • non-smooth time series, with fluctuations and abrupt changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Why is the COVID-19 variant dataset considered compositional?

A
  • data tracks counts of the COVID-19 variants (VOC) over time
  • each country has weekly counts of each variant and the total count per week
  • interest lies in relative information of each variant in addition to the absolute count
  • inherently makes the data compositional, even though it originates from counts.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why did you use a Hidden Markov Model (HMM)?

A
  • time series model for data arranges over regular time intervals.
  • models the temporal dependence of an observable outcome through the temporal evolution of a latent state sequence
  • can handle abrupt changes and fluctuations in the time series
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How did you account for the non-smooth nature of the COVID-19 time series?

A
  • HMM structure allows for discrete state transitions, capturing changes rather than assuming smooth temporal evolution
  • model does not enforce smoothing, making it well-suited to non-smooth time series
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do the hidden states in your HMM model for COVID-19 represent?

A
  • State 1 - Dormant before outbreak
  • State 2 - Active and increasing
  • State 3 - Dominant
  • State 4 - Active and decreasing
  • State 5 - Dormant after outbreak
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the GDM-HMM and how does it work?

A
  • GDM-HMM models the times series using the Generalised-Dirichlet-Multinomial (GDM) distribution with a Hidden Markov Model (HMM) structure
  • at each time point, the observed variant counts are modelled using a GDM distribution, conditional on the hidden state
  • within each variant, each state has its own GDM parameters, capturing distinct variant profiles
  • transitions between states are governed by a Markov process, allowing the model to flexibly capture shifts in variant dynamics over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How did you validate your model with COVID-19 data?

A
  • posterior predictive model checking
  • simulated datasets were compared to the observed data
  • conducting a moving window approach to detect if the trends in the simulated datasets matched those of the original data time series
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How did you use clustering with the COVID-19 data?

A
  • group countries with similar temporal variant dynamics
  • each cluster shared a set of HMM parameters, allowing borrowing of strength across countries
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How did you model the zero counts and count structure?

A
  • as the GDM can handle zero values, the zero counts were directly modelled
  • critical in early time points or rare variants, where zeros dominate
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why are standard time series models insufficient for your data?

A
  • traditional time series methods do not have the facility to account for the compositional nature, struggling with the zero values and count structure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the implications of using this model for public health surveillance?

A
  • GDM-HMM enables early detection of shifts in variant prevalence, helping public health officials monitor transitions such as the emergence of new variants
  • providing probabilistic state estimates and forecasts, it supports data-driven decision-making during dynamic outbreaks, especially when absolute case counts vary and sequencing coverage is inconsistent
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Generalised Dirichlet Multinomial (GDM) distribution, and why did you use it?

A
  • GDM distribution generalises the Generalised-Dirichlet and Multinomial by allowing more flexible covariance structures and overdispersion
  • provides greater flexibility than simpler alternatives, the Multinomial explains some of the variability, while the GD component can flexibly explain all other random variability, to capture different variance patterns in real-world data well
  • values that represent 0% and 100% of the total can naturally arise from the GDM
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does your Hidden Markov Model differ from traditional time series approaches?

A
  • discrete latent states to understand the evolution of the time series
  • allowing for abrupt changes in the time series
  • paired with a GDM observation model, capturing both counts, overdispersion and compositional data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What were the key evaluation metrics used to assess model performance?

A
  • MAE of the median and standard deviation
  • compares to the equivalent value from the original data
  • low MAE value indicates a closer match, which in this case may mean that the non-smooth temporal structure is being captured better
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How did you ensure fair comparisons between your models?

A
  • all models applied on the same data
  • same GDM model specification was used for all models compared with only specification of the time series model changing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In what scenarios did your methods outperform comaprison methods the most?

A
  • GDM-HMM effectively characterised the temporal evolution of each variant through the latent state sequence
  • accurately captured the corresponding statistic from the original data and producing replicate values that could be plausible in real-world scenarios
17
Q

What practical insights can be drawn from the results of your COVID-19 variant analysis?

A
  • identified key transition points in variant evolution, providing insights for public health officials
  • showed clustered specific information is vital in understanding variant dynamics
18
Q

How generalisable are your methods to other types of compositional data?

A

compositional time series:
* Market share evolution in business analytics
* Ecological time series, tracking species proportions over time

framework is broadly useful for compositional count data with temporal dynamics, especially when transitions are sudden or non-smooth

19
Q

What are the main limitations of your approaches?

A
  • Computational cost - running the GDM-HMM model
  • No covariate information - does not directly model covariates influencing transitions (e.g., policy changes or seasonality), could be extended
20
Q

Explain the rationale for using Beta-Binomial distributions?

A
  • Beta-Binomial distributions come from combining the GD and Multinomial distributions
  • GD - series independent scaled Beta distributions
  • Multinomial - series conditional Binomial distributions
  • Multinomial explains some of the variability, while the GD component can flexibly explain all other random variability, to capture different variance patterns in real-world data well
  • values that represent 0% and 100% of the total can naturally arise from the GDM
21
Q

How does the forward algorithm work in your HMM model?

A
  • forward algorithm is an analytical solution to integrating out the latent state sequence
  • computes the marginal likelihood of the observed sequence by recursively by calculating the probability of being in each state at time t given all past observations
  • uses the transition probabilities to infer which state to move to
22
Q

Describe your model’s latent structure and its inference?

A
  • sequence of hidden states
  • state-specific GDM parameters for each variant