PhD Flashcards
1
Q
Zeros
why not log-ratio
A
- log undefined for zeros
- when the zero is structural, not sensible to impute with a small number - loses informative nature
2
Q
Missing
why not log-ratio
A
- log requires complete data to be correctly defined
- when there are missing compositions, log-ratios may not produce sensible results - as the relative proportions not properly computed
3
Q
Count
why not log-ratio
A
- log-ratio results in discrete variables in real space which is not suitable for some modelling techniques which require continuous distributions
- potentially discards information on how the total impacts the variance and values that teh counts can take
- when the total count is small, more problematic - reduces the unique values the counts can take
4
Q
Multilevel
data challenge
A
when the data contains a multilevel / hierarchical structure - the components are correlated in structured way
% students achieving grades in class, within school and within county
5
Q
Time series
data challenge
A
- need to account for the underlying temporal structure and compositional structure
- typical time series techniques do not account for the compositional nature
- further challenge when the time series is non-smooth (abrupt changes / irrelgurlar fluctuations)
6
Q
Spatial
data challenge
A
need to account for spatial structure / dependence and the compositional nature