Section 6 : GLM mini case study Flashcards
(8 cards)
You are modeling claim severity for a highly skewed distribution.
Which distribution should you use ?
A: Use the Inverse Gaussian distribution.
Why: It handles strong skewness and long tails better than Gamma, which improves predictive accuracy for severe losses.
You’re modeling pure premiums with many zero claims and right-skewed positive values.
What should you do?
A: Use the Tweedie distribution.
Why: It accommodates zero-inflation and skewed losses by combining Poisson frequency and Gamma severity in one model.
You detect high correlation between two predictors in your GLM.
What should you do?
Remove one of the variables or use PCA (Principal Component Analysis).
Why: Multicollinearity destabilizes coefficient estimates and inflates standard errors, leading to unreliable interpretations.
You need to incorporate deductible relativities already estimated from a separate analysis.
What should you do?
Use an offset term in the GLM.
Why: Offsets let you incorporate pre-specified effects (like deductibles) without re-estimating them, maintaining consistency.
You want to validate your model with limited data and avoid overfitting.
What should you do?
Use k-fold cross-validation.
Why: It improves robustness by training and testing on multiple partitions, reducing sensitivity to any one split.
You want to compare multiple GLMs using the same dataset and distribution.
What should you do?
Compare log-likelihood or scaled deviance.
Why: These metrics are valid when the model assumptions are consistent and help assess goodness of fit.
Your GLM includes a continuous predictor with a nonlinear relationship to the target.
What should you do?
Add polynomial terms, bins, or splines.
Why: These transformations improve fit by capturing curvature or thresholds not modeled by a linear term.