Shapland Flashcards

1
Q

Advantages of a Bootstrap Model

A
  • Generates a distribution of possible outcomes as opposed to a single
    point estimate
    → Provides more info of potential results; can be used for capital
    modeling
  • Can be modified to the statistical features of the data under analysis
  • Can reflect the fact that insurance loss distributions are generally
    skewed right. This is because the sampling process doesn’t require a
    distribution assumption.
    → Model reflects the level of skewness in the underlying data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Reasons for more focus by actuaries on unpaid claims distributions

A
  • SEC is looking for more reserving risk information from publicly traded
    companies
  • Major rating agencies have dynamic risk models for rating and welcome
    input from company actuaries about reserve distributions
  • Companies use dynamic risk models for internal risk management and
    need unpaid claim distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

ODP Model Overview

A
  • Incremental claims q(w,d) are modeled directly using a GLM
  • GLM structure:
    • log link
    • Over-dispersed Poisson error distribution

Steps
1) Use the model to estimate parameters
2) Use bootstrapping (sampling residuals with replacement) to estimate
the total distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

ODP GLM Model

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

GLM Model Setup (3x3 triangle)

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

GLM Model:
Solving for Weight Matrix

A

Solve for the α and β parameters of the Y = X × A matrix equation that
minimizes the squared difference between the vector of the log of actual
incremental losses (Y) and the log of expected incremental losses (Solution
Matrix).
Use the Maximum Likelihood or the Newton-Raphson method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

GLM Model:
Fitted Incrementals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Simplified GLM Method

A

Fitted (expected) incrementals using a Poisson error distribution are the
same as incremental losses using volume-weighted average LDFs.

Simplified GLM Method
1) Use cumulative claim triangle to calculate LDFs
2) Develop losses to ultimate
3) Calculate the expected cumulative triangle
4) Calculate the expected incremental triangle from the cumulative
triangle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Advantages of the Simplified GLM Framework

A

1) GLM can be replaced with the simpler link ratio approach while still
being grounded in the underlying GLM framework
2) Using age-to-age ratios serves as a “bridge” to the deterministic
framework and allows the model to be more easily explained to others
3) We can still use link ratios to get a solution if there are negative
incrementals, whereas the GLM with a log link might not have a
solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Unscaled Pearson residual

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Scale Parameter

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

The assumption about residuals necessary for bootstrapped samples

A

Residuals are independent and identically distributed
Note:
No particular distribution is necessary. Whatever distribution the residuals
have will flow into the simulated data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Sampled incremental loss for a bootstrap model

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Standardized Pearson Residuals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Process to create a distribution of point estimates

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Adding process variance to
future incremental values in a bootstrap model

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Sampling Residuals

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Standardized Pearson scale parameter

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Bootstrapping BF and Cape Cod Models

A

With ODP bootstrap model, iterations for the latest few accident years can
result in more variance than expected.

BF Method
* Incorporate BF model by using a priori loss ratios for each AY with
standard deviations for each loss ratio and an assumed distribution
* During simulation, for each iteration simulate a new a priori loss ratio

Cape Cod Method
* Apply the Cape Cod algorithm to each iteration of the bootstrap model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Generalizing the ODP Model:
Pros/cons of using fewer parameters

A

Pros
1) Helps avoid potentially over-parameterizing the model
2) Allows the ability to add parameters for calendar-year trends
3) Can be used to model data shapes other than data in triangle form
→ e.g. missing incrementals in first few diagonals

Cons
1) GLM must be solved for each iteration of the bootstrap model, slowing
simulations
2) The model is no longer directly explainable to others using age-to-age
factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Negative Incremental Values:
Modified log-link

22
Q

Negative Incremental Values:
Negative Development Periods

23
Q

Negative Incremental Values:
Simplified GLM Adjustments

24
Q

Negative values during simulation:
Process Variance

25
The problem with negative incremental values during simulation
Negative incrementals may cause extreme outcomes for some iterations. Example They may cause cumulative values in an early development column to sum to near zero and the next column to be much larger. → This results in extremely large LDFs and central estimates for an iteration.
26
Options to address negative incremental values during simulation
1) Remove extreme iterations from results → BUT only remove truly unreasonable iterations 2) Recalibrate the model after identifying the sources of negative incrementals → e.g. remove a row with sparse data when the product was first written 3) Limit incremental losses to zero → Replace negative incrementals in original data with a zero incremental loss
27
Non-zero sum of residuals
28
Using a N-Year Weighted Average
With GLM Framework * Exclude the first few diagonals and only use N+1 diagonals to parameterize the model (data is now a trapezoid) * Run bootstrap simulations and only sample residuals for the trapezoid that’s used to parameterize the model With Simplified GLM * Calculate N-year average LDFs * Run bootstrap simulation, sampling residuals for the entire triangle in order to calculate cumulative values * Use N-year average factors to project future expected values for each iteration
29
Handling Missing Values
Examples: Missing the oldest diagonals (if data was lost) or missing values in the middle of the triangle Calculations affected: LDFs, fitted triangle (if missing latest diagonal), residuals, deg. of freedom Solution 1: Estimate missing value from surrounding values Solution 2: Modify LDFs to exclude the missing value, no residual for missing value → Don’t resample from missing values Solution 3: If missing value is on latest diagonal, estimate value or use value in 2nd to last diagonal to get filled triangle, using judgment
30
Handling Outliers
There may be outliers that are not representative of the variability of the dataset in the future, so we may want to remove them. * Outliers could be removed and treated as missing values * Identify outliers and exclude from LDFs and residual calculations, but resample the corresponding incremental when simulating triangles Remove outliers cautiously and only after understanding the data.
31
Heteroscedasticity
Heteroscedasticity When Pearson residuals have different levels of variability at different ages. Why heteroscedasticity is a problem: ODP bootstrap model assumes standardized Pearson residuals are IID. * With heteroscedasticity, we can’t take residuals from one development period and use them in other development periods Considerations when assessing heteroscedasticity: * Account for the credibility of the observed data * Account for the fact that there are fewer residuals in older dev. periods
32
Adjusting for Heteroscedasticity: Stratified Sampling
Option 1: Stratified Sampling 1) Organize development periods by groups with homogenous variances 2) For each group: Sample with replacement only from the residuals in that group BUT: Some groups only have a few residuals in them, which limits the amount of variability in possible outcomes
33
Adjusting for Heteroscedasticity: Standard Deviation
34
Adjusting for Heteroscedasticity: Scale Parameter
35
Pros and Cons of adjusting for heteroscedasticity using hetero-adjustment factors
Pro Can resample with replacement from the entire triangle Con Adds parameters, affecting Degrees of Freedom and scale parameter
36
Heteroecthesious Data: Partial first development period
This occurs when the first development period has a different exposure period length than other columns. → e.g. 6 months in the first column, then 12 months in the rest Adjustments Reduce the latest accident year’s future incremental losses to be proportional to the level of earned exposure in the first period. → Then simulate process variance (or reduce after process var step)
37
Heteroecthesious Data: Partial last calendar period data
Adjustments a) Annualize exposures in the last partial diagonal. b) Calculate the fitted triangle and residuals. c) During the ODP bootstrap simulation, calculate and interpolate LDFs from the fully annualized sample triangles. d) Adjust the last diagonal of the sample triangles to de-annualize incrementals on the last diagonal. e) Project future values by multiplying the interpolated LDFs with the new cumulative values. f) Reduce the future incremental values for the latest accident year to remove future exposure.
38
Exposure Adjustment
Issue: Exposures changed significantly over the years (e.g. rapidly growing line or line in runoff) Adjustment * If earned exposures exist, divide all claims data by exposures for each accident year to run the model with pure premiums * After the process variance step, multiply the result by accident year exposures to get total claims
39
Parametric Bootstrapping
Purpose Parametric bootstrapping is a way to overcome a lack of extreme residuals in an ODP bootstrap model. Steps 1) Fit a parameterized distribution to the residuals. 2) Resample residuals from the distribution instead of the observed residuals.
40
Purposes of Bootstrap Diagnostics
* Test the assumptions in the model. * Gauge the quality of the model fit to the data. * Help guide adjustments of the model parameters to improve the fit of the model. Purpose Find a set of models and parameters that results in the most realistic and most consistent simulations based on the statistical features of data.
41
Residual Graphs
Residual graphs help test the assumption that residuals are IID. Plots to Look at * Residuals vs Development Period → Look for heteroscedasticity * Residuals vs Accident Period * Residuals vs Payment Period * Residuals vs Predicted → Look for issues with trends → Plot relative std dev of residuals and range of residuals to further test for heteroscedasticity
42
Normality Test
The normality test compares residuals to the normal distribution. If residuals are close to normal, you should see: * Normality plot with residuals in line with the diagonal line (normally distributed) * High R^2 value and p-value greater than 5% Note: In the ODP bootstrap, residuals don’t need to be normally distributed.
43
AIC and BIC formulas
44
Identifying Outliers
Identify outliers with a box-whisker plot: * Box shows 25th 75th percentile * Whiskers extend to the largest values within 3 times the inter-quartile range * Values outside whiskers are outliers
45
Handling Outliers
* If outliers represent scenarios that can’t be expected to happen again, then it may make sense to remove them. * Use extreme caution when removing outliers because they may represent realistic extremes that should be kept in the analysis.
46
Reviewing Estimated-Unpaid Model Results
* Standard error should increase from the oldest to most recent years * Standard error for all years should be larger than any individual year * Coefficients of variation should decrease from the oldest to most recent years due to independence in incremental payment stream * A reversal in coefficients of variation in recent years could be due to: * Increasing parameter uncertainty in more recent years * Model may overestimate uncertainty in recent years, we may want to switch to BF or Cape Cod model * Minimum/maximum simulations should be reasonable
47
Methods for combining results of multiple models
Run models with the same random variables: 1) Simulate random variables for each iteration 2) Use same set of random variables for each model 3) Use model weights to weight incremental values from each model for each iteration by accident year Run models with independent random variables 1) Run each model separately with different random variables 2) Use weights to randomly select a model for each iteration by accident year so that the result is a weighted mixture of models
48
Estimated Cash Flow Results
Simulation of unpaid losses by calendar year have the following characteristics: * Standard error of calendar year unpaid decreases as calendar year increases in future * Coefficient of variation increases as calendar year increases → This is because the final payments projected farthest out will be the smallest and most uncertain.
49
Estimated Ultimate Loss Ratio Results
* Estimated ultimate loss ratios by accident year are calculated using all simulated values, not just the future unpaid values * Represents the complete variability in loss ratio for each accident year * Loss ratio distributions can be used for projecting pricing risk
50
Issues with on correlation methods
* Both location mapping and re-sorting methods use residuals of incremental future losses to correlate segments → Both tend to create overall correlations of close to zero * For reserve risk, the correlation that is desired is between total unpaid amounts for two segments so there may be a disconnect
51
Correlation between segments: Re-sorting
Uses algorithms such as a Copula or Iman-Conover to add correlation Advantages * Data triangles can be different shapes/sizes by segment * Can use different correlation assumptions * Different correlation algorithms may have other beneficial impacts on the aggregate distribution * Ex. Can use a copula with a heavy tail distribution to strengthen the correlation between segments in the tails, which is important for risk-based capital modeling
52
Correlation between Segments: Location Mapping
For each iteration, sample the residuals from the residual triangles using the same locations for all segments. Advantages * Method is easily implemented → Doesn’t require an estimated correlation matrix and preserves the correlation of the original residuals Disadvantages * All segments need to have same size data triangles with no missing data * Correlation of original residuals is used, so we can’t test other correlation assumptions