CH 18 (WM) Flashcards

Question

Define the term "Aliasing". [1]

Answer 1

Aliasing occurs when there is a linear dependency among the observed covariates X1, X2,...,Xp. ✓✓ There are two types of aliasing: intrinsic aliasing and extrinsic aliasing.✓✓

Answer 2

Intrinsic aliasing occurs because of dependencies inherent in the definition of the covariates.✓✓ This is dealt with by modelling software.✓✓ These intrinsic dependencies arise most commonly whenever categorical variables are included in the model.✓✓ For example, consider “patient age”, which has the four levels: 0-20 years, 21-40 years, 41-60 years and 60+ years.✓✓ Clearly, if any of X1, X2, or X3 is equal to 1, then X4 is equal to zero; and if X4 is equal to 1, then the rest are all equal to zero.✓✓ In particular: X4 = 1 - X1 - X2 - X3.✓✓

Answer 3

As with intrinsic aliasing, extrinsic aliasing also arises from a dependency among the covariates.✓ However, it arises when the dependency results from the nature of the data itself, rather than as a result of inherent properties of the covariates.✓✓ Extrinsic aliasing occurs when two or more factors contain levels that are perfectly correlated.✓✓

Answer 4

Near aliasing occurs when the correlation is almost, but not quite, perfect.✓✓

Answer 5

Q&A 3.20 (i)

Answer 6

Complete and marginal interactions are alternative representations of the same thing. [1⁄2] A complete interaction is expressed as a single factor that represents every combination of the factors involved. [1⁄2] For example, for the two factors given, there would be a new single factor, representing the interaction, which would have nine levels, ... [1⁄2] ... ie AX, AY, AZ, BX, BY, BZ, CX, CY, CZ. [1⁄2] Each of these levels would have a multiplier attached (since this is a multiplicative model). These could be written in the form of either a one-way or a two-way table. [1⁄2] For example: Factor 1: Factor 2: Y Z A B C X 0.90 0.97 1.26 1.00 1.10 1.20 1.40 1.45 1.85 [1⁄2] In this case, the base level has been selected to be the level corresponding to Level B of Factor 1 and Level X of Factor 2, ... [1⁄2] ... and the interaction term has 8 parameters. [1⁄2] A marginal interaction considers the additional effect of the interaction term over and above the single factor effects. [1⁄2] In this case, the single factor effects will be observed separately from the marginal interaction term effects. Using the same example as above, the multipliers would look as follows: Factor 1: A B C 0.90 - Factor 2: X - Y 1.20 1.40 Z - 0.90 1.00 - - - - 1.10 1.20 [2] So the overall relativity for Factor 1 Level A and Factor 2 Level Y would be 0 ... . 90 1 20 0 90 0 97  interactions. This marginal interaction would therefore be calculated as 1.20 0.90 0.97   0.90. [1⁄2] [Maximum 6]

Answer 7

iii) Consistency over time When pricing, it is important to check that the patterns of relativities observed in a GLM are not changing too much over time. [1⁄2] If a trend emerges over time then it is important to identify it, so that the patterns can be projected to the period over which the rates will apply. [1⁄2] The time consistency check is also used to determine whether the effect of each factor is consistent from year to year. [1⁄2] If a factor is consistent then it is likely to be a good predictor of future experience. [1⁄2] To test the consistency of parameter estimates over time, a GLM can be fitted that includes the interaction of a single factor with a measure of time, ... ... eg a calendar year. [1⁄2] [1⁄2] Ideally this would be done for every factor in the model and would test the interaction for statistical significance. [1⁄2] Significant factors will have a small error range of relativities and the error ranges for the various factors will not overlap too much. [1⁄2] [Maximum 3]

Answer 8

Statistical test Define Model 1 to be the initial model and Model 2 to be the reduced model. These two models are nested, so a chi-square test can be used to compare the changes in scaled deviance between the models. [1⁄2] The scaled deviance for Model 1 = 392.45 (given). The scaled deviance for Model 2 = 401.97 (given). Degrees of freedom for Model 1 = 50,000 – 80 = 49,920. [1⁄2] Degrees of freedom for Model 2 = 50,000 – 76 = 49,924, ... [1⁄2] ... since the reduced model has only 1 level for this factor instead of 5 so there are 4 fewer parameters fitted. [1⁄2] Under the null hypothesis, there is no difference between Model 1 and Model 2: D1* - D2* is distributed...[1/2] The difference between the scaled deviances is 401.97 – 392.45 = 9.52. [1/2] This should be compared with the upper 5% point of the (2; 4 ) [1/2] distribution. The test statistic of 9.52 exceeds this value, ... [1⁄2]  distribution is 9.488 (from page 169 of the Tables). ... so, at the 5% significance level, the reduced Model 2 would be rejected in favour of the initial Model 1. [1⁄2] Therefore, based on the statistical test, and assuming a 5% significance level, this factor would be kept in the model. [1⁄2] [Maximum 4]

Answer 9

ii) Further considerations It is not known what this factor is, although it is known to have five levels. An investigation would be needed as to whether these five levels are groupings of more detailed levels. [1⁄2] If the former, then the original ungrouped factor could be included in the model instead, to test for significance ... ... although this might be difficult if the factor has been grouped due to there being insufficient data. [1⁄2] The parameter values associated with each of the five levels should be analysed to see if they are as expected, relative to each other. [1⁄2] A graph of the values could be drawn, to enable this to be seen more clearly. [1⁄2] For example, it could be that the relativity for the “Unknown” level is so different to the relativities for the other levels that it is this alone that is making the factor appear statistically significant. [1⁄2] If this is the case, then the factor is not really adding much in terms of predictive power for the future. [1⁄2] Alternatively, “Unknown” could be grouped with one of the other levels and the model refitted to see whether it is then statistically significant. [1⁄2] An interaction between this factor and some measure of time should also be fitted, to show whether the pattern observed for the relativities is consistent over time. [1⁄2] If the pattern is not consistent then this factor may be rejected on the basis that it does not show a stable pattern and is therefore not useful for predicting the future. Consideration should be given to whether the factor is likely to be acceptable to policyholders ... [1⁄2] ... and brokers, eg their systems may be unable to handle an extra rating factor. [1⁄2] If this factor has been used in previous rating exercises for this book of business, it would be more likely to be kept it the model this time. [1⁄2] If this factor has not been used before in general, then the practicalities of how easily it could be incorporated into the rating algorithms should be considered. [1⁄2] If it is likely to take a lot of IT time to build the relevant tables then this would be an argument for not using the factor. Consideration should also be given to whether this factor is used by other insurers in the PMI market. [1⁄2] [1⁄2] If this factor is dropped from our model while other insurers continue to use it then this could lead to anti-selection. [1⁄2] [Maximum 6]

Answer 10

There is likely to be a large number of policyholders with zero or very small claims✓✓ and a small number of people with very large claims✓, ie the true distribution will be positively skewed✓. The normal distribution does not have this property✓. A normal distribution can also take negative values, which would be inappropriate.✓✓

Answer 11

The link function acts to remove the assumption that the effects of different variables must simply be added together.✓✓ It must be both differentiable✓ and monotonic✓ (either strictly increasing or strictly decreasing)✓. Typical link functions include the log, logit and identity functions.✓✓ The log link function is of particular interest in pricing✓ because its use results in a model where the effects of different rating factors are multiplied together✓✓.

CH 18 (WM) Flashcards

(35 cards)