CH 18 (WM) Flashcards
(35 cards)
List the assumptions of classical linear models. [1.75]
- error terms are independent and come from a normal distribution ✓✓
- the error terms have constant variance✓✓ (or homoscedasticity) ✓
- the mean is a linear combination of the explanatory variables ✓✓
What are the drawbacks for the normal model for multiple linear regression? [2]
- it assumes the response variable has a normal distribution ✓✓
- the normal distribution has a constant variance which may not be appropriate ✓✓
- it adds together the effects of different explanatory variables, but this is often not what is observed ✓✓
- it becomes long-winded with more than two explanatory variables ✓✓
Define the term “explanatory variables”. [1.5]
Explanatory variables are inputs into a model that are expected to influence the response variable.✓✓
In a pricing context, the explanatory variables would be rating factors.✓✓
It is important that explanatory variables make intuitive sense.✓✓
Define the term “response variables”. [1]
Response variables are outputs from the model that are likely to be affected by the explanatory variables.✓✓
In an overall pricing context, the response variable would be the price.✓✓
Define the terms “categorical and non-categorical variables”, together with examples of each. [3]
Categorical variables are explanatory variables that are used for modelling where the values of each level are distinct✓✓, and often cannot be given any natural ordering or score✓✓. An example of this would be gender, which can take the value of male or female✓✓.
By contrast, non-categorical variables can take numerical values, eg age.✓✓
Categorical variables are sometimes referred to as factors.✓✓ The majority of explanatory variables used in practice within GLMs for insurance are factors.✓✓
What are meant by the levels of a categorical variable? [1]
The levels of a categorical variable are simply the distinct values that the variable can take.✓✓
So, if gender is a variable in a GLM and it can only take the values “male” or “female”, then gender would be said to have two levels.✓✓
Explain how continuous numerical variables like age can be treated. [1.5]
Often, continuous numerical variables like age can be treated as categorical variables.✓✓
For example, if the “age of policyholder” variable was grouped into age bands (of 5 years for example), the new variable “age band” would be a categorical variable✓✓. This is because each such band is effectively a discrete category, ie a level of a categorical variable✓✓.
“Categorical variable” appears in every sentence
List the various techniques used to analyse the significance of the explanatory variables used in a model. [1]
- The chi-squared test
- The F statistic
- the Akaike Information Criteria (“AIC”)
- Other
Explain what is meant by a nested model. [2.25]
Two models are nested if one model contains explanatory variables that are a subset of the explanatory variables in the other model.✓✓
For example, if Model 1 has linear predictor: a + bx ✓✓
and Model 2 has linear predictor: a + bx + cx^2 ✓✓
then Model 1 is a subset of Model 2 ✓✓, ie Models 1 and 2 are nested.✓
Describe how you will apply the Chi-squared statistic to analyse the significance of the explanatory variables used in a model. [2]
If Models 1 and 2 are nested, then the change in scaled deviance follows a chi-squared distribution, ✓✓ ie:
Formula = { } ✓✓✓✓
This measures whether the inclusion of one or more additional explanatory variables in a model improves the model fit significantly.✓✓
Suppose that Model A and Model B are nested models with 6 and 10 parameters respectively.
The scaled deviance of Model A is 17.80 and for Model B is 11.08. Explain whether Model B is a significant improvement on Model A. [2]
(Question 18.10)
The difference in the scaled deviance is 6.72. ✓✓
The difference in the number of degrees of freedom is the same as the difference in the numbers of parameters in the models, ie 4. ✓✓
Since 6.72 < 9.488 , the upper 5% point of the chi-squared statistic ✓✓
there is insufficient evidence at the 5% significance level to reject Model A in favour of Model B. ✓✓
(page 168 of the Tables.)
Describe how you will apply “F statistics” to analyse the significance of the explanatory variables used in a model. [3.25]
In cases where the scale parameter for the model is unknown, for example when using the gamma distribution, it has to be estimated.✓✓
The estimate of the scale parameter is distributed as a chi-square distribution.✓✓
The ratio of the change in the deviance and the scale parameter estimate is distributed with an F distribution ✓✓, since the F distribution is the ratio of chi-square distributions ✓:
Formula = { } ✓✓✓✓
Note that the models need to be nested for this result to be valid.✓✓
Suppose Model C and Model D are nested models with 8 and 16 parameters respectively, and have been fitted to a set of 50 observations. The deviance for Model C is 40.89 and the deviance for Model D is 26.40. The scale parameter is unknown.
Explain whether Model D is a significant improvement on Model C. [3.25]
Question 18.11
The difference in the deviance is 14.49. ✓✓
The difference in the number of degrees of freedom is (50 – 8) – (50 – 16) = 8. ✓✓
The number of degrees of freedom in Model D is 34. ✓✓
So the value of the test statistic is: 2.33. ✓✓
From page 172 of the Tables, the upper 5% point of F(8,34) is 2.225. ✓✓
Since our test statistic exceeds this value✓✓, we reject Model C in favour of Model D. ✓
Describe how you will apply the Akaike Information Criteria (AIC) to analyse the significance of the explanatory variables used in a model. [3]
In cases where models are not nested, the AIC can be used to compare them.✓✓
The AIC for a model is calculated as: -2x log-likelihood + 2x number of parameters.✓✓
The AIC looks at the trade-off of the likelihood of a model against the number of parameters✓✓: the lower the AIC, the better the model.✓
For example, if two models fit the data equally well in terms of the log-likelihood ✓✓, then the model with fewer parameters is the more parsimonious✓✓, ie simpler, (and therefore “better”).✓
Define the term “generalised linear model (“GLM”). [2.75]
A generalised linear model (GLM) is a flexible generalisation of linear regression.✓✓
Generalised linear models are used to assess and quantify the relationship between a response variable and a set of possible explanatory variables.✓✓
For example, a GLM can be used to model the behaviour of a random variable✓ that is believed to depend on the values of several characteristics, eg age, gender and chronic condition✓✓.
These kinds of models can be used in a number of applications for private medical insurance✓ including risk modelling, pricing, financial projections and overall modelling of the business.✓✓✓✓
Q&A 3.10 [5]
3.841 & 53.38
Q&A 3.11 [3]
2.077
Definition the term “interaction”. [1]
An interaction exists when the effect of one factor varies, depending on the levels of another factor.
[1⁄2]
Interactions would be used where the pattern in the response variable (eg frequency or severity) is better modelled by including extra parameters for each combination of two of more factors.
[1⁄2]
Provide an example of the effect of interaction terms. [2]
Old individuals may have an %x higher risk than young individuals ✓✓ and individuals with chronic conditions may have a y%
higher risk than individuals without chronic conditions✓✓.
However, the combination of being older with a chronic condition may result in a much higher risk than [(1+x/100) x (1+y/100)-1]x100%. ✓✓
In this case, the effect of age depends on chronic conditions and the effect of a chronic condition depends on age.✓✓
Define the term “one-way analysis”. [2.5]
Prior to the use of GLMs in pricing✓, it was common to look at the effect on frequency and severity of each rating factor separately.✓✓ This is known as one-way analysis✓.
A one-way analysis ignores correlations and interaction effects between variables✓✓, for example age and disease, age and family size, or maternity and gender✓✓. As a result, the model may underestimate or double count the effects of variables.✓✓
You work for a health insurance company and specialise in generalised linear models (GLMs). A colleague in the pricing department has overheard you talking about residuals and is interested to learn a bit more about them.
Describe the following measures that can be used to check that a GLM is appropriate for the data given. You are not required to produce mathematical formulae.
(a) deviance residuals [2.5]
(Q&A 3.21)
Deviance residuals
A deviance residual, for a given observation, is a measure of the difference between the observed value and the value fitted by the model. [1⁄2]
The deviance residual considers the square root of each observation’s contribution to the deviance, …[1⁄2]
… adjusted for the direction in which the raw residual (the difference between the observed value and the fitted value) acts. [1⁄2]
The deviance measure corrects for the skewness of the distributions used, … [1⁄2]
… which means that the deviance residuals would be expected to be more closely normally distributed than the raw residuals. [1⁄2]
You work for a health insurance company and specialise in generalised linear models (GLMs). A colleague in the pricing department has overheard you talking about residuals and is interested to learn a bit more about them.
Describe the following measures that can be used to check that a GLM is appropriate for the data given. You are not required to produce mathematical formulae.
(a) Pearson residuals [1.5]
(Q&A 3.21)
Pearson residuals
A Pearson residual, for an individual observation, is the difference between the observed value and the fitted value (ie the raw residual), …[1⁄2]
… adjusted for the standard deviation of the predicted value and the leverage of the observed response. [1⁄2]
This measure does not adjust for the shape of the distribution. [1⁄2]
You have plotted the deviance residuals from your model, to check that the distribution chosen for the response variable is appropriate.
(ii) Explain how you will determine from the residual plot whether or not your model is likely to be a good fit. [3]
**Residual plot **
The residual plot could be a scatter plot of deviance residuals against the fitted values. [1⁄2]
If the distribution is appropriate for the data that are being modelled, the residual plot will have the following characteristics:
- the pattern of residuals will be symmetrical about the x-axis [1⁄2]
- the average residual will be zero, … [1⁄2]
… so there should be an equal number of points above zero and below zero on the graph [1⁄2] - the range of residual values will be fairly constant across the width (the x-axis) of the fitted values. [1⁄2]
A residual plot where the range of residuals narrows or widens as the fitted value increases, or where the range of residuals is not symmetrical about the x-axis, indicates that the model specification is poor. [1]
[Maximum 3]
What should we do if the residual checks suggest that our model is not a good fit to the data? [1]
Question 18.12
Solution 18.12
The model should be re-specified✓ by choosing a different statistical distribution✓ or a different linear predictor✓, link function✓ etc.