GLM 3- Assumptions Flashcards

1
Q

What is the first assumption of GLM?

A

Response-predictor Linearity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How can the first assumption of GLM be diagnosed?

A

Residual plots allow to identify non-linearity.

They plot the fitted values, y^i’s, against the residuals,e^i s.

The plot should indicate a linear trend –i.e. a line– between the fitted values and the residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How can the first assumption of GLM be remedied?

A

If the residual plot indicates that there is a non-linear relationship, one can either:

Transform the predictors, log X, square root of X

Use polynomial regression by including X2, X3, for instance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is assumption 2 of GLM?

A

Constant Variance of Errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How can the second assumption of GLM be diagnosed?

A

Residual plots can, again, enable us to assess whether the variances of the error terms are constant.

The error terms are assumed to be homoscedastic. That is, to have identical variance for different levels of the fitted values, y^i ’s.

Consequently, we should observe a uniform distribution of the variation of the residuals across the levels of predicted values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How can the second assumption of GLM be remedied?

A

We can transform the response, Y , by taking log(Y ), or square root of Y. If such transformations do not work or are impossible, just report it in your analysis.

We can exploit the source of variability in the responses, if known. The yi ’s may be aggregates with associated variances, σi . In such cases, we can use weighted least squares (WLS).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the third assumption of GLM?

A

Non-correlation of Errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How can the third assumption of GLM be diagnosed?

A

Serial Residual plots allow to identify the correlation of the errors.

They plot the residuals, e􏰇’s with respect to the observations IDs.

The plot should not indicate long-term dependency between sequences of residuals. This would violate the assumption of independent observations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How can the third assumption of GLM be remedied?

A

Typically, non-independence may be present due to some structure in your data, due to groups or time.

Model the group structure in your data, using a mixed-effects model, or hierarchical model.

Model the time-lag structure in your data, again using a mixed-effects model, or hierarchical model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is assumption 4 of GLM?

A

Detecting Outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can assumption 4 of GLM be diagnosed?

A

An outlier is a point, which is far from the values predicted by the model.

Outliers will result in an increase in the Residual Sum of Squares (RSS),
used to compute R2, and the confidence intervals for each parameter.

The studentized residual plot show the values of the residuals, ei ’s, divided by their standard errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can assumption 4 be remedied if violated?

A

Typically, the studentized residuals should not exceed 3 standard errors.

If a data point has a residual with a studentized residual of 3 or more, you may consider removing it, especially if you suspect that this observation is faulty in some ways.

However, care should be taken as the presence of an outlier may also indicate a deficiency in your model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is assumption 5 of GLM?

A

High-leverage Points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can assumption 5 of GLM be diagnosed?

A

A high-leverage point is a data point, whose removal produces a substantially different set of parameters.

The leverage, or hat-value, is a quantity that measures how unusual is that data point with respect to all the others.

We can plot the individual leverages against the values of the studentized residuals to identity points that are outliers and have also high leverage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How can assumption 5 of GLM be remedied?

A

There is no specific rule of thumb for detecting high leverage.

As for outliers however, you may consider removing it, especially if you suspect that this observation is faulty in some ways.

But, again, care should be taken as the presence of an outlier may also indicate a deficiency in your model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is assumption 6 of GLM?

A

Multicollinearity

17
Q

How can assumption 6 of GLM be diagnosed?

A

Two predictors are said to be collinear is they are strongly correlated with each other.

Pairwise collinearity can be assessed by considering the correlation matrix of your predictors.

Multicollinearity, however, may occur in the absence of severe pairwise collinearity. In that case, we use the variance inflation factor (VIF),

VIF (β^j ) := 1/ 1−R2 Xj |X−j

If the variance in Xj is strongly explained by all the other predictors in the model, then R2 will be close to 1, and therefore its VIF will be very large. Xj |X−j

18
Q

If violated how can assumption 6 of GLM be remedied?

A

If one variable, say Xj , exhibits a VIF larger than 5, we may resort to one of the following:

1) Drop that variable.

2) Combine the variables that are collinear, by taking the average of the
standardized variables.

3) Create a latent variable that combine the multicollinear variables.

Note that removing a collinear variable will not compromise the overall fit of your model, because most of the information contained in that variable is also contained in other variables.

19
Q

What is the default Plotting for lm Objects in R?

A

Every lm object can be plotted, using one of the following:

plot(fit)
plot(lm(y ∼ x1 + x2))
plot(fit <- lm(y ∼ x1 + x2))

Alternatively, we may select which plot we need as follows:

plot(fit, which=c(1,4))

20
Q

How do we know the options of the plot function in R?

A

?plot.lm

21
Q
  1. How can residual plot be created in R for assumption 1?
  2. What does this do?
A
  1. plot(fit, which=1)
  2. Check linearity of errors.
22
Q
  1. How can a QQ-plot of Standardized Residuals be fitted in R to test assumption 2?
  2. What does this do?
A
  1. plot(fit, which=2)
  2. Check normality of residuals.
23
Q
  1. How can a Scale-location Plot be fitted in R to test assumption 3?
  2. What does this do?
A
  1. plot(fit, which=3)
  2. Check homoscedasticity (equal variance)
24
Q
  1. How can a Cook’s Distances plot be fitted in R to test assumption 4?
  2. What does this do?
A
  1. plot(fit, which=4)
  2. Influential (or leverage) cases.
25
Q
  1. How can a Residuals vs Leverages plot be fitted in R to test assumption 5?
  2. What does this do?
A
  1. plot(fit, which=5)
  2. Checks Cases that are influential and outliers
26
Q
  1. How can a Cook’s Distances vs Leverages plot be fitted in R to test assumption 6?
  2. What does this do?
A
  1. plot(fit, which=6)
  2. Check influence and leverage.
27
Q

What is an estimator?

A

A function of the data, (y, X), and therefore can be denoted as follows:

β^ := β^(y, X).

This holds for every pair (y, X). Since the dependent variables are random, it follows that, if β^ is a function of y, then β^ is also a random quantity. Thus, ignoring X, we may write
β^ := β^(Y1,…,Yn).

28
Q

What are the statistical properties of OLS Estimators?

A

Unbiasedness. We say that β^ is an unbiased estimator of β, if the following condition holds, for every β ∈ R,

E[β^(Y1,…,Yn)|X] = β.

Minimal Variance. For every other estimator, β~, the OLS estimator, β^, has variance,

Var[β^(Y1,…,Yn)|X] ≤ Var[β~(Y1,…,Yn)|X].

Mean Squared Error (MSE). Finally, we will see that these two criteria can be combined by considering the MSE of β^ with respect to β,

MSE(β, β^) := E[(β − β^)2|X ]

29
Q

What is the MSE?

A

We assume that β represents a single parameter. Then,

E[(β^ − β)2|X] = E[(β^ − E[β|X] + E[β^|X] − β)2|X] ….

where the bias of β^ is defined as follows b2(β^) := (E[β^|X ] − β)2.

Thus, we obtain the following decomposition,
MSE(β,β^) = Var[β^|X] (variance) + b2(β) (squared bias)

The MSE is a theoretical quantity, since it depends on the unknown true parameter β.

30
Q

What is the Best Linear Unbiased Estimator (BLUE)?

A

An estimator β^ of a parameter β

i. If it is a linear function of the observed values y;

ii. If it is an unbiased estimator of β;

iii. And if it has minimal variance among all other linear unbiased estimators β~, such that,

Var[β^|X] ≤ Var[β~|X].

31
Q

What do the Gauss-Markov assumptions state?

What does this mean?

A

Zero-centered errors: E[ε|X] = 0,

Uncorrelated errors and homoscedasticity: Var[ε|X] = σ2in

In, Linearity of mean function: y = f (β, X), is linear in β.

This means the OLS estimator, β^, is BLUE for β.

32
Q

If we want to use MSE as a criterion what must we recall?

A

The standard decomposition

MSE(β,β^) = Var[β^|X] + b2(β^).

33
Q

What is MSE in statistics?

A

The mean squared error (MSE) is a theoretical object, dependent on (i) the true parameter, and (ii) the true model of Y .

It is defined as the integrated squared distance between the true parameter, β, and the estimator, β^, conditional on X; such that:

MSE(β, β^) := E[(β − β^)2|X ].

34
Q

What does MSE depend on in machine learning?

A

The mean squared error (MSE) depends on the joint model of Y and X.

35
Q

How is MSE defined in machine learning?

A

As the integrated squared distance between the observed outcome, y, and the predicted outcome, f^ (X); such that:

MSE(f^ ) := E[(y − f^ (X))2].

36
Q

What are models are expressed in terms of? Dependent variables are treated as random.
Independent variables are treated as fixed.

A

Conditional expectations, E[Y |X].

37
Q

How are dependent variables treated as?

A

Random

38
Q

How are independent variables treated as?

A

Fixed

39
Q

What is measurement error only assumed in?

A

Yi ’s.