Review of Linear Regression Flashcards

1
Q

In the equation for a linear regression model, what does each term represent?

A
  • y is the vector of responses
  • the x’s are the predictors making up X, the data matrix
  • β is the vector of parameters
  • ε are the independently, identically distributed random errors ~ N(0,1)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Outline the concept of a simple model

A

y_i = β_o + β_1 x_i + ε_i

If the model is linear then we have that µ = E [y] = x’β is a linear combination of the predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What’s our MLE for estimating our parameters?

A

We seek to minimize least squares and minimise ∑ ( y_i - x_i’ β)^2 with respect to β

Working through, we end up with ^β = (X’X)^-1 X’ y then as ^y = x’β

^β ~ N ( β, σ^2( X’X)^-1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define residuals

A

e_i = ^y - y

This is the difference between the observed and fitted values. A linear model is valid if all our assumptions are valid, which involves testing residuals which should be iid and ~ N(0,1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does it mean for a pdf/pmf to be in from the exponential family?

A

f (y ; θ, ϕ) = exp { (yθ - b(θ)) / a(ϕ) + c (y, ϕ) }

Where θ, ϕ are parameters, the natural and dispersion respectively and a,b and c are functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

State what is meant by Fisher Info

A

For a pdf / pmf it is i (θ) = b’‘(θ) / a(ϕ ) and it the amount of information about the parameters the data matrix X carries around θ. Formally, it is the variance of the score of expected of the observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What’s the point of Fisher Scoring?

A

We aim to fit a parametric distribution, to data and find ^θ = θ. To maximize the chance of this happening we use U(θ) = ∂/∂θ l( θ; y) = 0 = n ( y - b’(θ)) / a(ϕ) if in exponential family

There are not always analytical solutions to these - in which case we use the Newton Raphson method.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Outline the Fisher Expected Info Matrix

A

I (θ) = - ( E [ ∂^2/∂θ_i∂θ_j l( θ; y) ]

This is the the likelihood equation differentiated twice with respect to i and j

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

State the Fisher Scoring Algorithm. When does it equal a different Algorithm?

A

The FSA is θ^(k+1) = θ^(k) I(θ^(k))^-1 U(θ^(k))

FSA = NR with θ^(k+1) = θ^(k) - b’ (θ^(k) - y) / b’’ (θ^(k))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly