Linear regression Flashcards Preview

MAS213 > Linear regression > Flashcards

Flashcards in Linear regression Deck (23):
1

formula for β-hat 1

Sxy/Sxx

2

formula for β-hat 0

(mean of y) - Beta-hat 1 (mean of x)

3

express yi = β0 + β1xi + εi
in matrix notation

y = Xβ + ε

4

When is a model linear?

If the parameters (the beta terms ) are linear functions

5

How do we obtain the least squares estimate of a parameter?

We minimise the function R which is the same as the error terms squared.

R(β0, β1) = (yi − β0 − β1xi)^2 = ε^2

1. We differentiate R and set the first partial derivatives to be equal to 0.

2.

6

∂aTz/∂z =

a

7

∂(zTMz)/∂z =

(M + MT)z

8

εTε =

(y − Xβ)T(y − Xβ) = R(β0,β1)

9

βˆ(the matrix)

(XT X)^(-1)XT y

10

Why is the matrix formula for calculating the least square more useful than calculating the individual values of the parameters?

It is applicable to any linear model not those with just two parameters or linear independent values.

11

What is the independent variable and what is the dependent variable?

x = independent, y = dependent

12

Create a linear model in R with the following data:

height (cm) 175.8, 180.3, 182.9, 185.4
weight (kg) 63.5, 69.1, 82.6, 76.2

1. Create vectors from the two fields:
height and weight.

Call lm(height ~ weight)

13

What does this call tell us?

lm(formula = weight ~ height, data = measurements)
Coefficients:
(Intercept) height
-230.455 1.675

β-hat 0 = -230.455 and β-hat 1 = 1.675

14

y = βˆ0 + βˆ1x is said to be

the estimated or fitted regression line of y on x

15

yˆi = βˆ0 + βˆ1x is said to be

the ith fitted/predicted value of y

16

εi = yi − yˆi is said to be

the residuals

17

RSS Sum(1 to n, ei^2) is said to be

the residual sum of squares

18

How can we estimate the standard deviation using the RSS?

σˆ2 =RSS/(n - p) where n is the number of observations and p is the number of parameters

19

What does σˆ2 actually represent?

The amount of variation in the data about the true regression line

20

The ____ the RSS the _____ the observations are to the fitted regression line

1. smaller
2. The closer the observations are to the fitted line.

21

What is (n - p)

The residual degrees of freedom

22

How does the RSS vary as more parameters are added?

It either remains constant or it decreases. (Too much of this can result in over-fitting the data)

23

distribution of the error terms

ε ~ N(0,σˆ2)