Flashcards in Linear regression Deck (23):

1

## formula for β-hat 1

### Sxy/Sxx

2

## formula for β-hat 0

### (mean of y) - Beta-hat 1 (mean of x)

3

##
express yi = β0 + β1xi + εi

in matrix notation

### y = Xβ + ε

4

## When is a model linear?

### If the parameters (the beta terms ) are linear functions

5

## How do we obtain the least squares estimate of a parameter?

###
We minimise the function R which is the same as the error terms squared.

R(β0, β1) = (yi − β0 − β1xi)^2 = ε^2

1. We differentiate R and set the first partial derivatives to be equal to 0.

2.

6

## ∂aTz/∂z =

### a

7

##
∂(zTMz)/∂z =

### (M + MT)z

8

## εTε =

### (y − Xβ)T(y − Xβ) = R(β0,β1)

9

## βˆ(the matrix)

### (XT X)^(-1)XT y

10

## Why is the matrix formula for calculating the least square more useful than calculating the individual values of the parameters?

### It is applicable to any linear model not those with just two parameters or linear independent values.

11

## What is the independent variable and what is the dependent variable?

### x = independent, y = dependent

12

##
Create a linear model in R with the following data:

height (cm) 175.8, 180.3, 182.9, 185.4

weight (kg) 63.5, 69.1, 82.6, 76.2

###
1. Create vectors from the two fields:

height and weight.

Call lm(height ~ weight)

13

##
What does this call tell us?

lm(formula = weight ~ height, data = measurements)

Coefficients:

(Intercept) height

-230.455 1.675

### β-hat 0 = -230.455 and β-hat 1 = 1.675

14

## y = βˆ0 + βˆ1x is said to be

### the estimated or fitted regression line of y on x

15

## yˆi = βˆ0 + βˆ1x is said to be

### the ith fitted/predicted value of y

16

## εi = yi − yˆi is said to be

### the residuals

17

## RSS Sum(1 to n, ei^2) is said to be

### the residual sum of squares

18

## How can we estimate the standard deviation using the RSS?

### σˆ2 =RSS/(n - p) where n is the number of observations and p is the number of parameters

19

## What does σˆ2 actually represent?

### The amount of variation in the data about the true regression line

20

## The ____ the RSS the _____ the observations are to the fitted regression line

###
1. smaller

2. The closer the observations are to the fitted line.

21

## What is (n - p)

### The residual degrees of freedom

22

## How does the RSS vary as more parameters are added?

### It either remains constant or it decreases. (Too much of this can result in over-fitting the data)

23