GLM 1 - Simple linear regression Flashcards
In a linear relationship, how is the value of an outcome variable Y approximated?
Y ≈ β0 + β1X.
Y= dependent variable
B0= is an intercept
B1 = slope coefficient of X
What is the intercept/B0 (often labelled the constant)?
Expected mean value of Y when all X=0.
What is the B1?
The slope or how y changes per unit increase in x.
B1 is increase in y when you change x by a unit/when x is increased by one unit y will increase by beta 1
What is the terminology of a linear regression?
- We say that Y is regressed on X .
- We are expressing Y in terms of X .
- The dependent variable, Y , depends on X .
- The independent variable, X , doesn’t depend on anything.
How are the coefficients or parameters B0 and B1 estimated?
Using the available data:
(x1, y1), (x2, y2), . . . , (xn, yn ) - We have here a sample size of n data points.
How are the estimates of parameters written?
The estimates of the parameters are written with a circumflex or hat: ^
We then write our linear equation with these estimated coefficients: y^ = β^0 + β^1 xi
Only a hat over the dependent variable.
Independent variable (xi) does not have a hat as treated as fixed.
B0 and B1 are independent of each other
True or false
True
What does the circumflex allow us to differentiate between?
True value and estimated value
What happens if add value to B0?
This would only affect y but not B1xi – B0 can change independently of B1
What is y^I?
Predictions or predicted values of the outcomes y , given the independent variables, xi ’s
What are the differences between the predicted values, y^ i’s, and the observed values, yi ’s?
The residuals:
e^ := yj − yi^ .
That is, these are the values that remain after we have removed the
predictions from the observations.
Why are the residuals, e^i ’s, also equipped with a hat?
Because these are also estimated values.
Why are the black error bars vertical, and not perpendicular to the line in blue?
Residuals correspond to an addition to value of y hat
How can the optimal value of the parameters, β0 and β1 be found?
By considering the sum of the squares of the residuals:
RSS := e^1 + e^2
Why do we square residuals?
Residuals are defined as a subtraction of the predicted values from observed values; we can rewrite RSS in the following fashion: RSS = (y − y^1)2. Some values may be negative and some may be positive and thus must square them to normalise them and ensure they make a positive contribute to RSS.
What is the optimal purpose for B0 and B1?
To minimise distance from all the data points.
What is RSS a function of and why?
B0 and B1 because all residuals do depend upon values of B0 and B1. Thus, we may write the RSS as depending on these quantities:
RSS(β^0, β^1) = e^12 + e^22
The value taken by the RSS can therefore minimized for some values of β^0 and β^1.
How do we write this?
(β0, β1 ) := argmin RSS(β0 , β1),
Argim RSS- means argument that minimizes RSS
Where the hats on the right hand-side of the RSS have been suppressed.
The RSS is a function of the parameters β0 and β1 therefore…
it can take a range of values across a two dimensional landscape
How can we assess the accuracy/goodness of fit of our model?
Using the previously minimized value of the RSS
What is one way of quantifying accuracy of model?
Compare RSS with total sum of squares (which can be reformulated as the sum of squares of null model as null model is model with only y intercept)
What is R2 also known as?
Coefficient of determination
What does R2 measure?
Proportion of variance in the dependent variable explained by the independent variable.
For simple regression, the R2 can be shown to be equivalent to what?
Correlation of the IV with the DV.That is,where R2 and the square of Cor(Y , X ) are equal.