MLR Flashcards
(41 cards)
How is the best regression model fit found?
The best fit model is the one that minimises the total square differences between the X data points and the model.
What is simple linear regression?
A statistical technique that develops an equation that relates a dependent variable to one independent variable.
What does SLR do?
1) Enables prediction of dependent variable
2) Estimates the line of best fit
3) Evaluates a linear relationship between the explanatory variable and the response variable.
An important tip for interpreting SLR model?
The coefficient of X shows an increase/ decrease ON AVERAGE not certainly. The Y-intercept shows the AVERAGE is not accurate.
What is the lower and upper 95%?
This is the 95% confidence interval around the estimated values. Shows the amount of variability in the models.
How to find the T-value in Excel?
T.INV.2T(CI/2, number of values - number of variables -1)
What does the estimated slope do?
Provide information about the relationship between the response variable and the explanatory variable. It also shows the estimated average increase in Y for a one-unit increase in X.
What does the intercept show?
The value of the predicted response when the explanatory variable value is zero.
What does the t-test in the SLR model show?
These show whether the coefficient values are significantly different from 0. E.g. if floor space significantly explains the variability in house prices.
How to carry out a t-test?
Step 1: H0: beta=0 H1: beta =/ 0
Step 2: Assume H0 is true.
Step 3: Calculate the test statistic
t= (coefficient estimate-0)/ SE(coefficient estimate)
Step 4: Interpret results.
What does the p-value show?
The p-value shows the sum of the two cut-off areas of the distribution in a two-tailed test. It shows the probability of getting the t-stat or more extreme given the null hypothesis is true.
What is the connection between the confidence interval and 2 tailed t-tests?
If the t-test shows that the coefficient is significantly different from 0 p<aplha then the confidence interval doesn’t contain 0. If the opposite occurs the CI does contain 0/
What does the t-test test for?
Whether the coefficients are significantly different from 0.
What does the SS of regression show?
This shows the sum of the squares of the differences between the predicted values and the mean of the data points. Shows the total variation of the response variable.
How to find SS Total?
SS regression + SS residual
How to find the R-squared value?
SS regression/ SS total.
What does R squared show?
It shows how much of the variation is explained by the model. It evaluates the dispersion of the data points around the fitted regression line. Also called the coefficient of determination.
Why is the R-squared value not always good?
1) Doesn’t show outliers.
2) Doesn’t show increasing variance.
3) Doesn’t show model is curved.
What is heteroscedasticity?
The variance differs over the data set.
When can you carry out a SLR model?
1) When the residuals have constant variance.
2) The model must be linear.
3) Residuals must be uncorrelated.
4) The errors are normally distributed with a mean of 0.
What does the standard error show?
This is the average distance that the observed data points fall from the regression line. I.e. it tells you how wrong the model is on average using the units of the response variable.
How to calculate the standard error?
Mean squared residual value square rooted.
How to calculate multiple R?
The square root of the R squared value.
What is the multiple R?
This is the correlation coefficient. Shows how strong the linear relationship is between the response y data and explanatory x data.