W1: Multiple Linear Regression Flashcards
In multiple linear regression, there are:
Multiple independent variables (X) and one dependent variable (Y)
In multiple linear regression, there are:
Multiple independent variables (X) and one dependent variable (Y)
The multiple R-squared value for a regression represent the proportion of the variation in the Y variable that can explained by its regression on the X variables.
True or False?
True
The assumptions which we need to check when we perform a multiple linear regression are (3):
Normality of the errors
Common variance of the errors
Independence of the errors
For the Kolmogorov-Smirnov and Shapiro-Wilk tests of Normality, if p < 0.05 then we conclude that the Normality assumption has been satisfied.
True or False?
Multiple linear regression
False
If the p-value for a correlation coefficient was p = 0.036 then the correlation would be significant at
5% level
We can use multiple linear regression to allow the use of several X-variables (predictors/IV) to predict the
response Y
What is the multiple linear regression model equation?
Y = a + (b1 * X1) + (b2 * X2) + … + e
What is the multiple linear regression model equation - Y?
Y is the response (DV)
What is the multiple linear regression model equation? - X
X is predictors/IV
What is the multiple linear regression model equation? - B1/B2
B1/B2 is the slope/gradient
What is the multiple linear regression model equation? - a
A is constant
What is the multiple linear regression model equation? - e
e is error term
The multiple linear regression has predictor variables (X) with its own
coefficient (b1/b2)
Why is their an error term ( e ) in multiple linear regression?
Knowing the values of X1,X2…. does not allow us to predict the value of Y exactly
What is a residual?
Difference between the observed Y-value and its prediction (fitted value) based on corresponding X-values
How to calculate residual?
Multiple linear regression
Residual = Observations - Fitted Valeu
If the scatterplot of residuals are not independent + common variance (funnel effect graph)
Multiple linear regression
If the scatterplot of residuals are not independent + common variance (funnel effect graph)
Graph does not have independence
Test signifiance of each predictor, test null and alternate hypothesis that:
Multiple linear regression
H0: b = 0 vs H1 : b≠ 0 (for each particular X variable)
Generally, an R-Squared above 0.6 (2)
Multiple linear regression
makes a model worth your attention
Means that most of the variability in Y var can be explained by X var/multiple linear regression model
Step 1 (In SPSS): Writing Regression Equation (2)
The regression equation is:
MRI Count = 237.598 + 55.236(Gender) + 1.280 (PIQ) + 6.515 (Height)
Step 1 (In R): Writing Regression Equation (2)
The regression equation is:
Costs = -3085.657 -86.774(Region) + 511.084(Sex) + 115.61(Age) -2.62(Martial) + 51.16 (Alcohol) + 138.00 (Cigs) -269.264(Exercise)
How can you tell Y and X variables utilised in multiple linear regression model in R? (4)
- Costs = Y
- X = Region, Sex, Age, Marit, Alco
- Data is from ex.data
- This is all stored in variable called model