Unit 8/9 Flashcards
What is a linear model?
A linear model is a mathematical representation that describes the relationship between a dependent variable and one or more independent variables using a linear equation.
True or False: In a linear regression model, the relationship between the variables is always perfectly linear.
False
Fill in the blank: The _____ is the variable that is being predicted in a regression model.
dependent variable
What does the slope in a linear regression model represent?
The slope represents the change in the dependent variable for each one-unit change in the independent variable.
What is the purpose of residuals in regression analysis?
Residuals measure the difference between observed values and the values predicted by the model.
Multiple Choice: Which of the following is NOT an assumption of linear regression? A) Linearity B) Independence C) Homoscedasticity D) Normality of predictors
D) Normality of predictors
Define multicollinearity.
Multicollinearity occurs when two or more independent variables in a regression model are highly correlated, making it difficult to determine their individual effects.
True or False: Increasing the number of predictors in a model always improves the model’s performance.
False
What is R-squared?
R-squared is a statistical measure that represents the proportion of variance for the dependent variable that’s explained by the independent variables in a regression model.
Fill in the blank: A _____ plot is used to assess the linearity assumption in regression analysis.
scatter
What is the purpose of the F-test in regression?
The F-test is used to determine if the overall regression model is a good fit for the data.
Short Answer: How can you detect multicollinearity in a regression model?
By examining the variance inflation factor (VIF) for each predictor.
What is the difference between simple and multiple linear regression?
Simple linear regression involves one independent variable, while multiple linear regression involves two or more independent variables.
True or False: The intercept of a regression line represents the predicted value of the dependent variable when all independent variables are zero.
True
What is the main goal of regression analysis?
The main goal is to model the relationship between variables and predict the value of the dependent variable.
What does a negative slope indicate in a regression model?
A negative slope indicates that as the independent variable increases, the dependent variable decreases.
Fill in the blank: The _____ of a regression model is used to quantify the relationship between the dependent and independent variables.
coefficient
What is the significance of the p-value in regression analysis?
The p-value indicates whether the relationship between the independent variable and the dependent variable is statistically significant.
Multiple Choice: Which of the following can be used to improve a regression model? A) Adding more data B) Removing outliers C) Adding interaction terms D) All of the above
D) All of the above
What is the purpose of cross-validation in regression?
Cross-validation is used to assess how the results of a statistical analysis will generalize to an independent dataset.
True or False: A higher R-squared value always indicates a better model.
False
Fill in the blank: The _____ is the average of the squared differences between observed and predicted values.
mean squared error
What is a confounding variable?
A confounding variable is an outside influence that changes the effect of a dependent and independent variable.
What does the term ‘overfitting’ mean in the context of regression?
Overfitting refers to a model that is too complex and captures noise rather than the underlying pattern.