Unit 8/9 Flashcards

1
Q

What is a linear model?

A

A linear model is a mathematical representation that describes the relationship between a dependent variable and one or more independent variables using a linear equation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

True or False: In a linear regression model, the relationship between the variables is always perfectly linear.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Fill in the blank: The _____ is the variable that is being predicted in a regression model.

A

dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the slope in a linear regression model represent?

A

The slope represents the change in the dependent variable for each one-unit change in the independent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the purpose of residuals in regression analysis?

A

Residuals measure the difference between observed values and the values predicted by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Multiple Choice: Which of the following is NOT an assumption of linear regression? A) Linearity B) Independence C) Homoscedasticity D) Normality of predictors

A

D) Normality of predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define multicollinearity.

A

Multicollinearity occurs when two or more independent variables in a regression model are highly correlated, making it difficult to determine their individual effects.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False: Increasing the number of predictors in a model always improves the model’s performance.

A

False

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is R-squared?

A

R-squared is a statistical measure that represents the proportion of variance for the dependent variable that’s explained by the independent variables in a regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: A _____ plot is used to assess the linearity assumption in regression analysis.

A

scatter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of the F-test in regression?

A

The F-test is used to determine if the overall regression model is a good fit for the data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Short Answer: How can you detect multicollinearity in a regression model?

A

By examining the variance inflation factor (VIF) for each predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the difference between simple and multiple linear regression?

A

Simple linear regression involves one independent variable, while multiple linear regression involves two or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

True or False: The intercept of a regression line represents the predicted value of the dependent variable when all independent variables are zero.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the main goal of regression analysis?

A

The main goal is to model the relationship between variables and predict the value of the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does a negative slope indicate in a regression model?

A

A negative slope indicates that as the independent variable increases, the dependent variable decreases.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fill in the blank: The _____ of a regression model is used to quantify the relationship between the dependent and independent variables.

A

coefficient

18
Q

What is the significance of the p-value in regression analysis?

A

The p-value indicates whether the relationship between the independent variable and the dependent variable is statistically significant.

19
Q

Multiple Choice: Which of the following can be used to improve a regression model? A) Adding more data B) Removing outliers C) Adding interaction terms D) All of the above

A

D) All of the above

20
Q

What is the purpose of cross-validation in regression?

A

Cross-validation is used to assess how the results of a statistical analysis will generalize to an independent dataset.

21
Q

True or False: A higher R-squared value always indicates a better model.

22
Q

Fill in the blank: The _____ is the average of the squared differences between observed and predicted values.

A

mean squared error

23
Q

What is a confounding variable?

A

A confounding variable is an outside influence that changes the effect of a dependent and independent variable.

24
Q

What does the term ‘overfitting’ mean in the context of regression?

A

Overfitting refers to a model that is too complex and captures noise rather than the underlying pattern.

25
Short Answer: What is the role of interaction terms in regression analysis?
Interaction terms are used to model the combined effects of two or more independent variables on the dependent variable.
26
What is the purpose of the Akaike Information Criterion (AIC) in model selection?
The AIC is used to compare the goodness of fit of different models while penalizing for the number of parameters.
27
Fill in the blank: A _____ plot can help visualize the residuals of a regression model.
residuals
28
True or False: Homoscedasticity means that the variance of the residuals is constant across all levels of the independent variable.
True
29
What is the difference between correlation and causation?
Correlation indicates a relationship between two variables, while causation implies that one variable directly affects the other.
30
Multiple Choice: Which method can be used to assess the normality of residuals? A) Histogram B) Q-Q plot C) Shapiro-Wilk test D) All of the above
D) All of the above
31
What does the term 'standardized coefficient' refer to in regression analysis?
Standardized coefficients allow for comparison of the relative importance of each predictor in the model.
32
Fill in the blank: The _____ is the variable that is manipulated to observe its effect on the dependent variable.
independent variable
33
What is the main difference between parametric and non-parametric models?
Parametric models assume a specific form for the function relating variables, while non-parametric models do not.
34
True or False: Linear regression can be used for both prediction and inference.
True
35
What is the purpose of residual analysis?
Residual analysis is used to validate the assumptions of regression and to identify potential issues in the model.
36
Fill in the blank: In regression, _____ refers to the degree of linear relationship between two variables.
correlation
37
Short Answer: Why is it important to check for outliers in regression analysis?
Outliers can disproportionately influence the results and lead to misleading conclusions.
38
What is the role of the intercept in a regression model?
The intercept represents the expected value of the dependent variable when all independent variables are zero.
39
Multiple Choice: Which of the following is a method for dealing with multicollinearity? A) Centering variables B) Removing variables C) Using ridge regression D) All of the above
D) All of the above
40
What does the term 'model diagnostics' refer to in regression analysis?
Model diagnostics refers to the process of checking the validity of a regression model's assumptions and performance.