Untitled Deck Flashcards

(100 cards)

1
Q

What is the primary purpose of multiple linear regression?

A

To model the relationship between one dependent variable and two or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In the equation Y=β0+β1X1+β2X2+ε, what does β0 represent?

A

The intercept; the expected value of Y when all independent variables are zero.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How does multiple linear regression differ from simple linear regression?

A

Multiple linear regression involves two or more independent variables, while simple linear regression involves only one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What assumption is made about the relationship between the dependent and independent variables in multiple linear regression?

A

The relationship is assumed to be linear.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is multicollinearity, and why is it problematic in multiple linear regression?

A

Multicollinearity occurs when independent variables are highly correlated, making it difficult to assess the individual effect of each predictor.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the coefficient β1 represent in a multiple linear regression model?

A

The expected change in the dependent variable for a one-unit increase in X1, holding other variables constant.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the purpose of the error term ε in a regression model?

A

It accounts for the variability in Y that cannot be explained by the linear relationship with the predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the goodness-of-fit of a multiple linear regression model typically assessed?

A

By examining the R-squared value, which indicates the proportion of variance in the dependent variable explained by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the difference between R-squared and adjusted R-squared?

A

Adjusted R-squared adjusts the R-squared value for the number of predictors in the model, providing a more accurate measure when multiple variables are involved.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why might adding more predictors to a regression model not always lead to a better model?

A

Adding unnecessary predictors can lead to overfitting, where the model captures noise rather than the underlying relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the purpose of ANOVA in the context of regression analysis?

A

To assess the overall significance of the regression model by comparing the model variance to the residual variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In an ANOVA table, what does the ‘Sum of Squares’ represent?

A

It quantifies the total variation in the dependent variable, partitioned into components attributable to the regression model and residual error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does a significant F-statistic in an ANOVA table indicate about a regression model?

A

It suggests that at least one predictor variable significantly explains variation in the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How is the Mean Square Error (MSE) calculated in an ANOVA table?

A

By dividing the Sum of Squares for residuals by its corresponding degrees of freedom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the null hypothesis tested by the overall F-test in regression ANOVA?

A

That all regression coefficients are equal to zero, implying no linear relationship between predictors and the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

In the context of ANOVA, what does the term ‘degrees of freedom’ refer to?

A

The number of independent values or quantities that can vary in the analysis, associated with the sources of variation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Why is the principle of parsimony important when interpreting ANOVA results in regression?

A

It emphasizes choosing the simplest model that adequately explains the data, avoiding overfitting with unnecessary predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the Total Sum of Squares (SST) represent in an ANOVA table?

A

The total variation in the dependent variable around its mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How is the Regression Sum of Squares (SSR) interpreted in ANOVA?

A

It measures the portion of total variation explained by the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the Residual Sum of Squares (SSE) in an ANOVA context?

A

It quantifies the variation in the dependent variable that remains unexplained by the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the Akaike Information Criterion (AIC) used for in model selection?

A

To compare models by balancing goodness-of-fit and complexity, penalizing models with more parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How does the Bayesian Information Criterion (BIC) differ from AIC in model selection?

A

BIC imposes a stricter penalty for model complexity, favoring simpler models compared to AIC.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What are the four primary assumptions of multiple linear regression?

A

Linearity, independence, homoscedasticity (constant variance), and normality of residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How can you visually assess the linearity assumption in regression analysis?

A

By plotting residuals against fitted values; a random scatter suggests linearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does heteroscedasticity indicate in a regression model?
It indicates that the residuals have non-constant variance, violating the homoscedasticity assumption.
26
Which plot is commonly used to detect heteroscedasticity?
A residuals vs. fitted values plot.
27
What is the purpose of a Q-Q (quantile-quantile) plot in regression diagnostics?
To assess whether residuals follow a normal distribution.
28
How can you detect multicollinearity among predictors?
By calculating Variance Inflation Factors (VIF); high VIF values suggest multicollinearity.
29
What is considered a high VIF value indicating problematic multicollinearity?
A VIF value greater than 10 is often considered indicative of significant multicollinearity.
30
What is Cook's Distance used for in regression analysis?
To identify influential data points that disproportionately affect the regression model.
31
How is leverage related to influential observations in regression?
High leverage points have extreme predictor values and can unduly influence the regression line.
32
What is the Durbin-Watson statistic used to detect?
Autocorrelation in the residuals, particularly in time series data.
33
Why is it problematic if residuals are autocorrelated?
Autocorrelation violates the independence assumption, leading to inefficient and biased estimates.
34
What transformation can be applied if the normality assumption is violated?
Data transformations like logarithmic or square root transformations can help normalize residuals.
35
What does a residuals vs. leverage plot help identify?
It helps detect influential cases by showing how much each observation influences the regression coefficients.
36
When should you consider removing an outlier from your data set in regression analysis?
When the outlier is due to data entry errors or is not representative of the population being studied.
37
What is the consequence of violating the homoscedasticity assumption?
It can lead to inefficient estimates and affect the validity of hypothesis tests.
38
How can you address multicollinearity in your regression model?
By removing highly correlated predictors, combining them, or using techniques like ridge regression.
39
What is the purpose of standardizing variables in regression analysis?
To compare coefficients on a common scale and mitigate issues like multicollinearity.
40
Why is it important to check for influential points in regression analysis?
Because they can disproportionately affect the model's coefficients and predictions.
41
What does a partial regression plot show?
The relationship between the dependent variable and a specific independent variable, controlling for other variables.
42
What is the null hypothesis when testing the significance of a regression coefficient?
That the coefficient is equal to zero, indicating no effect.
43
How is the t-statistic for a regression coefficient calculated?
By dividing the estimated coefficient by its standard error.
44
What does a p-value less than 0.05 indicate in the context of regression coefficients?
It suggests that the coefficient is significantly different from zero at the 5% significance level.
45
What is the purpose of constructing a confidence interval for a regression coefficient?
To estimate the range within which the true population parameter is likely to fall.
46
How does increasing the sample size affect the width of confidence intervals in regression?
It generally makes the confidence intervals narrower, indicating more precise estimates.
47
What is the difference between a confidence interval and a prediction interval in regression analysis?
A confidence interval estimates the mean value of the dependent variable for given predictor values, while a prediction interval estimates the range for individual future observations.
48
What does an interaction term (e.g., X1×X2) in a regression model indicate?
It indicates that the effect of one predictor on the dependent variable changes depending on the value of another predictor.
49
How do you interpret a significant positive coefficient for an interaction term?
It means the effect of one independent variable on the response increases as the other independent variable increases.
50
What indicates the need for adding polynomial (e.g., quadratic or cubic) terms in regression?
A curved pattern in residual plots suggesting nonlinearity.
51
What is the purpose of including polynomial terms (like X2) in regression?
To capture curvature or non-linear relationships between independent and dependent variables.
52
Why is centering variables useful when including interaction terms?
It reduces multicollinearity and simplifies interpretation of main effects.
53
What is a nested model in regression analysis?
A simpler regression model that includes a subset of predictors from a more complex model.
54
How do you statistically compare two nested models?
Use an F-test comparing the residual sum of squares (SSE) of both models.
55
What hypothesis does the nested model F-test evaluate?
Whether the additional predictors significantly improve the fit over the simpler (nested) model.
56
If two nested models have similar adjusted R² but differ greatly in complexity, which one should you choose?
Typically, choose the simpler model due to the principle of parsimony.
57
What does a significant F-test result for nested models imply?
The larger model (with additional predictors) significantly improves the model fit.
58
Why is extrapolation risky in regression analysis?
Because predictions outside the range of observed data may be unreliable or misleading.
59
How does the prediction interval differ when predicting a single future observation versus predicting the mean response?
Prediction intervals for a single observation are wider than confidence intervals for the mean response.
60
How is a prediction interval calculated differently from a confidence interval?
Prediction intervals include the residual variability of individual observations, making them wider.
61
What effect does increasing variance of residuals have on the prediction interval width?
Increasing residual variance widens the prediction interval.
62
In regression, when would you prefer a prediction interval over a confidence interval?
When predicting individual future values rather than estimating an average response.
63
What is partial regression (added variable) plot useful for?
It evaluates the effect of adding a single predictor after accounting for other variables already in the model.
64
What does non-constant variance (heteroscedasticity) typically imply about model specification?
It often suggests a missing predictor or the need for a data transformation.
65
How would you address heteroscedasticity detected in your regression?
Use transformations like log or square-root, or weighted least squares.
66
What is a variance-stabilizing transformation, and give an example.
A mathematical operation applied to data to stabilize variance across values. Example: log-transform.
67
When diagnosing regression models, what does a residual plot shaped like a funnel indicate?
Indicates heteroscedasticity (increasing or decreasing variance).
68
What is the experiment-wise error rate?
The probability of making at least one Type I error when conducting multiple hypothesis tests.
69
Why would you use multiple-comparison procedures (like Tukey’s HSD or Bonferroni)?
To control the experiment-wise error rate when conducting multiple simultaneous hypothesis tests.
70
What’s the difference between Tukey’s HSD and Bonferroni adjustments?
Tukey’s HSD is specifically for pairwise comparisons post-ANOVA, while Bonferroni adjusts for multiple tests generally, and can be overly conservative.
71
How does the Bonferroni method control the Type I error rate?
By dividing the original significance level (α) by the number of tests conducted.
72
If conducting 10 tests with an individual α-level of 0.05, what's the approximate experiment-wise error rate if you don’t adjust?
Approximately 40% chance of making at least one Type I error.
73
What might indicate a nonlinear relationship between an independent variable (e.g., weight) and residuals in regression?
A clear curved pattern in a residual plot versus that variable.
74
If two independent variables have a correlation coefficient of 0.95, what implication does this have on the regression model?
High multicollinearity; including both variables might be redundant.
75
Why might you prefer using only one of two highly correlated predictors instead of both?
Including both provides minimal additional explanatory power and inflates standard errors due to multicollinearity.
76
What is the practical implication if adding squared terms significantly improves a regression model?
There is a curvature (nonlinear relationship) between the predictors and the response variable.
77
How can you determine if at least one predictor variable is useful from a regression model utility test?
A significant F-test result from the ANOVA table suggests at least one predictor is useful.
78
What does a non-significant coefficient of an independent variable imply in a multiple regression model?
That the variable does not provide significant explanatory power given the other variables in the model.
79
What formula gives the F-statistic for testing nested models (comparing simpler vs. complex)?
(SSEreduced−SSEfull)/(dfreduced−dffull) SSEfull/dffull
80
If the calculated F-statistic is larger than the critical value in an ANOVA test, what's the decision about the null hypothesis?
Reject the null hypothesis, concluding at least one group mean significantly differs.
81
How do you estimate variance from an ANOVA table?
Variance estimate = Mean Square Error (MSE).
82
What's the meaning of an F-statistic close to or less than 1 in ANOVA?
Little or no evidence that group means significantly differ; most variation is within groups.
83
Why is Tukey's procedure preferred for multiple comparisons after ANOVA?
It effectively controls the experiment-wise error rate specifically for pairwise comparisons.
84
How do you calculate the experiment-wise error rate for multiple independent hypothesis tests at α-level?
1−(1−α)n, where n is the number of tests.
85
What's the difference between 'prediction' and 'standard error of prediction' in regression?
Prediction is the estimated response; standard error quantifies uncertainty of this prediction.
86
How do you interpret a prediction interval of 25±3?
The actual observation is expected to fall between 22 and 28 with specified confidence.
87
Why does the prediction interval width increase if the residual variance increases?
Higher residual variance indicates greater uncertainty in future individual observations.
88
If two models have very similar R², how do you choose the best one?
Select the simpler one with the smaller BIC/AIC, applying the principle of parsimony.
89
When comparing models, why might the BIC criterion favor a simpler model compared to AIC?
Because BIC penalizes additional predictors more heavily than AIC.
90
If model M4 and M6 have similar BIC values and R², which model should you choose?
Either model is acceptable.
91
What does higher residual variance indicate?
Greater uncertainty in future individual observations.
92
How do you choose the best model if two have very similar R²?
Select the simpler one with the smaller BIC/AIC, applying the principle of parsimony.
93
Why might the BIC criterion favor a simpler model compared to AIC?
Because BIC penalizes additional predictors more heavily than AIC.
94
If model M4 and M6 have similar BIC values and R², which model should you choose?
Either model is acceptable; choose based on interpretability or simplicity of variables.
95
What is the effect of completing 10 more tournaments if it reduces a golfer's predicted scoring average by 0.03 per tournament?
A predicted reduction of about 0.3 in scoring average.
96
How can residual plots guide your choice between a complex versus a simple regression model?
They help identify violations of assumptions or the presence of unexplained patterns favoring a more complex model.
97
How do you calculate residual standard error from residual sum of squares (RSS)?
RSS divided by degrees of freedom.
98
What might a residual plot showing a clear 'fan' or 'funnel' shape indicate?
Increasing variance (heteroscedasticity), violating constant variance assumption.
99
What practical action might you take if a data point has a very high Cook's distance?
Investigate its accuracy or consider analysis with and without this influential point.
100
Can a data point with a large residual have a small Cook's distance?
Yes, if the point has low leverage (typical predictor values).