Untitled Deck Flashcards

Question

What does heteroscedasticity indicate in a regression model?

Answer 1

It indicates that the residuals have non-constant variance, violating the homoscedasticity assumption.

Answer 2

A residuals vs. fitted values plot.

Answer 3

To assess whether residuals follow a normal distribution.

Answer 4

By calculating Variance Inflation Factors (VIF); high VIF values suggest multicollinearity.

Answer 5

A VIF value greater than 10 is often considered indicative of significant multicollinearity.

Answer 6

To identify influential data points that disproportionately affect the regression model.

Answer 7

High leverage points have extreme predictor values and can unduly influence the regression line.

Answer 8

Autocorrelation in the residuals, particularly in time series data.

Answer 9

Autocorrelation violates the independence assumption, leading to inefficient and biased estimates.

Answer 10

Data transformations like logarithmic or square root transformations can help normalize residuals.

Answer 11

It helps detect influential cases by showing how much each observation influences the regression coefficients.

Answer 12

When the outlier is due to data entry errors or is not representative of the population being studied.

Answer 13

It can lead to inefficient estimates and affect the validity of hypothesis tests.

Answer 14

By removing highly correlated predictors, combining them, or using techniques like ridge regression.

Answer 15

To compare coefficients on a common scale and mitigate issues like multicollinearity.

Answer 16

Because they can disproportionately affect the model's coefficients and predictions.

Answer 17

The relationship between the dependent variable and a specific independent variable, controlling for other variables.

Answer 18

That the coefficient is equal to zero, indicating no effect.

Answer 19

By dividing the estimated coefficient by its standard error.

Answer 20

It suggests that the coefficient is significantly different from zero at the 5% significance level.

Answer 21

To estimate the range within which the true population parameter is likely to fall.

Answer 22

It generally makes the confidence intervals narrower, indicating more precise estimates.

Answer 23

A confidence interval estimates the mean value of the dependent variable for given predictor values, while a prediction interval estimates the range for individual future observations.

Answer 24

It indicates that the effect of one predictor on the dependent variable changes depending on the value of another predictor.

Answer 25

It means the effect of one independent variable on the response increases as the other independent variable increases.

Answer 26

A curved pattern in residual plots suggesting nonlinearity.

Answer 27

To capture curvature or non-linear relationships between independent and dependent variables.

Answer 28

It reduces multicollinearity and simplifies interpretation of main effects.

Answer 29

A simpler regression model that includes a subset of predictors from a more complex model.

Answer 30

Use an F-test comparing the residual sum of squares (SSE) of both models.

Answer 31

Whether the additional predictors significantly improve the fit over the simpler (nested) model.

Answer 32

Typically, choose the simpler model due to the principle of parsimony.

Answer 33

The larger model (with additional predictors) significantly improves the model fit.

Answer 34

Because predictions outside the range of observed data may be unreliable or misleading.

Answer 35

Prediction intervals for a single observation are wider than confidence intervals for the mean response.

Answer 36

Prediction intervals include the residual variability of individual observations, making them wider.

Answer 37

Increasing residual variance widens the prediction interval.

Answer 38

When predicting individual future values rather than estimating an average response.

Answer 39

It evaluates the effect of adding a single predictor after accounting for other variables already in the model.

Answer 40

It often suggests a missing predictor or the need for a data transformation.

Answer 41

Use transformations like log or square-root, or weighted least squares.

Answer 42

A mathematical operation applied to data to stabilize variance across values. Example: log-transform.

Answer 43

Indicates heteroscedasticity (increasing or decreasing variance).

Answer 44

The probability of making at least one Type I error when conducting multiple hypothesis tests.

Answer 45

To control the experiment-wise error rate when conducting multiple simultaneous hypothesis tests.

Answer 46

Tukey’s HSD is specifically for pairwise comparisons post-ANOVA, while Bonferroni adjusts for multiple tests generally, and can be overly conservative.

Answer 47

By dividing the original significance level (α) by the number of tests conducted.

Answer 48

Approximately 40% chance of making at least one Type I error.

Answer 49

A clear curved pattern in a residual plot versus that variable.

Answer 50

High multicollinearity; including both variables might be redundant.

Answer 51

Including both provides minimal additional explanatory power and inflates standard errors due to multicollinearity.

Answer 52

There is a curvature (nonlinear relationship) between the predictors and the response variable.

Answer 53

A significant F-test result from the ANOVA table suggests at least one predictor is useful.

Answer 54

That the variable does not provide significant explanatory power given the other variables in the model.

Answer 55

(SSEreduced−SSEfull)/(dfreduced−dffull) SSEfull/dffull

Answer 56

Reject the null hypothesis, concluding at least one group mean significantly differs.

Answer 57

Variance estimate = Mean Square Error (MSE).

Answer 58

Little or no evidence that group means significantly differ; most variation is within groups.

Answer 59

It effectively controls the experiment-wise error rate specifically for pairwise comparisons.

Answer 60

1−(1−α)n, where n is the number of tests.

Answer 61

Prediction is the estimated response; standard error quantifies uncertainty of this prediction.

Answer 62

The actual observation is expected to fall between 22 and 28 with specified confidence.

Answer 63

Higher residual variance indicates greater uncertainty in future individual observations.

Answer 64

Select the simpler one with the smaller BIC/AIC, applying the principle of parsimony.

Answer 65

Because BIC penalizes additional predictors more heavily than AIC.

Answer 66

Either model is acceptable.

Answer 67

Greater uncertainty in future individual observations.

Answer 68

Select the simpler one with the smaller BIC/AIC, applying the principle of parsimony.

Answer 69

Because BIC penalizes additional predictors more heavily than AIC.

Answer 70

Either model is acceptable; choose based on interpretability or simplicity of variables.

Answer 71

A predicted reduction of about 0.3 in scoring average.

Answer 72

They help identify violations of assumptions or the presence of unexplained patterns favoring a more complex model.

Answer 73

RSS divided by degrees of freedom.

Answer 74

Increasing variance (heteroscedasticity), violating constant variance assumption.

Answer 75

Investigate its accuracy or consider analysis with and without this influential point.

Answer 76

Yes, if the point has low leverage (typical predictor values).

Untitled Deck Flashcards

(100 cards)