Week 6 theoretical quesitions Flashcards

1
Q

In assessing the linear relationship between two interval variables, what is common between Pearson’s coefficient of correlation and simple linear regression analysis?

A

Common: Both methods can indicate whether there exists a linear relationship between the two interval variables and, if yes, the direction (positive or negative) of the linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

In assessing the linear relationship between two interval variables, what is different between Pearson’s coefficient of correlation and simple linear regression analysis?

A

Different: Pearson’s coefficient of correlation measures the strength of the linear relationship over the range of [-1,1], while slope estimate in simple linear regression analysis measures what is the expected change in y given one-unit change in x.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the ANOVA test for linear regression?

A

ANOVA test is essentially a F-test of sum of squares for regression against sum of squares for error.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What conclusion we can obtain from this test?

A

The test result can inform us whether the variation in the outcome variable explained by the regression model is sufficiently large relative to the variation unexplained (corresponding to a rejection of the null hypothesis in the F-test).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the difference between prediction interval and confidence interval in making prediction based on an estimated linear regression model?

A

Prediction interval is used when one is interested in estimating one-time value of the dependent variable given certain value(s) of the explanatory variable(s), while confidence interval is used when one is interested in estimating the mean of all values of the dependent variable given certain value(s) of the explanatory variable(s). All else being equal, confidence interval will always be narrower than the prediction interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the meaning of ceteris paribus?

A

Ceteris paribus: Other relevant factors being equal (all else being equal; holding all other relevant factors constant)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why is multiple linear regression analysis (compared to simple linear regression analysis) more able to make ceteris paribus inference?

A

Reason: By modeling the dependent variable as a function of multiple independent variables, multiple linear regression analysis can explicitly control for many other factors that simultaneously affect the dependent variable when we assess the effect of the focal independent variable on the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Why do we have adjusted R^2 in multiple linear regression analysis?

A

Adjusted R^2 is used to address the drawback of the normal R^2: more (even irrelevant) variables included in the model, R^2 will always increase. Adjusted R^2 addresses this drawback by imposing a penalty for adding additional independent variables into the model such that only if the additional variables explain a sufficiently large extra variation in y would the adjusted R^2 increase.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the issue of multicollinearity?

A

Multicollinearity is an issue in multiple linear regression analysis where two or more of the independent variables are strongly correlated. In other words, at least one of the independent variables can be largely approximated by a linear function of the other independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How is a categorical variable with k levels included as an explanatory variable in a linear regression model? How do we interpret the estimated coefficients?

A

k-1 dummy variables should be created corresponding to k-1 levels of the categorical variable. That is, one (arbitrary) level has to be omitted. Then the k-1 dummy variables, rather than the original variable, are included in the regression model.

The estimated coefficient of any of the k-1 dummy variables indicates, ceteris paribus (all other factors being the same), the expected mean difference in the outcome variable between the focal category and the omitted category.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly