Multiple Regression Flashcards

(39 cards)

1
Q

What is the date and time of the in-person exam?

A

29 MAY 9:30am – 11:30 pm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What percentage of the overall module mark is the exam worth?

A

60%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the two parts of the exam and their respective weights?

A

Part A (20%): 5 multiple choice questions on theory; Part B (80%): 5 exercises with equations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When is the revision session for exam preparation?

A

Thursday, May 1st, 15:15-17:05 CB 1.11

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What method is used to find the best-fitting line in regression?

A

Least squares method

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does the correlation coefficient represent?

A

Strength and direction of linear association between 2 variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a positive relationship in correlation?

A

An increase in X is associated with an increase in Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a negative relationship in correlation?

A

An increase in X is associated with a decrease in Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define simple linear regression.

A

Describes the nature and rate of change in the linear relationship between two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does regression tell us about the dependent variable (Y)?

A

How much Y is expected to change on average when X increases by one unit

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the equation for a straight line in regression?

A

Y = a + b * X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does the term ‘residual’ refer to in regression analysis?

A

The distance between the data point and the regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the significance of the intercept ‘a’ in the regression equation?

A

The value of Y when X = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does a high R-squared value indicate?

A

A good fit of the model to the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In multiple regression, what does the equation Y = a + b1X1 + b2X2 + … + bk*Xk represent?

A

The relationship between multiple predictor variables and the outcome variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the purpose of the adjusted R-squared?

A

Adjusts for the number of variables included in the model

17
Q

What does the coefficient ‘b’ represent in a regression model?

A

The change in Y when X increases by one unit

18
Q

What is multicollinearity?

A

When two predictors are strongly correlated, affecting the reliability of the regression estimates

19
Q

What happens when a predictor variable is added to a regression model?

A

Changes in R-squared, size of ‘b’ coefficients, and significance of ‘b’ coefficients should be checked

20
Q

Fill-in-the-blank: The best fitting line minimizes the sum of the _______.

A

squared deviations

21
Q

True or False: Correlation is symmetric.

22
Q

What is the dependent variable in the example about wealth and corruption?

A

CPI score (0 to 10; highly corrupt to highly clean)

23
Q

What is the significance of a p-value < 0.05 in regression analysis?

A

Indicates that the predictor variable is significant

24
Q

What does the term ‘predictor variable’ refer to?

A

An independent variable that is used to predict the outcome variable

25
What is the relationship between GDP and CPI in the example provided?
Higher GDP is associated with lower levels of perceived corruption
26
What is the effect of a 1 ppt increase in the immunisation rate on infant mortality when GNP is included in the model?
1.122 deaths per thousand live births. ## Footnote This indicates that when controlling for GNP, the relationship between immunisation rates and infant mortality becomes less pronounced.
27
Why is it important to compare countries with the same GNP when analyzing immunisation rates and infant mortality?
To ensure that the relationship observed is not confounded by differences in wealth. ## Footnote Countries with higher immunisation rates are often richer, which can independently affect infant mortality.
28
What happens to the estimated effect of immunisation rate on infant mortality when female illiteracy is included in the model?
The estimated effect falls to 0.514 deaths per thousand live births per 1 ppt increase in immunisation rate. ## Footnote Including female illiteracy accounts for another confounding variable affecting infant mortality.
29
What is multicollinearity in the context of regression models?
It occurs when two predictors are strongly correlated (R > 0.8), requiring one to be dropped from the model. ## Footnote This ensures that the unique effect of each predictor can be accurately measured.
30
What does the Standardised Beta coefficient represent in regression analysis?
It indicates how many standard deviations Y changes when X changes by one of its own standard deviations. ## Footnote This allows for comparison of the effects of different predictors measured in different units.
31
Fill in the blank: A dummy variable in regression is a categorical variable with just ______ values.
two. ## Footnote Typically, these values are 0 and 1, representing two categories.
32
What is the null hypothesis (H0) regarding the effect of male gender on wage in the given example?
Holding education level constant, male gender has NO effect on wage. ## Footnote This serves as a baseline to test against the alternative hypothesis (HA).
33
What was found about student evaluations of teaching with respect to gender?
Male lecturers receive evaluations that are 11.305 percentage points higher than female lecturers of the same age and ethnic background. ## Footnote This suggests a gender bias in student evaluations.
34
What percentage of the variation in student evaluations is accounted for by the model discussed?
19.6%. ## Footnote This indicates that a significant portion of the variation remains unexplained, suggesting other factors influence evaluations.
35
What are some factors that might affect student evaluations beyond the model's variables?
* Actual quality of teaching * Subjective experience of teaching * Time of year * Availability of cookies ## Footnote These factors highlight the complexity of evaluating teaching quality.
36
What is the significance of the coefficient for gender in the multivariate regression results?
It increased from 10.994 to 11.305, indicating a significant effect of male gender. ## Footnote This reflects that gender has a measurable impact on evaluations.
37
What is the purpose of adding categorical variables in a regression model?
To assess the effect of categories such as gender or race on the dependent variable. ## Footnote This allows for a more nuanced understanding of how different groups are evaluated.
38
What is the adjusted R-Square value after adding the age and ethnicity variables to the regression model?
Increased from 0.140 to 0.196. ## Footnote This indicates an improvement in the model's explanatory power.
39
True or False: Standardised Beta coefficients can be used to compare the effects of different predictors measured in different units.
True. ## Footnote They allow for meaningful comparisons by converting effects into standard deviation units.