Simple Regression Flashcards

(40 cards)

1
Q

What is the most common took of the applied economist?

A

Regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is regression?

A

It is used to help understand the relationships between many variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does regression do on an XY-plot?

A

It fits a line through the points in the XY-plot that best captures the relationship between X & Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the equation of a straight line (linear function)?

A

Y = 𝛼 + 𝛽X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is 𝛼 in the straight line equation?

A

The intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is 𝛽 in the straight line equation?

A

The slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why would we never get all points on an XY-plot lying precisely on it?

A

Due to measurement error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Is the straight line the true relationship in an XY-plot?

A

The true relationship is probably more complicated, a straight line may just be an approximation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens to important variables which affect Y?

A

They may be omitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the simple regression model?

A

Y = 𝛼 + 𝛽X + 𝑒

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is 𝑒 in the simple regression model?

A

The error term

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does regression analysis use?

A

It uses data (X and Y) to make a guess or estimate of what 𝛼 and 𝛽 are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens if there are more than two points on the XY-plot?

A

It won’t be possible to find a line that fits perfectly through all points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why do we need to fine the “best fitting” line?

A

Because it makes the residuals as small as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do we mean by “as small as possible”?

A

The one that minimises the sum of squared residuals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the most common method used to fit a line to the data?

A

We obtain the “Ordinary Least Squares” or OLS estimator

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do we choose 𝛼 and 𝛽?

A

So that the vertical distances from the data points to the fitted line are minimised

18
Q

What does OLS do?

A

It minimises the sum of the squared residuals

19
Q

What is Y?

A

Dependent variable

20
Q

What is X?

A

Explanatory (or independent) variable

21
Q

What are 𝛼 and 𝛽?

22
Q

What are 𝛼’ and 𝛽’ ?

A

OLS estimates of coefficients

23
Q

How do we decide which is the dependent variable?

A

Ideally, the explanatory variable should be the one which causes/influences the dependent variable (X causes Y)

24
Q

What is an example of a model with this dependent variable?

A

Increases in X (population density) causes Y (deforestation to increase) - not vice versa

25
Why must great care be taken in interpreting regression results as reflecting causality?
In some cases: the assumption that X causes Y may be wrong, we may not know whether X causes Y, X may cause Y but may also cause X, and the whole concept of causality may be inappropriate
26
What question does regression address?
How much of the variability in Y can be explained in X?
27
What do good fitting models have?
Small residuals
28
What does it mean if the residual is big for one observation?
Then it is an outlier
29
Why is it good to look at fitted values and residuals?
It can be very informative
30
What is the coefficient of determination?
The total variability in the dependent variable Y equals the variability explained in the explanatory variable (X) in the regression plus the variability that cannot be explained and is left as an error
31
What is R^2 known as?
The most common goodness of fit statistcs
32
What is one way to define R^2?
To say that it is the square of the correlation coefficient between y and yi
33
We can split the TSS into two parts, what are these parts?
Explained Sum of Squares and the Residual Sum of Squares
34
Where must R^2 lie between?
It must always lie between 0 and 1
35
What does R^2 = 1 mean?
Perfect fit - all data points are exactly on regression line
36
What does R^2 = 0 mean?
X does not have any explanatory power for Y whatsoever
37
What does bigger values of R^2 imply?
That X has more explanatory power for Y
38
R^2 is equal to what?
The correlation between X and Y squared
39
What does R^2 measure?
The proportion of the variability in Y that can be explained in X
40
How do we carry out non-linear regression?
Replace Y or X (or both) in the regression model by a suitable non-linear transformation (ln(Y) or X^2)