Week 2: finding the quantitative relationship between 2 variables Flashcards

1
Q

What principle do we use when we estimate b0 and b1 (using their formulas) ?

A

the Least Square principle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the least square principle guarantee?

A

that the regression line is the best fit of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the b0 and b1 equations derived from?

A

minimising the sum of the squares of the vertical distances between the observed Yi and predicted Ŷi values of the Dependent Variable:

min∑(Yi−Ŷ)^2 = min∑(Yi−(b0+b1Xi))^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does the least square principles guarantee that?

A
  • that the regression line obtained has the smallest sum of squared residuals
  • a regression line is the best approximation to the quantitative relationship existing between the variable Y
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What assumptions under-lie linear regression? (4)

A

Linearity

Independence of Errors

Normality of Error

Equal Variance (AKA homoscedasticity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the linearity assumption?

A

the relationship between X and Y is linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the ‘independence of errors’ assumption?

A

error values are statistically independent

this is particularly important when data is collected over a period of time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the ‘normality of error’ assumption?

A

error values are normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the ‘Equal Variance’ assumption?

A

the probability distribution of the errors has constant variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the residual for the observation i, ei??

A

the difference between its observed and predicted value

ei = Yi - Ŷi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you check the assumptions of regression?

A

by examining the residuals:

-examine for linearity assumption
-evaluate independence assumption
-evaluate normality assumption
-examine for constant variance for all levels of X (homoscedasticity)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you do a graphical analysis of residuals to investigate the assumptions?

A

plot residuals vs X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What happens to the histogram of the residuals when the assumption of Normality is satisfied?

A

the histogram of the residuals approximate the bell shape of a normal distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why do we need to compare two or more different regression models?

A

different estimation methods (different formulas to calculate the slope and intercept)

different populations, different samples, different variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What statistical instruments can be used to make a comparison?

A

total sum of squares

R^2

standard error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What equation do you use to work out total variation?

A

SST = SSR + SSE

Total Sum of Squares = Regression Sum of Squares + Error Sum of Squares

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does SST stand for?

A

Total Sum of Squares

18
Q

What does SSR stand for?

A

Regression Sum of Squares

19
Q

What does SSE stand for?

A

Error Sum of Squares

20
Q

How do you work out SST (Total Sum of Squares)?

A

SST = ∑(Yi - ȳ)^2

21
Q

How do you work out SSR (Regression Sum of Squares)?

A

SSR = ∑(Ŷi - ȳ)^2

22
Q

How do you work out SSE (Error Sum of Squares)?

A

SSE = ∑(Yi - Ŷi)^2

23
Q

What type of variation is SST (Total Sum of Squares)?

A

Total Variation

Measures the variation of the Yi values around their mean ȳ

24
Q

What type of variation is SSR (Regression Sum of Squares)?

A

Explained Variation

Variation attributable to the relationship between X and Y

25
What type of variation is SSE (Error Sum of Squares)?
Unexplained Variation Variation in Y attributable to factors other than X
26
What is the coefficient of determination?
the portion of the total variation in the dependent variable that is explained by variation in the independent variable
27
What is the coefficient of determination also known as?
R-square, denoted as R^2
28
What is the equation for R^2 (the Coefficient of Determination)?
R^2 = SSR / SST = regression sum of squares / total sum of squares --> R^2 = ∑(Ŷi - ȳ)^2 / ∑(Yi - ȳ)^2
29
What does R^2 have to be between?
0 ≤ R^2 ≤ 1
30
If R^2 = 1 describe the relationship between X and Y and the variation.
there is a perfect linear relationship between X and Y: 100% of the variation in Y is explained by variation in X
31
If R^2 = 0 describe the relationship between X and Y and the variation
no linear relationship between X and Y: none of the variation in Y is explained by variation in X
32
If R^2 = 0.6 describe the relationship between X and Y and the variation.
Strong linear relationships between X and Y: Most of the variation in Y is explained by variation in X
33
If R^2 = 0.4 describe the relationship between X and Y and the variation
Weaker linear relationships between X and Y: Some but not all of the variation in Y is explained by variation in X
34
What is another way to work out R^2?
by working out the correlation coefficient (R) and then squaring it
35
If R^2 = 0.576, how could this be expressed as a proportion or percent?
57.6 percent of the variation in the Y variable is explained by the variation in the X variable
36
What is the equation for the Standard deviation of the variation of observations around the regression line? (What does Syx = ?)
Syx = √(SSE / n-2) = √(∑(Yi - Ŷi)^2 / n-2) where SSE = error sum of squares n = sample size
37
What are the steps for working out regression in excel?
1) select DATA from the Title bar Menu 2) click on DATA ANALYSIS button 3) select REGRESSION from the contextual menu 4) enter Y range and X range desired options 5) get coefficient values, intercept coefficient goes before X variable 1 coefficient Ŷi (eg sells) = Coefficient Intercept + Coefficient X Variable 1 ( X variable) eg calls
38
How do you get R^2 in excel?
1) select DATA from the Title bar Menu 2) click on DATA ANALYSIS button 3) select REGRESSION from the contextual menu 4) enter Y range and X range desired options 5) Look at the R square value in the Regression Statistics table 6) OR Look at ANOVA table, SS regression value = SSR and SS Total = SST 7) put values into equation R^2 = SSR / SST
39
How do you get the value for Syx (standard error) in excel?
1) select DATA from the Title bar Menu 2) click on DATA ANALYSIS button 3) select REGRESSION from the contextual menu 4) enter Y range and X range desired options 5) Look at regression statistics table, standard error value = Syx
40
How do you add the prediction line to the Plot of Fitted and observed data?
1) click on one of the observed values (Blue dots) 1) right click the mouse and select "Add Trendline" from the contextual menu