Week 5 Flashcards

(41 cards)

1
Q

model that is even simpler than the univariate regression model.

A

It is called the intercept-only model, which is a regression model without any predictor.

uy = B0 +ei

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can you guess what the β0 equal to? What is the best prediction for Yi
if you don’t know anything else?

A

Answer: β0 = µy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

For the intercept-only model, we can only test what hypothesis

A

H0 : β0 = 0
H1 : β0 does not equal 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is this equivalent to
H0 : β0 = 0
H1 : β0 does not equal 0. in a one sample t-test

A

H0 : µy = 0
H1 : µy does not equal 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Univariate Model:

Equation?

What does B1 mean?

What tests?

A

yi = β0 + β1x1 + ϵi

§ β1: for one unit increase in x1, there is β1 unit increase in Y

§ t-test for the regression coefficient and correlation coefficient and F-test for the overall model fit (or R-squared) are equivalent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bivariate Model:

Equation?

What does B1 mean?

What tests?

A

yi = β0 + β1x1 + β2x2 + ϵi

β1: holding x2 constant, for one unit increase in x1, there is β1 unit increase in Y .

t-test for the partial regression coefficient is different from F-test for the overall model (or R-squared).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

the F-test for the univariate and bivariate regression tests whether…

A

the variance explained in the criterion variable can be significantly accounted for by all the predictors

H0: p2yy=0
H0: p2yy>0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

p2yy= what at the population level?

A

ssregression/sstotal

Another way of looking at the F-test is that it is a ratio comparing the current model and the intercept-only model.

p2yy = SScurrentmod/SSinterceptonlymod

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In the intercept-only model, can you do an F test?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Why is unadjusted R^2 not good?

A

Because the the sample R-squared r2yyˆ
is a biased estimator of the population R-squared ρ2yyˆ

§ Over repeated studies, the sample R-squared r2yyˆ tends to be higher than ρ2yyˆ
.
The sample R-squared r2yyˆtends to increase as the numberof predictors (denoted by p) increases.
§ As p increases, the model tends to be overfitting.
§ Overfitted model will be very unstable; the estimation varies widely across repeated samples - line too close to points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Bias-Variance Trade Off

Define bias and variance

Explain how they influence

A

For any statistical modelling, there is a bias-variance tradeoff.
Bias: how good is the model fit to the current data.
§ Less bias means less residual.
§ observed and predicted value are similar.
§ more variance in the criterion variable can be explained by the predictors.
§ In regression, usually, as you add more predictors, you will get less bias.

Variance: how variable is your estimated across repeated samples.
§ Large variance implies large standard error and more prediction error.
§ In regression, usually, as you add more predictors, you will get large standard error and predictor error.
§ recall multicollinearity.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Underfitted Model - bias and variance?

A

High bias; low variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Overfitted Model: - bias and variance?

A

Low bias; high variance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Both underfitted and overfitted models have _________ prediction error

A

large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The unadjusted R2 tends to favor ______________ models even
though they are not good.

A

overfitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The goal of the adjusted R2 is…

A

to provide a more balanced evaluation of the fit relative to the number of predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Unadjusted R^2 formula

A

r2yy = 1- (ssregression/sstotal)

18
Q

adjusted R^2 formula

A

r2yy = 1 = (ssresidual/dfresidual)/(sstotal/dftotal)

19
Q

As the number of predictors, relative to sample size, increases, the R-squared is adjusted how?

A

downward more.

In short, adjusted R squared adjusts the unadjusted R-squared downward to provide a better evaluation of fit.

20
Q

We know that in the population model, the error term is…

what is the notation

A

random variable with normal distribution

ei ~ N(0,o^2)

21
Q

deterministic view

A

The deterministic view assumes that the variability of the criterion variable can be fully accounted for by a list of predictors at the population level; therefore, there is no error term in the population

22
Q

stochastic view

A

The stochastic view assumes that the variability of the criterion variable CANNOT be fully accounted for by a list of predictors at the population level; therefore, there should be an error term in the population

23
Q

Modern statistics takes which view?

A

stochastic view.

24
Q

There are two fundamentally different interpretations of the
regression coefficients.

A
  1. Descriptively as an empirical association
  2. Causally as a structural relation
25
There are two fundamentally different interpretations of the regression coefficients. 1. Descriptively as an empirical association.
1. Descriptively as an empirical association. § Treat predictors as correlates with the criterion variable. § Population regression coefficients simply describe the relationship between predictors and the criterion variable at the population level. § Error term represents all the correlates that we didn’t measure but correlate with the criterion variable. § Relevant when you only focus on prediction.
26
There are two fundamentally different interpretations of the regression coefficients. 2. Causally as a structural relation.
2. Causally as a structural relation. § Treat predictors as causes of the criterion variable. § Population regression coefficients try to model the underlying causal relationship between the predictors and the criterion variable at the population level. § Error term represents non-systematic causes of the criterion variables. § Relevant when you want to study the underlying causal relationship
27
In the empirical association view, omitting a relevant predictor for the criterion variable _________________the inference.
does not affect But the prediction may not be good if you omit an important predictor
28
What happens when you omit a predictor in the structural relation view?
On the other hand, if we hold the structural relation view, then we can talk about a possible bias produced by omitting a predictor in the population.
29
Structural Relation View – Example Suppose in the population, the true model is µy|x = β0 + β1x1 + β2x2 + ϵ. Under the structural relation view, the true model means
x1 and x2 are the only causes of Y . Usually (not always), cor(x1, x2) = 0. § ϵ is variance in Y that can’t be accounted for by any other variables. § Thus, cor(x1, ϵ) = 0 and cor(x2, ϵ) = 0 at the population level
30
Structural Relation View – Example Suppose in the population, the true model is µy|x = β0 + β1x1 + β2x2 + ϵ. However, suppose we fit an univariate regression model.
§ omitting x2 § fitting a misspecified model µy|x = β0 + β1x1' + ϵ'. where, implicitly, the effect of x2 on Y is absorbed by the error ϵ' = β2x2 + ϵ. If x1 and x2 are correlated, then there is a correlation between x1 and ϵ', cor(x1, ϵ1) = 0, at the population level. However, if you fit the one predictor model with the least squares method at the sample level, the correlation between the predictor and the residual will be forced to be 0. need to look at slide bit more complex than this
31
In conclusion, if we hold the structural relation view, then we can talk about ...
In conclusion, if we hold the structural relation view, then we can talk about bias produced by omitting a predictor that is a cause of Y (i.e., fitting a misspecified model) and is correlated with another predictor in the model. In other words, in the structural relation view, we need to include all relevant causes of Y as predictors to obtain consistent estimation. § possible for experimental studies § not possible for observational studies but we try our best. § “All models are wrong but some are useful.”
32
Why do we create dummy variables?
To incorporate categorical variables into the regression model, we need to create dummy variables.
33
What are dummy variables?
Dummy (code) variables are numeric variables that use 0 or 1 to represent categorical variables.
34
How many dummy variables are needed?
For a categorical variable that has g groups, you need g - 1 dummy variables to code this categorical variable
35
If a categorical variable has 2 groups (e.g., male vs female), then how many dummy variables?
you need 1 dummy variable to code it.
36
Reference group
The group that has 0 on all dummy variables is called reference group
37
What is the interpretation of the intercept βˆ0 = 85.6? yˆ = 85.6 + 4.2D
mean grade of male (or the reference group) Recall βˆ0 is the value of the criterion variable when the predictor variable is 0. Therefore, βˆ0 is the value of the criterion variable for the reference group.
38
What is the interpretation of the intercept βˆ1 = 4.2?
difference between the mean grades of male and female. Recall: βˆ1 is the amount of change in the criterion variable for 1 unit change in the predictor variable. § Now, 1 unit change in the dummy variable is the change from the male group to the female group. § Therefore, βˆ1 is the amount of change in the criterion when you change from the reference group to the non-reference group
39
What does lm do in respect to dummy variables?
The lm function will automatically coerce character vectors as factors and do dummy code in the background. § It will assign the reference group based on alphabetical order. § e.g., "female" is before "male" alphabetically so "female" will be automatically assigned as the reference group.
40
A simple regression with a dummy variable is equivalent to
an independent-sample t-test, which is equivalent to running ANOVA with two groups
41
F test H0
H0 : µ1 = µ2 = µ3 H0 : β1=“ β2 = 0. H1 at least one pair of means are not equal. § same as the F-test in ANOVA.