Midterm 3 Flashcards

(83 cards)

1
Q

In OLS regression, total variation or deviation follows the logic of what test?

A

-test of significance called analysis of variance (ANOVA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Total variation in bivariate regression represents what?

A

-the total sum of squares SST

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What indicates the explained variation in a bivariate regression?

A
  • SSR sum of squares regression
  • the amount of variation in Y accounted for by X
  • amount of total variation that is explained by regression equation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the SSR also called?

A

-model sum of squares SSM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What represents the amount of variance left over in Y that the bivariate regression didn’t account for?

A
  • sum of squared errors (SSE)

- Sometimes called residual sum of squares (SSR)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the most important use of SST, SSE and SSR?

A
  • calculation of the coefficient of determination

- AKA square of Pearson’s r (r^2)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does r-squared tell us?

A

-the proportion of the total variation attributable by X

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What type of relationship do SSR and SSE hold with each other?

A
  • a reciprocal relationship

- as one sums increases the other decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

If there is a stronger linear relationship between X and Y, what will happen to the explained and unexplained variation?

A
  • greater explained variation

- lesser unexplained variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What would a r-squared value of 1 mean?

A
  • X explains 100% of the variation in Y

- we could predict Y from X without error

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When X and Y are not linearly related, what happens to the explained variation and r-squared?

A
  • both are zero

- X explains none of the variation in Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What do you need to calculate for a linear relationship to really say if its a strong relationship?

A

-r-squared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does it mean if the correlation coefficient is +1?

A

-there is a perfect positive relationship between the two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does it mean if the correlation coefficient is -1?

A

-there is a perfect negative relationship between X and Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does it mean if the correlation coefficient is 0?

A

-no linear relationship between these two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How would you express a correlation coefficient of 0.65?

A

-A one standard deviation increase in X is associated with a 0.65 increase in Y, on average

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Does the magnitude of a linear slope have anything to do with scatter?

A
  • NO

- it’s possible to have a very deep line with scatter or a very shallow line with no scatter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the slope coefficient?

A

-b

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What do r and b have in common?

A
  • the same numerator

- thus, testing the hypothesis that r=0 is the same as testing if b=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Why must we test to see if the relationship between the variables exists in the population from which the sample was drawn?

A
  • since the data for a bivariate regression is based on a random sample
  • called testing for significance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How do we test for significance?

A

-Pearson’s r since the slope is identical to this

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What assumptions are made to test for significance in a bivariate relationship?

A
  1. Assume that both variables are normal in distribution (bivariate normal distributions)
  2. Assume the relationship between variables in somewhat linear
  3. Homoscedastic relationship
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is a homoscedastic relationship?

A

-The Y scores are evenly spread above and below the regression line for the entire length of the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

How do you determine if it is appropriate to proceed with the assumptions around the test of significance?

A

-look for homoscedascity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What are bivariate normal distributions?
-both variables are normally distributed
26
In hypothesis testing, what does it mean if you fail to reject the null?
- the Pearson's r could have occurred by chance alone | - two variables are unrelated
27
What is hypothesis testing based on?
-sampling distribution of means
28
What is the sample distribution of means?
- describes the variation in the values of the mean over a series of samples - based in the central limit theorem
29
How large do samples have to be to reach a normal distribution?
-greater than or equal to 30
30
What happens with a larger sample size in hypothesis testing?
-better approximation to the normal distribution and a more effective estimation of the population mean
31
What can be understood about b in hypothesis testing?
- it can be interpreted as a mean | - thus the regression equation should have the population regression slope
32
What does b produce?
- beta | - not always though
33
What is critical about b for hypothesis testing?
-that b is normally distributed is critical for hypothesis testing of OLS regression
34
Why can we use z to determine b and beta?
-since b is normally distributed in the population of samples
35
Why can we drop the beta in the formula for t?
-since beta is presumed to equal 0
36
What do the residuals indicate?
- how far the predicted value based on b is from each actual case - suggest other factors besides X are influencing Y
37
The larger the standard deviation of X will cause what for the standard deviation of b?
- smaller SD of b | - better estimate the slope when we have a lot of values for the predictors
38
What are the three most commonly used levels of significance in quantitative research?
- p<0.05 * - p<0.01 ** - p<0.001 ***
39
When SPSS produces coefficients of bivariate relationship which values correspond with bX, a and Sb?
- bX is unstandardized coefficient and B - a is unstandardized coefficient and B - Sb is std. error and the X variable
40
What are antecedent variables?
- Z effects X independently and Y independently | - no effect between X and Y
41
What are redundant variables?
- Z and X affect each other but only Z affects Y | - Z and X are simultaneous
42
What is the least squares multiple equation for two independent variables?
Y=a + b1x1 + b2x2
43
What is b1 and b2
-b1 is the partial slope of the linear relationship between the first independent variable and Y
44
What is the purpose of multiple regression?
- to examine the independent relationship between each predictor (IV) and an outcome (DV, Y) in a set of predictors - holds all other variables constant - statistical control
45
What is statistical control
-we cannot eliminate the effect of other variables on our Y so we use statistics to control
46
Is multiple regression as good as an experiment?
- No - assumes that the relationship between variables can be assumed by a linear equation - makes errors as small as possible
47
What is wrong with multiple regression?
-we cannot measure every variable that affects our dependent variable
48
What is the purpose of a in a regression equation?
-anchor for the regression
49
How realistic is a multiple regression model?
-all models are poor depictions of reality
50
What is e in the full multivariate regression equation?
- it indicates all the other influences besides all X's in the model - changes for every case
51
What can b be thought of as in the multivariate regression model/equation?
- each b is a weight - expresses how much of Y each X contributes with a 1 unit increase in X - each b indicates the independent effect of each X
52
What is covariance?
-measure of how two variables vary together
53
What value shows r in a SPSS correlation matrix?
-find the two variables you are interested in and look at where they intersect
54
What does it mean to look at the independent effect?
-remove other variables effect on it
55
How do we look at the independent effect of two independent variables with correlation?
-run both of them in the regression model
56
How do we find the full regression equation in SPSS?
- a is equal to unstandardized and B - b1 is equal to unstandardized and X1 - b2 is equal to unstandardized and X2
57
Describe the regression equation Y=1.897 + 0.339Xage + 0.521Xmemory + e
- a one unit increase in age is related to a 0.339 unit increase in Y, controlling for memory - if age and short term memory were both zero, we would predict a reading ability of 1.897
58
What is the multiple coefficient of determination?
- R^2 | - since r^2 doesn't work for multiple regression cause there is overlap
59
What is R^2?
- correlation between observed and predicted values from the multiple regression - variance in the dependent variable accounted by the predictors in the regression
60
What would it mean if we had a R square value of 0.702?
-The amount of variance in Y X1 and X2 account for which is 70.2%
61
Why can we not just compare partial slopes?
-different units
62
What do we do to convert partial slopes into a comparable form?
-look at standardized coefficients
63
What are standardized partial slopes called?
-beta weights
64
How to interpret beta-weight values?
-the higher the beta-weight value the stronger the relationship regardless of + or -
65
In bivariate regression what type of strength do we observe with standardized coefficients?
-absolute
66
In multiple regression can we use standardized slopes to determine absolute strength?
- No | - Relative strength only
67
Is beta-weight equal to r?
-no
68
What does multiple regression do for spurious relationships?
-it is used to rule out spurious relationships among variables
69
What are the three types of spurious relationships?
- antecedent - redundant - suppression
70
What is suppression
- opposite of redundancy | - when the relationship between two variables gets stronger when you control for a third variable
71
How can we use stepwise regression to show spurious relationships?
- the unstandardized betas will change values in each model (go down) - or the R square will change value in each model
72
How do you test for significance in multiple regression?
-use t equation of b/Sb
73
What forms does multicollinearity come in?
-extreme and near extreme
74
What is extreme multicollinearity?
- at least two of the X variables in a regression equation are perfectly related by a linear function - correlation between X1 and X2 is 1
75
What is near-extreme multicollinearity?
- there are strong, although not perfect, linear relationships among the X's - correlation between X1 and X2 will be close to 1 or -1
76
How do you find near-extreme multicollinearity?
- regress each independent variable on all the other independent variables and look for a high R-square - if any of these are above 0.6 this is concerning
77
Why is multicollinearity a problem?
- it will result in a larger standard error for its coefficients - making it harder to find statistically significant coefficients (t)
78
What differs between the standard error for bivariate and multivariate regressions?
-correction factor for the covariance between the two predictors
79
What does greater covariance between two predictors result in?
-less reliable estimates because it inflates Sb
80
What is VIF?
-captures the factor to which two independent variables are collinear
81
How would you interpret a VIF of 9?
-you're multiplying the standard error for a coefficient for a factor of 3
82
What variable will have a large VIF?
-independent variable that is highly correlated with other predictors in the model
83
What is the cut off for VIF?
6