Multiple Regression Flashcards

1
Q

What does our model y hat equation look like when we add an additional predictor?

A

y hat = b0 + b1X1 + b2X2

We have our intercept (b0)
Then our first regression coefficient going along with the first predictor
Then our second regression coefficient going along with the second predictor

Residual (+e) is added for observed ys

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Interpret the univariate information of each of the variables, from a correlation matrix: n, mean, sd, median, trimmed, mad, min, max, range, skew, kurtosis, and se.

A

n = group sizes

mean = relatively similar; if it’s not, it’s measured on some other range (i.e., phyhealth mean is 3.6 and stress mean is 172).

median = identify that data are symmetrically distributed - match it up with mean.

trimmed mean = takes off outer quartiles, and extreme scores to take out skewness and outliers.

MEAN, MEDIAN, TRIMMED MEAN = should be similar

mad (median absolute deviation) = compares absolute value from from the median. It’s another measure of dispersion. SD and MAD should be similar to one another.

min and max = just to check the coding error

skew = right around ZERO (between -3 and 3)

kurtosis = around ZERO (between -3 and 3)

se = standard error of the mean distribution (to be used for standard error for a t-test).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In the assumption of Multicollinearity, predictors cannot be correlated… so what does it mean if an rcorr function produces a correlation table with an .80 correlation between 2 predictors (mental health and physical health)?

A

Values above .80 is a concern for multicollinearity.

It makes it difficult to pinpoint the true predictor due to the shared variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the first thing we do when looking at a regression model?

A

We look at the correlations with our predictors and the outcome variable - if the predictors and outcome variables are NOT correlated, they’re not good predictors to add to the regression model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the steps for a regression write-up?

A

We first report our correlations in a table, then indicate that the correlations are significant, direction, hypothesized direction…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does our interpretation for a multiple regression change from SLR?

A

b0 = Our intercept is the point at which the regression plane intersects the Y-axis; or expected value of Y when both X1 and X2 are = 0.
ex) The expected # of doctor visits when ind. reports NO mental AND physical health problems.

b1 = The change in the EXPECTED value of Y associated with a 1 unit increase in X1, OVER AND ABOVE THE EFFECT OF X2 (or when X2 = 0).
ex) Holding mental health problems constant…

b2 = The change in the EXPECTED value of Y associated with a 1 unit increase in X2, OVER AND ABOVE THE EFFECT OF X1 (or when X1 = 0).
ex) Holding physical health problems constant…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the introduction to statistical control mean for our interpretation?

What is the technical term of this statistical control?

A

“Over and above the effect of…”

We are holding 1 variable constant while watching how the other one changes.

It is officially called ‘Partial Regression Coefficient’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Conceptually, with the b1 coefficient equation, what are we trying to end up with?

And what are the steps towards getting rid of the variable?

What are we left with?

How is this process similar to Factorial ANOVA?

A

We just want the stuff associated between X1 and Y, over and above the effect of X2 - so we incorporate statistical control and take the variability of all of X2 and multiply it by the shared portion, then we subtract what’s shared between X1 and X2, multiplied what’s shared between X and Y.

What’s left is the what’s shared between X1 and Y without considering X2.

This is similar to main effect of FANOVA. The effect of X1 on Y, ignoring the levels of X2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Conceptually, with the b2 coefficient equation, what are we trying to end up with?

A

It’s exactly the same as b1 coefficient, but instead of X2, we plug in X1.

We just want to effect of X2 and Y, over and above the effect of X1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What would a negative b2 coefficient indicate if X1 is physical health problems and X2 is mental health problems

A

It indicates the shared variability between x1 and x2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the b0 equation for MR?

A

b0 = Y-bar - b1(X1) - b1(X2)

*THE SIGN MATTERS. IF THE B1 OR B2 COEFFICIENT IS NEGATIVE, THIS EQUATION WILL CHANGE BETWEEN PLUS AND MINUS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the equation for Full Model for MR?

A

ŷ = b0 +b1(X1) +b2(X2)

*THE SIGN MATTERS. IF THE B1 OR B2 COEFFICIENT IS NEGATIVE, THIS EQUATION WILL CHANGE TO MINUS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

We ran a regression on R… last week in SLR, physical health problems was a GREAT predictor. Now that we have added mental health problems to the model, both predictors are non-significant. What does this mean?

What could you do if this happens?

A

It isn’t good to keep both the predictors in the model. The addition of the mental health predictor did not improve our model fit.

We could either choose one of the variables OR because they’re redundant, make them into 1 predictor as “general health problems”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the point of adding multiple predictors in a MR?

A

We are trying to see if adding predictors improves our model fit in regards to predicting our Y (or doctor visits).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Conceptually, what are we comparing in a t-test?

A

The estimate divided by the standard error of the b.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the residual standard error in MR?

What does it tell us?

A

Residual standard error = Standard error or the estimate

The standard error of the estimate tells us the variability of the standard deviation of the points around the regression plane.

Conceptually, it is the measureof MISFIT. Or variability of what’s left unpredicted from our model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the df for multiple regression for 2 predictors and 10 individuals?

A

df = n - k - 1

= 10 - 2 - 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the Multiple R² tell us?

How is it computed?

Explain conceptually, each component of what’s used to compute the Multiple R².

A

It’s an effect size that tells us that taken together, X1 and X2 accounts for % of variability in Y.

ex) Taken together, physical health problems and mental health problems accounts for 43% of variability in doctor visits.

Multiple R-squared is computed by SSreg/SStot.

*SSregression = ŷ - y-bar
>This tells me the IMPROVEMENT we make in predicting Y by adding the predictors

*SStotal = y - y-bar
>This is the amount of Y we can predict with the mean of y

So essentially, we are seeing the accountability of JUST the improvement of the predictors over JUST the mean…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What’s the degrees of freedom for the regression model?

*think about the model fit anova table

A

k, or the number of predictors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What type of problem do we face with the Multiple r-squared when we add predictors?

A

k goes up - by adding more predictors, we are essentially inflating the model r-squared, so the Adjusted r-squared incorporates df to penalize us for crappy predictors, thereby, lowering the effect size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

When would the adjusted r-square go up?

A

The adjusted r-squared will only go up if the predictors we include are worth their weight in df.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

How should the multiple and adjusted r-square compare to each other?

A

Having similar multiple r-squared and adjusted will signify that our predictors are a good measure of variability. If a predictor doesn’t influence the model greatly, we should expect the adjusted predicted value to be similar to the predictor value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What happens if there is a large gap between the multiple r-squared and adjusted r-squared?

A

We would ethically report the adjusted r-squared.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is the statistical sentence and interpretation for MR?

A

F (k, n-k-1) = F-value, P ≤ .05

Taken together, physical and mental health problems are not good predictors for doctor visits.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What is the idea regression model?

A

Orthogonality between the predictors - that there is NO correlation or overlap between predictors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What are the quantities in the MR ANOVA table?
SSy
SSreg
SSres

A

It’s the same as SLR:

SSy(total) = ∑(y-ybar)² 
SSreg = ∑(ŷ-ybar)²
SSres= ∑(y-ŷ)²
27
Q

Conceptual: If EVERYTHING in the SSreg stays the same from the SLR and MR ANOVA table, but the F-value is a lot smaller, why is the p-value all of a sudden non-significant?

A

The MSresidual got bigger due to the addition of the predictor. The F changed and the p value changed.

28
Q

What is the model interpretation for a non-sig model?

A

The regression model does not significantly fit the data, such that X1 and X2, taken together, does not significantly predict Y, F (2,7) = 2.64, p > .05.

29
Q

What is the formula and interpretation of R² in the multiple regression?

A

It is the coefficient of MULTIPLE determination.

Interpretation: It is the Proportion of variance in outcome accounted for by the SET of predictor variables.

Formula: R² = SSreg/SStotal

30
Q

What is the R² formula for UNCORRELATED X1 and X2 predictors?

A

R²y.12 = r²yx1 + rR²yx2

31
Q

What is the first thing we do before we look at the significance of coefficients?

A

We look at the model information first, evaluate the model fit, THEN look at which predictors (coefficients) are significant or not.

We do this to see if the model is worthwhile.

32
Q

What is the MSresidual also called?

What does it tell us?

A

MSresidual = variance of the estimate

It tells us the amount of variance of our points from the regression line, therefore, if this number is big and there’s a lot of variance around the regression line, that means there’s too much that is unexplained and variable.

33
Q

What is the conceptual calculation of the standard error of the b-coefficients?

A

Numerator = MSresidual, or the variance of the estimate.

Denominator = We remove the overlap in the denominator by multiplying the model standard error by (1 - the squared correlation of both predictors [meaning they’re highly correlated to one another]).

All of it square rooted.

34
Q

Why is it a problem that the standard error is large due to large squared correlations?

A

Large squared correlations indicate that the 2 predictors are highly correlated to one another.

Since SE is used to test the significance of the coefficients.

(t = b - R2/ SE)

Then the T-value becomes small, so we’re less likely to get significance.

35
Q

What is the ß-coefficent?

A

2 things: Symbolically, it is the population coefficient, which is zero only in T-statistic.

It is ALSO the Standardized Regression Coefficient.

36
Q

What’s the difference between unstandardized and standardized regression coefficients?

Why would we standardize regression coefficients?

A

The unstandardized b coefficients are based on the unit of measure of the outcome variable.

ß - coefficients, or standardized coefficient, allow us to estimate unit free relationships among the standardized variables. It’s the residuals divided by the estimate of their SD.

This is contingent on the unit of measure (SD) for the x-variable.
ex) is 4 a big number? It depends on what it’s being measured up to. However, the ß-coefficients allow us to measure BEYOND the restraints of the X-variable.

> We standardize regression coefficients because we can’t define a cut-off point for what constitutes as a large residual.

37
Q

How do we standardize a b-coefficient into ß-coefficient?

What’s the point of standardizing?

A
  1. We convert all of the variables into z-scores.
  2. We convert our outcome variable into z-score.

When we standardize the variables to z-scores, we can compare the residuals to what we already know about the properties of z-scores.
ex) 99% of z-scores should lie between -3.29 and +3.29.

38
Q

What is the function in r to standardize a coefficient?

A

scale

39
Q

What is the interpretation of ß coefficients?

Explain it conceptually.

A
  • *ex) Although not significant, doctor visits are EXPECTED to increase by .72 STANDARD DEVIATIONS for every 1 STANDARD DEVIATION increase in physical health problems, OVER AND ABOVE the effects of mental health, t(7) = 1.51, p > .05.
    problems) .
40
Q

What is the formula for converting b-coefficients into ß-coefficients if you have SD information?

How about converting it backwards?

A

ß = b (Sx/Sy) ; b times standard deviation of x divided by the standard deviation of y.

b = ß (Sy/Sx) ; ß times standard deviation of y divided by the standard deviation of x.

41
Q

What happens to the intercept term?

A

The intercept becomes zero after we standardize our coefficients because the z-score conversion made the means of the coefficients zero.

42
Q

Interpret the Nested model (or the estimate full model), including the statistical sentences (statistically significant)…

F = 43.03
df = 3, 461
22% variance

A

TAKEN TOGETHER, reported physical health problems, mental health problems, and stress significantly predicts # of doctor visits, F (3, 461) = 43.03, p

43
Q

Interpret the individual coefficients of the nested models.

A

Intercept: The -3.7 doctor visits is expected when all other variables are zero, which significantly different from zero, t(461) = 3.29, p .05.

44
Q

What’s the point of reducing a full nested model into a nested submodel?

A

By setting certain coefficients to zero, it allows us to test the combinations of a set (ex: mental health and stress only as a SET of predictors).

We then compute a R²∆ test to see if the change in model R² is significant. It tells us if the SET of predictors contributes to the Full model in a significant way.

45
Q

Why do we compute an F test for R²∆?

What is the formula?

How do we interpret the change?

A

We compute the R²∆ test to see if the set of predictors we got from our nested submodel contributes to our FULL model in a significant way.

Formula:
(R²Full - R²Reduced)/(dfRegfull) ÷ (1 - R²Full) / (dfResFull)

Interpretation: Taken together, X1 and X3 significantly contribute to our model.
ex) taken together, mental health and stress significantly contribute to our model.

46
Q

How do we compute R²∆ in r?

A

anova (fitsub,fitfull)

47
Q

What are we left with when testing an empty or null nested model?

What is the empty nested model?

A

The intercept. This is what we are testing our model against - we are testing the change between the model fit over this empty model.

48
Q

What are the 3 different types of correlations?

A
  1. Zero-order correlations (pearson r, generic correlation that shows shared variability ignoring everything else)
  2. Partial correlations
  3. Semi-partial correlations
49
Q

Define Partial Correlations:

What does residualizing have to do with this?

A

Partial correlations is the amount of unique overlap bw X and Y where Z has been removed from X and Y.

We remove both by residualizing.

50
Q

Define Semi-Partial Correlations:

A

Semi-partial looks at the unique amount of overlap bw X and Y where Z has been removed from X but not Y.

51
Q

Which correlation between X and Y is bigger? Why?

A

Partial - the amount of Y is smaller - the total amount of variability in Y is smaller. So the shared portion takes more portion of Y.

52
Q

What kind of correlation are we running when answering the following question:

What would the relationship between physical health problems (X) and doctor visits (Y) be if I could literally hold stress (Z) constant?

A

Partial correlation

53
Q

What does Regress Y on Z mean?

A

We use Z as a predictor of Y. So we have Z predicting Y in the model. What’s left is the residual of Y after controlling for Z.

R² y.z = .59

54
Q

If a model is perfect to the sample data, where would the data points fall on?

A

All data points would fall on the regression line, meaning the residual would be zero.

55
Q

What is the biggest difference in the predictions of the b-coefficients between a SLR and MR?

A

For the b1 and b2 coefficients, we say the very important following:

b1 - The change in the EXPECTED value of Y associated with a 1 unit increase in X, OVER AND ABOVE the effect of X2.

Same for b2, except it would be OVER AND ABOVE the effect of X1.

56
Q

How is R-squared predicted in MR?

A

TAKEN TOGETHER, X1 and X2 accounts for % of variability in Y.

57
Q

Know (in general terms) how the formulae for b’s, R2, and standard errors of b’s in MR with two predictors differ from those in SLR.

A

b0 – Y-mean – b1X1 – b2X2

b1 – SSCPX1Y ÷ SSX

b2 – SSCPX2Y ÷ SSX2

58
Q

What are standardized coefficients, how are they computed (general statement, not formulae), and how are they interpreted?

A

o Standardized coefficients (ß) allow us to estimate the variables without the restraint of the x-variables unit of measures (SD of x-variables limit us).

o ß is computed by converting all variables and outcome variable into Z-scores.

o Interpretation of ß-coefficients: ß1 – the standardized change in the expected value of Y associated with a 1 STANDARD DEVIATION increase in X1, OVER AND ABOVE the effect of X2.

• Although not significant, doctor visits are EXPECTED to increase by .72 STANDARD DEVIATIONS for every 1 STANDARD DEVIATION increase in physical health problems, OVER AND ABOVE the effects of mental health, t(7) = 1.51, p > .05.

59
Q

• What is the general form of the regression model using standardized coefficients?

A

o Zy = ß1ZX1 + ß2Z2 + e

60
Q

• What happens to the intercept term after we standardize the regression coefficients?

A

o The intercept practically becomes zero after we standardize our coefficients because the z-score conversion made the means of the coefficients zero.

61
Q

How can you convert an unstandardized coefficient to a standardized coefficient (and vice versa)?

A

o ß = b (Sx/Sy) ; b times standard deviation of x divided by the standard deviation of y.
• “ßeta b SeXy”

o b = ß (Sy/Sx) ; ß times standard deviation of y divided by the standard deviation of x.

62
Q

• Under what conditions would you want to examine standardized coefficients?

A

o To create consistent and standardized means of measurements
• We would use it to compare different units of measurements

63
Q

Know how to conduct the F test of R2 change. How do you interpret the R2 change value and the result of the test?

A

o (R²Full - R²Reduced)/(dfRegfull) ÷ (1 - R²Full) / (dfResFull)

o Interpretation: Taken together, X2 and X3 significantly contribute to our model.
• ex) Taken together, mental health and stress significantly contribute to our model.