[L11] Multiple Regression Analysis Flashcards

1
Q

__ (rather correlation) is the term used when we have
the specific aim of predicting values on a ___ variable (or
target) from a “
__ variable”.

A

Regression; criterion; predictor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

The square of ___gives us an estimate of
the variance in y explained by variance in x.

A

correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Because there is a correlation between x and y , we can to a
certain extent, dependent on the size of r², predict __
from x scores.

A

y scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

line of best fit placed among the
points in a scatterplot.

A

REGRESSION LINE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

REGRESSION LINE

On this line will lie all our
___,
symbolized as ŷ, made from our knowledge of x values.

A

predicted values for y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

The vertical line between an actual y value & its
associated ŷ value is known as

A

PREDICTION ERROR.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

But it is better known as a ____ because it
represents how wrong we are in making the prediction for
that particular case.

A

RESIDUAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The
_
__ then is a line that minimizes these
residuals

A

regression line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

__– it is the number of units Ŷ increases
for every unit increase in x.

A

Regression coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

_ is a constant value. It is the value of Ŷ when x is 0

A

c =

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In regression we also deal with___ rather than raw
scores

A

standard scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When scores (x and y) are expressed in standard score form
then the regression coefficient is known as the
__

A

STANDARDIZED REGRESSION COEFFICIENT or
BETA

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Where there is only one predictor, __ is in fact the correlation
coefficient of x with y.

A

beta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

_
* Can be used when we have a set of variables (x1, x2, x3 etc)
each of which correlates to some extent with a criterion
variable (y) for which we would like to predict values.

A

MULTIPLE PREDICTIONS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

_ of two variables (green portion is the
shared variance) = r²

A

Co-variation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Because multiple regression has so much to do with
correlation, it is important that the variables used are
__.

A

continuous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

That is, they need to incorporate measures on some kind of a
___ scale

A

linear

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In __, variables like marital status where codes 1-4 are
given for single, ,married, divorced, widowed etc. can not be
done

A

correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The exception, as with correlation, is the __ variable
which is exhaustive, such as gender.

A

dichotomous

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Even with these variables, however, it does not make sense to
carry out a __ regression analysis if almost all variables
are dichotomously categorical

A

multiple

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

if almost all variables
are dichotomously categorical, In this instance a better procedure would be __

A

LOGISTIC
REGRESSION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

__
* Refers to predictor variables that will also correlate
with one another

A

COLLINEARITY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

If one IV is to be a useful predictor of the DV, independently
of its relationship with another IV (collinearity), we need to
know its unique relationship with the dependent variable.
* This is found using the statistic known as the ___

A

SEMI-PARTIAL
CORRELATION COEFFICIENT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

___is a way of partialling out the
effect of a third variable (z) on the correlation between two
variables, x and y.

A

PARTIAL CORRELATION

20
In __ correlation we take the residuals of only one of the two variables involved.
semi-partial
21
Semi-partial correlation gives us the ___ only shared between an IV & the DV, with the variance of another IV partialled out.
common variance
22
Remember that the explained variance is found by ___
squaring the regression coefficient.
23
Now, imagine that for each predictor variable a regression coefficient is found that, when squared, gives us the unique variance in the DV explained by that predictor on its own with the effect of all other predictors partialled out In this way we can improve our prediction of the variance in a dependent variable by adding in __ in addition to any variance already explained by other predictors
predictors that explain variance in y
24
In multiple regression, then, a ___ of one variable is made using the correlations of other known variables with it.
statistical prediction
25
For the set of predictor variables used, a particular combination of __is found that maximizes the amount of variance in y that can be accounted for.
regression coefficients
26
In multiple regression, then, there is an ___ that predicts y, not just from a x as in single predictor regression, but from the regression coefficients of X1, X2, X3 and so on.. Where Xs are predictor variables whose correlations with y are known.
equation
27
bi are the __
regression coefficients for each of the predictors (xi)
28
bo is the __ (c in the simple example earlier)
constant
29
These b values are again the ___increases for each unit increase in the predictor (xi), if all the other predictors are held constant.
number of units Ŷ
30
However, in this multiple predictor model, when standardized values are used, the ___are not the same value as that predictor’s correlation with y on its own.
standardized regression coefficients
31
What is especially important to understand is that, although a single predictor variable might have a strong individual correlation with the criterion variable, acting among a set of predictors it might have a __
very low regression coefficient.
32
In this case the potential contribution of one predictor variable to explaining variance in DV has, as it were, already been mostly used up by another predictor or IV with which I shares a lot of __
common variance
33
The multiple regression procedure produces a __, symbolized by R, which is the overall correlation of the predictors with the criterion variable. * In fact, it is the simple correlation between actual y values & their estimated y values.
MULTIPLE CORRELATION COEFFICIENT
34
The higher R is, the _ _ between actual y value & estimated ŷ.
better is the fit
35
The closer R approaches to +1, the ___ the differences between actual & estimated value
smaller are the residuals –
36
Although R behaves just like any other correlation coefficient, we are mostly interested in R² since this gives us the __ in the criterion variable that has been accounted for by the predictors taken together.
proportion of variance
37
This is overall what we set out to do – to find the ___ to account for variance in the criterion variable.
best combination of predictors
38
To find an R that is significant is __
no big deal
39
This is the same point about __ correlations, that their strength is usually of greater importance than their significance, which can be misleading.
single
40
However, to check for significance the R² Value can be converted into an __
F value
41
R² has to be adjusted because with small N its value is artificially __.
high
42
This is because, at the extreme, with N = number of predictor variables (p) + 1, prediction of the criterion variable values is __ and R² = 1, even though, in the population, prediction can not be that perfect.
perfect
43
The issue boils down to one of __
sampling adequacy.
44
Various rules of thumb are given for the __to produce a meaningful estimate of the relationship between predictors & criterion in the population – remember we are still estimating population parameters from samples.
minimum number of cases (N)
45
Although some authors recommend very high N indeed, Some recommend that the minimum should be __, and most accept this as reasonable, though the more general rule is __
p + 50; “as many as possible”.
46
Effect Size
Small = .02 * Medium = .15 * Large = .35
47
The output table in Multiple Regression Analysis\
* 1st table - simple descriptives for each variable. * 2nd table - correlation between all variables. * 3rd - which variables have been entered into the equation
48
* Model Summary –
gives R, R², and adjusted R²
49
Tells whether or not the model accounts for a significant proportion of the variance in the criterion variable. It is a comparison of the variance “explained” vs. the variance “unexplained” (the residuals)
* ANOVA –
50
- – contains information about all the individual predictors
Coefficients
51
__ – are the b weights. This tells us how many unit/point increase will there be in the criterion variable for every 1 unit/point increase in a particular predictor variable.
Unstandardized Coefficients
52
__- are the beta values. This tells us how many SD unit increase will there be in the criterion variable for every 1 SD unit increase in a particular predictor variable.
Standardized Coefficients
53
__– found by dividing the unstandardized b value by its standard error. If t is significant, it means the predictor is making a significant contribution to the prediction of the criterion.
t-values
54
__ – when predictor variables correlate together too closely. If tolerance values in the coefficients table are very low (e.g., under .2) then multicollinearity is present.
Collinearity