Sabinas lecture 1 to 7 Flashcards by Deleted Deleted

Variance & SD

Variance & SD
sigma2 or V = the degree to which a variable ‘varies’ around its
mean V = sum (X-X)2/N-1 = SS/df
SD = squ root sigma 2 or  V (in the same units, easier to interpret)

How well did you know this?

Not at all

Perfectly

› Covariance

CoV = the degree to which two variables ‘vary’ simultaneously or co-vary
Note: the variance of a variable is… its covariance with itself.

How well did you know this?

Not at all

Perfectly

Correlation

degree of linear
relationship between two variables and, essentially, it is a
standardised covariance

How well did you know this?

Not at all

Perfectly

Continuous versus discrete variables

continuous and discrete (categorical)

How well did you know this?

Not at all

Perfectly

Regression sum of sqares

Regression sum of sqares is about something we can predict. (1-R2) is what we can not predict.

How well did you know this?

Not at all

Perfectly

how to work out t

b/ SEb = t

How well did you know this?

Not at all

Perfectly

df Residual

df residual is proportion of the variable we cannot predict. N-K-1 predictors.

How well did you know this?

Not at all

Perfectly

Confidence Intervals (CI)

b is an estimate of the population parameter. Ultimately, we want to know the true value of the regression coefficient. Having the CI helps to illustrate this idea (i.e., if we conducted this research 100 times, there is XX% chance that the true (yet unknown) slope is within the specified range of values) .

How well did you know this?

Not at all

Perfectly

how to use CIs

If the range includes 0, then we can conclude that the findings are NOT n statistically significant, and vice versa.
› We can also use the CI to test whether the slope is different from a particular value (e.g., whether this slope is different from the one found in previous studies).
› SPSS does not calculate CI automatically

CI is sort of our parameter line. If I perform the experiment 100 times, this is the range I expect the B to be in.

How well did you know this?

Not at all

Perfectly

Converting from b to β [in italics!]

ß = b x (SDx/SDy)
b = ß x √(Vx/Vy)

How well did you know this?

Not at all

Perfectly

If B is equal to zero

there is no equation. It is not important. It will still be featured in the regression equation (DO NOT TAKE IT OUT).
The most common null hyp is that b = zero. Slope is not different to zero, nothing systematic is happening.

It doesn’t have to be zero, the slope is stagnant. Is the new slope different to 1.5 or not? If the CI includes this, it is fine as a null hyp.

How well did you know this?

Not at all

Perfectly

MR advantages

Can use both categorical and continuous independent
variables
› Can easily incorporate multiple independent variables
› Is appropriate for the analysis of experimental or
nonexperimental research

How well did you know this?

Not at all

Perfectly

Factors Affecting the Results

of the Regression Equation

Sample size (N)
The amount of scatter of points around
the regression line [indexed by (Y-Y’)2
or SSresidual] = Other things being equal, the smaller
SSresidual, the larger SSregression, and
hence larger the F-ratio

›The range of values in the X variable,
indicated by (X-X)2

How well did you know this?

Not at all

Perfectly

Assumptions Underlying MR (only a

glimpse now)

Dependent variable is a linear function of the IVs
- can be overlooked if one selects extreme cases of X… selection of only extreme cases can ‘force’ the regression to appear linear, even if it might be curvilinear for the X values. Bad practice…
› Each observation is drawn independently
› Errors are normally distributed
› The mean of errors is = 0
› Errors are not correlated with each other, nor with the IV
› Homoscedasticity of variance
- Variance of errors is not a function of IVs
- The variance of errors at all values of X is constant, meaning that it is the same at all levels of IV

How well did you know this?

Not at all

Perfectly

reg df

number of IVS

How well did you know this?

Not at all

Perfectly

do you report the non significant parts in regression conclusion?

YES

How well did you know this?

Not at all

Perfectly

decimal places for b

Study These Flashcards

three decimal places. .003 etc

what happens when you shorten the effect sample line graph?

Study These Flashcards

B is same, ß changes. distribution is different so SDs change

why is ß the same as ry2 when the two IVs don’t correlate

Study These Flashcards

because the overlap is not in the ven diagram. ß = ry2 when r12= 0

assumptions of error

Study These Flashcards

we assume that they are normally distributed, independent,

and have constant variance.

regression line

Study These Flashcards

that the IVs are differentially weighted
so that the prediction is optimised and the sum of the errors2 of prediction is minimised.
That is, the sum of squared values for each residual term is smaller than for any other
possible straight line, thus the term least squares

ß way of writing conc

Study These Flashcards

standard scores or stand deviations. not standard units

what is a different metric

Study These Flashcards

includes different scale of same dimension like cm is DIFF to hours. cm is DIFF to meters. must be exactly the same or use beta

when is something not a common cause

Study These Flashcards

a, b and c paths equivalent to ß’s, where DV is VarY, and it is regressed on Variables X1 and X2
› If VarX1 has no effect on Y (b=0), but it has an effect on X2, then:
- it is not a common cause
- ßYX2 = r YX2 = c
- c does not change with the inclusion or exclusion of X1
- OR
- If VarX1 has no effect on X2 (a=0),
but it has an effect on Y, then:
- it is not a common cause
- ßYX2 = r YX2 = c
- c does not change with the inclusion or exclusion of X1

importance of r2

For explanation, a high R2 less important than proper variable selection R2 should be within expected range - Explaining 25% of the variance may be surprisingly high for some questions, low for others › A high (?) R2 is important for prediction › “Human freedom may then rest in the error term

Indirect Effects

The regression weight for Parent Education changed because a mediating variable (Previous Achievement) was included in the model. › A portion of the direct effect from the first regression is now indirect (e.g., paths d and a) › Mediating variables do not have to be included to interpret regression coefficients as effects › However, this type of regression only focuses on direct effect.

mean when you standardise something

every time you standardise something the mean will be very close to zero like z score distribution

what do you look at when you have intercorrelations output?

Correlations that our IVs have with our DV | - Correlations that our IVs have with each other

including a common cause

prevents inflating the other variables bs and ßs

df for change statistics in sequential

always 1 because only adding 1 vriable at a time

order of entry

the variable entered first has the most opportunity to capture the higherst proportion of variance. the one entered last has a tiny ∆R2

total effects

the direct effect plus e x d (the indirect effects lines timsed together)

importnce measure btter than ∆R2

√∆R2

Unique Variance

Some researchers add each variable last in a sequential regression to determine its “unique” effect/variance › Can get the same information in simultaneous regression, requesting semipartial (part) correlations › Square the part correlations to determine unique variance

what to do with stepwise

large N and cross validation necessary

interactions and curevs

Test for interactions by sequentially adding a cross-product term to the regression › Test for curves in the regression plane by sequentially adding powers of variables (e.g., variable2)

purpose sequential

Is a variable (or block of variables) important for an outcome? - Does a variable explain/predict variance beyond that explained by other influences? - Does a variable explain/predict unique variance in an outcome? - Test for statistical significance of interactions and curves - Does a variable aid in predicting some criterion

What to Interpret in sequential

magnitude = importance √∆R2, | stat significancec ∆R2

when to use sequential

Useful for explanation when guided by theory › Useful for testing interactions & curves › Estimates total effects in implied model › More ‘similar’ to ANOVA method (?) be careful with order

alternatives to Stepwise

Simultaneous regression › Sequential regression (final equation) › Study correlations between IVs. If some are highly intercorrelated, consider combining them in a composite. › SEM (…?...)

Sabinas lecture 1 to 7 Flashcards

(40 cards)