250B Midterm Flashcards

1
Q

What is regression towards the mean?

A

the only time z score of 1 on X predicts z score of 1 on Y is when correlation is 1.0 – if correlation <1, no matter where you are on X you be predicted closer to mean on Y as a function of how small the correlation is
Placebo/control group removes effect of regression to the mean: both groups will experience a tendency to regress to the mean but if the treatment group shows statistically significant difference, it can be attributed to treatment and not just random chance (regression to mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does an individual case contribute to SStotal, SSresidual, and SSregression?

A

SStotal: sum(Y - Ybar)^2 –> to the extent that an obs is different from grand mean, SStotal increases
SSresidual: sum(Yhat - Y)^2 –> to the extent that an obs is different from its predicted value, SSresid increases
SSregression: sum(Yhat - Ybar)^2 –> obs contributes to grand mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Why are cases further from the mean on X more important in determining a regression line?

A

Since Z scores for these observations are larger, they drive correlation between X and Y. Correlation plays a large part in determining regression lines – through the slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is standard error of estimate and does standard error of estimate (SEE) change as a function of r?

A

SEE is a measure of how much scores vary at each value of X (conditional on X)
As r^2 increases, SEE decreases!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do SSyhat (SSreg) and SSresid change as r^2 goes from 0 to 1?

A
SSres = SSy(1 - r^2) so as r^2 increases,  multiplier of SSy decreases, so SSresid decreases --> makes sense because as r^2 increases, you are explaining more variance by knowing X, so your unexplained variance should decrease
SSreg = r^2 * SSy so as r^2 increases, multiplier of SSy increases and SSreg increases --> makes sense because your explained variance should increase!
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the proportional improvement in prediction? How is it different from PRE?

A

PRE is percent reduction in error of predicting Y when we know X.
PIP = 1 - sqrt(1 - r^2) is the reduction in the size of the SEE and reduction in width of confint on our prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is r^2 in regression similar to eta^2 in ANOVA?

A

Both represent magnitude of effect – how much overall variability in DV can be attributed to treatment effect – so same interpretation
Remember, eta^2 = SStreat/SStotal = SSreg/SSy, so they are basically the same thing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Describe assumptions of homogeneity of variance in arrays and normality in regression. When are these assumptions most critical?

A

Homogeneity of variance: variance of Y for each value of X is constant – necessary to ensure Sy|x is representative of variance of each array
Normality: in the population, values of Y corresponding to any specified value of X are normally distributed (aka errors are normally distributed) – necessary because we use the standard normal distribution
These assumptions are most important when we want to test hypotheses about b or set confidence limits on b or Y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s the biz when residuals violate distributional assumptions?

A

When homoscedasticity is violated, estimate of error variance is meaningless bc variance changes as a function of Yhat and it cannot be used as an error term, oh no!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Describe the assumption of bivariate normality in correlation when we are using r as an estimator of p

A

Bivariate normality: when X and Y are both random variables – when we are interested in correlation
We assume we are sampling X and Y from bivariate normal distribution – if you slice the distribution along either direction, you get normal conditional distributions (conditional on specific value of X or Y)
Marginal distribution – distribution of X over all values of Y or vice versa – also ~Normal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would you develop a prediction interval around Yhat in a new sample and what properties would that prediction interval have? (no formulas just concept)

A

Prediction interval around a future prediction in a future sample needs to take into account uncertainty of mean of Y conditional on a fixed value of Y and the variability of observations around that mean
Properties: farther from mean on X means more uncertainty in prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In regression context, how do you test p = 0 using an ANOVA table and F-test comparing two variance estimates? How is this analogous to a one-way between groups ANOVA?

A

You do it like normal – divide MSmodel by MSresid. The F statistic you get will be the t stat squared.
This is like a one way ANOVA because we can think of it as looking at the correlation between group membership and DV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Given a covariance, correlation, slope and intercept, how would these values change if deviation scores were analyzed, or Z-scores analyzed?

A
Correlation does not change
Covariance changes to correlation (bc corr = standardized cov)
Slope changes (as does interpretation -- slope is now change in sd units for one unit sd change in X)
Intercept changes to 0 bc mean of Z scores is 0
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some possible roles of a LOES function in regression analysis?

A

We do this when in exploratory stages or if it looks like our data is curvilinear
Average Y values close to target value of predictor for different slices of the predictor variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a single predictor case, is the t-test (or F) of b the same as the test of r, that is, will the result always be the same?

A

Yes but only in single predictor case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are some graphical methods of exploring properties of residuals in regression? What should residuals look like when the data meet the regression assumptions?

A

Make a histogram – residuals should be normally distributed about 0
Plot residuals as a fcn of IV – residuals should not vary sytematically with IV
Plot residuals as a fcn of Yhat – residuals should not change as a function of Yhat, should be scattered randomly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Describe the process of testing the difference between two regression slopes estimated in two different samples. How is this test similar to an independent groups t-test? Is this test the same as the test of the difference in two correlations, why or why not?

A

First you find SE(b1 - b2) using the variance sum law and then find a test statistic using it
It’s similar to an independent groups t test because the test statistic looks like a t statistic? idfk
It is not the same as a test of two correlations because the correlation is a standardized statistic – slopes can be equal across two groups, but if groups differ on variance on X or Y, correlations will differ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the major advantages of a factorial ANOVA relative to a one-way ANOVA? What is the major reason that the factorial ANOVA is so much more powerful than a one-way ANOVA?

A

Now we can study interactions
Also allow for more generalizability – now that we can study more factors, we are not limited to only generalizing to college age students. Also, no thing IRL is only affected by one thing.
Also, more efficient: bc we average effects of one variable across levels of other, a 2 factor design will require fewer participants than two one-way designs for the same degree of power (smaller MSwithin)

19
Q

Describe how the total sum-of-squares is partitioned in factorial ANOVA. This would include sum-of-squares cells.

A

SStotal is still sum of all squared deviations from grand mean
SSa is sum of sq deviations of factor A cell means from grand mean * nb (multiply by nb because each level of A has n participants at each of b levels)
Analog for SSb
SScells is how much variation is due to all research factors together
SSab is variance attributable to interaction bt A and B
SSerror is variation within each cell around cell mean – SS for each cell separately and summed

20
Q

What is SScells?

A

Variability of individual cell means from grand mean – how much cell means differ
Useful bc we can partition it into sources of variability: SSa, SSb, SSab
Once we have SSa, SSb, and SScells, we can calculate SSab to figure out how much variability in cell means is attributable to interaction between A and B

21
Q

What is grand mean?

A

Mean of all observations from all groups.

22
Q

What is a main effect?

A

Effect of one factor ignoring levels of the other. – related to marginal means

23
Q

What is a marginal mean?

A

Mean at one level of factor A across other levels of factor B

24
Q

What is an interaction?

A

When simple effects change as a fcn of levels of other IV – main effects of A not consistent across factor B
Effect of A depends on level of B

25
Q

What is a cell mean?

A

Mean for particular level of A and B

26
Q

What is a simple effect?

A

Holding one factor constant and comparing across levels of the other
Conditional on level of other variable

27
Q

Without using a significance test, describe the process of identifying an interaction by comparing main effects with simple effects.
What does it mean to say an interaction is what is left over after partialing out the main effects? If there were no interaction, would the main effects predict the cell means?

A

You would look to see if the main effect is consistent across all levels of other variable by looking for simple effects. If simple effects change as a function of level of other variable, then you gotta interaction.
After partialing out main effects if you still have unexplained variance from grand mean, that must be due to interaction between A and B.
If there is no interaction main effects (marginal means) would predict cell means.

28
Q

What are the issues involved in deciding what to report (main effect or no main effects) when there is an interaction?

A

Some say if you u hv interaction u best b ignorin dem main effects but sometimes main effects still account for fair amount of variance

29
Q

What is the difference between a fixed and random research factor? How does our analyses change if one or more factors is random?

A

Fixed: data has been gathered from all levels of factor that are of interest
Random: factor has many possible levels, interest is possible in all levels, but only a random sample of levels is included in data
For fixed effects we are safe to use MSerror but if one+ factor is random, expected MS no longer apply and ya gotta derive ya own damn MS terms

30
Q

Differentiate eta squared and partial eta squared in factorial designs. What complications arise in computing omega squared and partial omega squared when we have both fixed and random research factors?

A

eta-sq: magnitude of effect associated with a particular variable. it’s biased doe. represents size of effect relative to total variability in expt
partial eta-sq: estimate effect of A relative to total variability based on A and error
omega-sq allows us to differentiate between fixed, random, and mixed models and act accordingly – complications: use different denominators for fixed/random effects

31
Q

If a significant interaction is identified, then a researcher may wish to follow up with tests of simple (conditional) effects. Describe how the sum-of-squares is developed and tested for a simple effect. What does it mean that sum-of-square is additive, additive to what, and what are the implications?

A

Let’s say we want to compare levels of A at level 1 of B.
We’ll calculate SSa like normal but using only data for level 1 of B! Divide by df = a - 1 to get MSa_simple and divide that by MSE to get an F statistic.
Simple SS are additive: simple effect of A at each level of B is a partitioning of SSa and SSab

32
Q

Howell claims that d-family effect size measures are “more complicated” in factorial ANOVA because you have to figure out what the right standard deviation is (p. 440). What is the problem here, and how to solve it? Why can’t one just use the square root of MSerror?

A

You have to decide what you consider as error – is something artificially introduced by the experimenter?
You can’t just use sqrt(MSerror) because this may change depending on the particular independent variables you control for.
The solution is being careful about which effects you add back in to the error term

33
Q

What is the difference between an orthogonal and non-orthogonal design?

A

Orthogonal: equal n in all cells –> independent variables are uncorrelated –> SStotal can be uniquely partitioned into SS due to each research factor and interaction. So interpretation of effects of each research factor are unambiguous
Non-orthogonal: IVs correlated (if you’re in level i of factor A, you tend to be in level j of factor B) – this happens with unequal n –> effects of IVs is confounded

34
Q

Describe what Type I, II, and III sum-of-squares are, and why type three is typically used for an unbalanced ANOVA design.

A

I: Sequential/Hierarchical: evaluates differences in weighted means, results depend on proportion in sample. order you enter factors matters.
II: SS for each effect after controlling for the effect of the other main effects but not interaction (do not sum up to SStotal)
III: Evaluate effects controlling for all other effects (used when there is a significant interaction) – reduction in SSerror by inclusion of one term controlling for all other terms

35
Q

What are “incremental contributions to sum-of-squares” and how can they be used to define the three types of sum-of-squares.

A

idk

36
Q

What problem would a lack of independence among treatment levels cause?

A

Can hide experimental effects that may be masked by individual differences

37
Q

How does partialing out individual differences result in independent errors, and a smaller error term?

A

We have correlated errors! We can take this out of our error term. this is ok bc we can remove systematic variance
We can control for IDs btween subjects by removing them as a source of error

38
Q

What are the two sources of within group variance in an ANOVA and how is one of those sources controled for in a within-subjects design?

A

1) individual differences between subjects
2) random error
Control for (1) by removing IDs from error term

39
Q

How do we compute sum-of-squares between the subjects? What are the sources of variance within a subject?

A

Sources of variance within a subject: error, treatment differences, individual differences
Compute SSbetweensubs by subtracting subj mean from each measurement

40
Q

What does the subject by treatment interaction mean in a within subjects design?

A

This means that different subjects change differently over treatments

41
Q

Why, in the words of Howell, do “we seldom test the effect due to subjects”?

A

it’s trivial – not interesting to say, wow, look, people differ

42
Q

What is compound symmetry or sphericity? What happens when it is violated?

A

Compound symmetry: constant values on diagonal and off diagonal in cov matrix
Sphericity: all possible pairs of treatment difference scores have equal variance
If violated, F stat won’t be distributed as F! It’s fine tho dw

43
Q

Higher correlations among treatment levels lead to greater power, why?

A

Cuz you get to remove a greater portion of the error as systematic variance, bruh

44
Q

Contrasts and Effect Sizes. What is the error term for a contrast and what are the complexities involved in determining the denominator in an effect size?

A

If sphericity is all gucci then we can use a common error term like interaction MS when computing contrasts.
Denominator for effect sizes should be an error term specifically created for the particular contrast