Exam 3 Flashcards

Question

Why is there no correlation between the covariate and the variate in the true experiment? With what two things can the covariate be correlated in a quasi-experiment with nonrandom assignment?

Answer 1

There is no correlation between the covariate and the variate in the true experiment because random assignment should control for that. The covariate can be correlated to the DV. In a quasi-experiment, it’s okay for there to be a correlation between characteristics of the subject and assignment to the treatment. A covariate might be more or less correlated with the criterion Y. It comes about because subjects are not randomly assigned.

Answer 2

Power is increased because the covariate partials out from the criterion a source of variation that is irrelevant to the predictor, which increases the power of the test for an effect. The ANCOVA with a quasi-experiment might increase or decrease power for the test of the effect depending on the relationship of the covariate to treatment and criterion. It’s possible that the covariate partials out error variation in the criterion or it may partial out pre-existing b/w group differences on the criterion that should not be attributed to the treatment.

Answer 3

Within class regression lines are regressions of the criterion on the covariate in each of the conditions. It assumes homogeneity of within class regression meaning that the b1s are equal in the two groups (the slopes are the same). If that assumption is met, the treatment is constant over all levels (no interaction).

Answer 4

Yhat = b1 X + b2 C + b3 XC + b0 Yhat = b2 C + b3 XC + b1 X + b0 Yhat = (b2 + b3 X) C + (b1X + b0) The simple regression coefficient here (b2 + b3 X) gives the value of the difference between the intercept of the group coded 1 minus the group coded 0, at each specific value of X.

Answer 5

This procedure provides cutoff values of X (the covariate) beyond which the treatment effect (the difference in elevation of the two regression lines) is significant. It’s the procedure of testing conditional effects of treatment C at particular values of X. It tests whether two lines have significant differences at various points on X.

Answer 6

B1: the unweighted mean of the slopes of the groups B2: the difference between the intercepts of the high group versus the low coded group B3: the difference between the slopes of the high coded group minus the low coded group B0: the unweighted mean of the two intercepts

Answer 7

Fixed effects regression means that the predictors have specified values that are systematically included in the sample. There is no probability distribution for the predictors. We sample X systematically and observe values of Y. All predictors need to be 100% reliable with zero errors and we decide the range of values for the predictors. If we have multivariate normality, meaning each variable is normally distributed AND conditional for each value on one predictor, then all inferences will be correct.

Answer 8

The residual contains random variation in Y and specification errors of excluding relevant variables or specifying the wrong form of the relationship of predictors to criterion.

Answer 9

The mean structure refers to the coefficients and predicted scores. The variance structure refers to the error terms, the MS residual, and the standard error.

Answer 10

We are concerned about the estimates of coefficients, estimates of standard errors, and MS residuals

Answer 11

The conditional variance of Y for each set of fixed values of predictors is an estimate of the error variance. We assume homoscedasticity. Homoscedasticity states that the conditional variances are equal across each combination of one value on each predictor. Heteroscedasticity leaves the estimates of the regression coefficients unbiased; the standard errors of OLS become biased; the direction of bias depends on the relationship of the error variance to a predictor. If the error variance increases as the predictor increases, the bias is negative and significance is over-estimated. If error variance decreases as predictor increases, the bias is positive, and significance is under estimated.

Answer 12

Underlies the tests of significance of the multiple correlation and individual regression coefficients, as well as the confidence intervals. The assumption comes into play in inference. Nonnormality of errors does not create bias in the regression coefficients, but may increase the standard errors relative to what these standard errors would be for OLS estimates if data were normally distributed. T and F tests might be biased by nonnormality.

Answer 13

Errors can become correlated in quasi experiments and repeated measures. With nonrepeated measures, the measures may be taken on people within groups or who are related. With repeated measures, the repeated observations on the same individual over time will be correlated with one another. Correlated errors will show up in mean structure, but won't be in variance structure because it is correlated with the treatment. We will under-estimate all the error variance and have a positive bias in all our tests.

Answer 14

The ICC is the intraclass correlation. It is an index of how much clustering there is in the data (non-independence). An ICC of 0 means independence. Even a small correlation will give us alpha inflation

Answer 15

The effects of correlated errors (autocorrelation) are: the regression coefficients remain unbiased; the regression coefficients may be highly unstable across replications; and MS residual may substantially underestimate the true amount of residual variance in the population. The sample estimates of the standard errors of the regression coefficients may underestimate the corresponding parameters (t-tests for coefficients are positively biased). The fix is to do multilevel modeling

Answer 16

The distribution of a variable represented in a histogram that is highlighted with a nonparametric smooth. They are helpful in identifying skew of distributions and outliers.

Answer 17

Normal probability plots detect nonnormality and outliers in the distribution of a single variable, like the residuals. A set of scores is plotted as a fxn of scores that would have been obtained if the variable were normally distributed. The scores are ranked from lowest to highest. Nonnormality may look like a light tail or heavy tail. If the actual scores are normally distributed, the data points of the graph fall on a straight line. Skewed distributions have one heavy and one light tail. Outliers appear as points toward the upper right or lower left.

Answer 18

The partial regression leverage plots allow you to identify the specific predictor that is leading to the difficulties such as model misspecification. In these plots it is also easier to see how a particular case is distorting the regression coefficients. Case 69 example 6

Answer 19

The breakdown point is the proportion of outlying points in a sample that it takes to change the values of estimates of regression coefficients away from those that would be obtained if no errant points were present. For OLS estimators, the breakdown point is said to be: 1/n

Answer 20

leverage distance influence

Answer 21

Outliers are extreme points in a distribution. In multiple regression, outliers are conditional as data points whose Y scores are unexpected, given their X scores or position in the predictor space. These are scores that do not follow the regression model.

Answer 22

Leverage is based on predictors. It is the potential for a point to move the regression line. Is the case extreme on the predictors (no model is needed). High leverage does not necessarily mean that the point is affecting the regression outcome.

Answer 23

Distance is based on the residuals. Is the point extreme on Y, given X. A model is needed. High distance does not necessarily mean that a point is affecting the regression outcome. It can also mean distortion of standard errors

Answer 24

Influence is the function of both leverage and distance. These are points that distort the regression equation/surface. A model is needed. High influence means that the point is affecting the regression outcome.

Answer 25

We would delete the point and rerun the analysis. This is called DFBETAS: a measure of standardized change in regression coefficients when a case is deleted.

Answer 26

The hat diagonals measure the length from the centroid of the data (leverage)

Answer 27

The basis of all measures of distance are residuals

Answer 28

Clustered errant points can mask each other in diagnostic analysis. Clusters can make it difficult to determine which cases to delete. If you have a cluster of influential cases, all working in the same way to change the regression outcome, if you remove one of the cases, the remaining cases in the clusger will continue to exert influence. Therefore removing a case does not show the impact of the case on the analysis.

Answer 29

The effect size is the degree to which the treatment changes treated subjects relative to control subjects. It’s the degree to which phenomenon is present in the population or the degree to which the null hypothesis is false. It’s a measure of how far an effect is from the null value (typically zero) in the population.

Answer 30

Effect size is a function of the ratio of systematic variance (of predictors or tx manipulation) relative to error or residual variance). It is the proportion of Y variance accounted for by the source in question relative to the proportion of error. Effect sizes are unitless that do not depend on the scale of measurement. Effect Size = Systematic variance accounted for / Error variance Cohen’s d is a measure of the differences between the means of two groups (usually experimental and control) relative to the standard error. It is the systematic difference in the numerator by the random variability in the denominator. Effect sizes for multiple regression is defined in squared terms f2 . The effect size for multiple and partial correlations is done by dividing the squared multiple correlation of a set of predictors by the equation without that set of predictors (if you are predicting a criterion from a set of predictors B).

Answer 31

Pearson product moment correlations of .1 (small), .3 (mod), and .5 (large) account for 1%, 9%, and 25% of the variance respectively. Multiple or correlations (not squared) correspond to .14, .36, and .51 respectively for effect sizes. This translates to an r2 multiple and accounted variance of 2%, 13%, and 26% respectively.

Answer 32

The power of a statistical test is the probability that you actually detect a non-zero effect. It’s the probability of rejecting the null hypothesis, given that the null hypothesis is false.

Answer 33

The four factors involved in statistical inference are PANE P= power of the test (i.e., the probability of rejecting a false null) A= the level of significance chosen (alpha) N= the sample size (n) E= the effect size of the effect The experimenter can specify in advance the power desired (.8) The experimenter can specify the alpha level in advance Estimates of effect size can come from past research or by using the small, mod, and large values set by Cohen.

Answer 34

The effect size would decrease. This highlights the discrepancy that may exist between a true effect size in the population, and the estimate of the effect size in a sample with predictors measured with error.

Answer 35

If we have an interaction between a continuous and a two-group categorical variable, this means that the regression of Y on X is different within the two groups (that the within class regression slopes differ in the two groups). Violates homogeneity of within class regression and the treatment effect estimate depends on the value of the covariate. It becomes a much more nuanced estimate

Answer 36

Variable is 100% reliable

Answer 37

Attenuation means that the parameter estimate is closer to zero in the sample than in the population (shrinks toward zero). In contrast, negative bias means that the parameter estimate is closer to minus infinity.

Answer 38

Measurement error attenuates regression coefficients in the one predictor case

Answer 39

Say we have a regression equation: Yhat = b1 X + b2 Z + b0. Measurement error in predictor Z will affect the b2 regression coefficient. The measurement error in Z will also affect the b1 regression coefficient. The direction of bias in the coefficients can be in either direction, that is, to increase the regression coefficient or to decrease the regression constant. In general, the direction of the bias is not known

Answer 40

The reliability of XZ interaction term is exactly equal to the product of the reliabilities of the individual variables for uncorrelated X and Z, that is Roe XZ, XZ = Roe XX Roe ZZ. Thus the reliability of the product term is lower than the reliability of the variables of which it is compromised, and the sample size requirement for detecting interactions is large

Answer 41

the case can move the regression plane toward itself, thereby reducing its own residual and increasing the residuals for all other cases, yielding larger MS residual.

Answer 42

Carry out the regression analysis with the cases removed; compute the predicted score and the standard error from the analysis with the case removed

Answer 43

One measure for the case. The change in the predicted score of a case (in standardized z form) due to the inclusion of the case – shows that the case is moving the regression plane. DFFITS is a global measure of standardized change based on a predicted score.

Answer 44

For each case, more than one measure, specifically one measure of how the case is changing each regression equation, including the intercept. The change in each regression coefficient due to the inclusion of the case – shows that the case is changing the regression model. DFBETAS measures the standardized change in regression coefficients when a case is deleted. Case IN MINUS case OUT

Answer 45

Under adjustment bias in the analysis of covariance refers to the failure of the covariate(s) to adjust completely for differences between groups. The under adjustment bias is due to unreliability of the covariates

Answer 46

Having measurement error in the predictors X and Z entering the XZ interaction greatly increases the sample size needed to dtect a true interaction.

Exam 3 Flashcards

(70 cards)