# The Beast of Bias (L5) Flashcards

What are outliers?

Extreme scores that have a tendency to bias the parameters. They can also alter the value of the mean.

What is the assumption of normality?

Refers to the fact your sampling distribution needs to be normal (NOT data). Normality can influence parameter estimation and confidence interval estimation.

What is central limit theorem?

Tells us about under what conditions we should find normality. As long as sampling size is big enough (usually N=30 in theory, N=100 in practice), the distribution will be normal.

What is homogeneity of variance?

Difference data points have similar variance (SDs) which are consistent across conditions.

What is heterogeneity of variance?

The different data points do not have similar variance (SDs) across different data sets.

How can we detect bias?

Through graphs, numbers and standardized residuals.

How many standard deviations outside the mean does a data point have to be for it to be considered an outlier?

3 standard deviations.

How can we detect normality?

Through graphs, box plots, P-P/Q-Q plots, skew, kurtosis and K-S tests.

How do we correct observed problems?

1) Trim the data.

2) Winzorising.

3) Bootstrapping.

4) Transforming the data.

What is the assumption of additivity and linearity?

Assumption that the outcome is linearly related to any predictor. Most important as if this assumption isn’t met, no others are.

What is independence?

The errors in your model are not related to each others.

What are P-P plots and Q-Q plots?

Test normality;

P-P plots; show the cumulative probability of a variable against the cumulative probability of a particular distribution.

Q-Q plots; same but expressed as quantiles.

Follow basic line to show normality, with minimal variance.

What are Kolmogorov-Smirnov tests and Shapiro-Wilk tests?

Compare the socres in the same to a normally distributed set of scores with the same mean and standard deviation.

If non-significant; normally distributed

If significant; non-normally distributed.

What is Zskewness?

(S - 0) / SEskewness

S = skewness statistic SEskew = std. error

Both values found in SPSS output.

Used to assess significance of skew.

What is Zkurtosis?

(K - 0) / SEkurtosis

K = kurtosis statistic SEkurt = std. error

Both values found in SPSS output.

Used to assess significance of kurtosis.