module 5 Flashcards by aimee latendresse

data fishiness assumptions

assumption of normality
assumption of homogeneity of variance
independence of observation

How well did you know this?

Not at all

Perfectly

assumption of normality general definition

scores on the dependent variable within each group are assumed to be sampled from a normal distribution

How well did you know this?

Not at all

Perfectly

NHST for evaluating normality general definition

tests if sample distribution is sig different from normal distribution (same mean and SD)

How well did you know this?

Not at all

Perfectly

what tests are used for NHST tests for assumption of normality

shapiro wilkes test
kolomogorov smirnov test

How well did you know this?

Not at all

Perfectly

skew and kurtosis definition and cut offs

skew: asymmetry of distribution (0=normal) for descriptive approach >2
kurtosis: measure of how heavy/light distribution tails are (heavy=high kurtosis/many outliers, light=low kurtosis/no outliers) for descriptive approach >7
for both, 1.96 or above is non normal

How well did you know this?

Not at all

Perfectly

limitations of stat tests of normality

big difference needed for small samples, small difference for large sample
non-normality is less of a concern in small samples
doesnt take type of non normality into account

How well did you know this?

Not at all

Perfectly

descriptive approach for evaluating normality definition

looks at descriptives and or graphic displays to quantify magnitude and nature of non-normality

How well did you know this?

Not at all

Perfectly

____ kurtosis is more problematic than ____ kurtosis in t tests, ANOVAs, correlations, and regressions

positive, negative

How well did you know this?

Not at all

Perfectly

which approach makes more sense for normality testing; NHST or descriptives

descriptives bc it combines threshold of values and qq plots

How well did you know this?

Not at all

Perfectly

thin vs fat tails for normality distributions

thin: fewer extreme observations than normal distributions
fat: more extreme observations than normal distributions

How well did you know this?

Not at all

Perfectly

if data is normal, scatterplot should resemble a _____

straight line (as opposed to cloud shape)

How well did you know this?

Not at all

Perfectly

if the middle of the scatterplot line is straight and the ends flatten, it _____

indicates thin tails and is not problematic

How well did you know this?

Not at all

Perfectly

if the middle of the scatterplot line is straight and the ends have a steep slope, it _____

indicates fat tails and is problematic

How well did you know this?

Not at all

Perfectly

assumption of homogeneity of variance definition

variance of scores on dependent variable with in each group (condition) are the same across all groups (conditions)

How well did you know this?

Not at all

Perfectly

evaluating homo of variance; NHST approach definition

tests if variances in groups are significantly dif from one another

How well did you know this?

Not at all

Perfectly

evaluating homo of variance; descriptive approach

Study These Flashcards

looks only at imperfection
looks at descriptive stats and or graphic displays to quantify magnitude of differential variances (largest vs smallest SD)
looks at threshold ratio of largest to smallest variances

tests for homo of variance

Study These Flashcards

levenes tests
hertley variance ratio test or f-max tests

limitations of NHST approach for homo of variance

Study These Flashcards

role of sample size (dif in variance is less concern for small and more concern for larger sample sizes)
insensitive to dif in variance in small and sensitive to big
dif in variance is a magnitude problem

if variances are equal, scatterplot should resemble a straight line with a slope of ___ and the intercept is ____ whereas when the variances are not equal, scatterplot will not cluster around the line and will be different from __

Study These Flashcards

1, the difference between means,1

independence of observation definition

Study These Flashcards

each observation (between subjects) or set of observations (repeated measures) from the dataset is independent of all other observations/sets
ex of independance= roommates/partners

positive associations inflate ___ and negative associations inflate ___

Study These Flashcards

alpha, beta

evaluating independence of observation

Study These Flashcards

examine structural properties of data to see if basis exists for questioning validity of assumption
if no evident basis, its okay to carry on
thresholds are up for debate
if basis exists, independence can be assessed by computing interclass correlation for the part of data that is assumed to have lack of independence
if correlation is very small (<0.10), its fine to use t test/ANOVA

address violation for normality

Study These Flashcards

use alt stat procedures that dont need normality
evaluate level of measurement assumptions
identity and remove outliers
transform data to normalize distribution

address violations of homo of variance

Study These Flashcards

use alt procedures that dont need normality
evaluate level of measurement assumptions
identity and remove outliers

addressing violations of independence of observations

- alt stat procedures - ex multi level modeling (MLM) or hierarchal linear modeling (HLM)

outliers definition

- extreme values that differ largely from other other observations in dataset and suggest theyre drawn from another population

examples of common outliers

- data entry/encoding error (less common now, no longer manual data entry) - response latency data (longer response time due to distortion of error, due to distraction etc) - open ended estimate data

problems with outliers

- responsible often for violations of homo variance/normality - conceptual validity - disproportionate influence on stat results

identifying outliers

- impossible values in frequency tables/histogram - steep tails in normal qq plots - standardized residuals for observations - studentized deleted residuals

standardized residuals for observations

- index of deviation from the mean - follows z distribution - normal distributed N=100, 1 value should be >2.6 - normal distributed N=1000, 4 values should be >3.0 - general threshold of 4 or 5 is suggested

studentized deleted residuals

- index of deviation from mean (not including target observation in mean and SD calculation) - follows t distribution of df=n-2 - sample of 100, value of >3.6 = outlier - sample of 1000, value of >4.07 = outlier

response to outlier

- correct or treat impossible values as missing data - possible but highly discrepant values can be trimmed or capped to most extreme value/specified values - highly discrepant values are treated as missing

philosophical issues w outliers

- minimalist perspective: never touch the data, strong rational needed for deletion/alteration of data (due to potential abuse) - maximalist perspective: routine altering/deleting of values, outliers violate assumptions, hard to interpret, must set clear rules/procedures to avoid abuse - intermediate perspective: justifiable w/ clear rules/procedures and high thresholds for outliers

levels of measurement

- nominal: # assignment is abt group membership/categorical (ex nationality) - ordinal: # assignment is abt rank order on scale but is not reflective of mag of dif (ex favs, difference between top 1-2 and 4-5 may be different) - interval: # assignment is abt rank order and mag of dif but no ratio (ex C degrees scale, 0 for freezing, 100 for boiling) - ratio: # assignment is abt rank order, mag and ratio dif (ex mass, length)

what level of measurement has an absolute meaning ful zero (0) point

ratio

before conducting analysis (t test/ANOVA) and descriptive stats, its only meaningful independent variable has at least _______ properties

interval

module 5 Flashcards

(36 cards)