Chapter 2 PESIN Flashcards
(18 cards)
variance
how scores differ from one another
idea of how representative the mean is of data
how much error exists
*SD returns variance to same playing field as z-scores
2 kinds of variance
s^2: statistic, measure of the sample, estimates population parameters
sigma: paramter, measure of population. direct measure
* both tell us mean squared error in population
why different denoms for variance?
s^2 underestimates mean squared error in population.
because x-bar is used to estimate pop mean, will be off somewhat bc of sampling error
because x-bar is closest to all numbers in the sample, x-bar will be smaller than pop mean
so, make the denom smaller will give a bigger answer
degrees of freedom
number of freely varying scores to estimate a parameter
when parameter estimate (x-bar) is in formula for another (s^2), you lose a degree of freedom in the latter. so n-1
statistical models equation
outcome of i = model + error of i
mean as a model of scores
outcome = x-bar + error
usually use SS to describe total error in model. s^2 is better tho bc its an estimate of mean squared error (MS)
low SS or MS: model fits well, little error
high SS or MS: model fits poorly, lot of error
if you replace x-bar with any other number, SS & MS increase
mean model = method of least squares
variable & parameter w/in mean as a model of scores
variable: measured constructs that vary across personas in sample
parameter: constants that describe relations b/w variables in pop.
*mean has no variables, just parameter x-bar
mean as model of population
outcome = x-bar of sample + sampling error
standard error
measure of typical amount of error in a model
- SD of sampling distribution
- sampling dist: probability dist. of a statistics
- mean model: prob dist. of x-bar
central limit theorem
as sample size increase (n>30) sampling dist. has normal dist. with x-bar, s^2, and standard error as best estimates of pop.
n>30, CI use z-score cutoffs
n<30 sample dist. flatter than normal, use t cutoffs for CI
CI rules (Cumin and Finch)
- if CI barely touch, p=.01
2. if ends dont touch, p
issues with one-tailed tests
- results could go opposite direction, if so you must ignore
- requires lower statistic to be sig, you cant change after
overall, it encourages cheating
NHST
null: effect is absent, assume this is true and set model to it
alternative: prediction from theory, effect will be present
p-value: Fisher’s 5% chance of getting data if null is true. then can be confident in alt.
decision errors
type 1: we think effect, there is not. reject true null (a=.05)
type 2: we think no effect, there is. fail to reject true null. (beta=.2)
familywise/experimenterwise error rate: collective error, statistical tests conducted on same data will increase prob of type 1. (more than .05)
issues with statistical significance
- if n is large, sig effect may be small
- rejecting null does not mean proven wrong, means highly unlikely to be true
- failing to reject null does not mean proven right, means not enough evidence to warrant rejecting
- encourages all or nothing thinking
effect size
standardize measure of dif between conditions or the strength of a relation between variables
- standardized=across studies
- not as reliant on sample size
small effect: r=.1, d=.2
medium effect: r=.3, d=.5
large effect: r=.5, d=.8
beware of canned effect sizes: size of effect must be placed in context of research
cohen’s d
shows how much of variance in one variable is explained by the model
2 groups, pooled SD, is comparing differences between means
although not affected by sample size, the larger n the closer estimated d will be to pop
meta-analyses and effect size
takes effects zies from many studies to get a more definitive estimate of the effect in pop.
*takes avg of all effect sizes