- emphasizes the fact that the distribution is NOT showing the exact frewuency for each category -want it to be symmetrical (normal curve, mean and median are equal)

Research Basics pt. 2 Flashcards by Chavez Girls

Parameter

-descriptive value for a population

How well did you know this?

Not at all

Perfectly

Statistic

-descriptive value for a sample

How well did you know this?

Not at all

Perfectly

Mean

-average
-most commonly used
-only used with interval/ratio
-influence by outliers
-toward the tail oppositte of mode

-μ, x

How well did you know this?

Not at all

Perfectly

when shouldn’t you report the mean?

if you have outliers/extreme scores, the mean will be pulled towards the extremes (towards the tail) and will not provide a central value.

How well did you know this?

Not at all

Perfectly

Variance

-SD^2 or (distance from mean)^2/ n-1

-σ^2

How well did you know this?

Not at all

Perfectly

Standard Deviation

-distance between a score and mean
- Square root (distance from mean)^2/ n-1

-σ

How well did you know this?

Not at all

Perfectly

Frequency Distribution

-organized picture of an entire set of scores

-histogram, smooth curve, stem and leaf

How well did you know this?

Not at all

Perfectly

Smooth Curve

emphasizes the fact that the distribution is NOT showing the exact frewuency for each category
-want it to be symmetrical (normal curve, mean and median are equal)

How well did you know this?

Not at all

Perfectly

1 SD in a normal distribution

68.26% (34.13%)

How well did you know this?

Not at all

Perfectly

2 SD in a normal distribution

95.44% (13.59%)

68-95-99.7 Rule

How well did you know this?

Not at all

Perfectly

3 SD in a normal distribution

99.72% (0.13%)

How well did you know this?

Not at all

Perfectly

Histogram

-shows all the frequencies of the distribution

How well did you know this?

Not at all

Perfectly

positive vs. negative skew

-non symmetrical distribution
-named for tail

Positive: scores pile up at low values, tail point to high values

Negative: scores pile ip at high values with tail at low

How well did you know this?

Not at all

Perfectly

Kurtosis

-peakedness of tthe distrubution

How well did you know this?

Not at all

Perfectly

Leptokurtic

-skyscraper
-higher and thinner peak
-low variability
-easier to get significance

How well did you know this?

Not at all

Perfectly

Platykurtic

-hill
-lower peak
-higher variability
-harder to get significance

How well did you know this?

Not at all

Perfectly

Stem-And-Leaf Display

preserves the original data values
It’s especially useful for small to moderately sized data sets.
-each score devided into a stem (first digit) or leaf (last digit)

How well did you know this?

Not at all

Perfectly

central tendency measures

describes the center of the distribution and represents the entire distribution of scores as a single number (mode, median, mean)

How well did you know this?

Not at all

Perfectly

Mode

-most frequent
-used in all data
-located on one side near peak, other farthest from mean
-bimodal, multimodal

How well did you know this?

Not at all

Perfectly

Median

-middle: 50% of the scores in the distribution have values that are equal or less than the median
-used for ordinal, intterval, or ratio
-unnaffected by outliers
-can’t show sig dif
-between mean and mode

How well did you know this?

Not at all

Perfectly

Variabiltiy

-how spread out the data is
-descriptive (how spread out) and inferential stats (how accurate to pop)
-meaured by range or SD

less variability –> better representation

How well did you know this?

Not at all

Perfectly

Range

-total distance

How well did you know this?

Not at all

Perfectly

SD in Normal Distribution

Study These Flashcards

-70% of scores 1 SD of mean (35+/-)
-95% of scores 2 SD of mean
-99% of scores 3 SD of mean

standardized, mean is 0

Z Score

Study These Flashcards

-where a score is located relative to other scores
-# of SD above or below mean
-descriptive (where in curve) and inferential stats (reference to population)

z= score-mean/SD

Inferential Statistics

-infer things about the population based on sample

Probability

-proportion under the curve -z score creates % as body or tail

Critial Limit Theorem

-30 sample with be closly related to real pop

T-Test

-compare 2 groups -used fo smaller samples -flater curve than normal distibution -1 tail only considers 0.05 in one tail= higher chance of significance -2 tail considers 0.05 in both (0.025 in each)= lower chance of significance

F-Distribution

-ANOVA -more than 2 or factorial research design

Chi-Square Distribution

-comparing proporttions of people in diff groups -comparres observed frequencies to expected

Standard Error of Mean (SEM)

-value that describes the diff between the sample mean and true pop mean -always smaller than SD -smaller=less sampling error -sample SD/Square root (n)

Point Estimate

-mean of sample, estimates pop -boarder of box-plot

Interval Estimate (CI)

-confidence interval -range of sample that can include the real pop mean -span of box plot

Box-Whisker Diagram (boxplot)

-Whiskers: range of scores -Box: median (line), upper and lower quatile (25 and 75%)

Bar Graph

-nominal or ordinal -similar to hisogram with space

Error Bars Charts

-bar shoes mean score Can show -CI, SD, or Stardard error of mean

Scatterplots

-correlation -can be grouped (R is important)

Parametric Statistics

Analyzes quantitative data -t test, anova, regression -must meet assumptions -based on distributions so must be normalized

Non-Parametric Statistics

Analyze qualitative data -spearman, mann-whitney u (independent t test, takes mean rankings), friedman’s ANOVA, wilcoxson (independent t test, takes mean rankings) -violates the assumptions or have nominal/ordinal data

Linear Regression

-show relationships -make predictions

Parametric Assumptions of T-Test

I/R Data Normality Homogenity of Variance Free of Extreme Outliers Independence of Observations

Normality

-concern with smaller studies <30 -check skewednwss ( 2 is a problem) -check histograms Non-Parametric -Shapiro-Wilk Test: >0.5 is not significant

Homogeneity of Variance

Differences in variance should be equal Non-Parametric: -Levene’s test: want it to be not significant >0.5

Free of Influential Outliers

Regression: cook’s distance (>1 is bad)

Independence of Observations

-scores must not follow a pattern over time -scores from one participant cant influence another persons scores Non-pArametrics: Durbin Watson

Regression Assumptions

Linearity Homoscedasticity Outlier testing in regression

Homoscedasticity

-relationship statistics -seen in a scatterplot’s residual score -variance must be the same at all levels -how close are all points to the line -r^2 -heteroscedasticity is opposite

Linearity

-data points arranged in a linear pattern -seen in a scatterplot

Residual Score

-distance of score from regression line on y axis -ouliers are large

Standardized Residual

-distance from line in terms of SD - negative= under the line -positive = over line -0= on line

Solutions to Violated Assumptions

Trim the data Windoring: substitute outlier with highest score Transform the data: take the log of the data Bootstrapping is SPSS: Non parametric Data

Critical Region

-in the tails -outcomes unlikely caused by chance

How To Increase Power

-increase effect size -decrease variability -increase sample size -increase alpha -use a 1 tail test

Independent T-Test

-compares 2 means of independent data -different groups -1 IV and 1 DV -Man Whitney U

Repeated Measures T-Test (Dependent)

-compaires matched pairs -same participant twice -more likely to be significant -wilcoxon signed ranks -does not need HOV

Bonferroni Correction

-limits alpha inflation when testing the same data set multiple times and makinf a type 1 error - divides alpha by number of tests run

Research Basics pt. 2 Flashcards

(56 cards)