14. Descriptive Statistics Flashcards
(31 cards)
descriptive Statistics
The branch of statistics dealing with how to describe and summarize data.
How can I communicate the important characteristics of my data?
frequency distribution
a (chart) showing the unique values of the data set, along with their frequency within the data set
bar graph
used to depict a frequency distribution of CATEGORICAL variables (space between bars)
histogram
used to depict a frequency distribution of a QUANTITATIVE variable (no space between bars)
mean
sum of all values divided by number of values
X = E(x) / n
median
centermost value when the set is ordered
mode
most frequent value in a set
mean, median & mode
measures of central tendency
nominal
a variable that can be CATEGORIZED, but not quantified.
ordinal
a variable that can be RANKED, but not quantified.
interval
a variable that can be QUANTIFIED, without a true relationship to 0
ratio
a variable that can be QUANTIFIED, where 0 indicates absence of quantity
variance
measure of average distance to mean, measured in square units
standard deviation
measure of average distance to mean
variance (formula)
E(x-M)^2 / n
standard deviation
(E (x-M)^2 / n ) ^1/2
normal distribution
a distribution where 68% falls within one standard dev, 95% within 2 standard dev, and 99.7 within 3 standard devs of the mean
unstandardized difference between means
compare two data sets
by finding the difference between the data set means, in natural units.
(ie. M1 - M2)
cohen’s d
compare two data sets
by finding the difference between the data set means, in standardized units.
ie) M1 - M2 / SD
note: The SD can be for set 1 or 2
- 2 = small
- 5 = medium
- 8 = large
thresholds of effect size for interpreting cohen’s d
effect size
magnitude of relationship between two variables
Pearson correlation coefficient
vector value [-1,1] that describes magnitude (absolute value) and direction (sign) of relationship between variables, when when variable is controlled for.
p = E (Zx Zy) / n
- only valid for linear relationships*
- scatterplot data first*
partial correlation coefficient
vector value describing magnitude and direction of relationship between variables, when more than one variable is controlled for.
curvilinear regression
technique used to determine nature of relationship between variables that have a curviliear relationship (ie. elliptical)