Statistics Concepts Flashcards
(23 cards)
level of measurement***
refers to way numbers assigned to categories of variable (nominal, ordinal, ratio/interval)
discrete variable***
assumes distinct values, can be finite or infinite, always some minimum unit beyond which categories cannot be divided
continuous variable***
can assume any value on continuum
unit of analysis
Refers to specific group or entity on which statistical
analysis is performed
data structure
types: cross-sectional, time series, panel
array
univariate visualization involving list of observations in increasing or decreasing order
stem-and-leaf plot
univariate visualization; Stem represents leading digit of number, Leaf represents trailing digits
frequency distribution***
univariate visualization which counts number of times category repeated in data set
percent distribution***
univariate visualization which shows what percent of cases fall into each category
cumulative distribution***
univariate visualization which shows relative position of category in distribution (e.g. percentiles of GRE test takers)
histogram***
univariate visualization which uses vertical bars to designate frequency/percent of category, no spaces between bars to convey that variable measured on continuum
pie chart/bar graph
Pie charts/bar graphs are univariate visualizations used to graph nominal or ordinal variables, most useful when variable being graphed has few categories
mean***
Most commonly used measure of central tendency; mean restricted to interval/ratio variable because
involves arithmetic, also used with dichotomous
variables
trimmed mean
reduces influence of extreme values on calculation of mean by excluding portion of data in tails of distribution (e.g., 10 percent trimmed mean discards highest/lowest 10 percent of data)
median***
Measure of central tendency used with ordinal or higher
mode***
measure of central tendency used for all levels of measurement
variance***
denoted s^2, most commonly used measure of variability; takes into account all observations of variable
standard deviation***
measure of variability, denoted S. Disadvantage of variance uses square of deviation from mean measure not expressed in units of original variable, can return to original metric by taking square root of variance
range
Measure of variability measuring span of data, or
maximum possible difference in categories
interquartile range
Modified version of range, not as susceptible to outliers; measured as range of middle 50 percent of
observations and equal to difference b/w third quartile (75th
percentile) and first quartile (25th percentile)
box plot
Graphical device used to display univariate summary measures, displays 5 key measures on axis: min, Q1, median, Q3,
max
skewness
skewness measures describe whether distribution
symmetrical or skewed. Rightward skew = positive, leftward skew = negative
kurtosis
Measure of skewness reflecting peakedness or flatness of distribution; Leptokurtic = tall peak, kurtosis value > 3, Platykurtic = flat peak, kurtosis value < 3, Mesokurtic = symmetric peak, kurtosis = 3