Basic Descriptive Data Analysis Flashcards
(39 cards)
frequency distribution table
ranked order scores that shows the number of times each value occurred
categorical data contains what in a frequency distribution table
raw and relative frequency
continuous data contains what in a frequency distribution table
raw, relative, cumulative
raw frequency
how many fall into that data, usually whole number
example: 5/27 people were age 0-10
relative frequency
how does the number of data points relate to the entire sample, in %
example: 5/27 = 18.5 %
cumulative frequency
cumulative % up to indicated range you are looking at
class intervals (BIN)
defined range limits in which data is grouped
categorical variables
separate due to lack of relationship to one another - ranking order tho
what are two major statistical outputs of categorical data
frequency and percentage
what constitutes a normal distribution curve
bell shaped curve
symmetrical around the mean
what is the statistical significance of normal distribution?
many datasets follow the bell shaped symmetrical around the mean shape
what does bimodal mean
bad, “two humps”
suggestive of 2 different populations
left skewed
negatively skewed
tail is to the left
right skewed
positively skewed
tail is to the right
stem and leaf plot
used with continuous data
good for showing individual data
bad for large amounts of data
histogram
continuous data
we can get distribution curves with a histogram
good for showing midpoint of data and large amounts of data
bad for showing individual data
mean
equals sum of all values / total number of values
when is mean commonly used
used for measuring central tendency
when is mean less helpful
when outliers present or with skewed distribution
median
equals value of middle of ranked data
when is median most helpful
more helpful than mean when outliers present or with skewed distribution
mode
equals value that occurs most often
less commonly used compared to mean and median
what is the relationship between mean, median and mode with symmetrical data
mean = median = mode
what is the relationship between mean, median and mode with right skewed data
mode < median