Lecture 1 Flashcards
(24 cards)
What is statistics?
A collection of procedure and principles for gaining and analyzing information to educate people and help them make better decisions when faced with uncertainty
What is data?
a collection of numbers or other pieces of information to which meaning has been attached
Types of Data
Qualitative
Quantitative
Qualitative data
- non numeric (can sometimes have numbers but cannot be classified as numeric)
- nominal: categories without order
- ordinal: categories with order (universal - ex: 1 agree 5 disagree)
Types of Quantitative
- ratio (discrete, continuous)
- interval (discrete, continuous)
Ratio
differences AND ratios are meaningful, natural starting point (meaningful 0)
- discrete: countable number of possible values
- continuous: infinite number of possible values
ex: 1:10 girl to guy ratio
Interval
differences are meaningful, ratios are NOT meaningful
- continuous: infinite number of possible values
- discrete: countable number of possible values
ex: Temperature (can’t say 40 is 2x hotter than 20 degrees)
Stem-plot
a quick and easy way to put a LIST of numbers into order while getting a picture of their shape
Truncate
used in stem-plot, drop off numbers instead of rounding them
Histograms
Pictures related to stem-plots
- uses ranges
- does not list every value
- good for large number sets
Mean
sum of all the data values, divided by the number of data values (AVERAGE)
Mode
Value that occurs most often
Median
middle value of the data set when it has been arranged in ascending order (smallest to highest), often termed central tendency when there are out liars
Range of data set
difference between the highest and lowest values in the set
Measure of spread
Measures of spread describe how similar or varied the set of observed values are for a particular variable (data item)
Measure of shape
a distribution of data item values may be symmetrical or asymmetrical
- normal distribution
- skewed districution
Right skew
- when the tail on the right side of the histogram is longer than the tail on the right and the mean is higher than the median
Left skew
- when the tail on the left side of the histogram is longer than the right and the mean is lower than the median
Bin-Width
how big of an interval to use for each bar in a histogram
Range
max-min
Standard Deviation
roughly the average distance of values from their mean
Variance
standard deviation squared
percentile
a value below which a particular percentage of data points lie (median - 50th percentile)
Quartiles
divide the order date into 4 equal sized groups
Q1 -