Three measures of center for quantitative variables
Mean, Median, Mode
Mean
Average
Finding average / sample mean
(y1 + y2 + y3) divided by total number of observations –> n
Median
Middle
Finding median with odd number of observations
(n+1) divided by 2
Finding median with even number observations
Take two middle observations:
(observation 1 + observation 2) divided by 2
Outlier
Observation much smaller or greater than main body of observations
Is the median affected by the outlier?
No
Mode
Highest frequency
Three quartiles and corresponding percentiles
Q1 –> 25%
Q2 –> 50%
Q3 –> 75%
Five number summary
min, Q1, Q2, Q3, max
Determining five number summary with odd number of observations
Find median –> Q2
Don’t include median when calculating Q1 and Q3
Q1 = (add two middle numbers) divided by two
Q3 = (add two middle numbers) divided by two
Find min and max
Determining five number summary with even number of observations
Find median –> Q2
Include median when calculating Q1 and Q3
Q1 = (add two middle numbers) divided by two
Q3 = (add two middle numbers) divided by two
Find min and max
Frequency
Number of observations
Calculating relative frequency / proportion
Frequency divided by frequency total
Marginal distribution
Totals of each frequency distribution
Response variable (y)
Main variable
Explained by or depends on explanatory variable (x)
To get joint distribution from contingency table
Divide options by overall total then multiply by 100 to get percent
When to use joint distribution
When sentence says
“and”
When to use conditional distribution
When sentence mentions particular group
Three measures of spread for quantitative variables
Range, IQR, standard deviation (SD)
IQR
Difference between Q3 and Q1
Q3 - Q1
How to determine outliers
Upper fence = Q3 + 1.5 x IQR
Lower fence = Q1 - 1.5 x IQR
Outlier if higher than upper fence and lower than lower fence
Range
Largest - smallest