week 8 Flashcards
(13 cards)
Front
Back
Definition: Median
The median is the middle value of a sorted dataset. If n is even, it’s the average of the two middle values.
Definition: Mean
The mean (average) is (\bar{x} = \frac{1}{n}\sum x_i). Add all values, then divide by the number of observations.
Formula: Variance
Sample variance is (s^2 = \frac{\sum(x_i - \bar{x})^2}{n - 1}). It’s the average squared deviation from the mean (with n-1 in the denominator).
Formula: Standard Deviation (SD)
Standard Deviation = (s = \sqrt{ s^2 }). It returns the spread measure to the original data units.
Formula: Interquartile Range (IQR)
IQR = Q3 – Q1, the distance between the 75th and 25th percentiles.
Definition: Outlier (Boxplot)
A data point is often labeled an outlier if it’s beyond 1.5 × IQR above Q3 or below Q1.
Formula: z-score
(z = \frac{x - \bar{x}}{s}). Indicates how many SDs above (+) or below (–) the mean a point is.
Definition: Histogram
A histogram is a bar chart for quantitative data, grouping data values into equal-width bins.
Steps: Boxplot Construction
1) Determine 5-number summary (min, Q1, median, Q3, max). 2) Draw box from Q1 to Q3. 3) Mark median. 4) Calculate fences at 1.5 × IQR. 5) Draw whiskers to data within fences; mark outliers beyond them.
Formula: Coefficient of Variation (CV)
(CV = \frac{s}{\bar{x}} \times 100\%). Measures relative spread compared to the mean.
Definition: Correlation Coefficient (r)
r = from –1 to +1, measuring linear relationship strength between two numeric variables.
Formula: Percentile Location
For the p-th percentile in n data points: (i = \frac{p}{100}(n+1)). If i isn’t integer, interpolate.