Topic 4: intro to measurement of frequency distributions Flashcards
(11 cards)
1
Q
Describe the frequency distribution for a histogram
A
- To describe the distribution = look at over all pattern
- Over all pattern = shape + spread + center
- Line connected = too detailed
- Smoothed curve = highlights distribution
2
Q
Types of distribution for numeric variables
A
- Unimodal = 1 peak
- Bimodal = 2 peaks
- Multimodal = many peaks
- Normal = symmetrical = bell-shaped where both sides are mirror images
- Positively skewed = right side extends like tail
- Negatively skewed = left side extends like tail
3
Q
Describe the central tendency measure in negative skew
A
- Mode = peak
- Median = shifts to left of mode
- Mean = middle of dip to the left
4
Q
Describe the central tendency measure in positive skew
A
- Mode = peak
- Median = shift to right of mode
- Mean = middle of dip to the right
5
Q
Describe the central tendency measure in normal distribution
A
- Mean=mode=median
- Data is distributed around mean = most common distribution value also there + it is the middle point of curve
6
Q
Define outliers
A
- Observations that lie outside overall pattern of distribution
7
Q
What is the effect of outliner on the mean/median?
A
- The mean is either pulled to the right or left depending on where the outlier is
- Median is largely unaffected
8
Q
How do you know when boxplot is skewed?
A
- The box will not align in the middle of the lines
- If it is extended more to the right/above = positive skew
- If it is extended more to left/below = negative skew
9
Q
How does distribution affect choice of summary statistics?
A
- Mean = should only be used in normal distribution = affected by skew/outlier
- In large sample = outlier won’t affect mean = can be used BUT skewness always affects mean even in large
- SD = only use in normal distribution = affected same as mean
- Median/IQR = used when skew/outliers
- Mode = infrequently used in scientific research
10
Q
How to use SD to estimate range of a distribution
A
- Mean + (1 x SD)
- Mean - (1 x SD)
- Gives you range
11
Q
How to calculate range that covers 95% of values in sample
A
- Mean + (1.96 x SD)
- Mean - (1.96 x SD)
- Gives you range