section 2.1: examining numerical data Flashcards
(42 cards)
What is a scatterplot used for?
Visualizing the relationship between two numerical variables
What does a dot plot visualize?
One numerical variable. Darker colors represent areas with more observations
What does a stacked dot plot represent?
Higher bars indicate areas with more observations, aiding in judging the center and shape of the distribution
What is the purpose of a histogram?
Provides a view of data density, showing where data is relatively more common
What does the term ‘center’ refer to in statistics?
Mean or average of the distribution
What is the formula for the sample mean?
x̄ = (x1 + x2 + x3 + … + xn) / n
How is the population mean computed?
Computed the same way as sample mean, usually impossible to calculate due to lack of access to the entire population
What does x̄ represent?
Sample mean
What does μ represent?
Population mean
Define unimodal
A distribution with a single peak
What is the difference between bimodal and multimodal?
Bimodal has two peaks, while multimodal has several prominent peaks
What characterizes a uniform distribution?
No apparent peaks
What does ‘right skewed’ refer to?
A distribution with a tail extending to the right
What does ‘left skewed’ mean?
A distribution with a tail extending to the left
What is the formula for variance?
s^2 = (sum of(x - x̄)^2)/(n-1)
How is standard deviation calculated?
s = √(s^2)
What is the median in a dataset?
The value that splits the data in half when ordered in ascending order
What does Q1 represent?
25th percentile, also called the first quartile
What is the 50th percentile also known as?
The median
What does Q3 represent?
75th percentile, also called the third quartile
Define interquartile range (IQR)
The range where the middle 50% of the data lies, calculated as IQR = Q3 - Q1
What is the maximum upper whisker reach?
Q3 + 1.5 x IQR
What is the maximum lower whisker reach?
Q1 - 1.5 x IQR
Define an outlier
An observation beyond the maximum reach of the whiskers