3 - Representation Of Data Flashcards
(18 cards)
How do you calculate outliers using IQR
UQ + k(IQR)
LQ - k(IQR)
What is an outlier
An extreme value that lies outside the overall pattern of data
What is an anomaly
A data point that should be removed from the data
What is cleaning data
The process of removing anomalies from the data
How do you calculate outliers using standard deviation
_
x ± k (σ)
Typically this would be ±2σ
How are outliers displayed on box plots
Any data points that are outliers, are written with a cross
How are box plots drawn
Plot the highest and lowest value
Plot the quartiles + median
Add any outliers
What is a cumulative frequency diagram
A diagram that can be used to find exact values of quartiles
It requires grouped data
Where is a cumulative frequency diagram plotted
At the end of a group
What data does histograms represent
Grouped continuous data only
What does the area of a histogram represent
The area of a bar in a histogram is equal to k×frequency
What are the axis in a histogram
Y = frequency density
X = continuous data
How do you calculate frequency density
Frequency density = (k×frequency)/class width
(F.d × c.w = area = k×frequency)
How do you plot a frequency polygon
Plot at the midpoint of the group
How do you compare data
You must always comment on measure of location l, and measure of spread
If you are comparing with the mean, what measure of spread should you use
The interquartile range
If you are comparing with the standard deviation, what measure of location should you use?
The mean