Representations Of Data Flashcards
(6 cards)
Define an outlier:
An extreme value that lies outside the overall pattern of data
Give 2 numerical definitions of an outlier:
Either greater than Q3 + k(Q3-Q1).
Or less than Q1 - k(Q3-Q1).
Define an anomaly:
An outlier that is clearly an error so should be removed from the data (also known as cleaning the data).
What do the 5 vertical lines in a box plot represent (starting from left to right)?
1) Lowest value that isn’t an outlier (boundary for outliers).
2) Lower quartile.
3) Median.
4) Upper quartile.
5) Highest value that isn’t an outlier.
Where do the 4 horizontal lines go on a box plot (starting from left to right)?
First line connects the midpoint of first 2 vertical lines.
Next 2 lines connect top and bottom of 2nd, 3rd and 4th vertical lines, creating a rectangle with a vertical line in it.
Last line connects midpoint of 4th and 5th vertical lines.
What do the crosses represent on a box plot and where do they go?
Represent outliers.
They either go to the left of first vertical line or to the right of last vertical line.