Visualisation and presentation of data - lecture 2 - sem 1 Flashcards
(27 cards)
what is discrete data
when the data values are quantitative and the numbers are finite or countable
e.g. dice
what is continuous data
result from infinitely many possible quantitative values where the collection of values is not countable
e.g. how many decimal numbers are there between 1-2
ways of classifying data by 4 levels of measurement
nominal
ordinal
interval
ratio
what is nominal level of measurement
data the consists of names, ladles or categories only
the data cannot be arranged in an order
the pure data cannot be used for maths but their frequencies can
ordinal level of measurement
data which can be arranged in some order
difference between the data cannot be measured
gives relative comparison but not the magnitude of the differences
interval level of measurement
data can be arranged in order and the differences between data values can be found and are meaningful
no true 0
ratio level of measurement meaning
data that can be arranged in order, differences can be found and are meaningful and there is a natural 0
what is frequency distribution table for summarising qualitative data
there is a table with qualitative data
e.g. good, bad
then there is a frequency distribution table where it tells you how often the qualitative data came up and at the bottom there is a total with all the data
you can then find the percentage frequency
Bar charts for summarising the qualitative data
it shows data summarised in a frequency, relative frequency or percentage frequency distribution
x axis- specific categories
y axis- scale
pie chart for summarising the qualitative data
it shows data summarised in a frequency, relative frequency or percentage frequency distribution
the circle represent all of the data
use relative frequencies to dived the circle into sectors and classes
what is a histogram
graph used to represent the distribution of numerical data
shows frequency of data points within specific ranges (bins)
what is the median
measure of the central tendency that represents the middle value in a dataset when the values are arranged in order
gives a sense of the typical value especially if there are outliers
what is variance
measure of the variability that uses all the data
based on the difference between the value of each data and the mean
the sample variance provides an unbiased estimate of the population variance.
what is standard deviation
positive square root of the variance
measure of the amount of variation or dispersion of a set of values
cannot be negative
what does a large variance indicate
the numbers in the set are far from the mean and from each other
what does a small variance indicate
the numbers in the set are close to the mean and from each other
what does a low standard deviation indicate
that the values tend to be close to the mean of the data set
what does a high standard deviation indicate
that the values are spread out over a wider range
what is skewness
measure of the degree of asymmetry of a distribution
one tail of the distribution has more extremes than the other
what is kurtosis
measure of whether the data are peaked or flat relative to a normal distribution
normal distribution explained
highest point on the curve is the mean, mode and median
symmetrical
2/3 of the sample is within 1 SD of the mean
95% is within 2 SD
99.7% is within 3 SD
what is negative skewness
tail of the distribution extends more to the left
fewer extremely low values
mean is less than the median
outliers are low values
what is positive skewness
tail of the distribution extends to the right
fewer extremely high values
mean is greater than the median
outliers are high values
mesokurtic meaning
kurtosis value is 0
distribution isnt too flat or too peaked