Week 9 - Measurement and Analysis Flashcards
(15 cards)
what are the methods of summarising data
descriptive statistics
- central tendancy (mean, median, mode)
- spread/ dispersion (range, interquartile range, standard deviation, variance)
- shape (skewness, kurtosis)
what are the methods for organising data
- data tables
- coding
what are the methods of representing data
- graphs
(bar charts, histograms, pie charts, box plot, frequency polygons)
what is mean, median and mode
mean = sum of all values divided by the number of values
median = middle value when all numbers are arranged in order
mode = most frequent value
What is range
range = max val - min val
difference between the highest and lowest values in a dataset
what is the interquartile range (IQR)
the IQR is the range between the Q1 and Q3
represents middle 50% of data
quartile 1 (Q1) is the lowest 25% of the numbers
quartile 2 (Q2) is the next 25% of numbers (up to the median)
quartile 3 (Q3) is the second-highest 25% of numbers (above the median)
quartile 4 (Q4) is the highest 25% of numbers
What is variance
variance measures how far each value in a dataset is from the mean, on average
What is standard deviation (SD)
SD is the square root of the variance, tells you how much on average values in a dataset differ from the mean
What are the types of measurement scales
- Nominal scales
- catagorises data without any order or ranking e.g. ethnic background - Ordinal scales
- Categorises data with a clear order, but the difference between ranks is not defined e.g. age groups ranked - Interval scale
- Numeric data with equal intervals, but no true zero e.g. body temperature in °C - Ratio scale
- Numeric data with equal intervals and an absolute zero e.g duration since diabetes diagnosis (0 days, 20 days)
look at page 3 of week 9
What is a frequency polygon
A line graph created by connecting points plotted at the midpoint of each data class
used with continuous data
How is normal distribution demonstrated and what is it
a symmetrical probability distribution where data values are clustered around the mean, with fewer values further away from the mean
Normal distribution of data can be demonstrated with a symmetrical curve of data results
What is a frequency polygons two main purposes
interpolation = estimate the frequency of missing values
shape characterisation = understand teh distribution of the data
what are the 3 shapes that can occur on a frequency data plot
- positivley skewed
- Tail is longer on the right.
- Most data are clustered at lower values.
- Example: Income distribution. - symmetrical
- Data are evenly spread around the centre.
- Most values cluster around the mean.
- Ideal for many statistical analyses. - negatively skewed
- Tail is longer on the left.
- Most data are clustered at higher values.
what is kurtosis
measures tail heaviness and peak sharpness
How is kurtosis measured
if the data range is
- meokurtic
- platykurtic
- leptokurtic
different heights through normal distribution
normal distribution of each
meo = 3
platy = less than 3
lepto = greater than 3