Flashcards in Summarising data Deck (17)
What are the 2 types of summary descriptive statistics?
1. Measure of central tendency = avg
2. Measure of dispersion = spread of scores
What are the different ways to measure central tendency/ typical performance?
What are the dis and ad of using mode?
Mode: most frq score in a set of score
- simple + easy
- only avg which can be used w/ nominal data (categorical)
- can be unrepresentative therefore misleading
--> 39 = mode but best of numbers may be low
- may be more than one mode in a set of score
What are the dis + ad of using mean?
Mean: (add all scores)/ total numbers of scores
- uses info from every single score
- resistant to sample fluctuation
- Susceptible to distortion from extreme score - outliers + Skew
What are the dis + ad of using median?
Median: arrange scores in order, median = middle value or avg of middle 2 scores
- resistant to the distorting effects of extreme high or low scores
- ignores score' numerical value = wasteful data
- more susceptible to sampling fluctuations than the mean
What are the different measures of dispersion/ variability in performance?
2. Standard deviation
What are the AD and DIS of the range?
Range = difference between the highest + lowest score
- quick + easy to calculate
- influenced by extreme scores
- conveys no info about the spread of scores between the highest + lowest scores
--> could have same range but spread of data completely different
What is SD?
- The spread of scores around a sample mean
- tells us how well the mean summarises the sample
--> bigger the SD, the more scores differ from the mean + between themselves and less satisfactory the mean becomes as a summary of data
What are the ads + dis of SD?
- like the mean, use info from every score
- not intuitively easy to understand
How do you calculate SD?
1. Work out mean of data
2. Subtract mean from each score
3. Square the differences obtained
4. Add up the squared differences = SS sum of squares
5. SS/ the total number of scores = Variance
6. SD = square root of variance
What are some issues with using the mean and SD?
- usually obtain SD/ mean from a sample = cannot extrapolate to the population from our sample = only a good estimate
- SD tends to underestimate the population SD
How can we deal with SD typically underestimating the SD of the population?
- when using sample, divide by n
- when using the sample SD as an ESTIMATE of the population, divide by n-1
(makes SD larger)
What is the relationship between the normal curve and the SD?
- The SD cuts off a constant proportion of the distribution of score
> 68% of ppl have IQs between 85 + 115 (mean = 100, SD +/- 15)
> 95% have IQs between 70 - 130 (100, +/- (2*15))
What are the chances of 99.7% of a population will have an IQ between 55 - 145 if the mean = 100 and the SD = 15 and the SD constant has been 30?
1. (100 - 99.7)/ 2
--> 2 there since it is 2 more than normal 1 SD
= only occurs in 15% of the population
Wha is standard error of mean?
A type of SD
- is the SD of a set of sample means
- shows how much variation there is within a set of sample means
--> indicates the reliability of each sample mean as an estimate of the true population mean
What is the formulation for SE?
= SD/ square root of n
n = Sample size