DESCRIPTIVE STATISTICS Flashcards
(41 cards)
Descriptive statistics
Offer researcher ways of describing and summarising quantitative data.
Raw scores may be meaningless & confusing, so we need to present material in understandable and informative way.
-Reader can then see main trends of research.
Measures of central tendency
These describe a data set by identifying one score that represents the general trend of data.
They describe how the data cluster together around a central point.
3 measures of central tendency
Mode
Median
Mean
Mode
The value that occurs most frequently.
advantage of mode
It is simple and not affected by one or two extreme scores (outliers).
Useful when data is in categories.
disadvantage of mode
Can be unreliable as there can be several modes or no mode at all.
Does not particularly represent central tendency.
If one score changed, mode can change.
Relies on a score occurring more than others.
Median
Middle value in a set of data.
In odd number of score, median is the middle.
In an even number of scores, we take 2 central values and find average.
advantage of median
Not distorted by extreme values. Can give representative value
disadvantage of median
It ignores most of the scores, it is less sensitive than the mean.
May not be an actual value in a data set if there is an even number of values.(have to calculate the number)
Mean
‘Arithmetic average’
Calculates measure of central tendency by adding all the values and dividing by the number of values.
advantage of mean
Takes all scores into account, making it a sensitive measure of central tendency.
Misses nothing out, giving a valid measure.
disadvantage of mean
It can be misleading if there are one or two extreme scores in one direction. (Misrepresentative)
Average is often decimal, can be seen as meaningless.
Extreme scores
Can make measures unreliable as they misrepresent the true tendency of a data set by skewing it so it is too high or too low.
Measures of dispersion
These measures tell us whether scores in a set of data are similar to each other or if they are SPREAD OUT.
e.g Range, Standard Deviation
Range
Simplest measure of dispersion.
Difference between lowest and highest numbers.
Advantage of range
Simple to calculate, takes into account extreme values.
Disadvantage of range
Outliers can greatly influence the range value.
Ignore ALL BUT 2 scores, unlikely to provide adequate measure of dispersion.
Some statistics books define range as…
Highest core minus lowest score PLUS ONE.
This is an inclusive measure of range rather than a difference between 2 scores.
repeated measures design and range
Researcher may expect range of scores to be very similar across conditions as same ps are used.
Variance (S²)
Variance tells us more than the range.
Rather than looking at only 2 extremes of the data set, variance considers the DIFFERENCE between each data point and the mean (deviation).
These deviations are then squared, added together and total is divided by the number of scores in the data set minus 1.
Formula for variance
n - 1
(always start with brackets!)
symbol meanings
s² = variance
x = term in data set
x̄ = sample mean
Σ = sum of
n= sample size
Step by step to calculate variance=
- Calculate the mean (x̄)
- Write number of scores(n)
- Draw table with 3 columns and write scores (value of x) down first column.
- Work out the difference between each score and the mean (ignore if it is a positive or negative number) x - x̄
- Square each of these differences. (x - x̄)²
- Add together the column of differences Σ ( x - x̄ ) ²
- Take the sum and divide it by n-1 (number of scores - 1).
Standard Deviation (SD)
( σ )
A measure of how spread out numbers are.
Variance is a squared number, so is not in same units as the mean.
Standard deviation uses square root, returning figure back to same units as the mean.
(Easier to make direct judgements about data set)