descriptive statistics (A) Flashcards

1
Q

what is meant by descriptive statistics?

A

Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire or a sample of a population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is descriptive statistics are involved with this set?

A
  • measure of central tendency

- measures of spread (dispersion)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the measures of central tendency?

A
  • mean / average
  • mode / popular
  • median / middle
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the measures of dispersion?

A
  • range
  • IQR
  • Sandard deviation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is meant by measures of central tendency?

A

term term average is used in everyday life ro express an amount that is typical for a group of people / things

typical tendency of groups

measures of CT is a single value that attempts to describe a set of data by identifying the central position in data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

why is average useful?

A

useful indicator of the general trend

  • summarise large amounts of data
  • indicate that there is some variability around the single value within original data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the pros of the mean?

A
  • most popular measure
  • yields one distinct answer
  • useful for comparing data sets
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the cons of the mean/

A
  • answer can be affected by extreme values (outliers) or when skewed data is present
  • mean has tendency to be pulled towards extreme values
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are the pros of the median?

A
  • unaffected by extreme values

- if there is skewed data it may be more informative decriptce measure then the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the cons of the median?

A
  • less amenable then the mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what ate the pros of them doe?

A
  • easy to obtain

- only measure that can used for data on nominal scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the cons of the mode?

A
  • it is not very stable from sample to sample

- there may be more than one mode for a particular score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is non skewed data?

A

equal (perfect distribution)

mean is preferred

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what is skewed data?

A

large amounts of outliers

median is less affected by skewed dat and id generally considered to be the best

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

when do you use the mode?

A

for nominal data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when do you use the median?

A

ordinal

interval/ration (skewed)

17
Q

when do you use the mean?

A

interval/ratio (non skewed)

18
Q

what are measures of spread?

A

describe how similar or varied the set of observed values are for a particular variable

large spread = large differences
seen as positive if there is little deviation

19
Q

what are the types of measures of spread?

A
  • range
  • quatiles
  • deviation (absolute deviation, variance, standard deviation)
20
Q

what are the types of deviation?

A

absolute deviation, variance and standard deviation

21
Q

what is the range?

A

difference between the highest and the lowest variable

22
Q

what are the pros of range?

A
  • useful when measuring variable with a critical low/high threshold that should not be crossed e.g. drinking age
  • easy
23
Q

what are the cons of the range?

A
  • value is sensitive to outliers
24
Q

what are quartiles?

A

tell us about the spread of data set by breaking data set into quarters, just as the median breaks in half

25
how many quartiles are there?
Q1 Q2 Q3
26
how do you work out the quartiles?
- if you find the median you will find Q2 | - then work out the median of the number outside to find Q2 and 3
27
what is the interquartile range?
the difference between 1st and 3rd quartile IQR = Q3 - Q1
28
what is a box plot?
visual description of the distribution based on minimum Q1, median, Q3, maximum
29
what is an outlier?
observation which does not appear to belong with other data (measurement or recording error / equipment failure) have to find them as can skew the data
30
how do you find outliers?
Q1-1.5 x IQR Q3+1.5 x IQR anything outside of these fences is considered an outlier
31
what is the disadvantage to quartiles?
does not take into account every score just 25%
32
what is the difference between mean absolute deviation and standard?
both looking at the distance of the data oto its mean standard is calculating the square of the difference absolute is only looking at the absolute difference
33
how do yo work out mean absolute deviation?
- find the mean of all values - find the distance of each value from that mean, subtract mean from each value, ignore minus) - find the mean of those distances SUM OF (x-u) / N ``` x = each value u = mean n = number of values ```
34
what is the variance?
the average of the squared differences from the mean
35
how do you calculate the variance?
- work out the mean - for each number subtract the mean and square the result - the work out the average of those squared differences (but minus one) SUM OF (x-u)2 / n-1
36
why do you use n-1 for standard deviation and variance?
to overcome bias
37
what are the problems with sample variance?
- as they are squared this gives more weight to extreme scores - it contains outliers, variance ay not represent data as a whole - not same units as data (unit squared) cannot directly relate
38
wat is standard deviation?
measure of spread of scores within data set use in conjunction with mean to summarise continuous data - normally apparopraite if data not skewed or has outliers
39
how do you work out standard deviation
``` (square root of variance) work out mean subtract mean and quare add all square find mean of those squared find square root ```