descriptive stats Flashcards

1
Q

What 3 factors should you try to encompass when designing a study

A

Types of data
If looking for difference or relationship
Number of groups or variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are the 2 types of data

A

measurement and categorical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is measurement data

A

frequency or quantitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is categorical data

A

qualitative data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the 4 types of scales

A

nominal
ordinal
interval
ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a nominal scale and when is it used

A

used for categorical data which reflects labels for categories

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

why shouldn’t you calculate summary descriptions for categorical data

A

results in nonsensical data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define ordinal scales and what they’re used for

A

ordering objects along continuum of various rankings

no information given on differences btwn scale points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

give an example of a study using ordinal scales

A

Holmes and Rahe 1967

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define interval scales and what they’re used for

A

used when have equal intervals btwn objects to represent equal differences
do not allow talk on ratios as 0 point on scale is arbitrary

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

define arbitrary

A

not based on system or re

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

define ratio scales and what they’re used for

A

have true zero point

true zero corresponds to absence of thing being measured

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the aims descriptive statistics

A

to characterise numerical dataset representatively
to condense meaningful a lot of info
minimise error involved in condensing process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are inferential statistics

A

goal to infer characs of whole pop from sample and make likely assertions from information instead of certain ones
use sample stats to estimate population parameters
use of theoretical sampling distributions made of innumerable random samples
uses p-values and confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are the 3 categories of descriptive statistics

A

measures of central tendency and measures of dispersion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the 3 measure of central tendency

A

mean
median
mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what are the measures of dispersion

A

range
IQR
variance
standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is the mean; give the equation

A

average score; calculate by sum of scores/number scores

Σ x / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When is the mean most useful and why

A

For normal/symmetric distributions, the mean is the most efficient and least subject to sample fluctuations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what are the disadvantages of using the mean

A

greatly influenced by extreme scores

Inaccurate sometimes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

how can you tell if the mean is an appropriate measure to use on a dataset

A

by using a histogram to see if data is symmetrical and if mean is appropriate

22
Q

what type of distribution is unsuitable for the mean

A

skewed distributions

23
Q

why is the median

A

when all scores arranged in order; central value

24
Q

why and when is the median useful

A

less sensitive extreme scores; gives more accurate representation of data
better measure than mean for highly skewed distributions.

25
what is median formula
N+1/ 2
26
define mode
most common score
27
what happens if you have 2 adjacent modes
add them/2
28
what happens if you have 2 nonadjacent modes
bimodal distribution
29
what are the 2 defining features of measures of central tendency
they indicate typical values and are summarised by a single number
30
what are the suitable summary descriptions for categorical data
frequencies percentages mode
31
what are measures of variability
describe degrees to which values vary
32
what is the range and how is it computed
measure of distance from lowest to highest score; max value- min value
33
what are the disadvantages of using range
extreme values/ outliers distort | unstable across diff samples
34
what is the real advantage of range
straightforward to calculate and easy to interpret
35
what is the IQR, what does it use and how is it calculated
1/2 the distance needed to cover 1/2 the scores it uses percentiles It is computed as one half the difference between the 75th percentile [often called (Q3)] and the 25th percentile (Q1). The formula for semi-interquartile range is therefore: (Q3-Q1)/2.
36
what is the difference btwn IQR in a normal vs skewed distribution
In a symmetric distribution, an interval stretching from one semi-interquartile range below the median to one semi-interquartile above the median will contain 1/2 of the scores. This will not be true for a skewed distribution, however.
37
what are the advantages of IQR and what kind of distribution is it useful in
little affected by extreme scores; good measure of spread for skewed distributions.
38
what is the calculation for the separate IQR/percentiles
percentile/100 first, e.g. 50th percentile= 0.50 then 0.50 * (N+1) = rank X then go across dataset and find number at rank position
39
what is the disadvantage for IQR in normal distributions
more subject to sampling fluctuation in normal distributions than the standard deviation and therefore not often used for data that are approximately normally distributed.
40
define variance
measure of how much scores vary in terms of distance from mean average of each score's squared deviation from mean score
41
what is variance formula
σ2= Σ (x- MEAN)2 / N
42
how does variance formula change when computing for sample vs population
N-1 for sample | N for pop
43
when do you use sample variance formula
when have done sample and want to generalise to wider population and so estimate population variance
44
what is standard deviation
square root of variance
45
what does a bigger SD value mean
values more spread out
46
what is the equation for sample and population SD
Population σ = √σ2 | Sample s = √s2
47
what can you do if you know the SD and mean in normal distribution
possible to compute the percentile rank associated with any given score
48
in a normal distribution, how many of the scores are within 1 SD of the mean
68%
49
in a normal distribution, how many of the scores are within 2 SDs of the mean
95%
50
why is SD useful
used in many inferential stats tests
51
what is a disadvantage of the SD and how can this be overcome
not a good measure of spread in highly-skewed distributions | supplement by the IQR.