Interpreting Data Flashcards Preview

HSPH > Interpreting Data > Flashcards

Flashcards in Interpreting Data Deck (50):
1

What are the two main types of data?

Qualitative and quantitative

2

What are the two types of quantitative data?

Discrete and continuous

3

What are the two types of qualitative data?

Nominal (unordered) and ordinal (ordered)

4

What is nominal data split into?

Binary and categorical

5

What is the median?

Middle value when values ordered from smallest to largest

6

What is the median?
2, 3, 6, 7, 10, 11, 14

7

7

What is the mode?

Most common value

8

What is the mean?

The average. It is the sum of all the values divided by the number of values.

9

Calculate the mean.
2, 3, 4, 7, 8, 8, 11

6.1

10

What does standard deviation mean?

The average distance from the mean

11

How is standard deviation calculated?

The sum of (each individual value - mean) squared, then divided by the number of values. Then you square root this answer.

12

What centile is the median?

50th

13

What is the interquartile range?

25th to 75th centile

14

When is it better to use a median rather than a mean?

To avoid the influence of outliers, i.e. if there is an outlier that is very different to the rest of the data.

15

When is it better to use IQR rather than the standard deviation?

To avoid the influence of outliers

16

What is the Gaussian distribution determined by?

Mean and standard deviation

17

If the mean is reduced from 120 to 110, what happens to the Gaussian distribution?

It shifts to the left.

18

If the mean is increased from 120 to 130, what happens to the Gaussian distribution?

It shifts to the right.

19

What happens to the Gaussian distribution if the standard deviation is decreased from 15 to 10?

The curve becomes narrower and taller

20

What happens to the Gaussian distribution if the standard deviation is increased from 15 to 20?

The curve becomes wider and flatter

21

What is a useful property of Gaussian distributions?

A constant proportion of values will lie within any specified number of Standard Deviations above or below the mean (reference ranges).

22

If you go one standard deviation away from the mean, how many % does this represent?

68%

23

If you go 1.64 standard deviations away from the mean, how many % does this represent?

90%

24

If you go 1.96 standard deviations away from the mean, how many % does this represent?

95%

25

What is the 99% range? How is it calculated?

0.5th centile to 99.5th centile
Mean +/- 2.58 SDs

26

What is the 95% range? How is it calculated?

2.5th centile to 97.5th centile
Mean +/- 1.96 SDs

27

What is the 90% range? How is it calculated?

5th centile to 95th centile
Mean +/- 1.64 SDs

28

If the sample size isn't too small then the distribution of the sample mean will be...?

Gaussian

29

What is the standard error?

The standard deviation of this distribution (Gaussian) is called the standard error. It is a measure of the statistical accuracy of an estimate.

30

What is the standard error of the mean?

The standard deviation of the distribution of all possible sample means – can’t do this in practice, so it is estimated.

31

How is standard error of the mean estimated?

Standard deviation divided by the square root of the sample size.

32

How is the 95% confidence interval of a sample mean calculated?

95% CI = sample mean +/- (1.96 x standard error)

33

What does the 95% confidence interval mean?

We would expect 95% of samples of the same size to have a mean between the two values calculated.
In the population we are 95% sure that the mean could be as low as ___ or as high as ___.

34

When calculating confidence intervals and ranges, what should be used for each?

Standard deviation for ranges
Standard error for intervals

35

When the sample size increases, the 95% range…

Stays the same

36

When the sample size increases, the 95% confidence interval…

Gets narrower

37

What is ‘r’? What two values is it always between?

Correlation coefficient
-1 and 1

38

What does r=1 tell you?

Perfect positive correlation

39

What does r=-1 tell you?

Perfect negative correlation

40

What does r=0 tell you?

No correlation

41

What is the equation for a linear regression?

y = a + bx,
where y is the outcome and x is the predictor

42

What does the line of best fit do?

Minimises square of vertical distances

43

Regression - whatever we are predicting, should it be on the vertical or horizontal axis?

Vertical

44

Statistical significance - what does this mean and how is it determined?

An observed sample difference between groups might be due to chance. Statistically significant means the result is unlikely to be due to chance.
Use confidence intervals and p-values

45

What does a p-value mean?

A p-value for a result is the probability of observing a result as or more extreme than the sample result if the underlying assumption in the population is true.

46

What does the p-value have to be less than to be statistically significant?

<0.05

47

When can p-values be calculated?

When there is a comparison:
2 means – are they different i.e. is their difference different from 0?
Association – are the observed results different from those expected
Regression – is the slope different from 0?

48

How are p-values calculated?

Using chi-squared test

49

If the 95% CI for a difference excludes 0 then what can be said about the p-value?

p<0.05

50

If the 95% CI for a difference contains 0 then what can be said about the p-value?

p≥0.05