Theme 3 - stats, how we use it to interpret our data Flashcards

1
Q

give an example of how biology has an inherent variability in its data

A

e.g. women aged 50-70 are screened every three years for breast cancer, 4 in 100 will be positive but under further examination, only 1 in 4 will be a confirmed case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what does standard error mean do?

A

allows us to determine how representative an average is to the whole population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is the definition of standard deviation?

A

the difference between the data values and the mean so the further a data point is from the average, the larger the SD

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how does sample size affect the sample mean?

A

the larger the sample, the closer the sample mean is to the true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the central limit theorem?

A

the idea that the larger the sample, the closer it gets to the true mean and so shows us it is normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

how can you turn data into a normal distribution?

A

by repeatedly sampling the data set, finding the mean and plotting it to make it sample means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you calculate standard error mean (SEM)?

A

by taking the SD and dividing it by the sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is the SEM?

A

a reflection of how accurate the sample is to the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what do length of inferential error bars suggest?

A

how much uncertainty there is in the data:
-wide bars indicate large error
-short inferential bards indicate high precision

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is n in stats?

A

the number of independent subjects in an experiment and not the number of replicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are the rules for error bars?

A

-they’re meaningless unless defined in the figure legend
-the n number should always be stated
-they should only be shown for independently repeated experiments and never for replicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the two stats tests that will test for normal distribution and how do you know what to use whem

A

-shapiro wilk test (for smaller sample sizes)
-kolmogorv-smirnov test (for sample sizes above 50)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the null hypothesis?

A

there is no difference between the populations/ there’s no effect to be observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what does the experimental hypothesis state?

A

there is a difference between our populations/ an effect is observed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what determines the validity of a hypothesis?

A

the inability to prove that it is false

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a control group? when might it need to be modified?

A

-a control group is a group where as many variables as possible are kept the same so that the only thing that is different is the experiment variable (the thing being tested)
-needs to be modified in observational studies by including adjustments in the statistical analysis to account for the confounding variables

17
Q

how does randomisation in a blind study work?

A
  1. details of everyone taking part are put into a computer
  2. the computer puts each person into a treatment group at random
  3. the computer programme takes into account details such as age and e.g. stage of cancer to make sure all groups are as similar as possible
18
Q

what is a statistical test?

A

a test to determine if the observed finding are applicable to the wider population and not simply due to chance

19
Q

what is a type 1 error in statistical infernece?

A

a false positive error where there us no difference but we see one so we should have accepted the null hypothesis instead of reject

20
Q

what is a type 2 error?

A

a false negative error where there was a difference but did not detect it so we should have rejected the null hypothesis instead of accept it

21
Q

what is the alpha value?

A

the chance of making a type 1 error (often referred to as the p value)

22
Q

how is the alpha value different to the p value in statistical inference?

A

the alpha value is prior to the study whereas the p value is the observed result after the study is completed

23
Q

what is the beta value? how can you use this to find the ‘power’ of the study?

A

the chance of making a type 2 error and the power of the study is the inverse of the beta value

24
Q

why will the p value never be 0?

A

there is always a chance of making a type 1 error

25
Q

what is the definition of the p value?

A

the probability of the data/ observations arising due to chance when the null hypothesis is true

26
Q

what does it mean when the p value is less that 0.05?

A

the p value is sufficiently low and the so we can be confident the result being seen is real and not due to chance hence its statisticaly significant

27
Q

what is definition of sensitivity?

A

the proportion of diseased individuals that are correctly identified to have the condition

28
Q

what is sensitivity as a porportion?

A

a/(a+c)

29
Q

specificity definition

A

the proportion of non-diseased individuals that are correctly identified to not have the condition

30
Q

what is specificity as a proportion?

A

d/(b+d)

31
Q

what is the positive predictive value (PPV)?

A

the proportion of individuals with a positive test result that actually have the disease

32
Q

what is the equation for proportion of positive predictive value PPV?

A

a/(a+b)

33
Q

what is the negative predictive value (NPV)

A

the proportion of individuals with a negative test result that actually do not have the disease

34
Q

what is the equation for negative predictive value (NPV) as a proportion?

A

d/(c+d)

35
Q

what should the sensitivities and specificities of a good test include?

A

high sensitivities and specificities

36
Q

how does sensitivity and specificity affect ppv and npv

A

-high sensitivity = high ppv
-high specificity = high npv

37
Q

how are ppv and npv influenced by prevalence of the condition?

A

a high prevelance will increase the ppv and reduce NPV while a low prevelance will lower ppv and increase npv