Biostatistics Flashcards

(96 cards)

1
Q

What are some issues with clinical studies

A

My patient is a unique individual so the study may not encompass the patient

Study attempts to estimate for the whole population

Study looks at a sample which may not be representative of the population or my patient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How many people are in +/-1 SD of the mean in a normal distribution

A

68%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How many people are within 2 SDs of the mean if the population is normally distributed

A

95%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Quick checks for normal distribution

A

Mean=median=mode

IQR equal around mean

2/3 of data within 1 SD of mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does parametric mean

A

Normally distributed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is ordinal data

A

Ranked data

Eg a scale of 1 to 10

Rarely normally distributed

A score of 8 does not necessarily mean 2x a score of 4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is nominal data

A

Categorical data

Eg colour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which line is the median in a box and whisker plot

IQR?

A

The centre line

The box contains the IQR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are error bars

A

An indication of the uncertainty

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is SEM

A

Standard Error of the Mean

σ
——
✔️n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the usual % confidence

What does this mean

A

95% confidence

If you repeatedly sampled the population, 95% of the sample means +/- Confidence interval would contain the true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does null hypothesis mean?

A

The means of the 2 groups are not different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 2 questions we ask when considering if the difference is meaningful

A

Is the difference due to chance?

Is the difference big enough to be useful?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why do we obtain the statistical significance

A

To see the likelihood of the difference being caused by chance and if it is below 0.05

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is p value also called

What does it mean

A

The α value

The chance of a false positive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does it mean if H1 is true

A

The means are different

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a type 1 error

A

False positive

It is the incorrect rejection of a true null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a type 2 error

A

False negative

The failure to reject a false null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are α and β in terms of hypotheses

A
α = chance of a type 1 error happening 
β = the chance of a type 2 error happening
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the power of a study

A

1-β

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Why is a power calculation important

A

If power was >80% then result is probably true and there is no significant difference

If power <80% then we can’t make a conclusion from the negative result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

When do you perform statistical testing

What about a power calculation

A

Post hoc on actual data

Power should be estimated in advance but actual power can be calculated after

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What affects the power

A

The magnitude of difference

Distribution

Number of measurements made

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

If there is a smaller difference in means, what does this mean for the sample size

What about for variance

A

It increases sample size requirement

The greater the variance, the larger the sample size needed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
How do we correct for multiple testing
Bonferroni Correction Threshold reduced by dividing by the number of tests performed Eg for 20 tests p= 0.05/20 = 0.0025
26
Does independence come into hypothesis testing
Yes! It is assumed tests are independent!
27
What is an unpaired T test used for
Compare the means of 2 independent groups of measurements which are homoscedastic
28
What is assumed in an unpaired t test
Independence Normally distributed Variance is equal in both groups Usually 2 tailed
29
What word describes 2 groups with the same variance
Homoscedastic
30
How to test if 2 groups are homoscedastic What test is used if not homoscedastic
Levene’s rest / the F test Use a modified t test
31
What is variance
Average of the squared differences of the mean A measure of spread around the mean The standard deviation squared
32
What do we usually want to know from 2 groups if using a tailed test
If the 2 means are different
33
Give an eg of a one way yes
Effect of low calorie diet Vs High calorie (Difference will only be in one direction)
34
How to calculate degrees of freedom for t tests
n-2
35
What is a paired t test used for
To compare two means from same sample
36
What are the assumptions for a paired t test
Unbiased selection Normal distribution Usually 2 tailed
37
If a paired t test is slightly not normally distributed, is it an issue
No For all t tests values can be transformed to obtain normality
38
Give 3 tests to compare more than 2 means
One way ANOVA One way repeated measures ANOVA 2 way ANOVA
39
What does a one way ANOVA do
Analyse variance
40
What do ANOVA tests use to compare variance
F test
41
Why do we compare overall variance to variance calculated by group means
If overall is different to mean, at least one group mean must have been different from the overall mean
42
Give an example of a one way ANOVA What do we do here
3 groups: 2 drugs and a placebo First establish that there is a difference between the groups, then we perform comparisons between the groups (Post hoc multiple t tests)
43
When would you use a one way repeated measures ANOVA
When measuring how a within- subjects experimental group performs in 3 or more experimental conditions Eg a single group of subjects given 3 different drugs in a crossover study
44
When would we use a 2 way ANOVA Give an example
Where we look at 1 outcome measure but each subject has 2 nominal input factors Measuring cortisol levels as a measure of stress in rats. Stressful vs unstressful environment Also considering sex so each group contains Male and female
45
What do we use if the data is not normally distributed or if it is ranked data
Non parametric equivalents
46
What tests do you use if there are 2 means and the data is not normally distributed
Wilcoxon rank sum test Or Mann-Whitney U test
47
What tests do you use if there are multiple means and the data is not normally distributed
Kruskall Wallis one way ANOVA Friedman 2 way ANOVA
48
What test is used to compare multiple proportions When would we use it
Chi squared test If we are counting the out come and have 2 proportions eg number of people healthy in response to 2 treatments
49
When can chi squared not be used
If more than 20% of the expected counts are equal or less than 5 values
50
What is considered when using chi squared
``` Fisher’s exact test can be used with smaller numbers Assumed independence Each individual only represented once Confidence intervals can be calculated Yates’ correction can be used ```
51
What is Yates’ correction
A continuity correction that subtracts 0.5 from each value before being squared
52
When interpreting linear correlation what is the value of r^2 taken as Eg?
The percent of variation shared between 2 variables Eg: if r=0.6, variation accounted for by relationship is only 0.36
53
When is use of linear correlation invalid
If relationship is not linear If not independent If data consists of obvious outliers If data consists of subgroups of subjects
54
What does linear regression do What is the equation
Attempts to describe the relationship found by the correlation coefficient Y=a+bx
55
What is Spearman’s rho
Spearman’s Rank Coefficient Non parametric measure of statistical dependence between 2 variables
56
What does multiple linear regression do
Estimates the relative contribution of each risk factor that confounds a correlation
57
What linear regression is used for a binary outcome
Logistic regression
58
What linear regression is used for an ordinal outcome
Ordinal regression
59
What linear regression is used for an multiple outcome variables
Multivariate linear regression
60
What happens when doing linear regression of the outcomes are not continuous
Result reported as odds ratio for the input variable
61
What test do we use to find a p value for 2 categorical variables
Chi squared
62
What test do we use to find a p value for one numeric variable
T test
63
What test do we use to find a p value for one numeric and one categorical variable
T test
64
What test do we use to find a p value for one numeric and 2 categorical variables
ANOVA
65
If healthy and diseased populations have non overlapping populations what does this meAn
A perfect test is possible If there is an overlap we will get a false negative or positive
66
Sensitivity=
True positive —————————————- True positive + False negative (ie number that have the disease)
67
Formula for positive predictive value
Number that have the disease ——————————————- Number that test positive
68
What is primary prevention What about secondary
Population level and high risk individual level intervention Improving outcome only for individuals with a disease (eg rehab/ treatment)
69
What is SEM and how is it calculated
Standard error of the mean SD/ ✔️n
70
SD is always smaller than SEM True or false
False SEM is always smaller than SD
71
How might you “normalise” data
Taking logs
72
What is usually the confidence interval
95% ie 19/20 repeated samples should fall into it
73
What does parametric mean
Normally distributed
74
What does ordinal mean Discuss
Ranked data Eg a score from 1-10 Absolute magnitude cannot be replied upon 4/10 is not equal to half of 8/10 Rarely parametric
75
What is nominal data How is it usually displayed
Categorical Non parametric Usually tallied up and out into a bar chart or contingency table
76
What is Ho
The null hypothesis- there is no significant difference
77
How is the P value shown What is this
α Th chance of the observed difference being due to chance
78
Type 1 error=
False positive
79
How does the Bonferroni correction work
Used to correct for multiple testing Dividing significance threshold by the number of parallel tests being performed
80
What do you use to test if something is homoscedastic
Levene’s test or the F test
81
When do we use a paired t test What does this test assume
Comparing means from the same sample (eg before and after) Unbiased allocation
82
What is the power of a paired t test
Greater power but lower degrees of freedom
83
How to find degrees of freedom
Sum of Total number of samples - 2
84
What is the Mann Whitney U test used for
Ordinal or not normally distributed data that is independent
85
What do you use to compare the 2 means of ordinal data that are related When else can you use this
Wilcoxon rank sum test If the data is not normally distributed
86
How to calculate degrees of freedom in a Chi squared test
(Number of columns -1) x (Number of rows - 1)
87
Formula for chi squared
X^2= Σ(observed-expected)^2 / expected)
88
When can chi squared not be used
When more than 20% of expected counts are equal or Make up less than 5 different values
89
What is assumed in a chi squared test
Independence and each individual represented only once
90
What is Fisher’s exact test
Chi squared for smaller numbers
91
What happens to the proportions of false positives and false negatives if the prevalence of a disease increases
Proportion of false positives decreases Proportion of false negatives increases
92
The bias in exposure measurement of a cohort study is a serious problem. True or false?
False
93
Discuss the meaning of magnitude in ordinal data
The magnitude is meaningful but only in relative terms
94
How to increase power of a study
Increase number of subjects to be included in the study
95
If you are comparing the means of 2 samples with a given outcome of interest, what test do you use
Chi squared
96
3 things which influence power
``` Variance (we cannot control this) Sample size (we control this) Smallest difference (likely dictated by clinical utility of findings) ```