EPIDEMIOLOGY - Biostatistics Flashcards

1
Q

What are the two types of statistics?

A

Descriptive statistics
Inferential statistics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the two types of data?

A

Quantitative
Qualitative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Describe the two types of quantitative data

A

Continuous: data that does not have fixed values
Discrete: data that has fixed values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Describe the two types of qualitative data

A

Nominal: distinct, unordered categories of data
Ordinal: categories of data with some order or hierarchy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are probabilistic outcomes?

A

Probabilistic outcomes are the degree of randomness resulting from the result of an experiment or trial

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the measures of central tendency?

A

Mean
Median
Mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the measures of dispersion?

A

Max and min values
Standard deviation
Interquartile range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In practice, which combinations of central tendency and dispersion would you typically report?

A
  1. Mean and standard deviation
  2. Median and interquartile range
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the purpose of frequency tables?

A

Frequency tables summaries the frequency of each possible value in data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between continuous and relative frequency?

A

Continuous frequency: running total of frequencies in a frequency distribution
Relative frequency: ratio of the frequencies (%)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Give six examples of graphs that can be used to visualise data

A

Box plots
Bar plots
Density plots
Pie charts
Scatter plots
Line plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the null hypothesis?

A

The null hypothesis is a statement in which there is no relation between the two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the alternative hypothesis?

A

The alternative hypothesis is a statement in which there is some statistical relationship between the two variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is statistical hypothesis testing?

A

Statistical hypothesis testing is the use of data to determine the plausibility of a hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a test-statistic (T-statistic)?

A

A test statistic (T-statistic) is a number calculated by a statistical test which describes how far your observed data is from the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the probability value (P-value)?

A

The probability value (P-value) calculates the likelihood of your test statistic (T-statistic) to tell you how likely it is that your data could have occurred under the null hypothesis

17
Q

Describe how a 0.05 probability value (P-value) works in regards to the null hypothesis?

A

A p-value less than 0.05 is typically considered to be statistically significant, in which case the null hypothesis should be rejected. A p-value greater than 0.05 means that deviation from the null hypothesis is not statistically significant, and the null hypothesis is not rejected

18
Q

What is the main difference between parametric and non-parametric tests?

A

Parametric tests compare the mean values of normally distributes data and non-parametric tests compare the median values of abnormally distributed data

19
Q

What are four examples of parametric tests?

A

One sample t-test
Two sample t-test
Paired sample t-test
Analysis of variation (ANOVA) test

20
Q

When would you use a one sample t-test?

A

To compare the mean value of a sample with an expected mean value

21
Q

When would you use a two sample t-test?

A

To compare the mean values of two different samples

22
Q

When would you use a paired sample t-test?

A

To compare the mean values of two paired samples

23
Q

When would you use an analysis variance (ANOVA) test?

A

To compare more than two mean values with eachother

24
Q

What is the corresponding non-parametric test to a one sample t-test?

A

Wilcoxon Signed Rank test

25
What is the corresponding non-parametric test to a two sample t-test?
Mann-Whitney test
26
What is the corresponding non-parametric test to a paired sample t-test?
Wilcoxon Signed Rank test
27
What is the corresponding non-parametric test to an analysis variation (ANOVA) test?
Kuscall-Willis test
28
When would you use a Chi-squared test?
To compare the proportions of categorised data
29
What is a 95% confidence interval?
A 95% confidence interval is a range of values above and below the point estimate within which the true value is likely to lie with 95% confidence
30
What is the correlation coefficient?
The correlation coefficient in the measure of a relationship between two numerical values
31
What is represented by the correlation coefficient value of 1?
1 = Perfectly correlated (as one value increases, the other variable also increases)
32
What is represented by the correlation coefficient value of 0?
0 = No correlation (no association between the variables)
33
What is represented by the correlation coefficient value of -1?
-1 = Perfectly anti-correlated (as one value increases, the other variable decreases)
34
What is linear regression analysis?
Linear regression analysis is the prediction of the value of a variable based on the value of another variable
35
What are the two parameters estimated by linear regression analysis?
Intercept and gradient
36
What is extrapolation?
Extrapolation is the prediction of a new Y-value from an X-value outside the range covered by given data
37
What is intrapolation?
Extrapolation is the prediction of a new Y-value from an X-value within the range covered by given data