Statistical Approaches Flashcards

1
Q

Why do we take a sample mean?

A

Because we are interested in the mean of the population, and to see how close out sample mean is to the true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Standard error of the mean (SEM)

A
  • Informs us about how close the sample mean is to the actual mean
  • SEM is an estimate of the average variation of the sample mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the effect of a larger sample on the SEM?

A

The SEM will decrease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do we calculate SEM?

A

Standard deviation/√No. sampled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does the SEM allow us to do?

A

Calculate confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What question does calculating the SEM answer?

A

How close is this sample mean to the actual mean (mean in the target population)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the interpretation of a 95% confidence interval?

A
  • The interval from… to… has a 95% chance (probability) to contain the true population mean
  • AKA; Given repeated sampling and calculation of 95% confidence
    intervals for each sample estimate, 95% of them will include the true population mean
  • However, up to 5 out of 100 cases, the CI does not include the true population mean
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does a 95% CI NOT mean?

A

A 95% CI does not mean that the interval contains 95% of the data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the effect of sample size on confidence intervals?

A

With larger sample size the confidence interval will be smaller

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is deductive reasoning?

A

Logical thinking process where specific conclusions are drawn from general premises or facts

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Null Hypothesis?

A

The hypothesis that there is no difference between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Alternative Hypothesis?

A

The hypothesis that there is a difference between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do we do hypothesis testing?

A

Calculation of the probability (p-value) that the ‘data occurring’ if the null hypothesis was true

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the 3 steps in hypothesis testing?

A
  1. From the observed data, a test statistic is calculated
  2. The probability (p-value) of observing a test statistic as large or larger than that observed, if the H0 is true, is calculated
  3. The p-value is compared to a cut-off termed the ‘level of significance’ (called ‘alpha’)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What do statistic tests have?

A

A probability distribution (e.g. t distribution, z distribution, F distribution etc.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why should the level of significance be small?

A

Because we don’t want to reject the
null hypothesis when it is true (e.g. 0.05, 0.01, 0.001)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

When is a p-value calculated?

A

AFTER the statistical test has been performed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What does the p-value relate to?

A

A p-value is always related to the hypothesis test (The NULL HYPOTHESIS)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the p-value?

A

The probability of observing a test statistic as large or larger than that observed, if the null hypothesis is true (basically indicating the probability of the ‘data occurring’ if the null hypothesis was true)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is true of the null hypothesis if the p-value is very small?

A
  • It is unlikely the null hypothesis is true
  • If the p-value is less than alpha, the null hypothesis is rejected
  • We say the difference is ‘statistically significant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is true of the null hypothesis if the p-value is large?

A
  • The data are consistent will the null hypothesis
  • Hence we conclude that there is NO strong evidence that effect tested really exists
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Why might be wrong with a level of significance of 0.05?

A
  • As level of significance of 0.05 is a completely arbitrary cut-off
  • A dichotomous interpretation of p-values (i.e. 0.05 or less = ‘significant’ and above 0.05 = non-significant) might be inappropriate
  • A p-value of 0.04 means little different from a p-value of 0.07
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Statistical significance versus clinical significance

A
  • Statistical significance does not equate to biological, clinical or
    economic importance
  • A statistically significant result may be of little importance
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What is a major limitation of a p-value?

A
  • A p-value provides no information about the likely size of effect
  • We are typically interested in the magnitude of an effect, not just whether an effect is likely to exist or not
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What do confidence intervals show that p-values do not?
Information on the magnitude of an effect rather than whether or not the effect exists.
26
What do confidence intervals tell us?
Information about the variability of the estimate (e.g. mean, odds ratio)
27
What does the confidence interval for one group (a sample) tell us?
- CI for one group represents the ‘closeness’ of the sample mean to population mean
28
What does the confidence interval for two or more groups (more than one sample) tell us?
- Used to compare groups - Represents the ‘closeness’ of the difference between groups to the difference between groups in the population – Used to make a decision to reject or accept the null hypothesis
29
How do we use CI to determine if there is NO difference between groupd?
A confidence interval that includes the value of no difference indicates that groups do not significantly differ from each other
30
What is a values of no difference?
The value that indicates no statistically significant difference
31
What is the value of no difference for means?
- Risk difference – no significant difference if the 95% CI contains “0”
32
What is the value of no difference for Odds Ratios and Risk ratios?
- No significant difference if the 95% CI contains “1”
33
What does a chi square test do?
The purpose of this test is to determine if a difference between observed data and expected data is due to chance, or if it is due to a relationship between the variables you are studying
34
What are the conditions for using the Chi square statistic?
- Both the outcome (e.g. Otitis externa) and exposure (e.g. treatment) are binary data. - Observations are independent of each other (i.e. no dogs were used twice, no two dogs came from the same household - The numbers of observations (e.g. dogs cured and not cured in each treatment group) are reasonably high.
35
What does a chi square statistic use?
The chi-square test uses the differences between observed and expected numbers to calculate the chi-square statistic
36
What effect do greater differences in the observed treatment successes between groups have on the chi-square statistic?
The chi-square statistic increases.
37
What software can calculate CI?
http://www.brixtonhealth.com - WinPepi
38
What does selection of appropriate statistical tests to perform analyses depend on?
1. The data type (e.g. the chi-square test is appropriate for binary outcome data) and 2. assumptions required for various tests
39
When are parametric tests (e.g. t-test, ANOVA, regression) appropriate?
If outcome data are continuous and normally distributed
40
What statistical test is used if data are not normally distributed (i.e. skewed)?
A data transformation has to be conducted and the analysis has to be performed with the transformed data using parametric tests OR non-parametric tests can be used
41
When are non-parametric tests (e.g. Wilcoxon rank test, KruskalWallis) used?
Non-parametric tests do not require data to be normally distributed
42
What is the chi square test used for, and what are the requirements for its use?
To compare proportions 1. Outcome and exposure/intervention are binary (dichotomous) 2. Observations must be independent of each other
43
When is a one sample t-test used, and what are the requirements for its use?
Where it is required to test whether the mean of a sample or population is different from a particular value 1. Outcome continuous and data must be normally distributed 2. Must be only one group
44
When is a two sample t-test used, and what are the requirements for its use?
To test equality of the means of two populations. 1. Outcome continuous and data must be normally distributed 2. Must be two comparison groups
45
When is a paired t-test used, and what are the requirements for its use?
To test equality of the means of two samples/populations, when the observations arise as paired samples 1. Outcome must be continuous and differences between the pairs must be normally distributed 2. Must be two comparison groups - paired
46
When is Analysis of Variance (ANOVA) used, and what are the requirements for its use?
To test equality of the means of two or more populations 1. Outcome continuous and data must be normally distributed 2. Must be two or more comparison groups 3. Groups must have equal variance (homoscedasticity)
47
When is Wilcoxon’s Signed Rank Test used, and what are the requirements for its use?
For one sample where it is required to test whether the median of a sample or population is different from a particular value 1. Outcome must be continuous, one group 2. Data need not to be normally distributed 3. Is the non-parametric equivalent of one sample t-test
48
When is Wilcoxon’s Rank Sum Test/Mann-Whitney U test used, and what are the requirements for its use?
To test equality of the mean ranks of two samples/populations 1. Outcome must be continuous; two comparison groups 2. Data need not to be normally distributed 3. Is the non-parametric equivalent of two sample t-test
49
When is Wilcoxon’s Signed Rank Test with two matched pairs used, and what are the requirements for its use?
To test the difference between two samples/populations using matched pairs 1. Outcome must be continuous; two comparison groups - paired 2. Data need not to be normally distributed 3. Is the non-parametric equivalent of paired t-test
50
When is Kruskal-Wallis test used, and what are the requirements for its use?
To test equality of the mean ranks of two or more samples/populations 1. Outcome must be continuous; two or more comparison groups 2. Data needs to not be normally distributed
51
What is a univaribale analysis?
Where a single predictor (independent) variable (risk factor) is considered for a single outcome (dependent) variable
52
What is a multivariable analysis?
Where **multiple predictor**(independent) variables (risk factors) are considered for a **single outcome** (dependent) variable
53
What is a multivariate analysis?
Where **multiple predictor** (independent) variables (risk factors) are considered for **multiple outcome** (dependent) variables
54
What is a Linear Regression, and what are the requirements for its use?
Best fit line to describe the relationship and predict the value of a dependent variable based on an independent variable 1. Outcome must be continuous 2. One (simple) or more explanatory variables (multiple linear regression) 3. Explanatory variables can be continuous or categorical 4. Outcome variable must be normally distributed 5. Must be a linear relationship between outcome and explanatory variables
55
What is a Logistic Regression, and what are the requirements for its use?
Used to predict analyse the relationship between one or more predictor variables and a binary (yes/no, 0/1) outcome, like disease presence or absence. 1. Outcome must be dichotomous/binary 2. One or more explanatory variables 3. Explanatory variables can be continuous or categorical 4. Outcome variable must not be normally distributed 5. Most common analysis method in the biomedical sciences 6. Outcome of analysis is expressed as Odds Ratio, can be converted to probability
56
What is a Kaplan-Meier curve with Log-Rank test, and what are the requirements for its use?
A visual representation of **survival probabilities** over time, often used to analyse how long animals survive after a specific event, like treatment or diagnosis 1. Outcome: time 2. One or more groups 3. Used to describe and compare survival times 4. Censored data (‘losses’ of animals) are included in calculations 5. Log-Rank test compares whether survival is different between groups
57
What is a Cox Proportional Hazards Regression, and what are the requirements for its use?
Used to compare **survival times** for more explanatory variables 1. Helps to distinguish individual contributions of variables on survival 2. Outcome of analysis is expressed as a Hazard Ratio, which is interpreted like an Odds Ratio
58
Which statistical tests are used when comparing groups?
Continuous – normally distributed * t-tests * ANOVA Continuous - not normally distributed * Wilcoxon tests * Kruskal-Wallis test
59
Which statistical tests are used when comparing proportions?
Dichotomous outcome and exposure * Chi-square test
60
Which statistical tests are used with Multivariable experiments?
Continuous outcome - Linear regression Dichotomous (yes/no) outcome - Logistic regression Survival time - Kaplan-Meier curves - Cox Proportional Hazards regression