Stats + mock mistakes Flashcards

1
Q

Cohort Study

A

What is it?
Follows a group (cohort) over time to compare the incidence of disease between exposed and unexposed individuals.

Direction: Forward (prospective) or backward (retrospective)

Best for:
- Studying causality and incidence
- When exposure is known and common

Key strength:
Establishes temporal relationship (exposure before outcome)

Example:
Follow smokers vs non-smokers for 10 years to see who develops Parkinson’s.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Case-Control Study

A

What is it?
Starts with people who already have the disease (cases) and compares them to those without it (controls), looking backward for exposures.

Direction: Retrospective

Best for:
- Studying rare diseases
- Quick and cost-effective research

Key weakness:
Cannot easily establish time-order or calculate incidence

Example:
Take 100 people with Parkinson’s and 100 without, and ask if they smoked in the past.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cross-Sectional Study

A

What is it?
A snapshot of a population at one point in time to assess both exposure and outcome simultaneously.

Direction: None (one-time observation)

Best for:
- Measuring prevalence
- Hypothesis generation

Key limitation:
Cannot establish causality

Example:
Survey 5,000 people today on their smoking habits and Parkinson’s diagnosis status.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Cross-Over Randomised Trial

A

What is it?
A type of randomized trial where each participant receives both treatments in sequence, with a “washout” period in between.

Best for:
- Chronic, reversible conditions
- Comparing two treatments in the same person

Key limitation:
Not suitable for diseases with permanent effects

Example:
Compare two blood pressure drugs in the same patients — each tries both drugs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Parallel Group Randomised Controlled Trial (RCT)

A

What is it?
Participants are randomly assigned to only one of the treatment groups — both groups are followed concurrently.

Best for:
- Testing interventions (e.g., new drugs or therapies)

Key limitation:
Not appropriate for harmful exposures (like smoking)

Example:
Randomly assign 200 people to a new drug vs placebo and monitor outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a t-test?

A

A t-test is a statistical test used to determine whether there is a significant difference between the means of two groups. It assumes that the data is:

  • Continuous (e.g. height, blood pressure, weight)
  • Normally distributed
  • From independent or paired groups, depending on the type of t-test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does a t-test compare?

A

The means of a continuous variable between two groups.

Independent t-test: 2 separate groups (different participants)

Paired t-test: same group, 2 time points (same participants)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When wld u use a t-test and when would u use a chi-squared test

A

If you’re comparing means → think t-test.

If you’re comparing proportions or categories → think chi-square test.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Phases of clinical trials

A

mnemonic: 0 Safety, 1 Safe, 2 Works?, 3 Prove it, 4 Watch it

Phase 0: First in humans, microdosing

Phase 1: Is it safe?

Phase 2: Does it work?

Phase 3: Prove it works — for licensing

Phase 4: Long-term monitoring after approval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What happens in Phase 0 of clinical trials?

A

Microdosing in a very small group to study how the drug behaves in the body (pharmacokinetics). No therapeutic intent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the main purpose of Phase 1 trials?

A

To assess safety, tolerability, and dosage in a small group of healthy volunteers (or sometimes patients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is tested during Phase 2 trials?

A

Preliminary efficacy and side effects in a larger group of patients with the condition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the goal of Phase 3 trials?

A

To confirm efficacy, monitor for adverse effects, and compare the drug with standard treatments. Forms the basis for licensing approval (so its licensed already but only newly)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of Phase 4 trials?

A

Post-marketing surveillance to detect rare or long-term side effects and assess ongoing safety and effectiveness in the general population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are the Bradford Hill criteria for assessing a causal relationship in epidemiology?

A

The Bradford Hill criteria are nine principles used to evaluate whether an observed association is likely to be causal:

Strength of Association – A stronger association (e.g., high relative risk or odds ratio) is more likely to be causal.

Consistency – The association is observed repeatedly in different studies, settings, and populations.

Specificity – A specific exposure is linked to a specific outcome (less emphasized today).

Temporality – The cause must precede the effect. This is the only essential criterion.

Biological Gradient (Dose-Response) – Risk of disease increases with greater exposure.

Plausibility – The relationship is biologically or medically sensible based on current knowledge.

Coherence – The association does not conflict with existing theory or knowledge of the disease’s natural history.

Experiment (Reversibility) – Removal or reduction of exposure leads to a decrease in risk (e.g., quitting smoking lowers cancer risk).

Analogy – Similar factors are known to cause similar effects (e.g., asbestos and mesothelioma → silica and lung disease).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is relative poverty, and how does it differ from absolute poverty in the UK?

A

Relative poverty (or relative low income):
When a household has less than 60% of the current median UK income, after housing costs (AHC) or before housing costs (BHC).
➤ It reflects inequality — being poor compared to others in society.

Absolute poverty (or absolute low income):
When a household earns less than 60% of the median income in 2010/11, adjusted for inflation (held constant in real terms).
➤ It reflects fixed hardship, not changing with societal wealth.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What statistical test should be performed to compare the means of a normally distributed variable between two groups?

A

2-sample t-test (also called Independent t-test)

When to use it:
Data follows a Gaussian (normal) distribution.

You are comparing the means of a continuous variable between two independent groups.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When should the Chi-squared test be used in statistical analysis?

A

Chi-squared test

When to use it:
For categorical data (nominal or ordinal variables).

To test if there is a significant association between two categorical variables.

Used to compare the observed vs. expected frequencies in categories (e.g., yes/no responses, disease/no disease).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

When should Fisher’s Exact test be used instead of the Chi-squared test?

A

Fisher’s Exact test

When to use it:
Used for categorical data, particularly when dealing with small sample sizes or small expected frequencies (usually less than 5 in any cell).

Provides an exact p-value, unlike the Chi-squared test, which is an approximation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

When should the Paired t-test be used in statistical analysis?

A

Paired t-test

When to use it:
Used when comparing the means of a continuous variable in related or paired groups (e.g., same subjects measured before and after an intervention).

Assumes that the differences between pairs are normally distributed.

21
Q

What are the following tests used for:

  • 2-sample t-test
  • paired t-test
  • chi-squared test
  • Fisher’s exact test
  • Wilcoxon signed-rank test
A

2-sample t-test: For comparing means of normally distributed continuous data between two independent groups.

Paired t-test: For comparing means of related continuous data (e.g., before vs. after).

Chi-squared test: For testing relationships between categorical variables.

Fisher’s Exact test: For small sample sizes with categorical data.

Wilcoxon signed-rank test: For comparing paired data when the data is not normally distributed.

22
Q

a high AST>ALT is considered >2 . What does this indicate?

A

aspartate aminotransferase (AST) to alanine aminotransferase (ALT) of >2 suggests;
- Alcohol-related liver disease
- Cirrhosis
- Advanced fibrosis

23
Q

a AST/ALT ratio of <1 indicates what?

A

AST/ALT ratio of <1 i.e. ALT is higher

  • Acute viral hepatitis
  • NAFLD

n.b. Drug-induced liver injury and autoimmune hepatitis also generally have an AST/ALT ratio below 1, except in severe cases

24
Q

What does a P value represent in statistical testing?

A

The P value is the probability of obtaining the observed results, or more extreme, assuming the null hypothesis is true. A low P value (< 0.05) suggests that the observed result is unlikely due to chance, and we may reject the null hypothesis

P < 0.05: Statistically significant

P > 0.05: Not statistically significant

25
What does the standard deviation tell you about a dataset?
Standard deviation measures the spread or variability of data around the mean. A smaller SD = data is tightly clustered; larger SD = more spread out 68% of data lies within ±1 SD 95% within ±2 SD 99.7% within ±3 SD (if normally distributed) Often paired with mean in medical data to describe variability
26
What defines a Gaussian (normal) distribution in statistics?
A bell-shaped, symmetric distribution where the mean = median = mode. Data is symmetrically distributed around the center. 📝 Quick Notes: Common in biological measurements (e.g., height, BP) Allows use of parametric tests (e.g., t-test, ANOVA) Important for assuming normality in hypothesis testing Visual: Bell curve
27
What are the most important unit conversions to remember in clinical practice?
1 kg = 1000 g = 1,000,000 mg 1 g = 1000 mg = 1,000,000 mcg 1 L = 1000 mL = 1,000,000 µL 1 tsp = 5 mL, 1 tbsp = 15 mL 1 mL = 1 cc, 1 drop ≈ 0.05 mL
28
if data does not follow a Gaussian/normal distribution and there is a non-linear monotonic relationship what relationship can be used?
for non-linear monotonic relationships use Spearman's rank correlation
29
What is a Confidence Interval (CI)?
A CI gives a range in which we are confident the true mean difference lies. If the CI includes 0, it means the result might be due to chance — the difference could be statistically insignificant.
30
How do I use confidence intervals (CIs) to estimate the p-value and determine statistical significance?
Check if 0 is in the Confidence Interval (CI): ✅ If 0 is not in the CI → Result is statistically significant ❌ If 0 is in the CI → Result is not statistically significant Use the CI level to estimate the p-value: If 95% CI excludes 0 → p-value < 0.05 If 99% CI includes 0 → p-value > 0.01 ⇒ So, p-value is between 0.01 and 0.05 Choose the option that matches the p-value range. 🎯 Quick Rule of Thumb: Smaller CI (e.g., 95%) → Easier to get significance Larger CI (e.g., 99%) → Harder to get significance Compare both to narrow down p-value
31
What is an RCT? Why randomise?
An experiment where participants are randomly allocated to treatment/control groups to evaluate efficacy/safety. Why randomise? Avoids selection/allocation bias; ensures groups are comparable (only systematic difference = treatment).
32
Parallel vs. crossover design? Advantage of crossover?
Parallel: Groups receive different treatments (irreversible effects, e.g., chemotherapy). Crossover: Participants switch treatments after a washout period (reversible effects, e.g., analgesics). Advantage of crossover? Smaller sample size; each patient acts as their own control
33
Single vs. double blinding?
Single-blind: Patients unaware of treatment. Double-blind: Patients and observers unaware (reduces bias).
34
Problem with historical controls?
Selection bias (past groups may differ from current)
35
Intention-to-treat (ITT) vs. on-treatment analysis?
ITT: Analyzes all participants as randomized (preserves randomization). On-treatment: Only includes adherent participants.
36
Number Needed to Treat (NNT)?
Patients needed to treat to prevent 1 bad outcome. Example: Folic acid for Neural Tube Defects(NTDs): NNT = 40 (prevents 1 NTD per 40 treated).
37
Relative Risk (RR) of 0.5?
Treatment group has 50% lower risk vs. control
38
When to use cluster randomisation?
For community-level interventions (e.g., water disinfectant trials)
39
are these results clinically significant; RR=2 (p=0.07, 95% CI 0.99–4.1). Significant?
NOT SIGNIFICANT! RR=1 means no effect (treatment = control) P>0.05 statistically insignficant The 95% CI (0.99–4.1) crosses 1, indicating the result could be due to chance (true effect might be null).
40
What is the RR value
RR = Relative Risk (also called Risk Ratio) compares the probability of an outcome (e.g., disease, death, recovery) between two groups: -Treatment group (exposed to intervention) - Control group (placebo/standard care) RR = 0.5 → Treatment group has 50% lower risk of death vs. control. (makes it better) RR = 1 → No difference between groups. (no difference) RR = 2 → Treatment group has double the risk vs. control. (makes it worse) Measures Treatment Effect: RR <1 → Intervention reduces risk (e.g., vaccines). RR >1 → Intervention increases risk (e.g., side effects).
41
Which study type is best for rare diseases?
Case-control (start with cases; efficient for rare outcomes)
42
Key feature of cohort studies?
Follow groups over time to measure disease incidence (e.g., British Doctors Study: smoking → lung cancer).
43
Key feature of case-control studies?
Compare past exposures in cases (disease) vs. controls (no disease) Major bias in case-control studies? Recall bias (cases remember exposures better)
44
What is a confounder? How to adjust for confounding?
A variable linked to both exposure and outcome (e.g., smoking confounds alcohol → lung cancer) how to adjust? Stratification or regression models (e.g., logistic regression).
45
Absolute excess risk formula?
Risk in exposed – Risk in unexposed
46
Hazard ratio (HR) interpretation?
HR = 0.73 → 27% lower risk of death at any time in treated group
47
Best study for rare exposure (e.g., blue asbestos → lung cancer)?
Cohort (ensure enough exposed subjects)
48
Best study for quick results?
Case-control (no waiting for disease onset)