Stats Flashcards

(81 cards)

1
Q

Bias

A

Any factor that moves the findings of a study away from the truth

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Binary data

A

Data where there are only two possible values such as survived/died; also known as dichotomous data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Blinding in a randomized controlled trial

A

When the treatment allocation is concealed from either the subject or the assessor or both

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Case-control studies

A

Observational study that starts with cases with a disease and compares them with controls without the disease to investigate possible risk factors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Chi-squared goodness of fit test

A

A statistical test used to investigate whether a frequency distribution follows a specific theoretical distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Chi-squared test

A

A statistical test used to investigate the association between two categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Cluster Analysis

A

A statistical method used to identify groups or clusters of individuals who have common features in terms of known variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cluster randomization

A

When groups of individuals are allocated to treatments so that all subjects in a group receive the same treatment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Cohort study

A

Observational study that starts with a sample of individuals who are disease-free and measures possible causal factors at baseline and over time. The cohort of subjects is followed and their disease status is observed to investigate which factors are linked to the disease

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence interval (CI)

A

A range of values that indicates the precision of an estimate; for a 95% CI we can be 95% confident that the interval contains the true value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Continuous data

A

Data that lie on a continuum and so can take any value between two limits

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Cox proportional hazards regression

A

A multifactorial regression model used with a time-to-event outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Crossover trial

A

A single group study where each patient receives each of two or more treatments in turn so that they act as their own control

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Degrees of freedom (DF or df)

A

A quantity used in statistical testing and modelling that is related to the size of the sample and the number of parameters that have been estimated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Dummy variables

A

Used in regression modelling to enable a categorical predictor variable to be included, by converting a variable with n categories into n–1 binary variables, where one category is the reference category

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Equivalence trial

A

A trial that aims to see if a new treatment is no better or worse than an existing one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Fisher’s exact test

A

A statistical test that can be used to investigate the association between two categorical variables when the sample is small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Forest plot

A

A graph used to display individual study estimates and confidence intervals, and the pooled estimate and confidence interval in a meta-analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Gold standard test

A

A diagnostic test that is regarded as definitive, i.e. it gives the correct answer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Funnel plot

A

A simple graphical method for exploring the results from studies to see if publication bias might be present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Hazard ratio

A

Hazard ratio In survival analysis, the ratio of hazards or risks of outcome in two groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Heterogeneity

A

Where there is statistical variability between estimates such as may be found in a meta-analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Incidence

A

The number of new cases of a given condition occurring within a specific time period

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Indirect standardization

A

Gives the standardized mortality ratio (SMR), which is the ratio of the observed number of deaths in the comparison population and the number expected if that population had the same age-specific death rates as the standard population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Intention to treat analysis
Statistical analysis where patients are analysed in the treatment group to which they were originally randomly allocated even if they did not actually receive that treatment
26
Logistic regression
A multifactorial regression model used with a binary outcome
27
Logrank test
A statistical test used to compare time-to-event data in two or more groups
28
Meta-analysis
A statistical analysis which combines the results of several independent studies examining the same question
29
Multifactorial methods
Statistical models fitted to datasets with one outcome variable and several predictor variables; used to disentangle effects
30
Multiple regression
A multifactorial regression model used with a continuous outcome
31
Negative predictive value
The proportion of those found negative on a diagnostic test who are truly negative
32
Normal distribution
A continuous probability distribution with a symmetrical bell shape, which is followed by many naturally occurring variables
33
Number needed to harm
The number of patients who need to be treated in order that one additional patient has a negative outcome
34
Number needed to treat
The number of patients who need to be treated in order that one additional patient has a positive outcome
35
Observational study
A study in which subjects are observed, with exposures and outcomes measured, without any intervention by the researcher
36
Odds
The probability of an event occurring divided by the probability of it not occurring
37
Odds ratio
A measure of the difference in odds between two groups, calculated by dividing the odds in one group by the odds in another group
38
One-way analysis of variance
A statistical test used to compare the means from three or more independent samples
39
Parallel group trial
A trial in which subjects are allocated to receive one of two or more possible treatments and the comparison of different treatments is made between treatment groups
40
Pearson’s correlation
A measure of the strength of linear relationship between two continuous variables
41
Placebo
An inert treatment which is indistinguishable from the active treatment
42
Poisson regression
A multifactorial regression model used to model rates
43
Positive predictive value
The proportion of those found positive on a diagnostic test who are truly positive
44
Posterior distribution
A probability distribution obtained by combining prior evidence with new information
45
Power
The probability that a statistical test will find a significant difference if a real difference of a given size exists, i.e. the null hypothesis is not true
46
Predictor variable
In regression analysis, a variable which is used to predict the value of an outcome variable
47
Prevalence
The proportion of individuals with a condition within a specific population at a given time (point prevalence) or over a given time period (period prevalence)
48
Principal components analysis
A statistical method used to reduce a dataset with many inter-correlated variables to a smaller set of uncorrelated variables that explain the overall variability almost as well
49
Publication bias
A bias that occurs when the papers which are published on a topic are an incomplete subset of all the studies which have been conducted on that topic
50
Rank correlation
A non-parametric measure of the relationship between two variables, using the ranks of the data rather than the data values themselves
51
Receiver operating characteristic (ROC) curve
A graph plotting the sensitivity against 1–specificity for a diagnostic test at different cut-off points
52
Relative risk (RR)
A measure of the difference in risk between two groups, calculated by dividing the risk in the exposed group by the risk in the unexposed group (also known as risk ratio)
53
Risk ratio
A measure of the difference in risk between two groups, calculated by dividing the risk in the exposed group by the risk in the unexposed group (also known as relative risk)
54
Selection bias
A statistical bias introduced by the way in which subjects are selected for a research study
55
Sensitivity
The proportion of those who have the disease who are correctly identified by the diagnostic test as positive
56
Sensitivity analysis
A way of testing assumptions made in statistical analyses by doing several analyses based on different assumptions, and comparing the results
57
Significance level
The probability that a statistical test rejects the null hypothesis when no real difference exists, i.e. the null hypothesis is true (type 1 error)
58
Simple linear regression
A statistical method to estimate the nature of the linear relationship between two continuous variables
59
Skewed data
Data that do not follow a symmetrical distribution
60
Specificity
The proportion of those who do not have the disease who are correctly identified by the diagnostic test as negative
61
Standard deviation (SD)
A measure of dispersion used for continuous data; is equal to the square root of the variance
62
Standard error (SE)
A measure of precision of an estimated quantity that is equal to the standard deviation of the sampling distribution of the quantity
63
Stem and leaf plot
A graph which uses the data values themselves to depict the shape of a frequency distribution
64
Superiority trial
A trial which aims to see if one treatment is better than another
65
t test
A statistical test used to compare the means from two independent samples
66
Transformation
A function applied to a dataset to better fit a specific probability distribution, for example applying a logarithmic transformation to skewed data to make it fit a Normal distribution
67
Two-way analysis of variance
A statistical method used to investigate the effects of two factors on a continuous outcome
68
Type 1 error
Getting a significant result in a sample when the null hypothesis is in fact true in the underlying population
69
Type 2 error
Getting a non-significant result in a sample when the null hypothesis is in fact false in the underlying population (‘false non-significant’ result)
70
Variable
A quantity that is measured or observed in an individual and which varies from person to person
71
Washout period
The time interval between the administration of different treatments in subjects in a crossover trial that prevents there being any carry-over effects of the current treatment when the next treatment starts
72
Wilcoxon matched pairs test
A statistical test comparing ordinal data from paired sample
73
Wilcoxon signed rank test
A statistical test comparing ordinal data from two independent groups; equivalent to the Mann Whitney U test
74
Z-test for proportions
A statistical test used to compare proportions from two independent samples
75
Stratification for prognostic factors
important prognostic factors that need to be accounted for in a particular trial, the random allocation can be stratified so that the treatment groups are balanced for the prognostic factors.
76
Minimization
Allocation takes place in a way that best maintains balance in important prognostic factors. At all stages of recruitment, the next patient is allocated to that treatment which minimizes the overall imbalance in prognostic factors
77
Advantages of parallel group study design
The comparison of the treatments takes place concurrently Can be used for any condition, especially an acute condition which is cured or self-limiting such as an infection No problem of carry-over effects
78
Disadvantages of parallel study group designs
The comparison is between patients and so usually needs a bigger sample size than the equivalent cross-over trial
79
Advantages of crossover study designs
Treatments are compared within patients and so differences between patients are accounted for explicitly Usually need fewer subjects than the equivalent parallel group trials Can be used to test treatments for chronic conditions
80
Disadvantages of crossover study designs
Cannot be used for many acute illnesses Carry-over effects need to be controlled Likely to take longer than the equivalent parallel designs Statistical analysis is more complicated if subjects do not complete all periods
81
Zelen Randomised Consent Design
Subjects are randomly allocated to treatment or usual care Only those subjects who are allocated to treatment are invited to participate and to give their consent Subjects allocated to usual care (control) are not asked to give their consent Among the treatment group, some subjects will refuse and so this design results in three treatment groups1,2 1. Usual care (allocated) 2. Intervention 3. Usual care (but allocated to intervention) The analysis is performed with patients analysed in the original randomized groups, i.e. 1 versus 2 + 3 (Research design Intention to treat analysis)