Biostatistics Flashcards
(23 cards)
what % of the data falls within 1 SD in normal distribution, within 2 SD, and within 3SD
- 1SD: 68%
- 2SD: 95%
- 3SD: 99.7%
what does skew right vs. left mean
data is skewed towards outliers, right skew means that the high outlier dragged the average up
left skew (low outlier), dragged the data down
Type I error vs. type II
- type I: rejected null when it was true (alpha) - we usually set alpha to be 0.05 and if the p value is > alpha, statistial significane is achieved
- type II: accepted the null when it should have been rejected (beta, false negative) - risk of this increases with smaller sample sizes
power
- the probability that the test avoids making a type II error (false negative)
- higher power, less likely to do a false negative (risk of a false negative increases with smaller sample sizes)
equation time: risk
number if subjects in group with bad event divided by total number of subjects in group
interpretation: how likely the bad thing will happen in a single group
equation time: risk reduction (RR)
risk in treatment group divided by risk in control
interpretation: weather or not the treatment group has higher, lower or same risk as control group
equation time: relative risk reduction (RRR)
(% risk in control minus % risk in treatment) divided by % risk in control = 1 - RR
interpretation: how much risk was reduced in treatment vs. control group
equation time: absolute risk reduction (ARR)
% risk in control group minus % risk in treatment
interpretation: very similar to RRR, except instead of comparing 2 risk percentages, you are taking that percent and interpreting it in the setting of the actual incidence rate
equation time: NNT and NNH
1 divided by ARR = 1 divided by (risk in control minus risk in treatment)
- NNT interpretation: how many patients need to be treated for one of them to have a positive outcome; round up
- NNH interpretation: how many patients need to be treated for one of them to have a positive outcome; round down
- so what’s the difference? if we give patients ibuprofen, the “risk” in NNT looks specifically at how many people we need to treat for someone to have pain relief. NNH can be calculated from that same sample, and give us information on how many patients we need to give ibuprofen to in order for one of them to develop a stomach ulcer - technically same equation, but one was the intended outcome and one was an ADR
equation time: odds ratio
(people with intervention and outcome multiplied by people without intervention and without outcome) divided by (people with intervention and no outcome multipled by people without intervetion and still had outcome)
- interpretation: takes how often an event occured in study group vs. how often it happened in placebo group
- so why does the math look like that? (ratio in yes ** to (:)** ratio in no) -> (fraction in yes divided by fraction in no)
equation time: hazard ratio
hazard rate in treatment divided by hazard rate in control
no interpretation because i’m tired of stats and it’s only been 2 hrs
requirements for t-test and what does it even do
- t-test: checks for statistical differences between treatment and control group
- normally distributed continuous data
- one sample t-test: collected data vs. known data from general population
- paired t-test: same group is used for a pre/post measurement
- studet t-test: separate group for treatment and control
requirements for ANOVA and what does it even do
- like t-test, checks for statistical differences between groups where the collected data is continuous and normally distributed
- unlike t-test, must have 3 or more groups
chi square test requirements and what does it even do?
- like t-test, used to determine statistical differences between groups
- unlike t-test, canNOT be continuous data, must be categorical data (nominal or ordinal
so if continuous data isn’t normally distributed, what do we do to compare differences between groups?
use a man-whitney test or a wilcoxon test depending on your number of groups
will not go into more detail, was not even underlined in book
- remember that t-test and anova are for normally distributed continuous data
- if data is discrete/categorical: chi-square test (or wilcoxon or fisher exact or mann whitney, depends how many categories i guess)
regression
and types
- used to explain how much of the dependent variable changes per change in indiependent variable
useful when you need to consider many confounding factors
types:
- linear - for continuous data
- logistic - for categorical data
- cox regression - for categorical data in a survival analysis
sensitivity vs. specificity
- sensitivity (true positive): if sensitivity is <100%, then there will be patients who have the disease who do not test positive (false neg)
- specificity (true negative): if specificity is <100%, then there will be patients who test positive but do not have disease (false pos)
intention to treat vs. per protocol analysis
- intention to treat: include patients who ghosted you or couldn’t follow directions
- per protocol: only include patients who followed the instructions down to the fine print and understood the assignment
case control vs. cohort study
- case control is looking at pateints with disease (case) or without
- cohort study is looking exposure
RCT jargon: double-blind vs. single-blind vs. open-label
- double blind: patient and researcher don’t know who is who
- single blind: only patient doesn’t know what they’re actually getting
- open-label: tea was spilt to everyone (patient’s may ddrop out knowing that they didn’t get the med or patients with med may report improvement due to bias)
what’s an ECHO model
this is from the biostats section and has nothing to do with scans
economic, clinical and humanisitic outcomes: provides broad evaulative framework to assess the outcomes associated with diseases and treatments
- economic: cost of things both direct, indirect, and intangible
- clinical: speaks for itself
- humanistic: consequences of the disease as reported by patient or caregiver (QoL)
equation time: incremental cost ratio
cost difference between A and B divided by event difference between A and B
cost-minimization analysis vs cost benefit analysis vs cost effectiveness vs cost-utility
- cost minimization analysis: looks at 2 equally efficacious treatments (in terms of outcomes) and compares their cost
- cost benefit analysis: compares benefits to the cost of the intervention in monetary units
- cost effectiveness analysis: compares measurable clinical effects of intervention (lab values, duration of stay, % cure) to their respective costs (usually in $) - canNOT be used to compare things with different clinical effect units
- cost utility analysis: type of cost effectiveness analsis that now includes QoL and morbididty assessments (examle of outcome unit: quality adjusted life year)