Week 1 Day 1 - Mathematics Flashcards

1
Q

Descriptive statistics

A

Use to organize, summarize, and present the values
Draws NO consclusions

“The data is the data”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Inferential statistics

A

Used to draw conclusions about data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Categorical variable

A

variable with discrete or qualitative value

male/female
liking tofu 1-5 scale
shirt (4 types)
quarantine activity is qualitative, but is infinite, not discrete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous variable

A

variable that can measured along a continuum

age
temp
height
years as a nurse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

nominal

A

categorical variable

no intrinsic order - shirt, quarantine activity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

ordinal

A

categorical variables

have order - tofu (1,2,3,4,5)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

dichotomous

A

categorical variable

only 2 values - m/f (order doesn’t matter)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

interval

A

continuous variable

numeric value and is measured

i.e. age, temp, height, years as a nurse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ratio

A

continuous variable

like interval, but value of ‘0’ indicates there is nothing

i.e. age, height, years as a nurse

temp not ratio variable, nothing meaningful or valuable about my favorite temp being 70F and yours 75F

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

mean

as it relates to variables

A

advantage: easy to calc
disadvantage: affected by outliers

ratio (height, age): yes
interval (temp): yes
ordinal (tofu): maybe, possible mathematically, but you shouldn’t
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

median

as it relates to variables

A

advantage: outlier insensitive

ratio (age, height): yes
interval (temp): yes
ordinal (tofu): yes
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

mode

as it relates to variables

A

ratio (age, height): yes
interval (temp): yes
ordinal (tofu): yes
nominal (shirt): yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

measures of central tendency

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

measures of variability/spread

A

describes the manner in which data are scattered around a specific value (such as the mean)

range 
interquartile range
standard deviation
standard error of the mean
percentile
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

range

definition + as it relates to variables

A

highest value to lowest value

ratio (age, heigh)t: yes
interval (temp): yes
ordinal (tofu): yes
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

interquartile range

definition + as it relates to variables

A

refers to the upper and lower boundary defining the middle percent of observations

75th percentile-25th percentile
commonly used- 90th percentile-10th percentile

ratio (age, height): yes
interval (temp): yes
ordinal (tofu): yes
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

standard deviation

definition + as it relates to variables

A

measure of variability
how much people/subject differ from the the average (mean)

ratio (age, height): yes
interval (temp): yes
ordinal (tofu): maybe (we can, but we shouldn’t)
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

standard error the of the mean

definition + as it relates to variables

A

how well does the mean represent the sample

error of the mean gets smaller as the sample gets bigger

describes the amount of variability in the measurement of the population mean from several different samples

ratio (age, height): yes
interval (temp): yes
ordinal (tofu): maybe (we can, but we shouldn’t)
nominal (shirt): no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

inferential statistics

A

trying to reach conclusion that extend beyond the immediate data alone

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Null hypothesis

A

There is no difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

T test

A

simplest test for difference between 2 groups

the greater the magnitude of “t”, the more likely the groups are different (statistically different)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Reasons research may not be valid

A

bias
chance
confounders

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

chance

A

caused by random variations in subjects and measurements

larger sample size will reduce chance errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

bias

A

systematic variation

larger sample size WILL NOT help

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Types of bias
selection bias measurement bias analysis bias
26
selection bias
biased sampling of population
27
measurement bias
systematic bias-poor measurement technique | Spanish vs Portuguese men height
28
analysis bias
using analysis that favors one conclusion over another "torture the data until you get the conclusion that you want"
29
confounding
similar to bias misinterpretation of accurate variables occurs when an investigator falsely concludes that a particular exposure is causally related to a disease without adjusting for other factors that are known risk factors for the disease and are associated with the exposure.
30
POEM
Patient Oriented Evidence that Matters What patient's really care about: mortality and morbidity
31
DOE
Disease oriented evidence The stuff that patients don't care about, but is related to disease blood pressure, cholesterol, blood glucose
32
percentile
percentage of a distribution that is below a specific value i.e. a child in the 80th percentile for height if only 20% of children of the same age are taller than he is
33
experimental study
researcher assigns exposure | can't assign BAD exposures usually
34
randomized controlled trial
experimental study assignment to exposure is determined purely by chance (allocation is random) usually double blind, has controls randomizing helps (but does not guarantee) to get rid of confounding and bias
35
observational study
researcher did not assign exposure
36
cohort study
observational study subjects with an exposure of interest (i.e. HTN) and subjects without the exposure are identified and then followed forward in time to determine outcomes (i.e. stroke) exposure----->outcome disadvantage: longitudinal study - take a long time i. e. Framingham
37
case-control study
observational study first identified a group of subjects with a certain disease and a control group without the disease, and then look back in time to find exposure to risk factors for the disease advantages: wells suited for rare diseases, doesn't take a long time outcome----->exposure disadvantage: much more likely to have biases because it's hard to recruit a bunch of controls who are just like your cases, except they don't have the disease
38
cross-sectional study
observational study examines presence or absence of a disease or presence or absence of an exposure at a particular time. disadvantage: Since exposure and outcome are ascertained at the same time, it is often unclear if the exposure preceded the outcome.
39
case report or case series
Descriptive study reports on a single or a series of patients with a certain disease. disadvantage: usually generates a hypothesis but cannot test a hypothesis because it does not include an appropriate comparison group.
40
measures of frequency of events
incidence incidence rate prevalence
41
incidence
number of NEW events that occur during a specified period of time in a population at risk for develop the events new events per unit of time
42
incidence rate
incidence that reports the number of new events that occur over the sum of time individuals in the population were at risk for having the event (i.e. events/person-years). new cases per year (or other time frame) per population
43
prevalence
number of persons in the population affected by a disease at a specific time/number of persons in the population at that time cummulative incidences (when someone dies, they fall out of the prevalence pool)
44
How close the average of measured values are to the true value
accuracy
45
how close measured values are to each other
precision standard deviation is a measure of precision! not accuracy
46
%error
100% * (measured value - "true" value) / "true value"
47
population
group from which data is to be collected
48
sample
subset of a population
49
1 in ---> ? cm
2.54 cm
50
peta (P)
10^15 1E+15 quadrillion
51
tera (T)
10^12 1E+12 trillion
52
giga (G)
10^9 1E+9 billion
53
mega (M)
10^6 1E+6 million
54
kilo (k)
10^3 1000 thousand
55
hecto (h)
10^2 100 hundred
56
deca (da)
10^1 10 ten
57
deci (d)
10^-1 0.1 tenth
58
centi (c)
10^-2 .01 hundredth
59
milli (m)
10^-3 .001 thousandth
60
micro (μ)
10^-6 1E-6 millionth
61
nano (n)
1^10-9 1E-9 billionth
62
pico (p)
1^10-12 trillionth
63
exact number
number NOT obtained using a measuring device easily countable, absolutely no question of value small number can be reproducibly determined by counting
64
How can we improve accuracy?
making replicate measurements and taking the average
65
How can we improve precision?
careful lab technique and/or using instruments capable of yielding greater precision
66
measures of association
``` relative risk odds ration absolute risk attributable risk population attributable risk NNT (number needed to treat) ```
67
Relative risk
ratio of the incidence of disease in the exposed group divided by the corresponding incidence of disease in the unexposed group used in cohort studies RR--> across rows on chart
68
Odds ratio
odds of exposure in the group with disease divided by the odds of exposure in the control group used in case control studies OR - down columns on chart
69
Number needed to treat (NNT)
number of patients who would need to be treated to prevent one adverse outcome considers cost effectiveness considers what is being cured
70
absolute risk
relative risk and odds ratio provide a measure of risk compared with a standard However, 40% increase in risk of heart disease because of a particular exposure does not provide insight into the likelihood that exposure is an individual patient will lead to heart disease.
71
attributable risk or risk difference
measure of absolute risk difference between the incidence rates in the exposed and non exposed groups
72
population attributable risk
describes the excess rate of disease in the total study population of exposed and non exposed individuals that is attributable to the exposure calculated by multiplying the attributable risk by the proportion fo exposed individuals in the population
73
measures of diagnostic test accuracy
sensitivity specificity positive predictive value negative predictive value
74
positive predictive value
probability of disease in a patient with a positive test
75
negative predictive value
probability that the patient does not have disease if he has a negative result
76
sensitivity
ability of the test to identify correctly those who have the disease test with high sensitivity has few false negative results sensitivity rules out, specificity rules in
77
specificity
ability of the test to identify correctly those who do not have the disease high specificity has few false positive results *how specific is this test for this disease* sensitivity rules out, specificity rules in
78
Probability of incorrectly concluding there is a statistically significant difference in the population when none exists.
Type 1 error (alpha)
79
Probability of incorrectly concluding that there is no statistically significant difference in a population when one exists.
Type II error (beta)
80
Measure of the ability of a study to detect a true difference
Power
81
Confidence intervals
gives a range of values within which there is a high probability (95% by convention) that the true population value can be found CI narrows as the # of observations increases or SD decreases
82
Kaplan-Meier Analysis
Survival analysis ration of surviving subjects (those without an event)/total number of subjects at risk for the event