3. Medical Statistics intro Flashcards

(56 cards)

1
Q

what is a STATISTIC

A

a numerical summary of a SAMPLE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a PARAMETER

A

a numerical summary of the POPULATION

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is CATEGORICAL data

A

QUALITATIVE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is NUMERICAL data

A

QUANTITATIVE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

CATAGORICAL data can be split into:

A

NOMINAL and ORDINAL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is NOMINAL vs ORDINAL data with examples
(categorical)

A
  • Nominal: categories are mutually exclusive and UNORDERED

eg. sex, blood group, ethnicity, survival after 10 years

  • Ordinal: categories are mutually exclusive and ORDERED

eg. disease stage, education level, heart murmur grade

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

NUMERICAL data can be split into…

A

DISCRETE and CONTINUOUS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is DISCRETE vs CONTINUOUS data (numerical)

A
  • Discrete: take only INTEGER VALUES (COUNT 0,1,2..)

eg. NUMBER OF pregnancies, number of asthma exacerbations

  • Continuous: take ANY VALUE in a given interval

eg. weight, blood pressure, cholesterol levels, age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

PROS and CONS of CONVERTING NUMERICAL to CATEGORICAL

(eg systolic bp (mmHg) —> hypertensive (>140), normotensive (<140)

A

PROS:
- EASIER to DESCRIBE POPULATION by the % of people AFFECTED
- EASIER to make TREATMENT DECISIONS if population is GROUPED

CONS:
- LOSE INFORMATION
- how to DECIDE CUT OFF? what is abnormal?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

how to DESCRIBE the DISTRIBUTION of CATEGORICAL variables (what to look at)

A
  • the category with the LARGEST FREQUENCY (MODAL CATEGORY)
  • how FREQUENTLY each category was OBSERVED (%)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how to DESCRIBE the DISTRIBUTION of NUMERICAL variables (what to look at)

A
  • SHAPE (do observations cluster in certain intervals?)
  • CENTRE (where does a typical observation fall?)
  • VARIABILITY (how tightly are the observations clustering around a centre)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

DESCRIBING CATEGORICAL DATA:

PROPORTION vs PERCENTAGE (how to calculate)

A

PROPORTION : the NUMBER OF OBSERVATIONS in that category DIVIDED by the TOTAL NUMBER of OBSERVATIONS

PERCENTAGE = PROPORTION X 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DESCRIBING CATEGORICAL DATA:

PROPORTIONS and PERCENTAGES are also called … and serve as a way to..

A

RELATIVE FREQUENCIES

serve as a way to SUMMARIZE the DISTRIBUTION of a CATEGORICAL variable NUMERICALLY

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

DESCRIBING CATEGORICAL DATA:

what is the ABSOLUTE CHANGE (and how to calculate)

A

describes the ACTUAL INCREASE or DECREASE from a REFERENCE VALUE to a NEW VALUE

ABSOLUTE CHANGE = NEW VALUE - REFERENCE VALUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

DESCRIBING CATEGORICAL DATA:

what is the RELATIVE CHANGE (and how to calculate)

A

describes the size of the ABSOLUTE CHANGE in COMPARISON to the REFERENCE VALUE
expressed as %

RELATIVE CHANGE = NEW VALUE - REFERENCE VALUE /
REFERENCE VALUE
X100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

DESCRIBING CATEGORICAL DATA:

Percentages are also commonly used to compare 2 numbers. there is REFERENCE VALUE and COMPARED VALUE (compared to reference)

how do you calculate ABSOLUTE DIFFERENCE

A

ABSOLUTE DIFFERENCE
= COMPARED VALUE - REFERENCE VALUE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

DESCRIBING CATEGORICAL DATA:

Percentages are also commonly used to compare 2 numbers. there is REFERENCE VALUE and COMPARED VALUE (compared to reference)

how do you calculate RELATIVE DIFFERENCE (%)

A

RELATIVE DIFFERENCE
= COMPARED VALUE - REFERENCE VALUE /
REFERENCE VALUE

X 100

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

DESCRIBING CATEGORICAL DATA:

ABSOLUTE vs RELATIVE

A

ABSOLUTE = difference/change

RELATIVE = Percentage change

eg weight loss 200 kg —> 180 kg
absolute weight loss = 20 kg
relative weight loss = 10% (20/200 x 100)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

DESCRIBING CATEGORICAL DATA:

if a value is 20% MORE than the reference value, it is ….% OF the reference value

A

120% OF the reference (100 + P)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

DESCRIBING CATEGORICAL DATA:

is a value is 20% LESS than the reference value, it is …% OF the reference value

A

80% OF the reference (100 - P)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

DESCRIBING NUMERICAL DATA:

what type of graph visualises the DISTRIBUTION of a QUANTITATIVE variable

A

HISTOGRAM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

DESCRIBING NUMERICAL DATA:

three questions to ask:

A
  1. does the distribution have a SINGLE MOUND / PEAK (MODE)
  2. what is the SHAPE of the distribution
  3. do the data CLUSTER together, or is there a GAP such that one or more observations noticeably differ from the rest
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

DESCRIBING NUMERICAL DATA:

what is UNIMODAL vs BIMODAL distribution

A

UNIMODAL: SINGLE MOUND/PEAK (mode)

BIMODAL: 2 DISTINCE MOUNDS (modes)

24
Q

DESCRIBING NUMERICAL DATA:

SHAPE of the distribution can be:

A

SYMMETRIC: left half is mirror image of right half

SKEWED TO THE LEFT (NEGATIVELY SKEWED) : LONGER LEFT TAIL

SKEWED TO THE RIGHT (POSITIVELY SKEWED): LONGER RIGHT TAIL

25
DESCRIBING NUMERICAL DATA: is LEFT SKEWED data positive or negative and give an example of a left skew
NEGATIVELY SKEWED longer, skewed left tail eg LIFE SPAN relatively low deaths at young age, most deaths at older age
26
DESCRIBING NUMERICAL DATA: is RIGHT SKEWED data positive or negative and give an example of a right skew
POSITIVELY SKEWED longer, skewed right tail. starts high and slopes down eg. INCOME most observations at low income, relatively few are rich
27
DESCRIBING NUMERICAL DATA: in a NORMAL DISTRIBUTION what is the 68-95-99.7 % RULE
- within 1 STANDARD DEVIATION of the MEAN (above/below): 68% of observations - within 2 STANDARD DEVIATIONS of the MEAN: 95% of observations - within 3 STANDARD DEVIATIONS: ALL or NEARLY ALL observations
28
DESCRIBING NUMERICAL DATA: NORMAL DISTRIBUTION what % of observations are within 1 STANDARD DEVIATION
68%
29
DESCRIBING NUMERICAL DATA: NORMAL DISTRIBUTION what % of observations are within 2 STANDARD DEVIATIONS
95%
30
How to calculate MEAN
sum of all values / total number of values
31
MODE is most often used with which data type
CATEGORICAL DATA
32
the SHAPE of a distribution INFLUENCES whether the MEAN is LARGER or SMALLER than the MEDIAN how is the MEAN in relation to the MEDIAN in a SYMMETRIC DISTRIBUTION
MEAN = MEDIAN at the middle peak
33
the SHAPE of a distribution INFLUENCES whether the MEAN is LARGER or SMALLER than the MEDIAN how is the MEAN in relation to the MEDIAN in a LEFT SKEWED DISTRIBUTION
MEAN is SMALLER than the MEDIAN (usually) (median is closer to peak, mean is closer to long tail in unimodal)
34
the SHAPE of a distribution INFLUENCES whether the MEAN is LARGER or SMALLER than the MEDIAN how is the MEAN in relation to the MEDIAN in a RIGHT SKEWED DISTRIBUTION
MEAN is LARGER than the MEDIAN (usually) (median closer to peak, mean closer to long tail in unimodal)
35
for SKEWED DISTRIBUTIONS is mean or median PREFERRED
MEDIAN because it better represents what is TYPICAL
36
is MEDIAN affected by OUTLIERS
RESISTANT to outliers
37
is MEAN affected by OUTLIERS
YES NOT RESISTANT to outliers
38
is MODE affected by OUTLIERS
NO outliers do NOT affect mode
39
what is affected severely by OUTLIERS
RANGE so not very informative MEAN and STANDARD DEVIATION are also sensitive to outliers
39
STANDARD DEVIATION measures the..
SPREAD of data
40
STANDARD DEVIATION gives a measure of ... by ...
VARIATION by summarising the deviations of each observation from the mean and calculating an adjusted average of these deviations see calculation
41
what is the VARIANCE of a set of values
SQUARE of STANDARD DEVIATION variance = s ^2
42
the LARGER the STANDARD DEVIATION the ...
GREATER the VARIABILITY
43
when does S = 0 (standard deviation)
when all observations have the same value otherwise s > 0
44
STANDARD DEVIATION and variance UNITS
same units as the original observations variance has squared units
45
can OUTLIERS and SKEWS AFFECT STANDARD DEVIATION
NOT RESISTANT strong skewness and outliers can greatly INCREASE S
46
the INTERQUARTILE RANGE IQR is ..
the DISTANCE between the THIRD QUARTILE and FIRST QUARTILE IQR = Q3 - Q1 gives the spread of MIDDLE 50% of data
47
how do you calculate when an observation is a POTENTIAL OUTLIER
1.5 X IQR potential outlier if 1.5 x IQR below Q1 or above Q3
48
PERCENTILES: a pth percentile is a value such that..
p % of the observation falls below or at that value eg. 90th percentile 90% of data falls below that percentile, 10% above
49
QUARTILES: Q1,Q2,Q3 divide a set of date into ... groups with ...% of the values in each group
4 groups 25%
50
the 5 NUMBER SUMMARY is the basis of a BOX PLOT and consists of:
- MINIMUM VALUE - Q1 - Q2 (MEDIAN) - Q3 - MAXIMUM VALUE potential outliers marked separately and may be above maximum/ below minimum
51
what is a Z SCORE and how do you CALCULATE it
the NUMBER OF STANDARD DEVIATIONS that a given value is ABOVE/BELOW the MEAN Z = OBSERVATION - MEAN / STANDARD DEVIATION
52
a POSTIVE and NEGATIVE Z SCORE indicates...
Positive: Observation is ABOVE the Mean Negative: Observation is BELOW the mean
53
what does a Z SCORE of 2 say
that the data value is 2 STANDARD DEVIATIONS ABOVE the MEAN (-2 means 2 s BELOW mean)
54
Z SCORES allows us to tell..
how UNUSUAL an observation is LARGER Z SCORE (positive or negative) = MORE UNUSUAL (-1.3 is more unusual than 1.2)
55
an observation from a BELL-SHAPED distribution is a POTENTIAL OUTLIER if its Z SCORE is
BELOW - 3 or ABOVE 3 (3 standard deviations out)