Stats Flashcards Preview

Maths for Clinicians > Stats > Flashcards

Flashcards in Stats Deck (53)
Loading flashcards...
1

Qualitative Variable

A non-numerical variable, like hair colour or eye colour

2

Quantitative Variable

A numerical variable, like length, time

3

Continuous variable

Can take any value within a given range, e.g. height, time, age,

4

Discrete variable

Can only take certain values, e.g. shoe size, cost in £ and p, number of coins.

5

What type of data do histograms deal with?

Continuous data --> no spaces between the bars

6

Define the mode

The value which occurs most often.

7

Define the mean average

Where you add up all the numbers and then divide by the number of numbers

8

Define the median

The median is the middle number in an ordered list.

9

When is the mode used?

You should use the mode if the data is qualitative (colour etc.) or if quantitative (numbers) with a clearly defined mode (or bi-modal). It is not much use if the distribution is fairly even.

10

When is the mean used?

This is for quantitative data (numbers), and uses all pieces of data. It gives a true measure, and should only be used if the data is fairly symmetrical (not skewed).

11

When is the median used?

You should use this for quantitative data (numbers), when the data is skewed, i.e. when the median, mean and mode are probably not equal, and when there might be extreme values (outliers).

12

What is the range?

The range is the largest number minus the smallest (including outliers).

13

What's an outlier?

An extreme value

14

Ordinal variables

An ordinal variable is a categorical variable for which the possible values are ordered.

15

Nominal variables

Nominal variables have two or more categories without having any kind of natural order. They are variables with no numeric value, such as occupation or political party affiliation.

16

Binary variables

Binary variables are variables which only take two values.

17

What are the various sources of data?

Routinely collected data:
- Mortality and census data
- Hospital activity data
- Primary care data
- Infectious disease notifications
- Regular national surveys (e.g. Health Survey for England)
Research study data

18

Continuous variables

Have numerical values
Measurements are on a continuous scale i.e. can take an infinite number of distinct values

19

Discrete variables

Also known as count variables
Have numerical values, but they must be integers e.g. Number of fillings

20

What are the appropriate graphical presentations to use for 1 categorical variable?

Bar chart
Pie chart
Frequency table

21

What are the appropriate graphical presentations to use for 1 continuous variable?

Histogram
Bar chart

22

What is the appropriate graphical presentation to use for a categorical outcome and categorical exposure?

Contingency table

23

What is the appropriate graphical presentation to use for a numerical outcome and categorical exposure?

Box and whisker plot

24

What is the appropriate graphical presentation to use for a numerical outcome and numerical exposure?

Scatter plot

25

What are other terms for 'exposure'?

Explanatory variable
Independent variable
X variable
Risk factor
Treatment group

26

What are other terms for 'outcome'?

Response variable
Dependent variable
Y variable
Case/control group
Disease group

27

Bar charts

heights of the bars are proportional to the frequencies
useful for comparing the frequencies in each category relative to the others

28

Pie charts

areas of the sectors are proportional to the frequencies
useful for comparing the frequencies in each category with the whole group

29

Normal distribution

A distribution where the mean, median, and mode are roughly similar
A probability distribution that describes data that is symmetric around a mean

30

Describe positive skewness

Mode < median < mean
Also known as a right-skewed distribution because it has a long right tail.

31

Describe negative skewness

Mode > median > mean
Also known as a left-skewed distribution because it has a long left tail

32

Define the term 'standard deviation'

Measure of spread of observations around the mean

33

Define the term 'interquartile range'

Range from first (25%) to third (75%) quartiles of a distribution

34

What is a distribution?

• describes the frequency (or probability) of occurrence for a given value
• describes the shape of the data

35

What can we do with a distribution?

- make inferences about a wider population
- generate confidence intervals (assessing variability of estimates)
- test hypotheses
- calculate sample size

36

Define skewness

A measure of the asymmetry of the distribution

37

What is a null hypothesis?

A hypothesis saying that the outcome is not associated with the exposure

38

What is an alternative hypothesis?

A hypothesis saying that the outcome is associated with the exposure

39

Why use statistical tests?

We use statistical tests to help us judge if our observed effect size is due to chance or if it is real.

40

What is the significance level?

•The probability that you will find an effect that does NOT actually exist
•Strength of evidence needed to reject NULL hypothesis
•Normally set to 5%

41

Define the term 'inferential statistics'

Inferential statistics allows you to make predictions (“inferences”) from that data. With inferential statistics, you take data from samples and make generalisations about a population.

42

What is meant by standard error?

•Standard Error is an inferential statistic.
•It is an estimate of how variable a statistic would be if we repeated our study numerous times.

43

What are p-values?

P-values give the probability that we observed an effect size as large as we did if the null hypothesis is true i.e. effect size is zero

44

What do p-values tell you?

P-values tell us the strength of the evidence against the null hypothesis that there is no association.
As the p-value decreases the evidence against the null hypothesis increases.

45

What do confidence intervals tell you?

The confidence interval shows the range of values in which the true effect size is likely to lie.
A 95% confidence interval tells us that in 95% of replicate experiments, the true value will lie in the interval.

46

How can the concept of disease be influenced?

• Evidence of symptoms
• Technological and medical development
• Sociocultural environment

47

Define the term 'abnormality'

Different from what is usual or average, especially in a way that is bad

48

List the three types of abnormality

• Abnormal if unusual
• Abnormal if associated with clinical abnormality
• Abnormal if increased risk of future disease

49

Abnormal if unusual

Common in laboratory testing to define normal as the range which includes 95% of values found in healthy subjects. This means abnormal is the top and bottom 2.5% of the population

50

What's wrong with defining abnormal as unusual?

By definition 5% of healthy people will have “abnormal” i.e. “unusual” values

51

Abnormal if associated with clinical abnormality

More logical to label values of a test as abnormal if these values are clearly associated with the presence of a disease state.

52

What's wrong with defining abnormal as being associated with clinical abnormality?

There's almost always overlap between values in diseased subjects and those in healthy subjects

53

Abnormal if increased risk of future disease

A biochemical measure in asymptomatic individual may be associated with future disease in a causal way