Choosing statistics Flashcards

E-modules 2018/19 (39 cards)

1
Q

What is needed to test the hypothesis?

A

Choice of statistical test

Patient population/study sample selected allows for comparison (i.e. inclusion/exclusion criteria)

Patient outcome measures (i.e. variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

When the hypothesis proposes a correlation, what are the possible stats tests based on the variables?

A

Discrete
- Chi-Square

Continuous

  • Pearson (normally distributed)
  • Spearman rank (not normally distributed)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When the hypothesis proposes a comparison between groups, what stats test do you use for discrete data?

A

Chi-Square

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When the hypothesis proposes a comparison between groups, what stats test do you use for continuous, normally distributed data based on number of groups?

A

> 2 groups
- ANOVA (one variable)

2 groups

  • paired t-test
  • independent t-test
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When the hypothesis proposes a comparison between groups, what stats test to you use for continuous, NOT normally distributed data based on number of groups?

A

> 2 groups
- Kruskal Wallis

2 groups

  • Wilcoxon (paired)
  • Mann Whitney (independent)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which statistical analysis tests for differences?

A
Chi-square
ANOVA
T-tests
Kruskal-Wallis
Wilcoxon
Mann-Whitney U-Test

*hypothesis proposes a comparison between groups

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which statistical analysis tests for similarities?

A

Chi-Square
Pearson
Spearman rank

*hypothesis proposes a correlation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is quantitative data?

A

Numerical information about quantities

  • MEASURED: information can be measured and have continuous dimensions (height, temperature, BP)
  • COUNTED: information can be counted but not continuous (no. of children in family, no. of patients in clinic)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is qualitative data?

A

Information about qualities, it can’t actually be measured

Deals with descriptive information such as free-text comments to open-ended question/response to interview

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is categorical data?

A

In-between quantitative and qualitative

  • ORDINAL aspects can be easily converted into numerical data (i.e. scale on happiness can be given in numbers instead of words)
  • NOMINAL aspect consists of individual terms rather than sentences like in qualitative data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Broadly compare quantitative, qualitative, and categorical data

A

Quantitative = when you measure something and give it a number value

Categorical = when you classify something

Qualitative = when you judge something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Compare discrete and continuous data

A

Discrete data; counted

  • cannot be made more precise
  • i.e. number of children

Continuous data; measured

  • can be divided and reduced to finer and finer levels
  • i.e. height of a person
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Compare nominal and ordinal data

A

Nominal = items that are assigned individual named categories that do not have an implicit or natural value or rank
i.e. gender, fracture incidence

Ordinal = items which are assigned to categories that do have some kind of implicit or natural order
i.e. describe patients’ characteristics: stage of hypertension, pain level, and satisfaction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Broadly describe the mean and standard deviation

A

Mean is an average of the data

Standard deviation describes the width

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is normality?

A

It measures the central tendency and dispersion of data, and is used to decide how to describe the properties of large data-sets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the relative mean, median, and mode for the following skews:

a) positive skew
b) symmetrical distribution
c) negative skew

A

a) mode > median > mean
b) mean = median = mode
c) mode < median < mean

  • positive: >
  • negative <
17
Q

What is kurtosis?

A

Describes data that are heavy-tailed or light-tailed relative to a normal distribution

18
Q

Compare high and low kurtosis

A

High kurtosis
- tend to have heavy tails, or outliers that create a very wide distribution

Low kurtosis
- ten to have light tails, or lack of outliers that create a very narrow distribution

19
Q

What statistical tests are used to test for normality?

A

Shapiro-Wilks test
- small sample size (n<50)

Kolmogorov-Smirnov
- large sample size (n>50)

20
Q

What is a descriptive statistics?

A

Mean, mode or median

- used to categorise large data-sets into a tangible format

21
Q

Compare range, variance and standard deviation

A

Range
- measures how fat a set of numbers are spread out from their average value
Variance
- measure of the spread of the numbers away from the mean value
Standard deviation
- measure the spread of a set of data

22
Q

Compare IQR, standard error of mean, and confidence intervals

A

Interquartile range
- UQ - LQ
Standard error of mean
- measures how well the sample mean approximates to the population mean
Confidence intervals
- range of values in which true mean value might be found

23
Q

How do you determine whether groups are paired or independent?

A

Look at whether each group is composed of the same subjects of interest or if they are different

Paired = two data-sets come from the same individual
- measure same variable in same subject at different time points (longitudinal study)

Independent = two data-sets from different individuals
- comparing two groups with no common factors (cross-sectional study)

24
Q

Compare when to use parametric and non-parametric statistics

A

Parametric = normally distributed

Non-parametric = not

25
Name parametric tests
Paired/independent t-tests | ANOVA
26
Name non-parametric tests
Wilcoxon Signed Rank Mann-Whitney U Friedman (non-parametric equivalent of repeated measures one-way ANOVA) Kruskal-Wallis
27
When would you use the different t-tests?
Paired: different variables are compared with the same sample Independent: same variable is compared by from different samples
28
What does a one-way ANOVA tell you?
Used to compare the means from more than two samples with a normal distribution and will only tell you if a difference exists between your samples Further stat tests (i.e. post hoc test) are needed to calculate exactly where the difference is
29
What can the Pearson correlation coefficient tell you about correlation?
How strong the relationship is Varies between -1 to +1 (from perfect negative to perfect positive correlation)
30
Approximately what are the r-values for the following correlations: a) very low, b) low, c) reasonable, d) high, e) very high?
a) 0.0-0.2 b) 0.2-0.4 c) 0.4-0.6 d) 0.6-0.8 e) 0.8-1.0 *can be +/-
31
What is the r^2-value from a Pearson correlation?
Represents how closely your data is fitted to the correlation line The higher the value, the more reliable your conclusion can be
32
Compare correlation and regression
Correlation = indicates the strength of the relationship between two variables Regression = quantifies the association between the two variables, i.e. tells us the impact that changing one variable will have on the other variable
33
How is regression defined?
y = a + bx ``` a = the y-axis intercept value b = the gradient of the line, i.e. the regression coefficient ``` #gcse
34
What does the chi square test measure?
It is a measure of the differences between observed and expected frequencies Represented as X/X^2
35
What does it mean when X^2 = 0?
The observed and expected frequencies are the same
36
What does it mean the higher the X^2 value?
The bigger the difference between the observed and expected frequencies
37
How can the size of a study affect the p-value?
Very small studies with few samples might not return a reliable p-value Very large studies with many samples might be over powered and find a significant difference where none exists
38
What is a type I error?
Incorrectly reject the null hypothesis when it is true (significance level, a-value) False positive
39
What is a type II error?
Incorrectly fail to reject the null hypothesis when it is false False negative *the greater the power of the test, the lower type II error rate (power = 1-beta; the closer the power is to 1, the better the test is at detecting a false null hypothesis)