Flashcards in E-module 2 - Choosing statistics Deck (28)

Loading flashcards...

1

## What are the 2 types of analysis of data?

###
Correlation

Comparison

2

## Definitions of correlation and comparison as types of data analysis.

###
Correlation

- hypothesis tests to evaluate relationship between variables

Comparison

- hypothesis tests to evaluate differences between groups/populations

3

## What are the 2 types and 4 subtypes of data?

###
Types

- Quantitative - numeric information

- Qualitative/Categorical - information that can't be measured

Subtypes

- Quantitative gives rise to CONTINUOUS and DISCRETE (counted) data

- Qualitative/categorical gives rise to NOMINAL (unordered) and ORDINAL (ordered) data

4

## Which subtypes of data are PARAMETRIC and NON-PARAMETRIC?

###
PARAMETRIC

- continuous quantitative

NON-PARAMETRIC

- discrete quantitative

- nominal categorical

- ordinal categorical

5

## What is the principle segregating continuous and discrete data?

###
Continuous can be subdivided (potentially) infinitely where discrete cannot

- e.g. age is continuous if measured exactly in months, days, hours etc, but discrete if measured in years (overlap here within a category between continuous and discrete data)

6

## What is important to remember about continuous vs discrete data?

###
Means or rates are always continuous data. Likelihood is they were generated using discrete data BUT they themselves are continuous

e.g. heart rate is continuous but number of heartbeats in a minute is discrete

- this is because continuous data can take ANY value (e.g. 2, 2.5, 3) but discrete data cannot take certain values (e.g. 2, 3 but NOT 2.5)

7

## When do you check for normality in data?

### When you have CONTINUOUS data - discrete and qualitative data is ALWAYS NON-PARAMETRIC

8

## What does normality measure?

### Measures central tendency and dispersion of data

9

## What are the 2 tests used for testing normality and their conditions?

###
Shapiro-Wilk test - n<50

Kolmogorov-Smirnov test - n>50

10

## When can the data be conferred as normal or not normal?

###
p-value of normality test

if p<0.05 data is NOT normal, otherwise data is normal

11

## What are the 3 outcomes and subsequent distributions of data after normality testing?

###
YES - Gaussian/Normal distribution

NO - Skewed distribution

NO - Kurtosis

12

## What are the 2 main features of normal distributions?

###
- 68-95-99.7 rule - 2/3rds of data lies within 1 SD of the mean, 95% within 2 SDs, 99.7% within 3 SDs

AND

- Distribution is symmetrical

13

## What are the features of skewed distributions?

###
ASYMMETRICAL

- mean, median and mode all separated (usually found together in normal dist.) (mode at top of curve, median just downslope from top, mean just further downslope from the median)

- skew is named according to which direction has the long tail e.g. right/positive skew = long positive/right tail and vice versa

- uneven tails with many data points at high/low end of range

14

## What are the features of Kurtosis?

###
Kurtosis is where data is heavy or light-tailed with respect to a normal distribution

- heavy-tailed = outliers create a wide distribution (graph is flattened)

- light-tailed = lack of outliers creates a narrow distribution (graph is steepened)

15

## Definition of unpaired/independent and paired/dependent data/groups and an example of a study employing this?

###
Paired/dependent = when two (or more) sets of data have come from the same individual e.g. same subject at different points of the day

- longitudinal study

16

## Definition of unpaired/independent data/groups and an example of a study employing this?

###
Unpaired/independent = comparing data from two groups with no common factors (two independent groups)

- cross-sectional study

17

## Which statistical tests will you use for parametric data and what are their individual constraints?

###
Paired t-test = parametric, 2 groups, paired

Unpaired t-test = parametric, 2 groups, independent

Repeated measures, one-way ANOVA = parametric, 3 groups or more, paired

One-way ANOVA = parametric, 3 groups or more, unpaired

18

## Which statistical tests will you use for non-parametric data?

###
Wilcoxon (signed rank) test = non-parametric, 2 groups, paired

Mann-Whitney U test = non-parametric, 2 groups, unpaired

Kruskal-Willis test = non-parametric, 3 or more groups (paired)

(Friedman test = non-parametric, 3 or more groups, unpaired)

- medlearn tree only has K-W test for 3 or more groups

19

## When testing for correlation, which statistical tests would you use and when?

###
Pearson's test - data is continuous and follows a normal distribution

Spearman's rank test - data is continuous but does not follow a normal distribution

Chi-squared test - data is discrete

20

## What is the range of resulting values from Pearson's/Spearman's rank tests and what symbol indicates them?

###
Pearson's - r - from -1 to 1, perfect negative to perfect positive correlation (strong and weak in between)

Spearman's rank - rho (looks like p) - from -1 to 1, perfect negative to perfect positive correlation (strong and weak in between)

21

## Which type of data, parametric or non-parametric, is better to use and why?

###
PARAMETRIC

- easier to understand

- more powerful as LESS LIKELY TO incorrectly reject/fail to reject the hypothesis

22

## What are descriptive statistics?

###
Descriptive statistics are used to categorise large data-sets into a tangible format

Raw data is usually presented in the form of descriptive statistics e.g. provide mean +/- SD/SEM of a collection of data points

23

## What are measures of central tendency?

### Mean, mode, median

24

## What are measures of data dispersion?

### Variance, standard deviation (SD), standard error/standard error of the mean (SE/SEM)

25

## How do you calculate variance and standard deviation of a sample and a population?

###
Variance - sum the squared differences between each data value and the mean. divide all of this by the number of values (n)

- here for population, use n on the bottom of the fraction, for sample use n-1

Standard deviation - for both, this is the square root of the variance so calculate it as such

26

## How do you calculate the standard error of the mean?

### This is standard deviation divided by square root (n) (square root of the number of values)

27

## Which measure of central tendency should you use in cases where the data is normally distributed and not normally distributed?

###
Normally distributed = mean

Not normally distributed = median

28