Statistics Flashcards
(30 cards)
Describe the 6 factors required for a normal distribution curve
- Bell shaped
- symmetrical
- has equal values for mean, median and mode
- has a single central peak
- is continuous
- has values between -infinity and +infinity
Describe three ways to assess the normality of a distribution
- Informal review - visulisation
- Probability plot produced by statistical package
- statistical test - KS and SW
Describe the different options available for transformations
- logarithmic
- square root
-reciprocal - cube root
- logit
Describe what data you would use a logarithmic transformation on
- fairly skewed data
- data where the variance is proportional to the mean
Describe what data you would use a square root transformation on
-Data that is slightly skewed
- counts
Describe what data you would use a reciprocal transformation on
Highly skewed
Describe what data you would use a cube root transformation on
Data relating to volumes
Describe what data you would use a logit transformation on
Proportions
Describe the assumptions for using the paired t-test
- the distribution of paired differences is normal
- the differences are independent from each other
What are the assumptions for the unpaired t-test?
- the data is plausibly normally distributed
- the population variances (or SD) of the two groups are equal
What are the four options you can do if you want to do an unpaired t-test but the variances are not equal?
- Welch’s tests
- non-parametric test
- data transformation
- Do not proceed
What does regression analysis do?
Gives you information about the nature of relationships between two variables
What does correlation do?
gives you information about the relationship between linear continuous variables
What does Chi Squared analysis do?
Gives you information about the relationship between two categorical variables
when is logistic regression used?
When only one variable is categorical
What is the correlation coefficient?
Describes the strength of the association between two variables
What is the Pearson product moment correlation coefficient (p)
Used for the strength of correlation between linear relationships
What are some assumptions for simple linear regression?
- observations are independent
-relationship must be linear
-residuals must be normally distributed - homoscedasticity: equal variance
Describe the non-parametric tests used for two samples of data?
- Wilcoxon signed rank test - for dependent samples
- Mann-whitney test - for independent samples
Describe the two non-parametric tests that can be used for testing the hypothesis about a mean of a single sample
- sign test
- Wilcoxon signed rank
When is spearman’s correlation coefficent used?
- sample size is small
- neither X or Y is normally distributed
- at least one of the two variables is measured on an ordinal scale
- the relationship is non-linear
what are the assumptions for Chi-squared test?
- 80% or more of the cells have an expected value >5
- all expected frequencies >1
- total sample size >20
What does cross-sectional mean in terms of study design?
Observes the subjects once
What does longitudinal mean in terms of study design?
observes subjects over time