Resampling statistics Flashcards
(15 cards)
What do traditional statistics rely on?
Mathematical models with assumptions (e.g. normality)
Traditional statistics were primarily developed between 1800-1930.
Resampling methods
-Fewer assumptions (more robust)
-Require computers
-Include permutation tests and bootstrap resampling
Main advantages of resampling methods
-Minimal assumptions
-Easy to generalise
-No lookup tables or complex equations
-Forces engagement with data structure
Resampling methods are considered more robust than traditional methods.
Limitations of resampling methods
-Newer and less familiar
-Requires computing/programming
-Not widely available in traditional tools like SPSS
This may limit their accessibility for some researchers.
2 main resampling techniques
-Permutation tests (randomisation tests)
-Bootstrap resampling
These methods are alternatives to traditional statistical tests.
Permutation tests
-Replace t-tests or ANOVAs
-Shuffle group labels to test the null hypothesis
This method assesses the significance of observed differences.
Bootstrap resampling
-Estimate confidence intervals and standard error
-Resample w replacement from original data
-Model parameter uncertainty
Bootstrap methods involve resampling with replacement from original data.
Goal of permutation tests
To estimate the probability of observed group differences under the null hypothesis
This involves combining all data and randomly assigning it to new groups.
Steps in permutation tests
-Combine all data, ignoring groups labels
-Randomly assign data to new groups (simulate null)
-Calculate new difference in means (or SD)
-Repeat (e.g. 10,000 times)-> builds null distribution
-Compare observed differences to this distribution
This process is repeated multiple times to build a null distribution.
Sample size impact permutation tests
-Larger sample sizes lead to a narrower null distribution and more power
-Same measured difference is more significant w larger sample size
A same measured difference becomes more significant with larger sample sizes.
Process of bootstrap resampling
-Resample from original data (with replacement)
-Calculate the statistic of interest (mean, gradient, etc)
-Repeat 10,000+ times to form distribution
This is repeated many times to form a distribution.
Jackknife method
Systematically omit one data point per resample
This method is used to assess the influence of individual data points on the overall statistic.
Monte Carlo method
Simulating data from theoretical models to test hypotheses
For example, it can be applied to neuron spiking models.
How many resamples are typically recommended?
1,000-10,000+
The number of resamples can affect the reliability of the results.
‘garbage in, garbage out’
The quality of data affects the quality of results
This highlights the importance of data quality in statistical analysis.