statistics Flashcards Preview

Imaging the Mind > statistics > Flashcards

Flashcards in statistics Deck (15)
Loading flashcards...

Type 1 error

Rejecting the null hypothesis when there actually is no effect i.e., a false positive 



type 2 error

Accepting the null hypothesis when there is actually an effect i.e., a false negative


P value 

the chance of seeing the observed effect if the null hypothesis is true 


alpha value

this is the type one error level. alpha is the desired tolerenace of the chance of a false positive error


What is the power of  a procedure? 

The chance that a testing procedure correctly rejects the null hypothesis. 
(1 minus the type II error rate). 

power is a function of three things: 

  1. The true effect size
  2. The efficiency of the statistics 
  3. The sample size 


Voxel level analysis 

Where each voxel is tested invidually to identify a signal 


cluster level analysis

Where activity in groups of voxels is tested together for signficance. None of the individual voxels need be significant, as their signficance is derived together. 

Voxel level anaylsis does not account for the fact that clusters of voxels may be activated together - this does occur because brain structures activated may exceed 1 voxel size, and smoothing also spreads activation across voxels

Clusters are usually more sensitive than voxel anlysis. Cluster analysis can identify clustered activations that are greater than the size of kernals used for smoothing noise. their power comes at the expense of spatial specificity 


What is the difference between the P value and the alpha value? 

The alpha value is the standardised level of chance that will be accepted e.g., 0.05

The P value is the actual level of chance of those results occuring if the null hypothesis is true. if P is < alpha then we reject null hypothesis. 


What is the multiple testing/comparisons problem? 

The alpha value only controls the probability of a type I error on a test by test basis. For example, if alpha=0.05, then we will declare all voxels with a P value of <0.05 to be signficant (i.e., on average there are 5% false positives). When testing 100,000 voxels this means that 5,000 voxels will pass significance thresholds when they do not actually reflect an effect. 

False-positive risk therefore has to be controlled across an image and not just at the level of each individual test.  


Family-wise error rate

This is an alpha value set at the level of the image. Therefore, for an αFWE set at 0.05 there is a 5% chance of any false positives occuring within the entire image. This means the P-values must be corrected at the voxel level (or the cluster level). 


How can FWE be controlled for? 

Bonferroni method. This method sets alpha levels depending on how many test calculations are being performed. However, it is highly conservative for tests that are highly correlated, which is the case in fMRI due to spatial dependence of voxels on one another. 

Random field theory. This is an approach that takes into account the intrinsic (naturally correlated activity between voxels) and extrinsic (kernals) smoothness of the data. It requires that the data be very smooth, and is conservative for sample sizes <20


Permutation testing 



What are two overarching corrections that can be made to multiple testing

family wise error rate 

false discovery rate


false discovery rate (FDR) 

FWER is highly insensitive, often leaving no significant results. 

False discovery proportion refers to the proportion of the detected voxels or clusters that are false positives. this proportion is controlled for by FD rate.  

A 0.05 FDR level means that results are 95% correct - on average there are only 5% of false positives. 

it is more lenient but comes with a higher risk of false positives. 



Defining a region of interest prior to statical analysis is called a small volume correction and it can reduce the MCP. It is crucial that it is defined independently of any statistical analysis, and therefore requires some a-priori knowledge of where the activation might be. 



Power calculations 

There are made to work out how many subjects is needed to find the desired effect at around about 80% power (80% is the normal threshold)