Neurocognitive Modelling Flashcards

Question

What does information theory allow us to do?

Answer 1

Quantify the amount of information in a channel (eg, neurons)

Answer 2

X = Y so I(X,Y) = H(X) = H(Y)

Answer 3

I(X,Y) = H(X) - H(Y | X)

Answer 4

Entropy of Y - amount of information received, some due to X, some due to noise

Answer 5

Entropy of Y conditional on X Noise entropy

Answer 6

Information received when X is constant ie. no information from X, therefore all information due to noise

Answer 7

It is symmetric so I (X;Y) = H(Y) - H(Y|X) and I(X;Y) = H(X) - H(X|Y)

Answer 8

1. Number of spikes in a given window 2. Precise spike timing 3. Inter spike intervals 4. Synchronous spikes with another neuron

Answer 9

Bin neural signals across time

Answer 10

5 bits Maximum achieveable would be 10

Answer 11

Neurons are rate limited due to the high energy cost of spiking A neuron's transmission of information rate will therefore be far from the maximal rate that could theoretically be achieved

Answer 12

45% to neocortex 13% for spiking Allowing 0.16 spikes/s for each neocortical neuron

Answer 13

If a certain code results in much more information than another, we can use this as suggestive evidence for brain function

Answer 14

The timescale of the input matters

Answer 15

1. Stimuli change on a certain timescale 2. Spikes need to be more precise on a shorter timescale than the stimulus to transmit information about the stimulus 3. We can use this to calculate how precise that needs to be (at a certain point, increasing precision will not increase information)

Answer 16

1. We need to know probabilities of certain events 2. We don't actually know the true probabilities, we estimate from relative frequencies 3. Typically in neuroscience, small number of samples 4. Less samples we have, more skewed distribution, lower estimate of entropy

Answer 17

I(R;S) = H(R) - H(R|S) H(R) is biased downwards H(R|S) is biased downards EVEN MORE because it is calculated on a smaller sample So Mutual Information is biased upwards!

Answer 18

1. Remove half your data and recalculate 2. See if it stays similar 3. If it does, probably fine 4. If it doesn't, more advanced solution needed

Answer 19

1. Extrapolation 2. Lower bounds 3. Analytical methods 4. Bayesian methods

Answer 20

A group of neurons should encode as much information as possible or Remove as much redundancy as possible

Answer 21

Barlow, 1961

Answer 22

Maximise I(s; R) which goes to Maximise I(s;f(s))

Answer 23

Mutual information

Answer 24

Neural response - to be optimised

Answer 25

Tuning curve - to be optimised

Answer 26

Between 1 and 2 bits 1 worst case they encode the same thing 2 is best case, efficient, no overlap Calculate MI between a and b. Goal is for it to be 0.

Answer 27

1. What is the distribution of natural stimuli 2. What functions can we implement with the limitations of neural architecture?

Answer 28

1. Image space 2. Matching first order statistics (single pixels) 3. Matching second order statistics (pairs of pixels) 4. Higher level analysis (ICA) 5. Natural images

Answer 29

Look at probability distribution of different brightlness values in a particular image, over our whole image space

Answer 30

p(x1,x2,x3....) = p(x1)p(x2)p(x3)... Pixels are independent

Answer 31

Work from our distribution to randomly draw images

Answer 32

Look at relationships between pairs of pixels

Answer 33

Looking at dependencies (does knowing 1 pixel value tell us about another)

Answer 34

We can measure and quantify the relationship between pixels and use it in a probability distribution

Answer 35

Treat images as functions of varying brightness values 1. Take a single line of our image and define a function that traces their brightness values 2. Apply a fourier decomposition to see how brightness values change 3. Rapid change -->high frequency Gradual change --> low frequency

Answer 36

Higher power in low frequencies --> gradual change Lower power in higher frequencies

Answer 37

A model such as the Gaussian image model

Answer 38

A multivariate normal distribution

Answer 39

N (x | mu, sigma) x is brightness values of our pixels mu is vector containing mean pixel intensities sigma is covariance matrix describing correlations between pixel pairs

Answer 40

1. Pixels in natural images are correlated 2. The correlations can be captured in a simple Gaussian model 3. We can now maximise information, assuming that natural stimuli come from this simple distribution

Answer 41

r = Ws r is the neural response W is the neural filter (receptive field) s is the pixel values from our modelled natural image

Answer 42

We want to minimise redundancy, which means minimise correlation Maximise variance, minimise correlation

Answer 43

Curve is spread - more uncertainty, more information

Answer 44

Take correlated pixel inputs & transform these into decorrelated neural activity

Answer 45

A transformation that removes correlations between signals and normalises their variances

Answer 46

1. No correlations 2. Equal variance across all components

Answer 47

Decorrelating pixels 1. Eigen decomposition of the covariance matrix (PCA) 2. Rotate data

Answer 48

Scale axes to equalise range (full variance of pixel values, and all variances equal)

Answer 49

r, the neural response becomes the identity matrix

Answer 50

Checkerboard receptive field - not seen in nature

Answer 51

Has an additional contstraint: localised in space

Answer 52

Simulate the receptive fields of retinal neurons and cells in the lateral geniculate nucleus

Answer 53

Using independent component analysis

Answer 54

After whitening, the components are decorrelated, but they can still have higher order dependencies such as kurtosis. ICA removes these - by finding a rotation of the whitened data that makes the components as statistically independent as possible

Answer 55

Independent components cannot be Gaussian Linear components of non-Gaussian signals will become more Gaussian (central limit theorem) Therefore find directions in the data that are the least Gaussian

Answer 56

Instead of modelling s in r = Ws with a Gaussian distribution, model it as s = Ma M = mixing matrix a = independent non Gaussian sources

Answer 57

The resulting filters - are localised in space - are orientated and bandpass (like edges) - look like Gabor filters, which resemble receptive fields of V1 simple cells in the visual cortex

Answer 58

Supports the idea that the visual system may be optimising for statistical independence, not just decorelation Extracts independent features that are efficient for encoding natural scenes

Answer 59

Decorrelation

Answer 60

Statistics like edges that are non Gaussian

Answer 61

Receptive fields that closey resemble Gabor patches and therefore those of simple visual neurons in V1

Answer 62

How neurons should behave

Answer 63

The idea that neural systems represent information using as few active neurons as possible at any given time

Answer 64

1. Maximises memory storage - fewer active neurons per pattern -> more patterns stored 2. Efficient energy usage - neurons that don't fire use less energy

Answer 65

Zero must of the time, but high values occasionally - distributions are super Gaussian

Answer 66

1. Laplace distribution 2. Student t's distribution 3. Cauchy distribution

Answer 67

1. More energy 2. Less memory storage 3. Redundant 4. Inefficient

Answer 68

1. Less robust 2. No representational flexibility 3. Limits generalisation 4. Harder to learn

Answer 69

minimise E = - [preserve information] + λ[ sparseness] preserve... = mean squared error sparseness = choose a function that penalises non zero values

Answer 70

Energy that we are trying to minimise, cost function

Answer 71

1. Images are mixtures of independent components with non-Gaussian statistics 2. The task of the brain is to demix the signal, ie. recover original components 3. Therefore, neural response properties will be non-Gaussian

Answer 72

1. Super Gaussian (sparse) response statistics are desirble given constraints on the nervous system 2. Filters that maximise sparseness

Answer 73

Both give localised, Gabor like receptive fields that are like V1 simple cells

Answer 74

Are neurons' responses supposed to be sparse across populations or over time

Answer 75

We will find sparseness wherever we look

Answer 76

The brain is more complicated than a binary network, eg. excitatory and inhibitory neurons

Answer 77

Is memory storage the limting factor? Maybe generalisation / energy use is more important

Answer 78

Calculate optimal neural responses from stimulus statistics Revert this process and calculate presumed stimulus statistics from known neural responses, assuming that the stimuli are coded efficiently

Answer 79

Some stimuli are more behaviourally relevant than others - the steep function increases discriminability

Answer 80

Understand biologically plausible representations and algortihms goverining neuronal activity patterns

Answer 81

Decompose complex behavioural patterns and processes into computational components

Answer 82

Combine component functions into computational models that can perform complex cognitive tasks

Answer 83

Neuronal activity patterns - biologically plausible representations and algorithms governing those patterns - cognition

Answer 84

Behavioural cognitive patterns - decompose complex cognitive processes into computational components - brain correlates

Answer 85

Understand human cognitive capacities and behaviour

Answer 86

Develop theories and models that can explain and predict those capacities and behaviours

Answer 87

Statistical techniques and models to describe data and / or establish relations measured variables Have no intrinsic psychological content, just trying to match numbers

Answer 88

Heathcote et al 2000, is the practice effect described better by a power function or exponential function (Power function - learning continues by a constant fraction)

Answer 89

Box - processes Arrow - Flow of information / causal relationships

Answer 90

Baddely & Hitch 1974 - modelling working memory

Answer 91

Central Executive Visuospatial Sketchpad | Episodic Bufer | Phonological Loop Long Term Memory

Answer 92

The process by which a verbal description is formalised to remove ambiguity, while also constraining the dimensions a theory can span

Answer 93

1. Communicate theoretical ideas 2. Test theoretical hypotheses and predicitons 3. Compare different plausible models statistically

Answer 94

Simplified abstracted models can capture broad trends by ignoring processes we are not interested in currently

Answer 95

Precise and falsifiable

Answer 96

The model makes clear, specific, predictions - Must be mathematically or logically defined - It should say exactly what should happen under specific conditions

Answer 97

So you can actually test it, simulate it and compare it to data

Answer 98

The model must be testable and potentially disprovable - There must be a way to prove the model wrong through experiment or observation - If a model can't be wrong ever, it's not scientific

Answer 99

Trying to minimise discrepancy between paramter values estimated by the model (predictions) and empirical data (observations) until they match as much as possible

Answer 100

Express the deviation between predictions and observations in a single value (cost function)

Answer 101

Critically aid theory building

Answer 102

Tuning knobs to adjust the values produced by the model, until they match (fit) observations

Answer 103

Flexibly adjusted until the difference between model estimated values and data is minimised

Answer 104

Set to specific, meaningful values that are invariant when fitting the model

Answer 105

1. Least squares estimation 2. Maximum likelihood estimation 3. Bayesian estimation

Answer 106

Minimise the squared discrepancy between observations and predictions

Answer 107

Find the parameter values that give the highest likelihood of the observed data

Answer 108

1. Fit a linear regression 2. Find the parameters that minimise the discrepancy function - via an optimisation algorithm

Answer 109

1. Compute discrepancy for starting values 2. Tumble down the error surface until it reaches its minimum

Answer 110

1. Reflection - remove point with largest discrepancy and flip to opposite side 2. Expansion - If reflection works, extend the flipped point out to take a larger step down 3. Contraction - If it didn't work - move the worst fitting more towards the centre 4. Shrinking - If contraction fails, shrink reduce simplex by half toward minimum

Answer 111

Bootstrapping

Answer 112

Provides an indication of variability around the model parameter estmates, by repeatedly sampling

Answer 113

1. Fit the model to the experimental data set 2. SImulate multiple data samples by running the model with the originally estimated parameters 3. Fit model to the simulated samples

Answer 114

1. No known statistical properties 2. Cannot statistically models 3. Parameter estimates have no inherent statistical properties

Answer 115

Chance of the data given the model

Answer 116

Chance of the model given the data

Answer 117

The probability of all possible events predicted by the model

Answer 118

Probability mass function

Answer 119

1. Cumulative distribution function (CDF) 2. Probability density function (PDF) - derivative of CDF

Answer 120

Maximise the likelihood function so that the observations are most likely - find highest peak

Answer 121

1. Use a log transform on the likelihood function (make it a nice curve, not gaussian) 2. Express as deviance (flip it upside down 3. Use a minimising optimiser such as Nelder Mead Simplex on a reversed sign likelihood function

Answer 122

Easier interpretation and handling of psychological probability functions and combining multiple observations

Answer 123

Easier assignment of model fit and model comparion (higher deviance - worse fit)

Answer 124

Be flexible - fit different patterns of data Not overfit - not fit just any data

Answer 125

1. Number of free parameters 2. Functional form of the model 3. Extension of the parameter space

Answer 126

More free parameters result in better fit

Answer 127

Some models produce a wider variety of patterns based on their parameter values

Answer 128

Bounds placed on the parameters can decrease model flexibility

Answer 129

Nested models - Likelihood ratio test Non-nested models - Akaike information criterion

Answer 130

Simpler models (fewer parameters) vs same model with more complexity (more parameters)

Answer 131

Different models with same complexity (same number of parameters)

Answer 132

L specific / L general As we use the deviance function, division becomes a subtraction

Answer 133

AIC = - 2ln L + 2K Likelihood of the model - complexity of the model (K: number of parameters)

Answer 134

Active maintenance of visual information to serve the needs of ongoing tasks

Answer 135

1. Change Detection Paradigm 2. Continuous Reproduction Paradigm

Answer 136

Show an image briefly Gap Show another image to see if they are same or different

Answer 137

Coloured boxes randomly organised, say which colour one of them was from a wheel

Answer 138

3-5 chunks of information at a time

Answer 139

VWM capacity limt arises from a limited resource that is in some way distributed across items

Answer 140

1. Resource is allocated to a limited number of discrete representation data 2. No information is stored about additional items once the capacity limit is reached

Answer 141

1. Equal allocation of a continuous resource across all items 2. Fewer resource per item for larger set sizes 3. Representations lose precision when capacity limit is reached

Answer 142

When data are assumed to result from a mixture of two or more distributions, each representing different processes of populations

Answer 143

1. Noisy target representation - von Mises 2. Random guessing - Uniform

Answer 144

Items exceeding maximum VWM capacity are not maintained

Answer 145

1. Noisy target representation - von Mises 2. Random guessing - Uniform 3. Noisy non-target representation - von Mises

Answer 146

The precision of the representation of items decreases with increasing set size

Answer 147

A circular normal distribution

Answer 148

- Fit models simultaneously to multiple conditions to explain differences between these conditions with a common set of constrained parameters - Not particularly interested in parameter values

Answer 149

- Fit model separately to different experimental conditions to assess how the conditions affect latent constructs, with most or all parameters varying freely - Very interested in parameter values

Answer 150

Capacity = pt Precision = k Swapping = pn

Answer 151

The quantity of information that can be held

Answer 152

How well is information remembered

Answer 153

Chance of swapping

Answer 154

There is an idea that what makes working memory so special is that we are able to remember things at specific locations (colour and location)

Answer 155

1. Training improves precision, not capacity 2. Improvements in precision are highly stimuli and paradigm specific 3. Changes in response patterns suggest possible interactions between paradigms

Answer 156

Helped us understand the lack of changes

Answer 157

1. Deciding when to stop collecting data 2. Excluding or including participants post hoc 3. Transforming variables or selecting among multiple outcomes 4. Trying different statistical methods until results are significant 5. Selectively reporting results

Answer 158

1. Reproducible 2. Replicable 3. Robust 4. Generaliseable

Answer 159

Same data, same analysis yields same result

Answer 160

Different data, same analysis yields same result

Answer 161

Same data, different analysis yields same result

Answer 162

Different data, different analysis gives similar result

Answer 163

1. Preregistration 2. Open materials and methods 3. Open data 4. Open source software 5. Open source code 6. Open access publications

Answer 164

1. Keep a model logbook 2. Parameter recovery studies 3. Sensitivity (robustness and generalisation) studies 4. Quanity uncertainty in parameter estimates 5. Share model code, data and scripts

Answer 165

1. Preregistration may not always be useful or possible 2. Modelling is an iterative process and requires exploration

Answer 166

Make it understandable and reproducible Comments, notebooks, change log

Answer 167

When you publicly registering your research plan, including hypotheses, methods, and analysis plans, before data collection begins

Neurocognitive Modelling Flashcards

(196 cards)