DEFINITIONS Flashcards

(50 cards)

1
Q

what is learning

A

ability to adapt to new situations
implicit and explicit (does not require motivation - only experience of an error in judgement)
coding PE - discrepeancy between what you think you know and what is true in the moment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

components of learning

A

1 learning about reward and punishment
2 selecting action goals
3 actions to obtain reward
4 monitoring the potential value of switching to n alt course of action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

components of learning

  1. learning about reward and punishment
A

associative learning
perform for reward
not perform to avoid punish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

components of learning

  1. selecting action goals
A

assign value to diff goals and make decisions in accordance with goal of highest valued outcome
beh as related to goals

reliant on temporal discounting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

components of learning

  1. actions to obtain reward
A

map actions to goals - actions that lead to more valuable outcome of goal most efficiently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

components of learning

  1. monitoring value of switching action to alt
A

exploitation vs exploration trade offs
are there prev unconsidered shortcuts which might obtain reward more easily?
- uncertain routes may have better outcome

counterfactual thinking and social learning - determing if switching is a good idea

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

define decision making

A

awareness of available alternatives and assigning value to each - which routes lead to which outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

define choice behaviours

A

actions assoc with the choice of a specific alternative

not decisino making - consequence of

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

define normative theories

A

agent makes decision about the utility of an outcome based on likelihood and value
opt for best choice in idealised context that maximises the utility
**flawed - make mistakes, dont know everything about outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

define descriptive theories

A

actions chosen probabalistically based on value function and updated on the basis of its outcome (RPE)
more trial and error
observe beh and infer decision making process

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

factors that influence decision making

A

lifetime - alter perception of what valuable
pathology - decisional disorder - not mapped to reality
momentary fluctuations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

define rienforcement learning models

A

expand theory into tractable parameters

allow us to measure and quantify latent (theoretical) parameters of behaviour

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

reinforcement learning theory

A

expands utility - understanding of actions and outcomes are probabalistic
use experience and feedback overtime - confirm or disconfirm expectancies and update

expectancy drives action, experience drives value updating

based on model free learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

define value function

A

estimate of the sum of the future rewards

all accumulated reward LT - maximal over tiem

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

define reward function

A

estimate of the immediate intrinsic value of rewards

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

define state value function

A

sum of future expected rewards
based on:
animals state ie satiety

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

define action value function

A

sum of future expected rewards within an environmental state following an action

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

define reward prediction error RPE

A

difference between actual and expected reward as expected based on current value functions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

define rescorla wagner

A

the amount of learning (the
change ∆ in the predictive value of a stimulus
V) depends on the amount of surprise (the difference
between what actually happens, λ, and
what you expect, ΣV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

define model free rienforcement learning

A

direct experience with reward/penalty
decisions based on VF updating following PE
determined by certainty of the variable and action familiarity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

define model based reinforcement learning

A

model of the world - preference about the inexperienced but hypothesised world

make informed decisions without trial and error
- use motivational states and higher order input (ie social) without direct reward/penalty

adjust VF when new info about internal/external environment - avoid relearning based on experienced irregularities in RPE

22
Q

model free accomodation of model based learning

A

DOLL ET AL
accomodate simplified model based:
generalise learning from one state to another without additional experience if states overlap
update without direct experience of the action assoc with the devalued reward - alter rep of reward and thus reduce beh
ie reversal learning:
train lever = food
lever press > no reward? - update VF and decrement value of action as outcome
- lever overlaps with food presentation therefore devalued food = devalued lever press

23
Q

doll et al - model based learning in lever task - motivations

A

rat integrates model of the world into its motivations
ie temp discounting or maze task

choice will be base don how hungry and/or how impulsive the rat is

24
Q

define counterfactual thinking

A

ability to imagine hypothetical outcomes
what could have been -
when dont know vf or which decision to make
generate ficticious PE unencountered - experience of regret - the value of actions not chosed to alter future choice

25
define metacognition
awareness and understanding of one's own thought processes. | ability to reflect on successs and failure and act accordingly
26
define the striatum
midbrain grey matter part of basal ganglia dorsal/ventral higher order mammals, caudate/putamen in lower order LDM, higher order cog
27
DLS pathway
DLS to VLGPi to VL thall to sensorimotor cortices to DLS
28
DMS pathway
DMS to DMGPi to VA/DM thall to PF/PARIETAL cortices
29
VLS pathway
VLS to DM/DL VP to Thall to Motor cortex/mOFC/ACC
30
role of ventral striatum
reward processing and motivation NACC - shell and core input from amyg and hipp (Emotion, memory)
31
define substantia nigra (SN)/VTA
DA cells project DA terminals to striatal subdivisions of caudate/putamen and PFC key for learning and attention
32
connection between cortex and striatum
cortex funnels connections into the stritum topographical segregated connection betwen ventral and dorsal areas closely related to function reward: VM (OFC/ACC) - VS (NAcc) exec control: (dlPFC/PPC) - DS/VS (dlCAUD) motor: (SMA/PMC) - DLS (PUT) graded function of hot cog limbic/reward processing to cold cog exec motor and action orientation
33
DOPAMINE IN LDM
monoamine neutotransmitter widel distributed in brain - innervates circuits: nigrostriatal (ascending vs-ds vi SN) mesocorticallimbic (VTA - cortex) modulates glutamatergic signals on balance of actions in striatum: D1/D2 receptors alters glutamate function of MSNs DA predicts PE - when unexpect/pos or ommit/neg (schulz)
34
glutamate
excitatory | primary neurotrans in cortex
35
gaba
inhibitory | primary in MSNs
36
MSNs
medium spiny neurons primary neurons in the striatum inhibitory
37
direct path
D1 glutamate from cortex to striatum via GPi excitatory - disinhibition of the thallamus
38
indirect path
D2 divert cortex to striatum via GPe+STN inhibitory - heightened inhibition of the thallamus
39
define optogenetics (inLDM)
steinberg et al genetic photocsensitive microfibre on VTA/midbrain DA cells investigate causal relation between neuronal firing and behaviour
40
PE+
pos prediction error | unexpected reward or greater than expected
41
PE-
neg prediction error | ommitted reward or reward less than expected
42
describe Blocking
blocking effect the conditioning of an association between two stimuli, a conditioned stimulus (CS) and an unconditioned stimulus (US) is impaired if, during the conditioning process, the CS is presented together with a second CS that has already been associated with the unconditioned stimulus.
43
describe extinction learning
gradual decrease in response to a conditioned stimulus that occurs when the stimulus is presented without reinforcement
44
point of fixed/random interval reinforcement schedules
allows rat to experience uncertainty and delay discount rewards as a function of their delay - reduce subj value the more delayed in time the receipt is makes beh LESS a-o oriented
45
possible reward devaluation procedures
pair with aversive consequence ie lithium chloride (yin et al) ie satiety (faure)
46
why reward devaluation procedures
see if a behaviour has become habituated - wont respond to a-o contingency still respond to antecedent stimulus despite its devaluation
47
define adolescence
transitional phase characterised by physiological (ie hormones, brain structure), cog (ie abstract reasoning/temporal foresight, emotional reactivity, risktake + novelity seeking) and social change (parent conflict, peer assoc)
48
define temporal bridging
being able to recognise and wait for delayed consequences of actions requires temporal foresight/mental time travel in temporal discounting asess the larger future gain against the smaller immediate gain
49
define temporal discounting
the degree to which a reward is discounted in relation to its temporal delay, i.e. the subjective value of the temporal delay in terms of reward requires temporal bridging via temporal foresight
50
define impulsivity
broad construct 'Present boundedness’/‘poor futuring' - poor temp discount Risk taking and sensation seeking - poor consideration of the negative future consequences reduced persistence/resistance to delayed rewards: steeper temporal discounting - shorter tolerance of temporal delays and enlarged subjective perception of time assoc with deficits in motor, attention, reward decision making, timing