Chapter 6: Reinforcement and Choice Flashcards

1
Q

Intrinsic reinforcer

A

obtain reinforcing value while engaging in the behvaiour
intrinsically motivating
social contact
exercise

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Extrinsic reinforcer

A

things that are provided as a result of the behaviour to encourage more behaviour in the future

ex. reading in children
the only way to teach kids to read is to get them to read and this usually involved enticing with them with social reinforcement (saying “good job”) or other kinds of external reinforcement

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

More reward does not always mean more

____________. Why?

A

Reinforcement

bonus for making more parts (i.e. over 50)
only diff between the group that does not get a bonus and the groups that did get a bonus (no difference between the groups who all got bonus but diff bonus values, they all were equally reinforced)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

aversives can __________ behaviour

A

reinforce

the aversiveness drives the behaviour!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Continuous Reinforcement

A

Behaviour is reinforced every time it occurs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ratio Schedules

A

Reinforcer is given after the animal makes the required number of responses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fixed ratio (FR):

A

fixed ratio between the number of responses made and reinforcers delivered (e.g., FR 10)
• Key elements: postreinforcement pause, ratio run, and ratio strain (going from 1 peck to 100 pecks, subjects tend to stop responding)

see graph slide 9

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Cumulative Record

A

Based on old cumulative recorder device (Constant paper output, pen jumps with each response)
rate of responding across time!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Variable Ratio (VR):

A

Different number of responses are required for the delivery of each reinforcer

Value is equal to the average number of
responses made to receive a reinforcer (VR 5)

Responding based on average VR and
minimum

ex. gambling

see graph slide 19 (steep slope of responding
responding at high rate, no postreinforcement pause!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Interval Schedules

A

Responses are only reinforced if the response

occurs after a certain time interval.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Fixed interval (FI):

A

a response is reinforced only if it occurs more than a set amount of time (responses during the interval don’t matter)

Key elements: fixed interval scallop, limited
hold

i. e. have to wait 10 secs, after 10 secs has elapsed, the first peck at the key will gain them the reward!
ex. cramming before tests

see graph slide 16 (low rates of responding
scallopes responding, post-reinforcement pause!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Variable interval (VI):

A

responses are reinforced if they occur after a variable interval of time

see graph slide 19

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Reynolds 1975

Ratio and Interval Schedules Compared

A

• Compared rates of key pecking of pigeons on
VR and VI schedules
• Opportunities for reinforcement were made
identical for each bird
• The VI bird could receive reward when the VR
bird was within one response of its reward

With equivalent rate of reinforcement, variable ratio schedules produce a higher rate of responding than variable interval schedules

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Variable schedules produce _________ responding compared to Fixed

A

Variable schedules produce steadier responding compared to Fixed

fixed = post reinforcement pause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Ratio schedules produce ________ of responding than Interval

A

Ratio schedules produce higher rates of responding than Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Source of Differences Between Ratio and Interval Schedules:

Differential reinforcement of Inter-response times

A

Ratio schedules reinforce shorter IRTs

Interval schedules reinforce longer IRTs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Source of Differences Between Ratio and Interval Schedules: Feedback function

A
More feedback (reinforcement) comes with more
responding on Ratio schedules; not so for Interval Schedules (different jobs differ on this aspect)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Intermittent Schedules

A

Fewer reinforcers needed
More resistant to extinction

Variable reinforcement/interval schedules are resistant to intinction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Differential reinforcement of high rates (DRH)

A

Minimum ratio per interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Differential reinforcement of low rates (DRL)

A

Maximum ratio per interval

to get rid of a behaviour if you don’t hate it but want it to occur less

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Differential reinforcement of paced rates (DRP)

A

Maximum ratio per interval (DRH)

Minimum ratio per interval (DRL)

22
Q

Duration Schedules:

A

response must be made continuously for a period of time

23
Q

Complex schedules:

A

Conjunctive schedules, Adjusting schedules, Chained schedules…

24
Q

Noncontingent Schedules: Fixed time (FT)

A

Reinforcer occurs following predictable amount of time regardless of behaviour

25
Choice
Usually considered as a cognitive deliberation - Here measured based on effect of different, concurrent payoff schedules With true and fickle “freedom of choice”, choices would be unpredictable - Understanding choices in terms of consequences allows for prediction with concurrent schedules!
26
Herrnstein, 1961 | matching law
would the animal be able to figure out how to respond based on how much food it gets?? measurement of choice concurrent schedules
27
Measures of choice
1. Relative rate of responding (Behaviour) BL/(BL + BR) BL = rate of responding to left choice BR = rate of responding to right choice (BR + BL) = total responding 2. Relative rate of reinforcement RL/(RL + RR)
28
Matching law
Herrnstein, 1961: Proportion of responding (choice) is equal to the proportion of reinforcement for doing so. There is a correlation between behaviour and the environment. the first equation is = to the second they are proportional = its called the matching law Relative rates of responding match relative rates of reinforcement BL/(BL + BR) = RL/(RL + RR) OR BL/BR = RL/RR
29
Basket ball matching
26 players on a large university basketball team Relative choice of different shot types = relative rate of reinforcement (baskets made) VR, because you may need to make some shots before taking a 3 pt one
30
BL/BR = b(rL/rR)^s | real matching law
real matching law b = bias s = sensitivity Perfect matching, s = 1 • Undermatching, s 1
31
Undermatching
Undermatching, s
32
Overmatching
``` s > 1 – Increased sensitivity to rates of reinforcement – “Stick to the best option” – Common with high cost of switching ``` respond a lot to the best option common if it is costly to switch (long change over delay) will rarely sample from other options that pay off at lower rates
33
Response Bias
Important when there is a difference between operant behaviours - Commonly: side bias Important when there is a choice between reinforcers or responses - Biological predispositions - Quality
34
Matching Law and Simple Schedules
Rate of operant (Bx) and rate other (BO) activities | = Bx/(Bx+Bo) = rx(rx+ro)
35
Matching law describes ________, but does not ________
Matching Law describes the behaviour, but | does not explain it
36
Maximizing theories
Organisms distribute their behaviour so as to obtain the maximum amount of reinforcement over time Explains ratio schedule choice Doesn’t always hold
37
Melioration theories
Making the situation “better” than the recent past Change from one alternative to the next to improve the local rate of reinforcement Animals respond so that the local rate is the same on each alternative Predicted issue: Behaviour is strongly controlled by immediate consequences
38
Serial Reversal Learning
Over many reversals, re-acquisition speeds up Compare with initial acquisition and number of reversals - Behavioural Flexibility
39
Mid-session Reversal
With a reversal part- way through the session: - Perseverative Errors - Anticipatory Errors - Errors shift with changes in time - Pigeons predict reversal based on time - Self-Control?
40
Concurrent Chain Schedules
• Method to determine choice – e.g., whether variety is preferred • Different from concurrent schedules since animals are not free to switch • Able to investigate choice with commitment
41
Self-Control
Commonly used as “willpower” - Circular logic - Describes outcome, not process Better described as: - 1Choice of impulsive vs. delayed options
42
Temporal Self-Control
Choose: Smaller-Sooner reward (SS) vs. Larger-Later reward (LL) • Self-control vs. impulsivity
43
Waiting in Animals
Different species tolerate delays differently e.g., time they will wait for a threefold increase in reward - Temporal Self-Control - chimps are very good at this!
44
Waiting in Humans (Rosati et al. (2007))
- Temporal Self-Control - chimps are very good at this! - chimps are better than humans (expect for when the reward is money)
45
Rachlin & Green (1972)
• Pigeons chose small reward when no delay in Phase 1 • In phase 2, pigeons chose large, delayed reward when T between initial and terminal phases was increased
46
Delay-discounting
Value discounting function : value (V) is directly related to magnitude (M) and inversely related to delay (D), or V = M/(1 + KD) V = value of a reinforcer M = magnitude K = discounting parameter D = delay
47
Madden et al. 1997
•Madden et al. 1997 Opioid-addicted participants steeply discount delayed money and (especially) heroin it is a tendency to give greater value to rewards as they move away from their temporal horizons and towards the "now".
48
Small-But-Cumulative Effects
Malott (1989) Each choice of an SS reward over a LL reward has only a small effect - Builds over time - Difficulty in impulse control - Establishing rules for acceptable vs. unacceptable behaviour - Relapse handling: Dealing with steady stream of temptations
49
Long-term Effects | delayed gratification
``` Mischel: Delayed Gratification Eating the first marshmallow / less- preferred food correlated longitudinally with: - Lower SAT scores - Less educational and professional achievement - Higher rates of drug use ```
50
Simple Self-Control Methods (Skinner)
``` Physical restraint Deprivation/Satiation Distraction DRO Self-Reinforcement Self-Punishment Shaping ```
51
Clinical Implications of self control issues
``` ADHD Predominantly Hyperactive-Impulsive Type Substance abuse disorders Impulsive overeating Other impulse-control disorders Pathological gambling ```