What is the difference between classical and operant conditioning?
Pavlovian: reflexive associations between stimuli result in involuntary responses
vs
Operant: consequences of past actions influence future voluntary behaviour
do behaviours increase or decrease as a result of operant conditioning?
Both, depending on whether the past consequences of that behaviour reinforced or punished it
What is the basic principle of operant conditioning?
Consequences
When would a behaviour tend to be repeated or become more frequent in operant conditioning?
When it results in rewards
What happens to behaviours that result in punishment? 2
They become less frequent or are avoided
what process lead the cats to escape from Thorndike’s puzzle box?
Trial and error learning. Random behaviours had an effect at some point
What did a cat have to do to escape from thorndike’s puzzle box? 3
Pull a string, step on a platform, and turn a latch on the door
What is the law of effect?
The tendency to perform an action is increased if reward, weakened if it is not
How does ‘shaping’ teach a new behaviour to animals?
The tendency to perform an action is increased if rewarded, weakened if it is not.
How is Operant conditioning at play in the real world, without an experimenter/trainer?
Animals adapt behaviourally to environmental feedback (eg foraging)
What happens if you randomly reward pigeons every 15 seconds? And why do they do this?
They show superstitious behaviour, which is self-perpetuating through reinforcement. The behaviour is just whatever the bird was doing before the reward.
What is it called when random reinforcement shapes behaviour?
Superstitious behaviour
What are some examples of superstitious behaviour in humans?
Athlete warm up rituals
Lucky clothes
Lucky charms
Pedestrian crossing buttons
Why do people engage in superstitious behaviour?
we try to find links between behaviour and an outcome, even if there is no true association
What are two things that happen in shaping?
Scan - observing and waiting for behaviour
Capture - reinforce behaviour resembling target behaviour
How does baiting work? And involve Pavlovian conditioning?
Removing primary reward and associating it with another kind of stimulus/indicator
What are the ways we can teach a new behaviour?
Shaping (scan and capture) Baiting Mimicing Sculpting Instruction (language)
What is backward chaining? And why does it work?
Acquiring a new behaviour in small pieces from last one to the first. It’s easier when done in bits.
What are types of reinforcers and punishers, both negative and positive for each?
R+ ice cream
R- less chores
P+ shock
P- Tv privileges
how are reinforcers and punishers different?
Reinforcer: increases behaviour
Punisher: decreases behaviour
What is the difference between positive and negative reinforcers and punishers for the animal?
Positive: the animal receives something
Negative: something is taken away from the animal (or environment)
Is punishment necessarily irritating?
No
Is reinforcement necessarily rewarding?
No
What positive reinforcement do?
Adds something to increase behaviour
What is an example of positive punishment?
Anti-barking collars or getting told off
If you lose your licence, what is this in terms of operant conditioning?
Negative reinforcement - remove something to decrease behaviour (eg time out)
What is bridging? And how does it help with learning using positive reinforcement?
A conditioned reinforcer: A useful association between an instant stimuli and a subsequent reward. This stimulus signals the reward is coming.
It bridges time between behaviour and primary reinforcement when there needs to be no time delay.
How are horses trained? 2
Through shaping and negative reinforcement (removing reign pressure)
Why is continuous reinforcement not always possible?
We cant always be around to deliver it
What types of partial reinforcement can we use? 4
Fixed ratio - every nth
Variable ratio - on average every nth
Fixed interval - first behaviour after n seconds
Variable interval - on average, first behaviour after n seconds
What schedule of reinforcement does gambling fall under?
Variable ratio - the reward will come but you dont know when exactly
Why is variable ratio PRF very efficient?
It teaches and engravings persistence
Which schedule of reinforcement is the most resistant to extinction?
Variable ratio
Why are responses to fixed reinforcement schedules not linear, but variable ones are?
Fixed schedules have a post-reinforcement pause, where the animal learns its pointless to do anything straight away
What are differences in response rates to various reinforcement schedules (in rats)? 4
VR is the fastest learning; followed by FR. VI and FI take longer to learn responses. FI takes the longest
What is the post-reinforcement pause? And what types of schedules does it appear ?
The animal has a break after reinforcement Fixed schedules (FI & FR)
What is more effective? Continuous or partial scheduling?
Continuous, but its not always possible
Why is reinforcement more effective?
It strengthens the correct behaviour in the animal’s repertoire of behaviour, whereas punishment doesn’t actually tell the animal what the right thing to do is.
What are the problems with punishment? 2
- Less permanent (extinguishes faster)
- Reduces trust and increases aggression
How do you punish effectively? 8
- No escape
- As intense as possible (within limits)
- Continuous schedule
- No delay
- Over a short period of time
- No subsequent reinforcement
- Reinforce an incompatible, appropriate behaviour concurrently
- Watch for side effects (aggression, fear, modelling violence, learned helplessness, change to other behaviour)
Why cant you do bridging to reduce the delay in punishment?
It leads to escape behaviour, because you have signaled that punishment is coming
What are reward variables in Operant Conditioning? 3
Drive, size, and delay
What is Drive?
How much the organism wants the reinforcer (eg dogs that want to sniff things out/persist are good drug sniffers)
how could drive affect studies in animal behaviour?
Hungry vs sated organisms will respond differently to food rewards
What is the trade off for size of the reward?
Diminishing returns
How does the size of the reward affect acquisition and extinction of behaviours?
Makes them happen faster
How does delay produce problems for reinforcement and punishment?
Short term reinforcement are more motivating than long term punishments (eg eat the snack now vs be fat later)
What is the three term contingency?
- Discriminative stimulus (occasion)
- Operant response (behaviour)
- Outcome (consequence)
What does a discriminative stimulus do?
Signals the occasion when a particular behaviour will be punished/reinforced
What is the key condition to operant conditioning? And when do stimuli become “signals”?
Learning to discriminate stimulus
When stimuli are predictive of a consequence
What is stimulus generalisation?
When a response is reinforced in the presence of stimulus there is a tendency to reproduce for similar or associated stimuli
What is stimulus discrimination?
Degree to which different stimuli set the occasion for particular responses. A precise degree of stimulus control
How is stimulus discrimination taught?
Behaviour happens when stimulus is present, abates when absent
Is stimulus control pervasive?
Much of our everyday behaviour is under stimulus control (eg traffic light signals)