instrumental conditioning Flashcards
(43 cards)
E.Thorndike (1874-1935) experiment
–> conducted research examining whether animals could solve problems or “think”
- designed a variety of “puzzle boxes” from which the cats had to learn to escape
- was looking to see if the cats would show some evidence of solving the puzzle
- didn’t find any evidence of this, instead found a steady decrease in the amount of time it took the cats = they slowly improved overtime = law of effect
- the behaviours followed by release were steadily strengthened, while behaviours unrelated to release faded with time
law of effect
“of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur”
= positive consequences increased the likelihood or probability of a response (rather than a “reflexive” relation)
punishment effect
–> punishment is seen as the opposite to reinforcement
“ those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with the situation weakened , so that, when it recurs, they will be less likely to recur.”
- behaviours are “stamped out” if followed by negative consequences
classical versus instrumental conditioning
- classical conditioning is a relation between two stimuli (CS and US). the CS elicits the CR
- instrumental conditioning concerns the probability or likelihood of a response changing as a function of its consequences. the subject emits the response in order to produce a reward
what did B.F. skinner hypothesise
- hypothesized that we can’t actually see what’s going on in someone’s head, instead we have to measure reactions and actions, behaviours
- central figure in the area of psychology known as behaviourism
- this movement was a reaction against introspectionism towards more objective measurement in psychology
how does instrumental conditioning relate to operant conditioning
–> instrumental conditioning is also called operant conditioning because the response operates on the environment (produces an effect)
- the operant is the response defined in terms of its environmental effect
skinners version of the law of effect
when a response is followed by a reinforcer, the strength of the response increases. when a response is followed by a punisher, the strength of the response decreases
acquisition (with example)
behaviour shaped by successive approximations
eg:
- training a rat to press a lever, start with a broad response criterion and progressively narrow it
- the world is a “trainer”
- positive and negative consequences of actions constantly shaping behaviour repertoires
reinforcement increases ..
the likelihood of behaviour
positive reinforcement
adding a stimulus or event contingent upon a response increases that behaviour
lab: provide food to a food deprived rat for lever pressing
life: child receives pocket-money for doing “chores”. affection of partner for “kind” act. henry the cat “nuzzles” jessica
negative reinforcement
removing a stimulus or event contingent upon a response increases that behaviour
lab = a rat presses a lever to terminate (escape) or prevent (avoidance) an electric shock through the floor
life= child does homework to avoid detention (or corporal punishment) at school. don’t stay out all night to avoids partners wrath
punishment decreases …
likelihood of a behaviour
positive punishment
adding a stimulus or event contingent upon a response decreases that behaviour
lab = rat receives electric shock for pressing a lever
life = corporal punishment. antics on a skateboard, say. henry the cat bites Jessica
negative punishment
removing a stimulus or event contingent upon a response decreases that behaviour
lab: a lever press retracts the water spigot from a rat for a fixed period of time
life: pocket-money withheld, not allowed to go out. partner refuses to talk to you. henry the cat runs away from Jessica
conditioned reinforcers and punishers
primary reinforcers or punishers = seem inherently reinforcing (eg: food) or punishing (pain)
other stimuli acquire reinforcing or punishing properties by association with primary reinforcers or punishers
praise = conditioned reinforcer
“No!” = conditioned punisher
token reinforcer (money) = can be exchanged for primary reinforcers
what does experimental analysis of behaviour involve
systematic study of relation between behaviour and its consequences
schedules of reinforcement
a schedule of reinforcement is a specific pattern of presenting reinforcers over time
partial (or intermittent) reinforcement
a designated response is reinforced only some of the time. useful for maintaining behaviours
four of the most simple schedules:
Ratio = depends on numbers of responses (fixed and variable) –> behaviour is rewarded dependent on a set number of responses
interval = depends largely on the passage of time (fixed and variable)
continuous reinforcement (CRF)
- every instance of a response is reinforced
- useful for learning new behaviours and influencing ongoing patterns of behaviour quickly
fixed ratio schedule
–> the reinforcer is given after a fixed number of non-reinforced responses
Lab: a rat receives food for every tenth response
cumulative record: post reinforcer pause, “burst: of responses until next reinforcer
variable ratio schedule
–> the reinforcer is given after a variable number of non-reinforced responses. the number of non-reinforced responses varies around a predetermined average
lab: a rat is reinforced for every 10th on average, but the exact number required varies across trials
cumulative record: high steady rate of response. occasional pauses
fixed interval schedule
–> the reinforcer is given for the first time after a fixed period of time has elapsed
lab: rat is reinforced for the first lever press after 2mins has elapsed since the last reinforcer
cumulative record: pause after rft, steadily increasing response rate as interval elapses
variable interval schedule
–> the reinforcer is given for the first response after a variable time interval has elapsed. the interval lengths vary around a predetermined average
lab: rat is reinforced for the first lever press after 1’ on average, but the interval length varies from trial to trial
cumulative record: high steady rate of response, although not quite as high as comparable VR schedule. occasional pauses
- tends to show higher response rate because there is a more direct feedback loop between the rate at which the animal is behaving and at the rate it gets food
extinction
–> reinforcers are no longer delivered contingent upon a response, and the strength of the response decreases