instrumental conditioning Flashcards by Mila Stewart

E.Thorndike (1874-1935) experiment

–> conducted research examining whether animals could solve problems or “think”
- designed a variety of “puzzle boxes” from which the cats had to learn to escape
- was looking to see if the cats would show some evidence of solving the puzzle
- didn’t find any evidence of this, instead found a steady decrease in the amount of time it took the cats = they slowly improved overtime = law of effect
- the behaviours followed by release were steadily strengthened, while behaviours unrelated to release faded with time

How well did you know this?

Not at all

Perfectly

law of effect

“of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur”

= positive consequences increased the likelihood or probability of a response (rather than a “reflexive” relation)

How well did you know this?

Not at all

Perfectly

punishment effect

–> punishment is seen as the opposite to reinforcement
“ those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with the situation weakened , so that, when it recurs, they will be less likely to recur.”
- behaviours are “stamped out” if followed by negative consequences

How well did you know this?

Not at all

Perfectly

classical versus instrumental conditioning

classical conditioning is a relation between two stimuli (CS and US). the CS elicits the CR
instrumental conditioning concerns the probability or likelihood of a response changing as a function of its consequences. the subject emits the response in order to produce a reward

How well did you know this?

Not at all

Perfectly

what did B.F. skinner hypothesise

hypothesized that we can’t actually see what’s going on in someone’s head, instead we have to measure reactions and actions, behaviours
central figure in the area of psychology known as behaviourism
this movement was a reaction against introspectionism towards more objective measurement in psychology

How well did you know this?

Not at all

Perfectly

how does instrumental conditioning relate to operant conditioning

–> instrumental conditioning is also called operant conditioning because the response operates on the environment (produces an effect)
- the operant is the response defined in terms of its environmental effect

How well did you know this?

Not at all

Perfectly

skinners version of the law of effect

when a response is followed by a reinforcer, the strength of the response increases. when a response is followed by a punisher, the strength of the response decreases

How well did you know this?

Not at all

Perfectly

acquisition (with example)

behaviour shaped by successive approximations

eg:
- training a rat to press a lever, start with a broad response criterion and progressively narrow it
- the world is a “trainer”
- positive and negative consequences of actions constantly shaping behaviour repertoires

How well did you know this?

Not at all

Perfectly

reinforcement increases ..

the likelihood of behaviour

How well did you know this?

Not at all

Perfectly

positive reinforcement

adding a stimulus or event contingent upon a response increases that behaviour

lab: provide food to a food deprived rat for lever pressing

life: child receives pocket-money for doing “chores”. affection of partner for “kind” act. henry the cat “nuzzles” jessica

How well did you know this?

Not at all

Perfectly

negative reinforcement

removing a stimulus or event contingent upon a response increases that behaviour

lab = a rat presses a lever to terminate (escape) or prevent (avoidance) an electric shock through the floor
life= child does homework to avoid detention (or corporal punishment) at school. don’t stay out all night to avoids partners wrath

How well did you know this?

Not at all

Perfectly

punishment decreases …

likelihood of a behaviour

How well did you know this?

Not at all

Perfectly

positive punishment

adding a stimulus or event contingent upon a response decreases that behaviour

lab = rat receives electric shock for pressing a lever
life = corporal punishment. antics on a skateboard, say. henry the cat bites Jessica

How well did you know this?

Not at all

Perfectly

negative punishment

removing a stimulus or event contingent upon a response decreases that behaviour

lab: a lever press retracts the water spigot from a rat for a fixed period of time
life: pocket-money withheld, not allowed to go out. partner refuses to talk to you. henry the cat runs away from Jessica

How well did you know this?

Not at all

Perfectly

conditioned reinforcers and punishers

primary reinforcers or punishers = seem inherently reinforcing (eg: food) or punishing (pain)

other stimuli acquire reinforcing or punishing properties by association with primary reinforcers or punishers
praise = conditioned reinforcer
“No!” = conditioned punisher

token reinforcer (money) = can be exchanged for primary reinforcers

How well did you know this?

Not at all

Perfectly

what does experimental analysis of behaviour involve

systematic study of relation between behaviour and its consequences

How well did you know this?

Not at all

Perfectly

schedules of reinforcement

a schedule of reinforcement is a specific pattern of presenting reinforcers over time

How well did you know this?

Not at all

Perfectly

partial (or intermittent) reinforcement

Study These Flashcards

a designated response is reinforced only some of the time. useful for maintaining behaviours

four of the most simple schedules:
Ratio = depends on numbers of responses (fixed and variable) –> behaviour is rewarded dependent on a set number of responses
interval = depends largely on the passage of time (fixed and variable)

continuous reinforcement (CRF)

Study These Flashcards

every instance of a response is reinforced
useful for learning new behaviours and influencing ongoing patterns of behaviour quickly

fixed ratio schedule

Study These Flashcards

–> the reinforcer is given after a fixed number of non-reinforced responses
Lab: a rat receives food for every tenth response
cumulative record: post reinforcer pause, “burst: of responses until next reinforcer

variable ratio schedule

Study These Flashcards

–> the reinforcer is given after a variable number of non-reinforced responses. the number of non-reinforced responses varies around a predetermined average

lab: a rat is reinforced for every 10th on average, but the exact number required varies across trials
cumulative record: high steady rate of response. occasional pauses

fixed interval schedule

Study These Flashcards

–> the reinforcer is given for the first time after a fixed period of time has elapsed

lab: rat is reinforced for the first lever press after 2mins has elapsed since the last reinforcer
cumulative record: pause after rft, steadily increasing response rate as interval elapses

variable interval schedule

Study These Flashcards

–> the reinforcer is given for the first response after a variable time interval has elapsed. the interval lengths vary around a predetermined average

lab: rat is reinforced for the first lever press after 1’ on average, but the interval length varies from trial to trial
cumulative record: high steady rate of response, although not quite as high as comparable VR schedule. occasional pauses

tends to show higher response rate because there is a more direct feedback loop between the rate at which the animal is behaving and at the rate it gets food

extinction

Study These Flashcards

–> reinforcers are no longer delivered contingent upon a response, and the strength of the response decreases

partial reinforcement extinction effect

partial reinforcement schedules (FR, VR, FI, VI) provide greater resistance to extinction - this resistance to extinction is useful from an applied perspective - resistance to extinction is better on partial reinforcement than it is on continuous reinforcement

why does the partial reinforcement increase the strength of the response

1. a subject trained with partial reinforcement has learned that reinforcement follows non-reinforcement 2. learned to persist in the face of frustration produced by absence of reinforcement 3. discriminates that situation is different

typical problems using extinction

- inability to control all sources of reinforcement for the behaviour - failure to provide alternative (appropriate) behaviours leading to the same reinforcer - spontaneous recovery

time out

--> not extinction (negative punishment, removing broad range of +ve reinforcers) = brief, safe procedure, but not without potential problems - ability to implement - time out should remove reinforcers - time out only works if time in is reinforcing - abuse because reinforcing for parent or teacher remove aversive stimulus

transituational reinforcement

- theory focuses on general causal stimuli - reinforcers and punishers form unique and independent sets of transituationally effective stimuli - consummatory responses - relatively long-term deprivation necessary in order to use one of these gold standards

premack experiment

Part one: non deprived rats - measure baseline rates of running and drinking (BASE). more running than drinking - now the running wheel only active for short periods following some drinking (FR 30) = drinking increases part two: deprive the rats of water - baseline drinking now much higher (BASE) - arrange it such that following drinking, running wheel is activated, and rat is forced to run (FR 15, FR 5) --> rate of drinking decreases

problems for the idea of transituational reinforcers in the rat experiment

- in part 1, "running" reinforcers a consummatory response (drinking). this should not occur - in parts 1 and 2, wheel-running was both a reinforcer and a punisher of drinking. transituational reinforcement doesn't allow this dual role

how does transituational reinforcers challenge the concept of reinforcers as stimuli

- instead, behaviours are characterised as either high probability or low probability. - Behavior is reinforced when it is followed by higher probability behaviours

a priori

measure the probabilities of behaviours

deprivation

a means of changing probabilities of certain behaviours in a situation

stimulus control in instrumental conditioning

- to be effective in the environment, instrumental responses must occur at appropriate occasions

antecedent behaviour

--> environmental conditions or cues that precede a behaviour and can influence its occurrence - antecedent stimuli control (cue, signal) instrumental behaviour

stimulus generalisation and discrimination

extent that stimulus dimensions control behaviour - effects of reinforcement and discrimination behaviour

selective stimulus control

- animals are wired to pick out particular things and focus on those and ignore other stimulus - environments are often complex, tendency to ignore redundant info even if relevant

what is the point of cognitive performance of non-human animals

gives us a general understanding about behaviour, perception, and brain functioning in general

biological constraints on instrumental conditioning

raccon eg: after conditioning to a specific response, behaviour "drifted" to examples of instinctive behaviour related to food gathering - overtime as association with food and coins becomes stronger, starts to treat the coins as food, stops them from doing what they are meant to do

example of species specific defence reactions

--> Rats - they learn very easily to press lever for food and to jump out of box to avoid electrical shock - however it is extremely hard to train rats to press lever to avoid shock - due to fleeing and freezing dominant responses for rats in defensive situations - clash between instinct and operant

latent learning

- learning from experience when there appears no obvious reinforcement or punishment for the specific behaviour eg: rats in a maze rats are actually learning, forming a cognitive map of the maze, its just that the ones who aren't getting food don't have any reason to go through fast, still learning at same rate --> distinction between learning and performance

instrumental conditioning Flashcards

(43 cards)