Chapter 5 Flashcards Preview

Learning and Plasticity > Chapter 5 > Flashcards

Flashcards in Chapter 5 Deck (97):

Operant conditioning is a form of ____ learning.



In classical conditioning, do we control the response?



Classical VS operant conditioning?

In classical the outcome occurs regardless, whilst in operant the outcome is dependent on your response.


Operant conditioning is based on avoiding or obtaining a specific ____.



Operant conditioning requires an ____ operate in it's environment to determine an outcome.



Thorndike was the first to study behavioral outputs due to operant conditioning. What was his study? What was the conclusion? What is this idea known as?

Puzzle boxes. Organisms are more likely to repeat actions that produce satisfying consequences, and less likely to repeat actions that do not. This idea is known as law of effect.


The more time an animal spent in Thorndike's box, the ____ they learned to escape.



Law of effect?

Probability that a particular behavioral response increases or decreases depending on the consequences that have followed that response in the past.


____ -> ____ -> ____

Stimulus, response, outcome.


Thorndike's learning procedures involved ____ trials. What is this?

Discrete. Operant conditioning paradigm whee the experimenter defines the beginning and end of each trial.


BF Skinner wanted to refine Thorndike's techniques. How did he do this?

He created a Skinner box, which was opposite from Thorndike's discrete trial. The Skinner box is a conditioning chamber where reinforcement/punishment is automatically delivered when an animal makes a response (ex: lever pressing) - in this case, the animal is in charge of what was the start an end.


Free Operant Paradigm?

Operant conditioning paradigm where the animal can operate the apparatus "freely", responding to obtain reinforcement/avoid punishment, whenever it chooses - commonly referred to as operant conditioning


Can we see extinction in operant conditioning?




Providing consequences to increase probability of behavior occurring again in the future.



Providing consequence to decrease probability of behavior occurring again the in the future.


Adding a stimulus to free operant experiments can make them more elaborate. Example?

S(Light ON)->R(Lever press)->O(Food release)
S(Light OFF)->R(Lever press)->O(NO food release)


According to Thorndike and Skinner, operant conditioning consists of what 3 components?

Stimulus, response, outcome.


Discriminative stimuli?

In operant conditioning, stimuli that signal whether a particular response will lead to a particular outcome.


Example of Shaping?

Little kid learning to write the letter "A", it looks somewhat similar and they are rewarded it. Then he kept getting closer and closer to a proper A. Shaping: operant conditioning technique in which successive approximations to a desired response are reinforced.


Chaining? Example?

Operant conditioning technique where organisms are gradually trained to execute complicated sequences of discrete responses. I.e. Learning a complicated dance in order to receive smarties? How about you get a smartie after every correct dance move.


Operant conditioning?

Process whereby organisms learn to make responses in order to obtain or avoid important consequences


Operant conditioning

Process whereby organisms learn to make our refrain from making certain responses in order to obtain/avoid a certain outcome.


Thorndike's cat in box

A cat was placed in a box and when it escaped it was given food so it was more likely to do it again


Law of effect

Given a particular stimulus, a response that leads to a desirable outcome will tend to increase in frequency


Classical vs operant conditioning

Classical: organisms experience an outcome (US) whether or not they have learned the conditioned response (CR)
Operant: the outcome O is wholly dependent on whether the organism performs the response R


How are operant and classical conditioning similar?

Both have a negatively accelerated learning curve (time to do something decreases rapidly and then levels off) and they both show extinction


Discrete trial paradigms

Experimented defined the beginning and end of each trial


Free operant paradigm

The animal can operate freely (i.e running into maze to get food, then running out and that's when the trial ends.)


Reinforcement does what?

Increase probability of behavior


Skinner box

Cage with lever for food to be dispensed into a little plate thing. The animals would explore the cage and they accidentally hit it and dramatically increased their rate of responding


Punishment does what?

Decrease probability of a certain behavior


Cumulative recorder

height of the line at any given time represents the number of responses that have been made in the entire experiment (cumulative) up to that time


Discriminative stimuli

Stimuli that signal whether a particular response will lead to a particular outcome


Habit slip

Stimulus to response association is so strong that the stimulus seems to evoke the learned response automatically i.e. Making a phone call to a familiar number instead of the number you intended to dial


Protestant ethic effect

Rats that have been trained to press a lever to obtain food will often continue to work to obtain food even though they have free food in their cage


What is a response defined by?

The outcome it produces


Shaping: successive approximations to the desired response are reinforced

When a rat in a Skinner box happens to wander near the food tray, experimenter drops in a piece of food. The rat eats the food and starts to learn an association b/w the tray and food and will spend all its time near the food tray. The experimenter then changes the rules: now rat must also be near the lever before food is dropped. Once the rat has learned this, rules change again: food is dropped only if the animal is actually touching the lever, then rule changes again: only if animal is pressing down the lever.


A reinforcer is a ____.



Primary reinforcer

Organisms have innate drive to obtain these things i.e. Food water sex sleep good temperature


Clark Hull's drive reduction theory

Drive reduction theory says that humans are motivated to reduce the state of tension caused when certain biological needs are not satisfied.


what are some complications with primary reinforcements?

An animal will work hard for water, but once they've drunk enough, more water isn't reinforcing. Also, primary reinforcers are not created equally (animal will work harder for food they like)


Secondary reinforcers

Reinforcers that initially have no intrinsic value but that have been paired with primary reinforcers i.e. Money


Token economies

Desired behavior is reinforced with a token which can be exchanged for privilege


By being paired with a primary reinforcer, secondary reinforcers become reinforcers themselves.

Organisms will work to obtain.


Why do some researchers say that animals aren't fooled by secondary reinforcers?

They think the animals use the secondary reinforcers for providing informational feedback that behavior is on the right track for obtaining a primary reinforcer


A switch in outcome may produce a change in ____.



Negative contrast?

organisms given a less preferred reinforcer in place of an expected and preferred reinforcer will respond less strongly for the less preferred reinforcer than if they had been given that less preferred reinforcer all along


Negative contrast example

while rats can be trained to make lever press responses to obtain either food pellets or water sweetened with sucrose, they tend to prefer the latter. If the sweetened water is used as reinforcer during the first half of each training session and food pellets as the reinforcer during the second half, rats typically make many more responses during the first half of the session


Thorndike and Skinner concluded that punishment was _______ as reinforcement at controlling behavior

Not as effective


How can discriminative stimuli for punishment encourage cheating?

A speeding driver will see a cop (the discriminative stimulus) and slow down to avoid getting a ticket. But speeding in the absence of a police car will probably not be punished. In this case, punishment doesn't train driver not to speed. It only teaches him to suppress speeding in the presence of police cars. In effect, driver has learned to cheat


How can concurrent reinforcement undermine punishment?

The effects of punishment can be counteracted if the reinforcement occurs along with the punishment. Like suppose a rat first learns to press a lever for food but later learns that lever presses are punished by shock. Unless the rat has another way to obtain food, it's likely to keep pressing the lever to obtain food reinforcement in spite of punishing effects of shock.


How can punishment not end how you want?

Punishment decreases the probability that R will happen in the future... but what will happen instead? The organism will explore possible responses


Does initial intensity of punishment matter?

Yes it should be strong. I.e. Rats become numb to mild shock


Reinforcement schedules

Rules determining when outcomes are delivered in an experiment


____ outcomes produce fastest learning.



Response -> outcome interval being long results in what?

Decrease in association


The time lag between R and O is an important factor in ____.

Self control e.g. its easy to convince a student to study if a test is coming up tomorrow; it is harder if the test is in 5 weeks. the delay between R (studying) and O (good grades) makes reinforcement less effective in eliciting the response


Pre commitment

Making a choice that is difficult to change later. For example a dieter will be less likely to cheat if he gets rid of all the sweets in their house


Positive reinforcement versus positive punishment

  Positive reinforcement
S (potty present) --> R (empty bladder) --> O (praise)
performance of the response causes the reinforcer to be "added"
      Positive Punishment
S (potty absent) --> R (empty bladder) --> O (Disapproval)
The response must be with held, if it is not withheld (toddler peeing himself) a punishment is "added" to the environment


Negative reinforcement versus negative punishment

Negative Reinforcement
S (shock) --> R (take Aspiring) --> O (no more headache)
behaviour is encouraged because it causes something negative to be subtracted from the environment
     Negative Punishment
S (recess) --> R (aggressive behaviour) --> O (loss of playtime)
Behaviour is discouraged by subtracting something reinforcing (playtime) from the environment


Continuous reinforcement schedule

Each response is always followed by the outcome


Partial reinforcement schedules

Patterns where response is followed by outcome less than 100% of the time


Fixed ratio schedule

A fixed number of responses must be made before outcome is delivered


Fixed interval schedule

Reinforcers first response after a fixed amount of time. Once the time interval has elapsed, the reinforcement remains available until the response occurs and the reinforcement is obtained. To constantly respond before the times interval elapses is a waste of time and effort


Variable ratio schedule

Provides reinforcement after random number of responses. This reduced the post reinforcement pause because there's a study higher rate of responding since the very next response my result in a reinforcement (i.e. Lottery)


Variable interval schedule

Reinforce his first response after an interval that averages in particular length in time. Response rate under variable interval is steadier then fixed interval because animals check periodically to see whether reinforcement is available


Concurrent reinforcement schedule

Organism can make any of several possible responses, each leading to a different outcome. This allows researchers to examine how organisms choose to divide your time and efforts among different options


Matching law of choice behavior

Behavior is correlated with it's environment. Given two responses that are reinforced on variable interval schedules, an organisms relative rate of making each response will match the relative rate of reinforcement for that response


Behavioral economics

Study of how organisms allocate their time and resources among possible options


Economic theory predicts that each consumer will allocate resources in a way that maximizes there____or relative satisfaction

Subjective value


The particular allocation of resources to provide maximum subjective value to an individual is called a____

Bliss point. I.e. Jamie gets $100 each week, he can either buy 10 albums or eat out five times, is bliss point is eating out two times a week but buying six albums


The premack principle

Opportunity to perform a highly frequent behavior can reinforce a less frequent behavior. For example, peanut gave a group of rats free access to drinking water and the running wheel. On average, right spent more time running than drinking. He restricted their access to the wheel: they were allowed to run after the drunk certain amount of water. They learned the response to outcome association and started drinking more water (R) in order to access the wheel (O). Total amount of running decreased and total amount of drinking increased. Activity of running was acting as a reinforcer, and it was increasing the probability of an infrequent behavior


Response deprivation hypothesis

Suggests that the critical variable is not which responses more frequent, but merely which response have been restricted. By restricting the ability to make any response, you can make the opportunity to perform that response more reinforcing. For example, if you have been studying for six hours straight, The idea of taking a break to clean your room (which is something you hate) may begin to look attractive


Cortical areas process____information and then send it as primary inputs to the____, which sends messages to motor neurons in the muscles.

Sensory, motor cortex


Dorsal striatum

Part of the basal ganglia, is further divided into caudate nucleus and putamen. Receive stimulus information from sensory cortical areas and projects to the motor cortex.


Dorsal striatum is critical for what?

It is critical in operant conditioning for learning stimulus to response associations based on feedback. For example, rats with lesions of the DS can learn simple response to outcome associations but only in the absence of discriminative stimuli


Orbitoftontal cortex

One of the brain areas that appear to be important in prediction of behavior. This area is underneath front of the brain that contributes to goal directed behavior by representing predicted outcomes. It receives input from all sensory modalities. Monkeys with OFC lesions can learn to respond to stimuli that have a reward outcome over a stimuli that does not have a reward outcome. But, if conditions are reversed and the second stimulus is now the one being rewarded, lesioned monkeys are slower to adapt


Neurons in the orbifrontal cortex fire differently depending on what?

Whether a reward or punishment is coming. Medial areas of OFC we process information about reinforcers and lateral aerials about punishers.


Ventral tegmental area (VTA) is known as the...?

Pleasure centre


In humans, pimizide suppresses cravings, whilst ____ increases cravings.



Dopamine also strengthens learning of stimulus to response associations during operant conditioning by generally promoting____.

Synaptic plasticity


Opiates signal ____ in the brain.



Endogenous opioids?

Naturally occurring peptides with similar effects to opiates. Researchers believe drugs like heroin or so pleasurable because the activate the same brain receptors as endogenous opioids. Endogenous opioids are released in response to primary reinforcers


What is wanting signaled by?



What is liking signaled by?

Endogenous opioids


How can you differ addiction from habit?

The degree of it


What do pathological addicts experience?

Extreme withdrawal symptoms, to the point where the neglect other parts of their life because nothing else is as important as the addictive substance.


Addiction involves not only seeking the high, positive reinforcement, but also avoiding the adverse effects of withdrawal,____, both of which reinforce drunk taking.

Negative reinforcement


Many addictive drugs are___, which target opiate receptors, which increase dopamine levels



What does cocaine block,?

Dopamine reuptake, so it stays in the synapse longer


Many addicts report that they no longer experience a high to cocaine and amphetamines, why is that?

Their wanting system has disconnected from their liking system.


Behavioral addiction

When addictions to behaviors, instead of drugs, produce reinforcements or highs as well as cravings and withdrawal symptoms when the behavior is prevented. The most common example of this is gambling. Skaters adjusted gambling is so addictive due to reinforcement on variable ratio schedule.


Where does behavioral addiction reflect dysfunction in the brain?

The same place that is affected by drug addictions. For example gambling activates the brain similarly to how cocaine does.


In the US, most treatment plans include what type of therapy?

Cognitive therapy


Addiction is a strong stimulus to response to Alcom Association, with environmental stimuli triggering the addictive behavior, resulting in the reinforcing outcome of the high. The's, treatment should reduce the ____ strength.




Avoiding the stimuli the trigger the unwanted response


Delayed reinforcement

Imposing a fix delay before giving into addiction. Delete between response and I'll come weekends learning of response to outcome response