Psych Of Learning Exam 3 Flashcards Preview

Psychology > Psych Of Learning Exam 3 > Flashcards

Flashcards in Psych Of Learning Exam 3 Deck (80):
0

Steps of classical conditioning

1. Use of Unlearned (unconditioned) behaviors to illustrate learning
2. Behavior is performed involuntary or reflexive
3. Learned associated between CS and US (s-s association)

1

Behavior is performed involuntary or reflexive

Classical conditioning

2

Operant learning

1. Learned (S-R) association and the outcome of a response (reinforcer or punisher)
2. Behavioral response is voluntary and intentional

3

Behavioral response is voluntary and intentional

Operant Learning

4

Thorndike

Founder of operant learning with cats in puzzle box
Law of effect: if the behavior is performed in response to a stimulus is followed by a satisfactory outcome then the liklihood of that behavior being repeated increases

5

Law of effect:

if the behavior is performed in response to a stimulus is followed by a satisfactory outcome then the liklihood of that behavior being repeated increases

Thorndike

6

Guthrie-Horton experiment

Simplified cats puzzle box making learning quicker and more consistent
each individual cat learned its own unique way to solve task
Whatever worked they stuck to that behavior showing that Law of effect is flexible!
Behavior could be completely random and meaningless but if it helped reach that good outcome it might be repeated!

7

B.F. Skinner

Developed operant conditioning chamber which sped up experimentation and learning
Trained rats quickly!
Make connection between behavior and outcome easily! Speed up learning!
Used schedules of reinforcement
Skinner on free will
Now that we know what causes behavior, we know the mechanism. We can dispose of the idea of free will. Behaviors occur through reinforcer or punisher

8

Used schedules of reinforcement

Skinner

9

Operant conditioning chamber

Skinner

10

Example of basic idea of operant learning

Stimulus (getting told to eat dinner) + response (eating dinner) = reinforcer (dessert) which gives you a reward (pleasurable sensation) that motivates you to eat your dinner next time. The response doesn't always have to be present expectation of dessert is what drives you to eat your dinner

11

Motor response sequences

Coordination of multiple complex motor responses which require the involvement of an additional structure
Cerebellum: circuitry of the cerebellum allows for proper timing of motor responses
It is very accurate at estimating time, and making sure movements occur up at the right time/order

12

Cerebellum:

circuitry of the cerebellum allows for proper timing of motor responses
It is very accurate at estimating time, and making sure movements occur up at the right time/order

13

Reward:

A reinforcer may produce reward which provides motivation to perform the response again. Reward is processed by the nucleus accumbens—Reward Pathway.
• Dopamine is the neurotransmitter involved in processing reward information.
• Endorphins are involved in producing reward sensation NOT dopamine.
• Dopamine is involved in the prediction of a reinforcer/reward not the pleasurable sensation itself.

14

nucleus accumbens

Reward Pathway

15

Dopamine

Is the neurotransmitter involved in processing reward information, the prediction of a reinforcer/reward.

16

S-Reinforcement and R-Reinforcement Associations:

Requires feedback information that allows for a prediction about the outcome

Frontal Striatal Circuits:
1. The outcome of the response is processed by frontal cortex.
2. send feedback information to striatum.
This signal regulates dopamine activity in the striatum.
Good outcome = send more dopamine
Bad outcome = send less dopamine.
3. Regulation of dopamine activity is responsible for increasing the strength of the S-R association.
4. Frontal Striatal Circuits are also responsible for inhibiting S-R behaviors.

17

PreFrontal cortex

develops fast – fully around age 25. If ADHD is present, the process of full brain development is delayed about 3 years.
Teens/adolescents perfrom bad behaviors/stupid things because they cant inhibit bad choices/behaviors.

18

Forming the S-R Association:

1. Information about the stimulus enters the striatum via the sensory cortex or thalamus.
2. The neural signal is sent to the globus pallidus where a response signal is generated.
3. The response signal is sent out to the pre-motor cortex via the thalamus
4. Dopamine from the substantia nigra strengthens the S-R association.
• The neural connection between the striatum and the globus pallidus is where S-R association is formed.
• Whether or not this neural connection gets stronger or weaker depends on the outcome of the behavior.

19

sensory cortex or thalamus.

Information about the stimulus enters the striatum via the sensory cortex or thalamus.

20

globus pallidus

The neural signal is sent to the globus pallidus where a response signal is generated.

21

pre-motor cortex

The response signal is sent out to the pre-motor cortex via the thalamus

22

substantia nigra

Dopamine from the substantia nigra strengthens the S-R association.
• The neural connection between the striatum and the globus pallidus is where S-R association is formed.
• Whether or not this neural connection gets stronger or weaker depends on the outcome of the behavior.

23

Structures involved in S-R Learning

The Basal Ganglia:
Caudate nucleus and Putamen (Striatum):
• Main input structure of the basal ganglia. Receives sensory input from thalamus—Processes Stimulus
Globus Pallidus:
• Main output structure of the basal ganglia. Sends motor output information to thalamus—Response output is generated.
Substantia Nigra:
• Produces dopamine necessary for the formation of S-R associations. Sends dopamine to striatum allowing neural transmission to happen.

24

Globus Pallidus:

• Main output structure of the basal ganglia. Sends motor output information to thalamus—Response output is generated.

25

Substantia Nigra:

Produces dopamine necessary for the formation of S-R associations. Sends dopamine to striatum allowing neural transmission to happen.

26

Functional Anatomy of S-R Learning System
Components of S-R Learning:

S-R Learning Involves the formation of three pair-wise associations:
1. Stimulus – Response (Basal Ganglia)
2. Response – Reinforcer (Prefrontal Cortex)​These two associations (#2 and #3) require a prediction about ​
3. Stimulus – Reincforcer (Prefrontal Cortex) ​an outcome. If I perform this response I will get the
​Reinforcer.

27

Motor Response Sequence Learning

• Motor learning also involves the ability to form response sequences, or complex patterns of motor movements.
​Ex: Playing a musical Instrument.
• Not the result of chains of S-R associations or response chains – S-R-S-R-S-R-S-R-S-R
• Karl Lashley – showed that motor sequences are the result of motor programs which are entire representations of motor sequences.
• Motor programs are formed and stored in the cerebellum.

28

Karl Lashley –

showed that motor sequences are the result of motor programs which are entire representations of motor sequences.
• Motor programs are formed and stored in the cerebellum.

29

Motor Learning as S-R Learning

• Brenda Milner-
-Worked with patients who had amnesia to see if new motor learning could occur. Motor skills for those with amnesia could improve as those with normal cognitive learning.
- Showed motor learning was separate from cognitive learning and was a part of reinforcement learning or operant learning.
• The timing of training trials important:
​ Distributed Practice: spacing out training trials over time with periods of rest in between.
​ Massed Practice: Giving training trials back to back with no rest period in between.
♠ Distributed practice allows for times for consolidation, the process of strengthening the neural connections underlying the S-R association. S-R association requires Long-Term Potentiation (LTP) –4 hours – to make a physical/strong connection in your brain.

30

Brenda Milner-

Worked with patients who had amnesia to see if new motor learning could occur. Motor skills for those with amnesia could improve as those with normal cognitive learning.
- Showed motor learning was separate from cognitive learning and was a part of reinforcement learning or operant learning.

31

Distributed Practice:

spacing out training trials over time with periods of rest in between.
​Distributed practice allows for times for consolidation, the process of strengthening the neural connections underlying the S-R association. S-R association requires Long-Term Potentiation (LTP) –4 hours – to make a physical/strong connection in your brain.

32

Massed Practice:

Giving training trials back to back with no rest period in between.

33

Water maze

• Rat is placed in a pool filled with cold water (uncomfortable).
• The rat swims around, eventually finding the platform that has a visual cue for escape.
• After several trials the rat swims right to the platform.
Measuring time and distance of swim path – become more direct over time, taking less time to reach platform
Radial arm maze and Water maze have allowed us to identify the brain structures and processes involved in operant learning.
Cue on Platform (S) + Swim to/Climb on Platform (R) = Escape Cold Water (Reinforcer: Neg. reinforcement.)

34

Radial-Arm Maze + “Win – Stay” task

Rat is placed in the center of a maze that has eight arms. The rat eventually learns the arms with the light on = food and they will enter them more often.
• “Win- Stay” – The rat wins by staying with the same response. “Going to arms with a light on.”
Light (S) + Going to the end of a lit arm (R) = Food (Rf)

35

Brain Self-Stimulation

• Wires are attached to the Nucleus Accumbens – Part of the Reward Pathway of the Brain.
• When these neurons are active they produce an intense pleasurable sensation.
• The Reward Pathway (Nucleus Accumbens) was an accidental discovery.
James Olds and Peter Milner – were interested in sleep (reticular formation)
• Rats are placed in an operant chamber and activate a stimulator by pressing a lever.
• Used early on to understand drug addiction through chemical activation of neurons.
- Rats and humans will both undergo pain to receive the activation of the reward pathway or pleasure sensation.
• Neurons in the Nucleus Accumbens will adapt to constant stimulation and will need more stimulation to activate the same level of pleasure as before.

36

James Olds and Peter Milner –

The Reward Pathway (Nucleus Accumbens) was an accidental discovery.
were interested in sleep (reticular formation)

37

Other Operant Learning Tasks:

1. Brain Self-Stimulation
2. Radial-Arm Maze
3. Water Maze ​
Different from previous operant learning tasks.
Brain self stimulation: variation of operant conditioning chamber

Radial arm and water maze are very unusual

38

Learned Helplessness

• Martin Seligman
-Light turns on and the floor gets shocked.
- The dog is harnessed into place and cannot escape the shock.
- After several trials the dog will lay down knowing it cannot escape the shock and takes the pain.
- Test Trial: Harness is removed from the dog
​- The dog has the ability to move and escape to the other side
​- The light turns on, floor is shocked, and the dog lays down and takes the pain again.
​- The dog has learned to be helpless and gave up. Humans can learn helplessness. Seligman later changed his ​focus from learned helplessness to learned hopefulness.

39

Avoidance Paradox-

how can the non-occurrence of an event (no shock) act as a reinforcer for a behavior (jumping to the other side of the box)?

40

Two-Factor Theory of Avoidance Learning-

Classical Conditioning and Operant Learning are necessary for avoidance learning.
Two-Factor Theory of Avoidance Learning:
First- Classical Conditioning Occurs:
​ Light (CS) + Shock (US) = Fear (UR)
​​ Light (CS) = Fear (CR)
Second- Operant Learning Occurs:
​Light (CS) produces fear (CR)
​Light (S) –> Jump to other side (R) à Removal of Fear (Reinforcer: Negative Reinforcement)
•”Avoidance” is really just ESCAPE from the light (CS) and the fear it produces (CR)
​• Dog escapes sense of Fear

41

Avoidance: Solomon and Wynne (1953) and the shuttle box:

Dog is placed in a shuttle box that has an electrified floor and a short barrier between two sides (A and B)
1. The light turns on in Side A for 10 sec.
2. Shock is produced after 10 sec.
3. The dog jumps to side B to escape the pain.
4. The light on side B turns on for 10 sec.
5. Side B is then electrified.
6. The dog jumps back to side A to escape the pain.
7. After several trials when the light turns on the dog jumps to the other side before the shock comes.
8. Escape learning then avoidance learning.

42

Avoidance Learning
Escape and Avoidance:

Escape: when you are in the presence of something bad and behave in a way that allows you to get rid of that thing.
​Ex: taking an aspirin to get rid of the headache you already have
​ Compromise in an argument.
Avoidance: when the behavioral response prevents the aversive thing from occurring in the first place.
​Ex: Putting on sun-block.
• Both involve Negative Reinforcement.

43

Factors Determining Effectiveness of Punishment:

1. Intensity of Punisher- has to be really high, intense consequences.
2. Time between the behavior and the delivery of punisher – less effective the longer you wait.
3. Schedule of Punishment – only works if you use a continuous punishment schedule
4. Motivation to do the behavior anyway – if motivation is too strong to do the behavior no amount of punishment is going to work.

44

Reinforcement VS Punishment

Reinforcement- results in an increase in behavior.
Punishment- results in a decrease in behavior.
Positive Reinforcement: behavior increases when a reinforcer is added after behavior
Negative Reinforcement: behavior increases when an aversive thing is subtracted after a behavior
Positive Punishment: behavior decreases when aversive thing is added after behavior
Negative punishment: behavior decreases when reinforcer is subtracted after behavior

45

Positive Reinforcement:

behavior increases when a reinforcer is added after behavior

46

Negative Reinforcement:

behavior increases when an aversive thing is subtracted after a behavior

47

Positive Punishment:

behavior decreases when aversive thing is added after behavior

48

Negative punishment:

behavior decreases when reinforcer is subtracted after behavior

49

Partial Reinforcement Effect

Behavior Modification: If you give a reinforcer following a behavior even if infrequently, the behavior will persist.
​Ex: Parents reinforcing tantrum behavior in children by giving them what they want in the past.

Partial Reinforcement Effect – Extinction takes a long time to happen.
​​- Already used to waiting/ not being reinforced every time for behavior.

50

Extinction

- If you stop delivering the reinforcer after the behavioral response, you get extinction.
​Partial Reinforcement Effect – Extinction takes a long time to happen.
​​- Already used to waiting/ not being reinforced every time for behavior.
​​• Extinction occurs faster in a continuous schedule than a partial reinforcement schedule.
​Discrimination Hypothesis:
​Continuous Schedule- easy to tell the difference between learning trials and extinction trials.
​Partial Schedule- harder to tell the difference between learning trials and extinction trials.

51

Discrimination Hypothesis:

Continuous Schedule- easy to tell the difference between learning trials and extinction trials.
​Partial Schedule- harder to tell the difference between learning trials and extinction trials.

52

​3. Fixed Interval Schedule (FI) –

A reinforcer is delivered after the first response that is performed following a set ​or fixed amount of time that has elapsed. Reinforcer does not automatically show up after a certain amount of time. Have to respond.
​​Fixed Interval Scallop (Behavior Pattern) – very few responses at the beginning of the delay period.
​​​More and more resposnes towards the end of the delay period.

53

Fixed Interval Scallop (Behavior Pattern) –

very few responses at the beginning of the delay period.
​​​More and more resposnes towards the end of the delay period.

54

​2. Variable Ratio Schedule (VR) –

A reinforcer is delivered on average after every nth response but the actual ​number of responses required to get the reinforcer varies widely. Never know when the next reinforcer is ​coming.
​​Ex: Gambling, Tips,
​​-This Schedule produces rapid and fairly constant responding.
​​- Produces the fastest, longest, most consistent behavior.
​​Variable Ratio Schedule and Gambling:
​​1. A persons chance of winning (getting the reinforcer) increases the more times they play the game ​​​(make a response)
​​2. The number of responses needed to get the next reinforcer is uncertain.

55

​1. Fixed Ratio Schedule (FR) –

A reinforcer is delivered after every nth response (every 10th, 20th, 50th, response ​etc.)
​​-The number stays consistent.
​​Post Reinforcement Pause (PRP) – stop doing a response for a certain period of time after reinforcement ​​is given.
​​​- The more responses you had to make to receive the reinforcer the longer a break or pause you ​​​​will take.
​​​- The PRP lengthens as the number of behaviors increases.

56

Post Reinforcement Pause (PRP) –

Fixed ratio,
stop doing a response for a certain period of time after reinforcement ​​is given.
​​​- The more responses you had to make to receive the reinforcer the longer a break or pause you ​​​​will take.
​​​- The PRP lengthens as the number of behaviors increases.

57

Schedules of Reinforcement

Schedule = how often the reinforcer is delivered following the behavior.

Continuous Reinforcement Schedule – every time you do the behavior you get the reinforcer.
​- Reinforcer strengthens S – R Association.
​•Continuous Reinforcement schedule is the worse way to reinforce if you want a long lasting behavior.
​​- Worst one to use because of extinction.
​​​- Happens because reinforcer does not follow the response.
​​​- Extinciton occurs very quickly.

Partial Reinforcement Schedule- Reinforcer is not given every time the behavior happens.
​• Produces the strongest and longest lasting Behaviors.
​​1. Fixed Ratio​​
2. Variable Ratio
​​3. Fixed Interval​​
4. Variable Interval
​Fixed = schedule remains the same
​Variable = schedule changes
​Ratio = based on number of responses
​Interval = based on time.

Continuous reinforcement schedule: every time behavior is performed reinforcer is given
Should be a good thing because connection between reinforcer and stimulus is strengthened BUT it is I fact the worst way if you want a long lasting behavior.
Worst because it has higher chance of extinction
In operant conditioning extinction happens when reinforcer isn't present after stimulus but in classical conditioning it is present but extinction happens very quickly!

58

Reward VS Reinforcer

Reward is the positive effect or hedonic pleasure that is produced by encountering or obtaining some reinforcers which results in the animal wanting to maintain contact with that reinforcer.
- Provides motivation to perform behavior again. ​
Example: heroine user, inject heroin which gives you pleasurable sensation which motivates you to want to do it again so you get that sensation again

59

Positive Versus Negative Reinforcement:

1. Positive Reinforcement- behavior increases due to the addition of a reinforcing stimulus
​Ex: allowance, treats, other types of food.
2. Negative Reinforcement- behavior increases due to the subtraction of an aversive (bad) stimulus.
• Increase in behavior results in both positive and negative reinforcement used at the same time.
​Ex: Rat Training- adding food removes state of hunger for the rat.
Reinforcement = increasing behavior

60

Primary VS Secondary Reinforcers

Primary Reinforcer: a stimulus that is considered inherently rewarding, usually because it fulfills a biological or psychological drive.
​Biological (Death results if not fulfilled) – respiration, shelter, eating, water, sleep, excretion.
​Psychological (Not needed to sustain life) – love, acceptance, sunlight exposure, pride, achievement, life ​memories, family, learning/curiousity, sex
Secondary Reinforcer: a stimulus that gains reinforcing power because of previous assocaiton with a primary reinforcer.
​Ex: money, cell phone, credit card, car, computer, video games

61

Primary Reinforcer:

a stimulus that is considered inherently rewarding, usually because it fulfills a biological or psychological drive.
​Biological (Death results if not fulfilled) – respiration, shelter, eating, water, sleep, excretion.
​Psychological (Not needed to sustain life) – love, acceptance, sunlight exposure, pride, achievement, life ​memories, family, learning/curiousity, sex

62

Secondary Reinforcer:

a stimulus that gains reinforcing power because of previous assocaiton with a primary reinforcer.
​Ex: money, cell phone, credit card, car, computer, video games

63

Other Conceptualizations of a Reinforcer:

Clark Hull
Need-Reduction Theory: any stimulus that reduces a biological need will act as a reinforcer.
Drive-Reduction Theory: any stimulus that reduces a biological or psychological drive will act as a reinforcer.
Response Deprivation Theory: making animal work harder for reinforcer increases the effectiveness of that reinforcer.
​Ex: ​Lever 1- Press 5x for 1 drink
​​Lever 2- Press 20x for 1 drink
​​​Later give a choice on which lever to press
​​​​-Common sense = Lever 1
​​​​-Actually presses Lever 2 – Sugar solution (Lever 2) tasted sweeter to rat.
​​​​​Ex: Sports drinks taste better after working out.

64

Need-Reduction Theory:

Clark Hull
any stimulus that reduces a biological need will act as a reinforcer.

65

Drive-Reduction Theory:

Clark Hull
any stimulus that reduces a biological or psychological drive will act as a reinforcer.

66

Response Deprivation Theory:

Clark Hull
making animal work harder for reinforcer increases the effectiveness of that reinforcer.
​Ex: ​Lever 1- Press 5x for 1 drink
​​Lever 2- Press 20x for 1 drink
​​​Later give a choice on which lever to press
​​​​-Common sense = Lever 1
​​​​-Actually presses Lever 2 – Sugar solution (Lever 2) tasted sweeter to rat.
​​​​​Ex: Sports drinks taste better after working out.

Chooses the harder one to get solution

Dating hard to get :)

67

Is the Reinforcer part of the S-R association?

S-R learning involves 3 Elements:
1. The stimulus the animal reacts to
2. The behavioral response the animal performs.
3. The reinforcer/outcome.
• Historically thought only the relationship of Stimulus – Response
Stimulus​ + ​Response​ = ​Reinforcer/Outcome

Tinklepaugh (1928)
Training Trials:
Monkey would pick up two cups.
1. Had nothing ​2. Had a banana
​-Same cup led to a banana every time.
Test Trials:
​Cup 1. Nothing​​ Cup 2. Lettuce
Reaction: 1. If only learned about the cup + pick it up = no reinforcer
​2. If reinforced react as if certain reinforcer was expected.
Stimulus (Cup)​ + Response (Pick it Up)​=​Reinforcer (Banana)
Three Pair-Wise Associations are Learned:
Stimulus – Response
Response – Reinforcer
Stimulus – Reinforcer
• Animals learn that “when I experience stimulus (A), I should make response (B) because if I do then I will get reinforcer (C)
• Based on Expectation.

68

S-R learning involves 3 Elements:

1. The stimulus the animal reacts to
2. The behavioral response the animal performs.
3. The reinforcer/outcome.

69

Tinklepaugh (1928)

Training Trials:
Monkey would pick up two cups.
1. Had nothing ​2. Had a banana
​-Same cup led to a banana every time.
Test Trials:
​Cup 1. Nothing​​ Cup 2. Lettuce
Reaction: 1. If only learned about the cup + pick it up = no reinforcer
​2. If reinforced react as if certain reinforcer was expected.
Stimulus (Cup)​ + Response (Pick it Up)​=​Reinforcer (Banana)
Three Pair-Wise Associations are Learned:
Stimulus – Response
Response – Reinforcer
Stimulus – Reinforcer
• Animals learn that “when I experience stimulus (A), I should make response (B) because if I do then I will get reinforcer (C)
• Based on Expectation.
Test trial
Same two cups
Monkey doesn't reach to empty cup
This time he gets lettuce
Either there's was no reaction because it dudnt learn about outcome or it dies react because he expects that outcome (got angry)

70

Superstitious behavior most likely arose because of

operant learning

71

Variable ratio:

steady and fast rate of responding

72

Variable interval:

steady and moderate rate of responding

Time changes

73

Number one reason why punishment doesn't work....

Partial reinforcement affect: If you don't punish next time it becomes a negative reinforcement schedule which means it will increases in frequency and continue on
Only have to get away with it once for it to become a partial reinforcer event schedule

74

Both involve negative reinforcement

Escape and avoidance

75

Classical conditioning and operant learning are necessary for

avoidance learning

76

James Old and Peter Milner

Accidentally discovered reward pathway. They were interested in sleep. Planted electrodes into brain stem. One of the two decided to attach clamp that when altered they shifted cement forward
They expected rat to fall asleep but cement was not in reticular formation but shifted forward so they were feeling pleasurable sensations
Surgically implanted with electro
Into his brain Nucleus accumbens: reward pathway

Pleasurable sensation happens when these are activated
Sex and food activate reward pathway there is satiation (I'm full!)

In the experiments there were no satiation (I'm full)
Motivation strong enough it doesn't matter the punishment

77

Win stay:

win by staying with rewarded response (operant learning)

78

Win shift:

win by changing (cognitive learning)

81

Disadvantages of using Punishment:

1. Usually includes some negative emotional component (e.g. anger, fear) that can disrupt performance.
2. Can lead to a general suppression of all behaviors.
3. Behavior has to be monitored continuously.
​• Schedule has to be continuous to avoid Reinforcement.
• Offer a replacement behavior to get rid of the bad behavior and reinforce that “good” behavior.