learning part 4 Flashcards
(136 cards)
⚙️ What Is Instrumental Conditioning?
Instrumental Conditioning (also called Operant Conditioning) is learning through consequences — that is, learning that a certain behavior leads to a specific outcome. It’s called instrumental because your behavior is the instrument (the tool) that produces a result.
Elicited Behavior vs Instrumental Behavior
🌩 Elicited behavior
These are automatic responses that happen when a stimulus appears — you don’t choose them. Examples include:
Habituation: You stop reacting to something (e.g. stop noticing a ticking clock).
Sensitisation: You become more reactive to a repeated stimulus (e.g. a loud noise becomes more annoying).
Classical Conditioning: Like Pavlov’s dog — a bell (stimulus) makes the dog salivate because it predicts food.
🧠 These behaviors are:
“Triggered or prompted by a specific stimulus in the environment… automatic or involuntary.”
Do the procedures… require the participant to make a particular response to obtain food or other USs (Unconditioned Stimuli) or CSs (Conditioned Stimuli)?”
“They do not require the participant to make a particular response.”
In other words, the stimulus triggers the behavior — the participant is not choosing a behavior to get a reward.
🛠 Now Enter: Instrumental Conditioning
In instrumental conditioning, your behavior controls what happens.
Instead of something in the environment making you react (stimulus → response), you do something to make something happen (behavior → stimulus).
💡 Analogy:
Think of a vending machine:
You press a button → You get a drink.
Your action (behavior) causes the outcome (stimulus: drink).
If you do nothing, nothing happens.
🔄 Classical Conditioning (for comparison):
A bell rings (stimulus) → You salivate (response).
The environment acts on you, not the other way around.
☕ Everyday Analogy of instrumental conditioning
“Putting a coin in a coffee machine = coffee.”
Your action (putting the coin) is instrumental because it produces a result (coffee).
You do this behavior because it worked in the past — this is learning through consequence.
🧪 Thorndike’s Puzzle Boxes
Edward Thorndike studied how animals learn to get what they want through trial and error. He:
“Placed hungry animals (mainly cats) in puzzle boxes… with some food left outside in view of the animal.”
💡 Analogy: Imagine being locked in a room with snacks on the other side of a glass wall — you can see them but need to figure out how to unlock the door.
🎯 Goal for the Animal:
“Learn how to escape from the box to obtain the food.”
📦 Different Boxes, Different Tricks (Thorndike’s Puzzle Boxes)
“Different puzzle boxes required different responses to get out.”
Example:
Box A: Cat must pull a ring.
First time: 160 seconds to solve it.
Later: 6 seconds — it learns the trick!
“Box I: cats push a lever down.”
Each box was a new problem, and the cats learned through trial and error.
🧠 How Did They Learn? (Thorndike’s Puzzle Boxes)
“Initially, the cats showed random behaviours… but with continued practice… latencies became shorter.”
This means they got faster at escaping.
“Through trial and error… they retained successful behaviours and eliminated useless behaviours.”
💡 Analogy: Like trying different keys on a locked door — eventually, you find the one that works and remember it.
“Although Thorndike titled his treatise ‘animal intelligence’, many aspects of behaviour seemed unintelligent.”
That is — the animals weren’t reasoning; they were just trying stuff until something worked.!!!!!!!!
🔄 Reinforcement & Learning (Thorndike’s Puzzle Boxes)
“Behaviours that result in a positive outcome (escape) lead to an association between stimuli in a puzzle box and the effective response (pushing the lever).”
So:
See lever (stimulus)
Push lever (response)
Door opens (consequence)
Then:
“The consequence (escaping) reinforces this association.”
But it’s important to clarify:
“Not [that] the cat sees the lever and understands how it works.”
The cat doesn’t reason it out like a human. It just learns: “When I do X → I get Y.”
Law of Effect (💡 Thorndike’s Discovery)
💡 Thorndike’s Discovery:
“Thorndike established the LAW OF EFFECT.”
The Law of Effect is like a rule of thumb that says:
If something good happens after a behavior, you’re more likely to do that behavior again.
If something bad happens (or nothing happens), you’re less likely to repeat it.
Example of law of effect
“If an R (pressing a lever) in the presence of an S (lever) is followed by a positive event (escape), the association between the S-R becomes strengthened.”
🟡 Translated:
R = Response (e.g. pressing a lever)
S = Stimulus (e.g. seeing the lever)
Positive Event = Escape
So: If the cat presses the lever (R) when it sees the lever (S) and that lets it escape (reward), then it will remember this combo.
✅ “The R will be more likely to happen again the next time the S is encountered.”
🚫 Negative Event Weakens law of effect (response weakening)
if a R (reaching for the door-handle) in the presence of a S (the door-handle) is followed by a negative event (no escape), the association between the S-R becomes weakened.”
🟡 Translation:
If you do a behavior and nothing good happens, your brain learns: “this isn’t worth it.”
❌ “The R will be less likely to happen the next time the S is encountered.”
🧠 This is called response weakening. !!!!!!!!!
Key Concept of law of effect
The animal learns a connection between seeing something (S) and doing something (R) — like seeing a lever (S) and pushing it (R).
What happens after (the consequence) doesn’t get remembered as a “goal,” but simply makes the S-R link stronger or weaker.
💡 It’s like the brain is saying:
“Whenever I see this thing and do that move, something good/bad follows. I’ll do it more/less next time.”
🧠 The animal isn’t thinking about the reason — it just gets better at repeating the action that worked.
🧭 What Are Discrete-Trial Procedures?
“DISCRETE-TRIAL PROCEDURES can also be conducted in mazes similar to Thorndike’s puzzle box.”
These are experiments where the animal only has one chance per trial to perform a response.
Example:
“Rat begins in a start box and travels down a runway to the other end (the goal box) that has a reinforcer (e.g. food/water).”
💡 Analogy: Like a timed race where the rat runs from Start to Goal, and the prize is a snack at the end.
✅ “The trial is over once the rat has performed the instrumental response (i.e. reached the goal box).”
🏃♂️ How ti measure Learning in Discrete-Trial Procedures?
We measure how well the animal is learning by looking at:
- Running Speed
“How fast the animal gets from the start box to the goal box.” “It increases/decreases with repeated training trials.”
🟢 If learning happens → rat runs faster 🔴 If confused → rat runs slower
- Response Latency
“Time taken to leave the start box and begin moving down the alley.” “It becomes shorter/longer as training progresses.”
🟢 If the rat has learned what to do, it starts quickly 🔴 If it’s unsure or unmotivated, it hesitates
🧪 T-Maze & Complex Learning
A T-Maze is shaped like a “T”. The rat starts at the bottom and must choose to turn left or right.
“The goal is located in each arm of T, allowing the study of more complex processes.”
🐭 Experiment with Baby Rats in T-Maze
“They placed its mother in the right goal box, and another female rat in the left.”
Then:
“One trial consisted of putting the baby rat in the start box, and when the rat reached the goal box where the mother was, the trial was over.”
The reinforcer = being reunited with the mother.
“The rats learned to turn right with successive trials, and this behaviour continued even after the mother was not there.”
💡 Important: The rats learned the action, not just the mother’s presence.
🤯 What Did the Rats Learn in the T-Maze experiment?
“Stimulus: JUNCTION” (where the maze splits)
“Instrumental Response: TURN RIGHT”
“Reinforcing Outcome: TO MEET WITH ITS MOTHER.”
So when the baby rat sees the junction (stimulus), it learns to turn right (response) to meet mom (reward).
“When the baby rats saw the junction in the maze (the stimulus) they turned right (instrumental response), which led to the reinforcing outcome (the mother).”
And:
“The reinforcing stimulus made it more likely that the rat would turn right in the future.”
Thorndike’s procedures (puzzle boxes and mazes)
“In Thorndike’s procedures (puzzle boxes and mazes), the animal only has the opportunity to show instrumental responses during specific periods of time: trials.”
Each learning opportunity is limited and controlled.
“The animal has limited opportunities to respond, and those opportunities are scheduled by the experimenter.”
This ensures precise measurement of how behavior changes over time.
Free-operant Procedures analogy
(Think: letting someone play with a video game without stopping them after each level)
B.F. Skinner revolutionized psychology by creating free-operant procedures, which differ from older, more rigid discrete-trial procedures.
🔑 Key Concept: Free-operant behavior
In discrete-trial setups (like a maze), the subject is removed after each trial—like doing one puzzle, stopping, and starting over.
In free-operant procedures, the subject is not removed after each trial, meaning they can behave continuously. It’s like leaving a rat in a video game world where it can keep playing.
The subject is not removed after each trial, which helps to study behaviour more continuously.
🐀 The animal is free to produce instrumental behaviour many times.
It can press a lever as much as it wants, whenever it wants.
This procedure is more natural since behaviour is continuous, one activity leads to another. Behaviour is not divided into units.
Imagine watching someone cook: first they wash vegetables, then chop them, then cook them—all in a flow. This is more natural than stopping after each step.
It allows the study of more continuous behaviour that can be broken into measurable units called operants.
Operant = a measurable action like pressing a lever to receive food.!!!!!!!!!!
The Skinner Box (Operant Chamber)
(Lab tool for studying free-operant behavior)
Skinner box: allows the study of free-operant behaviour.
Inside the box:
There’s a lever (for rats) or a key (for pigeons).
When pressed or pecked, it gives food (reinforcer).
This lets scientists observe:
How often the animal performs the behavior
How long it takes to learn
What affects the rate of responding
Diagram:
Hungry rats → Push the lever → A pellet of food falls into the food cup
What is operant?
Operant = a measurable action like pressing a lever to receive food.!!!!!!!!!!
Operant Response and Operant in The Skinner Box?
🔄 3. Operant Response and Operant
1. Operant response
= lever pressing or pecking a key
This is the action the animal performs to get the outcome.
- Operant
= the modification of behaviour by the reinforcing or inhibiting effect of its own consequences.
This is the whole learning process: the behavior changes because it produces a result (like food or no food). - Same operant response = different behaviors with the same result
Different behaviours that result in the same effect on the environment are considered the same operant response.
💡 If a rat presses a lever with its left paw, right paw, or nose—all count as one operant response, because the effect (food) is the same.!!!!!!!!!!!!!
“The important is not the muscles involved in performing the behaviour but how the behaviour operates in the environment (results).”