Learning Flashcards
(39 cards)
Learning
the acquisition, from experience, of new knowledge, skills, or responses
Animal model of learning: Aplysia (sea slug)
- has only 20,000 neurons, but still shows simple forms of learning
Habituation: decrease in responding when a painless stimulus is repeated - first touch to gill = withdrawal of gill and siphon
- repeated touches = withdrawal fades
Sensitization: increase in responding when a (noxious/painful) stimulus is repeated
- first shock to tail = withdrawal of gill
- repeated shocks = stronger withdrawal
Animal model of learning: C elegans (small nematode worms)
- have only 302 neurons
- show simple forms of memory such as chemotaxis towards/away from salt based on past experiences. E.g., if previously fed in the presence of salt, it navigates towards salt (positive taxis to NaCl)
Classical (also: Pavlovian) Conditioning
A neutral stimulus gains significance through association:
1. Before conditioning:
Food is the unconditioned stimulus which leads to the unconditioned response of salivation in dogs
- Before conditioning:
Tuning fork is a neutral stimulus that elicits no conditioned response - During conditioning:
Tuning fork (conditioned stimulus) is paired with food (unconditioned stimulus). Result is still the unconditioned response of salivation. - After conditioning:
Tuning fork becomes a conditioned stimulus that elicits the conditioned response of salivation (without the presence of food)
Ivan Pavlov
Russian physiologist initially interested in digestion and salivation
Watson & Rayner (1920): The Little Albert experiment
- Demonstration of Pavlov’s effects in humans:
- Demonstration of an emotional response: after pairing the previously neutral stimulus of a rat with the unconditioned stimulus of loud noise, the rat became a conditioned stimulus and elicited fear in Little Albert
- Was interested in generalization (e.g., to other white objects like the rat such as Santa Claus) but didn’t get that far
- Ethically dubious
Role of Classical Conditioning in Phobias
- In anxiety clinics, spider, dog and snake phobias predominate over electrical socket and gun phobias
- This suggests an evolutionary biases in conditioned fear –> preparedness
Garcia & Koelling (1966): - Quinine taste (tonic water, CS) could be associated with nausea (US) from just one pairing, but not with shock (US)
- Light + tone (CS) could be associated with shock (US), but not nausea (US)
Extinction: inhibition or erasing?
Extinction = presentation of the conditioned stimulus alone (not in association with the US) leading to a decrease in the CR.
The learning of the connection between CS and US is not erased because the conditioned response (CR) can return, either after a break (spontaneous recovery) or after 1 new pairing - 1 CS-US pairing brings back the CR (reinstatement). This shows that relearning is fast, suggesting that the original learning wasn’t erased.
Generalization & Discrimination
A generalization function when initial training was to a 1050 Hz tone (the CS) - a CR is still observed at frequencies close to 1050 Hz.
But the observation that the CR is reduced to similar stimuli is also evidence of (psychological) discrimination.
Second-order conditioning
If a stimulus is presented alongside a conditioned stimulus, it can still elicit a conditioned response - this is second-order conditioning. E.g.,
1. Tone (CS) + Food (US) = Salivation (CR)
2. Picture (second-order stimulus) –> Tone (CS) = Salivation (CR)
3. Picture (second-order stimulus) = Salivation (CR)
Classical conditioning in drug overdoses
- Drug paraphernalia (needles, ashtrays, environments) become conditioned stimuli that precede drug effects
- Siegel model of drug overdose:
- for heroin users, CR is opposite to the UR
- heroin: slows heart rate/breathing (UR)
- drug paraphernalia: increase heart rate/breathing (CR) - Risk of overdose if drug administered in novel environment (fewer CS) where the preparatory and compensatory CR does not occur
Law of Effect (who & what)
Thorndike - law of effect is the principle that behaviours that are followed by a “satisfying state of affairs” tend to be repeated, and those that produce an “unpleasant state of affairs” are less likely to be repeated.
- invented the “puzzle box” for cats
- if the behaviour was “correct”, the animal was rewarded with food (rewarded actions are “stamped in”)
- incorrect behaviour = no result, animal was stuck in the box (profitless actions are “stamped out”)
- over time, the number of ineffective behaviours decreased and animals escaped from the box more and more quickly
Operant Conditioning (who & what)
Skinner - operant behaviour refers to behaviour that an organism performs that has some impact on the environment
- operant conditioning is based on the idea of reinforcement: any stimulus or event that increases the likelihood of the behaviour that led to it
Operant Conditioning Terminology
increases the likelihood of behaviour:
- Stimulus is presented = Positive Reinforcement
- Stimulus is removed = Negative Reinforcement
decreases the likelihood of behaviour
- Stimulus is presented = Positive Punishment
- Stimulus is removed = Negative Punishment
Shaping
Learning that results from the reinforcement of successive steps to a final desired behaviour
Superstitious learning
Attributing reward to the wrong response (occurs when the reward is presented after a fixed time interval regardless of what action the animal performs, leading to a random behaviour being reinforced)
Immediate vs Delayed Reinforcement and Punishment in Operant Conditioning
Long delays between the occurrence of a behaviour and the reinforcer create a ‘credit assignment’ problem - the more time elapses, the less effective the reinforcer because it makes it difficult for the animal to figure out the exact behaviour they need to perform in order to obtain it.
Immediate vs Delayed Reinforcement and Punishment in Classical Conditioning
Longer the delay between the CS and the UR, the harder it is to associate the UR with the CS and elicit a CR whenever the CS is presented. Exception:
Conditioned taste aversion: powerful (1 trial) learning of connection of CS with sickness (UR) occurring hours later
Schedules of Reinforcements
- Fixed-Interval
- Variable-Interval
^ produce slow, methodical responding because reinforcements follow a time scale that is independent of how many responses occur - Fixed-Ratio
- Variable-Ratio
Fixed-Interval Schedule
reinforcers presented at fixed time periods, provided that the appropriate response is made –> show a scalloping effects: after each reinforcement, responding pauses, but as the next time interval draws to a close, there is a burst of responding
Variable-Interval Schedule
a behaviour is reinforced on the basis of an average time that has expired since the last reinforcement –> produce steady, consistent responding because the time until the next reinforcement is less predictable
Fixed-Ratio Schedule
reinforcement is delivered after a specific number of responses have been made.
Special case - presentation of reinforcement after every response = continuous reinforcement
Variable-Ratio Schedule
delivery of reinforcement is based on a particular average number of responses, although the ratio of responses to reinforcements is variable.
- these produce slightly higher rates of responding than fixed-ratio schedules because the organism never knows when the next reinforcement is going to appear. higher the reinforcement-response ratio = higher the response rate
Which schedules (ratio/interval) yield higher overall responding?
Ratio