HC.1.3 Flashcards
(25 cards)
Explain the different types of reinforcement and punishment
- Positive reinforcement:
- Behaviour leads to a positive event
- you add something pleasant
- This encourages the behaviour because it brings a good outcome - Negative reinforcment:
- You remove something unpleasant
- this encourage the behaviour - Positive punishment
- you add something unpleasant
- This discourage the behaviour - Negative punishment:
- You remove something pleasant
- this discourage the behaviour
What is goal-directed action?
Behaviors that are performed with a specific outcome in mind. These actions are:
- deliberate and purposeful
- Based on the expected value of the outcome (i.e., how much the person wants the result).
- Sensitive to changes in the value of the reward or changes in action-outcome contingencies (e.g., if the reward is no longer good, or if the action no longer leads to the reward, the person will stop doing it).
Early in drug use or recovery efforts, many behaviors are goal-directed.
- Taking drugs to feel good or escape negative emotions (the goal is pleasure or relief).
- Staying abstinent to earn vouchers or repair relationships (the goal is reward or social approval).
Here, individuals:
- Weigh the consequences (e.g., “Will this get me high?” or “Will I pass the urine test and get a voucher?”).
- Adapt their behavior based on what they value most at that time.
You need both the cognitive belief criterion and the motivational desire criterion:
1. Belief that the action will lead to the outcome
2. Desire for the outcome
What is meant with the (cognitive) belief criterion?
Performance is mediated by a representation of the causal relationship between the action and its outcome
- This means that the person understands that a specific action leads to a particular result.
- It involves learning and memory: e.g., “If I press this lever, I’ll get food” or “If I go to therapy, I’ll feel better.”
- Cognitive basis: You must believe the action causes the outcome
In the context of addiction:
A person may believe that taking the drug will relieve stress or cause euphoria (Belief Criterion).
What is meant with the (motivational) desire criterion?
Performance is mediated by a representation of the current goal or incentive value of the outcome.
- This reflects the value or desire for the outcome right now.
- Even if you know pressing the lever gives food, you won’t do it if you’re not hungry.
- Motivational basis: You must want the outcome for the behavior to occur.
In the context of addiction:
If they are currently craving or in withdrawal, the desire to use is high (Desire Criterion).
What happens when you consistently use a substance? Explain
When you repeatedly use a substance, it becomes automatic and there is habit formation.
Core Idea: When a behavior is repeated in response to the same cue and consistently followed by a reward, it can become automatic.
Example: After work, someone repeatedly goes to the bar and gets rewarded (e.g., alcohol, relaxation, socializing).
- Over time, “end of workday” becomes a cue that triggers the behavior automatically, regardless of whether the person still desires the reward
- Initially, substance use might be goal-directed (e.g., to feel good, cope with stress).
- But with repetition, it becomes cue-driven (e.g., finishing work, seeing a friend, stress = trigger).
The person may continue to use even when:
- The substance is no longer rewarding.
- The outcome is negative.
- They consciously want to stop.
This is how addiction can shift from voluntary use to compulsive, habitual behavior, which is harder to control and less influenced by goals or consequences.
What is the definition of a habit?
Habits are instrumental responses that are triggered by stimuli, and that do not depend on the current motivation for the outcome of the behavior.
So, habits are stimulus-response (S-R) driven — they’re less about wanting the outcome, more about repeating what’s been reinforced.
Explain a field experiment on snacking habits
- 98 random people at the cinema are given either fresh or stale popcorn without knowing.
- The researchers than rated how often they typically eat popcorn at the cinema
Results:
- Low/Moderate habit participants (people who don’t eat popcorn at the cinema very often before this experiment) ate less stale popcorn (they were influenced by reward value).
- High habit participants ate just as much stale as fresh — showing behavior wasn’t driven by taste or reward, but by habit.
- In a non-habitual context (meeting room): Everyone ate less stale than fresh — habits didn’t get triggered.
Explain Thorndike’s law of effect
Thorndike says:
- When a behavior (response) is followed by satisfaction (i.e., a reward), the connection between the situation (stimulus) and the response is strengthened — making it more likely to happen again in the same situation.
- When a behavior is followed by discomfort (i.e., a punishment or negative outcome), the connection is weakened — making it less likely to be repeated.
- The stronger the satisfaction or discomfort, the stronger the impact on learning.
LEARN THE LITTLE GRAPHIC MODEL:
- S = stimulus
- R = Response
- A reward strengthens the S-R association
Over time, this becomes a habitual link — the response happens automatically when the stimulus is present, even without conscious thought or motivation for the reward.
Explain the dual-process theory:
This model suggests that behaviour is controlled by 2 systems: a goal-directed system and a habitual system:
With repetition, control over behavior shifts from:
- Flexible, goal-directed control ➡️
driven by:
—- Belief criterion: ‘I believe this action will cause a desired outcome’
—- Desire criterion: ‘I currently value the outcome I’m acting toward’
Behaviours are deliberate and sensitive to changes in value or consequences
- To inflexible, but efficient habitual control.:
— Behavior is triggered by external cues, not current goals or desires.
— Stimulus-response (S-R) based.
— Automatic, inflexible, and not influenced by whether the outcome is still wanted
A balance between these systems exists, but can tip depending on experience or neurological changes (like drug use).
Chronic drug use disrupts the balance:
1. Goal-directed system weakens:
- Drugs damage the prefrontal cortex ➡️ impairing planning, decision-making, value assessment.
2. Habit system strengthens:
- Drugs strongly reinforce S-R associations via dopamine.
- Even when the user no longer wants the drug, cues can trigger use automatically.
🧠 Result:
- Drug-seeking becomes automatic, inflexible, and compulsive.
- Driven by habit, not desire — a hallmark of advanced addiction.
Explain which 2 views are contrasting:
- The incentive-sensitization theory (Berridge & Robinson)
- They say that habits don’t explain addiciton
- They say that addiction is driven by pathological wanting or craving
- Even without liking the drug, intense motivation (due to dopamine sensitization) for it drives compulsive use
- Cognitive control is impaired, but motivation is key - Habit account (Everitt & Robbins):
- Habits do explain addiction
- Over time, drug use becomes a strong habit, cued by environment (e.g., seeing a syringe, place, time).
- Compulsive use = loss of goal-directed control and takeover by habit system.
- Combined with impaired executive function, this explains compulsive use.
Describe the neurobiological basis of addiction
3 stages of addiction:
1. Initial drug use:
- Driven by craving and goal-directed drug-seeking
- Involves the prefrontal Cortex (PFC) and Mesolimbic dopamine pathway
— VTA (ventral tegmental area) > NAcc (nucleus accumbens)
— Associated with reward, motivation, and decision-making
- Drug habits:
- After repeated use, drug-seeking becomes habitual.
- Controlled by the Nigrostriatal Dopamine Pathway (red):
— Substantia Nigra (SN) → posterior putamen (part of the striatum)
— This supports stimulus-response (S-R) learning – automatic, cue-driven behavior. - Drug abuse/ addiction:
- Behavior becomes compulsive, difficult to control.
- Caused by PFC dysfunction (blue):
— Damage to this area impairs goal-directed control, decision-making, and impulse regulation.
Describe the 3 dopamine pathways that are involved in behaviour and addiction
- Mesolimbic pathway
- VTA > NAcc
- Involved in reward, pleasure, reinforcement
- Key in initial drug use and goal-directed behaviour - Nigrostriatal pathway
- Substantia Nigra (SN) > posterior putamen
- important for habit formation and movement
- becomes dominant in habitual drug-taking - Mesocortical pathway
- VTA > PFC
- Involved in executive function, planning, inhibition
- Dysfunction here contributes to compulsive use and poor self-control
VOORBEELD TENTAMENVRAAG: Which theory can account for the lower relapse rates in Vietnam soldiers compared to relapse in individuals returning from a drug rehab center where they were treated for heroin use?
> Both the incentive-sensitization theory and the habit theory. Because When individuals return home from rehabilitation, they re-enter an environment associated with past drug use. This contextual cue (home) can trigger drug-seeking behavior:
1. Habit account:
- S-R learning
- Based on Thorndike’s Law of Effect: If heroin use in the home previously led to reward (or withdrawal relief), the context (home) becomes a stimulus.
- This stimulus triggers the automatic response: seeking heroin.
- Even without craving, the behavior is habitual and cue-driven
2. Incentive-sensitization theory:
- “Home” becomes a conditioned stimulus (CS).
- It was repeatedly paired with heroin (the unconditioned stimulus, US).
- Now, due to neural sensitization, just being in the home can produce intense “wanting”, even if the drug is no longer pleasurable.
Unlike rehab patients, Vietnam soldiers’ home environment was not associated with heroin use — they used drugs in Vietnam, not at home.
1. Habit account:
- Stimulus (Vietnam) → Response (use heroin)
- Since home was never part of the S-R link, returning home did not trigger the habitual behavior.
- The cue was removed, so the habit couldn’t operate.
2. Incentive-sensitization theory:
- “Vietnam” was the conditioned stimulus for heroin.
- Returning home meant no cue-triggered wanting, reducing relapse.
Here’s why both Habit Theory and Incentive-Sensitization Theory can explain the Vietnam relapse findings:
✅ Habit Theory explains:
- Vietnam soldiers’ drug use was context-specific (stimulus = Vietnam).
- When they returned home, the stimulus was removed, breaking the S-R chain.
- So, without environmental triggers, the habit couldn’t activate → lower relapse.
✅ Incentive-Sensitization Theory explains:
- The cues (like sights, smells, stressors) in Vietnam had become conditioned stimuli that triggered “wanting”.
- When those cues were left behind, the sensitized response wasn’t activated.
- At home, no intense craving was triggered → lower relapse.
✅ So together:
- Habit theory explains the behavioral automaticity tied to environment.
- Incentive-sensitization theory explains the motivational drive (or lack thereof) when cues are gone.
What is the outcome devaluation paradigm?
The Outcome Devaluation Paradigm is a classic experimental setup used to distinguish between goal-directed and habitual behavior. It tests whether an animal’s actions are still sensitive to the value of the outcome.
- Phase 1; The intstrumental learning phase.
- The animal (usually a rat or mouse) learns a stimulus-response-outcome association.
- Example: Lever press (response) → Food pellet (outcome).
- This phase establishes a link between behavior and reward. - Phase 2: Outcome devaluation.
- The value of the outcome (food) is reduced by:
— Satiation (feeding the animal until it’s full),
— Or pairing the food with something unpleasant (e.g., LiCl-induced nausea, which creates taste aversion).
- This step reduces the animal’s motivation for the reward. - Phase 3: extinction test
- The animal is placed back in the test environment without the food being delivered, so behavior is assessed in absence of reward.
- Researchers look at whether the animal continues to press the lever
INTERPRETATION:
- If the animal presses the lever less → its behavior is goal-directed (it’s sensitive to the outcome’s new value).
- If the animal keeps pressing the lever → the behavior is habitual (automatic, not sensitive to current reward value).
WHY IT MATTERS:
- The outcome devaluation paradigm helps researchers understand how habits form and how drug-seeking can become compulsive.
- It’s often used in studies of addiction, OCD, and decision-making, where behavior might continue even when it’s no longer beneficial.
What is the primary reason for conducting the outcome-devaluation test in extinction?
To prevent learning based on the new outcome value; to prevent new learning and isolate existing motivation. Because:
If the reward (outcome) were still delivered during the test, the animal could:
- Re-learn that the outcome is no longer desirable or available,
- Update its behavior during the test itself,
- Or receive new reinforcement that could mask the true influence of the devaluation.
because if you let the animal receive the devalued reward, it might learn during the test that the outcome is now bad (e.g., because it causes nausea or isn’t satisfying). That would confound the test of whether the animal had already internalized the devaluation before the test.
By testing in extinction:
- You observe the animal’s initial response tendencies, based solely on what it previously learned.
- It tells you whether the value of the outcome (goal) is still guiding behavior — without interference from new experiences.
- So, any reduction in responding reflects a true memory-based, goal-directed adjustment, not trial-by-trial learning.
Explanation by thorndike’s law of effect:
This principle says:
Responses followed by satisfaction are reinforced; responses followed by discomfort are weakened.
- If you run the test with the outcome still being delivered, the animal could learn new S-R associations based on current satisfaction or discomfort.
- That would activate new learning, rather than reflect whether the original response is still goal-directed based on the already-devalued outcome.
Which 2 factors influence whether behaviour remains goal-directed or habitual?
- Repetition.
- Key point: When an action is repeated many times (i.e., overtrained), it can become insensitive to outcome devaluation.
- This is called behavioral autonomy, meaning the behavior is no longer guided by the current value of the outcome — it’s now habitual.
- Classic research (e.g., Dickinson, 1985) showed that the more a response is practiced, the more it shifts from goal-directed to habit-based control. - Context matters:
- Key point: The insensitivity to outcome devaluation that arises from overtraining is context-specific.
- According to Thrailkill & Bouton (2015), if you change the context after overtraining, the habit might not transfer.
- This suggests that habits are tied to environmental cues, and contextual change can sometimes “reveal” goal-directed control that was suppressed.
Give an example for the outcome devaluation paradigm in humans
A group of human participants are splitted into 2 groups:
1. Moderate training group
- 2 training sessions on 1 day
2. Extensive training group
- 12 training sessions across 3 days
3 experimental phases:
1. Instrumental training:
- Participants learn to perform two different responses for two different outcomes:
— R1 → Smarties, R2 → Fritos
— Each response is linked with a specific snack.
2. Outcome devaluation:
- Participants are satiated on one of the outcomes (e.g., allowed to eat as many Smarties or Fritos as they want), reducing its current value.
3. Extinction test:
- Participants are tested without receiving the outcomes to see if they still perform the actions (goal-directed behavior vs. habit).
RESULTS:
- Goal-directed:
— Clear reduction in response for the devalued outcome
- Habitual:
— No difference in responding between devalued and valuable outcomes → behavior has become habitual.
— Even though one outcome has lost its value, the action is still repeated out of habit (S-R link).
Describe the neural evidence for the 2 pathways
Functional Magnetic Resonance Imaging (fMRI) has been used to investigate the neural correlates of goal-directed behavior and habits in humans.
Goal-Directed behavior:
- the vmPFD (ventromedial prefrontal cortex) (overlapping with orbitofrontal cortex)
- Caudate
Habits:
- premotor cortex (PMC)
- Posterior putamen
Craving:
- Nucleus Accumbens
Describe INDIRECT evidence for drug habits
- Cue reactivity
Research on how the brain responds to drug-related cues
- Cue reactivity = the brain’s automatic response to seeing cues associated with drug use (e.g., images of syringes, powder, alcohol bottles)
- Studies (e.g., Vollstädt-Klein et al., 2015) show that when individuals with substance use disorders see drug-related images, the dorsal striatum — a brain region associated with habitual behavior — becomes activated.
- This is in contrast to ventral striatum, which is more involved in goal-directed, reward-related behavior.
🔁 Interpretation:
- Activation of the dorsal striatum may suggest the cue is triggering a habit — an automatic, learned response (e.g., craving or seeking).
- PET-study:
— In this study, cocaine-addicted individuals watched videos with cocaine-related cues.
— PET scans showed dopamine release in the striatal habit region, reinforcing the idea that drug cues can directly activate brain systems underlying habits.
- Self-report:
- using the SRHI: A questionnaire tool used to measure how habitual a behavior is, including drug use.
- Higher SRHI scores were associated with greater frequency of substance use.
- Suggests that the more habitual the behavior is (as self-reported), the more likely the person is to use the substance regularly.
Limitations of self-report:
- Self-report measures rely on introspective awareness.
- That’s a problem when studying habits because true habits are automatic — people may not be fully aware of how or why they perform them.
- So, self-report might underestimate or misrepresent the automaticity of drug-seeking behavior.
Explain and describe how the outcome-devaluation paradigm can answer if: ‘Drug seeking become habitual with repetition?’
Experimental design:
Procedure:
1. Instrumental Training
- Rats trained to lever press for alcohol.
- Two groups:
— Short training (2 weeks)
— Long training (8 weeks) = overtraining → expected to induce habit
2. Outcome Devaluation
- Alcohol made less desirable (through satiation).
3. Extinction Test
- No alcohol delivered to test whether rats adjust their behavior based on the devaluation.
Neural Manipulation:
- Half of each training group had brain cannulae implanted into:
— Dorsomedial striatum (DMS) — involved in goal-directed control
— Dorsolateral striatum (DLS) — involved in habitual control
- GABA agonist muscimol was used to temporarily inactivate these regions during testing.
SO, does drug seeking become habitual with repetition?
> YES.
- The Corbit et al. (2012) study shows that long-term training leads to habitual drug-seeking behavior that is insensitive to devaluation.
- This habit becomes dependent on the dorsolateral striatum.
- When the DLS is inactivated, behavior shifts back to being goal-directed — proving the habit is neurobiologically encoded in the DLS.
Explain and describe how the outcome-devaluation paradigm can answer if: ‘Habit formation is accelerated for drug rewards relative to natural rewards?’
Procedure for alcohol vs. food:
1. Instrumental Training:
Rats are trained to press two levers:
- One gives food pellets
- The other gives alcohol
2. Devaluation:
- One reward is devalued (made aversive) by pairing with LiCl (causes nausea).
3. Extinction Test:
- Both levers are available, but no rewards are given.
- Researchers observe if rats reduce pressing the lever associated with the devalued outcome.
🔍 Findings:
- Devaluation of food significantly reduced lever pressing → goal-directed behavior.
- Devaluation of alcohol had less effect on lever pressing → more habitual.
💡 Interpretation:
- Suggests alcohol-seeking becomes habitual faster than food-seeking behavior.
- Drug rewards might accelerate the shift from goal-directed to habitual action.
Cocaine vs. sucrose:
Procedure:
Same design, but the comparison is now between:
- Sucrose (natural reward)
- Cocaine (drug reward)
🔍 Findings:
- Rats were more sensitive to devaluation for sucrose (goal-directed).
- Less sensitive for cocaine → continued pressing the lever even when the drug was devalued → habitual behavior.
ANSWER TO QUESTION:
- Both studies show that drug rewards (alcohol, cocaine) lead to faster or stronger development of habits than natural rewards.
- Drug-seeking behavior becomes less sensitive to outcome devaluation — a hallmark of habitual control.
Explain and describe how the outcome-devaluation paradigm can answer if: ‘Substance abuse lead to a general tendency to rely on rigid habits?’
HUMAN RESEARCH:
- Training description: Participants learn the correct response (right or left) for 6 different discriminative cues (pictures of fruit) to earn 6 different rewarding outcomes (pictures of fruit that were worth points).
- * Test description: Each block of the slips-of-action test is preceded by an instruction screen showing all (6) fruit rewards. Two are devalued (leads to deduction of points) which is indicated by a cross through the fruits. The other 4 are still-valuable (worth points). Then the cues appear on the screen in quick succession. Subjects should press when the cue was previously associated with a “still-valuable” fruit, but suppress the learned response if the cue was associated with a “devalued” fruit. The test takes place in nominal extinction (meaning that participants are not given feedback/points during the test, but they know that they still are earning/losing points, and that they will be shown their total score at the end). Strong S-R habits/weak goal-directed control should lead to “slips of action”: commission errors on devalued trials.
- Slips-of action task:
– Tests whether participants can suppress a previously learned response when the associated outcome becomes devalued.
– A “slip of action” indicates habitual behavior.
- Participants learned fruit–response–reward associations.
- Later, some fruits were devalued.
- Key test: Could participants suppress responding to cues linked to devalued fruits?
- Results – Ersche et al. (2016) & Sjoerds et al. (2013)
- Cocaine users showed no difference in response between devalued and valued outcomes.
- Controls reduced responses for devalued items.
⇒ Substance users show more habitual behavior. - Interpretation
- Substance abuse is associated with a shift toward habit control, even for neutral stimuli like fruits or animals.
- This suggests a general habit bias, not limited to drugs.
- This ‘habit tendency’ could be due to strong habit formation, weak goal- directed control or a combination of both.
ANIMAL RESEARCH:
1. Corbit, Nie & Janak (2012)
- Rats trained to press for sugar. One group also received noncontingent alcohol.
- Sugar + alcohol group pressed equally for devalued and valued rewards.
- Indicates habitual responding (insensitive to outcome devaluation).
- Correct graph: ✅ right graph (no difference between valued and devalued).
- Nelson et al. (2006)
- Amphetamine exposure also led to increased habit formation, even for food rewards.
- Devaluation didn’t reduce responding in amphetamine-exposed animals.
FINAL ANSWER:
Yes, substance abuse leads to a general tendency to rely on rigid habits.
This is evidenced by a greater reliance on stimulus-response behavior, reduced goal-directed control, and insensitivity to outcome devaluation, both in drug-related and non-drug-related contexts.
Explain and describe how the outcome-devaluation paradigm can answer if: ‘drug habits are compulsive?’
Subjects: Rats trained to self-administer cocaine via a lever.
Setup:
- Baseline: Lever press = cocaine reward.
- Punishment phase: Occasionally, pressing the lever also delivered a mild electric shock, though cocaine was still available.
🔍 Key Findings:
🟠 Moderate Training:
- Rats reduced cocaine-seeking when the shock was introduced.
⇒ Drug-seeking is sensitive to negative consequences = not yet compulsive.
🔴 Extensive Training:
- A subset of rats (approx. 20%) continued to press the lever for cocaine despite the shocks.
⇒ Their drug-seeking behavior was insensitive to punishment, a hallmark of compulsive behavior.
🧠 Important Notes:
- This compulsive drug-seeking only emerged after long exposure.
- It mirrors human addiction, where only some individuals become compulsive users.
- These compulsive rats were also more likely to relapse, making this model predictive of addiction severity.
FINAL ANSWER:
Yes, drug habits can become compulsive — but only after extensive drug exposure, and only in a subset (~20%) of individuals. These compulsive habits are insensitive to negative consequences, like punishment, and are linked to relapse risk, reflecting a critical component of human addiction.
Name some critics about research into drug habits
CRITICAL NOTES IN ANIMAL RESEARCH:
- Context matters: Most addiction studies in animals are conducted in impoverished environments where drug rewards are the only available option.
- Findings by Ahmed et al.: Rats preferred a sweet solution over cocaine when given a choice—even after extensive cocaine use. This questions the notion that drug use becomes irresistible.
- Social factors matter: A lack of social play increases later drug motivation; conversely, social interaction can protect against addiction-like behaviors.
🧠 Implication: Addiction-like behavior in rats may partly reflect the restricted, non-enriched conditions rather than inevitable habit formation.
- Animal models lack human complexities such as:
— language
— abstract reasoning
— long-term goal pursuit
🧠 Implication: Human efforts to quit drugs often rely on cognitive strategies and future planning, which are hard to model in rats.
CHALLENGES IN HUMAN RESEARCH:
- Self-report issues: People may mislabel their behavior—some call it “habit,” others describe craving.
- Cognitive dissonance: When actions don’t match beliefs (e.g., using drugs despite intentions), people may rationalize behavior as “craving” to reduce discomfort.
🧠 Implication: Human self-reports are tricky. People may post-rationalize compulsive use, making it hard to distinguish habit from craving.