Lecture 27 - Reinforcement Learning and Motor Sequences Flashcards
What is classical conditioning? What is the famous example?
a learned (reinforced) reflex/ response that is evoked by stimulus
Pavlov’s Dog
What is reinforcement?
increase behaviour
What is punishment?
decrease behaviour
Give an example of positive reinforcement.
Sweet treat
Give an example of negative reinforcement
take away homework
Give an example of positive punishment.
writing lines
Give an example of negative punishment
take away recess
What does the process of reinforcement learning involve?
Learning to link reward with specific actions (and their outcomes) so they become repeated
What is binary reward feedback?
Action is rewarded or not
What is a scalar quantity eward feedback?
relative to the utility of action/reward outcomes
What is the goal of reinforcement learning?
to maximize reward and minimize loss
What is cummulative reward?
it might be better to sacrifice immediate reward for long-term reward
What are some examples of cummulative rewards?
- chess
- investements
What happens to actions that are associated with reward?
they become strengthened/repeated
processes of reinforcement learning:
What is exploration?
the trial and error process of aquiring more information about the envrionemnt by searching possibilities
searchhing many action possibilites to determin which action tends to maximize reward
processes of reinforcement learning:
What is exploitation?
capitalize on known information to maximize reward
actions associated with past history of reward tend to be repeated to maximize future reward
What is the tradeoff between action exploration and exploitation?
shift emphasis from exploring to exploitation to maximize reward
What is a hockey example of the exploration, exploitation trade-off?
exploration - you find out goal tender is weak low
exploitation - shoot low to score goals (maximize reward)
What is the basal ganglia?
Collection of subcortical structures in the brain
Where is dopamine produced?
In the substantia nigra
What is dopamine?
a neurotransmitter that is part of the brains intrinsic reward system
basal ganglia:
What are the 2 parts of the striatum?
- caudate nucleus
- putamen
What is dopamine input into the striatum critical for?
learning from reward and strengtheing the representation of specific actions
Learning the piano example:
What happens to serial actions with learning?
can produce faster sequences with less errors. Key presses become smoother and linked together