RL: Chapter 1: Introduction Flashcards

1
Q

Reinforcement learning

A

Learning what to do - how to map situations to actions - so as to maximise a numerical reward signal.

The learner is not told which actions to take, but instead must discover which actions yield the most reward by trying them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Main challenge in reinforcement learning vs other types

A

Exploration vs exploitation.

The agent has to exploit what it has already experienced in order to obtain reward.

But it also has to explore in order to make better action selections in the future.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

6 Main subelements of a reinforcement learning system

A
  • Agent
  • Environment

>

  • Policy
  • Reward signal
  • Value function
  • A model of the environment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

6 Main subelements of a reinforcement learning system

Policy

A

Defines the learning agent’s way of behaving at a given time.

Roughly, a policy is a mapping from perceived states of the environment to actions to be taken when in those states.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

6 Main subelements of a reinforcement learning system

Reward signal

A

Defines the goal of a reinforcement learning problem.

On each time step, the environment sends to the reinforcement learning agent a single number called the rewards. The agent’s sole objective is to maximize the total reward it receives over the long run.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

6 Main subelements of a reinforcement learning system

Value function

A

Whereas the reward signal indicates what is good in an immediate sense, a value function specifies what is good in the long run.

The value of a state is the total amount of reward an agent can expect to accumulate over the future, starting from that state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly