Markov Decision Process Flashcards

(3 cards)

1
Q

When can we classify a process as Markovian?

A

When the future is independent of the past given the present

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the consequences and possible interpretations of choosing a discount factor (gamma) smaller then 1?

A

Choosing gamma smaller than one ensures convergence in infinite-horizon tasks, models time preference, and handles uncertainty about the future

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How is the value equation written when using a policy pi?

A

It is the sum of all possible actions times the reward observed by that action in that state plus the discounted sum of rewards of the probable future values of the states

How well did you know this?
1
Not at all
2
3
4
5
Perfectly