Markov Decision Process Flashcards by Gabriel Miki

When can we classify a process as Markovian?

When the future is independent of the past given the present

How well did you know this?

Not at all

Perfectly

What are the consequences and possible interpretations of choosing a discount factor (gamma) smaller then 1?

Choosing gamma smaller than one ensures convergence in infinite-horizon tasks, models time preference, and handles uncertainty about the future

How well did you know this?

Not at all

Perfectly

How is the value equation written when using a policy pi?

It is the sum of all possible actions times the reward observed by that action in that state plus the discounted sum of rewards of the probable future values of the states

How well did you know this?

Not at all

Perfectly

Markov Decision Process Flashcards

(3 cards)