CS7642_Readings Flashcards

1
Q

Any finite length discrete time MDP can be converted to a TTD-MDP? (True/False)

A

True. This is from the TTD MDP paper for week 10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

It is always possible to find a stochastic policy that always solves a TTD-MDP?

A

False. See Bhat 2007

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In policy shaping, probably the most important parameter is gamma?

A

False. The most important parameter for learning in policy shaping is the probability that an evaluation of an action choice is correct for a given learner (denoted by ‘C’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

An Oracle is always better than human teaching agents in policy shaping?

A

False. Cederbourg 2011 says this is not the case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Like MDPs, every Markov game has a non-empty set of optimal policies, at least one of which is stationary? (True/False)

A

True. From Littman 1994

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

There is always at least one optimal stationary deterministic policy for Markov games? (True/False)

A

False. Only regular MDPs make this guarantee.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly