Review Session #5 Flashcards

Question 1

Q

True or False: Options over an MDP form another MDP.

Answer

A

True. Options is the combination of actions of MDP and therefore can form another MDP.

Question 2

Q

True or False: Nash equilibria can only be pure, not mixed.

Answer

A

False. Pure strategies are just a subset of mixed strategies where the probably is always 100% for those actions.

Question 3

Q

True or False: An optimal pure strategy does not necessarily exist for a two-player, zero-sum finite deterministic game with perfect information.

Answer

A

True. Simple games like Rock Paper Scissors don’t have pure optimal strategy.

Question 4

Q

True or False: The “folk theorem” states that the notion of threats can stabilize payoff profiles in one-shot games.

Answer

A

False. The “folk theorem” is stated to occur in infinite games. In one-shot games, it is finite and the stabilization found by folk theorem won’t occur.

Question 5

Q

True or False: If following the repeated game strategy “Pavlov”, we will cooperate until our opponent defects. Once an opponent defects we defect forever.

Answer

A

False. The ‘Pavlov’ is more align to the tip 4 tat strategy where you would cooperate if they cooperated and defect if they defected. The stated strategy is actually “Grim Trigger”.

Question 6

Q

True or False: Correlated equilibria rely on coordination, like side payments.

Answer

A

False: Competitive Cooperation (CoCo) is relies on coordination, like on side payments. Correlated equilibria relies on rationality contraints and shared Q-tables and without explicit coordination.

Question 7

Q

True or False: “Subgame perfect” means that every stage of a multistage game has a Nash equilibrium.

Answer

A

False: Subgame perfect means that every subgame is a Nash equilibrium.

Question 8

Q

True or False: Inverse RL means that we invert the reward function before putting an agent in an environment.

Answer

A

False: Inverse RL means the behavior of the of agent derives the reward function of an agent in the environment.

Question 9

Q

True or False: DEC-POMDPs include communication, this communication allows agents to plan.

Answer

A

True: DEC-POMDPs (Decentralized) allow agents to communicate their actions towards a goal instead of one agent dictating all the actions.

Question 10

Q

True or False: Policy shaping requires a completely correct oracle to give the RL agent advice.

Answer

A

False: Policy shaping can be done with information with confidence values indicating accuracy.

Review Session #5 Flashcards

(10 cards)