Review Session #5 Flashcards Preview

CS 7642 - OMSCS (Final Exam) > Review Session #5 > Flashcards

Flashcards in Review Session #5 Deck (10)
Loading flashcards...

True or False: Options over an MDP form another MDP.

True. Options is the combination of actions of MDP and therefore can form another MDP.


True or False: Nash equilibria can only be pure, not mixed.

False. Pure strategies are just a subset of mixed strategies where the probably is always 100% for those actions.


True or False: An optimal pure strategy does not necessarily exist for a two-player, zero-sum finite deterministic game with perfect information.

True. Simple games like Rock Paper Scissors don't have pure optimal strategy.


True or False: The "folk theorem" states that the notion of threats can stabilize payoff profiles in one-shot games.

False. The "folk theorem" is stated to occur in infinite games. In one-shot games, it is finite and the stabilization found by folk theorem won't occur.


True or False: If following the repeated game strategy "Pavlov", we will cooperate until our opponent defects. Once an opponent defects we defect forever.

False. The 'Pavlov' is more align to the tip 4 tat strategy where you would cooperate if they cooperated and defect if they defected. The stated strategy is actually "Grim Trigger".


True or False: Correlated equilibria rely on coordination, like side payments.

False: Competitive Cooperation (CoCo) is relies on coordination, like on side payments. Correlated equilibria relies on rationality contraints and shared Q-tables and without explicit coordination.


True or False: "Subgame perfect" means that every stage of a multistage game has a Nash equilibrium.

False: Subgame perfect means that every subgame is a Nash equilibrium.


True or False: Inverse RL means that we invert the reward function before putting an agent in an environment.

False: Inverse RL means the behavior of the of agent derives the reward function of an agent in the environment.


True or False: DEC-POMDPs include communication, this communication allows agents to plan.

True: DEC-POMDPs (Decentralized) allow agents to communicate their actions towards a goal instead of one agent dictating all the actions.


True or False: Policy shaping requires a completely correct oracle to give the RL agent advice.

False: Policy shaping can be done with information with confidence values indicating accuracy.