Deep Q Network MLM Flashcards

1
Q

Deep Q Networks (DQN)

A

Deep Q Networks (DQN) are a type of Artificial Intelligence that combines the techniques of Deep Learning and Q-Learning, a model-free reinforcement learning algorithm.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
  1. Introduction
A

Deep Q Network (DQN) is a variant of Q-Learning that uses deep neural networks to approximate the Q-value function, which helps an agent to learn how to play games by taking smart actions based on the state of the game.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
  1. Neural Networks
A

In DQN, a neural network is used as a function approximator for the Q-value function. The input to the network is the current state of the game, and the output is the corresponding Q-value for each possible action in that state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q
  1. Experience Replay
A

DQN uses a technique called experience replay where past transitions are stored into a replay memory. During training, minibatches of transitions are sampled from this memory to update the Q-values. This approach breaks the correlation between consecutive samples, stabilizing the training process.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q
  1. Target Network
A

DQN also incorporates a technique known as a target network, which is a copy of the main network but with its weights frozen. The target network is used to calculate the target Q-value during updates, providing more stable learning targets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. Epsilon-Greedy Policy
A

DQN typically employs an epsilon-greedy policy for exploration, where the agent occasionally takes a random action instead of the one with the highest estimated Q-value. This balance between exploration and exploitation allows the agent to learn a more robust policy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. Reward and Punishment
A

DQN agents learn from both positive rewards and punishments. If an action leads to a higher score in a game, for example, it receives a positive reward. On the other hand, if an action causes the game to end, the agent receives a punishment.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. Challenges
A

While DQNs have shown remarkable success, particularly in learning to play video games from raw pixel inputs, they are not without challenges. They can be sample inefficient, meaning they require a lot of experience (gameplay) to learn effectively. Also, the choice of reward function can greatly impact the agent’s learning, and designing these reward functions can be non-trivial.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q
  1. Applications
A

DQN has been notably used by Google’s DeepMind to train an AI to play Atari games to a superhuman level, directly from raw pixel inputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly