week 5 - chatgpt Flashcards

Question 1

Q

What is the main advantage of using deep neural networks over shallow ones?

Answer

A

Deep networks can represent complex functions with fewer parameters by exploiting hierarchical features, making them more efficient for complex tasks.

Question 2

Q

What is the vanishing gradient problem in deep networks?

Answer

A

As gradients are backpropagated through many layers, they can become very small, preventing early layers from learning effectively.

Question 3

Q

What is the exploding gradient problem in deep networks?

Answer

A

Gradients can grow exponentially as they are backpropagated, leading to unstable updates and divergence during training.

Question 4

Q

How does the ReLU activation function help with vanishing gradients?

Answer

A

ReLU has a constant derivative for positive inputs, preventing the gradient from shrinking to zero during backpropagation.

Question 5

Q

What is the difference between ReLU, LReLU, and PReLU?

Answer

A

ReLU outputs 0 for negative inputs; LReLU uses a small fixed slope; PReLU learns the slope from data.

Question 6

Q

Why is weight initialization important in deep networks?

Answer

A

Proper initialization (e.g., Xavier or He) maintains variance across layers, avoiding vanishing or exploding activations and gradients.

Question 7

Q

What is the purpose of batch normalization?

Answer

A

It standardizes layer inputs to have zero mean and unit variance, which stabilizes and speeds up training.

Question 8

Q

How do skip connections help train deep networks?

Answer

A

They allow gradients to bypass some layers, mitigating the vanishing gradient problem and enabling very deep architectures.

Question 9

Q

What is the main idea behind convolutional neural networks (CNNs)?

Answer

A

CNNs use local filters and weight sharing to detect spatial hierarchies in data, such as edges, textures, and shapes in images.

Question 10

Q

What role do pooling layers play in CNNs?

Answer

A

Pooling layers reduce spatial dimensions and introduce translation invariance, helping to generalize better across input variations.

Question 11

Q

What is the purpose of 1x1 convolutions in CNNs?

Answer

A

They are used to reduce or expand the number of channels without affecting spatial dimensions, aiding computational efficiency.

Question 12

Q

What is dropout and why is it used?

Answer

A

Dropout randomly deactivates neurons during training, forcing redundancy and improving generalization by reducing overfitting.

week 5 - chatgpt Flashcards

(12 cards)