chapter 5 Flashcards

(19 cards)

1
Q

why using DL framework?

A

Because we do not need to implement all these new techniques from scratch in our neural network, but we focus on the big picture of solving real-world problems.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

examples of DL framework

A
  • TensorFlow
  • PyTorch
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

steps foe using DL framework

A

1- write our program as a Python program.
2- import the framework of choice as
a library.
3 - use DL functions from the framework that fit in our program.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are the important functions we use to build our neural networks?

A
  • keras.Sequential: to build the network.
  • keras.layers.Flatten: to handle our inputs.
  • keras.layers.Dense: to handle the hidden layers, number or nurons, and the activation function.
  • keras.optimizer.SGD: to apply the gradient descent and the learning rate.
  • compile: prepare the model for training.
  • fit: to start the trainig.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Epochs

A

refers to one cycle through the full training dataset.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Batch Size

A

The number of training samples used in one iteration.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Saturated neurons

A

when the neuron be insensitive to input changes because their derivative is 0 in the saturated region, that can cause learning to stop completely.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

vanishing gradient problem

A

problem where the backpropagated error is 0 and the weights are not adjusted. Saturated neuron is one of the causes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

How to avoid Saturated neurons?

A

Three common techniques:
- weight initialization.
- input standardization.
- batch normalization.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

weight initialization.

A

The way to ensure that our neurons are not saturated to begin with, so, if a neuron has a large number of inputs, then we want to initialize the weights to a smaller value to have a reasonable probability of still keeping the input to the activation function close to 0 to avoid saturation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

weight initialization strategies

A
  • Glorot initialization: recommended for tanh- and sigmoid-based neurons.
  • He initialization: recommended for ReLU-based
    neurons.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Input standardization.

A

Standardizing the input data to be centered around 0 and with most values close to 0, By subtracting the mean and dividing by the standard deviation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Batch normalization

A

The idea is to normalize values inside of the network as well and thereby prevent hidden neurons from becoming saturated.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

The two strategies to apply batch normalization.

A
  • apply the normalization before the activation function.
  • apply the normalization after the activation function.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Examples of activation functions to avoid saturated neurons.

A
  • ReLU.
  • leaky ReLU.
  • maxout.
  • elu.
  • softPlus.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what are the variations on gradient descent that help in faster learning?

A

1- Momentum.
2-Adaptive learning rate.
3- Adaptive moments.

17
Q

Gradients exploding

A

where the gradient becomes too big in
some point, causing a huge step size.

18
Q

Gradient clipping

A

is a technique to avoid exploding gradients
by simply not allowing overly large values of the gradient in the weight update
step.

19
Q

How to avoid overfitting?

A

split the dataset to :
1- training dataset»»»» to train the model.
2-validating dataset»»»» to tune hyperparameters.
3- testing dataset»»»» to test the model.