Lec 5 | Neural Networks Flashcards
(50 cards)
It is inspired by neuroscience
AI Neural Networks
- These are connected to and can receive and send electrical signals from other ____________ .
- They process input signals and can be activated which will send its electrical signal forward.
- A unit connected to other units
Neuron
- It is mathematical model for learning inspired by biological neural network
- It models mathematical functions that map inputs to outputs based on the structure and parameters of the network.
- Allows for learning the network’s parameters based on data
Artificial Neural Network
w₀ is a constant, also called [???] , modifying the value of the whole expression
Bias
Activation Functions
gives 0 before a certain threshold is reached and 1 after the threshold is reached.
Step Function
Activation Functions
Give the formula for the step function
g(x) = 1 if x ≥ 0, else 0
Activation Functions
gives as output any real number from 0 to 1, thus expressing graded confidence in its judgment
Logistic Function/Sigmoid
Activation Functions
What is the formula for the logistic function/sigmoid?
g(x) = e^x / ((e^x)+1)
Activation functions
Allows the output to be any positive value. If the value is negative, it sets it to 0.
Rectified Linear Unit (ReLU)
Activation Functions
Formula for ReLU?
g(x) = max(0, x)
It is an algorithm for minimizing loss when training neural networks
Gradient Descent
PSEUDOCODE
Gradient Descent
Start with a random choice of weights Repeat: Calculate the gradient based on all data points: that will lead to decreasing loss. Update weights according to the gradient.
What is a drawback or a problem of the gradient descent and how do you solve or minimize the problem?
- It requires to calculate the gradient based on all data points. It is computationally costly
- Use Stochastic Gradient Descent or Mini-Batch Gradient Descent to minimize the problem.
The gradient is calculated based on one point chosen at random.
Stochastic Gradient Descent
PSEUDOCODE
Stochastic Gradient Descent
Start with a random choice of weights Repeat: Calculate the gradient based on one data point: that will lead to decreasing loss. Update weights according to the gradient.
What is a drawback or problem with Stochastic Gradient Descent and how can it be solve?
It can be inacurrate. A way to solve this is by using Mini-Batch Gradient Descent.
- This computes the gradient based on on a few points selected at random
- Finds a compromise between computation cost and accuracy
Mini-batch Gradient Descent
Pseudocode
Mini-batch Gradient Descent
Start with a random choice of weights Repeat: Calculate the gradient based on one small batch: that will lead to decreasing loss. Update weights according to the gradient.
Main takeaway for Gradient Descents?
None of these solutions is perfect, and different solutions might be employed in different situations.
- Only capable of learning linearly separable decision boundary.
- Uses a straight line to separate data
- It could classify an input to be one type or another
Perceptron
Some data are not linearly separatable. What do we do use for data that are non-linearly separable?
multilayer neural networks
an artificial neural network with an input layer, an output layer, and at least one hidden layer.
Multilayer Neural Networks
It processes weighted inputs. It receives weights, performs an action on it and passes outputs to the next layer, until the output layer(final layer ?) is reached. It enables modeling of non-linear data.
Hidden Layer
It is the main algorithm used for training neural networks with hidden layers.
Backpropagation