Lec 5 | Neural Networks Flashcards by CYRA DAGOY

It is inspired by neuroscience

AI Neural Networks

How well did you know this?

Not at all

Perfectly

These are connected to and can receive and send electrical signals from other ____________ .
They process input signals and can be activated which will send its electrical signal forward.
A unit connected to other units

Neuron

How well did you know this?

Not at all

Perfectly

It is mathematical model for learning inspired by biological neural network
It models mathematical functions that map inputs to outputs based on the structure and parameters of the network.
Allows for learning the network’s parameters based on data

Artificial Neural Network

How well did you know this?

Not at all

Perfectly

w₀ is a constant, also called [???] , modifying the value of the whole expression

Bias

How well did you know this?

Not at all

Perfectly

Activation Functions

gives 0 before a certain threshold is reached and 1 after the threshold is reached.

Step Function

How well did you know this?

Not at all

Perfectly

Activation Functions

Give the formula for the step function

g(x) = 1 if x ≥ 0, else 0

How well did you know this?

Not at all

Perfectly

Activation Functions

gives as output any real number from 0 to 1, thus expressing graded confidence in its judgment

Logistic Function/Sigmoid

How well did you know this?

Not at all

Perfectly

Activation Functions

What is the formula for the logistic function/sigmoid?

g(x) = e^x / ((e^x)+1)

How well did you know this?

Not at all

Perfectly

Activation functions

Allows the output to be any positive value. If the value is negative, it sets it to 0.

Rectified Linear Unit (ReLU)

How well did you know this?

Not at all

Perfectly

Activation Functions

Formula for ReLU?

g(x) = max(0, x)

How well did you know this?

Not at all

Perfectly

It is an algorithm for minimizing loss when training neural networks

Gradient Descent

How well did you know this?

Not at all

Perfectly

PSEUDOCODE

Gradient Descent

Start with a random choice of weights
Repeat:
     Calculate the gradient based on all data points:  that will lead to decreasing loss. 
     Update weights according to the gradient.

How well did you know this?

Not at all

Perfectly

What is a drawback or a problem of the gradient descent and how do you solve or minimize the problem?

It requires to calculate the gradient based on all data points. It is computationally costly
Use Stochastic Gradient Descent or Mini-Batch Gradient Descent to minimize the problem.

How well did you know this?

Not at all

Perfectly

The gradient is calculated based on one point chosen at random.

Stochastic Gradient Descent

How well did you know this?

Not at all

Perfectly

PSEUDOCODE

Stochastic Gradient Descent

Start with a random choice of weights
Repeat:
     Calculate the gradient based on one data point:  that will lead to decreasing loss. 
     Update weights according to the gradient.

How well did you know this?

Not at all

Perfectly

What is a drawback or problem with Stochastic Gradient Descent and how can it be solve?

It can be inacurrate. A way to solve this is by using Mini-Batch Gradient Descent.

How well did you know this?

Not at all

Perfectly

This computes the gradient based on on a few points selected at random
Finds a compromise between computation cost and accuracy

Mini-batch Gradient Descent

How well did you know this?

Not at all

Perfectly

Pseudocode

Mini-batch Gradient Descent

Start with a random choice of weights
Repeat:
    Calculate the gradient based on one small batch:  that will lead to decreasing loss. 
    Update weights according to the gradient.

How well did you know this?

Not at all

Perfectly

Main takeaway for Gradient Descents?

None of these solutions is perfect, and different solutions might be employed in different situations.

How well did you know this?

Not at all

Perfectly

Only capable of learning linearly separable decision boundary.
Uses a straight line to separate data
It could classify an input to be one type or another

Perceptron

How well did you know this?

Not at all

Perfectly

Some data are not linearly separatable. What do we do use for data that are non-linearly separable?

Study These Flashcards

multilayer neural networks

an artificial neural network with an input layer, an output layer, and at least one hidden layer.

Study These Flashcards

Multilayer Neural Networks

It processes weighted inputs. It receives weights, performs an action on it and passes outputs to the next layer, until the output layer(final layer ?) is reached. It enables modeling of non-linear data.

Study These Flashcards

Hidden Layer

It is the main algorithm used for training neural networks with hidden layers.

Study These Flashcards

Backpropagation

How does backpropagation work?

It does so by starting with the errors in the output units, calculating the gradient descent for the weights of the previous layer, and repeating the process until the input layer is reached.

# PSEUDOCODE Backpropagation

``` * Calculate error for output layer * For each layer, starting with output layer and moving inwards towards earliest hidden layer: * Propagate error back one layer. In other words, the current layer that’s being considered sends the errors to the preceding layer. * Update weights. ```

Can you extend the backpropagation algorithm?

This can be extended to any number of hidden layers, creating deep neural networks, which are neural networks that have more than one hidden layer.

neural network with multiple hidden layers

Deep Neural Network

It is the danger of modeling the training data too closely, thus failing to generalize to new data.

Overfitting

How to combat overfitting?

Use Dropout

The temporary removing of units — selected at random — from a neural network to prevent over-reliance on certain units. Throughout training, the neural network will assume different forms, each time dropping some other units and then using them again.

Dropout

A python library that has an an implementation for neural networks using the backpropagation algorithm

TensorFlow

It encompasses the different computational methods for analyzing and understanding digital images, and it is often achieved using neural networks.

Computer Vision

What do images consist of?

pixels with RGB (3 values from 0-255)

What are the drawbacks of using Computer Vision?

1. The breaking down of the image into pixels and the values of their colors, we can’t use the structure of the image as an aid. 2. The sheer number of inputs is very big, which means that we will have to calculate a lot of weights.

Applys a filter that adds each pixel value of an image to its neighbors, weighted according to a kernel matrix. Doing so alters the image and can help the neural network process it.

Image Convolution

A drawback of Image Convolution?

It is computationally expensive due to the number of pixels that serve as input to the neural network..

It reduces the size of an input by sampling from regions in the input

Pooling

Pooling by choosing the maximum value in each region

Max-pooling

Neural networks that use convolution, usually for analyzing images

Convolutional Neural Network

Explain how Convolution Neural Networks work.

starts by applying filters that can help distill some features of the image using different kernels. These filters can be improved in the same way as other weights in the neural network, by adjusting their kernels based on the error of the output. Then, the resulting images are pooled, after which the pixels are fed to a traditional neural network as inputs

Give a benefit of Convolutional Networks

One of the benefits of these processes is that, by convoluting and pooling, the neural network becomes less sensitive to variation. That is, if the same picture is taken from slightly different angles, the input for convolutional neural network will be similar, whereas, without convolution and pooling, the input from each image would be vastly different.

* Neural network that has connections only in one direction * An input data is provided to the network, which eventually produces some output.

Feed-forward neural Network

What is a limitation of the Feed-forward Neural Network

Input needs to be in a fixed shape/fixed number of neurons and has a fixed number of output

It consists of a non-linear structure, where the network uses its own output as input.

Recurrent Neural Network

Explain the difference between Recurrent Neural Networks and Feed-Forward Neural Networks

Feed-Forward Neural Network * Uses input to get output * Incapable of varying the number of outputs Recurrent Neural Network * Uses output as input * Capable of varying the number of outputs * Helpful in cases where the network deals with sequences and not a single individual object

# CS50 QUIZ Consider the below neural network, where we set: * w0 = -5 * w1 = 2 * w2 = -1 and * w3 = 3. * x1, x2, and x3 represent input neurons, and y represents the output neuron. What value will this network compute for y given inputs x1 = 3, x2 = 2, and x3 = 4 if we use a step activation function? What if we use a ReLU activation function? * 0 for step activation function, 0 for ReLU activation function * 0 for step activation function, 1 for ReLU activation function * 1 for step activation function, 0 for ReLU activation function * 1 for step activation function, 1 for ReLU activation function * 1 for step activation function, 11 for ReLU activation function * 1 for step activation function, 16 for ReLU activation function * 11 for step activation function, 11 for ReLU activation function * 16 for step activation function, 16 for ReLU activation function

1 for step activation function, 11 for ReLU activation function

# CS50 QUIZ How many total weights (including biases) will there be for a fully connected neural network with a single input layer with 3 units, a single hidden layer with 5 units, and a single output layer with 4 units?

# CS50 QUIZ Consider a recurrent neural network that listens to a audio speech sample, and classifies it according to whose voice it is. What network architecture is the best fit for this problem? * One-to-one (single input, single output) * Many-to-one (multiple inputs, single output) * One-to-many (single input, multiple outputs) * Many-to-many (multiple inputs, multiple outputs)

Many-to-one (multiple inputs, single output)

# CS50 QUIZ Consider a 4x4 grayscale image with the following pixel values. ``` 2 4 6 8 16 14 12 10 18 20 22 24 32 30 28 26 ``` What would be the result of applying a 2x2 max-pool to the original image? * [[16, 12], [32, 28]] * [[16, 14], [32, 30]] * [[22, 24], [32, 30]] * [[14, 12], [30, 28]] * [[16, 14], [22, 24]] * [[16, 12], [32, 30]] ## Footnote Answers are formatted as a matrix [[a, b], [c, d]] where [a, b] is the first row and [c, d] is the second row.)

[[16, 12], [32, 28]]

Lec 5 | Neural Networks Flashcards

(50 cards)