Lecture 12 - Neural Networks Flashcards

1
Q

Where are ANNs used?

A

Artificial neuronal networks (ANN) or Neural Network (NN) (but we’re talking about artificially simulated)
Usually considered black boxes
Ex: used in credit card fraud detection, insurance claims, medical insurance processes
But 90% incorrect thing in the USA… (relisten)
Ex: ANN used in Chatgpt, google translate, self-driving cars, most modern robots (ex: roombas)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Artificial neurons

A

Real neuron: input to dendrites, cell body, axon and axon terminal (output)… then synapse into next neuron’s dendrites

Artificial neurons:
Weight corresponds to how important is this input to the neuron’s output
Input x Weight
Sum them up

Output = Sum of (Input x Weight)
Matrix multiplications!

BUT after doing this and before the output, there is also an Activation (the activation is like a threshold… either you have it or you don’t)
Like an instruction
Ex: if negative, turn into 0. If positive, y=x

BUT before even the activation, there is a Bias
-because real neurons can have a base excitation level
Ex: add a certain single value. Ex: add 1.5

So in order:
-Matrix multiplication of the sum of all the inputs x weight. Ex: (0.1-1.0) + (-0.81.0) + (0,5*0.0) = -0.9
-Then you add the bias. Ex: add 1.5. -0.9 + 1.5 = 0.6
-Then you add the activation. Ex: if negative value, becomes 0. If positive, keep. So here we would keep 0.6 The output is then 0.6

Weight and bias are parameters, usually random values at first, then get closer to real value as you keep training
Core of chatgpt is matrix multiplications (but with a trillion+ parameters lol)… 1.760 trillion parameters

Matrix Multiplication is also called Dot Product.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Activation Functions

A

Neurons firing at different rates

-Sigmoid Function: 0 to 1. input on y axis… more and more negative input on x axis, y= 0. As slowly approach zero on x axis , starts picking up. If get more and more positive, starts maxing out as y= 1.
(represents the neuron firing rates of previous slide)

-Tanh Function: Instead of 0 to 1, goes from -1 to 1.
More negative on x axis, gets closer to y=-1
More positive on x axis, gets closer to y=1

-ReLU (Rectified Linear Unit)
If input below 0, just cuts it out. (y=0)
If input positive, y=x
Like the activation in ANN!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Single-Layer “Networks”

A

Demo:
OR gate
AND gate
OR & AND+NOT gate
????

Perceptron is part of supervised learning
Perceptron Update Rule
δ(small delta) = desired output - actual output
ΔW(big delta) = ε * δ * input
?? see slides
but basically, we want our actual output to get closer to our desired output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

More than one neuron

A

Inputs, hidden layers, outputs

Multi-layer Perceptron (MLP)
Dense Neural Network
“Deep” Neural Network
Linear Layers

Hidden layer because you don’t train them directly

Feedforward:
Muliplication Matrix of Inputs and Hidden layers … Outputs
Ex: potato child (his cat)
But if instead we want the desired output to not be potato child but rather chicken (the dog)
We have to do Backpropagation:
Desired output minus actual output (error/loss.. small delta)
Apply error layer by layer through the network
Use to update our weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

NN Training in General

A

● Randomly initialize all weights
● For each (input, desired_output) in dataset:
○ Put input through model (feed-forward), receive predicted_output
○ Calculate loss (desired_output - predicted_output)
○ Backpropagate loss through network, update all weights + biases

Don’t think these things are exam relevant but have a general idea:
Gradient = every gradient step, every update of your weight, brings you further and further along the correct trajectory towards your goal
Calculating the first derivative of every layer
Gradient-descent = how your network actually updates
Loss function = when desired output – predicted output cancel out… mean absolute error, more complex formula
Mini-batches = multiple lines of data at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Summary

A

● Artificial Neurons very roughly model a single neuron
● Single “Cell”/layer components
○ Weights → output = Σ (inputs * weights)
○ Biases
○ Activation function
● Implementation: usually GPU → matrix multiplication
● Training algo: FW pass, Loss, BW pass
● CNNs
○ multiple kernels (3x3, 5x5, etc.)
○ feature hierarchy learned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly