Chapter 8- Neural Networks Flashcards

1
Q

what type of neural networks demonstrate above human level performance in chess and go?

A

convolutional neural networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is alex net?

A

a convolutional neural network that outperformed other models in the imagenet challenge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is a spiking neural network?

A

it aims to mimic a biological neuron more closely

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

give a neuron mathematically in sum notation y(x,y) =

A

f(sum: wx +b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

give a neuron mathematically in matrix notation

A

f(W.X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

give three different activation functions?

A

threshold, sigmoid, softmax

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the activation function used for logistic regression?

A

sigmoid

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

give the sigmoid activation function y(X,W) = ?

A

1 / (1 + e^-z)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

when is the softmax activation function used?

A

when we have multiple, mutually exclusive classes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

softmax is an extension of…?

A

the logistic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

give the equation for gradient descent, w_new = ?

A

w_old - lamda (dL/dw)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

give the equation for squared error loss, L = ?

A

0.5(y-t)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is backpropagation?

A

the application of the chain rule for neural networks

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the two stopping criteria we use for neural networks?

A

maximum number of epochs

early stopping criteria

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does the learning rate determine?

A

how large an adjustment we make to each weight at each iteration

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what neural network structure should be sufficient to approximate any function?

A

a multilayer perceptron with one hidden layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is the advantage of adding more layers to a model, rather than more neurons?

A

increases flexibility with fewer free parameters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what are the three approaches to establishing neural network architecture?

A

experimentation, heuristics, pre-trained models (transfer learning)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

when we add too many layers/neurons to a model, we risk…?

A

overfitting

20
Q

what is bias error?

A

error due to an erroneous assumption in the model

21
Q

what is variance error?

A

error due to the algorithm fitting to noise in the training data

22
Q

what kind of error decreases as we make a model more complex?

23
Q

describe the idea behind a drop out scheme

A

begin with an overly complex model

during training, the output of any individual neuron is ignored with probability p

24
Q

what is the traditional error curve of simpler models

A

test error decreases up until the model is sufficiently complex and then increases

25
what is double descent?
if we continue to increase the number of hidden layers, the test error decreases again
26
what are the two problems that deep(er) neural networks face?
training time vanishing gradient problem
27
what is the vanishing gradient problem
the weight update in the early layers can be extremely close to zero
28
what are the two ways to fix the vanishing gradient problem?
relu, rectified linear unit activation function feed the output of a neuron directly into a later stage of the network
29
describe the relu activation function
all negative values for y are set to 0, has a gradient of 1 for positive values for y
30
what is the input to a cnn?
raw 2d image
31
how do we represent each layer of a cnn?
rectangle
32
what is a fully connected layer?
each neuron in the current layer is connected to all neurons in the next layer
33
what do the different layers of the cnn learn, how is it different to a standard mlp? (hint: features)
early layers learn the features that are used for classification, rather than them needing to be pre-selected
34
give a kernel for horizontal lines
1 1 1 0 0 0 -1 -1 -1
35
give a kernel for vertical lines
1 0 -1 1 0 -1 1 0 -1
36
what is padding?
ensure the size of the output remains the same size by extending the original image in a non informative way
37
what is a pooling operation?
select the (max/min/average) value in a local area
38
what is the effect of pooling, why do we do it?
makes the model more robust to variations in position
39
what is the flatten layer?
converts the 2d map into a 1d array of features
40
what is a recurrent neural network
include loops in the hidden layer neurons which allow the network to use historical information
41
on what type of data are recurrent neural networks useful?
time series | sequence
42
what is a generative adversarial network?
generates data that the discriminator thinks is from the training set.
43
what are the two parts of a generative adversarial network?
generator and a discriminator
44
relu vs sigmoid
relu is faster and solves the vanishing gradient problem
45
cons of neural networks (2)
does not take into account spatial or temporal information black box
46
what is a capsule neural network
we not only have a representation of the image, put also its pose