Chapter 8- Neural Networks Flashcards
what type of neural networks demonstrate above human level performance in chess and go?
convolutional neural networks
what is alex net?
a convolutional neural network that outperformed other models in the imagenet challenge
what is a spiking neural network?
it aims to mimic a biological neuron more closely
give a neuron mathematically in sum notation y(x,y) =
f(sum: wx +b)
give a neuron mathematically in matrix notation
f(W.X)
give three different activation functions?
threshold, sigmoid, softmax
what is the activation function used for logistic regression?
sigmoid
give the sigmoid activation function y(X,W) = ?
1 / (1 + e^-z)
when is the softmax activation function used?
when we have multiple, mutually exclusive classes
softmax is an extension of…?
the logistic function
give the equation for gradient descent, w_new = ?
w_old - lamda (dL/dw)
give the equation for squared error loss, L = ?
0.5(y-t)^2
what is backpropagation?
the application of the chain rule for neural networks
what are the two stopping criteria we use for neural networks?
maximum number of epochs
early stopping criteria
what does the learning rate determine?
how large an adjustment we make to each weight at each iteration
what neural network structure should be sufficient to approximate any function?
a multilayer perceptron with one hidden layer
what is the advantage of adding more layers to a model, rather than more neurons?
increases flexibility with fewer free parameters
what are the three approaches to establishing neural network architecture?
experimentation, heuristics, pre-trained models (transfer learning)
when we add too many layers/neurons to a model, we risk…?
overfitting
what is bias error?
error due to an erroneous assumption in the model
what is variance error?
error due to the algorithm fitting to noise in the training data
what kind of error decreases as we make a model more complex?
bias
describe the idea behind a drop out scheme
begin with an overly complex model
during training, the output of any individual neuron is ignored with probability p
what is the traditional error curve of simpler models
test error decreases up until the model is sufficiently complex and then increases