IFN580 Week 6 Neural Network (11%) Flashcards
(19 cards)
When training a NN, what is the purpose of backward pass?
To update the model parameters stepping in the direction of the gradient
What’s the general flow of NN?
Feed input data into the INPUT LAYER, pass raw data to the HIDDEN LAYER.
HIDDEN LAYER transforms the data through weighted connections and activation functions.
The OUTPUT LAYER receives input from the final hidden layer and generates a final prediction.
What makes NN a better ML algorithm?
Non-linearity, which is introduced by activation functions.
A feed-forward neural network model is said to be fully connected when:
all nodes at one layer are connected to all nodes in the next higher layer.
The input values to a feed-forward neural network must be:
numeric
When training a neural network, what does an epoch represent?
the number of times that the training data has been passed through the
network
Which of the following supervised learning techniques can produce both numeric
and categorical outputs?
Neural networks
too many nodes result in
overfitting
too few hidden nodes result in
underfitting
Recall that last week we mentioned that the training process for a neural network
involves passing the training data through the network multiple times (an “epoch”).
During each of these training passes, which of the following statement is true?
Individual network weights are modified.
What happens when a neural network is over trained?
An over-trained neural network will fail to generalise to the trend of the data,
instead memorising specific values from the training set (aka. “overfitting”).
Why are neural networks referred to as a “universal approximator”?
A well-trained neural network can represent a wide variety of problems including
both classification and regression. These models can also produce good accuracy
even with the presence of noise, errors, missing values, etc
With decision trees and regression models, we may need to pre-process
features/attributes to enable the model to effectively learn patterns in the training set
(“feature engineering”). Do neural networks still require feature engineering, and
why/why not?
Different layers in neural networks can “create” their own features that are
learnt from the data. This eliminates the need for feature engineering to some
extent
Name three common activation functions typically used in neural networks. Which
function trains the fastest?
Sigmoid or logistic function.
Hyperbolic tangent or tanh function.
ReLU or Rectified Linear Unit
The ______ function is responsible for computing the difference between the
predictions and training data
loss
What is the goal of backpropagation?
Compute gradients that are propagated backwards from the output layer,
through the hidden layers to minimise the loss function.
Name three common optimisers used for training neural networks
- Adam (adaptive moment estimation)
- Root Mean Square Propagation (RMSProp)
- Stochastic Gradient Descent (SGD)
In K-nearest neighbour classification, _______ values for 𝑘 may result in _______.
small, overfitting
Which of the following are limitations of K-nearest neighbour classification?
Requiring storing all training data in the model.