21. Artificial Neural Networks 2 Flashcards
(20 cards)
What does an ANN consist of?
A collection of neurons (nodes) organized in layers and connected by weighted links.
What is the structure of a feedforward ANN?
Composed of an input layer, one or more hidden layers, and an output layer.
What is a perceptron?
A single-layer neural network with binary output based on weighted input summation.
How is a perceptron output computed?
Using the activation function: y = f(w·x + b), where w are weights, x inputs, and b the bias.
What activation function does a perceptron use?
A step function that outputs 1 if input > threshold, else 0.
What are limitations of a perceptron?
Cannot solve non-linearly separable problems like XOR.
What is the difference between perceptron and multi-layer perceptron (MLP)?
MLP has one or more hidden layers, allowing it to solve non-linear problems.
What are the common activation functions in MLP?
Sigmoid, tanh, and ReLU.
How does learning occur in an MLP?
Via backpropagation, adjusting weights using the gradient of the error.
What is backpropagation?
A supervised learning technique using gradient descent to minimize output error.
What are key components in backpropagation?
Forward pass, error computation, backward pass, weight update.
What is the role of the learning rate?
Controls how much the weights are adjusted during training.
What happens if the learning rate is too high or too low?
Too high: unstable training; Too low: slow convergence.
What is overfitting in ANN?
When the model learns training data too well, failing to generalize to new data.
How to prevent overfitting?
Use regularization, dropout, or early stopping.
What is the vanishing gradient problem?
When gradients become too small during backpropagation, slowing or stopping learning.
Which activation function is prone to vanishing gradients?
Sigmoid and tanh.
Which activation function helps reduce vanishing gradients?
ReLU (Rectified Linear Unit).
What is the purpose of bias in a neuron?
Allows the activation function to be shifted left or right.
What is the universal approximation theorem?
An MLP with one hidden layer can approximate any continuous function given sufficient neurons.