exam Flashcards
question for the exam (62 cards)
What is the basic structure of a Neuron (Perceptron) in ANN?
A Neuron (Perceptron) consists of inputs, weights, a bias, an activation function, and an output.
What is the purpose of an activation function in Neural Networks?
The activation function determines whether a neuron should be activated or not, introducing non-linearity into the model.
Activation function selectively activates neurons, ensuring that the model does not depend on just a few but rather distributes learning efficiently. It also introduces non-linearity, which is crucial for deep learning models to recognize complex patterns.
What are the steps involved in training a Neural Network?
The steps include initializing weights, feeding input data, calculating output, computing loss, and updating weights using backpropagation.
How can we assess the performance of our model?
Performance can be assessed using metrics such as accuracy, precision, recall, F1 score, and loss.
Model performance is assessed using evaluation metrics, loss functions, and validation techniques. The choice of metric depends on whether the task is classification, regression, or clustering.
✅ Classification → Accuracy, Precision, Recall, F1-score.
✅ Regression → MSE, MAE, R² Score.
Can you highlight the differences between Batch Gradient Descent and Stochastic Gradient Descent in the context of Machine Learning?
Both tries to minimize loss - find where the gradiant of graph over loss function is 0
Batch Gradient Descent uses the entire dataset to compute gradients, while Stochastic Gradient Descent updates weights using one sample at a time.
Batch is computational costly when working with large dataset
Stochastic computes the gradient using small subsets of each iteration
Which method is commonly used to determine optimal values for parameters like weights and biases in a Neural Network?
Gradient Descent is commonly used to determine optimal values for weights and biases.
The Gradient Descent algorithm is commonly used to optimize weights and biases in a neural network by minimizing the loss function.
✅ Key Idea – Adjust weights and biases iteratively to minimize the error between predictions and actual values.
✅ Uses Backpropagation – Computes gradients of the loss function with respect to parameters.
What is a loss function, and why is it important?
A loss function quantifies how well the model’s predictions match the actual data, guiding the optimization process.
What role do hyperparameters play in a Neural Network?
Hyperparameters control the learning process, including learning rate, batch size, and number of layers.
We dont know the best hyperparameters beforehand, there’s always a trade-off between speed, accuracy, and generalization.
We need to balance these hyperparameters rather than just maxing everything out.
What are the parameters of a Neural Network?
Parameters include weights and biases that are learned during training.
How should you select the suitable format of a neural network (MLP, RNN, CNN, GNN) for a project?
the right neural network architecture depends on the type of data you’re working with and the problem you’re solving
Multi-Layer Perceptron (MLP) - Tabular/Structured Data (e.g., Spreadsheets, Financial Data,
Convolutional Neural Network (CNN) - Image
Recurrent Neural Network (RNN) - Sequential Data (Text, Time-Series, Audio)
Graph Neural Network (GNN) - Graph-Structured Data (Networks, Social Relationships, Molecules, Knowledge Graphs)
How do you select the most suitable setting for the loss function in ANN?
Selecting the right loss function for an Artificial Neural Network (ANN) depends on the type of task, output activation function, and data distribution.
✅ Classification → Use Cross-Entropy Loss (Binary or Categorical).
✅ Regression → Use MSE, MAE, or Huber Loss.
✅ Imbalanced Data → Use Weighted Loss Functions.
What exactly is Gradient Descent?
Gradient Descent is an optimization algorithm used to minimize the loss function by iteratively adjusting parameters.
What does Mean Squared Error (MSE) tell us in machine learning?
MSE measures the average squared difference between predicted and actual values. Is an indicating for the models accuracy.
How does backpropagation work in Neural Networks?
Is the core algorithm used to train Neural Networks (NNs). It enables the model to learn by adjusting its weights and biases based on how much error it makes in predictions.
Backpropagation calculates gradients of the loss function with respect to each weight by applying the chain rule, allowing weights to be updated.
Explain forward pass and backward pass in the ANN training process.
The forward pass computes the output from inputs, while the backward pass updates weights based on the error calculated from the output.
Why is it important to split data into training and testing sets in machine learning?
Splitting data helps evaluate model performance on unseen data, preventing overfitting.
What is the difference between binary, multi-class, and multi-label classification? Also, explain which activation function is best suited for each.
Binary classification predicts two classes, multi-class predicts multiple classes, and multi-label predicts multiple labels. Sigmoid is best for binary, Softmax or cross entropy for multi-class, and Sigmoid for multi-label.
Classification tasks can be divided into three types based on the number and nature of output categories:
✅ Binary Classification → Two possible classes (e.g., Spam vs. Not Spam).
✅ Multi-Class Classification → More than two classes, but only one label per sample (e.g., Dog, Cat, or Bird).
✅ Multi-Label Classification → Each sample can belong to multiple categories (e.g., an image containing both a Dog and a Car).
Which loss function is best suited for regression and classification?
Mean Squared Error is best for regression, while Cross-Entropy is best for classification.
Which activation function best suits the input layer in an MLP?
The ReLU activation function
Which activation function best suits the input layer in a CNN?
Depends on task Sigmoid, softmax or Relu.
Which activation function best suits the input layer in an RNN?
The Tanh activation function is often used in the input layer of an RNN.
What is the role of the learning rate in Gradient Descent?
The learning rate (α) controls the step size of weight updates in Gradient Descent. It determines how quickly the model learns by adjusting its parameters to minimize the loss function
The learning rate determines the size of the steps taken towards the minimum of the loss function.
Using PyTorch, what is the procedure for constructing a Neural Network encompassing various layers, including input, hidden, and output?
The procedure involves defining a class, initializing layers in the constructor, and implementing the forward method to define the forward pass.
In PyTorch, a Neural Network is built using the torch.nn.Module class, which includes input, hidden, and output layers. The key steps are:
✅ Step 1: Import necessary libraries.
✅ Step 2: Define the neural network architecture using torch.nn.Module.
✅ Step 3: Initialize weights and activation functions.
✅ Step 4: Define the forward pass.
✅ Step 5: Instantiate the model and set up loss and optimizer.
✅ Step 6: Train the model using forward and backward propagation.
💡 Example: A simple feedforward neural network for classification can be implemented in PyTorch following these steps.
What strategies can be employed to mitigate the issue of overfitting in a complex neural network?
Strategies include using dropout, regularization, early stopping, and data augmentation.