Pikachu - Menti spørsmål fra tidligere eksamner Flashcards
(25 cards)
Which is a loss function used in classification tasks?
Mean Squared Error
Cross Entropy Loss
Cross Validation
Cross Entropy Loss
Which techniques are used to prevent overfitting? Select all that apply.
L1 regularisation
Dropout
Inrease number og layers
Weight decay
L1 regularisation
Dropout
Weight decay
What is the purpose of a convolutional layer in a CNN?
Learn spatial properties by applying filters to input data.
Reduce the dimensions of input data.
Convert input data to a probability distribution.
Compute the difference between predicted and actual values.
Learn spatial properties by applying filters to input data.
Why do we use data augmentation?
To improve quality of training data
Prevent overfitting by adding variations to training data
To reduce the size of training data and speed up training
Convert data to a format compatible with deep learning
Prevent overfitting by adding variations to training data
What are hyperparameters in deep learning?
Predefined weights and biases that remain constant during training
Adjustable parameters that affect model architecture and learning
Input data used to train the model
The model predictions
Adjustable parameters that affect model architecture and learning
What is the purpose of activation functions?
Reduce training speed
Improve performance of the optimisation algorithm
Adding non-linearities to the model
Reducing the models complexity
Adding non-linearities to the model
What is image segmentation?
A method of dividing an image into regions based on similarity
A method of improving picture contrast
A method of reducing image noise
A method of dividing an image into regions based on similarity
What is a language model, in the context of deep learning
A model assigning a vector of numbers to each word in a sentence
A model predicting missing words in a sentence (e.g. the next)
A model used to compare the grammatical structure of two sentences.
A model predicting missing words in a sentence (e.g. the next)
What is the purpose of using a pretrained model in early stages of a deep learning project?
Reduce the amount of training data needed
Increase the models’ complexity
Validating the models’ perfromance on new data
Teaching students basic programming
Reduce the amount of training data needed
What is a characteristic of deep learning models compared to traditional methods?
Using more data for training
The ability to more effectivly do linear computations
The use of multi-layer artificial neural networks to learn patterns
Stricter requirements on type and quality of input data
The use of multi-layer artificial neural networks to learn patterns
What are two main challenges when running a deep learning model in production? Select both that apply.
Overfitting to training data
Poor ability to process large amounts of data
Data drift, where new data differs from training data
Too high precision
Overfitting to training data
Data drift, where new data differs from training data
How is a grayscale image represented on a computer? How about a color image?
Grayscale images are 2D arrays; color images are 3D arrays
Both are represented as a single vector of pixel values
Grayscale images are single vectors; color images are 3D arrays
Both are represented as 3D arrays of pixels
Both are represented as 3D arrays of pixels
What best describes gradient descent in deep learning?
A method of speeding up model predictions
An optimisation algorithm for minimising a loss function
A method of balancing the weights between layers of a neural network
A method for selecting the most important features in input data
An optimisation algorithm for minimising a loss function
What is stride in a convolutional layer in a CNN?
Number of pixels the filter is m oved for each time it is applied
Number of times a filter is applied to input data
Width of the filters used
Number of filters in a convolutional layer
Number of pixels the filter is moved for each time it is applied
What is tokenisation, and why do we need it?
Conversion of text into phonemes
Dividing text into sentences for doing structural analysis
Breaking text into smaller pieces (subwords) for further processing
Encryption of text for data protection
Breaking text into smaller pieces (subwords) for further processing
What does epoch mean, in the context of neural network training?
Number of layers in the network
One complete pass through of the entire training dataset
Number of neurons in a layer
The time taken to train the model
One complete pass through of the entire training dataset
What is learning rate?
The percentage of data reserved for validation
The step size used for updating weights during backpropagation
The rate at which training examples are fed into the network
The number of training epochs required to achieve convergence
The step size used for updating weights during backpropagation
Which activation function is most used to avoid the problem of vanishing gradients?
Sigmoid
Hyperbolic tangent (tanh)
Rectified linear unit (ReLU)
Softmax
Rectified linear unit (ReLU)
What is the primary purpose of applying dropout?
To accelerate training by reducing the number of trainable parameters
To improve performance by focusing on the most important features
To prevent overfitting by randomly deactivating neurons in training
To increase the network’s capacity by adding more layers dynamically
To prevent overfitting by randomly deactivating neurons in training
Which of the following statements about batch size in neural network training are correct? Choose all that apply.
Smaller batch sizes -> noisier gradient updates -> escape local minima
Larger size requires more memory, but gives stable gradient updates
Smaller batch size always reduces training time
Larger batch sizes makes better use of hardware acceleration
Smaller batch sizes -> noisier gradient updates -> escape local minima
Larger size requires more memory, but gives stable gradient updates
Larger batch sizes makes better use of hardware acceleration
In a CNN, what is the purpose of the pooling layer?
Reduce dimensionality of feature maps while preserving important info
To introduce non-linearity into the model
To increase the number of trainable parameters in the network
To normalize the feaure maps to a standard scale
Reduce dimensionality of feature maps while preserving important info
Which of the following are common techniques used in data augmentation for images? Select all that apply.
Random cropping
Rotation
Feature scaling
Horizontal flipping
Random cropping
Rotation
Horizontal flipping
Which if the following statements about loss functions in neural networks are correct? Select all that apply.
Mean squared error (MSE) loss is commonly used for regression tasks
Cross-entropy loss is used exlusivly for binary classification
Loss functions are only used during evaluation of model performance
Loss functions measure how well model predictions align with truth
Mean squared error (MSE) loss is commonly used for regression tasks
Loss functions measure how well model predictions align with truth
Which of the following techniques can be used to interpret and visualize the predictions of a CNN? Select all that apply.
Grad-CAM (Gradient-weighted Class Activation Mapping)
Feature maps from intermediate layers
Batch Normalization
Data Augmentation
Grad-CAM (Gradient-weighted Class Activation Mapping)
Feature maps from intermediate layers