DL test 2023 Flashcards
(33 cards)
You have to choose an activation function for a fully connected nn. Which of the following is most likely to lead to dead neurons?
Rectified Linear Unit
At what stage of the life of a nn model is backpropagation used?
Training
Which one is true? 1. The final activation in CBOW is the logistic sigmoid function since we only predict one word, 2. CBOW predicts words based only on the previous words, 3. CBOW and Skipgram ignore the order of words, 4. Skipgram predicts the context based on a word
CBOW and Skipgram ignore the order of words, Skipgram predicts the context based on a word
A model for element classification of set data (i.e. assign a class for each element in the set) needs to be:
equivariant to permutation
A fully connected (dense) nn is used to model data that resides on a grid domain (such as a 2D image). This model is:
Not translationally invariant nor equivariant
Choose the best structure for a data efficient model that operates on graph data and assigns a class to a graph
Permutationally equivariant layer(s) followed by a global pooling layer(s) and a softmax layer
Select the layer(s) of a CNN that inputs a tensor with dimensionality 64x64x28 and outputs an activation map with the dimensionality of 60x60x32
2D Convolutional layer, with 32 filters, each with a kernel of size (5x5), stride 1 and VALID padding
The number of times a parameter is re-used in an RNN cell is proportional to:
The length of the sequence
Which of the following activation functions have the right properties to be suitable for serving as a gating mechanism?
The cumulative distribution function of the standard normal distribution, hard sigmoid, logistic sigmoid
What are the required characteristic(s) of the aggregation function of message passing graph nn?
Produce the average value of all input values, produce the same result for any permutation of the input values, deal with different number of input elements
Select which statements are true for the message passing graph nn model
the model can learn a representation of a graph with a variable number of unordered edges and nodes, the model can learn a representation of a graph with a fixed number of nodes and bidirectional edges
What is the number of iterations that a message passing graph nn needs to implement to guarantee that information from each node will reach each other node in a fully connected graph?
1
The depth of the message passing graph nn model is proportional to the:
number of iterations of message passing
When choosing an activation function for a fully connected neural network. Which of the following is more likely to cause vanishing gradients during training?
Logistic Sigmoid
When choosing an activation function for a fully connected neural network. Which of the following is most likely to lead to “dead” neurons?
Rectified Linear Unit
At what stage of the life of a neural network model is backpropagation used?
Training
Word Embedding
What probability distribution does the model in diagram (a) above estimate? Give a formula for the correct probability distribution, e.g
P
What probability distribution does the model in diagram (b) above estimate? Give a formula for the correct probability distribution, e.g.
p
Word embeddings (e.g. CBOW or Skipgram) can be used to train embeddings of other types of data than just words. Which of the following input data types are suitable for the use of these techniques?
DNA sequences of genes in terms of their four bases (Adenine A, Cytosine C, Guanine G, Thymine T), Tap dance choreography in “Kahnotation” (see figure below),
Time series representing the values of stocks over time
A model consists of 5 convolutional layers operating on a grid. This model is:
Equivariant to translation
Select the layer(s) of a CNN that inputs a tensor with dimensionality 64x64x128 and outputs an activation map with the dimensionality of 60x60x32.
2D Convolutional layer, with 32 filters, each with a kernel of size (5x5), stride 1 and VALID padding.
The task at hand is signature verification. You have access to a dataset of 50000 images of signatures. You only have 2-3 signatures per person.
During the operation of the system, a person provides identification and they sign a document. The task of the model is to verify that the provided signature corresponds to the one in the database.
Describe:
1. The data domain and its symmetries
2. The type of model that is motivated well for the given data domain and problem formulation
2. The loss function
3. How your model computes a verification score during the operation of the system
The task is metric learning since we have many images of signatures and not many classes. We use the siamese network to learn the signatures where we learn classes by taking one image which belongs to the class and another that does not belong to the class. We compare similarities between signatures. We have translation symmetry where the position of the signature with the image does not affect the model in verification and rotational symmetry where the orientation of the signature does not affect the model. The siamese network works outputs the similarity between images. The loss function is contrastive loss. The loss function aims to maximize the similarity between genuine signatures while minimizing the similarity between genuine and imposter signatures. We use the distance between images as a verification score where the lower distance means a higher score.
Suppose the recurring weights W in vanilla RNN (so without gates) has matrix-norm
How fast does the sensitivity of an output o to an input x increase or decrease in terms of l?
It decreases exponentially in l
What is the number of iterations that a Message Passing Graph Neural network needs to implement to guarantee that information from each node will reach each other node in a fully connected graph?
1