Mentimeters Flashcards

(73 cards)

1
Q

A CNN filter is applied to

A

all channels across layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Stride is

A

step with which the filter is applied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Padding

A

increases the size of input data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pooling

A
  • combines feature values within a region
  • downsamples feature maps
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

CNN activation is applied to

A

channel

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Hyperparameters can be learned with

A

validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

FC layer is typically used

A

close to the output side of the network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Typical loss for multiclass classification

A
  • cross entropy
  • softmax
  • negative log likelihood
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

ReLu can be applied

A

before or after max-pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Learning rate is

A

step of weights update

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Weights are not updated once per

A

epoch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

All training data is used to update weights in one

A

epoch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Averaging updates over iterations is called

A

momentum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

first and second order moments of gradients and used in

A
  • adadelta
  • RMSProp
  • Adma
  • Adagrad
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Batch normalisation is applied to

A

channels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Dropout is an effective regularisation of

A

fully connected layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

L2 regularisation of weights is called

A

decay

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

finetuning is a process of

A

updating parameters pretrained on another dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

data augmentation consists of

A

generating new samples from existing ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

hard negative is a

A

negative example which is similar to a positive one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

hard positive is a

A

positive sample which is dissimilar to positive ones

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

to debug a model

A

overfit on a small dataset

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

bias in a dataset is

A

confusing noise introduced during data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

VGG uses

A

3x3 filters and max pooling

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
VGG is widely used because of its
effective feature representation
26
efficiency of 1x1 filters was exploited in
inception
27
inception block uses
parallel filters with concatenated outputs
28
skip connections are used in
ResNet
29
skip connections in ResNet
do not change data
30
best performing word embedding is
Bert
31
Which unit is least effective in remembering sequences
RNN
32
Gating mechanism uses
sigmoid
33
in GRU hidden state and input are
concatenated
34
Language modelling uses architecture type
many to many
35
Transformers self attention uses
linear projections
36
What is the goal of reinforcement learning?
maximise expected return
37
The discount factor in value function is used to
weigh immediate and future rewards
38
which behaviour is exploration in game exploring
play an experimental move
39
what is the main drawback to monte carlo sampling approach to RL
needs to run an entire episode before updating
40
what is the main different between Q learning and SARAR
SARAR uses epsilon greedy update policy
41
main problem of Q-learning
not scalable
42
in deep Q learning 'deep' is mainly used to
approximate Q function
43
in policy based methods do we select actions according to value function
no
44
policy optimisation can only be performed using gradient based methods
false
45
REINFORCE is based on
monte carlo
46
in REINFORCE with baseline the baseline is used to
reduce variance
47
which method is not designed to reduce variance
REINFORCE
48
in actor-critic methods, critic is similar to which part of a GAN
discriminator
49
compared to value-based methods, policy-based methods can handle continuous action easily?
true
50
which parameters are not hyperparameters
weights of convolutional kernel
51
which hyperparameters optimization method is more efficient
random search
52
in successive halving, the number of configurations n indicates
exploration
53
in meta learning only training tasks contain training set and test set
false
54
in meta learning total loss is computed using
test examples
55
meta learning and multi task learning are the same
false
56
which colour representation can be used to compute colour similarities
RGB colour space
57
unsupervised representation learning can't be used for
learning a mapping function from dataset and labels
58
autoencoder is an
unsupervised method
59
in autoencoder the decoder must be symmetric to the encoder
false
60
as long as an autoencoder can reconstruct the input, this autoencoder can learn useful representations of the input
false
61
what objective function is used to train autoencoder
reconstruction loss
62
which is not an autoencoder?
disruptive
63
which autoendcoder can be used to perform dimensionality reduction
undercomplete autoencoders
64
in autoencoders which technique is used for anomaly detection
reconstruction error
65
which autoencoder should be used to recover noisy data
denoising autoencoder
66
an image classification model is a
discriminative model
67
VAEs
explicit methods
68
how are VAEs trained
maximising likelihood
69
the reparameterisation trick in VAEs is used for
training
70
GANs are
implicit methods
71
which loss is better for training the generator of GANs
non-saturating heuristic
72
what do GANs and VAEs have in common
both are generative models
73
Are VAEs easier to train but generate less sharp images?
yes