- combines feature values within a region - downsamples feature maps

Mentimeters Flashcards by Emma Wardle

A CNN filter is applied to

all channels across layer

How well did you know this?

Not at all

Perfectly

Stride is

step with which the filter is applied

How well did you know this?

Not at all

Perfectly

Padding

increases the size of input data

How well did you know this?

Not at all

Perfectly

Pooling

combines feature values within a region
downsamples feature maps

How well did you know this?

Not at all

Perfectly

CNN activation is applied to

channel

How well did you know this?

Not at all

Perfectly

Hyperparameters can be learned with

validation

How well did you know this?

Not at all

Perfectly

FC layer is typically used

close to the output side of the network

How well did you know this?

Not at all

Perfectly

Typical loss for multiclass classification

cross entropy
softmax
negative log likelihood

How well did you know this?

Not at all

Perfectly

ReLu can be applied

before or after max-pooling

How well did you know this?

Not at all

Perfectly

Learning rate is

step of weights update

How well did you know this?

Not at all

Perfectly

Weights are not updated once per

epoch

How well did you know this?

Not at all

Perfectly

All training data is used to update weights in one

epoch

How well did you know this?

Not at all

Perfectly

Averaging updates over iterations is called

momentum

How well did you know this?

Not at all

Perfectly

first and second order moments of gradients and used in

adadelta
RMSProp
Adma
Adagrad

How well did you know this?

Not at all

Perfectly

Batch normalisation is applied to

channels

How well did you know this?

Not at all

Perfectly

Dropout is an effective regularisation of

fully connected layers

How well did you know this?

Not at all

Perfectly

L2 regularisation of weights is called

decay

How well did you know this?

Not at all

Perfectly

finetuning is a process of

updating parameters pretrained on another dataset

How well did you know this?

Not at all

Perfectly

data augmentation consists of

generating new samples from existing ones

How well did you know this?

Not at all

Perfectly

hard negative is a

negative example which is similar to a positive one

How well did you know this?

Not at all

Perfectly

hard positive is a

positive sample which is dissimilar to positive ones

How well did you know this?

Not at all

Perfectly

to debug a model

overfit on a small dataset

How well did you know this?

Not at all

Perfectly

bias in a dataset is

confusing noise introduced during data collection

How well did you know this?

Not at all

Perfectly

VGG uses

3x3 filters and max pooling

How well did you know this?

Not at all

Perfectly

VGG is widely used because of its

effective feature representation

efficiency of 1x1 filters was exploited in

inception

inception block uses

parallel filters with concatenated outputs

skip connections are used in

ResNet

skip connections in ResNet

do not change data

best performing word embedding is

Bert

Which unit is least effective in remembering sequences

RNN

Gating mechanism uses

sigmoid

in GRU hidden state and input are

concatenated

Language modelling uses architecture type

many to many

Transformers self attention uses

linear projections

What is the goal of reinforcement learning?

maximise expected return

The discount factor in value function is used to

weigh immediate and future rewards

which behaviour is exploration in game exploring

play an experimental move

what is the main drawback to monte carlo sampling approach to RL

needs to run an entire episode before updating

what is the main different between Q learning and SARAR

SARAR uses epsilon greedy update policy

main problem of Q-learning

not scalable

in deep Q learning 'deep' is mainly used to

approximate Q function

in policy based methods do we select actions according to value function

policy optimisation can only be performed using gradient based methods

false

REINFORCE is based on

monte carlo

in REINFORCE with baseline the baseline is used to

reduce variance

which method is not designed to reduce variance

REINFORCE

in actor-critic methods, critic is similar to which part of a GAN

discriminator

compared to value-based methods, policy-based methods can handle continuous action easily?

true

which parameters are not hyperparameters

weights of convolutional kernel

which hyperparameters optimization method is more efficient

random search

in successive halving, the number of configurations n indicates

exploration

in meta learning only training tasks contain training set and test set

false

in meta learning total loss is computed using

test examples

meta learning and multi task learning are the same

false

which colour representation can be used to compute colour similarities

RGB colour space

unsupervised representation learning can't be used for

learning a mapping function from dataset and labels

autoencoder is an

unsupervised method

in autoencoder the decoder must be symmetric to the encoder

false

as long as an autoencoder can reconstruct the input, this autoencoder can learn useful representations of the input

false

what objective function is used to train autoencoder

reconstruction loss

which is not an autoencoder?

disruptive

which autoendcoder can be used to perform dimensionality reduction

undercomplete autoencoders

in autoencoders which technique is used for anomaly detection

reconstruction error

which autoencoder should be used to recover noisy data

denoising autoencoder

an image classification model is a

discriminative model

VAEs

explicit methods

how are VAEs trained

maximising likelihood

the reparameterisation trick in VAEs is used for

training

GANs are

implicit methods

which loss is better for training the generator of GANs

non-saturating heuristic

what do GANs and VAEs have in common

both are generative models

Are VAEs easier to train but generate less sharp images?

yes

Mentimeters Flashcards

(73 cards)