General Flashcards

1
Q

What is Layer in Neural Network

A

A layer is a sequence of operators (BatchNorm, Conv etc) plus 1 activation (ReLU, Sigmoid etc)

1 Layer = 1 Activation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How and Benefits of Normalize RGB value to defined mean and std values, i.e.

Normalize(
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])

A
  1. Find the mean and standard variance of the training set

2. use that as the mean and std value to normalize every input image during prediction.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the usual image Preprocessing in machine learning?

A

Scale/Resize
Crop
To Tensor
Normalization

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

in ipython, why wall time is smaller than CPU time?

In [52]: %time out = resnet(batch_t)
CPU times: user 619 ms, sys: 0 ns, total: 619 ms
Wall time: 313 ms
A

On a N-cpu/core computer, if a task is run in parallel on all N cores, then the wall time is a 1/N of the total CPU time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Python lists or tuples of numbers are collections of Python objects that are individually
allocated in memory

A

Thus list and tuples are not efficient storage of matrix. Use numpy array or tensor

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Use “range indexing” for each of the tensor’s dimensions CREATES A VIEW only (not a copy)

Assume tensor T of shape 3x4x5
T[0:2, 1:4, 2:5].shape == 2x3x3

A

Slicing creates another tensor that presents a different view of the same underlying data.

I.e. Slicing generates VIEWs, not COPIES

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Transposing without copying

A

Matrix can be transposed in “view” by creating a view of tensor with different tensor metadata:

Offset, shape and stride

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Neural networks exhibit the best training performance when the input data ranges roughly from 0 to 1, or from -1 to 1. Thus transforming input data from any range into [0,1] or [-1, 1] is called Normalization

Why?

A

Because activation functions are only sensitive/linear around [0,1] or [-1, 1] depends on the activation function used.

In other ranges, the activation function saturates at min/max value when input changes, which does not contribute to the network anymore.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly