Troubleshooting Flashcards

1
Q

Why is your model worst then the authers?

A

Implementation bug
Hyper parameters choices - model could be extremely sensitive
Data model fit - different data from the paper
Dataset construction - most time in industry is spent on datasets and not models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

בגדול מה צריכה להיות האסטרטגיה?

A

פסימיסם,

בגלל שקשה לעשות דיבאג - אז להתחיל ממש בדברים הפשוטים ואז לעלות את המורכבות

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

בגדול מה התהליך של בניית מודל

A

להתחיל בפשוט - לבחור מודל קל ודאטה קל

ליצור ולדבג את המודל

להעריך את התוצאות

לשפר את ההיפרפרמטר

לשפר את הדאטה/המודל

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

When starting simple - what architecture to choose?

A

For images start with LeNet like architecture then move to resnet

For sequences start with Transformer/attention model than move to wavenet like model

For other start with simple fully connected NN

For multiple input - say a picture with a phrase - start with making each input into a lower dimensional feature space for example use convNet and flatten the results, same with sequence use LSTM and keep the final vector. Then concaténate all together and pass the output through a fully connected layer

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Optimizer defualts

A

Adam with learning rate of 3e-4
Relu for cnn tanh for lstm
Regularización - none
Normalization - none (like batch normalization - not the one on the input)
Both are none because they are a source of bugs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Should i normalize the input data?

A

Yes!
Make aure to do it!
And that its not done automaticaly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How to simplify the problem so we can start easy

A

Small training set (less then 10,000)
Less classes/objects/smaller pictuers etc..
Create a simple synthetic training set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

3 General advice for implementing the model

A
  1. Lightweight implementation
    פחות מ200 שורות קוד…
  2. Of the shelf components
  3. Build complicated pipelines later
How well did you know this?
1
Not at all
2
3
4
5
Perfectly