Assessing Algorithms Flashcards

1
Q

KNN Pro/Con

A

P - Doesn’t overfit model

C - Can’t extrapolate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Parametric Pro/Con

A

P - Ability extrapolate

C -

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

RMS Error

A

sqrt ( (sum(actual - predicted)^2) / n))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

out of sample error

A

RMSE on test data

Error for data predicted outside of training model data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Cross validation

A

When there isn’t enough training data, split existing data into chunks.

Then, use different combinations of chunks to train/test, eg round1: test(1-4), train(5), round2: test(2-5), train(1)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Cross validation and financial data

A

Because it can “peek” at future values, it does not fit finance data well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Roll forward cross validation

A

Cross Validation where Training data is ALWAYS before test data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Correlation

A

For a regression algorithm:
Look at relationship between predicted and actual values

Scatter plot actual and predicted values, and calc correlation coefficient

Correlation != slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Overfitting

A

When in sample error decreases and out of sample error increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly