Framework for Prediction Flashcards

1
Q

What is the difference between original data and live data?

A

Original data is the data we have to build the model
Live data is data we do not have yet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Fill in the blanks

  1. If y is quantitaive, then there is a \_\_\_\_ prediciton and \_\_\_\_ problem.
  2. If y is binary, then there is a \_\_\_\_\_ predicitn and \_\_\_\_\_ problem
A
  1. quantitative prediction, regression problem
  2. probability prediciton, classification problem
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What do we care less about in a predictive analysis? What do we still care about?

A
  • Individual coefficient values (multicollinearity)
  • Still care about stability of the results
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the ideal prediciton error?

A

None, we want to be as close as possible.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

When evaluting prediction errors, how do we define direction and size?

A

Direction: Did we over- or underpredict values
Size: The absolute value of how far off the prediction was.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define prediciton error

A

The difference between the predicted value of the target variable and its actual value for the target observation.

PE = yhat - y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the three components of a prediciton error?

A
  1. Estimation error: difference between the estimated value and the true value
  2. Model error: the difference between the true value and the best predictor model (might not have the best model)
  3. Genuine error/idiosyncratic/irreducible: due to not being able to perfectly estimate all predicted values even if estimation error is zero
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Give an example of estimation, model, and an irreducible error.

A

Estimation: You predict a drive from your house to the store will take 5 minutes. Due to traffic it takes 11 minutes. The estimatione error would be 6 minutes
Model: In a model, you used age instead of age^2 for predicting.
Irreducible: Even though you have collected all the data you can, there still are differences between y and yhat.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are loss functions?

A

They evaluate the consequences of a prediction error. Specifically, how bad it is. Can be symmetric, asymmetric, linear, and convex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What does it mean for a loss function to be symmetric? Asymmetric? Convex?

A

Symmetric: Positive and negative errors produce the same value and magnitude of loss
Asymmetric: Positive and negative errors produce different values but the same magnitude of loss
Convex: disproportionate losses to larger errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Define bias and variance for MSE.

A

Bias: is the average of its prediction error. A biased prediction produces nonzero erorrs on average.

Variance: how the prediciton varies around its average value when multiple predicitons are made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is overfitting?

A

When a model has a better fit for the original data, but a worse fit for live data.

It is a key aspect of external validity and can make worse actual predicitons. Usually caused by too many variables.

Ex. Adding in extra variables that only slightly improve the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is underfitting?

A

When the model is a worse fit for both the original and live data.

Ex. Only using one variable in a regression instead of five

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some ways of finding the best fit?

A
  1. Using the AIC/BIC
  2. Using training models and test samples (cross-validation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a downside to using BIC?

A

It penalizes more complex models even when they would do better at prediciton in the live data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

T/F: The training-test split aviods overfitting the training set, but it may overfit the test set.

A

True

17
Q

Define k-fold cross-validation.

A

It’s a repeating training test split (k times). With each split called a fold. And the predition of a model is evaluated across the folds.

18
Q

T/F: Machine learning is an umbrella concept for methods that use algorithms to find patterns in data and use them for prediction purposes.

A

True

19
Q

Fill in the blank

To find the best fit using 5-fold cross validation you average —–

A

The goodness of fit on the test set across the 5 folds

20
Q

How can adding more x variables and interactions lead to over- and underfitting?

A

This can overfit the model by adding interactions that are catered to the original data, but are not present in the live data. Or these varaibles may be insignificant in the regression (aka adding white noise) that leads the model to not accurately represent the original or live data.

21
Q

Consider forcasting if it will rain tomorrow. Is is likely that people will have the same loss function? Why or why no?

A

No, they will likely not have the same loss function. For example, someone who lives close to the beach will not loss as much than someone who lives in a landlocked state and is travelling for vacation. The person who lives closeby may lose one good beach day out of the year, but the travelling person losses more as this might have been the only day they could visit the beach.