Final Stuff Flashcards by Jason Chen

MSE and MAE formulas

true - predicted, square/take absolute value, sum up, divide by n

How well did you know this?

Not at all

Perfectly

advantages of MSE and MAE

MSE - differentiable, good for learning
MAE - result is interpretable, simple, less prone to outliers

How well did you know this?

Not at all

Perfectly

accuracy

correct vs total
diagonal vs total

How well did you know this?

Not at all

Perfectly

recall/true positive rate

also called true positive rate
TP / (TP + FN)
out of all the positives, how many was the model able to get correct (recall)
first column

How well did you know this?

Not at all

Perfectly

false positive rate

FP / (FP + TN)
second column

How well did you know this?

Not at all

Perfectly

precision

TP / (TP + FP)
how many predicted positives were truly positive?
across first row

How well did you know this?

Not at all

Perfectly

how to set up confusion matrix

actual on top, predicted on the side
positive positive in the top left

How well did you know this?

Not at all

Perfectly

false negative

we predict negative, but that is false (actually positive)

How well did you know this?

Not at all

Perfectly

false positive

we predict positive, but that is false (actually negative)

How well did you know this?

Not at all

Perfectly

types of transformers

encoder, decoder, encoder-decoder

How well did you know this?

Not at all

Perfectly

high level of how to train an LLM

pretraining - predict next token
supervised fine tuning - train on prompts and good responses
reinforcement learning with human feedback - humans rate responses

How well did you know this?

Not at all

Perfectly

normalization vs regularization

normalization - making sure weights are same scale
regularization - makes sure model doesn’t overfit

How well did you know this?

Not at all

Perfectly

types of normalization

batch norm, L2 norm

How well did you know this?

Not at all

Perfectly

what are support vectors in SVM

the data points that are closest to the hyperplane

How well did you know this?

Not at all

Perfectly

what is out of bag evaluation

evaluate tree on the data that wasn’t used for training

How well did you know this?

Not at all

Perfectly

what is calibration?

Study These Flashcards

making sure the output of the model represents how confident it is

2 examples of non parametric models

Study These Flashcards

knn, decision trees
don’t make assumptions about the underlying distribution

what is LIME?

Study These Flashcards

a technique that aims to create an interpretable model local to a data point

what are proxy models?

Study These Flashcards

models that behave similar to complex models

why does the vanishing/exploding gradient problem occur?

Study These Flashcards

we are multiplying gradients as we move to earlier layers, multiplying lots of small/large numbers together

types of autoencoders

Study These Flashcards

deep (multiple layers), sparse, variational, etc.

what is the loss function for logistic regression?

Study These Flashcards

binary cross entropy, it is strictly convex

what are scaling laws?

Study These Flashcards

take in model size and data and try to predict loss
how much will throwing resources at the model improve it?

three svm kernels

Study These Flashcards

linear, polynomial, RBF

why use kernels in SVM?

transform the data into a higher dimensional space where the data is now linearly seperable

what are sampling layers in cnns?

pooling layers

what is grokking?

when test error falls way after training error has fallen, model suddenly generalizes well.

Final Stuff Flashcards

(27 cards)