Final Stuff 2 Flashcards

(27 cards)

1
Q

what is LIME?

A

a technique that aims to create an interpretable model local to a data point

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

why does the vanishing/exploding gradient problem occur?

A

we are multiplying gradients as we move to earlier layers, multiplying lots of small/large numbers together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what is grokking?

A

when test error falls way after training error has fallen, model suddenly generalizes well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are support vectors in SVM

A

the data points that are closest to the hyperplane

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to set up confusion matrix

A

actual on top, predicted on the side
positive positive in the top left

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is the loss function for logistic regression?

A

binary cross entropy, it is strictly convex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

types of transformers

A

encoder, decoder, encoder-decoder

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

false positive rate

A

FP / (FP + TN)
second column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

2 examples of non parametric models

A

knn, decision trees
don’t make assumptions about the underlying distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

recall/true positive rate

A

also called true positive rate
TP / (TP + FN)
out of all the positives, how many was the model able to get correct (recall)
first column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

false positive

A

we predict positive, but that is false (actually negative)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

advantages of MSE and MAE

A

MSE - differentiable, good for learning
MAE - result is interpretable, simple, less prone to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is calibration?

A

making sure the output of the model represents how confident it is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

three svm kernels

A

linear, polynomial, RBF

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what are sampling layers in cnns?

A

pooling layers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

types of autoencoders

A

deep (multiple layers), sparse, variational, etc.

17
Q

precision

A

TP / (TP + FP)
how many predicted positives were truly positive?
across first row

18
Q

accuracy

A

correct vs total
diagonal vs total

19
Q

what are scaling laws?

A

take in model size and data and try to predict loss
how much will throwing resources at the model improve it?

20
Q

MSE and MAE formulas

A

true - predicted, square/take absolute value, sum up, divide by n

21
Q

normalization vs regularization

A

normalization - making sure weights are same scale
regularization - makes sure model doesn’t overfit

22
Q

false negative

A

we predict negative, but that is false (actually positive)

23
Q

types of normalization

A

batch norm, L2 norm

24
Q

what is out of bag evaluation

A

evaluate tree on the data that wasn’t used for training

25
high level of how to train an LLM
pretraining - predict next token supervised fine tuning - train on prompts and good responses reinforcement learning with human feedback - humans rate responses
26
why use kernels in SVM?
transform the data into a higher dimensional space where the data is now linearly seperable
27
what are proxy models?
models that behave similar to complex models