teste 2 Flashcards

1
Q

What is a state space search?

A

it is a process in which successive states are considered with the intention of finding the goal state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

in a state space search what are the variables and fucntions

A

S: all possible states
A : all possible actions for a state
Action(s) action allowed to be performed when state is s
Results(s,a) results when a action is taken in s state
Costs(s,a) costs of doing a in s state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the three examples of state space search?

A

depth first search
breadth first search
A*-heuristic search

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In depth first search how does the algorithm proceed?

A

the root node is selected in the tree plot and each branch is explored fully in order until the goal state is found

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

breadth-first search

A

explores nodes at each level before moving to the next

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pros and cons of depth first search

A

pro: low memorie requirement
con: slow, may not find solution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

pros and cons of breadth first search

A

pro:garantees solution
cons: high memory cost

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is A*-heuritic search

A

it is an informed seach algorithm that aims to minimize costs( like memory and time) from the star till the goal node

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In a A*-heuristic search the formula for cost is f(n) = g(n)+h(n) what does h(n) and g(n) mean

A

h(n) represent the heuristic function, in this case the cost from n to the goal

g(n) represents the cost of a step

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

In machine learning when the classes are unkown what type of classification is used

A

clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Check distance formulas

A

check them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is cross validation

A

a way of training and testing your classifictaion method in machine learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the four types of cross validateion

A

leave one out
bootstap
n-fold
split test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

quickly descrive the cross validation methods

A

split test: half the set is training the rest is testing

bootstrap: random datapoints are selected to make a set (for testing and training) (there is reposition)

n-fold: testing successive and intersecting arrays of data against the rest

leave one out: one single data point is used as testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

In a confusion matrix, which axis must have a sum of 100%

A

vertical

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Formula for the true positive rate

A

TPR = TP/P = 1-FNR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

True negative rate formula

A

TNR = TN/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The positive prediction value (PPV)

A

PPV = TP/PP = 1- FDR

FDR - False discovery rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

F1 score formual

A

2PPV*TPR/PPV+TPR

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Accuracy

A

Acc = TP+TN/P+N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is PP ?

A

The total number of things labled positive (TP + FP)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is P?

A

Total number of positives

TP + FN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What is the complete machine learning system

A

sample-extaction1-classifier3-evaluation12-decision

1- learning
2-reporting

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

There are two types of models in machine leraving are are they and describe them

A

Discriminative and Generative models

Descriminative: They foccus onf distinction among classes, learning decision boundaries (ex: K-NN, SVM, Regression)

Generative: model how data is placed throughout the space, focusing on characteristics and a known model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is Gradient descent and it's goal
Gradient descent is the minimum of the derivative of the MSE.
26
Label and explain the different gradian descent models.
Stochastic (gd): a point in the GD is chosen at random in each step. It is fast, and good for redundant data. Batch(gd): all samples are tested per iteration MInibatch(gd): sub set per iteration
27
How does the step size affect the linear regression
see images on step size
28
Regularized linear models name them:
Ridge regression Lasso
29
iHow does ridge regression work
In ridge regression, some bias is applied to the final loss function of the trained data. This bias, when applied to the testing data, will in the long term, provide better fittings to the data. This is a way to avoid overfitting.
29
How does ridge regression work
a linear regression model is dependent on the training data. However the training dataset is not representative of the set/samples. So a small amount of bias is introduced in the error function- causing the function not a align with the trainingset perfectly. ---Make model shit during training As the iterations of fitting go on, the final fit should be better than the simple one. This fights overfitting. We do not want Mse to be zero all the time.
30
In the ridge regression what is the porpose of A, what does it do, how does its increase affect slope
MAKINg sure that both mse and the slope have the same units controls the severity of the penalty to the mse. decreased the slope.
31
describe Lasso,
Lasso: least absolute shrinkage and selection operator regression The penalty is not squared, but abs, it is very similar to ridge, but it can achieve slope 0.
32
In lasso, why is zero slope usefull:
to erase contribuition from useless parameters in the determination of y
33
What is early stopping
it is ml technique uselfuu to avoid overfitting and underfitting it determines the balance between overfitting and underfitting by determining the number of steps to take. A plot is usually created, Loss/Iteration . The training set always decreases in loss with the number of iterations. At each step the model should be testing with the testing data and if the loss starts to increase (in the testing datset) starts to increase then the model should stop.,
34
Logistic regression
preditc if somthing is (T of F / 0 or 1) the cost function takes into account both options
35
Classificatiuon is the same as regression
No it is not
36
in a Decision tree[CART], what are the inputs
GINI and Entropy
37
Gini formula
1-sum(square of probabilities)
38
Loss in desion tree formula
go see
39
In classification you have
Tree Knn Voronoi SVM
40
What is the kernel trick in svm
it is a trick used when a way to divide data with a line is not possible. What is done is the mapping is changed and then divided and then reverted
41
Whta does Parametric mean:
n of parameters is fixed in order to predict class
42
Parametric, compared to non-parametric is
simple, fast, less data, contrained and has poor fits
43
O que é genrative no modo parametrico
as probabiliadeds de classificação apriori são gaussinaas, e o n de para,metros é fixo
44
Parzen windows é?E utiliza?
um método generativo de ML em a divisão de informação pode ser feita com gaussianas? Utiliza a silverman's rule para determinara espessura das gaussianas
45
Descreve Bayes Classifier
Processo de classificação baseado na maximazação de classificações corretas. \ Baseado no teorema p(y|x) *p(x) = p(x|y)*p(y)
46
O que é a curse of dimentionality
com o aumento de dimencionamento, o aumento do n de dados aumento exponecialmente
47
Naive bayes, explain it
é como o bayes classification, mas as features sao dadas como independentes, ou seja a probabiliade aprioir de um set de dados é dada pela probabilidade apriori de dado 1*dado2*dado..... p(x|class ) = p(x1|class)*p(x2|class)*....
48
In bayes and naive byes the goal is to:
maximinse p(y|x)
49
Combining classifiers: what are the types
combining ^y combining p(y|x) ensamble
50
in emsamble what types exist?
bagging(pasting) random forest boosting stacking
51
Explain bagging(pasting)
the set is devided and a classifier is applied to each set
52
explain boosting
selection of samples taht dindr workn and boosting the second round of classifiers with those.
53
staking
train classifeir on the prediction o fseleveal classisfiers
54
Unsupervised learning is:
no known classifications are available the data is the only information
55
Unsuperperfised leaving has to reduce the number of features this is done via:
Filter method and wrapper method and embebed
56
in unsupervised learning describe filter methods:
they filter out the features taht are not distinctive (if two features are highly correlated, then they are excluded)
57
in unsupervised learning deacribe the wrapper method
checaks the data correlation (not the correlation between two feautes) like the filter method but for data. If the data on a feautre is very widespread than that feaure is not good! Features are ranked
58
Embebed methods are:
there are tow modes: lets talk foawrd each feature is tested with the classifeir. the best features i then added to a list next the classifer is ran again against all the features with the best feature from before. until therror is no longer decreasinf
59
Clustering what is it and how is it done
formation of groups through the adjusment of the groups centrod via many iterattins. done with k-means
60
Describe k-means
minimixação da distancia dos pontos à centriod que muda com as iterações
61
desvamntagens do k-menas
as posições dos centroides podem depender da posição inicial randomizda e o numeor de clusters selcionado iniclamente pode vir a fazer merda
62
Como os valores das centroides podem varaia com a i8nicialização entao que solução existe
clustering hierarquico
63
descrebe clustering heirarquico
that bubble shit man
64
What are gaussina mixture models
A Gaussian mixture model (GMM) is a probabilistic model that assumes that the instances were generated from a mixture of several Gaussian distributions whose parameters are unknown.
65
What is statified sampling
obriga datasets a aser as imbalanced as the data itself.
66
XAi
computer explains the reason be3hind its decisions
67
active learning
computer selects data and the user classifies it
68
reenforcement
ML is trained with new data set