Flashcards in ML-Midterm Deck (37):

1

## What is Machine Learning?

### machine learning is about modeling based on a specific hypothesis function, ideally in the form of a density function that describes the data.

2

## What is supervised learning and unsupervised learning ?

### One has label the other one does not

3

## What is a model ?

###
model is an approximation of a system

and

predict the behaviour of it.

4

## gradient descent

### gradient descent can find parameters to minimize the loss of the training data

5

## Explain the k-fold cross validation procedure and explain what it is used for ?

###
Procedure: divide data in k even sets choose 1 set as validation set and k-1 as the training set, then alter the validation set and repeat the operation k times.

It is use for validate the accuracy of trained model.

6

## What is Training set, testing set and validation set ?

###
Training Set: this data set is used to adjust the weights on the neural network.

Validation Set: this data set is used to minimize overfitting. Then you can use this information to turn your hyper parameter.

Testing Set: this data set is used only for testing the final solution in order to confirm the actual predictive power of the network.

7

## What is hyper parameter?

### parameter of the training algorithm, such as learning rates, momentum, or maximum number of iterations

8

## What is information leakage or information contamination ?

### you use the test data to train the model and use them to test your model.

9

## What is support vector machine ?

### Maximum margin classifier

10

## What is hyperplane in SVM ?

### hyperplane is the dividing or separating plane between the two classes

11

## Why SVM called support vectors ?

### ????

12

## What is Soft margin classifier ?

### Allow some overlap of the data points.

13

## Why use kernel ? kernel trick?

###
Transform the data to higher dimension so it is linearly separable.

To avoid calculate the dot product in the feature space.

14

## What is the purpose of regularization? give example and why it helps overfitting.

###
Prevent overfitting

Ridge regression we add L2 regularization

to control the complexity of model. It penalizes the features with less influence on the model.

15

## What is Batch gradient decent ?

### batch gradient decent use all the data to train.

16

## What is Stochastic gradient decent ?

### Stochastic gradient decent use only single data at a time.

17

## How does momentum helps in linear regression.

### Momentum helps to overcome the local minimum.

18

## What is Ridge regression ?

### We add L2 regularization as a penalty term when updating the weights. Weight decay.

19

## What is a random variable ?

###
Different every time.

Follows a specific probability density function

20

## Different distribution, uniform, bimodel, multimodel distribution?

### one peak , two peak and multiple peak.

21

## What is Bayes theorem ?

###
p(x|y) = p(y|x)p(x)/p(y)

prior knowledge p(x)

evidence called the likelihood p(y|x) (observed)

posterior distribution p(x|y)

22

## Explain what is maximum likelihood principle ?

### Given a parameterize hypothesis function p(y|x; w), we will chose as parameters the values which make the data y most likely under this assumption

23

## Likelihood function ? Maximum (log) likelihood

### maximizing the log-likelihood function is equivalent to minimizing a quadratic error term

24

## Why LMS regression is equivalent to MLE for Gaussian data ?

### Because the linear dependence of the mean and constant variance.

25

## What estimation MLE and MAP give us ?

### Point estimate.

26

## Difference between MLE and MAP ? What is it used for? Related?

###
MAP maximize the posteriori MLE maximize the likelihood. Eg, babes rule.

They used to obtain a point estimate of an unobserved quantity based on training data.

Related if the prior is constant(uniform prior)MAP=MLE.

27

## What estimation MLE and MAP give us ?

### Point estimate of a distribution.

28

## What is generative model ?

### Generative model is to ‘generate’ examples of the class objects

29

## What is generative model ? What it is used to?

### Generative model is to ‘generate’ examples of the class objects. Use to solve classification task.

30

## What is discriminative model ?

### A model that discriminates between classes based on the feature values.

31

## What is K-means clustering, what is k-medios.

### k-mean is to find the means of the data and k-medios is to find the central of the data.

32

## What is Gaussian Mixture Model ?

### we have k Gaussian classes, where each class is chosen randomly from a multinominal distribution

33

## What is expectation-maximization (EM) algorithm ?

###
E-step find labels of the data based on a assumption of distribution.

We make assumptions of training labels from the current model (expectation step)

M-step to update the parameters of the model to maximize the probability of the observations (maximization step).

34

## What is Causal model ?

### ???

35

## What is naive babes good to ? And Why ?

### Text classification, It is efficient that you do not have to calculate the probability of the words not in the dictionary.

36

## Graphical representation of Naive Bayes ?

###
class

/ | \

x1 x2 x3

37