General Questions Flashcards
(23 cards)
Supervised Learning
Training on data labeld with input/output pairs.
Unsupervised Learning
Training data without labels. (e.g. clustering)
Reinforcement Learning
An agent can interact with the environment, performs actions and get a reward for actions.
Semi-supervised Learning
Learn with a small amount of labled data and utilize unlabeled data to increase performance.
Machine Learning
Machine learning is a method of data analysis that automates analytical model building. It is a branch of artificial intelligence.
Difference between discrete and continuous random variables?
Discrete random variables take on a countable number of distinct values.
Example: coin flip
Continuous random variables take on an infinite number of values within a given range.
Example: speed of a car
What is Bayes rule?
Bayes Rule is a theorem that describes how to update the probabilites of hypotheses when given evidence.
What is regression?
A machine Learning technique to model the relationship between variables.
Hypothesis function.
What is a loss function?
A measure for a learning algorithm to determine wether a parameterset theta is good or bad.
Example: Residual sum of squares = 0.5(y-y_pred)**2
Was ist overfitting?
When data is matched perfectly, but unseen data is not conform with the model.
Difference between normal equation and gradient descent in linear regression
The normal equation calculates the exact minimum with calculus and matrices. Kind of the perfect solution.
Gradient descent is an approach to the minimum of the loss function, but will never be as good as the normal equation.
Log Trick
In a likelihood function we multiply the probabilty of each sample. This leads to many multiplications with small numbers.
Instead multiplying we just add those up. To get this we just calculate the log likelihood becaus it turs multiplications into additions.
It also removes the exponential function from the log likelihood.
Differences between maximum likelihood and ordinary lieast squares.
Both approaches lead to the same solution.
Similarities between maximum likelihood and ordinary least square.
In ML we want to find argmax of the likelihood function and in ordinary least square we want to find the argmin of the loss function.
Decision function
A function that predictes a class for a given sample.
Define discriminant function
Classification Problem. A measure how well an input matches a specific class.
It provides a score y_pred or a probabilty for a certain class.
score∈R and probabilty∈[0, 1]
How many discriminant functions do we need for two classes?
Just one
f(x)=w^Tx+b=theta^Tx linear discriminant
f(x)=p(y=1|x) probablistic model
with theta=[b, w1^T]^T and x:=[1, x^T]^T
Define decison function
Gives the result needed to make a classification decision.
with k=1, …, K
Probabalistic: c_pred=argmax p(y=k|x)
Score: c_pred=argmax f_k(x)
What is the difference between a discriminant function and a decision function?
A discriminant function is a measure for each class how well it fits the input and a decision function gives the result needed to make a classification decision.
Was sind Regions?
A discriminant function usually divides the space into two subspaces or Regions R.
Decision boundary
All points on the discriminant function. So that f(x)=0
Describe w
Orthogonal to all points on the decision boundary
Difference k-means and k-nearest
K-means is unsupervised and k-nearest is supervised