L6 Flashcards
(37 cards)
What are Parameters in the context of machine learning?
Values learned during training
What are Hyperparameters?
Set before training (like learning rate, number of neighbors, or regularization parameter C)
What is a Decision function?
Takes a dataset as input and gives a decision as output
What is the Loss function?
What you are trying to minimize for a single training example to achieve your objective (e.g. square loss)
What is a Cost function?
Average of your loss functions over the entire training set (e.g. mean square error)
What is a Training set used for?
Learn model parameters
What is a Validation set used for?
Tune hyperparameters
What is a Test set used for?
Evaluate final model performance
Why do we need SVMs?
To find the best line (or hyperplane) possible with the largest margin between classes
What does SVM stand for?
Support Vector Machine
What is the main goal of SVM in classification?
To separate classes with the widest possible gap or margin
What is the Margin in SVM?
The distance between the decision boundary and the closest data points (support vectors)
What are Support Vectors?
The data points closest to the boundary that define the position of the decision boundary
What is a Hard Margin SVM?
No errors allowed – aims to find a hyperplane that perfectly separates the classes without any misclassification
What is a Soft Margin SVM?
Allows some misclassification or overlap and measures how much an instance is allowed to violate the margin
What does hyperparameter C control in Soft Margin SVM?
Trade-off between margin size and classification errors
True or False: A large C in SVM means more tolerance to errors.
False
What is the decision function for a new data point in SVM?
If result ≥ 1 → positive class; If ≤ -1 → negative class; If in between → uncertain zone (margin)
What type of problems do SVMs solve?
Convex quadratic optimization problems with linear constraints
What is the kernel trick in SVM?
Projects data into a higher-dimensional space where it becomes linearly separable
What is a Linear kernel used for?
Fast, simple, when data is already separable
What does the RBF kernel offer?
Very flexible, works on complex data
What is the purpose of the Gamma parameter in RBF kernels?
Controls how far a single point’s influence reaches
What is One-vs-Rest (OvR) in multiclass classification?
One classifier per class vs all others