ML Part 2 Flashcards
(20 cards)
What is linear regression?
A model that predicts a continuous outcome using a linear combination of input features.
What does the slope coefficient represent in linear regression?
The change in the predicted value for a one-unit increase in the input.
What is the intercept in linear regression?
The predicted value when all input features are zero.
What is the loss function used in linear regression?
Mean Squared Error (MSE).
What are assumptions of linear regression?
Linearity, homoscedasticity, independence, normality of residuals.
What is logistic regression used for?
Binary classification.
What is the output of logistic regression?
A probability between 0 and 1.
What is the sigmoid function?
A function that maps any value to a [0, 1] probability range.
How do you convert probabilities to classes in logistic regression?
Using a decision threshold (often 0.5).
What is the loss function used in logistic regression?
Log loss or cross-entropy loss.
What is a decision tree?
A model that splits data using feature values to make decisions.
What is Gini impurity?
A measure of how often a randomly chosen element would be incorrectly labeled.
What is information gain?
The reduction in impurity achieved by a split in a decision tree.
What is tree pruning?
Reducing the size of a tree to prevent overfitting.
What are advantages of decision trees?
Interpretability, handling non-linearities, and requiring little data preprocessing.
What is the k-nearest neighbors algorithm?
A model that classifies data based on the majority label of the k closest training examples.
What is the key hyperparameter in k-NN?
The number of neighbors (k).
What distance metric is commonly used in k-NN?
Euclidean distance.
What happens if k is too small in k-NN?
The model becomes sensitive to noise and overfits.
What happens if k is too large in k-NN?
The model may underfit and smooth over patterns.