Machine Learning Concepts Flashcards

(51 cards)

1
Q

What is machine learning?

A

A field of AI where systems learn from data to make predictions or decisions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the main types of machine learning?

A

Supervised, Unsupervised, and Reinforcement Learning.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is supervised learning?

A

Learning from labeled data to predict outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is unsupervised learning?

A

Learning patterns from unlabeled data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is reinforcement learning?

A

Learning by interacting with an environment to maximize cumulative reward.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a labeled dataset?

A

A dataset where each input is paired with the correct output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is a feature?

A

An individual measurable property or characteristic of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a label?

A

The output variable or the value to be predicted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a model?

A

A mathematical representation of a real-world process learned from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is training data?

A

Data used to train a machine learning model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is test data?

A

Data used to evaluate the accuracy of a trained model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is validation data?

A

Data used to fine-tune model parameters during training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is overfitting?

A

When a model learns the training data too well and performs poorly on unseen data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is underfitting?

A

When a model is too simple to capture the data’s structure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a decision tree?

A

A tree-like model used to make decisions based on input features.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is a random forest?

A

An ensemble of decision trees used to improve prediction accuracy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is k-nearest neighbors (KNN)?

A

A method that classifies based on the majority class among k nearest data points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a support vector machine (SVM)?

A

A model that finds the optimal boundary that separates classes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a linear regression?

A

A model that fits a line to predict a continuous output.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is logistic regression?

A

A classification algorithm based on the logistic function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is classification?

A

Predicting discrete labels.

22
Q

What is regression?

A

Predicting continuous values.

23
Q

What is clustering?

A

Grouping similar data points together.

24
Q

What is dimensionality reduction?

A

Reducing the number of features while preserving information.

25
What is principal component analysis (PCA)?
A technique for dimensionality reduction by projecting data onto principal components.
26
What is a confusion matrix?
A table showing true positives, false positives, true negatives, and false negatives.
27
What is accuracy?
The ratio of correct predictions to total predictions.
28
What is precision?
The ratio of true positives to predicted positives.
29
What is recall?
The ratio of true positives to actual positives.
30
What is F1 score?
The harmonic mean of precision and recall.
31
What is cross-validation?
A method for assessing model performance by dividing data into training and validation sets multiple times.
32
What is k-fold cross-validation?
Dividing the dataset into k parts and using each as a validation set once.
33
What is hyperparameter tuning?
Optimizing the settings that control the learning process.
34
What is grid search?
A method for tuning hyperparameters by exhaustively searching a predefined set of values.
35
What is random search?
A method for tuning hyperparameters by sampling random combinations.
36
What is model evaluation?
Measuring a model's performance using various metrics.
37
What is data preprocessing?
Cleaning and transforming data before feeding it into a model.
38
What is feature scaling?
Normalizing data so features contribute equally to model performance.
39
What is normalization?
Rescaling features to a range of [0, 1].
40
What is standardization?
Transforming features to have mean 0 and standard deviation 1.
41
What is feature selection?
Choosing the most relevant features for modeling.
42
What is the bias-variance tradeoff?
The balance between underfitting and overfitting.
43
What is a learning curve?
A plot showing training and validation accuracy over time.
44
What is a ROC curve?
A graph showing the trade-off between true positive rate and false positive rate.
45
What is AUC?
Area Under the ROC Curve — a measure of model quality.
46
What is a baseline model?
A simple model used for comparison.
47
What is data leakage?
When information from outside the training dataset is used to create the model.
48
What is a pipeline?
A sequence of data preprocessing and modeling steps applied consistently.
49
What is ensemble learning?
Combining predictions from multiple models to improve performance.
50
What is boosting?
An ensemble technique that combines weak learners sequentially to form a strong learner.
51
What is bagging?
An ensemble method that trains multiple models independently and combines their results.