Technical Flashcards
(89 cards)
What Are the Different Types of Machine Learning?
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
What is:
Supervised Learning
In supervised machine learning, a model makes predictions or decisions based on past or labeled data.
Labeled data refers to sets of data that are given tags or labels, and thus made more meaningful.
What is:
Unsupervised Learning
In unsupervised learning, we don’t have labeled data. A model can identify patterns, anomalies, and relationships in the input data.
How can the model learn using Reinforcement Learning?
Using reinforcement learning, the model can learn based on the rewards it received for its previous action.
What is Overfitting?
The Overfitting is a situation that occurs when a model learns the training set too well, taking up random fluctuations in the training data as concepts. These impact the model’s ability to generalize and don’t apply to new data.
How can you avoid Overfitting?
- Regularization.
- Making a simple model.
- Cross-validation methods
What is Regularization?
It involves a cost term for the features involved with the objective function
What is a ‘Training Set’ in a Machine Learning model?
It is a labeled dataset used to train the model by providing examples for it to learn patterns and relationships.
What is a ‘Test Set’ in a Machine Learning model?
It is a dataset used to test the accuracy of the model’s predictions, typically without labels during prediction, to evaluate its performance.
What is a typical split ratio for Training and Test sets?
Usually 70% of data is used for training and 30% for testing, though it can vary based on preferences.
Why should you separate the Test Set before training the model?
To avoid biased testing results and ensure the model is evaluated on unseen data for accurate performance measurement.
How do you handle missing or corrupted data in a dataset?
By dropping those rows or columns, or replacing them with a placeholder value using methods like isnull(), dropna(), or fillna() in Pandas.
what type of model tends to work better?
When the training set is small
Which classifier works best when the training set is large?
Naive Bayes.
What is a Confusion Matrix?
A table used to measure the performance of an algorithm by comparing actual and predicted values in supervised learning.
what are the two parameters?
In a Confusion Matrix
How is accuracy calculated using a Confusion Matrix?
Accuracy = (Sum of diagonal values) / (Total observations)
What is a False Positive?
A case where the model predicts a positive outcome, but the actual outcome is negative.
What is a False Negative?
A case where the model predicts a negative outcome, but the actual outcome is positive.
What is the total observation count in a confusion matrix with values 12
3
What are the three stages of building a machine learning model?
Model Building, Model Testing, and Applying the Model.
What happens during the ‘Model Building’ stage?
Choose a suitable algorithm and train it according to the requirement.
What is done in the ‘Model Testing’ stage?
Check the accuracy of the model using test data.
What is done in the ‘Applying the Model’ stage?
Make changes after testing and deploy the final model for real-time projects, while periodically checking and updating it.