Supervised Learning Flashcards

Question 1

Q

What is Supervised Learning?

Answer

A

A type of ML where the model is trained on labeled data, learning from known answers.

Supervised Learning relies on a dataset that includes input-output pairs.

Question 2

Q

What are key features used in Supervised Learning?

Answer

A

Buying Price
Maintenance Cost
Number of Doors
Seating Capacity
Luggage Boot Size
Safety Rating

These features help the model learn and make predictions based on labeled data.

Question 3

Q

What is Predictive Modeling?

Answer

A

When ML learns patterns from data to make predictions.

Predictive modeling is a core aspect of Supervised Learning.

Question 4

Q

What is the difference between Regression and Classification?

Answer

A

Regression → Predicts continuous values (e.g., house prices)
Classification → Assigns data into categories (e.g., spam or not spam)

Understanding the difference is crucial for choosing the right model.

Question 5

Q

What is Linear Regression?

Answer

A

A method to find the best-fit line Y = mx + c, where c is the intercept and m is the slope.

Linear regression is used to model the relationship between two variables.

Question 6

Q

What are the limitations of Linear Regression?

Answer

A

Not good for non-linear relationships
Not good when there are too many outliers.

These limitations can affect the accuracy of predictions.

Question 7

Q

What is a Decision Tree?

Answer

A

A flowchart-like structure where each decision leads to an outcome.

Decision Trees are intuitive and easy to interpret.

Question 8

Q

What is the process of creating a Decision Tree?

Answer

A

Pick the best feature
Split the data into groups
Keep splitting until groups are pure.

This process helps in making decisions based on the data.

Question 9

Q

What is Random Forest?

Answer

A

A collection of multiple decision trees to improve accuracy and reduce overfitting.

Random Forest is an ensemble method that enhances model performance.

Question 10

Q

How does Random Forest work?

Answer

A

Train many Decision Trees on random data subsets
Use different features at each split
Combine all tree predictions.

This method helps in averaging out errors from individual trees.

Question 11

Q

What is k-Nearest Neighbors (k-NN)?

Answer

A

A method that classifies new data points based on the ‘k’ closest points in the dataset.

k-NN is a simple yet effective classification algorithm.

Question 12

Q

What is the process for k-NN classification?

Answer

A

Store the data
Choose k
Measure who’s closest
Pick the k nearest
Count votes & classify based on majority

The choice of ‘k’ can significantly impact the classification result.

Question 13

Q

What is a limitation of k-NN?

Answer

A

It is slow for large datasets.

The computational cost increases with the size of the dataset.

Question 14

Q

List the main concepts of Supervised Learning.

Answer

A

Uses labeled data
Regression vs. Classification
Linear Regression
Decision Trees
Random Forest
k-NN

These concepts form the foundation of supervised learning techniques.

Question 15

Q

What is the goal of Linear Regression?

Answer

A

To find the best-fit line that represents the relationship between variables.

This goal helps in making accurate predictions based on input data.

Question 16

Q

True or False: Decision Trees can overfit.

Answer

A

True

Overfitting occurs when the model learns noise in the training data.

Question 17

Q

Fill in the blank: Random Forest is an army of _______.

Answer

A

[Decision Trees]

This metaphor highlights the ensemble nature of the Random Forest algorithm.