Quiz Flashcards

1 (23 cards)

1
Q

What is the purpose of linear regression in machine learning?

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the purpose of linear regression in machine learning?

A

To model the relationship between a dependent variable and one or more independent variables.

Linear regression predicts the value of the dependent variable based on the values of the independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you interpret the coefficients in a linear regression model?

A

Each coefficient represents the change in the dependent variable for a one-unit change in the associated independent variable, holding other variables constant.

A positive coefficient indicates a direct relationship, while a negative coefficient indicates an inverse relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are some common metrics used to evaluate the performance of a linear regression model?

A
  • Mean Absolute Error (MAE)
  • Mean Squared Error (MSE)
  • R-squared

These metrics help assess how well the model predicts the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How does Bayes’ Theorem relate to conditional probability?

A

Bayes’ Theorem describes how to update the probability of a hypothesis based on new evidence, relating prior and posterior probabilities.

It is foundational in Bayesian statistics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Can you explain the concept of prior and posterior probabilities?

A

Prior probability is the initial belief about a hypothesis before observing evidence, while posterior probability is the updated belief after considering the evidence.

The relationship is defined by Bayes’ Theorem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

In what scenarios would you prefer using Bayes’ Theorem over other classification methods?

A

When prior knowledge is available, when data is scarce, or when dealing with imbalanced classes.

Bayes’ Theorem is particularly useful in medical diagnosis and spam detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is gradient descent, and how is it used in linear regression?

A

Gradient descent is an optimization algorithm used to minimize the cost function by iteratively adjusting model parameters.

It helps find the best-fitting line in linear regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How does regularization help prevent overfitting in a linear regression model?

A

Regularization adds a penalty to the loss function for large coefficients, discouraging complexity in the model.

This leads to better generalization on unseen data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the differences between L1 and L2 regularization?

A
  • L1 regularization (Lasso) can drive some coefficients to zero, leading to feature selection.
  • L2 regularization (Ridge) shrinks coefficients but does not eliminate them.

The choice between them depends on the specific problem and desired outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What are hyperparameters in the context of Convolutional Neural Networks (CNNs)?

A

Hyperparameters are configuration settings used to control the learning process, such as learning rate, batch size, and number of layers.

They are not learned from the data but set before training.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How do genetic algorithms optimize hyperparameters for CNNs?

A

Genetic algorithms use evolutionary strategies to iteratively select, combine, and mutate hyperparameter sets to find optimal configurations.

This mimics natural selection to enhance model performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are some common hyperparameters that can be tuned in CNNs?

A
  • Learning rate
  • Number of filters
  • Filter size
  • Pooling size
  • Dropout rate

Tuning these hyperparameters can significantly affect model accuracy and training time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the main goal of Principal Component Analysis?

A

To reduce the dimensionality of a dataset while preserving as much variance as possible.

It simplifies data without losing critical information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How does PCA reduce the dimensionality of a dataset?

A

By transforming the data into a new set of variables (principal components) that are orthogonal and capture the most variance.

This allows for easier visualization and analysis.

17
Q

What are the assumptions made when applying PCA to a dataset?

A
  • Linearity
  • Large sample size
  • Features are centered (mean of zero)
  • Features are scaled to have unit variance

Violating these assumptions can lead to misleading results.

18
Q

What is the fundamental concept behind Support Vector Machines (SVM)?

A

SVM aims to find the hyperplane that best separates different classes in the feature space.

It maximizes the margin between the closest points of the classes.

19
Q

How do SVMs handle non-linearly separable data?

A

SVMs use kernel functions to transform the data into a higher-dimensional space where it can be linearly separated.

Common kernels include polynomial and radial basis function (RBF).

20
Q

What role do kernels play in SVMs?

A

Kernels allow SVMs to operate in higher-dimensional spaces without explicitly computing the coordinates of the data in those spaces.

This enables efficient computation and flexibility in classification.

21
Q

What are ensemble methods, and why are they used in machine learning?

A

Ensemble methods combine multiple models to improve overall performance and robustness.

They reduce the risk of overfitting and increase accuracy.

22
Q

Can you explain the difference between bagging and boosting?

A

Bagging builds multiple independent models and averages their predictions, while boosting creates a sequence of models that learn from the errors of previous ones.

Bagging reduces variance, while boosting reduces bias.

23
Q

What are some popular ensemble algorithms, and how do they improve model performance?

A
  • Random Forest
  • AdaBoost
  • Gradient Boosting
  • XGBoost

These algorithms enhance predictive accuracy and can handle complex data patterns effectively.