fa3 + logistic reg to gradient boosting Flashcards by Deleted Deleted

We can visualize the tree using the export_graph function from the tree module.

Group of answer choices:
True
False

False

How well did you know this?

Not at all

Perfectly

In the decision tree, the region can be found by traversing the tree from the root and going left or right.

Group of answer choices
True
False

True

How well did you know this?

Not at all

Perfectly

Decision tree is a model that learns a hierarchy of if/else questions, leading to a decision.

Group of answer choices
True
False

True

How well did you know this?

Not at all

Perfectly

The .dot file format is a _____ file format for storing graphs.

TEXT

How well did you know this?

Not at all

Perfectly

In the decision tree, the ______ represents the whole dataset.

Group of answer choices
Terminal Nodes
Edges
Root
Conditions

Root

How well did you know this?

Not at all

Perfectly

The .dot file format is an image file format for storing graphs.
Group of answer choices
True
False

True

How well did you know this?

Not at all

Perfectly

Decision trees in scikit learn are implemented in ________ and DecisionTreeClassifier classes.

Group of answer choices: DecisionRegressorTree
TreeDecisionRegressor
RegressorDecisionTree
DecisionTreeRegressor

DecisionTreeRegressor

How well did you know this?

Not at all

Perfectly

Which is not true about Random Forest?

Group of answer choices
Not in the options
Less memory usage.
Less burden or parameter tuning.
As many trees are created, detailed analysis is difficult.
Poor performance for large and sparse data.

Less memory usage.

How well did you know this?

Not at all

Perfectly

To build a random forest model, you need to decide on the __________ to build.

Group of answer choices
Depth of the tree Height of tree
Number of trees
Root
Node of the tree

Number of trees

How well did you know this?

Not at all

Perfectly

The _______ are methods that combine multiple machine learning models to create more powerful models.

ENSEMBLES

How well did you know this?

Not at all

Perfectly

In the decision tree, the terminal nodes represent the whole dataset.

Group of answer choices
True
False

False

How well did you know this?

Not at all

Perfectly

In the decision tree, the sequence of if/else questions are called qualifiers.

Group of answer choices
True
False

False

How well did you know this?

Not at all

Perfectly

Which is not true about Random Forest?

Group of answer choices
Reduces underfitting by averaging trees that predict well.
Reduces overfitting by averaging trees that predict well.
Selects candidate features at random when splitting nodes.
Randomly selects some of the data when creating a tree.

Reduces underfitting by averaging trees that predict well.

How well did you know this?

Not at all

Perfectly

What are the parameters for Gradient Boosting?

a. n_estimators, learning rate
b. n_estimators, max_features
c. n_estimators, learning rate, max_depth
d. n_estimators, max_features, max_depth

How well did you know this?

Not at all

Perfectly

Gradient boosting is used when you need to take more performance in random forests.

Group of answer choices
True
False

True

How well did you know this?

Not at all

Perfectly

In the decision tree, the sequence of if/else questions are called ______.

Group of answer choices
Qualifiers
Condition
Tests
Nodes

Tests

How well did you know this?

Not at all

Perfectly

Decision trees in scikit learn are implemented in DecisionTreeRegressor and _______ classes.

Group of answer choices
DecisionClassifier
TreeDecisionClassifier
DecisionTreeClassifier
DecisionClassifierTree

DecisionTreeClassifier

How well did you know this?

Not at all

Perfectly

We can visualize the tree using the ______ function from the tree module.

export_graphviz

How well did you know this?

Not at all

Perfectly

Two most common linear classification algorithms:

Logistic Regression
Linear Support Vector Machines

How well did you know this?

Not at all

Perfectly

Logistic Regression, implemented in where

linear_model.LogisticRegression

How well did you know this?

Not at all

Perfectly

Linear Support Vector Machines (Linear SVMs), implemented in where

svm.LinearSVC

How well did you know this?

Not at all

Perfectly

SVC stands for?

support vector classifier

How well did you know this?

Not at all

Perfectly

______ is a classification algorithm and not a regression algorithm, and it should not be confused with LinearRegression

LogisticRegression

How well did you know this?

Not at all

Perfectly

the trade-off parameter detemrins the strength of the regularizaiton, called _____

How well did you know this?

Not at all

Perfectly

Higher values of C correspond to _____

LESS REGULARIZATION

When you use a high value of the parameter C, LogisticRegression and LinearSVC will _______

try to fit the training set as best as possible

low values of the parameter C, the models put more emphasis on _______

finding a coefficient vector (w) that is close to zero

Using low values of C will cause the algorithms to try to adjust to the _____ of data points

“majority”

using a higher value of C stresses the importance that each ______ be classified correctly

individual data point

_______ are a family of classifiers that are quite similar to the linear models

Naive Bayes classifiers

In Naive Bayes, ____is faster than linear classifier

Training Speeds

In Naive Bayes, _____ performance is slightly lower

Generalization

The reason that Naive Bayes models are so efficient is that they______ and collect simple per-class statistics from each feature

learn parameters by looking at each feature individually

The reason that Naive Bayes models are so efficient is that they learn parameters by looking at each feature individually and _______

collect simple per-class statistics from each feature

3 Kinds of Naive Bayes Classifier in Scikit-learn:

GaussianNB BernoulliNB MultinomialNB

GuassianNB -> ____ data

Continuous

BernoulliNB -> ____ data, ___ data

Binary data, Text data

MultinomialNB -> ____ data, ___ data

Integer count data, text data

In Naive Bayes, it controls _____

model complexity with alpha parameter

In Naives Bayes, _____ by adding virtually positive data as much as alpha

Smooth statistics

In Naive Bayes, ____ decreases the complexity of the model but does not change the performance

Large alpha

_____ is a high-dimensional dataset

GaussianNB

_____ and ______ are a text-like used to count sparse data

BernioulliNB and MultinomialNB

In Naive Bayes, _____ are fast and easy to understand and process

Training and testing

Naives Bayes works well with _____ and is not _____

sparse high-dimensional datasets, parameter sensitive

______ are widely used models for classification and regression tasks

Decision trees

In Decision Trees, they learn a hierarchy of ____, leading to a decision

if/else questions

Learning a _____ means learning the sequence of if/else questions that gets us to the true answer most quickly

decision tree

In the machine learning setting, if/else questions are called ___

tests

To build a tree, the algorithm searches over all possible tests and finds the one that is ____ about the target variable

most informative

The top node is called the ___, representing the whole dataset.

root

Parts of a decision tree:

Root Node Node Edge (Connects tests to other nodes) Terminal Node (Nodes with no futher edges) Characteristics (inside nodes)

A prediction on a new data is made by checking which region of the ____ the point lies in, and then predicting the majority target (or the single target in the case of pure leaves) in that region

partition of the feature space

The ____ can be found by traversing the tree from the root and going left or right, depending on whether the test is fulfilled or not

region

Decision trees in scikit learn are implemented in ____ and ____ classes

DecisionTreeRegressor, DecisionTreeClassifier

We can visualize the tree using the ___ function from the tree module

export_graphviz

export_graphviz writes a file in the ____, which is a text file format for storing graphs

.dot file format

export_graphviz writes a file in the .dot file format, which is a ____for storing graphs

text file format

We can visualize the _____ in a way that is similar to the way we visualize the coefficients in the linear model

feature importances

______ is impossible for extrapolation predicting outside the range of training data

Extrapolation 0

____ is not affected by scale

Decision Tree Regression

_____ are methods that combine multiple machine learning models to create more powerful models

Ensembles

Two ensemble models that have proven to be effective on a wide range of datasets, for classification and regression, both of which use decision trees as their building blocks:

Random Forests Gradient Boosted Decision Trees

It is one of the ensemble methods that can avoid overfitting by combining multiple decision trees

Random Forests

Random Forests Reduces overfitting by ______

averaging trees that predict well

In Random Forests, Regression is:

average of predicted values

In Random Forests, Classification is:

average of predicted probabilities

It injects randomness when creating trees

Random Forests

In Random Forests, it randomly selects _____ when creating a tree

some of the data

In Random Features, selects ______ when splitting nodes

candidate features at random

To build a random forest model, you need to decide on the _____ to build

number of trees

To build a random forest model, you need to decide on the number of trees to build (the ____ parameter of RandomForestRegressor or RandomForestClassifier)

n_estimators

To build a tree, we first take what is called a _____ of our data. That is, from our n_samples data points, we repeatedly draw an example randomly with replacement

bootstrap sample

A critical parameter in this process is ____. If we set _____ to n features, that means that each split can look at all features in the dataset, and no randomness will be injected in the feature selection

max features

The Advantages of Random Forests are:

Mostly widely used algorithm in regression and classification Excellent performance, less burden or parameter tuning, no data scale required Large datasets can be applied

In Random Forests, it is the _______ algorithm in regression and classification

Mostly widely used

In Random Forests, it has ___ performance, less burden or _____, ____ required

excellent performance, parameter tuning, no data scale required

In Random Forests, ____ datasets can be applied

Large

The Disadvantages of Random Forests are:

As many trees are created, detailed analysis is difficult, and the trees tend to get deeper Poor performance for large and sparse data More memory usage and slower training and prediction than linear models

In Random Forests, as many trees are created, _____ is difficult and the trees tends to get deeper

detailed analysis

In Random Forests, it has poor performance for ___ and ____ data

large and sparse data

In Random Forests, it has more ____ and slower ___ and ____ than linear models

memory usage slower training and prediction

The parameters used in Random Forests are:

n_estimators, max features

Another ensemble algorithm based on DecisiontreeRegressor

Gradient Boosted Regression Trees (gradient boosting machines)

Gradient Boosted Regression Trees can be used for both ____ and ___

classification and regression

In Gradient Boosted Regression Trees, unlike random forest, _____ is strongly applied instead of randomness

pre-pruning

Used a lot in machine learning contests (Kaggle)

Gradient Boosted Regression Trees (gradient boosting machines)

In Gradient Boosted Regression Trees, it is slightly more _____, slightly ____ than random forest

more parameter sensitive higher performance

Create the next tree to compensate for the error of the previous tree using a ______

shallow tree of 5 or less

In Gradient Boosted Regression Trees, the regression is:

least squares error loss function

In Gradient Boosted Regression Trees, the classification is:

logistic loss function

In Gradient Boosted Regression Trees, use _____ method

gradient descent method

In Gradient Boosted Regression Trees, use gradient descent method (learning rate parameter is important default = ___)

0.1

Gradient Boosting Advantages:

Use when you need to take more performance in random forests (xgboost for larger scales) No need for feature scale adjustment and can be used for binary and continuous features

In Gradient Boosting, we can use it when you need to take more _____ in random forests (____ for larger scales)

performance, xgboost

In Gradient Boosting, no need for _________ and can be used for binary and continuous features

feature scale adjustment

Gradient Boosting Disadvantages:

Doesn't work well for sparse high-dimensional data Sensitive to parameters, takes longer training time

In Gradient Boosting, it doesn't work well for sparse ___

high-dimensional data

In Gradient Boosting, it is sensitive to _____, takes longer _____

Parameters Longer Training time

Gradient Boosting Parameters are:

n_estimators learning rate max_depth (<=5)

fa3 + logistic reg to gradient boosting Flashcards

(100 cards)