Chapter 5 Flashcards by Matthew Alexander Paudianto

Before Machine Learning, we program ______ to recognize an apple given images of apple.

logic/algorithm

How well did you know this?

Not at all

Perfectly

With Machine Learning, given new images of apple, ______ make the prediction.

Machine Learning Models

How well did you know this?

Not at all

Perfectly

In Supervised Learning, we build a machine that can extract ______ between variables.

relationships

How well did you know this?

Not at all

Perfectly

Lazy learning algorithms generalize only when ______, rather than during the training phase.

queried

How well did you know this?

Not at all

Perfectly

During the training phase of lazy learning, the algorithm accepts the data as input but refrains from ______ on it.

actively training

How well did you know this?

Not at all

Perfectly

In lazy learning, the actual model training occurs during the ______.

prediction phase

How well did you know this?

Not at all

Perfectly

An example of a lazy learning algorithm is the ______ algorithm.

K-nearest neighbors (KNN)

How well did you know this?

Not at all

Perfectly

Eager learning algorithms process data during the ______ phase.

training

How well did you know this?

Not at all

Perfectly

Examples of eager learning algorithms include Linear Regression, Logistic Regression, Support Vector Machines, Decision Trees, and ______.

Artificial Neural Networks

How well did you know this?

Not at all

Perfectly

Instance-based learning involves using the ______ to make predictions.

entire dataset

How well did you know this?

Not at all

Perfectly

In instance-based learning, the machine compares the new data to the instances it has seen before and uses the ______ to make a prediction.

closest match

How well did you know this?

Not at all

Perfectly

In instance-based learning, ______ is created; instead, the machine stores all of the training data.

no model

How well did you know this?

Not at all

Perfectly

Instance-based learning is often used in pattern recognition, ______, and anomaly detection.

clustering

How well did you know this?

Not at all

Perfectly

Model-based learning involves creating a ______ that can predict outcomes based on input data.

mathematical model

How well did you know this?

Not at all

Perfectly

In model-based learning, the model can be thought of as a set of ______ that the machine uses to make predictions.

rules

How well did you know this?

Not at all

Perfectly

Parametric models make specific ______ about the relationship between input and output data.

hypotheses

How well did you know this?

Not at all

Perfectly

In parametric models, assumptions concern a ______ of parameters and variables that impact the model’s result.

fixed number

How well did you know this?

Not at all

Perfectly

Parametric models are easier to understand as they build a certain ______ about the data input that has to be followed.

hypothesis

How well did you know this?

Not at all

Perfectly

Because of the assumptions they make, parametric models often need ______ to reach a certain level of accuracy.

less data

How well did you know this?

Not at all

Perfectly

If the hypothesis is met, parametric models can be more efficient and perform better than ______.

non-parametric

How well did you know this?

Not at all

Perfectly

A limitation of parametric models is that the assumption often ______ the problem.

simplifies

How well did you know this?

Not at all

Perfectly

Parametric models have proven to be sensitive to outliers, show limited performance to ______ problems, and struggle to adapt to new unseen data.

nonlinearity

How well did you know this?

Not at all

Perfectly

Non-parametric models don’t need to make ______ about the relations between the input and output to generate an outcome.

assumptions

How well did you know this?

Not at all

Perfectly

Non-parametric models also don’t require a certain number of ______ to be set and learned.

parameters

How well did you know this?

Not at all

Perfectly

Studies have shown that non-parametric models perform better on ______ and are more flexible.

large datasets

Common non-parametric algorithms are random forests, decision trees, Support Vector Machines with non-linear kernels, k-Nearest Neighbors (k-NN) algorithm and Neural networks with ______ activation functions.

non-parametric

The basic benefit of non-parametric models is that they are able to catch ______ and relationships without having to follow a hypothesis.

complex patterns

Non-parametric models can handle ______ and noisy data effectively.

outliers

A limitation of non-parametric models is that they require ______ in order to generate better predictions.

more data input

Non-parametric models are usually harder to understand and analyze as there are no ______ about the behavior of input and output.

functional assumptions

A machine learning model maps from ______ to prediction, represented as f(x) -> y.

features

Learning has three stages: Training, ______, and Test.

Validation

In the training stage of learning, one aims to optimize ______.

model parameters

The validation stage of learning involves intermediate evaluations to ______ model.

design/select

The test stage of learning is for final ______.

performance evaluation

Model design and 'hyper parameter' tuning is performed using a ______.

validation set

Sometimes, you need to split 'train' into a train set for learning parameters and a ______ for checking model performance.

val set

The key principle of machine learning states: if Xi is similar to Xj, then yi is probably ______ to yj.

similar

In K-Nearest Neighbors, an arbitrary instance x is described by the feature vector , where ar(x) denotes the value of the ______ of instance x.

rth attribute

The distance between two instances xi and xj in KNN is defined to be d(xi, xj), calculated using the ______ distance formula.

Euclidean

The nearest neighbor algorithm assigns the label/target value of the ______ training features for given test features.

most similar

For KNN classification, prediction is usually the ______ or most common class of the returned labels.

mode

For KNN regression, prediction is usually the ______ of the returned values.

arithmetic mean

Concept learning involves acquiring ______ from specific training examples.

general concepts

Each concept can be thought of as a ______ function defined over a larger set.

boolean-valued

Implementing KNN involves three steps: Calculate Distance, Identify Nearest Neighbors, and ______.

Aggregate Nearest Neighbors

In KNN Classification, the algorithm aggregates the class labels of the 'k' nearest neighbors to predict the class of the current data point, choosing the ______.

most common class label

In KNN Regression, the algorithm calculates the ______ of the k nearest training examples.

mean value

In nearest-neighbor learning, the target function may be either ______ or real-valued.

discrete-valued

The KNN classification algorithm returns f_hat(xq) <- argmax_v_in_V Σ_i=1_k ______.

δ(v, f(xi))

In the Distance-Weighted Nearest Neighbor algorithm, greater weight is given to ______.

closer neighbors

In Distance-Weighted Nearest Neighbor Classification, wi is defined as ______.

1 / d(xq, xi)^2

In Distance-Weighted Nearest Neighbor Regression, f_hat(xq) is calculated as (Σ_i=1_k w_i * f(x_i)) / ______.

Σ_i=1_k w_i

Considering all examples in global KNN with distance weighting means the classifier will run ______.

more slowly

Nearest neighbour classifier is also known as lazy learning or ______ or instance-based learning.

memory-based learning

Nearest neighbour classifiers are useful for large ______ datasets, when trained models become obsolete in a short time.

fast-changing

In a nearest neighbour classifier, the training phase involves ______.

do nothing

A ______ shows the decision boundaries in a nearest neighbour classifier.

Voronoi diagram

The distance-weighted k-NEAREST NEIGHBOR algorithm is robust to ______ and is effective for large numbers of training examples.

noisy training data

A limitation of KNN is the assumption that classification is most similar to instances nearby in ______.

Euclidean distance

Another limitation of KNN is that the distance between instances is calculated based on ______ of the instance.

all attributes

If only a few attributes are relevant, the presence of many irrelevant attributes can ______ the distance metric in KNN.

mislead

One solution to KNN's attribute limitation is to ______ each attribute differently when calculating distance.

weight

Weighting attributes differently in KNN corresponds to ______ the axes in the Euclidean space.

stretching

A more drastic alternative to attribute weighting in KNN is to completely eliminate the ______ attributes from the instance space.

least relevant

An additional practical issue in applying k-NEAREST NEIGHBOR is efficient ______.

memory indexing

One indexing method for KNN is the ______.

kd-tree

In statistical pattern recognition terminology, regression means approximating a ______ target function.

real-valued

In statistical pattern recognition, residual is the error ______ in approximating the target function.

f_hat(x) - f(x)

In statistical pattern recognition, the kernel function is the function of distance used to determine the ______ of each training example.

weight

Chapter 5 Flashcards

(70 cards)