Chapter 2 Quiz Flashcards

(32 cards)

1
Q

methods are trained on a set of training data and then their performance is evaluated on a separate set of validation data

A

data partitioning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

tasks of classification and prediction as well as pattern discovery

A

predictive analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

trying to predict value of categorical variable

A

classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

trying to predict value of numerical variable

A

prediction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

finding general associations patterns between items in large databases through rules general to an entire population

A

association rules/affinity analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

method that uses individual users’ preferences and tastes given their historic purchases or measurable behavior indicative of preference

A

collaborative filtering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

consolidating a large number of records into a smaller set

A

data reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

methods for reducing the number of cases

A

clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

reduction of the number of variables

A

dimension reduction

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

exploration by creating charts and dashboards

A

data visualization/visual analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

used in classification and prediction, must have data available in which the value of the outcome of interest is known

A

supervised learning algorithms

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

data from which classification or prediction algorithm learns

A

training data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

sample of data where the outcome is known used for comparison between models

A

validation data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

sample of data where the outcome is known used to predict how well the model will do

A

test data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

there is no outcome variable to predict or classify

A

unsupervised learning algorithm

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

steps in machine learning

A

understand project
obtain data
preprocess data
reduce dimensions (if necessary)
determine ML task
partition data
choose ML technique
perform task
interpret results
deploy model

17
Q

SEMMA

A

sample, explore, modify, model, assess

18
Q

character, integer, categorical

A

types of variables

19
Q

unordered categorical variables

A

nominal variables

20
Q

ordered categorical variables

A

ordinal variables

21
Q

categorical variables decomposed into a series of binary variables

A

dummy variables

22
Q

creating different binary dummy variables for more than one category

A

one-hot encoding

23
Q

values that lie far away from the bulk of the data

24
Q

knowledge of the particular application being considered

A

domain knowledge

25
how to standardize a value
subtract mean from each value and divide by standard deviation
26
how to normalize a value to [0,1]
subtract minimum value and divide by range
27
creating a complex model that is too similar to the training data that it cannot be applied to other data
overfitting
28
adds up squared errors so when an error is positive or negative it contributes the same
sum of squared errors
29
average of the squared errors
mean squared error
30
square root of the MSE to give idea of the typical error in the same scale used for the outcome variable
root mean squared error
31
average of the absolute values of the errors
mean absolute deviation
32
what do SSE, MSE, RMSE, and MAD measure?
prediction error