Key Words Flashcards

1
Q

Data Mining

A

identifying implicit, previously unknown, potentially useful patterns in data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data Mining can be done…

Data Mining can be…

A

Data Mining can be done… interactively or automatically

Data Mining can be… descriptive or predictive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Machine Learning (for data mining)

A

programs that induce structural descriptions from observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Supervised learning

A

is based on labeled examples and used for predicting labels of new observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Unsupervised learning

A

is based on unlabeled data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Classification Rule

A

predicts value of given attribute

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Association Rule

A

predicts value of arbitrary attribute (or combination)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Concepts

A

structures that can be learned/thing to be learned

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Concept Description

A

Output of learning scheme

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Important types of learning problems (four)

A

Classification Learning
Numeric Prediction
Association Learning
Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Clustering

A

Grouping similar instances into clusters

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Is classification learning supervised or unsupervised?

A

Supervised - Scheme is provided with actual outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Explain classification learning

A
-predicting a nominal class
Can measure success on fresh data for which class labels are known (test data) - in practice success is often measured subjectively.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What do we call the outcome?

A

The class of the example

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Regression/Numeric Prediction

A

predicting a numeric quantity

Variant of classification learning where “class” is numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Is regression/numeric prediction supervised or unsupervised

A

supervised - scheme is being provided with target value

17
Q

Explain numeric prediction/regression

A

Meaure success by comparing predictions to actual values in test data

18
Q

Association Learning

A
Detecting associations between attributes
Can be applied if no class is specified and any kind of structure is considered "interesting"
19
Q

is association learning supervised or unsupervised

A

Unsupervised

20
Q

What is the difference between association learning and classification learning?

A

Association learning can predict any attributes value, not just the class, and more than one attributes value at a time
Consequence: far more association rules than classification rules

21
Q

Clustering

A

Finding groups of items that are similar

success often measured subjectively

22
Q

Is clustering supervised or unsupervised

A

unsupervised - the class of an example is not known

23
Q

Instance

A

example provided as a fixed-length row of data

  • thing to be classified, associated or clustered
  • individual, independent example of target concept
  • Characterized by a predetermined set of attributes
24
Q

What is a dataset?

A

Input to learning scheme, set of instances

25
Q

Denormalization

A

process of flattening, several relations are joined together to make one, denormalization is possible with any finite set of finite relations and may produce spurious regularities that reflect structure of database

26
Q

Denormalization problems

A
  • relationships without pre-specified number of objects

e. g concept of nuclear family

27
Q

Concept of nuclear family

A

may need relational data mining techniques, which are beyond the scope of this course