Ch5 - Introduction To Data Mining Flashcards

(17 cards)

1
Q

What is the main objective of data mining?

A

To explore and analyze large quantities of data to discover meaningful patterns.

Data mining is essential for extracting knowledge from vast datasets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is Knowledge Discovery in Databases (KDD)?

A

The automatic non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data.

KDD encompasses various steps including data cleaning, integration, selection, transformation, mining, evaluation, and presentation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

List the seven steps in the KDD process.

A
  • Data cleaning
  • Data integration
  • Data selection
  • Data transformation
  • Data mining
  • Pattern evaluation
  • Knowledge presentation

These steps help in systematically extracting knowledge from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the two main types of data mining methods?

A
  • Prediction Methods
  • Description Methods

Prediction methods involve forecasting unknown values, while description methods identify human-interpretable patterns.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is predictive modeling in data mining?

A

Finding a model for a class attribute as a function of the values of other attributes.

An example is predicting credit worthiness based on various factors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is clustering in data mining?

A

Finding groups of objects such that objects in a group are similar to each other and different from objects in other groups.

Clustering aims to maximize inter-cluster distances and minimize intra-cluster distances.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the goal of association rule discovery?

A

To produce dependency rules that predict the occurrence of an item based on the occurrences of other items.

Example rules include {Milk} –> {Coke} and {Diaper, Milk} –> {Beer}.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is anomaly detection?

A

Detecting significant deviations from normal behavior.

Applications include credit card fraud detection and network intrusion detection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are some motivating challenges in data mining?

A
  • Scalability
  • High Dimensionality
  • Heterogeneous and Complex Data
  • Data Ownership and Distribution
  • Non-traditional Analysis

These challenges can complicate the data mining process and the interpretation of results.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Fill in the blank: Data mining helps scientists in _______ of massive datasets.

A

[automated analysis]

This process is crucial for hypothesis formation in scientific research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the significance of data warehousing in data mining?

A

Data warehousing allows for the collection and storage of large datasets, facilitating analysis and mining.

Companies like Google and Facebook utilize vast amounts of data from their platforms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What role does data cleaning play in the KDD process?

A

To remove noise and inconsistent data from the dataset.

Cleaning data is crucial for enhancing the quality of analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False: Data mining is the central part of a process called Knowledge Discovery.

A

True

Data mining is essential for extracting patterns and knowledge from data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the purpose of regression in data mining?

A

To predict the value of a continuous variable based on other variables.

Examples include predicting sales based on advertising expenditure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is the application of clustering in targeted marketing?

A

Custom profiling for targeted marketing.

Clustering helps identify customer segments for more effective marketing strategies.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is the goal of churn prediction in customer analysis?

A

To predict whether a customer is likely to be lost to a competitor.

This involves analyzing customer transaction records.

17
Q

Fill in the blank: Data mining techniques draw ideas from _______.

A

[machine learning, AI, pattern recognition, statistics, database systems]

These disciplines contribute to the methods and strategies used in data mining.