Chapter 11 Flashcards

(11 cards)

1
Q

What is AI?

A

Used to describe computer systems that demonstrate human like intelligence and cognitive abilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is machine learning?

A

Techniques that integrate self learning algorithms. Application of AI that allows computers to learn automatically without human intervention or assistance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are some characteristics of machine learning?

A
  • designed to evaluate results and to improve performance over time
  • can uncover hidden patterns and relationships in data
  • use self learning algorithms to evaluate results and improve performance over time
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is Data Mining?

A

Process of applying a set of analytical techniques necessary for the development of machine learning and AI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the goal of data mining?

A

To uncover hidden patterns and relationships in data which allows us to gain insight and deliver relevant information to help make decisions.

Used for data segmentation
, pattern recognition, classification and prediction and requires a systematic approach

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the main Data Mining Process?

A

CRISP-DM

Emphasis on business goals and objectives prior to preparing the data and choosing analysis techniques

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the 6 major phases of CRISP-DM?

A
  1. Business understanding
  2. Data understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
  6. Deployment
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which 2 techniques can data mining algorithms be classified into?

A

Supervised and Unsupervised data mining

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are Similarity Measures?

A

Measures whether a group of observations are similar or dissimilar to one another. Based on the distance between pairwise observations of the variables

  • small distance = high similarity
  • large distance = low similarity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the similarity measures for numerical variables?

A
  • euclidean measures
  • manhattan distance
  • standardisation and normalisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the similarity measures for categorical variables?

A
  • matching coefficient
  • jaccards coefficient
How well did you know this?
1
Not at all
2
3
4
5
Perfectly