Chapter 11 Flashcards
(11 cards)
What is AI?
Used to describe computer systems that demonstrate human like intelligence and cognitive abilities
What is machine learning?
Techniques that integrate self learning algorithms. Application of AI that allows computers to learn automatically without human intervention or assistance
What are some characteristics of machine learning?
- designed to evaluate results and to improve performance over time
- can uncover hidden patterns and relationships in data
- use self learning algorithms to evaluate results and improve performance over time
What is Data Mining?
Process of applying a set of analytical techniques necessary for the development of machine learning and AI
What is the goal of data mining?
To uncover hidden patterns and relationships in data which allows us to gain insight and deliver relevant information to help make decisions.
Used for data segmentation
, pattern recognition, classification and prediction and requires a systematic approach
What is the main Data Mining Process?
CRISP-DM
Emphasis on business goals and objectives prior to preparing the data and choosing analysis techniques
What are the 6 major phases of CRISP-DM?
- Business understanding
- Data understanding
- Data preparation
- Modeling
- Evaluation
- Deployment
Which 2 techniques can data mining algorithms be classified into?
Supervised and Unsupervised data mining
What are Similarity Measures?
Measures whether a group of observations are similar or dissimilar to one another. Based on the distance between pairwise observations of the variables
- small distance = high similarity
- large distance = low similarity
What are the similarity measures for numerical variables?
- euclidean measures
- manhattan distance
- standardisation and normalisation
What are the similarity measures for categorical variables?
- matching coefficient
- jaccards coefficient