ML Overview Flashcards
(18 cards)
What are the 6 stages of the ML Pipeline
- Define the problem
- Data collection
- Data preprocessing
- Data modelling / machine learning
- Model evaluation
- Model application (on new/unseen data)
Define ML
Input data and answers and output rules that can be applied by the computer to new situations.
What are the 4 types of data sets?
- Record
- Graph and network
- Ordered
- Spatial, image, multimedia
What are data objects?
- Make up datasets
- Represent the entity being measured
- Are a row in a database
- Aka entities, instances, points, samples, tuples, patterns, vectors, examples
What are data attributes?
- Describe the data objects
- Are the columns in the database
- Aka features, variables, dimensions, predictors
What are the five types of data attributes
- Nominal - categories
- Binary - 0,1
- Ordinal - meaningful order but magnitude between successive values is not necessarily meaningful
- Interval scaled - equal sized units, ordered, no tire zero
- Ratio scaled - as per interval but with a true zero
3 major categories of machine learning
Supervised
Unsupervised
Reinforcement
Define supervised ML
Learning with a labelled training dataset
Define unsupervised ML
Learning patterns in unlabelled data
Define reinforcement ML
Learning based on feedback and reward
2 types of supervised ML
Classification
Regression
2 types of unsupervised ML
Clustering
Anomaly detection
2 types of reinforcement learning
Game play
Control
3 regression methods
Linear regression, polynomial regression
Ridge regression, LASSO, elastic net
Artificial neural networks
4 classification methods
Logistic regression
K nearest neighbours
Support vector machines
Decision trees
3 things that determine which ML approach to take
Data
Computing requirements
Interpretatability
What 3 aspects of the data determine which ML approach to take
- Size of data (more model parameters need bigger datasets)
- Feature types
- Linear or non-linear relationships
What 2 computing requirements determine which ML approach to take?
- Training period available
- Production performance requirements