lecture 5 Flashcards

Question 1

Q

What is predictive analytics

Answer

A

Predictive analytics is the process of extracting information from large data sets in order to determine trends and patterns that can be used to generate models and predict behaviors of interest.

Question 2

Q

Prescriptive analytics

Answer

A

Aims at suggesting (prescribing) the best decision options in order to take advantage of the predicted future utilizing large amounts of data (Šikšnys & Pedersen, 2016).

Incorporates the predictive analytics output and utilizes artificial intelligence, optimization algorithms and expert systems in a probabilistic context in order to provide adaptive, automated, constrained, time-dependent and optimal decisions.

Question 3

Q

Relation between Predictive and prescriptive (predictive-prescriptive split)

Answer

A

There is considerable overlap between the two areas.

difference:
prescriptive depends on predictive. In this course treated as two seperate steps.

Venn diagram in slide shows that Machine Learning / data mining is mainly predictive analytics, but also falls into the prescriptive part.
Probabilistic models is halfway in both.

predictive analystics
statistical analysis

prescriptive analystics
mathematical programming
simulation
logic based models
evolutianry computation

Question 4

Q

What is AI?

Answer

A

No consensus on a single definition

Thinking Humanly:
Cognitive science/Cognitive modelling

Acting Humanly: Turing test

Thinking Rationally: Logic-based/Deductive Intelligence

Acting Rationally: Rational (trying to achieve the best
solution) agents

Is it more about actual intelligence or perceived
intelligence?

slide 11

Question 5

Q

Chinese room argument

Answer

A

Is it more about actual intelligence or perceived
intelligence?
Does an AI actually
understand or does it simply
execute an algorithm/set of
rules with (super)human
capacities?

Question 6

Q

Levels of AI

Answer

A

narrow AI
general AI
super AI

Question 7

Q

What is narrow AI?

Answer

A

Dedicated to assist with or take over specific tasks

Question 8

Q

General AI

Answer

A

takes knowledge from one domain, transfers to other domains

Question 9

Q

Super AI

Answer

A

machines that are an order of magnitude smarter than humans

Question 10

Q

differences between AI, machine learning, and deep learning

Answer

A

AI: computing systems which are capable of performing tasks that humans are very good at, for example recognizsing objects

ML: the field of AI that applies statistical methods to enable computer systems to learn from the data towards and end goal.

Deep learning: neural networks with several hidden layers.

Question 11

Q

Machine learning definition

Answer

A

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, *if its performance at tasks *in T, as measured by P, improves with
experience E

Question 12

Q

When to use:
* classical ML
* Reinforcement learning
* ensembles
* neural networks and deep learning

Answer

A

classical ML
* simple data and clear features
Reinforcement learning
* no data, but we have an environment to interact with
ensembles
* when quality is a real problem
neural networks and deep learning
* complicated data, unclear features, belief in a miracle

Question 13

Q

Data requirements for Machine learning (taxonomy of machine learning)

Answer

A

Supervised
unsupervised
semisupervised
reinforcement

Question 14

Q

Supervised learning

Answer

A

With supervised learning, you feed the output of your algorithm into the system (as input, for instance pics of cats and dogs with the answer that a pic of a dog is a dog and a cat is a cat, to train the model). This means that in supervised learning, the machine already knows the output of the algorithm before it starts working on it or learning it. A basic example of this concept would be a student learning a course from an instructor. The student knows what he/she is learning from the course.

With the output of the algorithm known, all that a system needs to do is to work out the steps or process needed to reach from the input to the output. The algorithm is being taught through a training data set that guides the machine.

type of target variable is either:
* continous which results in regression analysis
* catergorical which results in classification.
Examples of these categories formed through classification would include demographic data such as marital status, sex, or age

Even more information if needed

Supervised learning uses a training set to teach models to yield the desired output. This training dataset includes inputs and correct outputs, which allow the model to learn over time. The algorithm measures its accuracy through the loss function, adjusting until the error has been sufficiently minimized.
Uses labeled data.
examples:
* Image- and object-recognition: Supervised learning algorithms can be used to locate, isolate, and categorize objects out of videos or images, making them useful when applied to various computer vision techniques and imagery analysis.
* Predictive analytics
* Spam detection: Spam detection is another example of a supervised learning model. Using supervised classification algorithms, organizations can train databases to recognize patterns or anomalies in new data to organize spam and non-spam-related correspondences effectively

challenges of supervised learning
* Supervised learning models can require certain levels of expertise to structure accurately.
* Training supervised learning models can be very time intensive.
* Datasets can have a higher likelihood of human error, resulting in algorithms learning incorrectly.
* Unlike unsupervised learning models, supervised learning cannot cluster or classify data on its own.

IBM

Question 15

Q

Difference between supervised vs. unsupervised learning vs. semi-supervised learning

Answer

A

Unlike supervised learning, unsupervised learning uses unlabeled data. From that data, it discovers patterns that help solve for clustering or association problems. This is particularly useful when subject matter experts are unsure of common properties within a data set. Common clustering algorithms are hierarchical, k-means, and Gaussian mixture models.

Semi-supervised learning occurs when only part of the given input data has been labeled. Unsupervised and semi-supervised learning can be more appealing alternatives as it can be time-consuming and costly to rely on domain expertise to label data appropriately for supervised learning

Question 16

Q

Unsupervised learning

Answer

A

Does not use labels
output is unknown
far less used than supervised learning
forms the future behind ML and its possibilities
machine and computers developing the ability to “teach themselves” is alluding to the process of unsupervised learning.
no access to concrete datasets
outcomes of problems are largely unknown
no reference data at all

Question 17

Q

Is skippable

example to show difference between supervised and unsupervised learning

Answer

A

consider that we have a digital image that has a variety of colored geometric shapes on it. These geometric shapes needed to be matched into groups according to color and other classification features. For a system that follows supervised learning, this whole process is a bit too simple.

The procedure is extremely straightforward, as you just have to teach the computer all the details pertaining to the figures. You can let the system know that all shapes with four sides are known as squares, and others with eight sides are known as octagons, etc. We can also teach the system to interpret the colors and see how the light being given out is classified.

However, in unsupervised learning, the whole process becomes a little trickier. The algorithm for an unsupervised learning system has the same input data as the one for its supervised counterpart (in our case, digital images showing shapes in different colors).

Once it has the input data, the system learns all it can from the information at hand. In fact, the system works by itself to recognize the problem of classification and also the difference in shapes and colors. With information related to the problem at hand, the unsupervised learning system will then recognize all similar objects, and group them together. The labels that it will give to these objects will be designed by the machine itself. Technically, there are bound to be wrong answers, since there is a certain degree of probability. However, just like how we humans work, the strength of machine learning lies in its ability to recognize mistakes, learn from them, and to eventually make better estimations next time around.

Question 18

Q

Reinforcement learning

Answer

A

Reinforcement Learning spurs off from the concept of Unsupervised Learning, and gives a high sphere of control to software agents and machines to determine what the ideal behavior within a context can be. This link is formed to maximize the performance of the machine in a way that helps it to grow. Simple feedback that informs the machine about its progress is required here to help the machine learn its behavior.

An agent decides the best action based on the current state of the results

Question 19

Q

Reinforcement learning vs. supervised learning and unsupervised learning

Answer

A

Reinforcement vs supervised learning
In Supervised Learning we have an external supervisor who has sufficient knowledge of the environment and also shares the learning with a supervisor to form a better understanding and complete the task, but since we have problems where the agent can perform so many different kind of subtasks by itself to achieve the overall objective, the presence of a supervisor is unnecessary and impractical. In the concept of Reinforcement Learning, there is an exemplary reward function, unlike Supervised Learning, that lets the system know about its progress down the right path.

Reinforcement vs unsupervised learning
Reinforcement Learning basically has a mapping structure that guides the machine from input to output. However, Unsupervised Learning has no such features present in it. Unsupervised Learning, the machine focuses on the underlying task of locating the patterns rather than the mapping for progressing towards the end goal.

For example, if the task for the machine is to suggest a good news update to a user, a Reinforcement Learning algorithm will look to get regular feedback from the user in question, and would then through the feedback build a reputable knowledge graph of all news related articles that the person may like. On the contrary, an Unsupervised Learning algorithm will try looking at many other articles that the person has read, similar to this one, and suggest something that matches the user’s preferences.

https://crayondata.ai/machine-learning-explained-understanding-supervised-unsupervised-and-reinforcement-learning/

Question 20

Q

Math representation (Taxonomy of Machine Learning)

Answer

A

divided in model-based and instance based

Instance-based: machine learning technique simply compares new instances to the ones they were trained on.
So comparing new data to the training data and based on the training data classifying it.

model-based: try to find a general representation of the relationships in the dataset.
the algorithm chooses an hypothesis, a mathematical representation. Then it determines the parameters of this hyporhesis based on the available data. This will be used to make estimations on new data.

https://hermit-notebook.site/en/notebook/computer-sciences/artificial-intelligence/machine-learning/taxonomy-of-machine-learning/

Question 21

Q

Classification by Training behaviour (Taxonomy of Machine Learning)

Answer

A

ML techniques cannot have a memory of the entire dataset they were trained on, but iterative adjustments are based on the data it is provided with. Many learning techniques will not be able to adjust on new data an already trained representation while keeping it consistent with its previous training (because there is no memory of the previous data).

batch learning: Learning techniques that require the entire data set for their training.
All the examples must be provided during the traning phase. The “predictor” resulting from the training is then used in production and no more learning occurs. In this setting, if we obtain new examples, we need to train a new model from scratch on the complete enriched data set.

online learning: This learning algorithm can actually adjuts an already trained representation to new data. Unlike batch learning, an online learning technique can be provided with new training examples progressively and changes its representations accordinly, even while being used in production. For many underlying representations, true online learning is not possible. However, depending on the formulation, we can often find a pseudo online algorithm based on recursive algorithms. In this case, the new predictor depends on the current best predictor and all the previous examples (already learnt).

https://hermit-notebook.site/en/notebook/computer-sciences/artificial-intelligence/machine-learning/taxonomy-of-machine-learning/

Question 22

Q

Classification by Task Type (Machine learning taxonomy by usage or goal)

Answer

A

Regression
Classification
Clustering
Association Rule learning
Decision making
Blind source seperation
Dimensinality reduction

Question 23

Q

Regression

Answer

A

𝑌 = 𝑓 (𝑋)

The values of 𝑌 are determined by a human
𝑌 ∈ ℝ is a continuous variable

𝑓 is learned from the data through ML

Regression tries to find the value of a property of a phenomenon depending on the values of other properties or instances of the same kind.

Regression typically falls under supervised learning.

For example, suppose an ice cream seller wants to predict its incomes based on temparature forcasts. We would be learning the (model and) parameters of a regression if we were to try to create a software package for this requirement.

Question 24

Q

Classification

Answer

A

𝑌 = 𝑓(𝑋)
The values of 𝑌 are determined by a human

𝑌 ∈ { 𝐶! , … , 𝐶” } is a discrete variable
𝐶! = Triangle
𝐶# = Circle

𝑓 is learned from the data through ML

Classification tries to find boundaries in the dataset so as to seperate the elements into a number of classes known (or defined) before the training.

Classification typicaly fall under supervised learning. However there exist unsupervised classification, like anomaly detection or outliers detection.

Question 25

Q

clustering

Answer

A

The real values of 𝑌 are unknown

The ML algorithm tries to identify existing patterns in the data (without prior supervision)

Clustering tries to group observations such that elements belonging to the same group (or cluster) are more similar - according to some similarity measure - and thoses belonging to different groups are more dissimilar. Clustering typically is an unsupervised learning task.

Question 26

Q

Baseline vs. State-of-the-art-model

Answer

A

Baseline/Benchmark
* Simple model
* Easy/quick to fit
* Reference point for performance analysis

State-of-the-art model
* Usually very complex model
* Costly/optimized fit
* Best possible performances

Question 27

Q

Supervised machine learning for regression

Answer

A

Linear Regression

Artificial Neural Networks

Deep Artificial Neural Networks

Support Vector Regression (SVR)

K-Nearest Neighbours (k-NN)

Question 28

Q

Linear Regression

Answer

A

Linear Regression

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation:
Model-based : 𝑌 = 𝛽𝑋 + ε

Task: Regression
For Classification, the equivalent model is
Logistic Regression

Question 29

Q

Artificial Neural Networks

Answer

A

Dataset requirement :
Supervised (ANN, RNN, CNN, GAN)
Unsupervised (Autoencoders)

Data provisioning: Batch/Online

Model representation: Model-based
Task: Regression/Classification Ensemble model

Question 30

Q

Deep Artificial Neural Networks

Answer

A

Dataset requirement :
▪ Supervised (ANN, RNN, CNN, GAN)
Unsupervised (Autoencoders)

Data provisioning: Batch/Online

Model representation: Model-
based

Task: Regression/Classification

Question 31

Q

Support Vector Regression (SVR)

Answer

A

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation: Model-
based : 𝑌 = 𝐾(𝛽𝑋) + ε

Task: Classification
For Regression, the equivalent model is Support Vector Regression

Question 32

Q

K-Nearest Neighbours (k-NN)

Answer

A

Dataset requirement:
Supervised

Data provisioning:
Batch/Online

Model representation:
Model-based

Task:
Classification/Regression
Regression -> Mean
Classification -> Majority vote

Question 33

Q

Supervised Machine Learning for classification

Answer

A

Naïve Bayes
Logistic Regression
Support Vector Machines (SVM)
Decision Tree
Random forest
Artificial Neural Networks

Question 34

Q

Naïve Bayes

Answer

A

Dataset requirement :
Supervised

Data provisioning:
Batch

Model representation:
Model-based

Task: Classification

Question 35

Q

Logistic Regression

Answer

A

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation:
Model-based : 𝑌 = 𝛽𝑋 + ε

Task: Classification
For Regression, the equivalent model is Linear Regression

Question 36

Q

Support Vector Machines (SVM)

Answer

A

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation: Model-based : 𝑌 = 𝐾(𝛽𝑋) + ε

Task: Classification

For Regression, the equivalent model is Support Vector Regression

Question 37

Q

Decision Tree

Answer

A

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation: Instance-based

Task: Regression/Classification
Regression VS Classification
Decision Tree

Question 38

Q

Random forest

Answer

A

Dataset requirement :
Supervised

Data provisioning: Batch

Model representation: Instance-based

Task: Regression/Classification
Ensemble model

Question 39

Q

Artificial Neural Networks

Answer

A

Dataset requirement :
Supervised (ANN, RNN, CNN, GAN)
Unsupervised (Autoencoders)

Data provisioning: Batch/Online

Model representation: Model-based

Task: Regression/Classification
Ensemble model

Question 40

Q

Unsupervised Machine Learning

Answer

A

K-Means Clustering
Hierarchical clustering
And many more…

many more:

Dimensionality Reduction
* PCA
* t-SNE
* Autoencoders

Clustering
* DBSCAN
* Self-organizing maps

Reinforcement Learning
* Q-Learning
* Deep Q-Learning
…

Question 41

Q

K-Means Clustering

Answer

A

Dataset requirement: Unsupervised

Data provisioning: Batch

Model representation: Instance-based

Task: Clustering/pattern recognition

N.B. : As clustering is unsupervised, multiple solutions can be found!

Question 42

Q

Hierarchical clustering

Answer

A

Dataset requirement: Unsupervised

Data provisioning: Batch

Model representation: Instance-based

Task: Clustering/pattern recognition

Question 43

Q

Machine learning in practice - pipeline

Answer

A

raw data
* collection
* download
* scraping

Data preprocessing
* Data quality (cf. diagnostic)
* missing data
* categorical variables

Train-test split
* single validation
* cross validation

model fit
* fit on training data
* test on testing data

performance evaluation
* performance metric choice
* evaluation on validation data

Question 44

Q

Splitting data

Answer

A

Data is split for three different uses:
* trees of different depths are fit to the training data
* their performance is evaluated on the validation set (the lower the validation error the better)
* and a final estimate of model performance is computed on the test set

Question 45

Q

Splitting data and vocabulary

Answer

A

Feature: With respect to a dataset, a feature represents an attribute and value combination. Color is an attribute. “Color is blue” is a feature (blue is one of the values color can have).
target: target variable, also known as a dependent variable, is the outcome we aim to predict or explain using our model. It is the variable that we want to estimate or classify based on the available data.
sample: a row, one instance in a dataset, so an answer for all the features (and thus variables)
Training Set: A set of observations used to generate machine learning models.
Test Set: A set of observations used at the end of model training and validation to assess the predictive power of your model. How generalizable is your model to unseen data?

Question 46

Q

Categorical data preprocessing

Answer

A

ordinal
one-hot-encoding

Use One-Hot Encoding: When dealing with nominal categorical variables that lack any inherent order.

Use Ordinal Encoding: When you have categorical variables with a clear ordinal relationship and the order between categories holds valuable information.

Question 47

Q

One-hot-encoding

Answer

A

transforms categorical variables into a binary matrix where each category is represented as a column, and each instance is marked with a ‘1’ in the corresponding column and ‘0’ in all other columns. (so for instance, three values: red, green and yellow, then 3 columns, if it is red then a 1 in the red column a 0 in the others.)

advantages:
1. Preservation of Information: One-hot encoding preserves the uniqueness of each category. It ensures that the algorithm does not assume any ordinal relationship among the categories.
2. Lack of Bias: Since each category is represented independently, one-hot encoding prevents introducing unintended biases based on the order of categories.
3. Suitable for Most Algorithms: One-hot encoded data is widely accepted by various machine learning algorithms, such as decision trees, random forests, and neural networks

limitations:
1. Dimensionality: One-hot encoding can significantly increase the dimensionality of the dataset, especially when dealing with categorical variables with many unique categories. This can lead to the curse of dimensionality and negatively impact model performance.
2. Loss of Order Information: One-hot encoding discards any inherent order that might exist among categories, which can be crucial in some scenarios.

Question 48

Q

Ordinal Encoding

Answer

A

Ordinal encoding is a technique that assigns a unique integer value to each category based on their order or rank. It is suitable for categorical variables that exhibit a clear ordinal relationship, where one category is greater or lesser than another. (for instance: flight ticket, first, second or a third class)

advantages:
1. Efficiency in Dimensionality: Ordinal encoding does not inflate the dataset’s dimensionality like one-hot encoding does. It replaces categorical values with integers, saving space and computation time.
2. Retains Order Information: This technique preserves the ordinal information that exists among categories, allowing the algorithm to leverage this information if it is relevant to the problem.

limitations:
1. Assumption of Equal Steps: Ordinal encoding assumes equal intervals between categories, which might not always be the case in real-world scenarios.
2. Potential Misrepresentation: If the assigned integer values do not accurately reflect the ordinal relationships, the encoded data might mislead the algorithm.

Question 49

Q

Missing data preprocessing

Answer

A

Case deletion
Missing data imputation
Approcahes that take into account data distribution

Question 50

Q

Missing data imputation

Answer

A

Generally replace the missing quantitative values using Mean/Median and when it comes to categorical or qualitative data, we use Mode to impute the missing data.

Question 51

Q

Case deletion

Answer

A

List Wise Deletion: If we have missing values in the row then, delete the entire row. So, here we get some data loss. But to avoid this, we can use the Pairwise deletion method.

Pair Wise Deletion: We find the correlation matrix here. If the feature is highly correlated with the target variable, then we use some different imputation methods to deal with missing values. But, if the feature is not highly correlated with the target variable, then we delete the entire column.

Question 52

Q

Precision

Answer

A

exactness of model

True positive / (true positive + false positive)

Question 53

Q

Accuracy

Answer

A

percentage correct predictions

(true positive + true negative)
/
(tp + fn + fp + tn)

Question 54

Q

Recall

Answer

A

Completeness of model

TP / (TP+FN)

Question 55

Q

F1 Score

Answer

A

Combines precision and recall

(precision * recall)
\ *2
(precision + recall)

(so the fraction and then time 2)

Brainscape's Knowledge GenomeTM

lecture 5 Flashcards

Brainscape's Knowledge Genome^TM