LU1- Recap Flashcards

(50 cards)

1
Q

What are the four types of Data Analytics?

A

Descriptive
Diagnostic
Predictive
Prescriptive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name 5 statistical techniques for Data Analysis

A

Linear regression
Classification
Resampling methods
Tree based methods
Unsupervised learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is Linear Regression?

A

Linear Regression is the technique that is used to predict a target variable by providing the best linear relationship among the dependent and independent
variable where best fit indicates the sum of all the distances amidst the shape and
actual observations at each data point is as minimum as achievable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is classification

A

Classification allocates specific
categories to a collection of data for making more spesific predictions and analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Name two types of classification techniques

A

Logistic Regression
Discriminant Analysis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is Logistic Regression?

A

A regression analysis technique to perform when the dependent variable is binary. It is a predictive analysis that is utilized for explaining data and the connection amongst one dependent variable and other nominal independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Name two resampling techniques

A

Bootstrapping
Cross- Validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is bootstrapping?

A

It operates through sampling with replacement from the actual
data and accounts the “not selected” data points as test samples.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is Cross-Validation?

A

This technique is used in order to validate the model performance, and
can be executed by dividing the training data into K parts. During cross validation execution, the K-1 part can be considered as training and the rest made out part acts
as a test set. Up to K times, the process is repeated and then the average of K scores is
accepted as performance estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

When does Undersampling take place?

A

When the majority of the class is copied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

When does oversampling take place?

A

When the minority of the class gets copied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Name 3 Unsupervised learning algorithms

A

Principal component Analysis
K-Means Clustering
Hierarchical Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is Principal component analysis?

A

recognising a linear-set of the mutually uncorrelated blend of features having maximum variance. Also, it helps in acquiring latent interaction among the
variables in an unsupervised framework.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is Machine Learning?

A

Machine Learning is the adoption of mathematical and or statistical models in order to get customized knowledge about data for making foresight.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Name an unsupervised machine-learning technique

A

Clustering

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are Latent variable models?

A

Latent variable models are commonly used for data preprocessing, such as reducing the number of features in a dataset (dimensionality reduction) or decomposing the dataset into multiple components.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is Clustering

A

A clustering problem is where you want to discover the inherent groupings in the
data, such as grouping customers by purchasing behavior.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what type of learning( supervised/ unsupervised) makes use of clustering?

A

unsupervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what type of learning( supervised/ unsupervised) makes use of Classification?

A

Supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what type of learning( supervised/ unsupervised) makes use of Regression?

A

Supervised

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What type of machine learning type makes use of labelled input and output data during the training phase of the machine learning lifecycle

A

Supervised learning

21
Q

To be able to classify new and unseen datasets and predict outcomes, what does a supervised learning model need to learn

A

relationship between input and output data

22
Q

What machine learning is where we have input variables (X) and an output variable (Y)

A

Supervised learning

23
Q

why do we call it supervised learning?

A

Because part of the approach requires human oversight

24
what is classification?
Classification is used when the output variable is categorical
25
Give an example of categorical data
yes/no or male/female or true/false
26
what is regression
regression is used when the output variable is a real or continuous value
27
give an example of regression variables
salary based on work experience or weight based on height
28
give an example of an algorithm used for regression problem
Linear regression or support vector regression or regression tree
29
give an example of a classification problem
The machine needs to understand the difference between stuff (apple, banana& cherry)
30
what type of learning does not make use of output variables
unsupervised
31
Unsupervised learning makes use of output variables (true or false)
False
32
what do you call unsupervised learning output
pseudo output
33
what is anomaly detection
It is when machine learning automatically detects unusual data points in a dataset
34
what is association mining
Identifies sets of items that frequently occur together in your dataset
35
what is latent variable models
Commonly used for data preprocessing such as reducing the number of features in a dataset
36
Give a real world example of unsupervised learning
computer vision - object recognition medical imaging anomaly detection customer personas- habits recommendation engines
37
what is the difference between unsupervised vs supervised
unsupervised - no output data is given data is not labeled computationally complex less accurate & trustworthy number of classes is not known
38
what is accuracy in regards to supervised learning
the ability of a model to make correct predictions
39
what is interpretability in regards to supervised learning
what degree the model allows for human understanding
40
give an example of an interpretable model
linear regression Random forest
41
Give an example of a Non interpretable model
SVM(Support vector machine) LSTM(Long short term memory) Deep learning(DL)
42
What is K-means clustering
A method of vector quantization, aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean
43
what type of learning category does K- means clustering fall into
Unsupervised
44
why use k- means clustering
K-means is used to classify unlabelled data by grouping them by features rather than categories. the goal is to split the data into k different clusters and report the location of the centre of mass for each cluster
45
what does the K represent in K means cluster
the K represents the number of groups or categories created.
46
what is hierarchical clustering
algorithm that creates clusters that have predominant ordering from top to bottom
47
what does hierarchical clustering do?
Hierarchical clustering separates data into groups based on some measure of similarity
48
what is Agglomerative Hierarchical Clustering
(“bottom-up”) clustering starts with each observation being its own cluster. They merge into subgroups as we move up the tree.
49
what is divisive clustering
(“top-down”) clustering starts with one cluster of all observations. The cluster is split into subgroups as we move down the tree.