Modeling - ML Models Flashcards

1
Q

XGBoost

A
  • Extreme Gradient Boosted Trees
    • Boosted group of decision trees
    • new trees made to correct errors of previous trees
    • uses gradient descent to minimize loss as new trees are added
  • Classification or regression (using regression trees)
  • regularization term penalizes complexity of each tree
  • nodes are split if there is a positive reduction of the loss function
  • loss reduction (gamma) is used to control complexity costs with each additional leaf
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Logistic Regression

A

Nonlinear Classification Model
Probabilities describe possible outcomes when modeled with logistic function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

K-means

A
  • method for grouping n observations into K clusters
  • each observation belongs to the cluster with the nearest mean
    Unsupervised
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Linear Regression

A

Supervised
Regression Model

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

SVM

A
  • Supervised learning models for classification or regression
  • finds a hyperplane in N-dimensional space that distinctly classifies the datapoints
  • If classes can’t be separated with a single line, you need a non-linear kernal to create hyperplane
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Decision Trees

A
  • flowchart like structure in which each internal node represents a test on an attribute and each leaf node represents a class label
  • paths from root to leaf represent classification rules
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Random Forest

A
  • ensemble of decision tree classifiers
  • each tree is generated from independent random vectors from samples in dataset
  • tree classifiers are then combined by averaging probabilistic predictions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

RNN

A
  • Recurrent neural network
  • connections between nodes can create a cycle, allowing output from some nodes to affect subsequent inputs to same nodes
  • Infinite impulse response class of networks
    • linear time-invariant systems
    • h(t) does not become exactly zero past a certain point, continues indefinitely
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

CNN

A
  • most commonly for visual images
  • uses convolution kernels that map a high dimension dataset to a lower dimension dataset
  • finite impulse response class of networks
    • impulse response does become exactly zero at times t > T for some finite T
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Collaborative Filtering

A
  • Technique for recommender systems
  • make auto predictions about interests of a user by collecting preferences or taste information from many users (collaborating)
  • if person A has same opinion as person B on an issue, A is more likely to have B’s opinion on a different issue than of a randomly chosen person
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Semantic Segmentation

A
  • deep learning algorithm that associates a label or category with every pixel in an image
  • used to recognize a collection of pixels that form distinct categories
  • try to draw a boundary around every object and know pixel level details
  • labeling every pixel in image and knowing to which class it belongs
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Instance Segmentation

A

Segment and show different instances of same class

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Linear Learner

A

SageMaker Built in algorithm
supervised learning algorithms used for classification or regression
For regression - basically Linear Regression.
For classification - linear threshold function is used. Can do binary or multi-class.
Uses Stochastic Gradient descent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

DeepAR

A

Sagemaker built in algorithm
Forecasting algorithm
Forecasting scalar time series using RNN.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Random Cut Forest

A

For anomaly detection
Unsupervised
Can detect unexpected spikes in time series data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

KNN

A

K-Nearest Neighbors
supervised
simple classification or regression algorithm.
Find K closest points to a sample point and return most frequent label or average value

16
Q

PCA

A

Dimensionality reduction
Unsupervised

17
Q

Factorization Machines

A

dealing with sparse data
good for item recommendations
supervised
classification or regression
pair-wise interactions

18
Q

BlazingText

A

provides highly optimized implementations of Word2Vec and text classification algorithms
- sentiment analysis, entity recognition, translation
- text classification
- web searches, information retrieval, ranking, document classification
- predict labels for a sentence
- supervised

19
Q

Sequence2Sequence

A
  • supervised algorithm
  • input is sequence of tokens
  • output generated is another sequence of tokens
    works well for the summarization of the text
    decodes and encodes sequences of tokens, such as words
20
Q

Object2Vec

A
  • generalizes Word2Vec embedding technique for words that are optimized in BlazingText algorithm
  • like Word2Vec but with arbitrary objects
21
Q

Neural Topic Model (NTM)

A

Topic modeling algorithm

22
Q

Latent Dirichlet Allocation (LDA)

A

Topic modeling algorithm

23
Q

Amazon Comprehend

A

Advanced text Analytics (Use natural language processing to extract insights & relationships from unstructured texts

24
Q

Amazon CodeGuru

A

Automated code reviews (Automate code reviews & identify your most expensive lines of code

25
Q

Amazon Lex

A

ChatBots (Easily build conversational agents to improve customer service & increase contact center efficiency

26
Q

Amazon Forecast

A

Demand forecasting (Build accurate forecasting models on the same machine learning forecasting technology used by Amazon.com)

27
Q

Amazon Textract

A

Document analysis (Automatically extract text and data from millions of documents in just hours, reducing manual efforts)

28
Q

Amazon Kendra

A

Enterprise search (Add natural language search capabilities to your apps so users can find the information they need more easily)

29
Q

Amazon Fraud Detector

A

Fraud prevention (Identify potentially fraudulent online activities based on the same technology used at Amazon.com)

30
Q

Amazon Rekognition

A

Image and video analysis (Add image and video analysis to your applications to catalog assets, automate media workflows, and extract meaning)

31
Q

Amazon Personalize

A

Personalized recommendations (Personalize experiences for your customers using machine learning technology perfected from years of use on Amazon.com)

32
Q

Amazon Translate:

A

Real-time translation (Expand your reach through efficient and cost-effective translation to reach audiences in multiple languages)

33
Q

Amazon Polly

A

Text to speech (Turn text into life-like speech to give voice to your applications)

34
Q

Amazon Transcribe

A

Transcription (Easily add high-quality speech-to-text capabilities to your applications and workflows)

35
Q
A