Modeling 2 Flashcards

(85 cards)

1
Q

Object Detection

A

it detects objects in an image with bounding boxes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How does Object Detection work ?

A

with a single deep neural network

CNN with Single shot multibox
Detector (SDD) algorithm
- CNN can be VGG-16 or ResNet-50

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how does Object Score provide confidence?

A

using a confidence score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

how to train object detection?

A

i. train from scratch

ii. use pre-trained models based on ImageNet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Object Detection input?

A

RecordIO / image format (JPG, PNG)

for training images
JSON to provide metadata like bounding boxes and labels per image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Object Detection output

A

all instances of objects in the image with categories and confidence scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Object Detection transfer learning mode

A

use pre-trained model for the base network weights, instead of random initial weights

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

how does Object Detection avoid over fitting?

A

flip
rescale
jitter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Object Detection hyperparameters

A

usual ones in a CNN

mini_batch_size
learning_rate
optimizer
- sgd, adam, rmsprop, adadelta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Object Detection instance types

A

GPU instances for training (honestly it’s the demanding CNN)

multi GPU multi Machine (scales up nicely)

ml. p2.xlarge
ml. p2.8xlarge
ml. p2.16xlarge
ml. p3.2xlarge
ml. p3.8clarge
ml. p3.16xlarge

for inference:
CPU or GPU
C5, M5, P2, P3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Image Classification

A

like Object detection but simpler

doesn’t tell you where objects are but gives you label for the image

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Image Classification Input

A

Apache MXNet RecordIO

  • not protobuf
  • for interoperability with other deep learning frameworks

Raw jpg, png images

image format requires .lst files to associate

  • image index
  • class label
  • path to image

augmented manifest image format enables pipe mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is pipe mode?

A

allows you to stream data from s3 instead of copy the data over

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How does Image Classification works?

A

ResNet CNN

full training mode:
- network initialized with random weights

Transfer Learning mode:

  • initialized with pre-trained weights
  • top fully-connected layer is initialized with random weights
  • network is fine-tuned with new training data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Default image specifications for Image Classification

A

224x224
3-channel
(imageNet’s dataset)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Image Classification hyperparameters

A

batch size
learning rate
optimizer

optimizer-specific parameters

  • weight decay
  • beta 1
  • beta 2
  • eps
  • gamma
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Image classification instance types

A

Multi-gpu multi-machine
GPU instance for training (p2,p3)

GPU or CPU for inferences (C4, p2, p3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Semantic Segmentation

A

Pixel level object classification
not like object detection with bounding boxes
not like image classification with labels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Semantic Segmentation use cases

A

self-driving vehicles
medical imaging diagnosis
robot sensing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Semantic segmentation output

A

segmentation mask

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Semantic Segmentation training

A

jpg, png

label maps to describe annotations
- for training and validation

augmented manifest image format supported for pipe mode

jpg images accepted for inference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Semantic Segmentation

A

MXNet Gluon and Gluon CV

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Semantic Segmentation algorithms

A

Fully-Convolutional Network (FCN)

Pyramid Scene Parsing (PSP)

DeepLabV3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Choices of backbones for Semantic Segmentation

A

ResNet50
ResNet101
Both trained on ImageNet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Semantic Segmentation training from scratch or incremental
both are supported
26
Semantic Segmentation hyperparameters
``` epochs learning rate batch size optimizer algorithm backbone ```
27
Semantic Segmentation instance types
GPU only: P2, P3 Single Machine Only ml. p2.xlarge ml. p2.8xlarge ml. p2.16xlarge ml. p3.8xlarge ml. p3.16xlarge
28
Inference instances for Semantic Segmentation
CPU C5, M5 | GPU P2, P3
29
Random cut forest
anomaly detection unsupervised detect unexpected spikes in time series data breaks in periodicity unclassifiable data points based on an algorithm developed by amazon
30
random cut forest output
assigns an anomaly score to each data point
31
random cut forest training input
RecordIO-protobuf or CSV can use file or pipe mode on either optional test channel for computing accuracy, precision, recall and F1 on labeled data
32
How does random cut forest work?
creates a forest of trees where each tree is a partition of the training data looks at expected change in complexity of the tree as a result of adding a point to it
33
how data is sampled in random cut forest ?
Randomly sampled and then trained
34
is it possible to use random cut forest in Kinesis Analytics?
yes it is. it can work ok streaming data too.
35
random cut forest hyperparameters
num_trees - increasing reduces noise num_samples_per_tree - 1/num_samples_per_tree approximates the ratio of anomalous to normal data
36
Random cut forest instance types
does not use GPU use M4, C4, C5 for training ml.c5.xl for inference
37
Neural Topic Modeling
organize documents into topics classify or summarize dox based on topics not just TF-IDF unsupervised
38
Neural Topic Modeling algorithm
Neural Variational Inference
39
Training input for Neural Topic Modeling
Four data channels - train is required - validation, rest and auxiliary are optional recordIO-protobuf or CSV words must be tokenized into integers every document must contain a count for every word in the vocabulary in CSV the auxiliary channel is for the vocabulary file or pipe mode which obviously pipe is faster
40
how to use Neural Topic Modeling
define how many topics we have
41
does the Neural Topic Modeling give us topic names ?
No, topics are a latent representation based on top ranking words one of two topic modeling algorithms in SageMaker - you can try them both
42
Neural topic model | important hyperparameters
lowering mini_batch_size and learning_rate can reduce validation loss at expense of training time num_topics
43
Neural Topic Modeling instance types
GPU or CPU GPU recommended for training CPU which is cheaper is ok for inference
44
Latent Dirichlet Allocation (LDA)
topic modeling not based on Deep Learning unsupervised - topics are unlabeled, which means they are just groupings of documents with a shared subset of words can be used for things other than words
45
how can you use LDA for things other than words ?
cluster customers based on purchases | harmonic analysis in music
46
LDA input for training
Train Channel, Optional Test Channel RecordIO-protobuf or CSV Each doc has counts for every word in vocabulary (CSV) pipe mode only supported with RecordIO
47
LDA: | un/supervised?
unsupervised
48
LDA: | optional test channel can be used for ... ?
Scoring results | - per-word log likelihood
49
LDA vs Topic modeling
similar to NTM but CPU based | - therefore cheaper / more efficient
50
LDA hyperparameters
num_topics alpha0 - initial guess for concentration parameter - smaller values generate sparse topic mixtures - larger values (>1.0) produce uniform mixture
51
LDA instance type
Single CPU
52
KNN
K-Nearest-Neighbors Simple Classification or regression algorithm supervised
53
KNN Classification
find the K closest points to a sample point and return the most frequent label
54
KNN Regression
Find the K closest points to a sample point and return the average value
55
KNN input
Training channel, contains data Test channel, emits accuracy or MSE RecordIO-protobuf or CSV training - first column is label File or Pipe mode, either
56
KNN in SageMaker, how does it work?
1- Data is sampled 2- SageMaker includes a dimensionality reduction stage - avoid sparse data (Curse of dimensionality) - at cost of noise / accuracy - sign or fjlt methods 3- built an index for looking up neighbours 4- serialize the model 5- query the model for a given K
57
KNN hyperparameters
K! | Sample_size
58
KNN Instance types
Training on CPU or GPU - ml.m5.2xlarge - ml.p2.xlarge Inference - CPU for lower latency - GPU for higher throughput on large batches
59
K-Means
unsupervised clustering divide the data into K groups where members of a group are similar as possible to each other - you define similar - measured by Euclidean distance SageMaker offers web-scale k-means clustering
60
K-Means input
training channel optional test - train ShardedByS3Key, - test FullyReplicated RecordIO-protobuf or CSV File or Pipe on either
61
K-Mean under the hood
every observation mapped to n-dimensional space n is number of features works to optimize the center of K clusters "extra cluster centers" may be specified to improve accuracy (which end up getting reduced ti k) K = k * x
62
K-Mean Algorithm
Determine initial cluster centers - random or k-means++approach - K-means++tries to make initial clusters far apart Iterate over training data and calculate cluster centers Reduce clusters from K to k - using Lloyd's method with k-means++
63
K-Mean hyperparameters
K! - choosing k is tricky - plot within-cluster sum of squares as function of K - elbow method - basically optimize for tightness of clusters mini_batch_size extra_center_factor Init_method
64
K-Mean Instance type
CPU or GPU but CPU recommended only one GPU/instance on GPU p*.xlarge
65
PCA | Principal Component Analysis
Dimensionality Reduction avoid the curse of dimensionality while minimizing loss of information
66
PCA | un/supervised?
unsupervised
67
what are the reduced dimensions called?
Components first component has largest possible variability second component has the next largest
68
PCA input
recordIO-protobuf or CSV | File or Pipe on either
69
PCA under the hood?
Covariance matrix is created then singular value decomposition (SVD) Two modes - regular for sparse data and moderate number of observation and features - randomized for large number of observations and features uses approximation algorithm
70
PCA hyperparameters
Algorithm_mode Subtract_mean - unbiased data
71
PCA instance type
CPU or GPU | - it depends on the specifics of the input data
72
Factorization Machines
Classification and regression | Dealing with Sparse data
73
Factorization Machines use cases
Click Prediction Item Recommendations Since an individual user doesn't interact with most pages / products the data is sparse
74
Factorization Machines | un/supervised?
supervised | - Classification or Regression
75
is it limited to pair-wise interactions
yes. e.g. user - item
76
Factorization Machines input
recordIO-protobuf with Float32 | - Sparse data means CSV isn't practical
77
Factorization Machines, how does it work ?
Finds factors we can use to predict a classification e.g. Click or not / Purchase or not or value (predicted rating?) given a matrix representing some pair of things (users and items) usually used in the context of recommender systems
78
Factorization Machines hyperparameters
initialization methods for bias, factors, and linear terms - uniform, normal or constant - can tune properties of each method
79
Factorization Machines instance types
CPU or GPU CPU recommended GPU only works for dense data
80
IP insights
finding fishy behaviour identify suspicious behaviour from ip address identify logins from anomalous ip's identify accounts creating resources from anomalous IP's
81
IP insights | un/supervised
unsupervised
82
IP insights input
``` username account id(raw data no need to pre-process) ``` training channel, optional validation (computes AUC scores) CSV only (Entity, IP)
83
IP insights, how is it used?
uses a neural network to learn latent vector representations of entities and ip addresses entities are hashed and embedded. - need sufficiently large hash size automatically generates negative samples during training by randomly pairing entities and IP's
84
IP Insights hyperparameters
num_entity_vectors - hash size - set to twice the number of unique entity identifiers Vector_dim - size of embedding vectors - scales model size - too large results in overfitting Epochs, Learning rate, batch size, etc.
85
IP Insights instance type
``` CPU or GPU GPU recommended ml.p3.2xlarge or higher can use multiple GPU size of CPU depends on - vector_dim - num_entity_vectors ```