Modeling 3 Flashcards

1
Q

Amazon comprehend

A

Higher-level AI/ML services beyond SageMaker

it does NLP and Text Analytics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Amazon comprehend input

A
social media 
emails 
web pages
documents 
transcripts 
medical records (comprehend medical)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Amazon comprehend Extracts?

A
Entities 
Phrases
Sentiments
Language
Syntax
Topics
Document classification
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

can you train Amazon comprehend on your own data?

A

yes you can train

and also you can use some of out-of-the-box models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Amazon Translate

A

use deep learning to translate text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

can you define some terminologies for Amazon Translate

A

yes you can
using CSV or TMX format

it’s appropriate for proper names, brands, names etc.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon Transcribe

A

Speech to text

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Does Amazon Transcribe support streaming audio?

A

yes it does
HTTP/2 or WebSocket

define the language
- French, English, Spanish only

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Amazon Transcribe input

A

FLAC
MP3
MP4
Wave

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

does Amazon Transcribe do speaker identification?

A

yes it does

define how many speakers are in there and it will do the rest

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

does Amazon Transcribe do channel identification?

A

yes
i.e. two callers could be transcribed separately
Merging based on timing of utterances

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

does Amazon Transcribe do custom vocabulary?

A

yes you can
give it a list
special words, names, acronyms

also can do Vocabulary tables that include sound

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Amazon Polly

A
Neural text-to-speech, many voices & languages 
supports:
- Lexicons
- SSML
- Speech Marks
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Does Amazon Polly handle Lexicons?

A

yes it does

e.g. W3C map to world wide web consortium

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SSML

A

ssml (speech synthesis markup language)

alternative to plain text
speech synthesis markup language
gives control over emphasis, pronunciation, breathing, whispering, speech rate, pitch, pause

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Polly Speechmarks

A

can encode when sentence/word starts and ends in the audio stream
useful for lip-synching animation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Amazon Rekognition

A

Computer Vision
Object and scene detection
- can use a collection of known faces

Image moderation
Facial Analysis
Celebrity recognition
Face comparison
Text in image
Video analysis
- object
- people
- celebrities marked on timeline
- people pathing
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Amazon Rekognition input

A

Video
- Kinesis Video Streams (H.264 encoded, 5-30FPS, favor resolution over fps)

Image

  • S3
  • part of the request
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Amazon Forecast

A

Fully managed

highly accurate forecasting with ML

20
Q

Amazon Forecast models

A
ARIMA
DeepAR
ETS
NPTS
Prophet

AutoML chooses the best model

21
Q

Amazon Forecast input

A

works with any time series

  • price
  • promotion
  • economic performance
  • etc

can combine with associated data to find relationships

22
Q

Amazon Forecast use cases

A

Inventory planning
Financial planning
Resource Planning

23
Q

Amazon Forecast is forecasting based on

A

dataset groups
predictors
forecasts

24
Q

Amazon Lex

A

Natural language chatbot engine

bot is built around intents
lambda functions are invoked to fulfill the intent
slots specify extra information needed by the intent

25
Q

a use case for Lex?

A

making an amazon alexa

use Transcribe to convert voice to text

use Lex to extract the intents

use polly to return a voice to user

26
Q

Where to deploy Lex?

A

AWS mobile SDK
Facebook Messenger
Slack
Twilio

27
Q

Amazon Personalize

A

collaborative filtering engine

recommender system

feed in data about a user, in return it gives you what other stuff this user might be interested in

28
Q

Amazon Textract

A

Optical Character Recognition (OCR)

supports table, forms, fields

29
Q

Amazon DeepRacer

A

Reinforcement learning powered 1/18 scale race car

30
Q

DeepLens

A

Deep learning-enabled video camera

integrated with rekognition, SageMaker, Polly, Tensorflow, MXNet and Caffe

31
Q

DeepLense output

A

Kinesis Video Streams

32
Q

Reinforcement learning

A

learning about an environment and how to navigate in an optimal manner as you encounter different states within that environment

There are:

  • Environment: Layout e.g. Board/ maze
  • Choices (actions)
  • Conditions (states)
  • Rewards: values associated with the action from state
  • Observation: i.e., surroundings in a maze, state of chess board

keep track of reward or penalty associated with each action given a condition

use those values to inform its future choices

33
Q

What’s MDP?

A

Markov Decision Process

mathematical framework for modeling decision making

MDP is a discrete time stochastic control process

34
Q

Does SageMaker offers reinforcement learning?

A

yes it does
it uses a deep learning framework with Tensorflow and MXNet

supports Intel Coach and Ray Rllib toolkits

35
Q

Custom, open-source or commercial environments supported

A
MATLAB
Simulink
EnergyPlus
RoboSchool
PyBullet
Amazon Sumerian
AWS RoboMaker
36
Q

Is it possible to distribute training with SageMaker RL?

A

yes it does
it contribute training and/or environment rollout

Multi-core and multi-instance

37
Q

Reinforcement Learning Hyperparameters

A

parameters are abstracted

hyperparameter tuning in SageMaker can then optimize them

38
Q

Reinforcement Learning instance type

A

no specific guidance given by aws

it’s deep learning though, GPU might be helpful

supports multiple instances and cores

39
Q

SageMaker Automatic Model Tuning

A

define the hyperparameters we care about and their ranges we assume is good to try and the metrics we are optimizing for

e. g.
- learning rate
- batch size
- depth
- etc

40
Q

How does SageMaker saves time and dollars when it comes to Automatic Model Tuning?

A

It spins up a “HyperParameter Tuning Job” and train as many as combinations that we allow

potentially a lot if instances are spun up

It also learns as it goes so it doesn’t have to try every possible combination

41
Q

SageMaker Auto Tuning best practices?

A

don’t optimize too many hyperparameters at once

limit ranges to as small a range as possible

use logarithmic scales when appropriate

don’t run too many training jobs concurrently
- it limits how well the process can learn as it goes

make sure the training jobs running on multiple instances report the correct objective metric in the end

42
Q

SageMaker and Apache Spark?

A

yes
there is a SageMaker library that you can use in a spark driver script

instead of spark mllib implementation, use SageMaker Estimator

e.g. XGBoost, PCA, K-mean

43
Q

SageMaker Spark integration

A

connect notebook to a remote EMR running spark
or use Zeppelin

Training df should have:

  • features column that is a vector of doubles
  • optional labels column of doubles

fit on SageMaker estimator and get a SageMaker model

transform on SageMakerModel to make inferences

work with Spark Pipelines

44
Q

why to do the SageMaker-Spark integration?

A

combine pre-processing big data in spark with training and inference in SageMaker

45
Q

Where does the training code used by SageMaker come from?

A

Whether it’s your own code, a built-in algorithm from SageMaker, or a model you’ve purchased in the marketplace - all training code deployed to SageMaker training instances come from ECR

46
Q

Which SageMaker algorithm would be best suited for identifying topics in text documents in an unsupervised setting?

A

Latent Dirichlet Allocation is a topic modeling technique. Neural Topic Model would also be a correct answer.