MLA-C01 Flashcards by Craig Arcuri

Before you can use auto scaling, you must have already created an Amazon SageMaker ______________.

model endpoint.

How well did you know this?

Not at all

Perfectly

You can have multiple model _____________for the same endpoint.

versions

How well did you know this?

Not at all

Perfectly

Amazon SageMaker ____________ provides tools to help explain how machine learning (ML) models make predictions.

Clarify

How well did you know this?

Not at all

Perfectly

An ____________can be thought of as the answer to a Why question that helps humans understand the cause of a prediction.

explanation

How well did you know this?

Not at all

Perfectly

On AWS, AI/ML practitioners can use Amazon Sagemaker ____________, which uses Shapley values to help answer how different variables influence model behavior.

Clarify

How well did you know this?

Not at all

Perfectly

Debug model output tensors from machine learning training jobs in real time and detect non-converging issues using Amazon SageMaker ____________.

Debugger

How well did you know this?

Not at all

Perfectly

___________is the extent to which you can explain the internal mechanics of an ML or deep learning system in human terms.

Explainability

How well did you know this?

Not at all

Perfectly

Amazon SageMaker _________produces metrics that measure the predictive quality of machine learning model candidates.

Autopilot

How well did you know this?

Not at all

Perfectly

The ratio of the number of correctly classified items to the total number of (correctly and incorrectly) classified items.

Accuracy

How well did you know this?

Not at all

Perfectly

measures how well an algorithm predicts the true positives (TP) out of all of the positives that it identifies.

Precision

How well did you know this?

Not at all

Perfectly

uses natural language processing (NLP) to extract insights about the content of documents. It develops insights by recognizing the entities, key phrases, language, sentiments, and other common elements in a document.

Amazon Comprehend

How well did you know this?

Not at all

Perfectly

a text translation service that uses advanced machine learning technologies to provide high-quality translation on demand. use to translate unstructured text documents or to build applications that work in multiple languages.

Amazon Translate

How well did you know this?

Not at all

Perfectly

a fully managed, automatic speech recognition (ASR) service that makes it easy for developers to add speech to text capabilities to their applications.

Amazon Transcribe

How well did you know this?

Not at all

Perfectly

a cloud service that converts text into lifelike speech. You can use to develop applications that increase engagement and accessibility.

Amazon Polly

How well did you know this?

Not at all

Perfectly

a cloud-based image and video analysis service that makes it easy to add advanced computer vision capabilities to your applications.

Amazon Rekognition

How well did you know this?

Not at all

Perfectly

a fully managed service that uses statistical and machine learning algorithms to deliver highly accurate time-series forecasts.

Amazon Forecast

How well did you know this?

Not at all

Perfectly

an AWS service for building conversational interfaces for applications using voice and text.

Amazon Lex

How well did you know this?

Not at all

Perfectly

a fully managed machine learning service that uses your data to generate item recommendations for your users. It can also generate user segments based on the users’ affinity for certain items or item metadata.

Amazon Personalize

How well did you know this?

Not at all

Perfectly

a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.

Amazon Textract

How well did you know this?

Not at all

Perfectly

an intelligent search service that uses natural language processing and advanced machine learning algorithms to return specific answers to search questions from your data.

Amazon Kendra

How well did you know this?

Not at all

Perfectly

allows you to conduct a human review of machine learning (ML) systems to guarantee precision.

Amazon Augmented AI (A2I)

How well did you know this?

Not at all

Perfectly

uses machine learning (ML) to make it easier for customers to accurately detect anomalies in their metrics.

Amazon Lookout for Metrics

How well did you know this?

Not at all

Perfectly

a fully managed service enabling customers to identify potentially fraudulent activities and catch more online fraud faster.

Amazon Fraud Detector

How well did you know this?

Not at all

Perfectly

a fully managed, generative-AI powered assistant that you can configure to answer questions, provide summaries, generate content, and complete tasks based on your enterprise data.

Amazon Q Business

How well did you know this?

Not at all

Perfectly

Amazon Polly is the Opposite of Amazon ____________.

Transcribe

______________measures how many actual positives were predicted as positive.

Recall

_____________is the harmonic mean of precision and recall.

F1-measure

It measures the ability of the model to predict a higher score for positive examples as compared to negative examples.

AUC (Area Under Curve)

_________is a method used in machine learning to reduce errors in predictive data analysis.

Boosting

____________improves machine models' predictive accuracy and performance by converting multiple weak learners into a single strong learning model.

Boosting

____________ are data structures in machine learning that work by dividing the dataset into smaller and smaller subsets based on their features

Decision trees

Boosting creates an ____________model by combining several weak decision trees sequentially.

ensemble

In ________, data scientists improve the accuracy of weak learners by training several of them at once on multiple datasets. In contrast, boosting trains weak learners one after another.

bagging

__________is a popular and efficient open-source implementation of the gradient boosted trees algorithm.

XGBoost

___________boosting is a supervised learning algorithm that tries to accurately predict a target variable by combining multiple estimates from a set of simpler models.

Gradient

Amazon SageMaker _____________ reduces data prep time for tabular, image, and text data from weeks to minutes.

Data Wrangler

With SageMaker ________________ you can simplify data preparation and feature engineering through a visual and natural language interface.

Data Wrangler

Sagemaker ____________ a no-code ML tool that helps business analysts generate accurate ML predictions without having to write code or without requiring any ML experience.

Canvas

Amazon SageMaker ____________ is a fully managed, purpose-built repository to store, share, and manage features for machine learning (ML) models.

Feature Store

___________are inputs to ML models used during training and inference.

Features

SageMaker ____________ tags and indexes feature groups so they are easily discoverable through the visual interface of Amazon SageMaker Studio.

Feature Store

Amazon SageMaker ____________ offers the most comprehensive set of human-in-the-loop capabilities, allowing you to harness the power of human feedback across the ML lifecycle to improve the accuracy and relevancy of models.

Ground Truth

You can complete a variety of human-in-the-loop tasks with SageMaker ___________, from data generation and annotation to model review, customization, and evaluation, either through a self-service or an AWS-managed offering.

Ground Truth

SageMaker _________helps identify potential bias during data preparation without writing code.

Clarify

SQL function used for anomaly detection on numeric columns in a stream

RANDOM_CUT_FOREST

is derived from “Linux” and “cluster

Lustre

a type of parallel distributed file system, for large-scale computing

Lustre

a fully managed Windows file system share drive

FSx for Windows File Server

a network drive you can attach to your instances while they run

EBS

Managed NFS (network file system) that can be mounted on many EC2

EFS

Data Warehouse vs Data Lake. Warehouse is ________________. Lake is ___________

Structured, Unstructured

Binary format that stores both the data and its schema

AVRO

Columnar storage format optimized for analytics.

Parquet

Find K “nearest” (most similar) rows and average their values

K Nearest Neighbor (KNN)

Find linear or non-linear relationships between the missing feature and other features

Regression

Duplicate samples from the minority class

Oversampling

Instead of creating more positive samples, remove negative ones

Undersampling

measures how “spread-out” the data is

Variance

________________ 𝜎 is just the square root of the variance.

Standard Deviation

Data points that lie more than one ___________________ from the mean can be considered unusual.

Standard Deviation

Bucket observations together based on ranges of values.

Binning

Create “buckets” for every category * The bucket for your category has a 1, all others have a 0

One Hot Encoding

______________ for deploying to edge devices

SageMaker NEO

___________values are the algorithm used to determine the contribution of each feature toward a model’s predictions

Shapley

Used on the final output layer of a multi-class classification problem

Softmax

Choosing an activation function: For multiple classification, use _________on the output layer

softmax

Choosing an activation function: ________do well with Tanh

RNN’s

Choosing an activation function: For everything else

Start with ReLU

Choosing an activation function: _________for really deep networks

Swish

When you have data that doesn’t neatly align into columns * Images that you want to find features within * Machine translation * Sentence classification * Sentiment analysis

Convlution Neural Network (CNN)

RNN’s: what are they for?

Time-series data

When you want to predict future behavior based on past behavior

Recurrent Neural Network (RNN)

Sequence to sequence, Sequence to vector, Vector to sequence, Encoder -> Decoder

RNN topologies

_________batch sizes tend to not get stuck in local minima

Small

____________batch sizes can converge on the wrong solution at random

Large

_________learning rates can overshoot the correct solution

Large

____________learning rates increase training time

Small

* ________________techniques are intended to prevent overfitting.

Regularization

Preventing overfitting in ML in general * A regularization term is added as weights are learned

L1 and L2 Regularization

* L1: sum of _______________

weights

L2: sum of ______________

square of weights

We need to understand true positives and true negative, as well as false positives and false negatives.

confusion matrix

Percent of positives rightly predicted

Recall

AKA Correct Positives

Precision

Plot of true positive rate (recall) vs. false positive rate at various threshold settings.

ROC Curve

The area under the ROC curve is… wait for it..

AUC

Generate N new training sets by random sampling with replacement

Bagging

Training is sequential; each classifier takes into account the previous one’s success.

Boosting

Define the hyperparameters you care about and the ranges you want to try, and the metrics you are optimizing for

Automatic Model Tuning

Don’t optimize too many hyperparameters at once * Limit your ranges to as small a range as possible * Use logarithmic scales when appropriate * Don’t run too many training jobs concurrently * This limits how well the process can learn as it goes * Make sure training jobs running on multiple instances report the correct objective metric in the end

Automatic Model Tuning: Best Practices

Stop training in a tuning job early if it is not improving the objective significantly

Early Stopping

Uses one or more previous tuning jobs as a starting point

Warm Start

Automates: * Algorithm selection * Data preprocessing * Model tuning * All infrastructure * It does all the trial & error for you

SageMaker Autopilot

Visual IDE for machine learning

SageMaker Studio

Create and share Jupyter notebooks with SageMaker Studio * Switch between hardware configurations (no infrastructure to manage)

SageMaker Notebooks

Organize, capture, compare, and search your ML jobs

SageMaker Experiments

Saves internal model state at periodical intervals * Gradients / tensors over time as a model is trained * Define rules for detecting unwanted conditions while training * A debug job is run for each rule you configure * Logs & fires a CloudWatch event when the rule is hit

SageMaker Debugger

Catalog your models, manage model versions * Associate metadata with models * Manage approval status of a model * Deploy models to production * Automate deployment with CI/CD

SageMaker Model Registry

___________________ is a visualization toolkit for Tensorflow or PyTorch * Visualize loss and accuracy * Visualize model graph * View histograms of weight, biases over time * Project embeddings to lower dimensions * Profiling

Tensorboard

Compile & optimize training jobs on GPU instances * Can accelerate training up to 50% * Converts models into hardware-optimized instructions * Tested with Hugging Face transformers library, or bring your own model * Incompatible with SageMaker distributed training libraries

SageMaker Training Compiler

Retain and re-use provisioned infrastructure * Useful if repeatedly training a model to speed things up * Use by setting KeepAlivePeriodInSeconds in your training job’s resource config

Warm Pools

Creates snapshots during your training * You can re-start from these points if necessary * Or use them for troubleshooting, to analyze the model at different points * Automatic synchronization with S3 (from /opt/ml/checkpoint)

Checkpointing

Run automatically when using ml.g or ml.p instance types * Replaces any faulty instances * Runs GPU health checks * Ensures NVidia Collective Communication Library is working

Cluster Health Checks and Automatic Restarts

You can of course run multiple training jobs in parallel * “job parallelism” * Individual training can also be parallelized * Distributed data parallelism * Distributed model parallelism

Distributed Training

Network device attached to your SageMaker instances * Makes better use of your bandwidth * Promises performance of an onpremises High Performance Computing (HPC) cluster in the cloud

Elastic Fabric Adapter (EFA)

______________ produces a weighted average of all token embeddings. The magic is in computing the attention weights.

Self-attention

A mask can be applied to prevent tokens from “peeking” into future tokens (words)

Masked Self-Attention

Chat! * Question answering * Text classification * i.e., sentiment analysis * Named entity recognition * Summarization * Translation * Code generation * Text generation * i.e., automated customer service

Applications of Transformers

Tokenization, token encoding * Token embedding * Captures semantic relationships between tokens, token similarities * Positional encoding * Captures the position of the token in the input relative to other nearby tokens * Uses an interleaved sinusoidal function so it works on any length

LLM Input processing

The stack of decoders outputs a vector at the end * Multiply this with the token embeddings * This gives you probabilities (logits) of each token being the right next token (word) in the sequence

LLM Output processing

MLA-C01 Flashcards

Cert Exam Study (111 cards)