Domain 1 Flashcards

Question

The three layers of deep neural networks

Answer 1

input layer, several hidden layers, and an output layer of nodes

Answer 2

image classification and natural language processing where there is a need to identify the complex relationship between data objects

Answer 3

they don't need the relevant features given to them.

Answer 4

It comes to identifying patterns from structured data and labeled data. Examples include classification and recommendation systems.

Answer 5

unstructured data like images, videos, and text. Tasks for deep learning include image classification and natural language processing, where the is a need to identify the complex relationships between pixels and words. but only deep learning uses neural networks to simulate human intelligence.

Answer 6

Gen AI Notes

Answer 7

Increasing business efficiency Solving complex problems Making better decisions

Answer 8

Costs outweigh benefits Models cannot meet interpretability requirements (can't know how a neural network made a decision, so instead use a rules based system) Systems must be deterministic (produces same output with the same input) rather than probabilistic

Answer 9

supervised learning problem

Answer 10

classification problem. (supervision)

Answer 11

regression problem.

Answer 12

assigns an input to one of several classes based on the input attributes.

Answer 13

assigns an input to one of several classes based on the input attributes. An example is the prediction of the topic most relevant to a tax documen

Answer 14

egression problem. Regression estimates the value of dependent target variable based on one or more other variables,

Answer 15

If we have such as weight and age, then we have a multiple linear regression problem. A

Answer 16

a class of techniques that are used to classify data objects into groups, called clusters. It attempts to find discrete groupings within data. Members of a group are similar as possible to one another, and as different as possible from members of other gro

Answer 17

Clustered analysis

Answer 18

Anomaly detection

Answer 19

Amazon Rekognition

Answer 20

Amazon Textract

Answer 21

Amazon Comprehend

Answer 22

Amazon Comprehend

Answer 23

Transcribe

Answer 24

Amazon Kendra

Answer 25

Amazon Personalize

Answer 26

Amazon Translate

Answer 27

Amazon Forecast

Answer 28

Amazon Fraud Detector

Answer 29

Identify the business goal

Answer 30

define success criteria align stakeholders

Answer 31

Frame the ML problem

Answer 32

Define the ML task, including inputs, outputs and metrics Determine feasibility Start with the simplest model options Do a cost benefit analysis

Answer 33

Start with the simplest, things AI/ML hosted services and pre-trained models. Fully customize only if needed.

Answer 34

Data sources Data ingestion, including ETL Labels

Answer 35

Gathering transforming and storing data in a new central location

Answer 36

Labeling, as you likely don't already have the data labeled and need to do that

Answer 37

Looking for missing data, masking PII data, cleaning it, and splitting it.

Answer 38

80% for training the model 10% for model eval 10% for final testing before prod deploy

Answer 39

which characteristics of the dataset should be used as features to train the model. This is the subset that is relevant and contributes to minimizing the error rate of a trained model. You should reduce the features in your training data to only those that are needed for inference. Features can be combined to further reduce the number of features. Reducing the number of features reduces the amount of memory and computing power required for training

Answer 40

Crawls source systems, discovers metadata and schemas, understands the source data. Only metadata is stored in the data catalog

Answer 41

AWS Glue DataBrew

Answer 42

Amazon SageMaker Ground Truth

Answer 43

Amazon SageMaker Canvas

Answer 44

Amazon SageMaker Feature Store is a centralized store for features and associated metadata, so features can be easily discovered and reused. Feature Store makes it easy to create, share, and manage features for ML development. Feature Store accelerates this process by reducing repetitive data processing and curation work required to convert raw data into features for training an ML algorithm. You can create workflow pipelines that convert raw data into features and add them to feature groups.

Answer 45

Parameters

Answer 46

False.This can't be done in one iteration, because the algorithm has not learned yet. It has no knowledge of how changing weights will shift the output closer toward the expected value. Therefore, it watches the weights and outputs from previous iterations, and shifts the weights to a direction that lowers the error in generated output. This iterative process stops either when a defined number of iterations have been run, or when the change in error is below a target value.

Answer 47

There are usually multiple algorithms to consider for a model. The best practice is to run many training jobs in parallel, by using different algorithms and settings. This is known as running experiments, which helps you land on the best-performing solution

Answer 48

known as hyperparameters

Answer 49

the URL of the S3 bucket containing your training data. You also specify the compute resources you want to use for training, and the output bucket for the model artifacts. You specify the algorithm by giving SageMaker the path to a Docker container image that contains the training algorithm. In the Amazon Elastic Container Registry, Amazon ECR, you can specify the location of SageMaker provided algorithms and deep learning containers, or the location of your custom container, containing a custom algorithm. You also need to set the hyperparameters required by the algorithm.

Answer 50

Amazon SageMaker experiments

Answer 51

also known as hyperparameter tuning, finds the best version of a model, by running many training jobs on your dataset. To do this, AMT uses the algorithm and ranges of hyperparameters that you specify. It then chooses the hyperparameter values that create a model that performs it best, as measured by a metric that you choose.

Answer 52

Batch inference

Answer 53

Batch inference Real-time inference Self-managed Hosted (sagemaker inference)

Answer 54

Batch transform (offline line inference, large datasets) Asynchronous (long processing times, large payloads) Serverless (intermittent traffic, periods of no traffic) Real-time (live predictions, sustained traffic, low latency, consistent performance)

Answer 55

Amazon SageMaker Model Monitor

Answer 56

IAC Rapid Experimentation Version Control Active perf mon Automatic model retraining and validation when there is data and code changes

Answer 57

Productivity Repeatability Reliability Auditability Data and model quality

Answer 58

Amazon SageMaker Model Building Pipelines

Answer 59

CodeCommit SageMaker Model Registry SageMaker Feature Store Third party

Answer 60

SageMaker Pipelines Amazon Managed Worklows for Apache Airflow AWS Step Functions Third party

Answer 61

A confusion matrix is a table with actual data typically across the top and the predicted values on the left.used to summarize the performance of a classification model when it's evaluated against task data

Answer 62

which is simply the percentage of correct predictions

Answer 63

Precision measures how well an algorithm predicts true positives out of all the positives that it identifies. The formula is the number of true positives divided by the number of true positives, plus the number of false positives.

Answer 64

If we want to minimize the false negatives, then we can use a metric known as recall. For example, we want to make sure that we don't miss if someone has a disease and we say they don't. The formula is the number of true positives divided by the number of true positives plus the number of false negatives.

Answer 65

No, but you can use F1

Answer 66

Combines recall and precision into one figure, allowing you to optimize on both of these

Answer 67

which is the false positives divided by the sum of the false positives and true negatives. In our example, this metric shows us how the model is handling the images that are not fish. It is a measure of how many of the predictions were of fish out of the images that were not fish

Answer 68

Closely related to the false positive rate is the true negative rate, which is the ratio of the true negatives to the sum of the false positives and true negatives. It is a measure of how many of the predictions were of not fish out of the images that were not fish.

Domain 1 Flashcards

(98 cards)