AWS Services Flashcards

1
Q

Amazon Athena

A

Analytics

Use SQL to query S3, save output to S3
Can use for preprocessing, feature engineering
Less performant than data warehouse, but more convenient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Amazon Elastic Map Reduce

EMR

A

Analytics

Distributed data processing
Massive parallel compute tasks
Single master node manages core nodes (scalable) which manage task nodes (scalable)
Apache Spark - fast analytics engine, can run on EMR or SageMaker

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q
Amazon Kinesis
(basic functionality and four instances)
A

Analytics

Ingesting large scale data, highly scalable

Amazon Kinesis Data Analytics
Amazon Kinesis Data Firehose
Amazon Kinesis Data Streams
Amazon Kinesis Video Stream

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Amazon QuickSight

A

Analytics
BI tool
reporting, visualize data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

AWS Batch

A

Compute

Dynamically provision other AWS services for your batch job
EC2, fargate, spot instances, etc)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Amazon Elastic Cloud Compute

EC2

A

Compute

Scalable compute instances

Amazon machine image (AMI) - conda based containers w/ libraries and drivers

Instance types for ML: Compute optimized or accelerated computing (GPU)
GPUs: ml.p2
CPU recommended: ml.m4 or ml.c4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Amazon Elastic Container Registry

ECR

A

Containers

Managed container image registry

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Amazon Elastic Container Service

ECS

A

Containers

Build and store container images

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Amazon Elastic Kubernetes Service

EKS

A

Containers

Deploying and managing containers at scale

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

AWS Glue

A

Database

Data integration, ETL, S3 crawler to determine schema (called catalog)

Easy to setup/run with minimal effort
Python and Scala
Job Systems - managed infrastructure for ETL workflows
Crawlers and Classifiers - scan data, classify, extract schema info, store metadata
Data Catalog - store, annotate, and share metadata
ETL operations - auto generate ETL scripts based on metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Amazon Redshift

A

Database

Data warehouse

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

AWS IoT Greengrass

A

Internet of Things

Build, deploy, and manage
Control IoT fleet

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

AWS CloudTrail

A

Management and Governance

Tracks actions taken in AWS console

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Amazon CloudWatch

A

Management and Governance

Track usage metrics

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Amazon Virtual Private Cloud

VPC

A

Networking and Content Delivery

Manage virtual network

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

AWS Identity and Access Management

IAM

A

Security, Identity, and Compliance

control access to AWS resources

17
Q

AWS Fargate

A

Serverless

Run containers without having to manually manage underlying resources

18
Q

AWS Lambda

A

Serverless

run serverless code on high-availability compute infrastructure

19
Q

Amazon Elastic File System

EFS

A

grows/shrinks as you add/delete files

mount on EC2 instances, lambda, or containers

20
Q

Amazon Elastic Block Store

EBS

A

Storage

scalable, high-performance block storage
breaks data into blocks to store as separate pieces
best for static files that aren’t changing

21
Q

Amazon FSx

A

Storage

high performance and throughput
fully managed Windows File Server

22
Q

Amazon S3

Simple Storage Service

A

Storage

Store any data, structured, unstructured, anything.
data lake
Security: IAM users, bucket policies; encryption - server side, key management service

23
Q

Amazon Mechanical Turk

A

workforce for labeling jobs

24
Q

AWS Database Migration Service

A

can use to migrate from on prem

25
Q

Elastic Inference Accelerator

A

attach to EC2 / SageMaker / Deep Learning Containers

accelerates deep learning inference workloads