- Retention 1-365 days - Record = Partition Key + Data Blob 1MB - Provisioned - IN : 1MB per shard per sec - OUT : 2MB per shard per sec - On-demand - 4MB or 4000 records per second - scales automatically based on throughput during last 30 days - replicates to 3 AZ

- Use Cases : Support multiple programming languages - PutRecord vs PutRecords - PutRecords uses batching and increase throughput - ProvisionedThroughputExceeded Exception - Solution : Retries with backoff, increase shards and choice of partition key

- Use Cases : Monitor log files and send them to KDS - On top of KPL - Features - write from multiple directories to multiple kinesis streams - preprocess data before sending - Able to handle file rotation, checkpointing and retry - Emit metrics to CloudWatch for monitoring

- 2MB per shard per second - GetRecords returns up to 10MB /sec or up to 1000 records per second - Max 5 GetRecords API - 200ms latency

- Lambda can source records from KDS - Lambda consumer has library to de-aggregate record from the KPL

- API call to change the number of shards is UpdateShardCount - We can implement AutoScaling with AWS Lambda

- EIF : SSL - EAR : KMS - VPC - KCL --> grant read and write access to DynamoDB table

- Fully managed - 1-14 days retention - 10ms latency - 256KB msg body + metadata - Horizontal scaling in term of number of consumer - Max 120,000 in-flight messages being processed by consumers

- Name of queue must end in .fifo - Lower throughput (30,000 msg per sec with batching and 3000 per second without) - messages are processed in order by consumer - msg are sent exactly once

- EIF : HTTPs - EAR : KMS - IAM

Collection Flashcards by Kevin Cheung

KDS

Retention 1-365 days
Record = Partition Key + Data Blob 1MB
Provisioned
- IN : 1MB per shard per sec
- OUT : 2MB per shard per sec
On-demand
- 4MB or 4000 records per second
- scales automatically based on throughput during last 30 days
replicates to 3 AZ

How well did you know this?

Not at all

Perfectly

Kinesis Producer SDK

Use Cases : Support multiple programming languages
PutRecord vs PutRecords
PutRecords uses batching and increase throughput
ProvisionedThroughputExceeded Exception
- Solution : Retries with backoff, increase # shards and choice of partition key

How well did you know this?

Not at all

Perfectly

Kinesis Producer Library

Use Cases : High performance and long-running producers
Synchronous and Asynchronous API
Batching –> 1MB/s or 1000 records /s
Compression must be implemented by users
KPL records must be decoded with KCL or special helper library
RecordMaxBufferedTime 100ms

How well did you know this?

Not at all

Perfectly

Kinesis Agent

Use Cases : Monitor log files and send them to KDS
On top of KPL
Features
- write from multiple directories to multiple kinesis streams
- preprocess data before sending
- Able to handle file rotation, checkpointing and retry
- Emit metrics to CloudWatch for monitoring

How well did you know this?

Not at all

Perfectly

Kinesis Consumer SDK

2MB per shard per second
GetRecords returns up to 10MB /sec or up to 1000 records per second
Max 5 GetRecords API
200ms latency

How well did you know this?

Not at all

Perfectly

Kinesis Client Library

Read records from Kinesis produced by KPL
Share multiple shards with multiple consumer in one group
Checkpointing feature to resume progress
Leverage DynamoDB for checkpointing
- Make sure to provision enough WCU / RCU
- Use on-demand for DynamoDB otherwise DynamoDB will slow down KCL
ExpiredIterationException
- Solution : increase WCU

How well did you know this?

Not at all

Perfectly

Kinesis Connector Library

S3
DynamoDB
Redshift
ElasticSearch

How well did you know this?

Not at all

Perfectly

Kinesis and Lambda

Lambda can source records from KDS
Lambda consumer has library to de-aggregate record from the KPL

How well did you know this?

Not at all

Perfectly

Kinesis Enhanced Fan Out

2MB /consumer /sec /shard
Kinesis pushes data to consumer over HTTP2
70 ms latency
Default limit of 5 consumers using enhanced fan out per data stream
Use SubscribeToShard API

How well did you know this?

Not at all

Perfectly

Auto Scaling

API call to change the number of shards is UpdateShardCount
We can implement AutoScaling with AWS Lambda

How well did you know this?

Not at all

Perfectly

KDS Security

EIF : SSL
EAR : KMS
VPC
KCL –> grant read and write access to DynamoDB table

How well did you know this?

Not at all

Perfectly

Kinesis Data Firehose

Fully managed
Near real time (60 sec latency)
Auto scaling
Spark / KCL do not read from KDF
Destination : s3, Splunk, Redshift, ElasticSearch
Record Size 1MB
Replicates records to 3 AZ
Retention 24 hours

How well did you know this?

Not at all

Perfectly

KDF Buffer

2 mins
32MB

How well did you know this?

Not at all

Perfectly

SQS Standard

Fully managed
1-14 days retention
10ms latency
256KB msg body + metadata
Horizontal scaling in term of number of consumer
Max 120,000 in-flight messages being processed by consumers

How well did you know this?

Not at all

Perfectly

SQS Producing Messages

Provide delay delivery
Get back
- msg id
- md5 hash of the body

How well did you know this?

Not at all

Perfectly

SQS Consuming Messages

Poll 10 msg at a time
Process the message within the visibility timeout
Delete the msg using msg id and recipt handler
max 120,000 in-flight msg being processed by consumers

How well did you know this?

Not at all

Perfectly

SQS FIFO Queue

Name of queue must end in .fifo
Lower throughput (30,000 msg per sec with batching and 3000 per second without)
messages are processed in order by consumer
msg are sent exactly once

How well did you know this?

Not at all

Perfectly

SQS Security

Study These Flashcards

EIF : HTTPs
EAR : KMS
IAM

IoT Device Gateway

Study These Flashcards

Serves as entry point for IoT devices connecting to AWS
Supports MQTT, Websocket and HTTP1.1 protocols
Fully managed
Scale automatically to support over 1 billion Things

IoT Message Broker

Study These Flashcards

Pub Sub pattern with low latency
Msg sent using MQTT, WebSocket and HTTP1.1
Msg are published into topics
Msg broker forwards msg to all clients connected to the topic

IoT Authentication

Study These Flashcards

3 authN
- X.509 certification
- AWS SigV4
- Custom tokens with custom authorizers
For mobile
- Cognito Identities
Web / Desktop / CLI
- IAM
- Federated Identities

IoT Authorization

Study These Flashcards

AWS IoT Policies
- Attach to X.509 certificates or Cognito Identities
- Able to revoke any device at any time
- IoT Policies are JSON doc
- Can be attached to groups instead of individual Things
IAM Policies
- Attached to users, group or roles
- Used for controlling IoT AWS APIs

IoT Device Shadow

Study These Flashcards

JSON doc representing the state of a connected Thing
IoT Thing will retrieve the state when online and adapt

IoT Rules Engines

Study These Flashcards

Rules are defined on the MQTT topics
Rules = when it is triggered
Use Cases
- Augment or filter data received from a device
- Write data received from a device to a DynamoDB database
- Save a file to S3
Rules need IAM roles to perform their actions

IoT Greengrass

- IoT Greengrass brings the compute layer to the device directly - We can execute AWS Lambda functions on the devices - Operate offline - Deploy functions from the cloud directly to the devices

Data Migration Service

- Homo and Hetero - Continuous data replication using Change Data Capture - Require EC2 instance to perform the replication tasks

Data Migration Service Schema Conversion Tool

- Prefer compute-intensive instances to optimize data conversions

Direct Connect (DX)

- Provides a dedicated private connection from a remote network to your VPC - Require to setup a Virtual Private Gateway on your VPC - Use Cases - Increase Bandwidth Throughput - Consistent network experience - Hybrid Env - Support IPv4 and IPv6 - If DX is setup to one or more VPC in different regions, use Direct Connect Gateway

Direct Connection Types

- Dedicated - 1,10,100 Gbps - physical ethernet port dedicated to a customer - Hosted - 50,500Mbps to 10Gbps - Capacity can be added or removed on demand - Lead time are often longer than 1 month to establish a new connection

Direct Connect Encryption

- In Transit : not private - AWS DC + VPC provides an IPsec encrypted private connection

Direct Connect Max Resiliency

- Multiple DX per 1 location

Snowcone

- Light - Device for edge computing, storage and data transfer - 8 TB - Connect it to internet and use AWS DataSync to send data

Snowball Edge

- Storage Optimized - 80TB - Compute Optimized - 42TB - Provide block storage and S3-compatible object storage

Snowmobile

- 100PB - High Security, Temperature Controlled, GPS, 247 video surveillance - Better than Snowball if transferring more than 10PB

AWS OpsHub

- Manage your Snow Family Device

AWS Managed Streaming Kafka

- Fully managed - Data Stored in EBS - Message size 1MB to 10MB - Choose number of AZ - Choose the VPC and subnet - Choose the broker instance type - Choose the size of EBS volume - Durability & Availability - Ensure the replication factor (RF) is at least 2 for 2 AZ clusters and at least 3 for 3 AZ clusters - Set minimum in-sync replicas (miniISR) to at most RF-1

MSK Security

- EIF : TLS - EAR : KMS - AuthN and AuthZ - Mutual TLS + Kafka ACLs - SASL / SCRAM + Kafka ACLs - IAM

MSK Monitoring

- CloudWatch Metrics - Basic monitoring, enhanced monitoring, topic level monitoring - Prometheus - Broker Log delivery - To CloudWatch, S3, KDS

MSK Connect

- Managed Kafka Connect Workers - Auto-scaling capabilities - Deploy any Kafka Connect connectors to MSK as a plugin

KDS > SQS

- Ability for multiple applications to consume the same stream concurrently - Ability to consume records in the same order a few hours later

KDF Sources

- KDF API - KDS - Other AWS Services - Kinesis Agent - AWS Lambda

KDF + Lambda Transformation

All transformed records from Lambda must be returned to Firehose with following 3 parameters - recordId - result - data Enable source record backup and KDF will deliver the un-transformed incoming data to a separate S3 bucket

Kinesis Video Streams

- Fully managed - Service for media ingestion, storage and processing - Use Cases - Smart Home : Stream video and audio from camera-equipped home devices - Smart City - Industrial Automation - Integrates with ML Framework

Kinesis Video Stream Concepts

- Video Stream - resource that enables you to capture live video and other time-encoded data - Fragment - Self-contained sequence of media frames - Chunk - KVS stores videos in chunks

Collection Flashcards

(44 cards)