Collection Flashcards

(44 cards)

1
Q

KDS

A
  • Retention 1-365 days
  • Record = Partition Key + Data Blob 1MB
  • Provisioned
    • IN : 1MB per shard per sec
    • OUT : 2MB per shard per sec
  • On-demand
    • 4MB or 4000 records per second
    • scales automatically based on throughput during last 30 days
  • replicates to 3 AZ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Kinesis Producer SDK

A
  • Use Cases : Support multiple programming languages
  • PutRecord vs PutRecords
  • PutRecords uses batching and increase throughput
  • ProvisionedThroughputExceeded Exception
    • Solution : Retries with backoff, increase # shards and choice of partition key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Kinesis Producer Library

A
  • Use Cases : High performance and long-running producers
  • Synchronous and Asynchronous API
  • Batching –> 1MB/s or 1000 records /s
  • Compression must be implemented by users
  • KPL records must be decoded with KCL or special helper library
  • RecordMaxBufferedTime 100ms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Kinesis Agent

A
  • Use Cases : Monitor log files and send them to KDS
  • On top of KPL
  • Features
    • write from multiple directories to multiple kinesis streams
    • preprocess data before sending
    • Able to handle file rotation, checkpointing and retry
    • Emit metrics to CloudWatch for monitoring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Kinesis Consumer SDK

A
  • 2MB per shard per second
  • GetRecords returns up to 10MB /sec or up to 1000 records per second
  • Max 5 GetRecords API
  • 200ms latency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Kinesis Client Library

A
  • Read records from Kinesis produced by KPL
  • Share multiple shards with multiple consumer in one group
  • Checkpointing feature to resume progress
  • Leverage DynamoDB for checkpointing
    • Make sure to provision enough WCU / RCU
    • Use on-demand for DynamoDB otherwise DynamoDB will slow down KCL
  • ExpiredIterationException
    • Solution : increase WCU
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Connector Library

A
  • S3
  • DynamoDB
  • Redshift
  • ElasticSearch
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis and Lambda

A
  • Lambda can source records from KDS
  • Lambda consumer has library to de-aggregate record from the KPL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Kinesis Enhanced Fan Out

A
  • 2MB /consumer /sec /shard
  • Kinesis pushes data to consumer over HTTP2
  • 70 ms latency
  • Default limit of 5 consumers using enhanced fan out per data stream
  • Use SubscribeToShard API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Auto Scaling

A
  • API call to change the number of shards is UpdateShardCount
  • We can implement AutoScaling with AWS Lambda
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

KDS Security

A
  • EIF : SSL
  • EAR : KMS
  • VPC
  • KCL –> grant read and write access to DynamoDB table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kinesis Data Firehose

A
  • Fully managed
  • Near real time (60 sec latency)
  • Auto scaling
  • Spark / KCL do not read from KDF
  • Destination : s3, Splunk, Redshift, ElasticSearch
  • Record Size 1MB
  • Replicates records to 3 AZ
  • Retention 24 hours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

KDF Buffer

A
  • 2 mins
  • 32MB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SQS Standard

A
  • Fully managed
  • 1-14 days retention
  • 10ms latency
  • 256KB msg body + metadata
  • Horizontal scaling in term of number of consumer
  • Max 120,000 in-flight messages being processed by consumers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SQS Producing Messages

A
  • Provide delay delivery
  • Get back
    • msg id
    • md5 hash of the body
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SQS Consuming Messages

A
  • Poll 10 msg at a time
  • Process the message within the visibility timeout
  • Delete the msg using msg id and recipt handler
  • max 120,000 in-flight msg being processed by consumers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

SQS FIFO Queue

A
  • Name of queue must end in .fifo
  • Lower throughput (30,000 msg per sec with batching and 3000 per second without)
  • messages are processed in order by consumer
  • msg are sent exactly once
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SQS Security

A
  • EIF : HTTPs
  • EAR : KMS
  • IAM
19
Q

IoT Device Gateway

A
  • Serves as entry point for IoT devices connecting to AWS
  • Supports MQTT, Websocket and HTTP1.1 protocols
  • Fully managed
  • Scale automatically to support over 1 billion Things
20
Q

IoT Message Broker

A
  • Pub Sub pattern with low latency
  • Msg sent using MQTT, WebSocket and HTTP1.1
  • Msg are published into topics
  • Msg broker forwards msg to all clients connected to the topic
21
Q

IoT Authentication

A
  • 3 authN
    • X.509 certification
    • AWS SigV4
    • Custom tokens with custom authorizers
  • For mobile
    • Cognito Identities
  • Web / Desktop / CLI
    • IAM
    • Federated Identities
22
Q

IoT Authorization

A
  • AWS IoT Policies
    • Attach to X.509 certificates or Cognito Identities
    • Able to revoke any device at any time
    • IoT Policies are JSON doc
    • Can be attached to groups instead of individual Things
  • IAM Policies
    • Attached to users, group or roles
    • Used for controlling IoT AWS APIs
23
Q

IoT Device Shadow

A
  • JSON doc representing the state of a connected Thing
  • IoT Thing will retrieve the state when online and adapt
24
Q

IoT Rules Engines

A
  • Rules are defined on the MQTT topics
  • Rules = when it is triggered
  • Use Cases
    • Augment or filter data received from a device
    • Write data received from a device to a DynamoDB database
    • Save a file to S3
  • Rules need IAM roles to perform their actions
25
IoT Greengrass
- IoT Greengrass brings the compute layer to the device directly - We can execute AWS Lambda functions on the devices - Operate offline - Deploy functions from the cloud directly to the devices
26
Data Migration Service
- Homo and Hetero - Continuous data replication using Change Data Capture - Require EC2 instance to perform the replication tasks
27
Data Migration Service Schema Conversion Tool
- Prefer compute-intensive instances to optimize data conversions
28
Direct Connect (DX)
- Provides a dedicated private connection from a remote network to your VPC - Require to setup a Virtual Private Gateway on your VPC - Use Cases - Increase Bandwidth Throughput - Consistent network experience - Hybrid Env - Support IPv4 and IPv6 - If DX is setup to one or more VPC in different regions, use Direct Connect Gateway
29
Direct Connection Types
- Dedicated - 1,10,100 Gbps - physical ethernet port dedicated to a customer - Hosted - 50,500Mbps to 10Gbps - Capacity can be added or removed on demand - Lead time are often longer than 1 month to establish a new connection
30
Direct Connect Encryption
- In Transit : not private - AWS DC + VPC provides an IPsec encrypted private connection
31
Direct Connect Max Resiliency
- Multiple DX per 1 location
32
Snowcone
- Light - Device for edge computing, storage and data transfer - 8 TB - Connect it to internet and use AWS DataSync to send data
33
Snowball Edge
- Storage Optimized - 80TB - Compute Optimized - 42TB - Provide block storage and S3-compatible object storage
34
Snowmobile
- 100PB - High Security, Temperature Controlled, GPS, 247 video surveillance - Better than Snowball if transferring more than 10PB
35
AWS OpsHub
- Manage your Snow Family Device
36
AWS Managed Streaming Kafka
- Fully managed - Data Stored in EBS - Message size 1MB to 10MB - Choose number of AZ - Choose the VPC and subnet - Choose the broker instance type - Choose the size of EBS volume - Durability & Availability - Ensure the replication factor (RF) is at least 2 for 2 AZ clusters and at least 3 for 3 AZ clusters - Set minimum in-sync replicas (miniISR) to at most RF-1
37
MSK Security
- EIF : TLS - EAR : KMS - AuthN and AuthZ - Mutual TLS + Kafka ACLs - SASL / SCRAM + Kafka ACLs - IAM
38
MSK Monitoring
- CloudWatch Metrics - Basic monitoring, enhanced monitoring, topic level monitoring - Prometheus - Broker Log delivery - To CloudWatch, S3, KDS
39
MSK Connect
- Managed Kafka Connect Workers - Auto-scaling capabilities - Deploy any Kafka Connect connectors to MSK as a plugin
40
KDS > SQS
- Ability for multiple applications to consume the same stream concurrently - Ability to consume records in the same order a few hours later
41
KDF Sources
- KDF API - KDS - Other AWS Services - Kinesis Agent - AWS Lambda
42
KDF + Lambda Transformation
All transformed records from Lambda must be returned to Firehose with following 3 parameters - recordId - result - data Enable source record backup and KDF will deliver the un-transformed incoming data to a separate S3 bucket
43
Kinesis Video Streams
- Fully managed - Service for media ingestion, storage and processing - Use Cases - Smart Home : Stream video and audio from camera-equipped home devices - Smart City - Industrial Automation - Integrates with ML Framework
44
Kinesis Video Stream Concepts
- Video Stream - resource that enables you to capture live video and other time-encoded data - Fragment - Self-contained sequence of media frames - Chunk - KVS stores videos in chunks