Collection Flashcards

1
Q

KDS

A
  • Retention 1-365 days
  • Record = Partition Key + Data Blob 1MB
  • Provisioned
    • IN : 1MB per shard per sec
    • OUT : 2MB per shard per sec
  • On-demand
    • 4MB or 4000 records per second
    • scales automatically based on throughput during last 30 days
  • replicates to 3 AZ
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Kinesis Producer SDK

A
  • Use Cases : Support multiple programming languages
  • PutRecord vs PutRecords
  • PutRecords uses batching and increase throughput
  • ProvisionedThroughputExceeded Exception
    • Solution : Retries with backoff, increase # shards and choice of partition key
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Kinesis Producer Library

A
  • Use Cases : High performance and long-running producers
  • Synchronous and Asynchronous API
  • Batching –> 1MB/s or 1000 records /s
  • Compression must be implemented by users
  • KPL records must be decoded with KCL or special helper library
  • RecordMaxBufferedTime 100ms
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Kinesis Agent

A
  • Use Cases : Monitor log files and send them to KDS
  • On top of KPL
  • Features
    • write from multiple directories to multiple kinesis streams
    • preprocess data before sending
    • Able to handle file rotation, checkpointing and retry
    • Emit metrics to CloudWatch for monitoring
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Kinesis Consumer SDK

A
  • 2MB per shard per second
  • GetRecords returns up to 10MB /sec or up to 1000 records per second
  • Max 5 GetRecords API
  • 200ms latency
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Kinesis Client Library

A
  • Read records from Kinesis produced by KPL
  • Share multiple shards with multiple consumer in one group
  • Checkpointing feature to resume progress
  • Leverage DynamoDB for checkpointing
    • Make sure to provision enough WCU / RCU
    • Use on-demand for DynamoDB otherwise DynamoDB will slow down KCL
  • ExpiredIterationException
    • Solution : increase WCU
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Kinesis Connector Library

A
  • S3
  • DynamoDB
  • Redshift
  • ElasticSearch
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Kinesis and Lambda

A
  • Lambda can source records from KDS
  • Lambda consumer has library to de-aggregate record from the KPL
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Kinesis Enhanced Fan Out

A
  • 2MB /consumer /sec /shard
  • Kinesis pushes data to consumer over HTTP2
  • 70 ms latency
  • Default limit of 5 consumers using enhanced fan out per data stream
  • Use SubscribeToShard API
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Auto Scaling

A
  • API call to change the number of shards is UpdateShardCount
  • We can implement AutoScaling with AWS Lambda
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

KDS Security

A
  • EIF : SSL
  • EAR : KMS
  • VPC
  • KCL –> grant read and write access to DynamoDB table
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Kinesis Data Firehose

A
  • Fully managed
  • Near real time (60 sec latency)
  • Auto scaling
  • Spark / KCL do not read from KDF
  • Destination : s3, Splunk, Redshift, ElasticSearch
  • Record Size 1MB
  • Replicates records to 3 AZ
  • Retention 24 hours
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

KDF Buffer

A
  • 2 mins
  • 32MB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

SQS Standard

A
  • Fully managed
  • 1-14 days retention
  • 10ms latency
  • 256KB msg body + metadata
  • Horizontal scaling in term of number of consumer
  • Max 120,000 in-flight messages being processed by consumers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

SQS Producing Messages

A
  • Provide delay delivery
  • Get back
    • msg id
    • md5 hash of the body
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

SQS Consuming Messages

A
  • Poll 10 msg at a time
  • Process the message within the visibility timeout
  • Delete the msg using msg id and recipt handler
  • max 120,000 in-flight msg being processed by consumers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

SQS FIFO Queue

A
  • Name of queue must end in .fifo
  • Lower throughput (30,000 msg per sec with batching and 3000 per second without)
  • messages are processed in order by consumer
  • msg are sent exactly once
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

SQS Security

A
  • EIF : HTTPs
  • EAR : KMS
  • IAM
19
Q

IoT Device Gateway

A
  • Serves as entry point for IoT devices connecting to AWS
  • Supports MQTT, Websocket and HTTP1.1 protocols
  • Fully managed
  • Scale automatically to support over 1 billion Things
20
Q

IoT Message Broker

A
  • Pub Sub pattern with low latency
  • Msg sent using MQTT, WebSocket and HTTP1.1
  • Msg are published into topics
  • Msg broker forwards msg to all clients connected to the topic
21
Q

IoT Authentication

A
  • 3 authN
    • X.509 certification
    • AWS SigV4
    • Custom tokens with custom authorizers
  • For mobile
    • Cognito Identities
  • Web / Desktop / CLI
    • IAM
    • Federated Identities
22
Q

IoT Authorization

A
  • AWS IoT Policies
    • Attach to X.509 certificates or Cognito Identities
    • Able to revoke any device at any time
    • IoT Policies are JSON doc
    • Can be attached to groups instead of individual Things
  • IAM Policies
    • Attached to users, group or roles
    • Used for controlling IoT AWS APIs
23
Q

IoT Device Shadow

A
  • JSON doc representing the state of a connected Thing
  • IoT Thing will retrieve the state when online and adapt
24
Q

IoT Rules Engines

A
  • Rules are defined on the MQTT topics
  • Rules = when it is triggered
  • Use Cases
    • Augment or filter data received from a device
    • Write data received from a device to a DynamoDB database
    • Save a file to S3
  • Rules need IAM roles to perform their actions
25
Q

IoT Greengrass

A
  • IoT Greengrass brings the compute layer to the device directly
  • We can execute AWS Lambda functions on the devices
  • Operate offline
  • Deploy functions from the cloud directly to the devices
26
Q

Data Migration Service

A
  • Homo and Hetero
  • Continuous data replication using Change Data Capture
  • Require EC2 instance to perform the replication tasks
27
Q

Data Migration Service Schema Conversion Tool

A
  • Prefer compute-intensive instances to optimize data conversions
28
Q

Direct Connect (DX)

A
  • Provides a dedicated private connection from a remote network to your VPC
  • Require to setup a Virtual Private Gateway on your VPC
  • Use Cases
    • Increase Bandwidth Throughput
    • Consistent network experience
    • Hybrid Env
  • Support IPv4 and IPv6
  • If DX is setup to one or more VPC in different regions, use Direct Connect Gateway
29
Q

Direct Connection Types

A
  • Dedicated
    • 1,10,100 Gbps
    • physical ethernet port dedicated to a customer
  • Hosted
    • 50,500Mbps to 10Gbps
    • Capacity can be added or removed on demand
  • Lead time are often longer than 1 month to establish a new connection
30
Q

Direct Connect Encryption

A
  • In Transit : not private
  • AWS DC + VPC provides an IPsec encrypted private connection
31
Q

Direct Connect Max Resiliency

A
  • Multiple DX per 1 location
32
Q

Snowcone

A
  • Light
  • Device for edge computing, storage and data transfer
  • 8 TB
  • Connect it to internet and use AWS DataSync to send data
33
Q

Snowball Edge

A
  • Storage Optimized
    • 80TB
  • Compute Optimized
    • 42TB
  • Provide block storage and S3-compatible object storage
34
Q

Snowmobile

A
  • 100PB
  • High Security, Temperature Controlled, GPS, 247 video surveillance
  • Better than Snowball if transferring more than 10PB
35
Q

AWS OpsHub

A
  • Manage your Snow Family Device
36
Q

AWS Managed Streaming Kafka

A
  • Fully managed
  • Data Stored in EBS
  • Message size 1MB to 10MB
  • Choose number of AZ
  • Choose the VPC and subnet
  • Choose the broker instance type
  • Choose the size of EBS volume
  • Durability & Availability
    • Ensure the replication factor (RF) is at least 2 for 2 AZ clusters and at least 3 for 3 AZ clusters
    • Set minimum in-sync replicas (miniISR) to at most RF-1
37
Q

MSK Security

A
  • EIF : TLS
  • EAR : KMS
  • AuthN and AuthZ
    • Mutual TLS + Kafka ACLs
    • SASL / SCRAM + Kafka ACLs
    • IAM
38
Q

MSK Monitoring

A
  • CloudWatch Metrics
    • Basic monitoring, enhanced monitoring, topic level monitoring
  • Prometheus
  • Broker Log delivery
    • To CloudWatch, S3, KDS
39
Q

MSK Connect

A
  • Managed Kafka Connect Workers
  • Auto-scaling capabilities
  • Deploy any Kafka Connect connectors to MSK as a plugin
40
Q

KDS > SQS

A
  • Ability for multiple applications to consume the same stream concurrently
  • Ability to consume records in the same order a few hours later
41
Q

KDF Sources

A
  • KDF API
  • KDS
  • Other AWS Services
  • Kinesis Agent
  • AWS Lambda
42
Q

KDF + Lambda Transformation

A

All transformed records from Lambda must be returned to Firehose with following 3 parameters
- recordId
- result
- data
Enable source record backup and KDF will deliver the un-transformed incoming data to a separate S3 bucket

43
Q

Kinesis Video Streams

A
  • Fully managed
  • Service for media ingestion, storage and processing
  • Use Cases
    • Smart Home : Stream video and audio from camera-equipped home devices
    • Smart City
    • Industrial Automation
  • Integrates with ML Framework
44
Q

Kinesis Video Stream Concepts

A
  • Video Stream
    • resource that enables you to capture live video and other time-encoded data
  • Fragment
    • Self-contained sequence of media frames
  • Chunk
    • KVS stores videos in chunks