Section 19: AWS Integration & Messaging: SQS, SNS & Kinesis: Kinesis Flashcards
(94 cards)
What does AWS Kinesis do?
Collect, process, and analyze real-time video and data streams
What kind of real-time data is Kinesis well suited to ingest?
logs, metrics, website clickstreams, IoT telemetry data
There are four types of Kinesis, what are they? (names only)
- Kinesis Data Streams
- Kinesis Data Firehose
- Kinesis Data Analytics
- Kinesis Video Streams
Which Kinesis type is best to: capture, process, and store data streams?
Your options are: Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Kinesis Video Streams.
Kinesis Data Streams
Which Kinesis type is best to: load data streams into AWS data stores?
Your options are: Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Kinesis Video Streams.
Kinesis Data Firehose
Which Kinesis option is best suited to: analyze data streams with SQL or Apache Flink?
Your options are: Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Kinesis Video Streams.
Kinesis Data Analytics
Which Kinesis option is best to: capture, process, and store video streams?
Your options are: Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, Kinesis Video Streams.
Kinesis Video Streams.
What is a shard, in terms of data?
A part of a dataset, when that dataset has been partitioned.
Can applications, clients, sdk, kinesis produer library (kpl), and kinesis agents all be Kinesis Data Stream producers?
Yes. maybe even at the same time idk.
Is a Kinesis Data Stream stream made up of shards?
yes. Again, as it pertains to data, a shard is a part of a dataset.
Can you scale up the number of shards in a Kinesis Data Stream… stream?
Yes
- What is the max size of a record that can be sent to a Kinesis Data Stream?
- What is the MB/sec throughput at which a record can be sent to a Kinesis Data Stream shard?
- What is the alternative number of messages per sec throughput at which a records can be sent to a Kinesis data stream shard?
- 1 MB.
- 1 MB/sec
- 1000 msg/sec
When sending a record from a producer to a Kinesis stream-shard, what three things does the record consist?
- sequence number (unique per partition-key within shard)
- A partition key (A partition key is used to group data by shard within a stream.) (myst specify while put records into stream)
- data blob (up to 1 MB)
Can all of the following be consumers of Kinesis Data Stream records?
* lambda
* kinesis data firehose
* kinesis data analytics
* custom consumer (aws sdk) - Classic or Enhanced Fan Out
* Kinesis Client Library (KCL) - Library to simplify reading from a data stream
* apps with an ec2 symbol was also one of the slides, but not another
Yes
What are the three things that are in each record being sent from a Kinesis Data Stream stream-shard to a Kinesis Data Stream consumer?
- Partition key
- Sequence number
- data blob
What are the two reates at which records can be sent from a Kinesis Data Stream stream-shard to a Kinesis Data Stream consumer?
- 2 MB/sec (shared) per shard - all consumers
OR - 2 MB/sec (enhanced) per shard per consumer
I’m thinking you can only pick one per kinesis data stream? or maybe only one can happen at a time?
About Kinesis Data Streams:
1. what is the retention
2. do you have the ability to reprocess (replay) data
3. can data be deleted once it’s been inserted into Kinesis?
- between 1 day and 365 (inclusive (including 1 and 365))
- yes
- Nope. data inserted into Kinesis is immutable.
About Kinesis Data Streams:
4. Does data that shares the same partition go into the same shard (this is really confusing, given that I assumed a shard was just a partition).
Yes, they call it ordering.
- What are the producers available to have as a Kinesis Data Stream
- AWS SDK;
- Kinesis Producer Library (KPL);
- Kinesis Agent
- What are the consumers available to a Kinesis Data Stream?
- you can write your own Kinesis Client Library (KCL) or by using AWS SDK
- Alternatively, you can use a managed consumer like aws lambda, kinesis data firehose, or kinesis data analytics.
Kinesis Data Streams have two capacity modes, what are they?
- Provisioned
- on-demand mode
You need the following things, do you use Kinesis Data Stream Provisioned capacity mode, or Kinesis Data Stream On-Demand capacity mode?
* you need to choose the number of shards provisioned, and scale manually or using API
* you need each shard to have up to 1MB/s in (or 1000 records per second)
* you need each shard to get 2MB/s out (for a classic or enhanced fan-out consumer)
* you need to pay per shard provisioned per hour
Capacity mode: Provisioned
You need the following things, do you use Kinesis Data Stream Provisioned capacity mode, or Kinesis Data Stream On-Demand capacity mode?
* you don’t want to provision or manage the capcity
* you’re perfectly happy with 4MB/s in or 4000 records per second (this is the default capacity provisioned for this capacity mode)
* you’re very happy for your stream to scale automatically based on observed throughout peak during the last 30 days
* you’re happy to pay per stream per hour & data in/out per GB
Capacity mode: On-demand mode
Let’s talk about Kinesis Data Stream Security:
1. how do you control access/authorization?
2. can you do encryption in flight? using what?
3. can you do encryption at rest? using what?
- control access/authorization using IAM policies
- encryption in flight using https endpoints
- encryption at rest using kms