Kinesis Consumers Flashcards Preview

AWS Data Analytics Cert > Kinesis Consumers > Flashcards

Flashcards in Kinesis Consumers Deck (21)
Loading flashcards...
1
Q

What are the 7 ways to Consume data from a Kinesis Stream?

A
  • Kinesis SDK
  • Kinesis Client Library (KCL)
  • Kinesis Connector Library
  • 3rd party libraries
  • Kinesis Firehose
  • AWS Lambda
  • Kinesis Consumer Enhanced Fan Out
2
Q

What are some examples of 3rd party libraries to consume data from a stream?

A
  • Spark
  • Log4J
  • Appenders
  • Flume
  • Kafka
  • Connect
3
Q

What is the data read shard limit?

A

2MB

4
Q

How much data does the SDK GetRecords return?

A

Up to 10MB, which exceeds to 2MB/s limit, so you need to wait another 5 seconds for the next call

5
Q

How many GetRecords API calls can a shard make per second?

A

5 GetRecords API calls

6
Q

What is the GetRecords API latency?

A

200ms

7
Q

Can you increase throughput by adding more consumers to read from the same shard?

A

No, if more consumers read from the same shard, they share the 2MB/s limit and the 5 API calls. These limits are per shard per second

8
Q

How would you get around the issue of multiple consumers sharing the read limits of shards?

A

Use Fan out

9
Q

What are 6 features of the Kinesis Client Library?

A
  • Exists for multiple languages such as Java, Node, Python
  • Read records from the stream that were produced with the KPL (decode)
  • Share multiple shards with multiple consumers in one group, and shard discovery
  • Checkpointing system to resume progress
  • Uses DynamoDB for Checkpointing
  • Record Processors will process the data
10
Q

What should you do if your KCL is not reading fast enough even if your stream has enough throughput?

A

The DynamoDB may not have enough WCU/RCU to efficiently Checkpoint

11
Q

What are 4 features of using Lambda to read from a stream?

A
  • Lambda has a library to de-aggregate records from KPL
  • Lambda can be used to run lightweight ETL
  • Lambda can be used to trigger
  • Lambda has a configurable batch size
12
Q

What is the data limit between Kinesis Consumers and Kinesis Fan Out

A

Kinesis Fan Out pushes data to consumers to it gets 2MB per second per consumer vs 2MB per second per shard

13
Q

What are 3 reasons you would choose Classic Consumers?

A
  • Low number of consuming applications, less than 5
  • You can tolerate latency of about 200ms
  • Minimize cost
14
Q

What are 4 reasons you would choose Enhanced Fan Out Consumers?

A
  • Multiple consumers for the same stream
  • Low latency of about 70ms
  • Higher costs
  • Default limit of 5 consumers using enhanced fan out per data stream
15
Q

How would you divide a hot shard?

A

Use shard splitting

16
Q

What is shard splitting?

A

It is a process of splitting an existing shard into 2 shards. The data in the existing shard remains until it is expired, then the 2 new shards take over

17
Q

What is merging shards?

A

It is to take shards away

18
Q

What happens when we merge shards?

A

Old shards are closed and deleted when data expires

19
Q

Is auto scaling possible in Kinesis?

A

Its possible but not easy. You can use the “UpdateShardCount” API but there are manual steps

20
Q

What are 3 main limitations of Kinesis scaling?

A
  • Resharding cannot be done in parallel, so plan capacity in advance
  • You can only perform 1 resharding operation at a time and it takes a few seconds
  • For 1000 shards, it takes 30k seconds to double to 2000 shards, thats 8.3 hours
21
Q

What are 5 options for Kinesis Security?

A
  • Control access using IAM policies
  • Encryption in flight using HTTPS endpoints
  • Encryption at rest using KMS
  • Manually implemented Client Side encryption
  • VPC endpoints to access within a VPC