Data Analytics Flashcards
(602 cards)
Abbr for ETL
Extract Transform Load
What is AWS alternative to Apache Kafka?
AWS Kinesis
How is Kinesis Streams divided
Shards
Kinesis Streams retention period
- default 24H
- up to 365 Days
Can multiple applications consume the same stream in Kinesis?
YES
How the billing looks like in Kinesis Data Streams?
per shard provisioned
Size of Data Blob in Kinesis Streams
up to 1MB
Kinesis Producer max write
1MB/s or 1000 messages/s PER SHARD
Message received if producer go above provisioned throughput?
ProvisionedThroughputException
Two types of consumers in Kinesis Streams
- Consumer Classic
- Consumer Enhanced Fan-Out
What is Kinesis Agent?
Kinesis Agent is a stand-alone Java software application that offers an easy way to collect and send data to Kinesis Data Streams
What is hot shard in Kinesis Streams?
Some shards in your Kinesis data stream might receive more records than others. This can lead to throttling errors in the stream, resulting in overworked shards, also known as hot shards.
Potential solutions to ProvisionedThroughputExceeded
- retries with backoff
- increase shards (scaling)
- ensure the partition key is optimal
What is Kinesis Producer Library?
Easy to use and highly configurable C++Java library that helps you write to a Kinesis data stream
Two types of API in KPL?
- Synchronous
- Asynchronous
What is the purpose of Batching in Kinesis Producer Library?
decrease throughput and decrease cost
Kinesis Producer Library two types of batching
- Aggregation
- Collection
What might be the effect of increasing RecordMaxBufferTime in KPL?
- additional processing delay
- higher packing efficiencies and better performance
Can KPL be used if the application cannot tolerate additional delay?
NO. SDK should be used
Shard Kinesis Consumer max throughput?
2MB/s
Shard Kinesis Producer max throughput?
1MB/s
When to use Enhanced Kinesis Fan Out Consumers?
- Multiple Consumer applications for the same stream
- Low latency requirement (70ms)
When to use Standard Kinesis Consumers?
- low number of consuming applications (1,2,3) for the same stream
- Can tolerate 200ms latency
- minimize cost
Default limit of consumers when using Enhanced Fan Out Kinesis Consumer
5