Datawarehousing Flashcards
(91 cards)
Is Redshift good for ELT?
Yes
Can Lambda Expression be trigged by IOT?
Yes
Can Lambda Expression be trigged by Kinesis?
Yes
Can Apache Spark notebooks run on EMR?
Yes
Can Apache Spark read from S3?
Yes
Can Apache Zeppelin be used to visualize data in Amazon Redshift?
Yes
Is Redshift a columnar database?
Yes
Is Redshift MPP?
Yes
Is Redshift ANSI SQL Compliant?
Yes
In addition, to data compression and columnar storage, how is I/O reduced in Redshift?
Zone maps : A zone map exists for each 1 MB block, and consists of in-memory metadata that tracks the minimum and maximum values within the block, Hence if you sort the column e.g. a date_column If it is sorted then it will be faster to find the block in which data is stored. Amazon redshift does not use indexes as any conventional database.
Can Redshift Clusters be managed via API?
Yes
Does redshift support ODBC and JDBC?
Yes
Describe Redshift architecture?
1 Leader Node. Communicating to multiple Compute nodes that house the data
Does Redshift encrypt data at rest?
Yes AES-256
Does Amazon Redshift take care of key management?
Yes
Anti-Patterns for Redshift
Small datasets, OLTP, Unstructured data, BLOB data
What are the 2 methods used by Kinesis Firehouse?
PutRecord and PutRecordBatch
What is the max size for a Firehouse PutRecord?
1000 Kb
Kinesis Agent
Java agent is a stand-alone software which can send information to Kinesis and Kinesis Firehose. It can be installed on Linux servers
Can the Kinesis Agent monitor multiple files and write to multiple streams?
Yes
What is the max buffer size for Kinesis Firehose?
3Mb
Can Kinesis Firehouse invoke a Lambda Function?
Yes
Why should a record separator be added to Kinesis Stream data?
Kinesis stream bundles records together. If you don’t add a record separator, you can’t split the records later.
What are buffer sizes for S3?
1 MB - 128 MB