Domain 4: Analysis Flashcards
RANDOM_CUT_FOREST
Kinesis Data Analytics SQL (or Flink) Function for anomaly detection in numeric columns
Kinesis Firehose Buffer Limits
1 to 128 MB
60 to 900 seconds
Kinesis Data Analytics Supported Sources
Kinesis Streams and Kinesis Firehose
Kinesis Data Analytics Supported Destinations
Kinesis Streams, Kinesis Firehose, Lambda
What happens if a record arrives late to a Kinesis Data Analytics application
Record is written to the error stream
In what form does Kinesis Data Analytics provision capacity?
Kinesis Processing Units
How much memory is provided per KPU?
4GB
What is the default number of KPU per Kinesis Data Analytics application?
8
What is the name of the visualization tool in the Elastic Stack?
Kibana
Is ElasticSearch Serverless?
No, still have to scales servers
What should ElasticSearch NOT be used for?
- OLTP (RDS or DynamoDB instead)
- Ad-Hoc Querying (Athena instead)
How can data be imported to ElasticSearch?
Kinesis, DynamoDB, Logstash, Beats, ElasticSearch API
What query engine does Athena use?
Presto
What data formats does Athena support?
CSV, JSON, Parquet, ORC, Avro
Is Athena serverless?
Yes
Does Athena support unstructured data?
Yes
Which data formats are columnar?
ORC and Parquet
Which data formats are splittable?
ORC, Parquet, Avro
Which notebooks can Athena integrate with?
Jupyter, Zeppelin, RStudio
What is the cost rate for Athena?
$5 per TB scanned
Do cancelled queries count toward Athena charges?
Yes
Do failed queries count toward Athena charges?
No
What data format will be the most cost effective in Athena?
Columnar (ORC, Parquet)
Does Athena charge for DDL processing?
No