What does OLAP stand for?
Online Analytics Processing
What does Columnar Data Storage optimize for?
read performance
It allows queries to only the necessary columns and gets better compression
When should a developer choose Amazon Redshift over traditional RDS databases?
For OLAP use-cases
What are the two key performance technologies Redshift uses?
Massive Parallel Processing
and
Columnar Data Storage
How can you run standard SQL queries against data stored in S3?
Amazon Athena
Is Amazon Athena serverless?
Yes
What is the primary format your data should be in for cost-effective querying with Athena?
Columnar Formats (like Parquet or ORC)
Athena only pays for the data it scans
What is the best choice for developers needing a managed solution for full-text search, log aggregation, and real-time application monitoring?
Amazon OpenSearch Service
(previously called ElasticSearch)
What is the key difference between OpenSearch and a traditional database?
OpenSearch is a search and analytics engine designed for unstructured data, and focuses on fast flexible search over transactional ACID compliance
What does EMR stand for?
Elastic MapReduce
When should a developer use EMR instead of Redshift, Athena, or Lambda?
For a customized, complex big data framework (like Hadoop, Spark or Hive)
Where is input data typically ingested for EMR?
S3
Where is output data usually written to from EMR?
S3
What is the serverless ETL service that Amazon Offers?
AWS Glue
What specialized component of AWS Glue is essential for services like Athena to understand the structure of data in S3?
The Glue Data Catalog
What AWS service simplifies setting up a secure data lake and centrally manages security and access permissions for data stored in S3?
AWS Lake Formation
What is the key benefit of Lake Formation’s security model?
It provides fine-grained access control (down to the column/row level) for various analytics services (Athena, Redshift, EMR) reading from the S3 data lake.
What is AWS’s Business Intelligence offering?
Amazon Quicksight