Chapter 5 - ANALYTICS: Amazon Athena, Amazon EMR, Amazon Kenesis, Amazon Redshift, AWS Glue, AWS Data Pipeline, Amazon Quick-Sight, AWS Lake Formation, ELASTICSEARCH Flashcards

Question 1

Q

Which AWS service you will use for real time analytics of streaming data such as IoT telemetry data, application logs, and website clickstreams. ?

Amazon Athena
Amazon Kinesis
Amazon Elasticsearch Service
Amazon QuickSight

Answer

A

Amazon Athena
Amazon Kinesis
Amazon Elasticsearch Service
Amazon QuickSight

Question 2

Q

Which of the following are Kinesis services? Choose 4.

Kinesis Video Streams
Kinesis Data Streams
Kinesis Data Firehose
Kinesis QuickSight
Kinesis Data Analytics

Answer

A

Kinesis Video Streams
Kinesis Data Streams
Kinesis Data Firehose
Kinesis QuickSight
Kinesis Data Analytics

Question 3

Q

You want to collect log and event data from sources such as servers, desktops, and mobile devices and then have a custom application continuously process the data, generate metrics, power live dashboards, and emit aggregated data into stores such as Amazon S3. Which is the main AWS service you will use?

Kinesis Data Streams
Kinesis Data Firehose
Kinesis Video Streams
Kinesis Data Analytics

Answer

A

Kinesis Data Streams
Kinesis Data Firehose
Kinesis Video Streams
Kinesis Data Analytics

Question 4

Q

Which of the following are ideal use case for Kinesis Data Streams? Choose 3.

Real time data analytics
Long term data storage and analytics
Log and data feed intake and processing
Real time metrics and reporting
ETL Batch jobs

Answer

A

Real time data analytics
Long term data storage and analytics
Log and data feed intake and processing
Real time metrics and reporting
ETL Batch jobs

Question 5

Q

What are features of AWS Redshift? Choose 3.

Fully managed data warehouse service.
Allows you to run complex analytic queries against petabytes of structured data using sophisticated query optimization, columnar storage on high-performance storage, and massively parallel query execution.
Also includes Amazon Athena, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
Also includes Amazon Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
Fully managed data lake service.

Answer

A

Fully managed data warehouse service.
Allows you to run complex analytic queries against petabytes of structured data using sophisticated query optimization, columnar storage on high-performance storage, and massively parallel query execution.
Also includes Amazon Athena, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
Also includes Amazon Redshift Spectrum, allowing you to directly run SQL queries against exabytes of unstructured data in Amazon S3 data lakes
Fully managed data lake service.

Question 6

Q

You are working as a solution architect for a financial services company which is planning to create a new data warehouse solution leveraging AWS Redshift. The raw data will be fist exported to S3 and EMR cluster and then copied into Redshift. The query results will be exported to another S3 data lake. How can you ensure that all data exchange (COPY, UNLOAD) between Redshift and other AWS resources should not traverse through internet and also to leverage the VPC security and monitoring features?

Use AWS Glue to copy and upload data to Redshift cluster
Use AWS Data pipeline to copy and upload data to Redshift cluster
Enable enhanced VPC routing on your Redshift cluster
Enable VPC flow logs on your Redshift cluster

Answer

A

Use AWS Glue to copy and upload data to Redshift cluster
Use AWS Data pipeline to copy and upload data to Redshift cluster
Enable enhanced VPC routing on your Redshift cluster
Enable VPC flow logs on your Redshift cluster

Question 7

Q

Which AWS service you will use for business analytics dashboards and visualizations?

Amazon Athena
Amazon EMR
Amazon Elasticsearch Service
Amazon QuickSight

Answer

A

Amazon Athena
Amazon EMR
Amazon Elasticsearch Service
Amazon QuickSight

Question 8

Q

You are the solution architect for a national retail chain having stores in major cities. Each store use an on premise application for sales transaction. At the end of the day at 11 pm data from each store should be uploaded to Amazon storage which will be in excess of 30TB of data, the data then should be processed in Hadoop and results stored in data warehouse. What combination of AWS services you will use?

Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon DynamoDB
Amazon Data Pipeline, Amazon Elastic Block Storage, Amazon S3, Amazon EMR, Amazon Redshift
Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon Redshift
Amazon Data Pipeline, Amazon Kinesis, Amazon S3, Amazon EMR, Amazon Redshift, Amazon EC2

Answer

A

Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon DynamoDB
Amazon Data Pipeline, Amazon Elastic Block Storage, Amazon S3, Amazon EMR, Amazon Redshift
Amazon Data Pipeline, Amazon S3, Amazon EMR, Amazon Redshift
Amazon Data Pipeline, Amazon Kinesis, Amazon S3, Amazon EMR, Amazon Redshift, Amazon EC2

Question 9

Q

Which AWS Analytics services gives you the ability to process nearly unlimited streams of data?

Amazon Kinesis Streams
Amazon Kinesis Firehose
Amazon EMR
Amazon Redshift

Answer

A

Amazon Kinesis Streams
Amazon Kinesis Firehose
Amazon EMR
Amazon Redshift

Question 10

Q

Which of the following are scenarios where Amazon Quicksight cannot be used?

Highly formatted canned Reports
Quick interactive ad-hoc exploration and optimized visualization of data. Create and share dashboards and KPI’s to provide insight into your data
Analyze and visualize data in various AWS resources, e.g., Amazon RDS databases, Amazon Redshift, Amazon Athena, and Amazon S3.
Analyze and visualize data from on premise databases like SQL Server, Oracle, PostgreSQL, and MySQL
Analyze and visualize data in data sources that can be connected to using JDBC/ODBC connection.

Answer

A

Highly formatted canned Reports
Quick interactive ad-hoc exploration and optimized visualization of data. Create and share dashboards and KPI’s to provide insight into your data
Analyze and visualize data in various AWS resources, e.g., Amazon RDS databases, Amazon Redshift, Amazon Athena, and Amazon S3.
Analyze and visualize data from on premise databases like SQL Server, Oracle, PostgreSQL, and MySQL
Analyze and visualize data in data sources that can be connected to using JDBC/ODBC connection.

Question 11

Q

Which of the following AWS services you can leverage to analyze logs for customer facing applications and websites? Choose 2.

Amazon S3
Amazon Elasticsearch
Amazon Athena
Amazon Cloudwatch

Answer

A

Amazon S3
Amazon Elasticsearch
Amazon Athena
Amazon Cloudwatch

Question 12

Q

Which AWS service you will use for data warehouse and analytics requirements?

DynamoDB
Aurora
Redshift
S3

Answer

A

DynamoDB
Aurora
Redshift
S3

Question 13

Q

Which AWS database service will you choose for Online Analytical Processing (OLAP)?

Amazon RDS
Amazon Redshift
Amazon Glacier
Amazon DynamoDB

Answer

A

Amazon RDS
Amazon Redshift
Amazon Glacier
Amazon DynamoDB

Question 14

Q

Which AWS service reduces the complexity and upfront costs of setting up Hadoop by providing you with fully managed on-demand Hadoop framework?

Amazon Redshift
Amazon Kinesis
Amazon EMR
Amazon Hadoop

Answer

A

Amazon Redshift
Amazon Kinesis
Amazon EMR
Amazon Hadoop

Question 15

Q

Which of the following use cases is not well suited for Amazon EMR?

Log processing and analytics
Large extract, transform, and load (ETL) data movement
Ad targeting and click stream analytics
Genomics, Predictive analytics, Ad hoc data mining and analytics
Small Data Set and ACID transaction requirements
Risk modeling and threat analytics

Answer

A

Log processing and analytics
Large extract, transform, and load (ETL) data movement
Ad targeting and click stream analytics
Genomics, Predictive analytics, Ad hoc data mining and analytics
Small Data Set and ACID transaction requirements
Risk modeling and threat analytics

Question 16

Q

You want to do click stream analysis of website to detect user behavior by analyzing the sequence of clicks a user makes, the amount of time the user spends, where they usually begin the navigation, and how it ends. By tracking this user behavior in real time, you want to update recommendations, perform advanced A/B testing, push notifications based on session length, and much more. Which AWS services you will use to ingest the captured clickstream data and analyze the sessions?

Data ingestion: Kinesis Data Streams, Data sessionization Analytics : Kinesis Data Analytics
Data ingestion: Kinesis Firehose , Data sessionization Analytics : Kinesis Data Analytics
Data ingestion: Kinesis Data Streams, Data sessionization Analytics : AWS Glue
Data ingestion: Kinesis Data Streams, Data sessionization Analytics : AWS EMR

Answer

A

Data ingestion: Kinesis Data Streams, Data sessionization Analytics : Kinesis Data Analytics
Data ingestion: Kinesis Firehose , Data sessionization Analytics : Kinesis Data Analytics
Data ingestion: Kinesis Data Streams, Data sessionization Analytics : AWS Glue
Data ingestion: Kinesis Data Streams, Data sessionization Analytics : AWS EMR

Question 17

Q

Which kinesis service is integrated with Amazon S3, Amazon Redshift, and Amazon Elasticsearch Service?

Kinesis Data Streams
Kinesis Data Firehose
Kinesis Quicksight
Kinesis Data Analytics

Answer

A

Kinesis Data Streams
Kinesis Data Firehose
Kinesis Quicksight
Kinesis Data Analytics

Question 18

Q

Your company has recently migrated on-premise application to AWS and deploying them in VPCs. As part of the proactive monitoring and audit purpose they want to continuously analyze the Cloudtrail event logs to collect different operational metrics in real time. For example:

Total calls by IP, service, API call, IAM user
Amazon EC2 API failures (or any other service)
Anomalous behavior of Amazon EC2 API (or any other service)
Top 10 API calls across all services

Which AWS services you will use?

S3, Kinesis Data Analytics, Lambda, DynamoDB
EC2, S3, Kinesis Data Analytics, DynamoDB
EC2, S3, Kinesis Data Analytics, Lambda, DynamoDB
Kinesis Data Firehose, S3, Kinesis Data Analytics, Lambda, DynamoDB

Answer

A

S3, Kinesis Data Analytics, Lambda, DynamoDB
EC2, S3, Kinesis Data Analytics, DynamoDB
EC2, S3, Kinesis Data Analytics, Lambda, DynamoDB
Kinesis Data Firehose, S3, Kinesis Data Analytics, Lambda, DynamoDB

Question 19

Q

Which of the following are features of EMR HDFS File System? Choose 4.

It is a distributed, scalable, and portable file system for Hadoop. HDFS is an implementation of the Hadoop FileSystem API, which models POSIX file system behavior.
It allows clusters to store data in Amazon S3.
Instance store and/or EBS volume storage is used for HDFS data.
Amazon EBS volumes attached to EMR clusters are ephemeral: the volumes are deleted upon cluster and instance termination.
HDFS is common choice for persistent clusters.
HDFS is common choice for transient clusters.

Answer

A

It is a distributed, scalable, and portable file system for Hadoop. HDFS is an implementation of the Hadoop FileSystem API, which models POSIX file system behavior.
It allows clusters to store data in Amazon S3.
Instance store and/or EBS volume storage is used for HDFS data.
Amazon EBS volumes attached to EMR clusters are ephemeral: the volumes are deleted upon cluster and instance termination.
HDFS is common choice for persistent clusters.
HDFS is common choice for transient clusters.

Question 20

Q

Scenario for Q20-Q21. ABC Tolls, operates toll highways throughout the country. Customers that register with ABC Tolls receive a transceiver for their automobile. When the customer drives through the tolling area, a sensor receives information from the transceiver and records details of the transaction to a relational database. Their current solution stores records in a file system as part of their batch process. ABC Tolls has a traditional batch architecture. Each day, a scheduled extract-transform-load (ETL) process is executed that processes the daily transactions and transforms them so they can be loaded into their Amazon Redshift data warehouse. Then next day, the ABC Tolls business analysts review the data using a reporting tool. In addition, once a month (at the end of the billing cycle) another process aggregates all the transactions for each of the ABC Tolls customers to calculate their monthly payment. ABC Tolls would like to make some modifications to its system. Q20-Q21 are each specific to one requirement.

The first requirement comes from its business analyst team. They have asked for the ability to run reports from their data warehouse with data that is no older than 30 minutes. The ABC Tolls engineering team determines that their current architecture needs some modifications to support these requirements. They have decided to build a streaming data ingestion and analytics system to support this requirement. Which of the following statement is correct to meet this requirement? Choose 2.

Create a Kinesis Firehose delivery stream and configure it so that it would copy data to their Amazon Redshift table every 15 minutes.
Use the Amazon Kinesis Agent on servers to forward their data to Kinesis Firehose.
Create a Kinesis data stream and configure it so that it would copy data to their Amazon Redshift table every 15 minutes.
Use the Amazon Kinesis Agent on servers to forward their data to Kinesis data stream.

Answer

A

Create a Kinesis Firehose delivery stream and configure it so that it would copy data to their Amazon Redshift table every 15 minutes.
Use the Amazon Kinesis Agent on servers to forward their data to Kinesis Firehose.
Create a Kinesis data stream and configure it so that it would copy data to their Amazon Redshift table every 15 minutes.
Use the Amazon Kinesis Agent on servers to forward their data to Kinesis data stream.

Question 21

Q

Scenario for Q20-Q21. ABC Tolls, operates toll highways throughout the country. Customers that register with ABC Tolls receive a transceiver for their automobile. When the customer drives through the tolling area, a sensor receives information from the transceiver and records details of the transaction to a relational database. Their current solution stores records in a file system as part of their batch process. ABC Tolls has a traditional batch architecture. Each day, a scheduled extract-transform-load (ETL) process is executed that processes the daily transactions and transforms them so they can be loaded into their Amazon Redshift data warehouse. Then next day, the ABC Tolls business analysts review the data using a reporting tool. In addition, once a month (at the end of the billing cycle) another process aggregates all the transactions for each of the ABC Tolls customers to calculate their monthly payment. ABC Tolls would like to make some modifications to its system. Q20-Q21 are each specific to one requirement.

ABC Tolls is also developing a new mobile application for its customers. While developing the application, they decided to create some new features. One feature will give customers the ability to set a spending threshold for their account. If a customer’s cumulative toll bill surpasses this threshold, ABC Tolls wants to send an in-application message to the customer to notify them that the threshold has been breached within 10 minutes of the breach occurring. Which AWS services they will use to achieve this feature such that solution is scalable and cost effective?

Kinesis Analytics, Kinesis Streams, EC2, SNS, DynamoDB
Kinesis Analytics, Kinesis Streams, Lambda, SQS, DynamoDB
Kinesis Analytics, Kinesis Streams, EC2, SQS, DynamoDB
Kinesis Analytics, Kinesis Streams, Lambda, SNS, DynamoDB

Answer

A

Kinesis Analytics, Kinesis Streams, EC2, SNS, DynamoDB
Kinesis Analytics, Kinesis Streams, Lambda, SQS, DynamoDB
Kinesis Analytics, Kinesis Streams, EC2, SQS, DynamoDB
Kinesis Analytics, Kinesis Streams, Lambda, SNS, DynamoDB

Question 22

Q

You want to leverage Amazon Web Services to build an end-to-end log analytics solution that collects, ingests, processes, and loads both batch data and streaming data, and makes the processed data available to your users in analytics systems they are already using and in near real-time. The solution should be highly reliable, cost-effective, scales automatically to varying data volumes, and requires almost no IT administration. This solution should be extensible to use cases like analyze log data from websites, mobile devices, servers, sensors, and more for a wide variety of applications such as digital marketing, application monitoring, fraud detection, ad tech, gaming, and IoT. Which services you will use to build this log analytics solution?

Kinesis Firehose, Kinesis Analytics, S3, Elasticsearch, Kibana
Kinesis Firehose, Kinesis Analytics, RDS Aurora, Elasticsearch, Kibana
Kinesis Firehose, Kinesis Analytics, DynamoDB, Elasticsearch, Kibana
EC2, Lambda, S3, Elasticsearch, Kibana

Answer

A

Kinesis Firehose, Kinesis Analytics, S3, Elasticsearch, Kibana
Kinesis Firehose, Kinesis Analytics, RDS Aurora, Elasticsearch, Kibana
Kinesis Firehose, Kinesis Analytics, DynamoDB, Elasticsearch, Kibana
EC2, Lambda, S3, Elasticsearch, Kibana

Question 23

Q

You want to leverage AWS native components to do Clickstream analytics by collecting, analyzing, and reporting aggregate data about which webpages someone visits and in what order in your website. The clickstream analytics solution should provide these capabilities: Streaming data ingestion, which can process millions of website clicks (clickstream data) a day from global websites. Near real-time visualizations and recommendations, with web usage metrics that include events per hour, visitor count, web/HTTP user agents (e.g., a web browser), abnormal events, aggregate event count, referrers, and recent events. You want to build a recommendation engine on a data warehouse. Analysis and visualizations of your clickstream data both real time and analytical. Which AWS native services you will use to build this solution?

Amazon IoT core, Amazon Elasticsearch Amazon S3, Amazon RDS, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon Kinesis Data Firehose, Amazon Elasticsearch Amazon S3, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon IoT core, Amazon Elasticsearch Amazon S3, Amazon DynamoDB, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon EC2, Amazon Elasticsearch Amazon S3, Amazon Redshift, Amazon Quicksight, Amazon Athena

Answer

A

Amazon IoT core, Amazon Elasticsearch Amazon S3, Amazon RDS, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon Kinesis Data Firehose, Amazon Elasticsearch Amazon S3, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon IoT core, Amazon Elasticsearch Amazon S3, Amazon DynamoDB, Amazon Redshift, Amazon Quicksight, Amazon Athena
Amazon EC2, Amazon Elasticsearch Amazon S3, Amazon Redshift, Amazon Quicksight, Amazon Athena

Question 24

Q

Which of the following are features for EMR EMRFS File System? Choose 3.

EMRFS is common choice for persistent clusters.
EMRFS is common choice for transient clusters.
Is an implementation of the Hadoop file system used for reading and writing regular files from Amazon EMR directly to Amazon S3.
Provides the convenience of storing persistent data in Amazon S3 for use with Hadoop while also providing features like Amazon S3 server-side encryption, read-after-write consistency, and list consistency.

Answer

A

EMRFS is common choice for persistent clusters.
EMRFS is common choice for transient clusters.
Is an implementation of the Hadoop file system used for reading and writing regular files from Amazon EMR directly to Amazon S3.
Provides the convenience of storing persistent data in Amazon S3 for use with Hadoop while also providing features like Amazon S3 server-side encryption, read-after-write consistency, and list consistency.

Question 25

Q

Which AWS services is fully managed ETL (extract, transform, and load) service that is serverless, makes it simple and cost-effective to categorize your data, clean it, enrich it, and move it reliably between various data stores?

AWS EMR
AWS Datapipeline
AWS Glue
AWS Data Migration Service

Answer

A

AWS EMR
AWS Datapipeline
AWS Glue
AWS Data Migration Service

Question 26

Q

Which of the following is not a use case for using AWS Glue?

To build a data warehouse to organize, cleanse, validate, and format data.
Run serverless queries against your Amazon S3 data lake.
Create event-driven ETL pipelines with AWS Glue.
Create business intelligence dashboard on top of data warehouse.

Answer

A

To build a data warehouse to organize, cleanse, validate, and format data.
Run serverless queries against your Amazon S3 data lake.
Create event-driven ETL pipelines with AWS Glue.
Create business intelligence dashboard on top of data warehouse.

Question 27

Q

You are using Kinesis data stream and kinesis data analytics for ingestion and real time clickstream analytics of new launched website. The solution was working fine when there were limited users, but as the popularity of website increased you are observing performance issues and exceptions in the logs. What you should do to improve the performance?

Increase the number of shards in the kinesis stream.
Decrease the number of shards in the kinesis stream.
Replace the kinesis data stream by kinesis data firehose.
Replace the kinesis data stream by lambda.

Answer

A

Increase the number of shards in the kinesis stream.
Decrease the number of shards in the kinesis stream.
Replace the kinesis data stream by kinesis data firehose.
Replace the kinesis data stream by lambda.

Question 28

Q

You are building an ETL solution for daily sales report analysis. All the regional headquarter in the country upload their sales data between 7pm-11pm to a S3 bucket. Upon upload each file should be transformed and loaded into a data warehouse. What services you will use to design this solution in a most cost effective way? Choose 2.

Configure S3 event notification to trigger a lambda function which will kick start ETL job whenever a file is uploaded.
Use AWS Glue for ETL and Redshift for Data warehouse
Use AWS Data Pipeline for ETL and Redshift for Data warehouse
Use AWS Glue for ETL and Amazon EMR for Data warehouse

Answer

A

Configure S3 event notification to trigger a lambda function which will kick start ETL job whenever a file is uploaded.
Use AWS Glue for ETL and Redshift for Data warehouse
Use AWS Data Pipeline for ETL and Redshift for Data warehouse
Use AWS Glue for ETL and Amazon EMR for Data warehouse

Question 29

Q

You are launching an Amazon Redshift a cluster in a virtual private cloud (VPC). What information you need to provide apart from VPC id?

Redshift Subnet Group
DW Subnet Group
Cluster Subnet Group
DB Subnet Group

Answer

A

Redshift Subnet Group
DW Subnet Group
Cluster Subnet Group
DB Subnet Group

Question 30

Q

What are differences between AWS Glue vs. AWS Data Pipeline? Choose 3.

AWS Glue provides a managed ETL service that runs on a serverless Apache Spark environment. AWS Data Pipeline provides a managed orchestration service that gives you greater flexibility in terms of the execution environment, access and control over the compute resources that run your code, as well as the code itself that does data processing.
AWS Data Pipeline, you don’t have to worry about configuring and managing the underlying compute resources. AWS Glue launches compute resources in your account allowing you direct access to the Amazon EC2 instances or Amazon EMR clusters.
AWS Glue, you don’t have to worry about configuring and managing the underlying compute resources. AWS Data Pipeline launches compute resources in your account allowing you direct access to the Amazon EC2 instances or Amazon EMR clusters.
AWS Glue ETL jobs are Scala or Python based. AWS Data Pipeline, you can setup heterogeneous set of jobs that run on a variety of engines like Hive, Pig, etc.
AWS Data Pipeline provides a managed ETL service that runs on a serverless Apache Spark environment. AWS Glue provides a managed orchestration service that gives you greater flexibility in terms of the execution environment, access and control over the compute resources that run your code, as well as the code itself that does data processing.

Answer

A

AWS Glue provides a managed ETL service that runs on a serverless Apache Spark environment. AWS Data Pipeline provides a managed orchestration service that gives you greater flexibility in terms of the execution environment, access and control over the compute resources that run your code, as well as the code itself that does data processing.
AWS Data Pipeline, you don’t have to worry about configuring and managing the underlying compute resources. AWS Glue launches compute resources in your account allowing you direct access to the Amazon EC2 instances or Amazon EMR clusters.
AWS Glue, you don’t have to worry about configuring and managing the underlying compute resources. AWS Data Pipeline launches compute resources in your account allowing you direct access to the Amazon EC2 instances or Amazon EMR clusters.
AWS Glue ETL jobs are Scala or Python based. AWS Data Pipeline, you can setup heterogeneous set of jobs that run on a variety of engines like Hive, Pig, etc.
AWS Data Pipeline provides a managed ETL service that runs on a serverless Apache Spark environment. AWS Glue provides a managed orchestration service that gives you greater flexibility in terms of the execution environment, access and control over the compute resources that run your code, as well as the code itself that does data processing.

Question 31

Q

Which of the following are components of AWS Data Pipeline? Choose 3.

Pipeline definition
Pipeline
Data Catalog
Task Runner

Answer

A

Pipeline definition
Pipeline
Data Catalog
Task Runner

Question 32

Q

Which of the following are not components of AWS Glue? Choose 2.

AWS Glue Data Node
AWS Glue Console
AWS Glue Data Catalog
AWS Glue Job Scheduler
AWS Glue Crawlers and Classifiers
AWS Glue ETL Operations
AWS Glue Jobs System

Answer

A

AWS Glue Data Node
AWS Glue Console
AWS Glue Data Catalog
AWS Glue Job Scheduler
AWS Glue Crawlers and Classifiers
AWS Glue ETL Operations
AWS Glue Jobs System

Question 33

Q

Which of the following are features of Amazon Athena? Choose 3.

Is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
Query jobs are executed on a clusters of EC2 instances.
Is serverless, so there is no infrastructure to setup or manage, and you can start analyzing data immediately.
Uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Apache Parquet and Avro.
Is an interactive query service that makes it easy to analyze data in Amazon RDS using standard SQL

Answer

A

Is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL.
Query jobs are executed on a clusters of EC2 instances.
Is serverless, so there is no infrastructure to setup or manage, and you can start analyzing data immediately.
Uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Apache Parquet and Avro.
Is an interactive query service that makes it easy to analyze data in Amazon RDS using standard SQL

Question 34

Q

Which of the following data sources is not supported by Amazon QuickSight?

Amazon RDS, Amazon Aurora, Amazon Redshift
Amazon Athena and Amazon S3
Excel spreadsheets or flat files (CSV, TSV, CLF, and ELF)
EBS and EFS
Connect to on-premises databases like SQL Server, MySQL and PostgreSQL
Import data from SaaS applications like Salesforce

Answer

A

Amazon RDS, Amazon Aurora, Amazon Redshift
Amazon Athena and Amazon S3
Excel spreadsheets or flat files (CSV, TSV, CLF, and ELF)
EBS and EFS
Connect to on-premises databases like SQL Server, MySQL and PostgreSQL
Import data from SaaS applications like Salesforce

Brainscape's Knowledge GenomeTM

Chapter 5 - ANALYTICS: Amazon Athena, Amazon EMR, Amazon Kenesis, Amazon Redshift, AWS Glue, AWS Data Pipeline, Amazon Quick-Sight, AWS Lake Formation, ELASTICSEARCH Flashcards

Brainscape's Knowledge Genome^TM