Amazon Simple Storage Service (S3) | Query in Place Flashcards

1
Q

Can I tier objects from Standard - IA to Amazon Glacier?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

Yes. In addition to using lifecycle policies to migrate objects from Standard to Standard - IA, you can also set up lifecycle policies to tier objects from Standard - IA to Amazon Glacier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is “Query in Place” functionality?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

Amazon S3 allows customers to run sophisticated queries against data stored without the need to extract, transform, and load (ETL) into a separate analytics platform. The ability to query this data in place on Amazon S3 can significantly increase performance and reduce cost for analytics solutions leveraging S3 as a data lake. S3 offers multiple query in place options, including S3 Select, Amazon Athena, and Amazon Redshift Spectrum, allowing you to choose one that best fits your use case. You can even use Amazon S3 Select with AWS Lambda to build serverless apps that can take advantage of the in-place processing capabilities provided by S3 Select.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is S3 Select?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

S3 Select is an Amazon S3 feature (currently in Preview) that makes it easy to retrieve specific data from the contents of an object using simple SQL expressions without having to retrieve the entire object. You can use S3 Select to retrieve a subset of data using SQL clauses, like SELECT and WHERE, from delimited text files and JSON objects in Amazon S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What can I do with S3 Select?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

You can use S3 Select to retrieve a smaller, targeted data set from an object using simple SQL statements. You can use S3 Select with AWS Lambda to build serverless applications that use S3 Select to efficiently and easily retrieve data from Amazon S3 instead of retrieving and processing entire object. You can also use S3 Select with Big Data frameworks, such as Presto, Apache Hive, and Apache Spark to scan and filter the data in Amazon S3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Why should I use S3 Select?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

S3 Select provides a new way to retrieve specific data using SQL statements from the contents of an object stored in Amazon S3 without having to retrieve the entire object. S3 Select simplifies and improves the performance of scanning and filtering the contents of objects into a smaller, targeted dataset by up to 400%. With S3 Select, you can also perform operational investigations on log files in Amazon S3 without the need to operate or manage a compute cluster.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How do I get started with S3 Select?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

Amazon S3 Select is currently available in Limited Preview. To apply for access to this Preview, complete the Amazon S3 Select Preview Application Form. During the Preview, you can use Amazon S3 Select through the available Presto connector, with AWS Lambda, or from any other application using the S3 Select SDK for Java or Python.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is Amazon Athena?

Query in Place

Amazon Simple Storage Service (S3) | Storage

A

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to setup or manage, and you can start analyzing data immediately. You don’t even need to load your data into Athena, it works directly with data stored in S3. To get started, just log into the Athena Management Console, define your schema, and start querying. Amazon Athena uses Presto with full standard SQL support and works with a variety of standard data formats, including CSV, JSON, ORC, Apache Parquet and Avro. While Amazon Athena is ideal for quick, ad-hoc querying and integrates with Amazon QuickSight for easy visualization, it can also handle complex analysis, including large joins, window functions, and arrays.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly