Amazon Redshift | Data Integration and Loading Flashcards

1
Q

Do your prices include taxes?

Data Integration and Loading

Amazon Redshift | Database

A

Except as otherwise noted, our prices are exclusive of applicable taxes and duties, including VAT and applicable sales tax. For customers with a Japanese billing address, use of AWS services is subject to Japanese Consumption Tax. Learn more.

Back to top »

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How do I load data into my Amazon Redshift data warehouse?

Data Integration and Loading

Amazon Redshift | Database

A

You can load data into Amazon Redshift from a range of data sources including Amazon S3, Amazon DynamoDB, Amazon EMR, AWS Data Pipeline and or any SSH-enabled host on Amazon EC2 or on-premises. Amazon Redshift attempts to load your data in parallel into each compute node to maximize the rate at which you can ingest data into your data warehouse cluster. For more details on loading data into Amazon Redshift please view our Getting Started Guide.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can I load data using SQL ‘INSERT’ statements?

Data Integration and Loading

Amazon Redshift | Database

A

Yes, clients can connect to Amazon Redshift using ODBC or JDBC and issue ‘insert’ SQL commands to insert the data. Please note this is slower than using S3 or DynamoDB since those methods load data in parallel to each compute node while SQL insert statements load via the single leader node.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do I load data from my existing Amazon RDS, Amazon EMR, Amazon DynamoDB, and Amazon EC2 data sources to Amazon Redshift?

Data Integration and Loading

Amazon Redshift | Database

A

You can use our COPY command to load data in parallel directly to Amazon Redshift from Amazon EMR, Amazon DynamoDB, or any SSH-enabled host. Redshift Spectrum also enables you to load data from Amazon S3 into your cluster with a simple INSERT INTO command. This could enable you to load data from various formats such as Parquet and RC into your cluster. Note that if you use this approach, you will accrue Redshift Spectrum charges for the data scanned from Amazon S3.

In addition, many ETL companies have certified Amazon Redshift for use with their tools, and a number are offering free trials to help you get started loading your data. AWS Data Pipeline provides a high performance, reliable, fault tolerant solution to load data from a variety of AWS data sources. You can use AWS Data Pipeline to specify the data source, desired data transformations, and then execute a pre-written import script to load your data into Amazon Redshift. Also, AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics. You can create and run an AWS Glue ETL job with a few clicks in the AWS Management Console.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly