Databases Flashcards
(22 cards)
What are the benefits of databases?
- structure the data
- build indexes to efficiently query/search through the data
- define relationships between your datasets
- flexibility, scalability, high-performance, highly funtctional
Name a few examples of databases
key-value, document, graph, in-memory, search database
What do the AWS database-services provide?
- quick provisioning, high availability, vertical and horizontal scaling
- automated backup & restore, Operations, Upgrades
- Operating System Patching is handled by AWS
- Monitoring, alerting
Can you run a database on EC2 without using the database-services?
Yes, but you must handle yourself the resiliency, backup, patching, high availability, fault tolerance, scale etc.
Name the six engines of Amazon RDS
- Aurora
- PostgreSQL
- MySQL
- MariaDB
- Oracle Database
- SQL Server
Name three benefits of RDS
- automated backups
- database snapshots
- automatic host replacement
What does RDS provide?
- automated provisioning, OS patching
- Continuous backups and restore to specific timestamp (Point in Time Restore)!
- Monitoring dashboards
- Read replicas for improved read performance
- Multi AZ setup for DR (Disaster Recovery)
- Maintenance window for upgrades)
- Scaling capability (vertical and horizontal)
- Storage backed by EBS (gp2 or io!)
Can you SSH into your RDS database instance?
No
What do you know about Amazon Aurora?
- PostgreSQL and MySQL are supported
- 5 times performance improvement over MySQL on RDS
- cost 20% more than RDS but is more efficient
What do you know about Amazon Dynamo DB
- noSQL database -> key-value-pair
- millions of requests per second
- low latency
- integrated with IAM for security, authorization and administration
- low cost and auto scaling capabilities
What do you know about DynamoDB Accelerator - DAX?
- fully managed in-memory cache for DynamoDB
- 10x performance improvement - single-digit millisecond latency to microseconds latency
- secure, highly scalable & highly available
- difference with ElastiCache at the CCP level: DAX is only used for and is integrated with DynamoDB, while ElastiCache can be used for other databases
What do you know about Redshift?
- OLAP - online analytical processing (analytics and data warehousing)
- load data once every hour, not every second
- 10x better performance than other data warehouse, scale to PBs of data
- columnar storage of data (instead of row based)
- massively parallel query execution (MPP), highly available
- pay as you go based on the instances provisioned
- has a sql interface for performing the queries
- BI tools such as AWS Quicksight or Tableau integrate with it
What do you know about Amazon Elastic MapReduce (EMR)
- helps creating Hadoop clusters (Big Data) to analyze and process vast amount of data
- the clusters can be made of hundreds of EC2 instances
- take care of all the provisioning and configuration
- use cases: data processing, machine learning, web indexing, big data…
What do you know about Athena?
- fully serverless database with SQL capabilities
- used to query data in S3
- pay per query
- output results back to S3
- secured through IAM
- use case: one-time SQL queries, serverless queries on S3, log analytics
What do you know about Amazon QuickSight?
- serverless machine learning-powered business intelligence service to create interactive dashboards
- fast automatically scalable, embeddable, with per-session pricing
- integrated with RDS, Aurora, Athena, Redshift, S3
Name four use cases for Amazon QuickSight
- business analytics
- building visualizations
- perform ad-hoc analysis
- get business insights using data
What do you know about DocumentDB
- is the same for MongoDB (NoSQL database) as Aurora for PostgreSQL/MySQL
- MongoDB is used to store, query and index JSON data
- similar deployment concepts as Aurora
- fully managed, highly available with replication 3 AZ
- automatically scales to workloads with millions of requests per second
What do you know about Amazon Neptune?
- fully managed graph database
- highly available across 3 AZs with up to 15 read replicas
- can store up to billions of relations and query the graph with milliseconds latency
- great for knowledge graphs (wikipedia), fraud detection, recommendation engines, social networking
What do you know about Amazon Quantum Ledger Database (QLDB)?
- a ledger is a book recording financial transactions
- fully managed, serverless, high available, replication across 3 AZs
- used to review history of all the changes made to your application data over time
- immutable system: no entry can be removed or modified, cryptographically verifiable
What do you know about Amazon Managed Blockchain?
- blockchain make it possible to build applications where multiple parties can execute transactions without the need for a trusted, central authority
- is a managed service to join public blockchain networks or create your own scalable private network
- compatible with the frameworks Hyperledger Fabric & Ethereum
What do you know about AWS Glue?
- managed extract, transform and load (ETL) service
- useful to prepare and transform data for analytics
- fully serverless service
- glue data catalog catalog of datasets: can be used by Athena, Redshift, EMR
What do you know about Database Migration Service (DMS)?
- quickly and securely migrate databases to AWS, resilient, self healing
- the source database remains available during the migration
- supports homogeneous and heterogeneous migrations