RDS Flashcards

1
Q

What are the 2 types of of general DBs in the industry?

A

Relational or “SQL” (there are many others like Aurora MariaDB)

Non-Relational or “NoSQL” (DynamoDB)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What’s the difference with SQL v NoSQL style DBs?

A

The schema.

SQL = rigid; parameters are defined in advance & every row in a table is uniquely identifiable.

NoSQL = not rigid, more flat; more relaxed relationship;

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Can you run a DB on an EC2 instance?

A

Yes, but it’s not recommended over services like RDS or RedShift unless there is a specific reason:

○ Access to the OS of the DB Instance; you can only do this if running on EC2 i.e you can’t in RDS (this is a very rare request/requirement)

○ Advanced DB tuning for which you need Root Access

○ Vendor demands

○ DB or DB Version that AWS doesn’t provide
Specific

○ OS/DB combination, or an ARCH that AWS doesn’t provide

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are some negatives about running a your own DB on EC2?

A

○ Admin overhead - both from managing an EC2 and a DB Host

○ Backup/DR/replication management falls on customer

○ EC2 is running in a single AZ - if the zone fails, then access to the DB fails

○ Features - AWS DB products are more feature rich than running your own DB on EC2

○ EC2 is either ON or OFF; doesn’t scale as well rapidly based on load compare to an AWS DB product

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you connect to an RDS instance?

A

The RDS CNAME (i.e you do not connect via an IP address).

The CNAME is a unique identifier.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What type of storage is created with RDS?

A

When you provision an RDS instance, it will auto-provision storage for that instance –> this is in the form of EBS within the same AZ.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is RDS Multi-AZ?

Is there an added cost?

A

Feature that provides HA to a DB instance.

This is an option that’s enabled when spinning up an RDS instance that will also spin up HW/resources inside another AZ as a secondary/backup instance.

This is an added cost.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What kind of replication is used automatically in an RDS Multi-AZ scenario?

A

Synchronous Replication.

REMINDER - the replica is not for added capacity, it is completely for used as a backup target.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is there disruption when a failure occurs in a RDS Multi-AZ scenario?

A

Yes. When RDS detects a failure and switches over to point to the CNAME of the standby Replica, there is a minor disruption/time it takes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can RDS Multi-AZ span across Regions?

A

No. It can span across different AZ’s but must be in the same Region.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Recovery Point Objective (RPO)?

A

The time between the point when the last successful backup took place to the point in time of a failure.

Ex > last backup was at 3am and there was an outage at 11am&raquo_space; this would mean there is an 8-hour RPO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the Recovery Time Objective (RTO)

A

The time between a failure and the point at which the system comes back into service.

Represents how long it takes to restore a backup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the 2 types of backups from an RDS perspective?

A

Manual Snapshots

Automated Backups

Both of these live in/point to S3

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are Manual Snapshots? What is the retention period?

A

Manual snaps of a DB at a specific point in time.

You can take as many as you want in whatever increments - you control the RPO.

They like in RDS forever unless you manually delete/clear them.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are Automated Backups? What is the retention period?

Does this have any disruption associated to it?

A

Same architecture as the manual snapshots but they occur automatically .

Backups will cause a slight disruption in services, so you want to perform backups in off-business hours during a service window.

Auto-backups are not retained indefinitely; they are automatically cleaned up&raquo_space; you can set the retention settings (between 0 to 35 days).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

RDS BACKUP & RESTORE SUMMARY

A

· When you perform a Restore (from a backup) RDS will create a brand new RDS instance that comes with a new DB endpoint address
○ You have to update any apps to point at the new address which will be it’s CNAME

· Snapshots = single point in time; this means it’s fixed&raquo_space; it’s the point in time that the snapshot was created; this has a direct correlation to the RPO value depending on when the snapshot was taken

· Automated = backups work different.. Because of Transaction Logs that are sent to S3 every 5 min, you can choose a specific point restore the DB to when there is a failure
»> Backups are restored from the closest snapshot, and then transaction logs are replayed from that point onward all the way up to your chosen point in time for the recovery point

· Restores from snapshots are NOT fast; this affects RTO

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an RDS Read Replica? What type of replication is used?

A

These are “read-only” copies of an RDS instance.

RR are kept in sync using Asynchronous replication&raquo_space; data is written to the primary DB first, and then once it’s successfully done that, it’s replicated to the Read Replicas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Can Read Replicas be created in different Regions?

A

Yes - can be same region as RDS instance or a different region i.e Cross-Region Replication.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What are the 2 main reasons Read Replicas are important?

A

Scale out a DB READ capacity - You can have 5 x direct RR’s per DB instance.

Offer lower RTO recovery for any instance during a failure - If there is a primary DB failure, you can promote a Read Replica DB to be read + write very quickly, making that the new primary DB.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What’s the main differences between Aurora and RDS?

** Aurora aka Aurora Provisioned Architecture **

A

Aurora:

  • uses Cluster topology versus dedicated EBS storage
  • Single primary instance + 0 or more Replicas (up to 15)
  • Combines the “read” benefits of Read Replicas and the HA benefits of multi-zone AZ (scale and availability)
  • Uses shared cluster volumes for storage/compute instances

Aurora replicates data to six storage nodes in Multi-AZs to withstand the loss of an entire AZ (Availability Zone) or two storage nodes without any availability impact to the client’s applications. On the other hand, RDS MySQL allows only up to five replicas and the replication process is slower than Aurora

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Aurora REVIEW:

A

→ Aurora copies your data to 6 x Storage Nodes in 3 x AZ’s; your instance cluster has access to all available storage

→ Can have up to 15 x Replicas of the data; any of these can be used for failover/failing over to

→ When data is written to the primary DB, Aurora will synchronously copy that data across AZ’s to multiple storage nodes that are associated with your cluster

→ When you create a cluster, you don’t allocate the amount of Storage needed - you simply get billed for whatever is used

22
Q

Are backups for Aurora similar to RDS?

A

Yes.

Normal backups/auto-backups/manual snapshots - works exactly like RDS.

Restores create a new cluster, by default, just like regular RDS.

23
Q

What is Aurora Serverless Architecture?

A

The version of Aurora where you don’t need to provision DB instances of a specific size OR need to worry about managing the instances at all.

Very close to DBaaS type of product/service.

24
Q

What is an ACU?

A

Aurora Capacity Unit (this is for Aurora Serverless deployments specifically)

Represent “x” amount of Compute and an associated “x” amount of Memory.

You can set the minimum/maximum values where the cluster will dynamically scale up/down based on load within the min/max parameters.

25
Q

Are ACU’s shared or local resources?

A

Shared.

They are stateless so they can be shared across many AWS customers; not local.

26
Q

What is the ACU shared pool of resources called when using Aurora Serverless ARCH?

A

Proxy Fleet.

Applications connect to compute/memory resources that are provided the ACU’s via a “Proxy Fleet”

AWS manages the proxy fleet.

27
Q

What do apps point @ when connecting to a primary Aurora Cluster for R/W operations?

A

Cluster Endpoint.

28
Q

Main Use Cases for Aurora RDS Serverless

A

○ Infrequently used apps

○ Very low volume website site

○ New applications where you’re unsure of the levels of load (you don’t know the size of the DB; serverless will auto-scale)

○ Variable workloads - apps that have peaks and valleys; avoids provisioning static capacity

○ Unpredictable workloads; you can set the min and max ACU’s very low and very high just to play it safe

○ Development/Test DB’s

○ Multi-Tenant Apps - if your customer uses more/creates more load and your cost for the license goes up, it’s fine because the revenue from the customer paying you also goes up

29
Q

What is Aurora Global DB?

What is the Aurora Global DB scale at this moment in time?

A

Global DB allows you to create global-level replication using Aurora from of a “master region” for up to 5 x secondary/standby AWS regions.

Each “standby region” can have up to 16 x RR’s where apps can use these regions for READS and handle R/W from the primary instance/region.

30
Q

Main Use Cases for Aurora Global DB

A

○ Cross-region DR and business recovery

○ Global READ scaling; low latency + performance improvements to international customers, for example

○ ~1s or less replication between the primary to secondary regions

○ Secondary regions can have up to 16 x replicas (Primary has up to 15 x replicas + the read/write primary instance for a total of 16)

○ Currently MAX of 5 x regions

REMINDER:
○ ONE-WAY replication from Primary > Secondary
○ No impact on DB performance when replicating because the replication happens at the Storage layer

31
Q

What is an Aurora Multi-Master Cluster?

A

An advanced feature of Aurora that allows the service to have multiple Primary instances that can perform both READS and WRITES.

In Multi-Master mode, ALL instances of the DB are in R/W mode; this means that there is no lengthy time if there is a failure to the “primary DB”

32
Q

How do apps connect in Aurora Multi-Master mode?

A

Applications connect to either ONE or ALL of the instances - there is NO load-balancing across instances and there is no “cluster endpoint” to point to.

With M-M Cluster, the Application points to multiple clusters (versus the cluster endpoint pointing to only one cluster a time for R/W operations) - This allows for automatic failover with little or no disruption at all.

Fault Tolerance versus HA.

33
Q

What is DB Migration Services (DMS)?

A

It’s basically an EC2 instance with DMS software running on top, where you give it connection info on Source/Desty target for the 2 x physical DBs by creating and endpoint config.

Service is capable of moving DB’s both INTO or OUT of AWS.

34
Q

How does DB Migration Services (DMS) function?

A

DMS runs a Replication Instance on top of an EC2 instance.

The replication instance runs one or more replication tasks which is where all the configuration parameters are defined when migrating from one DB to another.

35
Q

What are the 2 most important Replication Tasks when using DMS?

A

Source and destination endpoints (contain info on the DB)

Source and Target DB’s (actual DB)

36
Q

When migrating DBs using DMS, can both DB’s be on-premise?

A

No - one DB must always be in AWS.

37
Q

What are the 3 methods/options for migrating data when using DMS?

A

Full Load = copies DB entirely from source to destination.

Full Load + CDC (change data capture) = full data transfer but while it’s happening it’s still capturing changes to the DB. After full load has been transferred, then any captured changes can be applied to the destination target.

CDC-Only - use a 3rd party tool/alternative method/external tooling for Full Load and only use DMS for the changes applied to the target DB .

38
Q

What is the Schema Conversion Tool (SCT)?

A

Assists with a DB schema conversion when migrating to a target DB i.e you can modify the DB schema as part of a migration.

REVIEW - RDS has a rigid schema/relationship with data that’s not changed easily.

39
Q

DMS Overview:

A

The Replication Instance running on top of EC2 performs the migration between the Source & Desty endpoints. The endpoints store the connection information for Source and Target DBs.

Jobs can be Full Load, Full Load + CDC, and CDC-Only.

40
Q

What are the 2 different DB transaction models?

What is each focused on?

A
  1. ACID - focused on consistency

2. BASE - focused on availability

41
Q

What is the CAP Theorem?

What does the acronym stand for?

A

(C)onsistency - every READ will be the most recent WRITE

(A)vailability - always get a successful response, but with no guarantee the READ was the most recent WRITE

(P)artition Tolerance - system split into partitions, where the system still functions even with dropped messages and errors

42
Q

Acid v Base Theory

A

→ If you have a DB system that has multiple nodes and a network is involved, then you generally have a CHOICE between having Consistency and Availability

EX) A DB node fails, or comms between some of the nodes in a cluster fail. A request is sent to that node. You can either:

○ Cancel the operation which improves consistency but decreases availability

○ Let the operation go which improves availability but decreases consistency

43
Q

What type of database is ACID referring to?

A

RDS-style DB’s (SQL)

ACID limits a DB’s ability to scale.

44
Q

What type of database is BASE referring to?

A

NoQL like DynamoDB

BASE offers very high performance and scalability

45
Q

Does RDS support Encryption at Rest?

Encryption in Transit?

A

Yes, both.

In transit - SSL/TLS

At rest - KMS

46
Q

When doing encryption at rest with RDS, which entity handles the crypto functions - the host/EBS storage, or the RDS instance?

A

The host and underlying EBS volume

Data is encrypted/decrypted when entering or leaving the host that the RDS instance is sitting in.

47
Q

Once you turn on encryption for an RDS instance, can you turn it off?

A

No - once it’s on, the instance will stay encrypted.

48
Q

What is different about the encryption with MSFT RDS MySQL and RDS Oracle?

A

Crypto functions are done by the DB instance itself, not the host or the storage (this is the opposite of RDS encryption @ rest)

49
Q

What is Cloud HSM? What DB Instance Vendor/Type uses it?

A

RDS Oracle

Encrypts the data with customer managed keys as it is written to the instance. This method gives AWS zero trust.

50
Q

What is IAM Authentication in RDS?

Does it allow Authentication or Authorization?

A

Allows an IAM User or Role the ability to assume a Local Database User for access to an RDS Instance?

Authentication-ONLY