Database Specialty - RDS Flashcards

1
Q

Instance Class Types in RDS

A

Standard, Memory-optimized, Burstable performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Storage Type

A

General Purpose, Provisioned IOPS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Parameter groups important points

A

> Define configuration values specific to the select DB engine
Default parameter group cannot be edited
To make config changes, you must create a new parameter group
New parameter group inherits settings from the default parameter group
Can be applied to any DB instances within the AWS region

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Restoring from a snapshot points

A
  • By default, restored cluster gets applied with
    • New security group
    • Default parameter group
    • Option group that was associated with the snapshot
  • While restoring from a snapshot, be sure to
    • Choose the correct security group to
      ensure connectivity for the restored DB
    • Choose correct parameter group for the
      restored DB
    • Recommended to retain parameter group
      of the snapshot to help restore with the
      correct parameter group
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

PITR with RDS

A
  • Point-In-Time Recovery
  • Can only restore to a new instance
  • The backup retention period controls the
    PITR window
  • Can restore to any point in time during your
    backup retention period
  • RDS uploads DB transaction logs to S3 every
    5 minutes (affects latest restorable time)
  • You can move/restore a DB instance from
    outside VPC to inside VPC with PITR (but not
    other way round)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Exporting DB Snapshot Data to S3

A
  • All types of backups can be exported (automatic/manual
    or those created with AWS Backup service)
  • How to export?
    • Setup an S3 bucket with appropriate IAM
      permissions and create a KMS key for SSE
    • Export the snapshot using console (Actions à Export to Amazon S3) or using start-export-task CLI command
  • Export runs in the background
  • Doesn’t affect the DB performance
  • Data exported in Apache Parquet format (=compressed and consistent)
  • Allows you to analyze the DB data using Athena or Redshift Spectrum
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Multi-AZ Deployments in RDS

A
  • For high availability, data durability and fault-tolerance (not used for scaling)
  • Offers SYNC replication to standby instance in another AZ over low latency links
  • Performs automatic failover to standby instance in another AZ in case of planned or unplanned outage
  • Uses DNS routing to point to the new master (no need to update connection strings)
  • Failover times (RTO) are typically 60-120 seconds (minimal downtime)
  • Backups are taken from standby instead of primary to ensure performance level
    during backup activity
  • Recommended for production use cases
  • To force a failover or simulate AZ-failure, reboot the master instance and choose Reboot with failover
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

RDS Read Replicas

A
  • Read-only copies of master (primary) DB instance
  • Up to 5 Read Replicas
  • Within AZ, Cross AZ or Cross Region
  • Replication is ASYNC, so reads are eventually
    consistent
  • Applications must update the connection string to leverage read replicas
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

RDS Read Replicas 2

A
  • Boost DB performance
    and durability
  • Useful for scaling of read- heavy workloads
  • Can be promoted to primary (complements
    Multi-AZ)
  • To create a replica, you must enable automatic backups with at least one
    day retention period
  • Replica can be Multi-AZ (= a replica with its own standby instance)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

RDS Read Replicas as Multi-AZ

A
  • Supported for MySQL / MariaDB / PostgreSQL / Oracle
  • Works as a DR target. When promoted to primary, it works as Multi-AZ
  • There’s added network cost when data goes from one AZ to another
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

RDS Read Replicas – Use Case

A
  • You have a production database that is
    taking on normal load
  • You want to run a reporting application
    to run some analytics
  • You create a Read Replica to run the new
    workload there
  • The production application is unaffected
  • Read replicas are used for SELECT (=read) only kind of statements (not INSERT, UPDATE, DELETE)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Promoting a Read Replica to a Standalone DB
Instance

A
  • Promoted instance is rebooted and becomes an independent DB instance
    (separate from its source)
  • Will no longer work as a replica. Does not affect other replicas of the original DB
    instance
  • You cannot promote a replica to a standalone instance while a backup is running
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Promoting a Read Replica to a Standalone DB
Instance – Use cases

A
  • Use as a DR strategy
  • Avoid performance penalty of DDL operations (like rebuilding indexes)
  • Perform DDL ops on a read replica and promote it to a standalone instance. Then
    point your app to this new instance.
  • Sharding (splitting a large DB
    into multiple smaller DBs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Enabling writes on a read replica

A
  • For MySQL / MariaDB read replica, set
    the parameter read_only = 0 for the read replica to make it writable
  • You can then perform DDL operations on the read replica as needed without affecting the source DB
  • Actions taken on the read replica don’t
    affect the performance of the source DB instance
  • You can then promote the replica to a standalone DB
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

RDS Read Replica capabilities

A
  • Can create multiple read replicas in quick succession
  • Can use DB snapshot to perform PITR of a Read Replica
  • Can create a replica from an existing replica
  • reduces replication load from the master DB instance
  • second-tier replica can have higher replication lag
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Cross-Region Read Replicas in RDS

A
  • Supported for MariaDB, MySQL, Oracle, and PostgreSQL
  • Not supported for SQL Server
  • Advantages
  • Enhanced DR capability
  • Scale read operations closer to the
    end-users
  • Limitations
  • Higher replica lag times
  • AWS does not guarantee more than
    five cross-region read replica instances
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

RDS replicas with an external database

A
  • Replication b/w an external DB and an
    RDS replica
  • Supported for MySQL / MariaDB engines
  • Two ways
  • Binlog replication
  • GTID based Replication
18
Q

RDS Disaster Recovery Strategies

A
  • To ensure business continuity despite
    unexpected failures/events
  • Multi-AZ is not enough (it can’t protect from
    logical DB corruption, malicious attacks etc.)
  • Key metrics for DR plan – RTO and RPO
  • RDS PITR offers RPO of 5 minutes (typically)
  • RTO (Recovery time objective)
  • How long it takes you to recover after a disaster
  • Expressed in hours
  • RPO (Recovery point objective)
  • How much data you could lose due to a disaster
  • Expressed in hours (e.g. RPO of 1 hour means you could lose an hour worth of data)
19
Q

Comparing RDS DR Strategies Automated backups

A

RTO - Good
RPO - Better
Cost - Low
Scope - Single region

20
Q

Comparing RDS DR Strategies Manual snapshots

A

RTO - Better
RPO - Good
Cost - Medium
Scope - Cross-Region

21
Q

Comparing RDS DR Strategies Read replicas

A

RTO - Best
RPO - Best
Cost - High
Scope - Cross-Region

22
Q

RDS Disaster Recovery Strategies

A
  • To ensure business continuity despite
    unexpected failures/events
  • Multi-AZ is not enough (it can’t protect from
    logical DB corruption, malicious attacks etc.)
  • Key metrics for DR plan – RTO and RPO
  • RDS PITR offers RPO of 5 minutes (typically)
  • RTO (Recovery time objective)
  • How long it takes you to recover after a disaster
  • Expressed in hours
  • RPO (Recovery point objective)
  • How much data you could lose due to a disaster
  • Expressed in hours (e.g. RPO of 1 hour means you could
    lose an hour worth of data)
23
Q

Troubleshooting high replica lag

A
  • Asynchronous logical replication typically results in replica lag
  • You can monitor ReplicaLag metrics in
    CloudWatch
  • ReplicaLag metric reports Seconds_Behind_Master values
  • Replication delays can happen due to:
  • Long-running queries on the primary instance (slow query log can help)
  • Insufficient instance class size or storage
  • Parallel queries executed on the primary instance
24
Q

Troubleshooting replication errors - Recommendations:

A
  • Size the replica to match the source DB (storage size and DB instance class)
  • Use compatible DB parameter group settings for source DB and replica
  • Ex. max_allowed_packet for read replica must same as that of the source DB
    instance
  • Monitor the Replication State field of the replica instance
  • If Replication State = Error, then see error details in the Replication Error field
  • Use RDS event notifications to get alerts on such replica issues
25
Troubleshooting replication errors (contd.)
* Writing to tables on a read replica * Set read_only=0 to make read replica writable * Use only for maintenance tasks (like creating indexes only on replica) * If you write to tables on read replica, it might make it incompatible with source DB and break the replication * So set read_only=1 immediately after completing maintenance tasks * Replication is only supported with transactional storage engines like InnoDB. Using engines like MyISAM will cause replication errors * Using unsafe nondeterministic queries such as SYSDATE() can break replication * You can either skip replication errors (if its not a major one) or delete and recreate the replica
26
Troubleshooting MySQL read replica issues
* Errors or data inconsistencies b/w source instance and replica * Can happen due to binlog events or InnoDB redo logs aren’t flushed during a replica or source instance failure * Must manually delete and recreate the replica * Preventive recommendations: * sync_binlog=1 * innodb_flush_log_at_trx_commit=1 * innodb_support_xa=1 * These settings might reduce performance (so test before moving to production)
27
Performance hit on new read replicas
* RDS snapshots are EBS snapshots stored in S3 * When you spin up a new replica, its EBS volume loads lazily in the background * This results in first-touch penalty (when you query any data, it takes longer to retrieve it for the first time) * Suggestions: * If DB is small, run “SELECT * FROM ” query on each table on the replica * Initiate a full table scan with VACUUM ANALYZE (in PostgreSQL) * Another reason could be an empty buffer pool (cache for table and index data)
28
Scaling in RDS
* Vertical Scaling (Scaling up) * Single-AZ instance will be unavailable during scaling op * Multi-AZ setup offers minimal downtime during scaling op – standby DB gets upgraded first and then primary will failover to the upgraded instance * Horizontal Scaling (Scaling out) * Useful for read-heavy workloads * Use read-replicas * Replicas also act as a DR target
29
Sharding in RDS
* Sharding = horizontal partitioning * Split and distribute data across multiple DBs (called shards) * Mapping / routing logic maintained at application tier * Offers additional fault tolerance (since no single point of failure) * If any shard goes through failover, other shards are not impacted
30
RDS Monitoring - common metrics, native logs, manual monitoring tools
* Common metrics * CPU, RAM, disk space consumption / Network traffic / DB connections / IOPS metrics * Native logs / extensions * e.g. pgaudit extension in PostgreSQL for auditing (DML / DCL / DDL etc) * Manual Monitoring Tools * RDS console (DB connections, R/W ops, storage consumption, memory utilization, N/W traffic) * AWS Trusted Advisor (cost optimization, security, fault tolerance, performance improvement checks) * CloudWatch (service health status etc.)
31
RDS Monitoring - Automated Monitoring Tools
* RDS event notifications * Database logs (can be exported to CloudWatch Logs) * CloudWatch (Metrics / Alarms / Logs) * Enhanced Monitoring (real-time) * Performance Insights * RDS Recommendations * CloudTrail (captures all RDS API calls, can be viewed in CloudTrail console or delivered to an S3 bucket) * Up to 90 days of your account activity can be viewed in CloudTrail console (can create a trail to deliver the audit logs to S3)
32
RDS Notifications / Event subscriptions
* Available within the RDS console * Allows you to create CloudWatch alarms to notify you whenever certain metric data crosses a threshold * You can send alarm notifications to an SNS topic (email / SMS) * You can also subscribe to RDS events * Event sources can be snapshots, instances, security groups, parameter groups, clusters, cluster snapshots, etc. * Events like DB instance creation, deletion, availability (shutdown / restart), backup, recovery, failover, failure, backtrack, config change etc.
33
RDS Recommendations
* Periodic automated suggestions for DB instances, read replicas, and DB parameter groups
34
RDS Logs
* View / watch / download DB logs from the RDS console * Can export logs to CloudWatch Logs (log types vary by DB engine) * CloudWatch Logs never expire. To expire them, set log group retention policy (1 day - 10 yrs) * Logs are accessible from RDS console even if you disable log export to CloudWatch Logs
35
RDS Logs - Log types that can be exported to CloudWatch Logs - topics
* Alert log – Oracle * Audit log – Oracle, MariaDB, MySQL (must use option group with MARIADB_AUDIT_PLUGIN option for MariaDB and MySQL to audit database activity) * Listener log – Oracle * Trace log – Oracle * Error log – SQL Server, MariaDB, MySQL * Postgresql log – PostgreSQL (contains audit logs) * Upgrade log – PostgreSQL * General log – MariaDB, MySQL * Slow query log – MariaDB, MySQL
36
Exporting AWS RDS logs to S3
* RDS database log files can be accessed via RDS console, CLI or API * Transaction logs cannot be accessed * You can export log data from CloudWatch Logs to S3 by creating an export task in CloudWatch (createexport-task CLI command) * Log files can also be downloaded using the RDS API and uploaded to S3 (using Lambda or AWS SDK)
37
RDS Enhanced Monitoring
* To analyze real-time OS level metrics (CPU / memory usage etc.) * To monitor different processes or threads that are using the CPU * Helps identify performance issues * Increased granularity of 1 to 60 seconds * 1, 5, 10, 15, 30, or 60 seconds * Requires an agent to be installed on the DB server to collect metrics
38
RDS Performance Insights
* Offers visual dashboard for performance tuning, analysis and monitoring * Monitors DB load for the instance (if the instance has multiple DBs, you’ll see aggregated metrics) * DB load – average number of active sessions (AAS – average active sessions) * Performance problems will appear as spikes in the DB load graph * Helps identify performance bottlenecks, expensive SQL statements, etc.
39
RDS Performance Insights - DB load
* You can visualize the DB load, filter it by waits / SQL / hosts / users * Waits - wait state for CPU, IO, Lock etc. * SQL – SQL statements * Hosts * Users * Identify slow queries (top SQL), locks
40
CloudWatch Application Insights
* For .NET and SQL Server * Also supports DynamoDB tables * Identifies and sets up key metrics, logs, and alarms for SQL Server workloads * Uses CloudWatch events and alarms * Useful for problem detection, notification and troubleshooting
41
RDS on VMware
* Lets you deploy RDS DBs in on-premises VMware environments (VMware vSphere) * Same user interface as in AWS * Supports MySQL, PostgreSQL, and SQL Server * Fully managed DBs * Uses health monitoring to detect unhealthy database instances and automatically recovers them * Support manual and automatic backups with PITR * Can use CloudWatch for monitoring
42
RDS – Good things to know
* Read replica can be made writable (for MySQL/MariaDB) * For other engines * read replica cannot be made writable, but you can promote it to make it writable * For Oracle and SQL Server * Automatic backups or manual snapshots are not supported on the replica * For Oracle * Does not yet support Oracle RAC (a cluster DB with a shared cache architecture) * For SQL Server * Supports both Multi-AZ options – Database Mirroring and Always On * For PostgreSQL * Only manual snapshots are supported on the replica (no automatic backups) * Set log retention with parameter rds.log_retention_period * For MySQL/MariaDB * Set log retention with stored procedures * e.g. call mysql.rds_set_configuration('binlog reten