Chapter 17 of 26
High-Performing Databases: Amazon RDS and Related Options
Database performance questions often hide in details like read patterns and bursty workloads; learn how to tune RDS and choose complementary services to keep queries fast.
Positioning RDS in High-Performance Architectures
Where RDS Fits
Amazon RDS is a managed relational database service. AWS handles provisioning, patching, and backups so you can focus on schema, queries, and integration.
Key Performance Levers
RDS performance levers: engine (MySQL, PostgreSQL, Aurora, etc.), instance class, storage type, Multi-AZ, read replicas, and parameter/connection tuning.
Link to Well-Architected
The performance efficiency pillar focuses on efficient use of computing resources. For RDS, that means right-sizing, scaling reads, and avoiding overprovisioning.
What You Will Learn
You will design and tune RDS, use read replicas, decide when to add caching, and compare relational vs non-relational options for SAA-style performance questions.
Choosing RDS Engine and Instance Class for Performance
Engine Choice Matters
Engine affects performance: Aurora for high throughput and HA, standard MySQL/PostgreSQL/MariaDB for typical apps, Oracle/SQL Server mainly for legacy/licensing.
Aurora for High Performance
Aurora uses a distributed storage layer and supports many read replicas, giving better throughput and availability than standard engines on the same instance size.
Instance Class Families
Burstable (t3/t4g) for dev or low-average CPU, general purpose (m5/m6g) as a default, memory-optimized (r5/r6g) for read-heavy or analytics-style queries.
Exam Clues
Phrases like "bursty, low average" hint at burstable. "Consistent heavy load" or "mission-critical" point to general purpose or memory-optimized, often Aurora.
Storage, IOPS, and Multi-AZ: Getting the Baseline Right
Storage Types
gp3 is the default SSD with configurable IOPS and throughput. io1/io2 gives very high, predictable IOPS. Magnetic is legacy and not for performance.
IOPS vs Throughput
IOPS is operations per second, key for OLTP with many small reads/writes. Throughput is MB/s, key for large sequential operations.
Multi-AZ Basics
Multi-AZ for standard RDS engines uses synchronous replication to a standby in another AZ, improving availability but not read capacity.
Aurora and HA
Aurora’s storage is inherently multi-AZ and replicated. You add reader instances to scale reads; storage replication is built in.
Scaling Reads with Read Replicas and Aurora Readers
Why Read Replicas
Read-heavy workloads can overwhelm a single primary. Read replicas offload SELECT queries, reducing CPU and I/O load on the primary.
Standard RDS Replicas
Standard RDS replicas use asynchronous replication. They can be in or across regions, but may show replication lag for very recent writes.
Aurora Readers
Aurora provides a reader endpoint that load-balances across up to 15 reader instances, with fast, storage-level replication.
Exam Clues
Need higher read throughput? Add replicas. Need the latest data? Read from the primary/writer. Multi-AZ alone does not increase read capacity.
Connection Management, Pooling, and Burst Handling
Why Connections Matter
Each DB connection uses memory and CPU. Connection storms during bursts can cause timeouts, slow queries, or "too many connections" errors.
Connection Pooling
Use pooling at the app tier: reuse connections and limit max connections per instance to avoid overwhelming the database.
What RDS Proxy Does
RDS Proxy maintains a shared pool of connections between your apps and RDS/Aurora, reducing overhead and smoothing bursts.
Exam Pattern
If Lambda or many short-lived clients are overloading RDS with connections, the likely answer is to introduce RDS Proxy and tune pooling.
Caching Strategies with Databases (ElastiCache and Application Caches)
Why Add a Cache
Caching serves frequent reads from memory instead of disk, reducing DB load and latency, especially for hot, rarely changing data.
Caching Options
You can cache in the app process or use a distributed cache like Amazon ElastiCache (Redis or Memcached) shared across instances.
Cache Patterns
Common patterns: read-through and cache-aside, where the app checks the cache first and populates it from RDS on a miss.
Exam Signals
Phrases like "frequently read, rarely updated" and "reduce database load" usually indicate adding ElastiCache in front of RDS.
Relational vs Non-Relational: Performance Tradeoffs
Relational Strengths
RDS/Aurora excel at structured data, SQL joins, and ACID transactions. They scale reads well but writes are harder to scale.
DynamoDB Strengths
DynamoDB is a fully managed NoSQL store with single-digit millisecond latency and horizontal scaling for both reads and writes.
When to Use RDS
Choose RDS when you need SQL, complex joins, and strong transactional guarantees across multiple rows or tables.
When to Use DynamoDB
Choose DynamoDB for huge scale, simple key-based access patterns, and strict low-latency requirements at any scale.
End-to-End Design: Read-Heavy Ecommerce Catalog
Scenario Overview
Ecommerce site: read-heavy product catalog, occasional writes, low-latency product pages during flash sales, and strong consistency for checkout.
Core Database Choice
Choose Amazon Aurora MySQL-compatible as the system of record for products and orders, for relational integrity and high performance.
Baseline Aurora Setup
Create an Aurora cluster with one writer and two readers, using memory-optimized instances to keep hot data in memory.
Read Scaling and Caching
Point browsing queries at the reader endpoint, add ElastiCache Redis for product details with cache-aside, and use TTL plus invalidation on updates.
Handling Bursts
Place RDS Proxy between the app and Aurora to pool connections and smooth spikes during flash sales, improving stability and latency.
Thought Exercise: Fix the Slow Reporting Database
Imagine you are called in to troubleshoot a slow reporting system.
Current setup
- Single RDS PostgreSQL instance (db.m5.large, gp3 storage).
- Hosts both OLTP (web app) and heavy reporting queries (BI dashboards).
- Symptoms during business hours:
- Web app is slow.
- Reporting queries time out.
- CPU on RDS is 90–100%.
Your task
Take 2–3 minutes and sketch (mentally or on paper) an improved architecture using only AWS managed services. Answer these prompts:
- Isolation: How would you separate OLTP and reporting workloads so they do not compete for CPU and I/O?
- Read scaling: Which features of RDS or Aurora would you use to spread reporting reads?
- Caching: Is there any data that could be cached to reduce repeated heavy queries?
- Engine/instance: Would you change the engine (e.g., to Aurora) or instance class? Why?
When you are done, compare to a reference solution:
- Create one or more read replicas (or Aurora readers) dedicated to reporting. Point BI tools at these replicas instead of the primary.
- Optionally migrate to Aurora PostgreSQL for more read replicas and better throughput.
- Scale up to a larger memory-optimized instance to support more in-memory data and connections.
- Introduce ElastiCache for common summary data that dashboards hit repeatedly.
As you design, keep the performance efficiency pillar in mind: use resources efficiently while meeting requirements, not just "throw the biggest instance" at the problem.
Quiz 1: RDS Performance Concepts
Test your understanding of core RDS performance features.
A startup runs a read-heavy social media app on a single RDS MySQL instance. During peak hours, CPU usage is high and read queries are slow. The app must continue using MySQL, and developers want minimal code changes. What is the BEST first step to improve read performance?
- Enable Multi-AZ on the existing RDS instance and direct read traffic to the standby
- Create one or more RDS MySQL read replicas and direct read-only queries to the replicas
- Migrate the database to DynamoDB and redesign the data model for key-value access
- Switch the storage type from gp3 to magnetic to reduce latency
Show Answer
Answer: B) Create one or more RDS MySQL read replicas and direct read-only queries to the replicas
Read replicas are designed to scale read-heavy workloads. Creating RDS MySQL read replicas and directing read-only queries to them offloads the primary and improves performance with minimal code changes. Multi-AZ improves availability, not read capacity. DynamoDB would require a major redesign. Magnetic storage is slower and not appropriate for performance.
Quiz 2: Caching and Connection Management
Check how well you can recognize when to use caching and RDS Proxy.
An application uses AWS Lambda functions to query an RDS PostgreSQL database. During traffic spikes, the database experiences thousands of new connections per second, causing timeouts and high latency. Which combination of changes is MOST appropriate to improve performance?
- Increase RDS storage size and switch to Multi-AZ
- Add an ElastiCache cluster and move all data from RDS into the cache
- Introduce RDS Proxy between Lambda and RDS, and ensure Lambda functions reuse pooled connections
- Convert the RDS instance to Aurora Serverless v2 and disable connection limits
Show Answer
Answer: C) Introduce RDS Proxy between Lambda and RDS, and ensure Lambda functions reuse pooled connections
RDS Proxy is specifically designed to manage and pool database connections for serverless and spiky workloads like Lambda. Putting RDS Proxy between Lambda and RDS reduces connection storms and improves stability. ElastiCache can help with read load but does not solve connection storms by itself. Multi-AZ improves availability, not connection behavior. Aurora Serverless v2 does not remove connection limits and does not by itself fix connection storms.
Key Term Review: RDS Performance and Related Services
Flip through these cards to reinforce core concepts.
- Amazon RDS
- A managed relational database service that handles provisioning, patching, backups, and basic monitoring for engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.
- Read replica
- A read-only copy of a database instance that uses asynchronous replication from the primary to offload read traffic and improve read scalability.
- Multi-AZ deployment (standard RDS engines)
- A high-availability configuration where data is synchronously replicated to a standby instance in another Availability Zone. It improves availability and durability but does not increase read capacity.
- Amazon Aurora reader endpoint
- A cluster endpoint in Aurora that automatically load-balances read-only connections across available reader instances to scale read throughput.
- RDS Proxy
- A fully managed database proxy for RDS and Aurora that pools and shares database connections, improving scalability, resilience, and performance for applications with many short-lived connections.
- Amazon ElastiCache
- A managed in-memory data store service compatible with Redis and Memcached, commonly used as a caching layer to reduce database load and improve latency.
- gp3 storage (RDS)
- General Purpose SSD storage for RDS where you can configure baseline IOPS and throughput independently of storage size, suitable for most production workloads.
- Provisioned IOPS (io1/io2) for RDS
- SSD storage type that provides high, predictable IOPS and low latency for I/O-intensive workloads, at a higher cost than gp3.
- Aurora Serverless v2
- An on-demand, auto-scaling configuration for Aurora that adjusts database capacity in fine-grained increments based on workload, useful for variable or unpredictable workloads.
- DynamoDB vs RDS (performance)
- DynamoDB is a fully managed NoSQL service that scales reads and writes horizontally with single-digit millisecond latency, ideal for massive scale and simple access patterns; RDS is best for relational data, complex queries, and transactions.
Key Terms
- DynamoDB
- A fully managed NoSQL key-value and document database service that delivers single-digit millisecond performance at any scale with horizontal partitioning.
- RDS Proxy
- A fully managed database proxy for RDS and Aurora that pools and shares database connections, improving scalability, resilience, and performance for applications with many short-lived connections.
- Amazon RDS
- A managed relational database service that handles provisioning, patching, backups, and basic monitoring for engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.
- gp3 storage
- General Purpose SSD storage type for RDS that allows you to configure baseline IOPS and throughput independently of storage size.
- Read replica
- A read-only copy of a database instance that uses asynchronous replication from the primary to offload read traffic and improve read scalability.
- Amazon Aurora
- A high-performance, highly available relational database engine compatible with MySQL and PostgreSQL that uses a distributed, fault-tolerant storage system and supports up to 15 read replicas.
- Amazon ElastiCache
- A managed in-memory data store service compatible with Redis and Memcached, commonly used as a caching layer to reduce database load and improve latency.
- Connection pooling
- An application or middleware technique where a pool of database connections is maintained and reused, reducing overhead and avoiding connection storms.
- Cache-aside pattern
- A caching pattern where the application checks the cache first and, on a miss, loads data from the database and populates the cache before returning the result.
- Multi-AZ deployment
- A high-availability configuration where data is synchronously replicated to a standby instance in another Availability Zone; for standard RDS engines it improves availability and durability but does not increase read capacity.
- Aurora Serverless v2
- An on-demand auto-scaling configuration for Aurora that adjusts database capacity in fine-grained increments based on workload demand.
- Provisioned IOPS (io1/io2)
- SSD storage types for RDS that provide high, predictable I/O operations per second and low latency for I/O-intensive workloads.