SkarpSkarp

Chapter 17 of 26

High-Performing Databases: Amazon RDS and Related Options

Database performance questions often hide in details like read patterns and bursty workloads; learn how to tune RDS and choose complementary services to keep queries fast.

27 min readen

Positioning RDS in High-Performance Architectures

Where RDS Fits

Amazon RDS is a managed relational database service. AWS handles provisioning, patching, and backups so you can focus on schema, queries, and integration.

Key Performance Levers

RDS performance levers: engine (MySQL, PostgreSQL, Aurora, etc.), instance class, storage type, Multi-AZ, read replicas, and parameter/connection tuning.

Link to Well-Architected

The performance efficiency pillar focuses on efficient use of computing resources. For RDS, that means right-sizing, scaling reads, and avoiding overprovisioning.

What You Will Learn

You will design and tune RDS, use read replicas, decide when to add caching, and compare relational vs non-relational options for SAA-style performance questions.

Choosing RDS Engine and Instance Class for Performance

Engine Choice Matters

Engine affects performance: Aurora for high throughput and HA, standard MySQL/PostgreSQL/MariaDB for typical apps, Oracle/SQL Server mainly for legacy/licensing.

Aurora for High Performance

Aurora uses a distributed storage layer and supports many read replicas, giving better throughput and availability than standard engines on the same instance size.

Instance Class Families

Burstable (t3/t4g) for dev or low-average CPU, general purpose (m5/m6g) as a default, memory-optimized (r5/r6g) for read-heavy or analytics-style queries.

Exam Clues

Phrases like "bursty, low average" hint at burstable. "Consistent heavy load" or "mission-critical" point to general purpose or memory-optimized, often Aurora.

Storage, IOPS, and Multi-AZ: Getting the Baseline Right

Storage Types

gp3 is the default SSD with configurable IOPS and throughput. io1/io2 gives very high, predictable IOPS. Magnetic is legacy and not for performance.

IOPS vs Throughput

IOPS is operations per second, key for OLTP with many small reads/writes. Throughput is MB/s, key for large sequential operations.

Multi-AZ Basics

Multi-AZ for standard RDS engines uses synchronous replication to a standby in another AZ, improving availability but not read capacity.

Aurora and HA

Aurora’s storage is inherently multi-AZ and replicated. You add reader instances to scale reads; storage replication is built in.

Scaling Reads with Read Replicas and Aurora Readers

Why Read Replicas

Read-heavy workloads can overwhelm a single primary. Read replicas offload SELECT queries, reducing CPU and I/O load on the primary.

Standard RDS Replicas

Standard RDS replicas use asynchronous replication. They can be in or across regions, but may show replication lag for very recent writes.

Aurora Readers

Aurora provides a reader endpoint that load-balances across up to 15 reader instances, with fast, storage-level replication.

Exam Clues

Need higher read throughput? Add replicas. Need the latest data? Read from the primary/writer. Multi-AZ alone does not increase read capacity.

Connection Management, Pooling, and Burst Handling

Why Connections Matter

Each DB connection uses memory and CPU. Connection storms during bursts can cause timeouts, slow queries, or "too many connections" errors.

Connection Pooling

Use pooling at the app tier: reuse connections and limit max connections per instance to avoid overwhelming the database.

What RDS Proxy Does

RDS Proxy maintains a shared pool of connections between your apps and RDS/Aurora, reducing overhead and smoothing bursts.

Exam Pattern

If Lambda or many short-lived clients are overloading RDS with connections, the likely answer is to introduce RDS Proxy and tune pooling.

Caching Strategies with Databases (ElastiCache and Application Caches)

Why Add a Cache

Caching serves frequent reads from memory instead of disk, reducing DB load and latency, especially for hot, rarely changing data.

Caching Options

You can cache in the app process or use a distributed cache like Amazon ElastiCache (Redis or Memcached) shared across instances.

Cache Patterns

Common patterns: read-through and cache-aside, where the app checks the cache first and populates it from RDS on a miss.

Exam Signals

Phrases like "frequently read, rarely updated" and "reduce database load" usually indicate adding ElastiCache in front of RDS.

Relational vs Non-Relational: Performance Tradeoffs

Relational Strengths

RDS/Aurora excel at structured data, SQL joins, and ACID transactions. They scale reads well but writes are harder to scale.

DynamoDB Strengths

DynamoDB is a fully managed NoSQL store with single-digit millisecond latency and horizontal scaling for both reads and writes.

When to Use RDS

Choose RDS when you need SQL, complex joins, and strong transactional guarantees across multiple rows or tables.

When to Use DynamoDB

Choose DynamoDB for huge scale, simple key-based access patterns, and strict low-latency requirements at any scale.

End-to-End Design: Read-Heavy Ecommerce Catalog

Scenario Overview

Ecommerce site: read-heavy product catalog, occasional writes, low-latency product pages during flash sales, and strong consistency for checkout.

Core Database Choice

Choose Amazon Aurora MySQL-compatible as the system of record for products and orders, for relational integrity and high performance.

Baseline Aurora Setup

Create an Aurora cluster with one writer and two readers, using memory-optimized instances to keep hot data in memory.

Read Scaling and Caching

Point browsing queries at the reader endpoint, add ElastiCache Redis for product details with cache-aside, and use TTL plus invalidation on updates.

Handling Bursts

Place RDS Proxy between the app and Aurora to pool connections and smooth spikes during flash sales, improving stability and latency.

Thought Exercise: Fix the Slow Reporting Database

Imagine you are called in to troubleshoot a slow reporting system.

Current setup

  • Single RDS PostgreSQL instance (db.m5.large, gp3 storage).
  • Hosts both OLTP (web app) and heavy reporting queries (BI dashboards).
  • Symptoms during business hours:
  • Web app is slow.
  • Reporting queries time out.
  • CPU on RDS is 90–100%.

Your task

Take 2–3 minutes and sketch (mentally or on paper) an improved architecture using only AWS managed services. Answer these prompts:

  1. Isolation: How would you separate OLTP and reporting workloads so they do not compete for CPU and I/O?
  2. Read scaling: Which features of RDS or Aurora would you use to spread reporting reads?
  3. Caching: Is there any data that could be cached to reduce repeated heavy queries?
  4. Engine/instance: Would you change the engine (e.g., to Aurora) or instance class? Why?

When you are done, compare to a reference solution:

  • Create one or more read replicas (or Aurora readers) dedicated to reporting. Point BI tools at these replicas instead of the primary.
  • Optionally migrate to Aurora PostgreSQL for more read replicas and better throughput.
  • Scale up to a larger memory-optimized instance to support more in-memory data and connections.
  • Introduce ElastiCache for common summary data that dashboards hit repeatedly.

As you design, keep the performance efficiency pillar in mind: use resources efficiently while meeting requirements, not just "throw the biggest instance" at the problem.

Quiz 1: RDS Performance Concepts

Test your understanding of core RDS performance features.

A startup runs a read-heavy social media app on a single RDS MySQL instance. During peak hours, CPU usage is high and read queries are slow. The app must continue using MySQL, and developers want minimal code changes. What is the BEST first step to improve read performance?

  1. Enable Multi-AZ on the existing RDS instance and direct read traffic to the standby
  2. Create one or more RDS MySQL read replicas and direct read-only queries to the replicas
  3. Migrate the database to DynamoDB and redesign the data model for key-value access
  4. Switch the storage type from gp3 to magnetic to reduce latency
Show Answer

Answer: B) Create one or more RDS MySQL read replicas and direct read-only queries to the replicas

Read replicas are designed to scale read-heavy workloads. Creating RDS MySQL read replicas and directing read-only queries to them offloads the primary and improves performance with minimal code changes. Multi-AZ improves availability, not read capacity. DynamoDB would require a major redesign. Magnetic storage is slower and not appropriate for performance.

Quiz 2: Caching and Connection Management

Check how well you can recognize when to use caching and RDS Proxy.

An application uses AWS Lambda functions to query an RDS PostgreSQL database. During traffic spikes, the database experiences thousands of new connections per second, causing timeouts and high latency. Which combination of changes is MOST appropriate to improve performance?

  1. Increase RDS storage size and switch to Multi-AZ
  2. Add an ElastiCache cluster and move all data from RDS into the cache
  3. Introduce RDS Proxy between Lambda and RDS, and ensure Lambda functions reuse pooled connections
  4. Convert the RDS instance to Aurora Serverless v2 and disable connection limits
Show Answer

Answer: C) Introduce RDS Proxy between Lambda and RDS, and ensure Lambda functions reuse pooled connections

RDS Proxy is specifically designed to manage and pool database connections for serverless and spiky workloads like Lambda. Putting RDS Proxy between Lambda and RDS reduces connection storms and improves stability. ElastiCache can help with read load but does not solve connection storms by itself. Multi-AZ improves availability, not connection behavior. Aurora Serverless v2 does not remove connection limits and does not by itself fix connection storms.

Key Term Review: RDS Performance and Related Services

Flip through these cards to reinforce core concepts.

Amazon RDS
A managed relational database service that handles provisioning, patching, backups, and basic monitoring for engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.
Read replica
A read-only copy of a database instance that uses asynchronous replication from the primary to offload read traffic and improve read scalability.
Multi-AZ deployment (standard RDS engines)
A high-availability configuration where data is synchronously replicated to a standby instance in another Availability Zone. It improves availability and durability but does not increase read capacity.
Amazon Aurora reader endpoint
A cluster endpoint in Aurora that automatically load-balances read-only connections across available reader instances to scale read throughput.
RDS Proxy
A fully managed database proxy for RDS and Aurora that pools and shares database connections, improving scalability, resilience, and performance for applications with many short-lived connections.
Amazon ElastiCache
A managed in-memory data store service compatible with Redis and Memcached, commonly used as a caching layer to reduce database load and improve latency.
gp3 storage (RDS)
General Purpose SSD storage for RDS where you can configure baseline IOPS and throughput independently of storage size, suitable for most production workloads.
Provisioned IOPS (io1/io2) for RDS
SSD storage type that provides high, predictable IOPS and low latency for I/O-intensive workloads, at a higher cost than gp3.
Aurora Serverless v2
An on-demand, auto-scaling configuration for Aurora that adjusts database capacity in fine-grained increments based on workload, useful for variable or unpredictable workloads.
DynamoDB vs RDS (performance)
DynamoDB is a fully managed NoSQL service that scales reads and writes horizontally with single-digit millisecond latency, ideal for massive scale and simple access patterns; RDS is best for relational data, complex queries, and transactions.

Key Terms

DynamoDB
A fully managed NoSQL key-value and document database service that delivers single-digit millisecond performance at any scale with horizontal partitioning.
RDS Proxy
A fully managed database proxy for RDS and Aurora that pools and shares database connections, improving scalability, resilience, and performance for applications with many short-lived connections.
Amazon RDS
A managed relational database service that handles provisioning, patching, backups, and basic monitoring for engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Amazon Aurora.
gp3 storage
General Purpose SSD storage type for RDS that allows you to configure baseline IOPS and throughput independently of storage size.
Read replica
A read-only copy of a database instance that uses asynchronous replication from the primary to offload read traffic and improve read scalability.
Amazon Aurora
A high-performance, highly available relational database engine compatible with MySQL and PostgreSQL that uses a distributed, fault-tolerant storage system and supports up to 15 read replicas.
Amazon ElastiCache
A managed in-memory data store service compatible with Redis and Memcached, commonly used as a caching layer to reduce database load and improve latency.
Connection pooling
An application or middleware technique where a pool of database connections is maintained and reused, reducing overhead and avoiding connection storms.
Cache-aside pattern
A caching pattern where the application checks the cache first and, on a miss, loads data from the database and populates the cache before returning the result.
Multi-AZ deployment
A high-availability configuration where data is synchronously replicated to a standby instance in another Availability Zone; for standard RDS engines it improves availability and durability but does not increase read capacity.
Aurora Serverless v2
An on-demand auto-scaling configuration for Aurora that adjusts database capacity in fine-grained increments based on workload demand.
Provisioned IOPS (io1/io2)
SSD storage types for RDS that provide high, predictable I/O operations per second and low latency for I/O-intensive workloads.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself