Chapter 19 of 26
Cost-Optimized Databases and Data Transfer: RDS, Caching, and Network Economics
Database and data transfer charges can surprise even experienced teams. This module explains how to keep relational workloads and network paths cost-effective while still meeting performance and availability goals.
Big Picture: Why Databases and Data Transfer Get Expensive
Why This Matters
Databases and data transfer often become surprise cost centers. This module connects your EC2 and storage knowledge to relational databases (Amazon RDS) and network paths.
Cost Optimization Pillar
The AWS Well-Architected cost optimization pillar is about continually refining systems to achieve business outcomes at minimum cost. RDS and data transfer are key places to apply it.
Main Cost Levers
Your main levers: how you size/configure RDS, how much traffic hits the DB vs caches/replicas, and where data flows (intra-AZ, cross-AZ, cross-Region, internet).
Exam Mindset
On the exam, if two designs meet reliability and performance, the one that reduces RDS capacity and cross-AZ/cross-Region traffic is usually the cost-optimized answer.
RDS Cost Components: What You Actually Pay For
Instance Hours
RDS charges per DB instance-hour based on instance class and engine. Aurora Serverless v2 uses Aurora capacity units (ACUs) instead of instance-hours.
Storage Types
You pay per GB-month of storage. gp3 is cheaper and flexible; io1/io2 provide provisioned IOPS at higher cost. Aurora storage auto-scales and is billed per GB used.
I/O and Throughput
Some RDS setups bundle I/O, others bill per million requests. Aurora charges per million I/O operations, which can add up for chatty workloads.
Backups and Snapshots
Automated backups up to DB size are included. Extra manual snapshots, long retention, and cross-Region copies incur additional GB-month and data transfer charges.
Multi-AZ and Replicas
Multi-AZ and read replicas are separate instances with their own compute and storage costs. Exam questions often hide these costs behind "high availability" wording.
Worked Example: Comparing Two RDS Configurations
Two Options
Option A: Single-AZ db.m6g.large, 200 GB gp3, 7-day backups. Option B: Multi-AZ db.m6g.large (primary + standby), same storage and backups.
Cost Differences
Option B roughly doubles instance cost and increases effective storage cost. Multi-AZ replication is built into RDS pricing, not a separate transfer line item.
Network Angle
App-to-DB traffic can incur cross-AZ charges if the app runs in a different AZ from the primary DB. Place app instances in the same AZ to avoid these charges.
Choosing on the Exam
If questions stress high availability and automatic failover, choose Multi-AZ. If they stress minimizing cost with no strict HA requirement, Single-AZ may be correct.
Read Replicas and Scaling Reads Cost-Effectively
What Are Read Replicas?
Read replicas are separate RDS instances that receive asynchronous replication from a primary. They exist to scale read traffic, not to provide synchronous HA.
Costs and Trade-offs
Each replica has full instance and storage cost. They can be cheaper than a huge primary for read-heavy workloads, but are only eventually consistent.
Aurora Read Scaling
Aurora read replicas share a common storage volume, reducing storage duplication. You can have many read replicas and use Aurora Global Database for cross-Region reads.
Common Exam Trap
Multi-AZ is for high availability of writes, not read scaling. If the need is "scale reads and reduce cost", think read replicas and/or caching, not just Multi-AZ.
Caching to Reduce Database Load and Cost
Why Cache?
Caching avoids unnecessary DB queries. ElastiCache (Redis/Memcached) can serve hot data from memory, reducing RDS load and instance size requirements.
Patterns
Typical flow: app checks cache, on miss queries RDS, then stores the result in cache with a TTL. Cache entire query results for frequently accessed pages or lists.
Cost Angle
ElastiCache nodes cost money, but can offload huge read volumes. Often, "ElastiCache + smaller RDS" is cheaper than a single very large RDS instance.
Consistency Trade-offs
Caches can be stale. Use TTLs and invalidation strategies. For critical, always-fresh data, you may bypass cache or use very short TTLs.
Performance Efficiency
Caching aligns with the performance efficiency pillar: using resources efficiently as demand changes, by serving frequent reads from fast, cheap memory.
RDS Deployment Options: Cost vs Resilience
Single-AZ
Single-AZ RDS has one instance in one AZ. It is cheapest but cannot automatically fail over if the AZ fails, leading to longer downtime.
Multi-AZ Instances
Multi-AZ creates a synchronous standby in another AZ with automatic failover. It improves availability but doubles instance costs and increases storage cost.
Read Replicas vs HA
Read replicas scale reads and can be promoted manually. They are not the same as Multi-AZ, which is focused on automatic high availability.
Aurora Choices
Aurora replicates storage across AZs by default and supports multiple readers and Global Database for cross-Region DR and low-latency reads.
Choosing on the Exam
Mention of RPO/RTO, AZ failure, or automatic failover points to Multi-AZ or Aurora. Emphasis on lowest cost with tolerable downtime points to Single-AZ.
Data Transfer Pricing Basics: Intra-AZ, Cross-AZ, Cross-Region, Internet
Intra-AZ
Traffic between resources in the same AZ is typically free or low cost. Placing EC2 and RDS in the same AZ minimizes data transfer charges.
Cross-AZ
Traffic between different AZs in the same Region is billed per GB. Chatty app-to-DB traffic across AZs can become a noticeable cost.
Cross-Region
Data transfer between Regions is more expensive than cross-AZ. It includes RDS cross-Region replicas and S3 cross-Region replication.
Internet Egress
Data leaving AWS to the internet is usually the most expensive. Inbound traffic is cheaper; outbound egress is what you optimize away.
Exam Pattern
When comparing designs, the cost-optimized one usually keeps traffic local (same AZ/Region) and minimizes internet egress and cross-Region flows.
Cost-Aware Use of CloudFront and VPC Endpoints
CloudFront Basics
CloudFront caches content at edge locations, reducing load on origins and often lowering internet egress cost compared to direct S3/EC2 delivery.
CloudFront Exam Angle
For global users accessing S3 static sites or APIs, CloudFront usually improves performance and reduces cost. Expect this pattern on the exam.
VPC Endpoints
Gateway endpoints (S3/DynamoDB) and interface endpoints (PrivateLink) let VPC resources reach AWS services via private IPs, avoiding public internet.
NAT vs Endpoints
NAT gateways bill per GB. Heavy S3/DynamoDB traffic via NAT is costly. A gateway endpoint keeps that traffic on AWS backbone and avoids NAT data charges.
Common Trap
If a scenario complains about high NAT costs for S3/DynamoDB, the fix is usually "add a VPC gateway endpoint", not "buy a bigger NAT".
Thought Exercise: Re-Design a Costly Architecture
Imagine this architecture for a news website:
- Users around the world access a web app running on EC2 instances in two AZs behind an Application Load Balancer (ALB).
- The app uses RDS for MySQL in Multi-AZ with the primary in AZ A and standby in AZ B.
- The app instances in both AZs connect to the primary DB.
- Static images are stored in S3 and served directly to users (no CloudFront).
- Private subnets use a single NAT gateway in AZ A to reach S3.
Costs are high. Mentally walk through these questions and jot down answers:
- Cross-AZ traffic: Where is cross-AZ data transfer happening? How could you reduce it while still being resilient?
- Database load: What data is likely read frequently and could be cached? Where would you insert ElastiCache, and how might that affect RDS size/cost?
- Internet egress: How could CloudFront change the cost profile of serving static content and maybe some API responses?
- NAT gateway: Why might NAT charges be high here? What VPC feature could you add to reduce them?
Try to produce a revised high-level design that:
- Keeps high-volume traffic within the same AZ when possible.
- Uses caching to offload RDS.
- Uses CloudFront and VPC endpoints to reduce egress and NAT costs.
You do not need exact prices. Focus on which flows get cheaper and why.
Quiz 1: RDS and Caching for Cost Optimization
Check your understanding of RDS deployment and caching strategies.
A startup runs a read-heavy analytics dashboard on RDS for PostgreSQL. The single-AZ db.r6g.large instance is CPU-bound during business hours. They want to minimize cost while improving read performance. Which option is MOST cost-optimized?
- Upgrade the RDS instance to a larger db.r6g.4xlarge and keep the architecture the same.
- Enable Multi-AZ on the existing instance so reads can be split between primary and standby.
- Add one or more read replicas and introduce ElastiCache to cache common query results.
- Migrate from RDS to a larger EC2 instance running self-managed PostgreSQL.
Show Answer
Answer: C) Add one or more read replicas and introduce ElastiCache to cache common query results.
Read-heavy workloads are ideal for scaling out with read replicas and caching. Option C offloads reads to replicas and ElastiCache, allowing the primary to stay smaller. Multi-AZ (B) does not provide read scaling; the standby is not used for reads. Simply scaling up (A or D) increases cost significantly and does not leverage cheaper caching.
Quiz 2: Data Transfer and Network Economics
Test your understanding of data transfer cost patterns and optimization tools.
A company has private subnets in a VPC. Instances frequently read and write large objects to an S3 bucket in the same Region. The finance team notices high NAT gateway data processing charges. What is the BEST way to reduce these costs while maintaining security?
- Move the S3 bucket to a different Region with cheaper data transfer pricing.
- Add a VPC gateway endpoint for S3 and update route tables for the private subnets.
- Replace the NAT gateway with a larger NAT instance in a public subnet.
- Expose the S3 bucket publicly and access it over the internet instead of through the NAT.
Show Answer
Answer: B) Add a VPC gateway endpoint for S3 and update route tables for the private subnets.
A VPC gateway endpoint for S3 lets instances in private subnets access S3 over the AWS backbone without using the NAT gateway, eliminating NAT data processing charges. Moving Regions (A) may not help and adds complexity. Replacing the NAT (C) does not solve per-GB costs. Making the bucket public (D) weakens security and still incurs internet egress.
Key Term Review
Flip these cards to reinforce core terms and patterns you will see on the exam.
- Multi-AZ RDS deployment
- An RDS configuration with a primary DB instance and a synchronous standby in a different Availability Zone, providing automatic failover for high availability but not additional read capacity.
- Read replica (RDS)
- A separate RDS instance that receives asynchronous replication from a primary database and is used to scale read traffic or offload reporting workloads.
- Aurora Global Database
- An Amazon Aurora feature that creates a primary cluster in one Region and up to several secondary read-only clusters in other Regions with low-latency replication, used for global reads and disaster recovery.
- ElastiCache
- A managed in-memory caching service (Redis or Memcached) used to offload read traffic from databases and improve application performance and cost-efficiency.
- Intra-AZ vs cross-AZ data transfer
- Intra-AZ traffic (within the same Availability Zone) is typically free or low cost, while cross-AZ traffic (between AZs in the same Region) is billed per GB and should be minimized for high-volume flows.
- Internet egress
- Data transferred from AWS to the public internet, usually the most expensive type of data transfer and a key target for optimization using CloudFront and caching.
- CloudFront for cost optimization
- Using CloudFront to cache and deliver content from edge locations, reducing origin load and often lowering data transfer out charges compared to serving directly from S3 or EC2.
- VPC gateway endpoint
- A VPC feature that provides private, routed access from a VPC to S3 or DynamoDB without using public IPs or NAT gateways, reducing NAT data processing costs and improving security.
- Cost optimization pillar (Well-Architected)
- The cost optimization pillar includes the continual process of refinement and improvement of a system over its entire lifecycle to build and operate cost-aware systems that achieve business outcomes and minimize costs.
- Performance efficiency pillar (Well-Architected)
- The performance efficiency pillar focuses on the efficient use of computing resources to meet requirements and maintain that efficiency as demand changes and technologies evolve.
Key Terms
- Aurora
- A cloud-native relational database engine from AWS that is compatible with MySQL and PostgreSQL and uses a distributed, shared storage architecture across multiple Availability Zones.
- Multi-AZ
- An RDS deployment pattern where a primary DB instance synchronously replicates to a standby in another Availability Zone for high availability and automatic failover.
- Amazon RDS
- A managed relational database service that supports engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Aurora, handling backups, patching, and basic operations.
- CloudFront
- Amazon's content delivery network service that caches and serves content from edge locations to reduce latency and origin load.
- ElastiCache
- A managed in-memory caching service for Redis and Memcached used to reduce database load and improve application performance.
- Read replica
- An asynchronously replicated copy of a primary database instance used to scale read traffic and offload reporting or analytics workloads.
- VPC endpoint
- A private connection between a VPC and supported AWS services that does not require an internet gateway, NAT device, VPN connection, or AWS Direct Connect.
- Gateway endpoint
- A type of VPC endpoint that provides a target for a route in your route table for traffic destined to S3 or DynamoDB.
- Data transfer out
- Network traffic leaving an AWS resource, either to the internet, another AZ, or another Region; often a significant cost component.
- Interface endpoint
- A type of VPC endpoint powered by AWS PrivateLink that uses private IP addresses in your VPC to connect to supported services.