SkarpSkarp

Chapter 19 of 26

Cost-Optimized Databases and Data Transfer: RDS, Caching, and Network Economics

Database and data transfer charges can surprise even experienced teams. This module explains how to keep relational workloads and network paths cost-effective while still meeting performance and availability goals.

27 min readen

Big Picture: Why Databases and Data Transfer Get Expensive

Why This Matters

Databases and data transfer often become surprise cost centers. This module connects your EC2 and storage knowledge to relational databases (Amazon RDS) and network paths.

Cost Optimization Pillar

The AWS Well-Architected cost optimization pillar is about continually refining systems to achieve business outcomes at minimum cost. RDS and data transfer are key places to apply it.

Main Cost Levers

Your main levers: how you size/configure RDS, how much traffic hits the DB vs caches/replicas, and where data flows (intra-AZ, cross-AZ, cross-Region, internet).

Exam Mindset

On the exam, if two designs meet reliability and performance, the one that reduces RDS capacity and cross-AZ/cross-Region traffic is usually the cost-optimized answer.

RDS Cost Components: What You Actually Pay For

Instance Hours

RDS charges per DB instance-hour based on instance class and engine. Aurora Serverless v2 uses Aurora capacity units (ACUs) instead of instance-hours.

Storage Types

You pay per GB-month of storage. gp3 is cheaper and flexible; io1/io2 provide provisioned IOPS at higher cost. Aurora storage auto-scales and is billed per GB used.

I/O and Throughput

Some RDS setups bundle I/O, others bill per million requests. Aurora charges per million I/O operations, which can add up for chatty workloads.

Backups and Snapshots

Automated backups up to DB size are included. Extra manual snapshots, long retention, and cross-Region copies incur additional GB-month and data transfer charges.

Multi-AZ and Replicas

Multi-AZ and read replicas are separate instances with their own compute and storage costs. Exam questions often hide these costs behind "high availability" wording.

Worked Example: Comparing Two RDS Configurations

Two Options

Option A: Single-AZ db.m6g.large, 200 GB gp3, 7-day backups. Option B: Multi-AZ db.m6g.large (primary + standby), same storage and backups.

Cost Differences

Option B roughly doubles instance cost and increases effective storage cost. Multi-AZ replication is built into RDS pricing, not a separate transfer line item.

Network Angle

App-to-DB traffic can incur cross-AZ charges if the app runs in a different AZ from the primary DB. Place app instances in the same AZ to avoid these charges.

Choosing on the Exam

If questions stress high availability and automatic failover, choose Multi-AZ. If they stress minimizing cost with no strict HA requirement, Single-AZ may be correct.

Read Replicas and Scaling Reads Cost-Effectively

What Are Read Replicas?

Read replicas are separate RDS instances that receive asynchronous replication from a primary. They exist to scale read traffic, not to provide synchronous HA.

Costs and Trade-offs

Each replica has full instance and storage cost. They can be cheaper than a huge primary for read-heavy workloads, but are only eventually consistent.

Aurora Read Scaling

Aurora read replicas share a common storage volume, reducing storage duplication. You can have many read replicas and use Aurora Global Database for cross-Region reads.

Common Exam Trap

Multi-AZ is for high availability of writes, not read scaling. If the need is "scale reads and reduce cost", think read replicas and/or caching, not just Multi-AZ.

Caching to Reduce Database Load and Cost

Why Cache?

Caching avoids unnecessary DB queries. ElastiCache (Redis/Memcached) can serve hot data from memory, reducing RDS load and instance size requirements.

Patterns

Typical flow: app checks cache, on miss queries RDS, then stores the result in cache with a TTL. Cache entire query results for frequently accessed pages or lists.

Cost Angle

ElastiCache nodes cost money, but can offload huge read volumes. Often, "ElastiCache + smaller RDS" is cheaper than a single very large RDS instance.

Consistency Trade-offs

Caches can be stale. Use TTLs and invalidation strategies. For critical, always-fresh data, you may bypass cache or use very short TTLs.

Performance Efficiency

Caching aligns with the performance efficiency pillar: using resources efficiently as demand changes, by serving frequent reads from fast, cheap memory.

RDS Deployment Options: Cost vs Resilience

Single-AZ

Single-AZ RDS has one instance in one AZ. It is cheapest but cannot automatically fail over if the AZ fails, leading to longer downtime.

Multi-AZ Instances

Multi-AZ creates a synchronous standby in another AZ with automatic failover. It improves availability but doubles instance costs and increases storage cost.

Read Replicas vs HA

Read replicas scale reads and can be promoted manually. They are not the same as Multi-AZ, which is focused on automatic high availability.

Aurora Choices

Aurora replicates storage across AZs by default and supports multiple readers and Global Database for cross-Region DR and low-latency reads.

Choosing on the Exam

Mention of RPO/RTO, AZ failure, or automatic failover points to Multi-AZ or Aurora. Emphasis on lowest cost with tolerable downtime points to Single-AZ.

Data Transfer Pricing Basics: Intra-AZ, Cross-AZ, Cross-Region, Internet

Intra-AZ

Traffic between resources in the same AZ is typically free or low cost. Placing EC2 and RDS in the same AZ minimizes data transfer charges.

Cross-AZ

Traffic between different AZs in the same Region is billed per GB. Chatty app-to-DB traffic across AZs can become a noticeable cost.

Cross-Region

Data transfer between Regions is more expensive than cross-AZ. It includes RDS cross-Region replicas and S3 cross-Region replication.

Internet Egress

Data leaving AWS to the internet is usually the most expensive. Inbound traffic is cheaper; outbound egress is what you optimize away.

Exam Pattern

When comparing designs, the cost-optimized one usually keeps traffic local (same AZ/Region) and minimizes internet egress and cross-Region flows.

Cost-Aware Use of CloudFront and VPC Endpoints

CloudFront Basics

CloudFront caches content at edge locations, reducing load on origins and often lowering internet egress cost compared to direct S3/EC2 delivery.

CloudFront Exam Angle

For global users accessing S3 static sites or APIs, CloudFront usually improves performance and reduces cost. Expect this pattern on the exam.

VPC Endpoints

Gateway endpoints (S3/DynamoDB) and interface endpoints (PrivateLink) let VPC resources reach AWS services via private IPs, avoiding public internet.

NAT vs Endpoints

NAT gateways bill per GB. Heavy S3/DynamoDB traffic via NAT is costly. A gateway endpoint keeps that traffic on AWS backbone and avoids NAT data charges.

Common Trap

If a scenario complains about high NAT costs for S3/DynamoDB, the fix is usually "add a VPC gateway endpoint", not "buy a bigger NAT".

Thought Exercise: Re-Design a Costly Architecture

Imagine this architecture for a news website:

  • Users around the world access a web app running on EC2 instances in two AZs behind an Application Load Balancer (ALB).
  • The app uses RDS for MySQL in Multi-AZ with the primary in AZ A and standby in AZ B.
  • The app instances in both AZs connect to the primary DB.
  • Static images are stored in S3 and served directly to users (no CloudFront).
  • Private subnets use a single NAT gateway in AZ A to reach S3.

Costs are high. Mentally walk through these questions and jot down answers:

  1. Cross-AZ traffic: Where is cross-AZ data transfer happening? How could you reduce it while still being resilient?
  2. Database load: What data is likely read frequently and could be cached? Where would you insert ElastiCache, and how might that affect RDS size/cost?
  3. Internet egress: How could CloudFront change the cost profile of serving static content and maybe some API responses?
  4. NAT gateway: Why might NAT charges be high here? What VPC feature could you add to reduce them?

Try to produce a revised high-level design that:

  • Keeps high-volume traffic within the same AZ when possible.
  • Uses caching to offload RDS.
  • Uses CloudFront and VPC endpoints to reduce egress and NAT costs.

You do not need exact prices. Focus on which flows get cheaper and why.

Quiz 1: RDS and Caching for Cost Optimization

Check your understanding of RDS deployment and caching strategies.

A startup runs a read-heavy analytics dashboard on RDS for PostgreSQL. The single-AZ db.r6g.large instance is CPU-bound during business hours. They want to minimize cost while improving read performance. Which option is MOST cost-optimized?

  1. Upgrade the RDS instance to a larger db.r6g.4xlarge and keep the architecture the same.
  2. Enable Multi-AZ on the existing instance so reads can be split between primary and standby.
  3. Add one or more read replicas and introduce ElastiCache to cache common query results.
  4. Migrate from RDS to a larger EC2 instance running self-managed PostgreSQL.
Show Answer

Answer: C) Add one or more read replicas and introduce ElastiCache to cache common query results.

Read-heavy workloads are ideal for scaling out with read replicas and caching. Option C offloads reads to replicas and ElastiCache, allowing the primary to stay smaller. Multi-AZ (B) does not provide read scaling; the standby is not used for reads. Simply scaling up (A or D) increases cost significantly and does not leverage cheaper caching.

Quiz 2: Data Transfer and Network Economics

Test your understanding of data transfer cost patterns and optimization tools.

A company has private subnets in a VPC. Instances frequently read and write large objects to an S3 bucket in the same Region. The finance team notices high NAT gateway data processing charges. What is the BEST way to reduce these costs while maintaining security?

  1. Move the S3 bucket to a different Region with cheaper data transfer pricing.
  2. Add a VPC gateway endpoint for S3 and update route tables for the private subnets.
  3. Replace the NAT gateway with a larger NAT instance in a public subnet.
  4. Expose the S3 bucket publicly and access it over the internet instead of through the NAT.
Show Answer

Answer: B) Add a VPC gateway endpoint for S3 and update route tables for the private subnets.

A VPC gateway endpoint for S3 lets instances in private subnets access S3 over the AWS backbone without using the NAT gateway, eliminating NAT data processing charges. Moving Regions (A) may not help and adds complexity. Replacing the NAT (C) does not solve per-GB costs. Making the bucket public (D) weakens security and still incurs internet egress.

Key Term Review

Flip these cards to reinforce core terms and patterns you will see on the exam.

Multi-AZ RDS deployment
An RDS configuration with a primary DB instance and a synchronous standby in a different Availability Zone, providing automatic failover for high availability but not additional read capacity.
Read replica (RDS)
A separate RDS instance that receives asynchronous replication from a primary database and is used to scale read traffic or offload reporting workloads.
Aurora Global Database
An Amazon Aurora feature that creates a primary cluster in one Region and up to several secondary read-only clusters in other Regions with low-latency replication, used for global reads and disaster recovery.
ElastiCache
A managed in-memory caching service (Redis or Memcached) used to offload read traffic from databases and improve application performance and cost-efficiency.
Intra-AZ vs cross-AZ data transfer
Intra-AZ traffic (within the same Availability Zone) is typically free or low cost, while cross-AZ traffic (between AZs in the same Region) is billed per GB and should be minimized for high-volume flows.
Internet egress
Data transferred from AWS to the public internet, usually the most expensive type of data transfer and a key target for optimization using CloudFront and caching.
CloudFront for cost optimization
Using CloudFront to cache and deliver content from edge locations, reducing origin load and often lowering data transfer out charges compared to serving directly from S3 or EC2.
VPC gateway endpoint
A VPC feature that provides private, routed access from a VPC to S3 or DynamoDB without using public IPs or NAT gateways, reducing NAT data processing costs and improving security.
Cost optimization pillar (Well-Architected)
The cost optimization pillar includes the continual process of refinement and improvement of a system over its entire lifecycle to build and operate cost-aware systems that achieve business outcomes and minimize costs.
Performance efficiency pillar (Well-Architected)
The performance efficiency pillar focuses on the efficient use of computing resources to meet requirements and maintain that efficiency as demand changes and technologies evolve.

Key Terms

Aurora
A cloud-native relational database engine from AWS that is compatible with MySQL and PostgreSQL and uses a distributed, shared storage architecture across multiple Availability Zones.
Multi-AZ
An RDS deployment pattern where a primary DB instance synchronously replicates to a standby in another Availability Zone for high availability and automatic failover.
Amazon RDS
A managed relational database service that supports engines like MySQL, PostgreSQL, MariaDB, Oracle, SQL Server, and Aurora, handling backups, patching, and basic operations.
CloudFront
Amazon's content delivery network service that caches and serves content from edge locations to reduce latency and origin load.
ElastiCache
A managed in-memory caching service for Redis and Memcached used to reduce database load and improve application performance.
Read replica
An asynchronously replicated copy of a primary database instance used to scale read traffic and offload reporting or analytics workloads.
VPC endpoint
A private connection between a VPC and supported AWS services that does not require an internet gateway, NAT device, VPN connection, or AWS Direct Connect.
Gateway endpoint
A type of VPC endpoint that provides a target for a route in your route table for traffic destined to S3 or DynamoDB.
Data transfer out
Network traffic leaving an AWS resource, either to the internet, another AZ, or another Region; often a significant cost component.
Interface endpoint
A type of VPC endpoint powered by AWS PrivateLink that uses private IP addresses in your VPC to connect to supported services.

Finished reading?

Test your understanding with a custom practice exam on this chapter.

Test yourself