Chapter 6 of 9
High-Performing Storage and Databases: S3, EBS, RDS, and DynamoDB
Storage and database questions often hide performance bottlenecks in the fine print—throughput limits, access patterns, or consistency needs. This module trains you to spot those clues and choose the right mix of S3, EBS, RDS, DynamoDB, and caching.
Big Picture: Storage and Database Performance on AWS
Why This Module Matters
Storage and database choices often decide whether a system flies or crawls. Performance bottlenecks usually hide in access patterns, throughput limits, and consistency needs.
Core Services
We focus on four core AWS data services: S3 (object), EBS (block), RDS/Aurora (relational), and DynamoDB (NoSQL). We also touch on EFS, FSx, caching, and read replicas.
Performance Dimensions
Always ask: access pattern, latency, throughput, consistency, and concurrency. These five dimensions guide you to the right storage or database option.
Current Capabilities
We use AWS capabilities as of mid-2026. Some services changed over time; we emphasize how they work now so you choose the right modern patterns.
Step 1: Object vs Block vs File – Choosing the Right Storage
Three Storage Models
AWS gives you object (S3), block (EBS), and file (EFS/FSx) storage. The right choice depends on how your app reads and writes data.
S3 – Object Storage
S3 stores objects in buckets, accessed by key. Great for big files, backups, logs, and static assets. High durability and scale, but whole-object reads/writes.
EBS – Block Storage
EBS is a network disk for a single EC2 instance. It offers low ms latency and configurable IOPS, perfect for databases and random I/O workloads.
EFS and FSx – File Storage
EFS and FSx provide shared file systems. EFS is NFS for Linux; FSx offers specialized file systems (Lustre, ONTAP, Windows). Multiple servers can mount them.
Choosing Quickly
Think: huge object store over HTTP → S3; low-latency disk for one server → EBS; shared POSIX file system → EFS/FSx.
Step 2: Matching Storage to Workload – Practical Scenarios
Scenario A: Video Streaming
Millions of large video files, global users, traffic spikes. Best fit: S3 plus CloudFront. S3 scales automatically and handles huge throughput.
Scenario B: OLTP Database
Many small random reads/writes with low latency. Best fit: EBS gp3/io2 for the DB instance. Provision IOPS and throughput to meet demand.
Scenario C: Shared Repository
Dozens of EC2 instances need a shared POSIX file system. Best fit: EFS (Linux) or FSx for Windows (Windows). S3 is not a POSIX file system.
Mental Images
Picture a bucket for S3, a disk for EBS, and a shared folder for EFS/FSx. Map question keywords to these images to choose quickly.
Step 3: Relational vs NoSQL – RDS, Aurora, and DynamoDB
Relational Databases: RDS & Aurora
Relational DBs use SQL, schemas, and ACID transactions. RDS manages engines like MySQL and PostgreSQL; Aurora is AWS’s high-performance relational engine.
Aurora Performance
Aurora uses distributed storage, auto-scales storage, supports many read replicas, and offers strong performance and fast recovery for OLTP workloads.
DynamoDB: NoSQL at Scale
DynamoDB is a key-value and document store with single-digit ms latency, on-demand or provisioned capacity, and global tables for multi-Region scale.
Choosing Relational vs NoSQL
Need complex joins and multi-table transactions? Choose RDS/Aurora. Need huge scale with simple key-based access? Choose DynamoDB.
Step 4: Choose the Right DB – Thought Exercise
For each mini-scenario, decide whether RDS/Aurora or DynamoDB is a better fit for performance and requirements. Justify your choice in one sentence.
- Banking transactions system
- Features: account balances, money transfers between accounts, strong consistency, complex reports.
- Your choice and why?
- User session store for a global web app
- Features: store simple JSON blobs (session data), very high RPS, low latency, access by session ID, no complex queries.
- Your choice and why?
- Product catalog with flexible attributes
- Features: millions of products, each with different attributes (size, color, etc.), access mainly by product ID or category, high read volume.
- Your choice and why?
Pause and write your answers. Then compare with the guidance below.
Suggested answers (do not peek before thinking):
- Banking system → RDS/Aurora (needs strong ACID transactions and complex queries).
- Session store → DynamoDB (simple key-value access, very high scale, low latency).
- Product catalog → Often DynamoDB (flexible attributes, high reads), possibly with a search service like OpenSearch for complex filtering.
Step 5: Performance Tuning – EBS, RDS, and DynamoDB
Tuning EBS
Pick the right EBS type: gp3 for general SSD, io2 for high IOPS. Increase size and IOPS, or stripe volumes with RAID 0 for more throughput when needed.
Tuning RDS/Aurora
Scale up instances, add read replicas, use Aurora reader endpoints, and manage connections with RDS Proxy. Do not forget indexes and query tuning.
Tuning DynamoDB Capacity
Choose on-demand for unpredictable traffic or provisioned with auto scaling for steady workloads. Adjust RCUs/WCUs to meet throughput needs.
Avoiding Hot Partitions
In DynamoDB, hot partitions occur when one key gets too much traffic. Design partition keys that spread load evenly and use DAX for ultra-fast reads.
Step 6: Caching and Read Replicas – Speeding Up Reads
Why Cache?
Caching stores hot data in memory, giving microsecond to sub-millisecond access. It protects your database from read storms and cuts latency.
ElastiCache Pattern
App checks ElastiCache first; on a miss, it reads from the DB and then populates the cache. Great for sessions, user profiles, and product data.
Read Replicas
RDS and Aurora read replicas handle read-only queries. Send reporting and dashboards to replicas to offload the primary DB.
DynamoDB Caching
Use DAX or ElastiCache to cache DynamoDB reads, especially for read-heavy workloads needing even lower latency.
E-commerce Example
Product data in Aurora, cached in Redis. Product pages hit Redis; Aurora read replicas handle remaining reads. This keeps the site fast during sales.
Step 7: Quick Check – Picking the Right Tool
Answer this question to check your understanding of storage and database choices.
You are designing a high-traffic leaderboard service for a mobile game. Requirements: (1) store player scores as simple key-value pairs, (2) extremely high read/write throughput with single-digit millisecond latency, (3) traffic is spiky and unpredictable, (4) you need to scale globally in the future. Which combination is the **best** starting point?
- RDS MySQL with Multi-AZ and read replicas
- Aurora PostgreSQL with a large instance and ElastiCache
- DynamoDB with on-demand capacity and potential global tables
- S3 for scores plus CloudFront for global distribution
Show Answer
Answer: C) DynamoDB with on-demand capacity and potential global tables
DynamoDB is designed for high-throughput key-value workloads with single-digit millisecond latency. On-demand capacity handles spiky, unpredictable traffic. Global tables support future multi-Region replication. RDS and Aurora are better for relational workloads, and S3 is not suitable for low-latency, high-frequency score updates.
Step 8: Key Term Review
Flip these cards to reinforce the most important concepts for high-performing storage and databases on AWS.
- S3 (Simple Storage Service)
- AWS object storage service. Stores objects in buckets, accessed by key over HTTP/S. Great for large files, backups, logs, and data lakes with high durability and scalability.
- EBS (Elastic Block Store)
- Block storage volumes for a single EC2 instance. Low-latency random I/O, ideal for databases and OS disks. Performance depends on volume type, size, and provisioned IOPS.
- EFS (Elastic File System)
- Managed NFS file system for Linux instances. Shared, elastic, and scalable file storage with POSIX semantics, mounted by many instances at once.
- FSx
- Family of managed high-performance file systems (e.g., Lustre, NetApp ONTAP, Windows File Server, OpenZFS) for specialized workloads and shared storage.
- RDS (Relational Database Service)
- Managed relational database service supporting engines like MySQL and PostgreSQL. Handles backups, patching, and provides Multi-AZ and read replicas.
- Aurora
- AWS-built high-performance relational database compatible with MySQL and PostgreSQL. Uses distributed storage, supports many replicas, and offers strong performance and resilience.
- DynamoDB
- Fully managed NoSQL key-value and document database. Single-digit millisecond latency at scale, with on-demand or provisioned capacity and optional global tables.
- Read Replica
- A read-only copy of a database used to offload read traffic from the primary instance. Common in RDS and Aurora for scaling reads.
- DAX (DynamoDB Accelerator)
- In-memory cache for DynamoDB that provides microsecond read latency by caching responses to read requests.
- Hot Partition (DynamoDB)
- A partition receiving disproportionately high traffic due to poor key distribution. Causes throttling and performance issues; avoided by designing better partition keys.
Key Terms
- S3
- AWS Simple Storage Service; highly durable, scalable object storage for files, backups, and data lakes.
- DAX
- DynamoDB Accelerator; an in-memory cache that integrates with DynamoDB to provide microsecond read latency.
- EBS
- Elastic Block Store; network-attached block storage for a single EC2 instance, used for OS and database volumes.
- EFS
- Elastic File System; managed NFS file storage that can be mounted by many Linux instances simultaneously.
- FSx
- Family of managed file systems (Lustre, NetApp ONTAP, Windows File Server, OpenZFS) optimized for specific high-performance workloads.
- RDS
- Relational Database Service; managed service for traditional relational database engines like MySQL and PostgreSQL.
- ACID
- Set of properties (Atomicity, Consistency, Isolation, Durability) that guarantee reliable processing of database transactions.
- IOPS
- Input/Output Operations Per Second; a measure of how many read/write operations a storage device can perform per second.
- Aurora
- AWS-built relational database compatible with MySQL and PostgreSQL, with a distributed storage engine for high performance and availability.
- DynamoDB
- Fully managed NoSQL key-value and document database with single-digit millisecond latency and automatic scaling.
- ElastiCache
- AWS managed in-memory cache service supporting Redis and Memcached, used to reduce database load and latency.
- Read replica
- Read-only database instance that receives replicated data from a primary instance, used to scale read workloads.
- Hot partition
- In DynamoDB, a partition that receives too much traffic relative to others, often due to skewed partition key design, causing throttling and latency.
- On-demand capacity (DynamoDB)
- Billing and scaling mode where DynamoDB automatically adapts capacity to traffic without specifying RCUs/WCUs.
- Provisioned capacity (DynamoDB)
- Mode where you specify read and write capacity units; auto scaling can adjust them based on utilization.